Content uploaded by Ádám Kun
Author content
All content in this area was uploaded by Ádám Kun on Feb 04, 2019
Content may be subject to copyright.
Ann. N.Y. Acad. Sci. ISSN 0077-8923
ANNALS OF THE NEW YORK ACADEMY OF SCIENCES
Issue: DNA Habitats and Their RNA Inhabitants
The dynamics of the RNA world: insights and challenges
´
Ad´
am Kun,1,2 Andr ´
as Szil´
agyi,1,4 Bal ´
azs K¨
onny˝
u,3Gergely Boza,3Istv´
an Zachar,1
and E¨
ors Szathm´
ary1,3,4
1Parmenides Center for the Conceptual Foundations of Science, Munich/Pullach, Germany. 2MTA-ELTE-MTMT Ecology
Research Group, Budapest, Hungary. 3Department of Plant Systematics, Ecology and Theoretical Biology, Institute of
Biology, E¨
otv¨
os University, Budapest, Hungary. 4MTA-ELTE Theoretical Biology and Evolutionary Ecology Research Group,
Department of Plant Systematics, Ecology and Theoretical Biology, Budapest, Hungary
Address for correspondence: E¨
ors Szathm´
ary, Parmenides Center for the Conceptual Foundations of Science, Kirchplatz 1,
82049 Munich/Pullach, Germany. szathmary.eors@gmail.com
The RNA world hypothesis of the origin of life, in which RNA emerged as both enzyme and information carrier, is
receiving solid experimental support. The prebiotic synthesis of biomolecules, the catalytic aid offered by mineral
surfaces, and the vast enzymatic repertoire of ribozymes are only pieces of the origin of life puzzle; the full picture
can only emerge if the pieces fit together by either following from one another or coexisting with each other. Here,
we review the theory of the origin, maintenance, and enhancement of the RNA world as an evolving population of
dynamical systems. The dynamical view of the origin of life allows us to pinpoint the missing and the not fitting pieces:
(1) How can the first self-replicating ribozyme emerge in the absence of template-directed information replication? (2)
How can nucleotide replicators avoid competitive exclusion despite utilizing the very same resources (nucleobases)?
(3) How can the information catastrophe be avoided? (4) How can enough genes integrate into a cohesive system in
order to transition to a cellular stage? (5) How can the way information is stored and metabolic complexity coevolve
to pave to road leading out of the RNA world to the present protein–DNA world?
Keywords: RNA world; origin of life; error threshold; hypercycle; metabolism; ribozyme
Introduction
The possibility of an RNA world, a period in the
origin of life on Earth, when RNA molecules acted
both as enzymes and as genetic material, was sug-
gested well before the name was coined by Gilbert
in 1986.1The history of the research on the origin of
life2tells us that the potential prebiotic importance
of RNA was suggested as early as the late 1950s.
When it became established that living cells harbor
much more RNA than DNA, some biologists pro-
posed that RNA preceded DNA during evolution.3,4
The discovery of the details of protein synthesis5
revealed a plethora of RNA molecules involved in
a diversity of processes within the contemporary
cells, which prompted speculation on the possible
prebiotic/ancestral role of RNA. Woese,6Orgel,7
and Crick8independently proposed that RNA
acted both as catalyst and as information-carrying
molecule. G´
anti9presented a detailed account of the
origin and embedding of catalytic RNA molecules
in a metabolizing and dividing chemical supersys-
tem: the chemoton.10 The idea of catalytic RNA
received prime experimental proof by the discovery
of natural RNA enzymes (ribozymes), found inde-
pendently by the groups of Altman11 and Cech.12
Jeffares13 proposed that, if we encounter a cat-
alytic RNA in a modern organism, it could be a relic
from a bygone era—especially if it is found in all
domains of life. Unfortunately, not many naturally
occurring ribozymes are known. Besides the initially
discovered RNase P11 and the group I introns,12
there are group II introns,14 the hammerhead
ribozyme,15 the hairpin ribozyme,16 the hepatitis
delta virus and like ribozymes,17 the Neurospora
Varkud satellite ribozyme,18 the glmS ribozyme,19
and the twister ribozymes.20 These molecules can,
however, only cleave RNA molecules21,22 —not a
repertoire upon which a metabolism could have
been built. A convincing argument says that these
doi: 10.1111/nyas.12700
1
Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
The dynamics of the RNA world Kun et al.
ribozymes have been retained in evolution because
the large size of the products limits the attainable
catalytic enhancement, hence a replacement by pro-
tein enzymes could not have been selected for.13 At
the same time, the idea of an RNA world was built
upon the diverse roles of RNA in contemporary
metabolism and not upon the limited catalytic role
of these natural ribozymes. We are only beginning
to unravel the world of functional RNA molecules,
but it is already clear that they are much more than
simple information storages (RNA viruses) or in-
formation carriers between DNA and polypeptides
(mRNA).
After revealing the catalytic role of RNAs, a smok-
ing gun was found. In translation, RNA serves as a
direct link between DNA and polypeptides, the two
essential actors of contemporary life. Before this dis-
covery, the central component of translation, the
ribosome, was thought to be a normal protein en-
zyme, with an inordinate amount of RNA bundled
within. It took decades to unveil the surprising fact
that the ribosome is actually a huge ribozyme23,24
(or at least a ribozyme with peptide structural el-
ements thrown in). The fact that RNA molecules
are involved in all aspects of the translation process
led many to propose the RNA world hypothesis,6–8
even without knowing its central role yet. Hence,
this finding provides extremely strong evidence for
the theory. It is significant that evidence is accu-
mulating in favor of accepting the spliceosome as
aribozyme,
25–27 of which the RNA core has been
conserved for over one billion years.
The fossil record of the RNA world does not
stop with translation; many of the important
coenzymes contain a nucleotide part.28 NAD(P),
FAD, coenzyme A, S-adenozyl-metionin, 3-phos-
phoadenosine-5-phosphosulfate (PAPS), and ATP
contain an adenine part, while thiamine pyrophos-
phate, THF, and pyridoxal phosphate have cyclic
nitrogenous bases that could have been derived from
a nucleobase. Interestingly, their biological activi-
ties do not depend on the adenine part. Why, then,
is the nucleotide part present at all? It could have
been a “handle” by which ribozymes got hold of
the coenzyme before the protein world.29,30 Indeed,
RNA aptamers evolved to bind CoA always bind
the coenzyme through the adenine part, and never
through the sulfonated pantothenic acid part.31
Aribozymethatrequiresacofactorwouldpos-
sibly bind the cofactor in a similar fashion. Such
coenzymes are the ones found to be autocatalytic in
metabolism,32 which also suggests their ancient ori-
gin. Although much better coenzymes might have
evolved in a purely protein world, once many of
the reactions already relied on a particular (and
crucial) coenzyme evolved in an RNA world, re-
placing them was nearly impossible, thus many an-
cient coenzymes evolved in the RNA world are still
with us.
Szathm´
ary33,34 proposed a way to evolve novel ri-
bozymes in vitro, which was realized many times
over the next decade. The success of the SELEX
technique35–37 to obtain ribozymes for many impor-
tant reactions convincingly demonstrated that RNA
can have a rich catalytic repertoire.38–42 All types of
reactions necessary for nucleotide and peptide syn-
theses can be catalyzed by such ribozymes.39 Here,
redox ribozymes43–45 should be mentioned, which
demonstrate that energy production is within the
capabilities of ribozyme-run metabolisms.
Besides metabolism, a fully functioning ribo-
organism also requires a membrane, and thus needs
membrane transporters. RNA can change the per-
meability of the membrane46 and ribozymes can
even act as membrane transporters,47 allowing
control over the exchange of material with the
environment.
The facts listed above support the existence of an
ancient RNA world.48,49 Alone, they strongly suggest
that the RNA world held sway during the invention
of the genetic code and translation. The chemical
nature of coenzymes and the enzymatic repertoire
of in vitro –evolved ribozymes indicate that the RNA
world could have a rich metabolism. RNA involve-
ment in translation suggests that peptide synthesis
evolved in the RNA world before DNA. Arguably,
modern metabolism is a palimpsest of the ancient
RNA world.50 Any proof of the existence of the RNA
world, however, does not mean that we understand
all of its aspects. In fact, we are quite far from a com-
plete understanding, as now there might be more
questions about the origins of the RNA world than
answers.
This review focuses on how the RNA world could
have emerged after the appearance of self-replicating
molecules and how it could have provided the first
scaffolding to the living cell, ultimately orchestrat-
ing the transition to peptide enzymes and to the
DNA-encoded genetic material. We survey the cur-
rent status of the RNA world from the point of
2Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
Kun et al. The dynamics of the RNA world
view of theoretical evolutionary biology. The story
is considerably more coherent than even a decade
ago, but burning open questions still remain. These
missing details provide the second target of this re-
view. Although the following might be seen as an
embarrassing list of ignorance, we see them as suc-
cessive steps of a research plan: a list of well-defined
questions that need to be tackled. One has to ap-
preciate that the upsurging of open questions in a
field does not only indicate the increasing attention,
but also that it gains momentum and progresses, as
without progress, no new questions would surface.
We quote Orgel’s optimistic words about the RNA
world: “We are very far from knowing whodunit.
The only certainty is that there will be a rational
solution.”51 This review complements and updates
another one (Szathm´
ary et al.52)writtenabouta
decade ago.
Establishment of the RNA world: the
nature of RNA template replicators
“the Struggle for Existence amongst all
organic beings throughout the world [...] in-
evitably follows from their high geomet rical pow-
ersofincrease...”
Darwin, The Origin of Species, 1859
The RNA world, like any complex adaptive sys-
tem, has its own problem of origins. Although a
fully RNA-based genetic system seems feasible in
light of findings about the catalytic repertoire of
ribozymes (see later section on metabolism), as-
suming the spontaneous appearance of a general
and effective RNA polymerase ribozyme is highly
unrealistic. How could evolutionary search gradu-
ally select for a replicase, that is an enzyme that
can catalyze the template-directed polymerization
of itself or its complementary strand, when there is
no replication and inheritance, thus no evolution,
yet? This was termed by Robertson and Joyce53 as a
chicken-and-egg paradox, pointing out that the para-
dox of the origin of genetic systems having multipli-
cation, variability, and heredity before an effective
RNA world was a problem that was relegated but
not vanquished. Later, we discuss the possibilities
and consequences of some hypotheses that attempt
to remedy the paradox of the very origin of the RNA
world.
The combinatorial approach
RNA sequences can form both in solution54,55
and on mineral surfaces such as montmorillonite
clay.56,57 A generally accepted scenario58–60 postu-
lates that the first RNA sequences emerged by ran-
dom, nonenzymatic synthesis of oligomers, and
then their ligation and recombination produced
longer sequences on a surface;61 the ensuing RNA
pool would have been diverse in structure and thus
had (some) catalytic activity. However, this scenario
is problematic on more than one account. Although
small ligases most probably would form in a pre-
biotic environment and sequence diversity would
be undoubtedly huge, it must be kept in mind that
the sequence space is vast. Even if we consider only
sequences up to length (L)=50 nt,56,57 the total
number of variants is
50
L=1
4L,
which is around 1030. Essential enzymes needed to
emerge from this sequence space, from which, at
any particular time, only an infinitesimally small
fraction could be realized. Of course, early evolu-
tion did not wait for a single, specific sequence to
appear, as many different nucleotide orders could
have provided useful enzymatic activity.
Ribozyme activity can be maintained if the struc-
ture is kept intact,62,63 or it can even withstand mi-
nor structural mutations, as the sequence →struc-
ture map is highly injective.64 Thus, it makes more
sense to look for an appropriate structure than for a
sequence. The number of structures for a sequence
space of sequence length Lis 2.35Lrather than
4L(Ref. 64), because there are many fewer struc-
tures than sequences for a given sequence length.64,65
Moreover, some RNA structures are more common
than others;66 for shorter sequences (around L=
30) it was shown that more than 90% of sequences
fold to common structures. A structure is consid-
ered common if it is formed by more sequences
than the average structure,66 that is, Nc>4L/SL,
where Ncis the number of sequences folding into
a common structure and SLis the number of dis-
tinct structures of length L. The rest of the struc-
tures, although numerous, are represented only by
a few sequences. These rare structures are hard to
find, as they exist only in some corners of the se-
quence space, as opposed to common structures
that exist everywhere. Furthermore, even if a rare
structure is found, it can be easily lost, as mutations
always result in a different structure, whereas for
3
Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
The dynamics of the RNA world Kun et al.
common structures, mutations often result in the
same structure.
Thus, because of combinatorial and physical
necessities, ribozymes fold more probably into
these common structures.42,67 Common structures
are easily reached from any starting point in
sequence space via evolution; most common
structures are within a distance of maximum 15–20
mutations from any arbitrary sequence of length
L=50.65,68–70 As the composition of RNA sequences
is not random,57 the reachable sequence space
is constrained. If we assume that such sequence
constraints do not restrict the reachable structural
space, then a smaller sequence space needs to be
searched to find a useful structure. Unfortunately,
this fraction of the sequence space still contains ap-
proximately 1023 sequences (considering sequences
only up to L=50).
Let us say that short ligases and nucleases emerge.
If the reaction network of short oligomers results
in the uncontrollable duplex formation, dissoci-
ation, ligation, and breakage of RNA sequences,
the hypothesis of de novo emergence of ribozymes
has to face another serious obstacle: the unavoid-
able elongation of sequences (called the elongation
catastrophe).71 As the template length increases, the
number of possible elongation events suffers a com-
binatorial explosion. Consequently, the diversity in-
creases in the population, instead of producing a
restricted but useful set of sequences.
The hypotheses about the early development of
the RNA world usually conclude that if a restricted
set of RNA sequences can exhibit a large enough
structural variation, then the useful molecules can
be enriched, as such an enrichment (selection)
has been demonstrated by in vitro selection
experiments.61,72 The problem with such a line of
thought is that techniques such as SELEX73 are evo-
lutionary methods that employ template-directed
replication of the genetic material. Evolution
requires variation, multiplication, and heredity.
Random generation of RNAs offers variability and
multiplication, but no heredity.
Another possibility rests on von Kiedrowski-
type replicators:74 two trimers can form a hexamer
guided by another hexamer, then two hexamers can
form a dodecamer guided by an existing one, and
so on. Potentially, quite long replicators can be syn-
thesized in a dynamically stable and exponential
way. This scenario of convergent synthesis has been
analyzed in the model of Fernando et al.71,which
concluded that spontaneous elongation and paral-
lel replication of short oligomers do not allow this
mechanism to raise itself above noise level. Although
this system could show multiplication and heredity,
a further problem would be that tolerable variation
is limited, as the oligomers and the template have
to be very specific, and many mutations would ruin
the templating effect. Thus, the system would lack
the potential for open-ended evolution, although
even fully fledged ribozymes can replicate in such a
manner.75
As a more feasible alternative to the de novo emer-
gence of a replicase, short functional replicators
(that can emerge spontaneously without enzymes)
may form a diverse cross-catalytic set that in turn
might be responsible for the replicase functional-
ity (although they are not replicases themselves) as
a whole (note that no autocatalysis is assumed for
members at this point, cf. Refs. 76 and 77), or they
might self-assemble to be a functional ribozyme.78
Vasas et al.79 analyzed the kinetic stability of a sim-
ple two-membered autocatalytic loop, in which each
member catalyzes the inclusion of one noncatalytic
molecule. If there are large differences in catalytic ef-
ficiencies (as it is probable in the prebiotic context),
the system shows kinetic instability. In this case, the
deterministic equilibrium concentration of one of
themembersisverylow,solossbychanceina
stochastic system is likely. Thus, even if a replicase
appears in a diverse, prebiotic RNA pool, it would
still be subject to stochastic loss because initially its
concentration is too low.
Along similar lines, an interesting system has been
presented as a possible solution to the problem of
early RNA replication by Meyer et al.80 In the pro-
posed network, a polymerase helps the replication
of RNA oligomers (but not that of complete poly-
mers), and a ligase helps the formation of itself, as
well as of the polymerase, out of these oligomers.
The system is collectively autocatalytic, but there is
no direct mutual catalysis of replication. The poly-
merase helps the replication of the oligomers, but
the latter contribute stoichiometrically, rather than
catalytically, to the formation of the polymerase and
the autocatalysis of the ligase. A similar system was
analyzed by Wu and Higgs.81
We want to understand the transition from
activated monomers and short (or not so short)
oligomers to an evolving ensemble of RNA
4Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
Kun et al. The dynamics of the RNA world
replicators. Starting from synthesized (as opposed
to replicated, based on a template) RNA sequences,
finding a replicase ribozyme that could kick-start
evolution is problematic because of the vast se-
quence space that needs to be searched. Moreover,
maintaining a fledgling replicase in the realm of
population stochasticity would not be easy. Hence,
the emergence of the first template replicator is far
from solved; we are only beginning to understand
the problem itself.
Resource competition: Gause’s principle
A self-replicating ribozyme—whatever its origin—
would still have to compete with other sequences
and side reactions for a limited set of resources (e.g.,
activated nucleotides) and fight information loss be-
cause of erroneous replication. Simple ecological
considerations could help establish the baseline for
coexistence of prebiotic replicators.
The competitive exclusion principle is one of the
major organizing aspects of ecology, formulated first
by Gause in the golden age of theoretical ecology.82 It
states—roughly speaking—that the number of co-
existing species must be less than or equal to the
number of resources that the species compete for.
This obviously puts a limit on the diversity of co-
existing species and remains valid beyond the scope
of classical ecology. Two major refinements are in
order: first, the above statement is only valid for
steady-state situations; second, “resource” does not
mean nutrients only, but includes many other fac-
tors affecting coexistence (the so-called regulating
factors, cf. Refs. 83 and 84).
Exponential growth is generally used as a refer-
ence case for modeling in population dynamics. The
underlying assumption is simple: the change in the
amount of a given species is proportional to its actual
amount; the (asexual) mitotic division of a protist
is a fitting example. The corresponding differential
equation is
dx(t)/dt =k(x(t))p,
where x(t) denotes the concentration of the species
at time t,kis the Malthusian parameter of growth
(per capita growth rate), and p=1. In this
case, the population growth is exponential x(t)=
x(0)exp(k·t), until it reaches ecological (extrin-
sic) constraints. If competing species have dif-
ferent Malthusian parameters, the type with the
higher kultimately excludes all other variants in the
absence of mutations. In Eigen’s quasispecies model
(see, e.g., Ref. 85), sequences are competitors liv-
ing on a shared pool of limiting resources (e.g., one
type of monomer), thus the fastest replicator with its
mutational neighborhood (the quasispecies) always
excludes others.
When taking the above results into consider-
ation, there is an obvious question: What is the
limit of diversity that can still be maintained of
coexisting replicator molecules competing for
the same resources (e.g., nucleotides in the RNA
world)? Mutation-free pure resource competition
can provide a lower bound on the diversity of
coexistence. Both numerical and analytical results
of such resource competition of polynucleotides
agree with Gause’s principle:86 asymptotically stable
coexistence is only possible when the number of
replicators does not exceed the number of resources
(nucleotides) and the nucleotide composition
of replicators is sufficiently different (i.e., niche-
segregation in the RNA world). Interestingly, the
two complementary strands (the plus and minus
strands) can be counted as one replicator from an
ecological point of view, as they are strictly stoi-
chiometrically coupled, thus, for example, on four
nucleotides, at most four pairs (eight sequences)
are able to coexist. The coexistence is affected not
only by the nucleotide composition but also by the
nucleotide order. Parts of sequences that are copied
earlier have a larger influence on the dynamics ow-
ing to the higher concentration of the corresponding
replication intermediates compared to parts copied
later. This sequence effect influences coexistence
(e.g., it can allow the coexistence of two replicators
with identical nucleotide compositions but ade-
quately different sequences) but does not permit
more species to coexist. For a simple replication
system, the number of nucleotides applies a strict
and rather low bound on the number of coexisting
sequence pairs, hence the sustainable diversity.
Taking into account the phenomenon that
double-stranded RNA molecules are replication-
ally inert, the dynamics of coexisting replicators
changes dramatically, yielding a more permissive
criterion for coexistence. Both von Kiedrowski74 and
Zielinski and Orgel87 have constructed systems of
hexa- and tetranucleotides (respectively) with the
ability to self-replicate nonenzymatically. Instead
of exponential growth, growth rate was found to
be proportional to the square root of the actual
5
Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
The dynamics of the RNA world Kun et al.
concentration, that is, p=½in Eq. (1). This limited
growth is because of three different factors: (1) only
single-stranded nucleic acid can act as template for
replication; (2) the concentration of single-stranded
templates is proportional to the square root of the
total concentration; and (3) the immediate prod-
uct of replication is a replicationally inert double-
stranded form. Because of such dynamics, there is
always an advantage of rarity: any species can invade
a population when rare.88–90 This is true not just for
p=½, but for any value in the range of (0, 1). Al-
though the exponential case (p=1) means “survival
of the fittest,” the p=(0, 1) interval corresponds to
“survival of everybody.” In this so-called parabolic
regime, an arbitrary number of competing popu-
lations can coexist in a globally stable way.91 Note
that there is enhanced selectivity in the system, al-
though relative to the linear growth case. If p=½,
then the ratio of equilibrium concentrations of the
competing species is the square of the kinetic rate
constants.
Although the regime of parabolic replication can
sustain an arbitrary large diversity and could over-
come the restriction posed by Gause’s principle,
such replicators cannot be real information integra-
tors, as evolution cannot act on them. If we assume
that any new mutant is also subject to duplex for-
mation, it can gain no selective advantage, and thus
no evolution is expected to happen in such a regime.
This is because for Darwinian selection, exponential
growth of competing replicators is necessary.88,90
If parabolic growth is coupled with additional
physically and chemically feasible assumptions
(such as degradation and binding of replicators to
the surface in an adsorption–desorption process,
see Refs. 89, 92–94), the outcome of the dynam-
ics (whether it will be survival of everybody or the
fittest) becomes a quantitative issue, depending on
external parameters. Accordingly, such a system of
replicators would be able to switch between a co-
existence (parabolic) regime and a selective (Dar-
winian) regime, which could provide the necessary
selective edge for the system to become a real unit of
evolution.
Although Gause’s principle limits the number of
coexisting species by the number of independent
resources, there could have been many ecological
and dynamical factors that extend the number of
resources, and thus relax this limit. When it comes
to coexistence, molecular replicators are not that
much different from the multicellular organisms of
supraindividual biology; and thus the results of ecol-
ogy might apply. It has been demonstrated that ex-
trinsic variation in space and time,95,96 intrinsically
generated fluctuations,97,98 and chaotic mixing99 in-
troduce other regulating factors and can increase the
number of coexisting species. The possible role of
these factors in the prebiotic context is the scope of
further research. All this offers a glimpse of hope
that a variety of replicators could have coexisted in
plausible prebiotic environments, and they could
have evolved into more complex systems. From a
theoretical–ecological point of view, however, we
do not yet have a conclusive answer.
Error threshold
Too high a degree of variability undermines hered-
ity. This sounds obvious, but it was discovered rather
late that the mutation rate (inversely proportional
to replication accuracy) sets a limit on the amount
of genetic information that can be maintained by se-
lection. Eigen85 was the first to analyze the amount
of maintainable information in the context of a re-
action kinetic model of molecular evolution. This
landmark study presents the flipside of the coin
of the mutational load, known to population ge-
neticists through the investigations of Haldane.100
Eigen’s theoretical model described the dynamics of
a large population of replicating sequences (geno-
types) in a well-mixed flow reactor. In case of error-
free replication, the equilibrium population consists
only of replicators with the highest fitness (assum-
ing only one fittest type: the master). If there is
even the smallest chance of mutation during repli-
cation, the mutation–selection balance results in a
new equilibrium. A cloud of mutants appears in the
mutational neighborhood of the master sequence
that nevertheless remains the most abundant. This
well-defined distribution of mutants (together with
the master phenotype) is the quasispecies, intro-
ducedbyEigenandSchuster,
101 and it becomes the
target of selection. By decreasing replication accu-
racy, the quasispecies collapses at a critical value
with a sharp transition; beyond this point, the mas-
ter sequence is lost. The system diffuses randomly
in genotype space, and further evolution is impossi-
ble, as no information can be selectively maintained.
This critical value of replication accuracy is the error
threshold.
6Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
Kun et al. The dynamics of the RNA world
The loss of information is inevitable in any such
mutation–selection system, but the exact position of
the error threshold depends on the fitness landscape
(i.e., the phenotype–fitness mapping) and parame-
ters of the population dynamics of replicators (e.g.,
degradation rate, population size, and interaction
between molecules). In case of the single-peak fit-
ness landscape of the original model (the master
sequence has fitness >1, all others have 1), the crit-
ical per-base replication accuracy (q∗) that defines
the error threshold can be approximated analytically
as q∗=s−1/L(where sis the selective superiority of
the master sequence). Assuming that the logarithm
of s≈1, the maximum chain length roughly equals
the inverse of mutation rate per site per replication.
Without peptide enzymes, the per-nucleotide copy-
ing fidelity is approximately 96–99%.102–104 This
approximation(asaruleofthumb)suggestsavery
strict limit on the sustainable sequence length that
is far from what is thought to be necessary for mini-
mal life. This sets the so-called Eigen paradox—or in
the words of John Maynard Smith, “the ‘Catch-22’ of
the origin of life: no large genome without enzymes,
and no enzymes without a large genome.”105
There are, however, many subtleties that must
be discussed to evaluate the severity of an early
error threshold. The single-peak fitness landscape
is an abstraction with limited biological relevance.
Although a huge body of literature deals with
calculating the error threshold for further fitness
landscapes, the selection criterion for each land-
scape was unfortunately almost always analytical
tractability106 and not biological relevance. For
example, the perturbation theory of quantum
mechanics can be used to estimate the equilibrium
distribution of concentrations in the quasispecies,
but this method is applicable only when all fitnesses
are different (for details, see Ref. 107). From a
biological point of view, this is rather implausible,
as it excludes individuals sharing the same baseline
fitness. Another example rests on the formal anal-
ogy between the two bases (purine and pyrimidine)
of a binary template and a 2D Ising system with
nearest-neighbor interaction. There is an exact
correspondence between the equilibrium properties
of the 2D Ising lattice and Eigen’s model108 (for
a more general statistical physics approach, see
Refs. 109–112). In this context, the error catastrophe
corresponds to the magnetic order–disorder transi-
tion. Some analytically, partially tractable solutions
for very simple fitness landscapes can be derived
using this analogy. However, the required simplifi-
cations on the fitness landscape to achieve tractable
solutions makes the model biologically implausible
(e.g., fitness decreases with the square root of the
Hamming distance from the master genotype,109 or
decreases in a stepwise manner110). Consequently,
in the general case, a solution is possible only either
by numerically integrating the set of differential
equations or via computing the leading eigenvalue
of the value matrix of the system (see, e.g., Ref. 107).
As already discussed, it is not a particular se-
quence (genotype) but a function (phenotype)
that needs to be replicated. Thus, instead of a
genotypic error threshold, we should look for
the phenotypic error threshold—the critical mu-
tation rate above which the functional phenotype
cannot be maintained selectively. As the num-
ber of structures is considerably fewer than the
number of sequences, genotypes sharing the same
phenotype form a neutral network (or neutral
set) in the genotype space. The percolated topol-
ogy of neutral sets allows for easier evolution-
ary adaptation. Finding a given secondary struc-
ture (function) by a mutation–selection process
is easier than expected,113–115 and losing an al-
ready acquired function is also less probable. There-
fore, the so-called phenotypic error threshold is
more permissive than the original (genotypic) error
threshold.
The connectivity of neutral paths characterized
by the fraction of mutants having the same pheno-
type can account for the more permissive pheno-
typic error threshold.116,117 Below a critical level of
replication accuracy (the phenotypic error thresh-
old), the population diffuses randomly over the
whole genotype space, and the master phenotype
is lost. At a relatively high replication accuracy, the
population randomly drifts on the neutral network
of the master phenotype, preserving the secondary
structure.118 Traversing the neutral network is not
entirely random; instead, the population tends to
move to a highly connected part of the neutral set115
(Fig. 1). A reliable estimate of the structure of neu-
tral networks can only come from fitness landscapes
based on real-world data. Available data on the activ-
ity of mutated hairpin ribozyme62 and Neurospora
VS ribozyme63 allows the construction of a fitness
landscape.119 The landscape is such that structure
is the most important predictor of function and,
7
Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
The dynamics of the RNA world Kun et al.
Figure 1. The evolution of RNA phenotypes. Although the
evolutionary search in the sequence space (represented by the
rectangular grid; the evolutionary trajectory is represented by a
blue thick line, lower panel) is continuous, multiple sequences
may form similar phenotypes (represented by the different shad-
ing colors on the grid), hence the fitness increase is nonlinear
(blue line in the upper panel).
thus, fitness, albeit at certain sites the chemical na-
ture of the base is constrained (so-called critical
sites,119 e.g., the active sites or the substrate bind-
ing parts of the ribozymes). Thus, some part of the
structural neutral paths cannot be traversed. This
was also demonstrated recently in vivo by map-
ping the fitness landscape of GTP aptamers of up
to 24 nt in length.120 The phenotypic error thresh-
old allows sequences nearly a magnitude longer121
than presumed from the Eigen’s model (i.e., 700 vs.
100 nt with 10−2error rate) to be maintained. The
whole genetic material required for a minimal ribo-
organism, however, cannot be replicated unless the
error rate falls below 10−3. However, individual ri-
bozymes, even relatively longer ones like replicases,
can be stably replicated at this accuracy.122
Selectively maintainable information can be fur-
ther increased by taking into account the stalling of
replication after mismatch. Stalling after a Watson–
Crick base pair mismatch has been observed for
many DNA polymerases123–125 (the factor of slow-
downisbetween10and10
6). Irene Chen’s lab126,127
has demonstrated that, in the case of nonenzymatic
polymerization of DNA, the speed of polymeriza-
tion slows down by one or two orders of magnitude
after base-pair mismatches. Furthermore, “exten-
sion of RNA is relatively slow after a mismatch, to
a roughly similar extent as DNA.”127 Thus, accu-
rate copying without mismatch has an advantage of
faster replication. Remember that the original Eigen
model assumed that the speed of polymerization
as such is not affected by accuracy. Consequently,
if stalling is not omitted, more information can be
maintained, and thus the error threshold is miti-
gated.
Recombination—a mechanism usually ignored
in the study of early molecular evolution—could
have had its role in the alleviation of the error
threshold.128 Santos et al.129 have found a bene-
ficial effect of recombination on the sustainable
genome size. The authors assumed compartmen-
talized populations of genes, with internal compe-
tition among unlinked templates. Recombination
during the replication of a gene was allowed. An in-
crease of roughly 30% in length could be achieved by
recombination.
Mutation rate sets a limit to the length of an
RNA molecule that can be faithfully copied. Repli-
cation accuracies at the dawn of life were not suf-
ficiently high to stably replicate all the necessary
genes stringed into a chromosome. Maintenance of
structure, coupled with stalling at mutations and
recombination between different copies of the same
gene, can relax the error threshold to the level where
individual genes can be faithfully replicated.
Maintenance of the RNA world:
coexistence and evolvability of early
replicators
...differentiation isthenecessaryconditionfor
coexistence.
G. Hardin, Science, 1960
The error threshold, as we discussed in the previ-
ous section, prevents the stable maintenance of in-
formation above a certain size.85,121,130,131 Although
the whole of genetic information cannot be accu-
rately copied as one molecule, a coexisting set of
shorter replicators can still provide the same infor-
mation content. Although such a collection could
overcome the error threshold, it also introduces
a new problem. During replication, all replicator
types have to be replicated together to maintain the
complete information content of the system. This
requirement poses a serious problem, as replica-
tors competing for common resources are subject
to competitive exclusion, which ultimately means
the survival of only as many replicators as the num-
ber of resources.
A feasible solution for coexistence is when the
full information content of such a system is shared
among functionally interacting shorter replicators.
Phenotypically different replicators assemble to
8Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
Kun et al. The dynamics of the RNA world
create a molecular community in which each mem-
ber is a replicator and is essential for the mainte-
nance of the whole system, thus it is a collectively
autocatalytic system. A functionally coupled replica-
torsystemisvulnerabletoanymemberthatdoesnot
contribute to the maintenance of the whole commu-
nity, jeopardizing the integrity of the system. There-
fore, the resistance against such parasites has to be
in focus when the coexistence of early replicator
communities is investigated. Note that coexistence
in prebiotic molecular communities touches on the
same or similar mechanism as coexistence in eco-
logicalcontexts(forareviewofcoexistenceineco-
logical settings, see Ref. 84). Later, we discuss three
hypotheses for the coexistence of early replicators,
discussing whether it is possible to achieve and se-
lectively maintain the molecular diversity required
to advance to the next stages of the RNA world.
The hypercycle versus cross-catalytic
networks
The hypercycle was the first theoretical model in
which functionally coupled replicators could form
a molecular community.85,101,130 In the original
model, an arbitrary number of replicators is directly
linked together to form a cyclic catalytic loop; thus,
each member of the loop catalyzes both its own
replication and the replication of the next member.
Accordingly, members of the hypercycle are auto-
catalytic individually and collectively, thus forming
a cooperative system (see Fig. 2A). This hypercyclic
connection is responsible for the stable coexistence
of replicators.130 The hypercycle is indeed ecologi-
cally stable.
There are, however, two major issues with the
hypercycle. First, if any of the members is diluted
because of stochastic effects, this can ruin the whole
hypercycle when it is running at low concentrations
(cf. demographic stochasticity). Second, the origi-
nal model has a serious oversight in not including
mutations,132,133 thus leaving an enormous theo-
retical gap in the explanation of the evolutionary
origin and survival of the hypercycle in a biologi-
cally relevant way. Allowing mutations in a hyper-
cycle can give rise to various mutants. Any selfish
mutant that is a better target for replication will
destroy the hypercycle by channeling resources out
of the cooperative cycle (toward the parasites; see
Fig. 2B). Furthermore, short-circuit mutants intro-
duce shortcuts in the reaction loop, severing cut-off
Figure 2. The interaction between replicators can be direct
or indirect. (A) In the hypercycle model, replicators catalyze
the replication of the next molecule; thus, they have a direct
effect on the replicative success of another member of the repli-
cator community. (B) Parasites of the hypercycle interrupt the
cooperation of replicators either by creating shortcuts (I) or
by accepting catalytic help without reciprocating it (II). (C)
The metabolically coupled replicator system is built on the as-
sumption that the interaction among members of the replicator
community is indirect and all members contribute to a com-
mon metabolism that replicators feed on. (D) A parasite in the
metabolically coupled replicator system consumes products of
the common metabolism without contributing to production.
members of the original cycle and thus reducing the
diversity maintained. Moreover, any mutant with
better catalytic activity would not increase the effi-
ciency of the system, as they are not evolutionary
units and cannot be selected for.132–134 As a con-
clusion, hypercycles are not able to overcome the
danger of information decay. They cannot com-
pete against harmful parasites. Moreover, although
membersofthehypercyclecanbeunitsofevolu-
tion,thecycleasawholeisnot,asitisnotsubject
to selection with heritable variability.135,136
The classical ecological solution to temper the in-
vasion of parasites is to assume local effects (spatially
explicit models) that, however, could only provide a
partial solution for the hypercycle.137 In the model
of Boerlijst and Hogeweg, local replicator interac-
tions produce moving spiral waves in which selfish
parasites move out to the edge of spiral arms to
finally die out. In a specific case, if parasites are
dropped exactly into the center of spiral waves, they
9
Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
The dynamics of the RNA world Kun et al.
can survive in an inert cyst. Unfortunately, even if
the selfish parasite is contained, the hypercycle can-
not be maintained stably when short-cut parasites
appear, neither in spatially implicit nor spatially ex-
plicit models.137 A further problem with this mech-
anism to save the hypercycle is that it is extremely
fragile. Random perturbation in the adhesion of the
replicators in the different patches of the surface ru-
ins the spirals, and with them goes the resistance
against parasites.138
On the experimental side, it should be noted
that, contrary to erroneous claims, no instantia-
tion of the molecular hypercycle has been realized
(Szathm´
ary77 presents a survey of the propagation
of this conceptual error). In the hypercycle, replica-
tion is a second-order process: template replication
is catalytically aided by the previous member in the
cycle. In cross-catalytic systems, members aid the
formation rather than the replication of other mem-
bers. The first such system was realized in the von
Kiedrowski lab;139 a recent, more complex example,
was presented by Vaidya et al.76 An important ques-
tion is the evolvability of the latter system, because
template replication is not a component process.
Surface-bound replicators
Surfaces, besides their favorable kinetic and
thermodynamic effects on an unfolding chemical
network,140,141 have an important role in provid-
ing population structures in which evolution is
known to proceed differently from its course in
a well-mixed flow reactor (cf. Fig. 3). A potential
interaction network was explored by the metaboli-
cally coupled replicator system (MCRS)142 (see Fig.
2C). Replicators in the MCRS interact with each
other indirectly; namely, every replicator catalyzes
only one reaction in a hypothetical metabolic re-
action network carrying out monomer production,
but all of the replicators are essential, otherwise
monomer production breaks down. Moreover,
replicators compete for monomers, and replicators
with higher replication rates can utilize monomers
faster and can become dominant in the system. In
the spatially implicit version of the MCRS, there
is no compensatory mechanism against superior
replicators. Therefore, they competitively exclude
all other replicators, and the metabolic—and hence
the replicator—system collapses.142 In the spatially
explicit model (metabolic replicator model, MRM),
however, replicators stably coexist in most parts
of the parameter space.142,143 Local interactions
and limited mixing of replicators in the spatially
explicit model ensure that the metabolic network
is more likely to be complete in the neighborhood
of rare replicators than in the vicinity of dom-
inant replicators (see Fig. 3B), thus providing a
control over the dominant species (advantage of
rarity).142,143
The MCRS has a double advantage against par-
asites over the hypercycle. Because the main cou-
pling is indirect via metabolism, the short-circuit
parasite (in contrast to the hypercycle case) has no
meaning (see Fig. 2D). Moreover, harmful effects
of parasites occur only locally in the MRM. Par-
asites overwhelm—owing to their higher replica-
tion rate—their own local metabolic community
and break the metabolic process down, terminating
their own replication as well (cost of commonness).
As long as parasites are able to “infect” new local
metabolic communities, they coexist permanently
with metabolic replicators.142,143
Consequently, the MCRS has the ability to
incorporate a new replicator (i.e., a new function-
ality) as long as it does not impair the established
metabolic process. Therefore, a new replicator, even
a parasitic one, can permanently coexist with the
metabolic replicators. Moreover, evolution is able to
transform parasites into beneficial members of the
system without inhibiting the metabolic process.144
A related model concerns the spread of efficient
replicase ribozymes on surfaces. In a well-mixed
case, shorter, dysfunctional molecules would dis-
place longer competent replicases by the virtue of
their faster replication rate. This is not so in the sur-
face model of Szab ´
oet al.145 with limited diffusion.
Local accumulation of parasites is self-limiting, be-
cause in such a patch an average parasite finds only
other parasites around itself, and thus cannot repli-
cate. In the model, elongation activity and accuracy,
enzyme and replication rates, and template are in a
three-way trade-off. Despite this severe constraint,
a stable bimodal distribution of short parasites and
long replicases emerges as a result of simulated
evolution.
To summarize, local indirect interactions and
limited mixing of replicators are required for the co-
existence of genes. The presence of local interactions
is one of many properties linking theoretical and ex-
perimental prebiotic approaches. Mineral surfaces
could have played an influential role in the evolution
10 Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
Kun et al. The dynamics of the RNA world
Figure 3. Schematic representation of population structures in models of prebiotic replicators. (A) In a well-mixed model, no
population structure is assumed, hence free movement of molecules is allowed. (B) In the case of surface-bound molecules, the
replicators have a limited number of immediateneig hbors and their translocation (dispersal, diffusion) is limited. (C) The stochastic
corrector model assumes that replicators multiply inside vesicles with a membrane boundary, and successful replication of the
community will accelerate vesicle growth and division, which defines the fitness of these protocells.
of prebiotic information-carrying molecules at mul-
tiple levels. They may have been responsible for the
homochirality of nucleotides,146,147 may have cat-
alyzed the polymerization of monomers,57 and/or
may have protected polymers from degradation.148
The properties of mineral surfaces coupled with the
theoretical demonstration of potential replicator co-
existence hints that life may have originated on sur-
faces, most probably without a soup phase (albeit
chemical intermediates could have formed in the
prebiotic ocean, or even in the atmosphere149–151).
Active compartmentalization
Surface-bound replicators could have kick-started
life, but the number of coexisting replicators, hence
metabolic complexity, was limited.143 Compart-
mentalization provides a more articulated popu-
lation structure, and it has further advantages by
effectively increasing local concentrations within
the small volume of cells compared to free solu-
tions, which significantly improves the efficiency
of (bio)chemical reactions.152 It can also pro-
vide an efficient way to spatially segregate dif-
ferent genomes composed of several unlinked
replicators.153 Surface-bound models often assume
that small molecules produced locally do not diffuse,
or do not diffuse faster than the macromolecules cat-
alyzing their formation. This is unrealistic because
small molecular products probably leak from the
system. Properly compartmentalized catalysts can
benefit from the products of their own reactions, or
those of their cooperative partners.
The transition from surface-bound to compart-
mentalized replicators is also the transition to the
first living system, and the first major evolutionary
transition proposed by Szathm´
ary and Maynard
Smith.154 Thus far, we have only considered RNA
replicators; and although RNA can fulfill roles of
both template and catalyst, it cannot form mem-
branes. The first membranes were most probably
single-chain fatty acids.153,155–158 Once simple am-
phiphilic molecules were present, spontaneous for-
mation of vesicles became possible under specific
circumstances.155,157,159 Interestingly, spontaneous
membrane assembly is also catalyzed by certain
mineral surfaces,160 which offers a conceptual bridge
between the surface-bound and encapsulated repli-
cators in the evolution of the RNA world. A fea-
sible mechanism to encapsulate macromolecules
into compartments was proposed by Deamer’s
lab.161,162 A drying–wetting cycle was implemented,
in which empty compartments and macromolecules
(e.g., RNA) are mixed in a solution that is then
dried, in which phase the compartments dehy-
drate and produce multilamellar structures, with
macromolecules among the layers. Then, in the wet-
ting phase, the compartment is rehydrated, which
means new compartments are formed, encapsulat-
ing macromolecules.
Compartmentalization of the individually repli-
cating genes provides the basis of the first living cells.
A minimal living system fulfilling all criteria of life
was proposed by G´
anti10,163,164: the chemoton (since
its reconceptualization in 1975) has a membrane,
an information subsystem, and a metabolism. In
2001, Szostak et al. proposed a very similar con-
struct, the protocell,165,166 in which one ribozyme
synthesizes the membrane components and another
is responsible for genome replication. Remarkable
experimental advances have been made in recent
years toward the in vitro realization of such minimal
systems.155,167
11
Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
The dynamics of the RNA world Kun et al.
In silico investigations of (proto)cells could also
provide valuable insight into the problems faced
by these early systems: how the lipid bilayer
could self-assemble from the metabolic products
of the vesicle;168 how membrane permeability af-
fects metabolism;169 or how vesicles transform
and divide.170 Furthermore, compartmentalization
models can address the problem of maintaining the
genetic information and the effect of parasites—
which will be discussed later.
To understand the mechanisms behind the coex-
istence of genes in a compartmentalized system, a
simple yet effective model—the stochastic corrector
model (SCM)—was proposed.171,172 In the model,
it is assumed that encapsulated replicators catalyze
their own replication, as well as the growth of the
membrane (and thus the cell as a whole). As a further
natural assumption, competition is allowed among
replicators. There is an optimal replicator composi-
tion that yields the fastest cell growth. The cells are
in selective disequilibrium, thus maintaining a vari-
ety of different compositions. At a critical size, cells
undergo fission and form two daughter cells, with
random segregation of replicators. Because the
replicators replicate individually, faster-growing
ones can be overrepresented in the offspring. Several
mechanisms act against coexistence, such as the in-
ternal competition of the replicators for monomers,
competition for the replicase, and the potential for
gene loss because of random assortment of genes
to daughter cells. Nevertheless, both the stochastic-
ity of replication dynamics and the stochasticity of
cell division increase variability among the cells, and
thus selection can act on this variation. It was shown
that the stochastic nature of the daughter cell com-
position in fact facilitates coexistence, as, by chance,
daughter cells could inherit a balanced gene set even
if the parental cell had a suboptimal distribution of
genes.171 Hence, the name of the model: stochastic
corrector (Fig. 3C). This process protects the pop-
ulation from extinction and results in evolutionary
dynamics yielding a stable quasispecies at the level
of compartments.172 The SCM is inherently stable
against parasites. It has the ability to select against
inferior and for superior mutants.173–176 The cells
of the SCM are individuals subject to selection,
and thus are evolutionary units. Selection at the
level of compartments can be considered as group
selection177,178 because (1) the number of templates
in protocells is much smaller than the number of
compartments; (2) each protocell has only one par-
ent; and (3) there is no migration among groups.
Compartmentalization can also save the
hypercycle.179–181 If a favorable mutant appears in a
compartment after random segregation of templates
into daughter cells, there is a chance of appearance
of a superior template composition that can out-
compete compartments with inferior compositions.
Consequently, the compartmentalized HPC can be
the subject of selection, and can integrate infor-
mation successfully. Note, however, that the reac-
tion topology assumed in the MCR system tolerates
higher mutational loads than a compartmentalized
hypercycle of similar gene diversity,182 making the
hypercycle still a less favored model of early infor-
mation integration.
The above models mostly focused on the coexis-
tence of only a few (1–3) genes in the face of stochas-
ticity and parasites. An important question arises
next: How many genes can actually coexist within
compartments? In an infinite population with repli-
cators having exactly the same replication rate (i.e.,
there is no internal competition), an arbitrary num-
ber of genes can coexist.183 Finite population and
internal competition, however, leads to a finite max-
imum maintainable gene number. Recently, Hubai
and Kun184 have shown that as many as 100 genes
can coexist within the SCM.
Later, we will discuss whether 100 genes are
enough for a minimal ribo-organism. Nevertheless,
we can conclude that the compartmentalized sys-
tems are not just stable solutions against parasites,
but are also capable information integrators and are
units of evolution.
Enhancing the RNA world: chromosomes
and metabolism
Biologists must first of all be concerned with
this chemical motor, since the system of chemical
cycles is the basis of the functioning of life.
T. G ´
anti, The Principle of Life, 1987
Limited diffusion or compartmentalization al-
lows for some genes to coexist. Although we have
argued that a single RNA-dependent RNA replicase
is sufficient for the start of evolution, a functional
ribo-organism requires considerably more enzymes.
Comparative analyses of bacteria and theoretical
considerations place the minimal set of genes for a
12 Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
Kun et al. The dynamics of the RNA world
present-day organism at around 200. Many of these
encode enzymes are involved in translation and
DNA metabolism,185 and thus are not required for a
ribo-organism. Even among the 50 genes suggested
as the minimum for an intermediate metabolism186
are genes for the conversion of ribonucleotides to
deoxyribonucleotides. Thus, the minimal gene set
for a functional ribo-organism, that is an organism
having an RNA genome and RNA enzymes, might
lie in the 60–100 range. Compartmentalized systems
can harbor such a diverse set of genes.
The evolution of the chromosome is the next step
toward more complex life forms, as internal com-
petition and the threat of stochastic loss of genes
limits the number of individually replicable genes.
The invention of the chromosome allows the further
expansion of metabolism to the point where the evo-
lution of the genetic code, and then translation, be-
come feasible. Below, we discuss the challenges the
RNA world must have faced in its evolution from
the first cells, through complex metabolism, to the
RNA–protein world.
Chromosomes
The quest for understanding the chromosome,
a single RNA molecule containing all necessary
genes (ribozymes) of the organism, has been be-
hind the scene in the previous sections. Once
the replication apparatus can copy RNA with
all genes simultaneously and with sufficient fi-
delity, the problem of individually replicating genes,
and with it Eigen’s paradox, is gone. As an ad-
dition to a highly accurate replicase, a chro-
mosome requires an endonuclease that cleaves
the embedded genes (ribozymes) from it. Evolv-
ing RNA-cleavage capability is not very compli-
cated; in fact all naturally occurring ribozymes
can do it.21 Simple structural motifs exhibited by
the hairpin or the hammerhead ribozymes (the
smallest natural ribozymes) are common even in
random pools of short RNA molecules.61 Thus,
chromosomes evolve once replication is accurate
enough (see Ref. 187 for plausible molecular steps
leading to the establishment of chromosomes from
unlinked, replicating ribozymes).
This evolutionary transition probably did not re-
sult in (bacterial) chromosomes as we know them
now: a single copy of genes per cell that precisely
double for cell division. An intermediate evolu-
tionary step could have been individually repli-
cating chromosomes, when all genes are linked
together. In such an ensemble, no chromosome can
“cheat” by replicating faster than the others. Fur-
thermore, the inheritance of a full set of genes is
ensured if at least one copy of the chromosome
gets into the daughter cell. The reliable allocation
of two chromosomes to the two daughter cells re-
quires a separator mechanism, such as the one pro-
vided by the cytoskeleton, which is probably a late
prokaryotic invention. Without the evolved facili-
ties of a cytoskeleton-like system, multiple copies
of chromosomes might still be able to ensure that
no daughter cells end up missing any gene. We can
argue that if there is a higher number of copies of
the chromosome, and chromosomes are assigned
to daughter cells randomly, both daughter cells will
have at least one set of genes with high probabil-
ity. For example, with seven chromosomes per cell,
the chance of an empty daughter cell is less than
1% (binomial distribution, with P=0.5). A system
with randomly assorting chromosomes can actu-
ally outcompete cells with individually replicating
genes.188 But the aforementioned seven copies per
cell are still more than the two copies required for
cell division. One of them goes to one of the daugh-
ter cells, and the other to the other daughter cell.
As simple as it sounds, it requires either a cytoskele-
ton (like in contemporary organisms189,190)orat-
tachment to the cell membrane (as in the replicon
model191). As RNA polymerases are powerful mo-
tors, they could have exerted force when traveling
along the strand being copied. As the membrane was
growing simultaneously, it could have aided the seg-
regation of chromosomes by letting them move to
opposite poles of the early cell (cf. bacterial chromo-
some segregation192,193). Thus, an RNA polymerase
ribozyme, besides replicating the genetic material,
could have pushed the two copies to opposite ends
of the cell, ensuring cell division.
The main obstacle in the path to a chromosome
is the error threshold. When gradual increase of the
fidelity of replicase overcame the critical threshold
above which the whole genetic material could be
replicated as one molecule, chromosomes evolved.
The evolution of accurate chromosome segrega-
tion and bacterial-type cell division remain to be
elucidated.
Enzymatization
Metabolism is the fundamental core function of the
living cell.32,163,194 As we have argued previously,
there must have been a small but essential set of
13
Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
The dynamics of the RNA world Kun et al.
molecules that catalyzed a minimal metabolism,
even at the surface-bound stage. At least a mini-
mal metabolism was required for RNA replication.
Metabolism, however, might sound like a bewilder-
ingly complex network of reactions, as it usually
is for contemporary species. How could evolution
have proceeded, then, from a few coexisting genes,
catalyzing their own replication to a complex and
intertwined metabolism with a multitude of special-
ized enzymes? There are three main angles that aim
to explain the origins of a complex metabolism: (1)
discovering the catalytic repertoire of ribozymes; (2)
assembling reaction networks; and (3) understand-
ing the increasing specificity of enzymes.
First, the catalytic repertoire of ribozymes shows
that almost all reactions necessary for a ribo-
organism can be catalyzed by ribozymes. The real
challenge is to develop an efficient and accurate
replicase ribozyme. Unfortunately, at the moment
there is no known ribozyme that can stably repli-
cate itself. However, template-directed polymeriza-
tion was proven to be possible,103,195 although only
up to 14–20 nucleotides could be copied. The copy-
ing fidelity of these ribozymes is around 96% per
nucleotide per copying. Efficiency was further en-
hanced to copy 98 nucleotides,104 with accuracy in-
creased to 99%. These ribozymes are around 200
nucleotides long, and thus they are not able to repli-
cate themselves.
Recent advances show that, in principle, ri-
bozymes can catalyze the template-directed poly-
merization of long sequences (up to 206, thus long
enough for the replicase ribozyme;196 this particu-
lar ribozyme can only replicate a very specific se-
quence). Furthermore, a shorter ribozyme could
also act as a replicase,197 albeit with unknown fidelity
or processivity. These experiments lend credence to
theoretical models that the gradual refinement of
copying fidelity is possible and the error threshold
can be overcome.145,198 Recent advances seem to in-
dicate that a self-replicating ribozyme is just around
the corner, although the research to realize it took al-
most two decades, which also indicates that a general
replicase ribozyme is not something easily evolved.
Surface-bound metabolism can enhance the for-
mation of RNA strands. Apart from the replicase,
the availability of nucleotides is critical. Let us as-
sume that nucleobases and ribose are available from
the environment. To form activated monomers, the
sugar needs to be phosphorylated twice and then
the constituents need to be put together to form the
nucleotide. Kinase ribozymes199 could first produce
the d-ribose-5-phosphate and then 5-phospho-d-
ribose-1-diphosphate (PRPP). Ribozymes can cat-
alyze the formation of the glycosidic bond be-
tween PRPP and either a pyrimidine200–202 or a
purine202,203 nucleobase. Almost all biologically im-
portant reactions could be catalyzed by ribozymes
to some extent.39
Upon leaving the mineral surface, replicators
were probably encapsulated into vesicles. Com-
partmentalization raised new problems, for ex-
ample, that of permeability: how could small
molecules (raw and waste) cross the membrane?
Although RNA molecules cannot be proper trans-
membrane molecules, it is possible to select for
oligonucleotide sequences that efficiently bind to
membranes,46,47,204–207 presumably in the form of
collaborative hetero-oligomeric complexes.47,204,205
These complexes can significantly change the
permeability of membranes for larger ionic
compounds46,204,208 and serve as specific trans-
porters for more complex compounds, such as
amino acids.47 Nucleotides can spontaneously dif-
fuse across fatty acid membranes.155,209 Interest-
ingly, ribose has the best permeability coefficient
among aldopentoses and hexoses, both for fatty acid
and phospholipid membranes, which promotes its
accumulation within the protocells.210 If one con-
siders the formose reaction211 as a possible prebiotic
pathway for autocatalytic carbohydrate synthesis,
such passive sorting and accumulation of ribose—
one of many products of the formose reaction—in
membrane-bound vesicles could have supplied ri-
bose for nucleotide synthesis.210 Consequently, evi-
dence suggests that, when the evolution of the RNA
world arrived at the stage of compartmentalized
replicators, scenarios considering RNA molecules
as mediators of transmembrane transport proved
to be possible.
Second, four reaction pathway evolution scenar-
ios are known. According to the backward (or ret-
rograde) evolutionary scenario,212 the last step in a
pathway leading to important molecules were enzy-
matized first. Only pathways that operate without
enzymes can be populated by enzymes this way. The
last product will be depleted first, and then the last
but one, and so on. Cells evolving an enzyme for the
last nonenzymatic step have an advantage as they
can secure resources faster. The forward pathway
14 Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
Kun et al. The dynamics of the RNA world
evolution postulates that enzymes appear first for
the early steps of a pathway, and later steps become
catalyzed later in succession.213 Such an evolution-
ary scenario could work for catabolic pathways, in
which more and more energy can be extracted by
successive processing of a molecule. The patchwork
evolution postulates that enzymes are recruited from
other pathways.214 Finally, the shell hypothesis pro-
poses that there was a core metabolic process (e.g.,
the reductive citric acid cycle), and new pathways
may have been recruited and attached to this core.215
Obviously, these scenarios cannot be entirely sepa-
rated from each other, they may all have played es-
sential roles in the evolution of metabolic-reaction
networks.216
The third problem is the evolution of enzymatic
efficiency, which raises further problems both from
the biochemical and the theoretical points of view.
First and foremost, modeling the evolution of en-
zymes is a challenging task. If the crystal structure
of a given enzyme is given, the interaction between
the enzyme and a small molecule as a ligand can be
analyzed on either a quasi-classical or a quantum-
mechanical level. Because the structure of early
enzymes is unknown and the structural–functional
evolution of the enzymes on the molecular level
cannot be modeled, these approaches fail. There are
two possibilities to overcome the hurdles of model-
ing the evolution of specific enzymes: either by using
a fully artificial chemistry or applying a major sim-
plification of real chemical structures that preserves
the major properties of the receptor–ligand inter-
actions. In artificial chemistry approaches, atomic
types, chemical bonds, reaction routes, and the in-
teraction between molecules are defined in arbitrary
but consistent ways. Dittrich et al.217 have argued
that “artificial chemistries are ‘the right stuff’ for the
study of prebiotic and biochemical evolution.” Such
chemistries are applicable for a wide variety of mod-
els (from biochemical to ecological systems) with a
continuously growing literature. Setting up an arti-
ficial chemistry model for studying the evolution of
enzymes is a straightforward task (see, e.g., Ref. 218).
In this context, the increasing chemical–functional
complexity,the interac tions betweenmolecules, and
the analysis of the system can be handled in a rela-
tively easy way, although the relevance of any results
so obtained is at least doubtful. We suggest that
such models are nevertheless useful to understand
larger-scale phenomena, like metabolic network ex-
pansion or self-assembly, in which the abstraction
of individual reactions does not affect the behavior
of the system.
The second method of modeling is to reduce the
complexity of the structure of real enzymes and to
simplify the treatment of the receptor–ligand inter-
action. This approach could capture the essential
features of both the evolution of enzyme functions
and the thermodynamics of the receptor–ligand in-
teraction. An early study using the above method
was done by Kacser and Beeby.219 They approxi-
mated enzymes and ligands with 3D cavities and
blocks fitting in cavities. The enzymatic activity is
assumed to be proportional to the Lennard–Jones
interaction energy between enzyme and substrate
(if the substrate can enter into the cavity, zero
otherwise). With this choice, there is one optimal
enzyme size for a given substrate. This approach
respects both the effect of the geometry of the par-
ticipants and the basics of the thermodynamics of
interactions. During evolution, the enzyme sizes can
change, altering the catalytic activity on a given sub-
strate. A possible extension was made by Szathm´
ary
et al.220 In this model, substrates and enzymes are
more than three-dimensional (3D) hyperblocks and
cavities with “active sites” on their faces. Instead
of increasing the complexity of 3D structures, in-
troducing higher dimensionality provides a way to
model the geometrical complexity of enzymes much
more easily.For proper catalysis, the active sites must
meet their complementary partners (otherwise the
catalytic product is waste), and for high catalytic ac-
tivity, enzyme cavities must optimally fit the size of
their substrates.
On the basis of such a model, Szathm´
ary et al.220
have concluded that the formation of a chromo-
some is a prerequisite for complex metabolism run
by specific enzymes. The reason for this is that,
while replicating ribozymes are unlinked, there is
a considerable assortment load because of chance
in protocell division (genes are assorted to offspring
compartments randomly) that selects for general-
ist enzymes at the expense of specificity and effi-
ciency. Small metabolic repertoire and promiscuous
ribozymes (e.g., see Ref. 202) were the norm. The
invention of the chromosome seems to be the pin-
nacle of the RNA world, as from its very beginning it
was striving for this elusive target, but its invention
15
Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
The dynamics of the RNA world Kun et al.
paved the road out of the RNA world. In contrast to
peptide synthesis, ribozyme production required
only RNA copying. The genetic material is copied
to produce ribozymes, and ribozymes, or a chromo-
some harboring them, are copied to replicate the ge-
netic material. Peptide enzymes are more efficient
catalysts, but their production requires many en-
zymes (the ribosome, tRNAs, and aminoacyl tRNA
synthetases). The evolution of life could have ar-
rived to the proliferation of such enzymatic activity
at this stage of complexity, which we can refer to as
the peptide–RNA world.
What remains to be done?
The dynamical theory of the RNA world has ad-
vanced considerably over the last two decades. Of
course, in the research of the RNA world it is taken
for granted that once upon a time an RNA world did
in fact exist. As Orgel54 and Joyce38 note, it is quite
likely that RNAs were not the first replicating tem-
plates. It is also certain that they were not the last,
either. Today we are living in a DNA–RNA–protein
world. What are the main goals for dynamical the-
ory in the further clarification of the evolution of
the RNA world? Maybe template replication was
preceded by collective autocatalysis of molecules
lacking template replication? This view, forcefully
advocated by Kauffman,221,222 received surprising
support by the demonstration of limited evolvabil-
ity of such networks in compartmentalized form.223
Computationally demanding further examination
of such systems may turn out to be very important,
but a survey of the relevant details has been beyond
the scope of this review.
More detailed and integrated models of proto-
cells harboring ribozymes is needed, extending our
view toward the evolutionary build-up of a com-
plex, connected metabolism, and the establishment
of resilient membranes with regulated permeability.
The name of the game is undoubtedly detailed mod-
eling of coevolution of metabolism, membrane, and
templates, much in the spirit of G´
anti’s chemoton
concept.
The RNA world has been left behind by evolution.
One could argue that the origin of the genetic code
and translated protein enzymes was the greatest,
yet in a sense self-defeating, invention of the RNA
world. How this could have happened and what role
theory can have in the elucidation of this important
evolutionary transition will be subjects of a different
review.
Acknowledgments
Financial support has been provided by the
European Research Council under the Euro-
pean Community’s Seventh Framework Pro-
gramme (FP7/2007-2013)/ERC Grant agreement
No. [294332]. A.Sz. and ´
A.K. acknowledge support
by the European Union cofinanced by the European
Social Fund (Grant agreement No. TAMOP 4.2.1/
B-09/1/KMR-2010-0003). B.K. acknowledges fi-
nancial support from the Hungarian Research
Foundation (OTKA Grant No. K100806). G.B. and
´
A.K. acknowledge support from the Hungarian Re-
search Grants (OTKA K100299). ´
A.K. gratefully ac-
knowledges a J´
anos Bolyai Research Fellowship of
the Hungarian Academy of Sciences. This work was
carried out as part of EU COST action CM1304
“Emergence and Evolution of Complex Chemical
Systems.”
References
1. Gilbert, W.1986. Or igin of life: the RNAworld. Nat ure 319:
618.
2. Lazcano, A. 2010. Historical development of origins re-
search. Cold Spring Harbour Perspect. Biol. 2: a002089.
3. Belozerskii, A.N. 1959. On the species specificity of the
nucleic acids of bacteria. In The Origin of Life on Earth. A.I.
Oparin et al., Eds.: 322–321. New York: Pergamon Press.
4. Brachet, J. 1959. Les acides nucl´
eiques et l’origine des
prot´
eines. In TheOriginofLifeonEarth. A.I. Oparin et
al., Eds.: 361–367. New York: Pergamon Press.
5. Crick, F.H.C. 1958. On protein synthesis. Symposia Soc.
Exp. Biol. 12: 138–163.
6. Woese, C.R. 1967. The Genetic Code.NewYork:Harper&
Row.
7. Orgel, L.E. 1968. Evolution of the genetic apparatus. J. Mol.
Biol. 38: 381–393.
8. Crick, F.H.C. 1968. The origin of the genetic code. J. Mol.
Biol. 38: 367–379.
9. G´
anti, T. 1979. Interpretation of prebiotic evolution based
on chemoton theory (in Hungarian). Biol´
ogia 27: 161–175.
10. G´
anti, T. 2003. Chemoton Theory.NewYork:KluwerAca-
demic/Plenum Publishers.
11. Guerrier-Takada, C. et al. 1983. The RNA moiety of ri-
bonuclease P is the catalytic subunit of the enzyme. Cell 35:
849–857.
12. Kruger, K. et al. 1982. Self-splicing RNA: autoexcision and
autocyclization of the ribosomal RNA interveningsequence
of Tetr ahy me n a.Cell 31: 147–157.
13. Jeffares, D.C., A.M. Poole & D. Penny. 1998. Relics from
the RNA world. J. Mol. Evol. 46: 18–36.
16 Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
Kun et al. The dynamics of the RNA world
14. Peebles, C.L. et al. 1986. A self-splicing RNA excises an
intron lariat. Cell 44: 213–223.
15. Forster, A.C. & R.H. Symons. 1987. Self-cleavage of plus
and minus RNAs of a virusoid and a structural model for
theactivesite.Cell 49: 211–220.
16. Hampel, A. & R.R. Tritz. 1989. RNA catalytic properties of
the minimum (-)sTRSV sequences. Biochemistry 28: 4929–
4933.
17. Sharmeen, L. et al. 1988. Antigenomic RNA of human Hep-
atitis delta viruses can undergo self-cleavage. J. Virol.62:
2674–2679.
18. Saville, B.J.& R.A. Collins. 1990. A site-specific self-cleavage
reaction performed by a novel RNA in Neurospora mito-
chondria. Cell 61: 685–696.
19. Winkler, W.C. et al. 2004. Control of gene expression by a
natural metabolite-responsive ribozyme. Nature 428: 281–
286.
20. Roth, A. et al. 2014. A widespread self-cleaving ribozyme
class is revealed by bioinformatics. Nat. Chem. Biol. 10:
56–60.
21. Doudna, J.A. & T.R. Cech. 2002. The chemical repertoire
of natural ribozymes. Nature 418: 222–228.
22. Westhof, E. 1999. Chemical diversity in RNA cleavage. Sci-
ence 286: 61–62.
23. Nissen, P. et al. 2000. The structural basis of ribosome
activity in peptide bond synthesis. Science 289: 920–930.
24. Moore, P.B. & T.A. Steitz. 2002. The involvement of RNA
in ribosome function. Nature 418: 229–235.
25. Butcher, S.E. 2009. The spliceosome as ribozyme hypoth-
esis takes a second step. Proc.Natl.Acad.Sci.U.S.A.106:
12211–12212.
26. Valadkhan, S. et al. 2009. Protein-free small nuclear RNAs
catalyze a two-step splicing reaction. Proc. Natl. Acad. Sci.
U.S.A. 106: 11901–11906.
27. Fica, S.M. et al. 2013. RNA catalyses nuclear pre-mRNA
splicing. Nature 503: 229–234.
28. White, H.B. 1976. Coenzymes as fossils of an earlier
metabolic state. J. Mol. Evol. 7: 101–104.
29. Orgel, L. 1989. The origin of polynucleotide-directed pro-
tein synthesis. J. Mol. Evol. 29: 465–474.
30. Jadhav, V.R. & M. Yarus. 2002. Coenzymes as coribozymes.
Biochimie 84: 877–888.
31. Saran, D., J. Frank & D.H. Burke. 2003. The tyranny of
adenosine recognition among RNA aptamers to coenzyme
A. BMC Evolut. Biol. 3: 26.
32. Kun, ´
A.,B.Papp&E.Szathm
´
ary. 2008. Computational
identification of obligatorily autocatalytic replicators em-
bedded in metabolic networks. Genome Biol. 9: R51.
33. Szathm´
ary, E. 1990. Towards the evolution of ribozymes.
Nature 344: 115.
34. Szathm´
ary, E. 1989. The emergence, maintenance, and
transitions of the earliest evolutionary units. Oxford Surve ys
Evolut. Biol.6: 169–205.
35. Tuerk, C. & L. Gold. 1990. Systematic evolution of ligands
by exponential enrichment: RNA ligands to bacteriophage
T4 DNA polymerase. Science 249: 505–510.
36. Ellington, A.D. & J.W. Szostak. 1990. In vitro selection of
RNA molecules that bind specific ligands. Nature 346: 818–
822.
37. Robertson, D.L. & G.F. Joyce. 1990. Selection in vitro of an
RNAenzyme that specifically cleaves single-stranded DNA.
Nature 344: 467–468.
38. Joyce, G.F. 2002. The antiquity of RNA-based evolution.
Nature 418: 214–220.
39. Chen, X., N. Li & A.D. Ellington. 2007. Ribozyme catalysis
of metabolism in the RNA World. Chem. Biodiv.4: 633–
655.
40. Ellington, A.D. et al. 2009. Evolutionary origins and di-
rected evolution of RNA. Intl.J.Biochem.CellBiol.41:
254–265.
41. Landweber, L.F., P.J. Simon & T.A. Wagner.1998. Ribozy me
engineering and early evolution. BioScience 48: 94–103.
42. Joyce, G.F. 2004. Directed evolution of nucleic acid en-
zymes. Ann. Rev. Biochem.73: 791–836.
43. Tsukiji, S., S.B. Pattnaik & H. Suga. 2003. An alcohol de-
hydrogenase ribozyme. Nat. Struct. Mol. Biol. 10: 713–
717.
44. Tsukiji, S., S.B. Pattnaik & H. Suga. 2004. Reduction of
an aldehyde by a NADH/Zn2+-dependent redox active ri-
bozyme. J. Am. Chem. Soc.126: 5044–5045.
45. Hsiao, C. et al. 2013. RNA with iron(II) as a cofactor catal-
yses electron transfer. Nat. Chem. 5: 525–528.
46. Khvorova, A. et al. 1999. RNAs that bind and change the
permeability of phospholipid membranes. Proc. Natl. Acad.
Sci. U.S.A. 96: 10649–10654.
47. Janas, T., T. Janas & M. Yarus. 2004. A membrane trans-
porter for tryptophan composed of RNA. RNA 10: 1541–
1549.
48. Higgs, P.G. & N. Lehman. 2015. The RNA World: molecular
cooperation at the origins of life. Nat. Re v. Genet.16: 7–17.
49. Bernhardt, H. 2012. The RNA world hypothesis: the worst
theory of the early evolution of life (except for all the oth-
ers). Biol. Direct 7: 23.
50. Benner, S.A., A.D. Ellington & A. Tauer. 1989. Modern
metabolism as a palimpsest of the RNA world. Proc. Natl.
Acad. Sci. U.S.A. 86: 7054–7058.
51. Orgel, L.E. 1998. The origin of life—a review of facts and
speculations. TIBS 23: 491–495.
52. Szathm´
ary, E., S. Mauro & C. Fernando. 2005. Evolutionary
potential and requirements for minimal protocells. Top.
Curr. Chem.259: 167–211.
53. Robertson, M.P. & G.F. Joyce. 2012. The origins of the RNA
Worl d . Cold Spring Harbor Perspect. Biol.4: a003608.
54. Orgel, L.E. 2004. Prebiotic chemistry and the origin of
the RNA world. Crit.Rev.Biochem.Molec.Biol.39: 99–
123.
55. Powner, M.W., B. Gerland & J.D. Sutherland. 2009. Synthe-
sis of activated pyrimidine ribonucleotides in prebiotically
plausible conditions. Nature 459: 239–242.
56. Huang, W. & J.P. Ferris. 2003. Synthesis of 35-40 mers of
RNA oligomers from unblocked monomers: a simple ap-
proach to the RNA world. Chem. Commun. (Camb.) 1458–
1459.
57. Ferris, J.P. 2006. Montmorillonite-catalysed formation of
RNA oligomers: the possible role of catalysis in the origins
of life. Philos. Trans. R. Soc. B. 361: 1777–1786.
58. Ma, W. et al. 2007. Nucleotide synthetase ribozymes may
have emerged first in the RNA world. RNA 13: 2012–2019.
17
Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
The dynamics of the RNA world Kun et al.
59. Manrubia, S.C. & C. Briones. 2006. Modular evolution
and increase of functional complexity in replicating RNA
molecules. RNA 13: 97–107.
60. Copley, S.D., E. Smith & H.J. Morowitz. 2007. The origin
of the RNA world: co-evolution of genes and metabolism.
Bioorg. Chem. 35: 430–443.
61. Briones, C., M. Stich & S.C. Manrubia. 2009. The
dawn of the RNA World: toward functional complexity
through ligation of random RNA oligomers. RNA 15: 743–
749.
62. Fedor, M. 2000. Structure and function of the hairpin ri-
bozyme. J. Mol. Biol. 297: 269–291.
63. Lafontaine, D.A., D.G. Norman & D.M.J. Lilley. 2002. The
structure and active site of the Varkund satellite ribozyme.
Biochem. Soc. Trans. 30: 1170–1175.
64. Haslinger, C. & P.F. Stadler. 1999. RNA structure with
pseudo-knots: graph-theoretical and combinatorial prop-
erties. Bull. Math. Biol. 61: 437–467.
65. Schuster, P. et al. 1994. From sequences to shapes and back:
a case study in RNA secondary structures. Proc. R. Soc.
Lond. B 255: 279–284.
66. Schuster, P. 1997. Genotypes with phenotypes: adventures
in an RNA toy world. Biophys. Chem. 66: 75–110.
67. Geveretz, J., H.H. Gan & T. Schlick. 2005. In vitro RNA
random pools are not structurally diverse: a computational
analysis. RNA 11: 853–863.
68. Gr¨
uner, W. et al. 1996. Analysis of RNA sequence structure
maps by exhaustive enumeration. II. Structures of neutral
networks and shape space covering. Monatsh Chem. 127:
375–389.
69. Fontana, W. et al. 1993. Statictics of RNA secondary struc-
tures. Biopolymers 33: 1389–1404.
70. J¨
org, T., O. Martin & A. Wagner. 2008. Neutral network
sizes of biological RNA molecules can be computed and
are not atypically small. BMC Bioinformat. 9: 464.
71. Fernando, C., G. Von Kiedrowski & E. Szathm´
ary. 2007. A
stochastic model of nonenzymatic nucleic acid replication:
‘‘elongators’’ sequester replicators. J. Mol. Evol. 64: 572–
585.
72. Yarus, M. 2011. Life from an RNA World: The Ancestor
With in. Harvard: Harvard University Press.
73. Ellington, A.D. & J.W. Szostak. 1992. Selection in vitro
of single-stranded DNA molecules that fold into specific
ligand-binding structures. Nature 355: 850–852.
74. von Kiedrowski, G. 1986. A self-replicating hexadeoxy nu-
cleotide. Angew.Chem.Int.Ed.Engl.25: 932–935.
75. Paul, N. & G.F. Joyce. 2002. A self-replicating ligase ri-
bozyme. Proc. Natl. Acad. Sci. U.S.A. 99: 12733–12740.
76. Vaidya, N. et al. 2012. Spontaneous network formation
among cooperative RNA replicators. Nature 491: 72–77.
77. Szathm´
ary, E. 2013. On the propagation of a conceptual er-
ror concerning hypercycles and cooperation. J. Syst. Chem.
4: 1.
78. Hayden, E.J., G. von Kiedrowski & N. Lehman. 2008. Sys-
tems chemistry on ribozyme self-construction: evidence for
anabolic autocatalysis in a recombination network. Angew.
Chem. Int. Ed. Engl. 47: 8424–8428.
79. Vasas, V. et al. 2015. Primordial evolvability: impasses and
challenges. J. Syst. Chem. In press.
80. Meyer, A.J., J.W. Ellefson & A.D. Ellington. 2012. Abiotic
self-replication. Acc. Chem. Res. 45: 2097–2105.
81. Wu, M. & P.G. Higgs. 2011. Comparison of the roles of
nucleotide synthesis, polymerization, and recombination
in the origin of autocatalytic sets of RNAs. Astrobiology 11:
895–906.
82. Gause, G.F. 1935. TheStruggleforExistence.Baltimore,MD:
William and Wilkins.
83. MacArthur, R. & R. Levins. 1964. Competition, habitat
selection, and character displacement in a patchy environ-
ment. Proc.Natl.Acad.Sci.U.S.A.51: 1207–1210.
84. Mesz´
ena, G. et al. 2006. Competitive exclusion and limiting
similarity: a unified theory. Theor. Pop. Biol. 69: 68–87.
85. Eigen, M. 1971. Selforganization of matter and the evolu-
tion of biological macromolecules. Naturwissenscaften 10:
465–523.
86. Szil´
agyi, A., I. Zachar & E. Szathm ´
ary. 2013. Gause’s princi-
pleandtheeffectofresourcepartitioningonthedynamical
coexistence of replicating templates. PLoS Computat. Biol.
9: e1003193.
87. Zielinski, W.S. & L.E. Orgel. 1987. Autocatalytic synthesis
of a tetranucleotide analogue. Nature 327: 346–347.
88. Szathm´
ary, E. & I. Gladkih. 1989. Sub-exponential growth
and coexistence of non-enzymatically replicating tem-
plates. J. Theor. Biol. 138: 55–58.
89. Lifson, S. & H. Lifson. 1999. A model of prebiotic replica-
tion: survival of the fittest versus extinction of the unfittest.
J. Theor. Biol. 199: 425–433.
90. Szathm´
ary, E. 1991. Simple growth laws and selection con-
sequences. TREE 6: 366–370.
91. Varga, Z. & E. Szathm ´
ary. 1997. An extremum principle for
parabolic competition. Bull. Math. Biol. 59: 1145–1154.
92. von Kiedrowski, G. & E. Szathm´
ary. 2000. Selection versus
coexistence of parabolic replicators spreading on surfaces.
Selection 1: 173–179.
93. Wills, P.R. et al. 1998. Selection dynamics in autocatalytic
systems: templates replicating through binary ligation. Bull.
Math. Biol. 60: 1073–1098.
94. Scheuring, I. & E. Szathm´
ary. 2001. Survival of replicators
with parabolic growth tendency and exponential decay. J.
Theor. Biol. 212: 99–105.
95. Chesson, P. 1994. Multispecies competition in variable en-
vironments. Theor. Pop. Biol. 45: 227–276.
96. Chesson, P. 2000. General theory of competitive coexis-
tence in spatially-varying environments. Theor. Pop. Biol.
58: 211–237.
97. Scheffer, M. et al. 2003. Why plankton communities have
no equilibrium: solutions to the paradox. Hydrobiologia
491: 9–18.
98. Huisman, J. & F.J. Weissing. 2002. Oscillations and chaos
generated by competition for interactively essential re-
sources. Ecol. Res. 17: 175–181.
99. K´
arolyi, G. et al. 2000. Chaotic flow: the physics of species
coexistence. Proc. Natl. Acad. Sci. U.S.A. 97: 13661–13665.
100. Haldane, J.B.S. 1937. The effect of variation on fitness. Am.
Nat. 71: 337–349.
101. Eigen, M. & P. Schuster. 1977. The hypercycle: a princi-
ple of natural self-organisation. Part A: emergence of the
hypercycle. Naturwissenscaften 64: 541–565.
18 Ann. N.Y. Acad. Sci. xxxx (2015) 1–21 C2015 New York Academy of Sciences.
Kun et al. The dynamics of the RNA world
102. Friedberg, E.C., G.C. Walker & W. Siede. 1995. DNA Repair
and Mutagenesis. Washington, D.C.: ASM Press.
103. Johnston, W.K. et al. 2001. RNA-catalyzed RNA polymer-
ization: accurate and general RNA-templated primer ex-
tension. Science 292: 1319–1325.
104. Wochner, A. et al. 2011. Ribozyme-catalyzed transcription
of an active ribozyme. Science 332: 209–212.
105. Maynard Smith, J. 1983. Models of evolution. Proc. R. Soc.
Lond. B 219: 315–325.
106. Baake, E. & W. Gabriel. 2000. Biological evolution through
mutation, selection, and drift: an introductory