Preprint
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the author.

Abstract

Evolution by natural selection is often viewed as an optimisation process where an organism's phenotypic traits are adapted gradually to improve its fitness. Because of the many different traits with potentially conflicting requirements, among other factors, this optimisation process may appear onerous. Building on recent mathematical work connecting optima and simplicity, we here show that for certain generic phenotype fitness requirements --- those based on physics and engineering principles --- optimal phenotypic shapes will tend to `simple', in the sense of low algorithmic or descriptional complexity. As a result, we argue that adapting to these types of generic fitness requirements will be a much `easier' task for natural selection, compared to a null expectation based on arbitrary optimisation requirements. Further, selection's task may be easier still due to the fact that optimal phenotypes for one set of generic fitness constraints may also be close to optimal for other generic constraints, such that adapting to one constraint yields the other `for free'.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the author.

... More broadly, many studies have shown that employing AIT as a theoretical framework combined with estimates of Kolmogorov complexity can be fruitful in natural sciences. Example applications include in thermodynamics [33][34][35], understanding the regularity of the laws of physics [36], entropy estimation [37,38], classifying biological structures [39], evolution theory [40,41], networks [42,43], in addition to data analysis [44][45][46] and time series analysis [47,48], among others [49]. ...
... Therefore, we leave these to the future work. It is interesting, however, that studies of natural RNA shapes [21,22] and the shapes of protein complexes [23] have shown that GP map biases alone can be very good predictors of natural biological shape frequencies (see also [41,[77][78][79] for more on different types of biases and evolutionary outcomes). Therefore, it may be that the transition probability biases discussed here, resulting from conditional complexity constraints, are strong enough that their stamp is still observable even in natural data. ...
Article
Full-text available
Unravelling the structure of genotype–phenotype (GP) maps is an important problem in biology. Recently, arguments inspired by algorithmic information theory (AIT) and Kolmogorov complexity have been invoked to uncover simplicity bias in GP maps, an exponentially decaying upper bound in phenotype probability with the increasing phenotype descriptional complexity. This means that phenotypes with many genotypes assigned via the GP map must be simple, while complex phenotypes must have few genotypes assigned. Here, we use similar arguments to bound the probability P ( x → y ) that phenotype x , upon random genetic mutation, transitions to phenotype y . The bound is P ( x → y ) ≲ 2 − a K ~ ( y | x ) − b , where K ~ ( y | x ) is the estimated conditional complexity of y given x , quantifying how much extra information is required to make y given access to x . This upper bound is related to the conditional form of algorithmic probability from AIT. We demonstrate the practical applicability of our derived bound by predicting phenotype transition probabilities (and other related quantities) in simulations of RNA and protein secondary structures. Our work contributes to a general mathematical understanding of GP maps and may facilitate the prediction of transition probabilities directly from examining phenotype themselves, without utilizing detailed knowledge of the GP map.
... Another possible explanatory factor for the similarity between natural and random SS is that some fitness-related properties of phenotype shapes are linked to bias. Recently, it has been mathematically argued that certain generic fitness requirements based on physics and engineering principles (e.g., mutational robustness in molecules and efficiency in biological networks) may lead to highly optimal values for particular types of phenotype shapes, which may also have high probability or be favorably biased [71,72]. In addition to mathematical arguments, a large range of biological examples is presented in support of the theory. ...
Article
Full-text available
An important question in evolutionary biology is whether (and in what ways) genotype–phenotype (GP) map biases can influence evolutionary trajectories. Untangling the relative roles of natural selection and biases (and other factors) in shaping phenotypes can be difficult. Because the RNA secondary structure (SS) can be analyzed in detail mathematically and computationally, is biologically relevant, and a wealth of bioinformatic data are available, it offers a good model system for studying the role of bias. For quite short RNA (length L≤126), it has recently been shown that natural and random RNA types are structurally very similar, suggesting that bias strongly constrains evolutionary dynamics. Here, we extend these results with emphasis on much larger RNA with lengths up to 3000 nucleotides. By examining both abstract shapes and structural motif frequencies (i.e., the number of helices, bonds, bulges, junctions, and loops), we find that large natural and random structures are also very similar, especially when contrasted to typical structures sampled from the spaces of all possible RNA structures. Our motif frequency study yields another result, where the frequencies of different motifs can be used in machine learning algorithms to classify random and natural RNA with high accuracy, especially for longer RNA (e.g., ROC AUC 0.86 for L = 1000). The most important motifs for classification are the number of bulges, loops, and bonds. This finding may be useful in using SS to detect candidates for functional RNA within ‘junk’ DNA regions.
Article
Full-text available
Developing new ways to estimate probabilities can be valuable for science, statistics, engineering, and other fields. By considering the information content of different output patterns, recent work invoking algorithmic information theory inspired arguments has shown that a priori probability predictions based on pattern complexities can be made in a broad class of input-output maps. These algorithmic probability predictions do not depend on a detailed knowledge of how output patterns were produced, or historical statistical data. Although quantitatively fairly accurate, a main weakness of these predictions is that they are given as an upper bound on the probability of a pattern, but many low complexity, low probability patterns occur, for which the upper bound has little predictive value. Here, we study this low complexity, low probability phenomenon by looking at example maps, namely a finite state transducer, natural time series data, RNA molecule structures, and polynomial curves. Some mechanisms causing low complexity, low probability behaviour are identified, and we argue this behaviour should be assumed as a default in the real-world algorithmic probability studies. Additionally, we examine some applications of algorithmic probability and discuss some implications of low complexity, low probability patterns for several research areas including simplicity in physics and biology, a priori probability predictions, Solomonoff induction and Occam’s razor, machine learning, and password guessing.
Article
Full-text available
Physical roots, exemplifications and consequences of periodic and aperiodic ordering (represented by Fibonacci series) in biological systems are discussed. The physical and biological roots and role of symmetry and asymmetry appearing in biological patterns are addressed. A generalization of the Curie–Neumann principle as applied to biological objects is presented, briefly summarized as: “asymmetry is what creates a biological phenomenon”. The “top-down” and “bottom-up” approaches to the explanation of symmetry in organisms are presented and discussed in detail. The “top-down” approach implies that the symmetry of the biological structure follows the symmetry of the media in which this structure is functioning; the “bottom-up” approach, in turn, accepts that the symmetry of biological structures emerges from the symmetry of molecules constituting the structure. A diversity of mathematical measures applicable for quantification of order in biological patterns is introduced. The continuous, Shannon and Voronoi measures of symmetry/ordering and their application to biological objects are addressed. The fine structure of the notion of “order” is discussed. Informational/algorithmic roots of order inherent in the biological systems are considered. Ordered/symmetrical patterns provide an economy of biological information, necessary for the algorithmic description of a biological entity. The application of the Landauer principle bridging physics and theory of information to the biological systems is discussed.
Article
Full-text available
Significance Why does evolution favor symmetric structures when they only represent a minute subset of all possible forms? Just as monkeys randomly typing into a computer language will preferentially produce outputs that can be generated by shorter algorithms, so the coding theorem from algorithmic information theory predicts that random mutations, when decoded by the process of development, preferentially produce phenotypes with shorter algorithmic descriptions. Since symmetric structures need less information to encode, they are much more likely to appear as potential variation. Combined with an arrival-of-the-frequent mechanism, this algorithmic bias predicts a much higher prevalence of low-complexity (high-symmetry) phenotypes than follows from natural selection alone and also explains patterns observed in protein complexes, RNA secondary structures, and a gene regulatory network.
Article
Full-text available
How does morphological complexity evolve? This study suggests that the likelihood of mutations increasing phenotypic complexity becomes smaller when the phenotype itself is complex. In addition, the complexity of the genotype-phenotype map (GPM) also increases with the phenotypic complexity. We show that complex GPMs and the above mutational asymmetry are inevitable consequences of how genes need to be wired in order to build complex and robust phenotypes during development. We randomly wired genes and cell behaviors into networks in EmbryoMaker. EmbryoMaker is a mathematical model of development that can simulate any gene network, all animal cell behaviors (division, adhesion, apoptosis, etc.), cell signaling, cell and tissues biophysics, and the regulation of those behaviors by gene products. Through EmbryoMaker we simulated how each random network regulates development and the resulting morphology (i.e. a specific distribution of cells and gene expression in 3D). This way we obtained a zoo of possible 3D morphologies. Real gene networks are not random, but a random search allows a relatively unbiased exploration of what is needed to develop complex robust morphologies. Compared to the networks leading to simple morphologies, the networks leading to complex morphologies have the following in common: 1) They are rarer; 2) They need to be finely tuned; 3) Mutations in them tend to decrease morphological complexity; 4) They are less robust to noise; and 5) They have more complex GPMs. These results imply that, when complexity evolves, it does so at a progressively decreasing rate over generations. This is because as morphological complexity increases, the likelihood of mutations increasing complexity decreases, morphologies become less robust to noise, and the GPM becomes more complex. We find some properties in common, but also some important differences, with non-developmental GPM models (e.g. RNA, protein and gene networks in single cells).
Article
Full-text available
Fitness effects of mutations depend on environmental parameters. For example, mutations that increase fitness of bacteria at high antibiotic concentration often decrease fitness in the absence of antibiotic, exemplifying a tradeoff between adaptation to environmental extremes. We develop a mathematical model for fitness landscapes generated by such tradeoffs, based on experiments that determine the antibiotic dose-response curves of Escherichia coli strains, and previous observations on antibiotic resistance mutations. Our model generates a succession of landscapes with predictable properties as antibiotic concentration is varied. The landscape is nearly smooth at low and high concentrations, but the tradeoff induces a high ruggedness at intermediate antibiotic concentrations. Despite this high ruggedness, however, all the fitness maxima in the landscapes are evolutionarily accessible from the wild type. This implies that selection for antibiotic resistance in multiple mutational steps is relatively facile despite the complexity of the underlying landscape.
Article
Full-text available
For a broad class of input-output maps, arguments based on the coding theorem from algorithmic information theory (AIT) predict that simple (low Kolmogorov complexity) outputs are exponentially more likely to occur upon uniform random sampling of inputs than complex outputs are. Here, we derive probability bounds that are based on the complexities of the inputs as well as the outputs, rather than just on the complexities of the outputs. The more that outputs deviate from the coding theorem bound, the lower the complexity of their inputs. Since the number of low complexity inputs is limited, this behaviour leads to an effective lower bound on the probability. Our new bounds are tested for an RNA sequence to structure map, a finite state transducer and a perceptron. The success of these new methods opens avenues for AIT to be more widely used.
Article
Full-text available
Many systems in nature can be described using discrete input-output maps. Without knowing details about a map, there may seem to be no a priori reason to expect that a randomly chosen input would be more likely to generate one output over another. Here, by extending fundamental results from algorithmic information theory, we show instead that for many real-world maps, the a priori probability P(x) that randomly sampled inputs generate a particular output x decays exponentially with the approximate Kolmogorov complexity [Formula: see text] of that output. These input-output maps are biased towards simplicity. We derive an upper bound P(x) ≲ [Formula: see text], which is tight for most inputs. The constants a and b, as well as many properties of P(x), can be predicted with minimal knowledge of the map. We explore this strong bias towards simple outputs in systems ranging from the folding of RNA secondary structures to systems of coupled ordinary differential equations to a stochastic financial trading model.
Article
Full-text available
ᅟ: Symmetry is an eye-catching feature of animal body plans, yet its causes are not well enough understood. The evolution of animal form is mainly due to changes in gene regulatory networks (GRNs). Based on theoretical considerations regarding fundamental GRN properties, it has recently been proposed that the animal genome, on large time scales, should be regarded as a system which can construct both the main symmetries - radial and bilateral - simultaneously; and that the expression of any of these depends on functional constraints. Current theories explain biological symmetry as a pattern mostly determined by phylogenetic constraints, and more by chance than by necessity. In contrast to this conception, I suggest that physical effects, which in many cases act as proximate, direct, tissue-shaping factors during ontogenesis, are also the ultimate causes - i.e. the indirect factors which provide a selective advantage - of animal symmetry, from organs to body plan level patterns. In this respect, animal symmetry is a necessary product of evolution. This proposition offers a parsimonious view of symmetry as a basic feature of the animal body plan, suggesting that molecules and physical forces act in a beautiful harmony to create symmetrical structures, but that the concert itself is directed by the latter. Reviewers: This article was reviewed by Eugene Koonin, Zoltán Varga and Michaël Manuel.
Article
Full-text available
Mutational neighbourhoods in genotype-phenotype (GP) maps are widely believed to be more likely to share characteristics than expected from random chance. Such genetic correlations should, as John Maynard Smith famously pointed out, strongly influence evolutionary dynamics. We explore and quantify these intuitions by comparing three GP maps - RNA SS, HP for tertiary, Polyominoes for protein quaternary structure - to a simple random null model that maintains the number of genotypes mapping to each phenotype, but assigns genotypes randomly. The mutational neighbourhood of a genotype in these GP maps is much more likely to contain (mutationally neutral) genotypes mapping to the same phenotype than in the random null model. These neutral correlations can increase the robustness to mutations by orders of magnitude over that of the null model, raising robustness above the critical threshold for the formation of large neutral networks that enhance the capacity for neutral exploration. We also study {\em non-neutral correlations}: Compared to the null model, i) If a particular (non-neutral) phenotype is found once in the 1-mutation neighbourhood of a genotype, then the chance of finding that phenotype multiple times in this neighbourhood is larger than expected; ii) If two genotypes are connected by a single neutral mutation, then their respective non-neutral 1-mutation neighbourhoods are more likely to be similar; iii) If a genotype maps to a folding or self-assembling phenotype, then its non-neutral neighbours are less likely to be a potentially deleterious non-folding or non-assembling phenotype. Non-neutral correlations of type i) and ii) reduce the rate at which new phenotypes can be found by neutral exploration, and so may diminish evolvability, while non-neutral correlations of type iii) may instead facilitate evolutionary exploration and so increase evolvability.
Article
Full-text available
Author Summary The evolution of complexity, a central issue of evolutionary theory since Darwin's time, remains a controversial topic. One particular question of interest is how the complexity of an organism's body plan (morphology) is influenced by the complexity of the environment in which it evolved. Ideally, it would be desirable to perform investigations on living organisms in which environmental complexity is under experimental control, but our ability to do so in a limited timespan and in a controlled manner is severely constrained. In lieu of such studies, here we employ computer simulations capable of evolving the body plans of virtual organisms to investigate this question in silico. By evolving virtual organisms for locomotion in a variety of environments, we are able to demonstrate that selecting for locomotion causes more complex morphologies to evolve than would be expected solely due to random chance. Moreover, if increased complexity incurs a cost (as it is thought to do in biology), then more complex environments tend to lead to the evolution of more complex body plans than those that evolve in a simpler environment. This result supports the idea that the morphological complexity of organisms is influenced by the complexity of the environments in which they evolve.
Article
Full-text available
Native protein folds often have a high degree of symmetry. We study the relationship between the symmetries of native proteins, and their designabilities—how many different sequences encode a given native structure. Using a two-dimensional lattice protein model based on hydrophobicity, we find that those native structures that are encoded by the largest number of different sequences have high symmetry. However only certain symmetries are enhanced, e.g., x/y-mirror symmetry and 180° rotation, while others are suppressed. If there are many possible mutations which leave the native state of a particular protein stable, then, by definition, the state is highly designable. Hence, our findings imply that insensitivity to mutation implies high symmetry. It appears that the relationship between designability and symmetry results because protein substructures are also designable. Native protein folds may therefore be symmetric because they are composed of repeated designable substructures. © 2000 American Institute of Physics.
Article
Full-text available
Computer models are used to mimic the early evolution of ancient vascular plants (tracheophytes). These models have three components: (a) an N-dimensional domain of all mathematically conceivable ancient morphologies (a mor-phospace); (b) a numerical assessment of the ability (fitness) of each morphology to intercept light, maintain mechanical stability, conserve water, and produce and dis-perse spores; and (c) an algorithm that searches the morphospace for successively more fit variants (an adaptive walk). Beginning with the most ancient known plant form, evolution is simulated by locating neighboring morphologies that progressively perform one or more tasks more efficiently. The resulting simulated adaptive walks in-dicate that early tracheophyte evolution involved optimizing the performance of many tasks simultaneously rather than maximizing the performance of one or only a few tasks individually, and that the requirement for optimization accelerated the tempo of morphological evolution in the Silurian and Devonian.
Article
Full-text available
We investigate how scale-free (SF) and Erdos-Rényi (ER) topologies affect the interplay between evolvability and robustness of model gene regulatory networks with Boolean threshold dynamics. In agreement with Oikonomou and Cluzel (2006) we find that networks with SF(in) topologies, that is SF topology for incoming nodes and ER topology for outgoing nodes, are significantly more evolvable towards specific oscillatory targets than networks with ER topology for both incoming and outgoing nodes. Similar results are found for networks with SF(both) and SF(out) topologies. The functionality of the SF(out) topology, which most closely resembles the structure of biological gene networks (Babu et al., 2004), is compared to the ER topology in further detail through an extension to multiple target outputs, with either an oscillatory or a non-oscillatory nature. For multiple oscillatory targets of the same length, the differences between SF(out) and ER networks are enhanced, but for non-oscillatory targets both types of networks show fairly similar evolvability. We find that SF networks generate oscillations much more easily than ER networks do, and this may explain why SF networks are more evolvable than ER networks are for oscillatory phenotypes. In spite of their greater evolvability, we find that networks with SF(out) topologies are also more robust to mutations (mutational robustness) than ER networks. Furthermore, the SF(out) topologies are more robust to changes in initial conditions (environmental robustness). For both topologies, we find that once a population of networks has reached the target state, further neutral evolution can lead to an increase in both the mutational robustness and the environmental robustness to changes in initial conditions.
Article
Full-text available
Although most networks in nature exhibit complex topologies, the origins of such complexity remain unclear. We propose a general evolutionary mechanism based on global stability. This mechanism is incorporated into a model of a growing network of interacting agents in which each new agent's membership in the network is determined by the agent's effect on the network's global stability. It is shown that out of this stability constraint complex topological properties emerge in a self-organized manner, offering an explanation for their observed ubiquity in biological networks.
Article
Full-text available
AN OPTIMAL MODEL FOR THE PRISONER'S DILEMMA GAME IS SUGGESTED. THE MODEL IS NORMATIVE IN THE SENSE THAT GIVEN FEW ASSUMPTIONS ABOUT THE WAY THE GAME IS PERCEIVED BY THE PLAYERS, AN OPTIMAL POLICY IS PRESCRIBED TO EACH PLAYER MAXIMIZING HIS LONG-RUN EXPECTED GAIN. THE DILEMMA IS RESOLVED BY RESTRUCTURING THE GAME AS A SUPERGAME COMPOSED OF SEVERAL COMPONENT GAMES SUCH THAT TRANSITIONS AMONG THEM ARE POSSIBLE. DYNAMIC PROGRAMING IS USED TO DERIVE THE OPTIMAL POLICY. (16 REF.) (PsycINFO Database Record (c) 2006 APA, all rights reserved).
Article
Full-text available
The Prisoner's Dilemma is the leading metaphor for the evolution of cooperative behaviour in populations of selfish agents, especially since the well-known computer tournaments of Axelrod and their application to biological communities. In Axelrod's simulations, the simple strategy tit-for-tat did outstandingly well and subsequently became the major paradigm for reciprocal altruism. Here we present extended evolutionary simulations of heterogeneous ensembles of probabilistic strategies including mutation and selection, and report the unexpected success of another protagonist: Pavlov. This strategy is as simple as tit-for-tat and embodies the fundamental behavioural mechanism win-stay, lose-shift, which seems to be a widespread rule. Pavlov's success is based on two important advantages over tit-for-tat: it can correct occasional mistakes and exploit unconditional cooperators. This second feature prevents Pavlov populations from being undermined by unconditional cooperators, which in turn invite defectors. Pavlov seems to be more robust than tit-for-tat, suggesting that cooperative behaviour in natural situations may often be based on win-stay, lose-shift.
Article
Full-text available
Protein structures in nature often exhibit a high degree of regularity (for example, secondary structure and tertiary symmetries) that is absent from random compact conformations. With the use of a simple lattice model of protein folding, it was demonstrated that structural regularities are related to high "designability" and evolutionary stability. The designability of each compact structure is measured by the number of sequences that can design the structure-that is, sequences that possess the structure as their nondegenerate ground state. Compact structures differ markedly in terms of their designability; highly designable structures emerge with a number of associated sequences much larger than the average. These highly designable structures possess "proteinlike" secondary structure and even tertiary symmetries. In addition, they are thermodynamically more stable than other structures. These results suggest that protein structures are selected in nature because they are readily designed and stable against mutations, and that such a selection simultaneously leads to thermodynamic stability.
Article
Full-text available
Allometric scaling relations, including the 3/4 power law for metabolic rates, are characteristic of all organisms and are here derived from a general model that describes how essential materials are transported through space-filling fractal networks of branching tubes. The model assumes that the energy dissipated is minimized and that the terminal tubes do not vary with body size. It provides a complete analysis of scaling relations for mammalian circulatory systems that are in agreement with data. More generally, the model predicts structural and functional properties of vertebrate cardiovascular and respiratory systems, plant vascular systems, insect tracheal tubes, and other distribution networks.
Article
Full-text available
In a cell or microorganism, the processes that generate mass, energy, information transfer and cell-fate specification are seamlessly integrated through a complex network of cellular constituents and reactions. However, despite the key role of these networks in sustaining cellular functions, their large-scale structure is essentially unknown. Here we present a systematic comparative mathematical analysis of the metabolic networks of 43 organisms representing all three domains of life. We show that, despite significant variation in their individual constituents and pathways, these metabolic networks have the same topological scaling properties and show striking similarities to the inherent organization of complex non-biological systems. This may indicate that metabolic organization is not only identical for all living organisms, but also complies with the design principles of robust and error-tolerant scale-free networks, and may represent a common blueprint for the large-scale organization of interactions among all cellular constituents.
Article
Full-text available
We study the diameter, or the mean distance between sites, in a scale-free network, having N sites and degree distribution p(k) proportional, variant k(-lambda), i.e., the probability of having k links outgoing from a site. In contrast to the diameter of regular random networks or small-world networks, which is known to be d approximately ln(N, we show, using analytical arguments, that scale-free networks with 2<lambda<3 have a much smaller diameter, behaving as d approximately ln(ln(N. For lambda=3, our analysis yields d approximately ln(N/ln(ln(N, as obtained by Bollobas and Riordan, while for lambda>3, d approximately ln(N. We also show that, for any lambda>2, one can construct a deterministic scale-free network with d approximately ln(ln(N, which is the lowest possible diameter.
Article
Full-text available
We study the structural stability of models of proteins for which the selected folds are unusually stable to mutation, that is, designable. A two-dimensional hydrophobic-polar lattice model was used to determine designable folds and these folds were investigated through Langevin dynamics. We find that the phase diagram of these proteins depends on their designability. In particular, highly designable folds are found to be weaker, i.e., easier to unfold, than low designable ones. We expect this to be related to protein flexibility.
Article
Full-text available
Recent laboratory experiments suggest that a molecule's ability to evolve neutrally is important for its ability to generate evolutionary innovations. In contrast to laboratory experiments, life unfolds on time-scales of billions of years. Here, we ask whether a molecule's ability to evolve neutrally-a measure of its robustness-facilitates evolutionary innovation also on these large time-scales. To this end, we use protein designability, the number of sequences that can adopt a given protein structure, as an estimate of the structure's ability to evolve neutrally. Based on two complementary measures of functional diversity-catalytic diversity and molecular functional diversity in gene ontology-we show that more robust proteins have a greater capacity to produce functional innovations. Significant associations among structural designability, folding rate and intrinsic disorder also exist, underlining the complex relationship of the structural factors that affect protein evolution.
Article
Full-text available
In the framework of a lattice-model study of protein folding, we investigate the interplay between designability, thermodynamic stability, and kinetics. To be ``protein-like'', heteropolymers must be thermodynamically stable, stable against mutating the amino-acid sequence, and must be fast folders. We find two criteria which, together, guarantee that a sequence will be ``protein like'': i) the ground state is a highly designable stucture, i. e. the native structure is the ground state of a large number of sequences, and ii) the sequence has a large Δ/Γ\Delta/\Gamma ratio, Δ\Delta being the average energy separation between the ground state and the excited compact conformations, and Γ\Gamma the dispersion in energy of excited compact conformations. These two criteria are not incompatible since, on average, sequences whose ground states are highly designable structures have large Δ/Γ\Delta/\Gamma values. These two criteria require knowledge only of the compact-state spectrum. These claims are substantiated by the study of 45 sequences, with various values of Δ/Γ\Delta/\Gamma and various degrees of designability, by means of a Borst-Kalos-Lebowitz algorithm, and the Ferrenberg-Swendsen histogram optimization method. Finally, we report on the reasons for slow folding. A comparison between a very slow folding sequence, an average folding one and a fast folding one suggests that slow folding originates from a proliferation of nearly compact low-energy conformations, not present for fast folders. Comment: 24 pages, 10 figures, 2 tables
Article
Full-text available
A large number of complex networks, both natural and artificial, share the presence of highly heterogeneous, scale-free degree distributions. A few mechanisms for the emergence of such patterns have been suggested, optimization not being one of them. In this letter we present the first evidence for the emergence of scaling (and smallworldness) in software architecture graphs from a well-defined local optimization process. Although the rules that define the strategies involved in software engineering should lead to a tree-like structure, the final net is scale-free, perhaps reflecting the presence of conflicting constraints unavoidable in a multidimensional optimization process. The consequences for other complex networks are outlined. Comment: 6 pages, 2 figures. Submitted to Europhysics Letters. Additional material is available at http://complex.upc.es/~sergi/software.htm
Article
Full-text available
Many complex systems, such as communication networks, display a surprising degree of robustness: while key components regularly malfunction, local failures rarely lead to the loss of the global information-carrying ability of the network. The stability of these complex systems is often attributed to the redundant wiring of the functional web defined by the systems' components. In this paper we demonstrate that error tolerance is not shared by all redundant systems, but it is displayed only by a class of inhomogeneously wired networks, called scale-free networks. We find that scale-free networks, describing a number of systems, such as the World Wide Web, Internet, social networks or a cell, display an unexpected degree of robustness, the ability of their nodes to communicate being unaffected by even unrealistically high failure rates. However, error tolerance comes at a high price: these networks are extremely vulnerable to attacks, i.e. to the selection and removal of a few nodes that play the most important role in assuring the network's connectivity. Comment: 14 pages, 4 figures, Latex
Article
Experiments show that evolutionary fitness landscapes can have a rich combinatorial structure due to epistasis. For some landscapes, this structure can produce a computational constraint that prevents evolution from finding local fitness optima-thus overturning the traditional assumption that local fitness peaks can always be reached quickly if no other evolutionary forces challenge natural selection. Here, I introduce a distinction between easy landscapes of traditional theory where local fitness peaks can be found in a moderate number of steps, and hard landscapes where finding local optima requires an infeasible amount of time. Hard examples exist even among landscapes with no reciprocal sign epistasis; on these semismooth fitness landscapes, strong selection weak mutation dynamics cannot find the unique peak in polynomial time. More generally, on hard rugged fitness landscapes that include reciprocal sign epistasis, no evolutionary dynamics-even ones that do not follow adaptive paths-can find a local fitness optimum quickly. Moreover, on hard landscapes, the fitness advantage of nearby mutants cannot drop off exponentially fast but has to follow a power-law that long-term evolution experiments have associated with unbounded growth in fitness. Thus, the constraint of computational complexity enables open-ended evolution on finite landscapes. Knowing this constraint allows us to use the tools of theoretical computer science and combinatorial optimization to characterize the fitness landscapes that we expect to see in nature. I present candidates for hard landscapes at scales from single genes, to microbes, to complex organisms with costly learning (Baldwin effect) or maintained cooperation (Hankshaw effect). Just how ubiquitous hard landscapes (and the corresponding ultimate constraint on evolution) are in nature becomes an open empirical question.
Article
Some preliminary work is presented on a very general new theory of inductive inference. The extrapolation of an ordered sequence of symbols is implemented by computing the a priori probabilities of various sequences of symbols. The a priori probability of a sequence is obtained by considering a universal Turing machine whose output is the sequence in question. An approximation to the a priori probability is given by the shortest input to the machine that will give the desired output. A more exact formulation is given, and it is made somewhat plausible that extrapolation probabilities obtained will be largely independent of just which universal Turing machine was used, providing that the sequence to be extrapolated has an adequate amount of information in it. Some examples are worked out to show the application of the method to specific problems. Applications of the method to curve fitting and other continuous problems are discussed to some extent. Some alternative
Article
The relationship between optimization, evolutionary sequence selection, and structural symmetry is investigated for an elementary continuum model of proteins in which a complete correspondence between sequence and structure can be established. It is found that (i) kinetic optimization (minimal frustration) is strongly connected with ground state symmetry and (ii) the highest symmetry ground state is the least fragile to sequence mutations.
Article
All living things are remarkably complex, yet their DNA is unstable, undergoing countless random mutations over generations. Despite this instability, most animals do not grow two heads or die, plants continue to thrive, and bacteria continue to divide. Robustness and Evolvability in Living Systems tackles this perplexing paradox. The book explores why genetic changes do not cause organisms to fail catastrophically and how evolution shapes organisms' robustness. Andreas Wagner looks at this problem from the ground up, starting with the alphabet of DNA, the genetic code, RNA, and protein molecules, moving on to genetic networks and embryonic development, and working his way up to whole organisms. He then develops an evolutionary explanation for robustness. Wagner shows how evolution by natural selection preferentially finds and favors robust solutions to the problems organisms face in surviving and reproducing. Such robustness, he argues, also enhances the potential for future evolutionary innovation. Wagner also argues that robustness has less to do with organisms having plenty of spare parts (the redundancy theory that has been popular) and more to do with the reality that mutations can change organisms in ways that do not substantively affect their fitness. Unparalleled in its field, this book offers the most detailed analysis available of all facets of robustness within organisms. It will appeal not only to biologists but also to engineers interested in the design of robust systems and to social scientists concerned with robustness in human communities and populations.
Article
1. In the course of evolution, complicated organisms have descended from much simpler ones. Since the instructions to form an organism are contained in the nucleus of its fertilized egg, this means that the genetic constitution has become correspondingly more complex in evolution. If we express this complexity in terms of its improbability, defining the amount of genetic information as the negative logarithm of its probability of occurrence by chance, we may say that genetic information is increased in the course of progressive evolution, guided by natural selection of random mutations. 2. It was demonstrated that the rate of accumulation of genetic information in adaptive evolution is directly proportional to the substitutional load, i.e. the decrease of Darwinian fitness brought about by substituting for one gene its allelic form which is more fitted to a new environment. The rate of accumulation of genetic information is given by where L e is the substitutional load measured in ‘Malthusian parameters’. 3. Using L e = 0·199, a value obtained from the application of the ‘principle of minimum genetic load’ (cf. Kimura, 1960 b ), we get It was estimated that the total amount of genetic information accumulated since the beginning of the Cambrian epoch (500 million years) may be of the order of 108 bits, if evolution has proceeded at the standard rate. Since the genetic information is transformed into phenotypic information in ontogeny, this figure (10 ⁸ bits) must represent the amount of information which corresponds to the improved organization of higher animals as compared to their ancestors 500 million years back. 4. Problems involved in storage and transformation of genetic information thus acquired were discussed and it was pointed out that the redundancy of information in the form of repetition in linear sequence of nucleotide pairs within a gene may play an important role in the storage of genetic information.
Article
Previous theoretical studies on the pentose phosphate cycle (Meléndez-Hevia et al., 1985, 1988, 1990) demonstrated that simplicity in metabolism, defined as the least possible number of enzyme reactions in a pathway, has been a target in biological evolution. Those results demonstrated that a process of optimization has occurred in the evolution of metabolism. However, the results also suggest a number of questions of general interest: (i) Why simplicity? What is the selective advantage of simplicity in metabolic pathways? (ii) How has simplicity been achieved? Can natural selection mechanisms solve the problems of combinatorial optimization in the design of metabolism? (iii) Are the reaction mechanisms of the pentose phosphate cycle (transketolase and transaldolase) the best suited for pentose-hexose interconversion? For example, could a simpler pathway be possible if other enzymes (e.g. one carbon transfer) were to exist? In this paper we analyze all these questions and present results which demonstrate that: (i) Simplicity (the least possible number of steps) in a metabolic pathway is a feature which supplies more catalytic efficiency. That is, for a given metabolic conversion, a short pathway yields more flux than a long one. (ii) Natural selection working at molecular level accounts for the selection of the shortest pathway. (iii) It is not possible to find any other set of enzyme mechanisms capable of producing a simpler solution for the pentose phosphate pathway; any other mechanism, such as one carbon transfer between sugars, leads to a more complicated solution. Therefore, our results demonstrate that both the design of this pathway and the enzyme mechanisms themselves have been optimized.
Article
Only about 1000 qualitatively different protein folds are believed to exist in nature. Here, we review theoretical studies which suggest that some folds are intrinsically more designable than others, i.e. are lowest energy states of an unusually large number of sequences. The sequences associated with these folds are also found to be unusually thermally stable. The connection between highly designable structures and highly stable sequences is generally known as the ‘designability principle’. The designability principle may help explain the small number of natural folds, and may also guide the design of new folds.
Article
Information is a key concept in evolutionary biology. Information stored in a biological organism's genome is used to generate the organism and to maintain and control it. Information is also that which evolves. When a population adapts to a local environment, information about this environment is fixed in a representative genome. However, when an environment changes, information can be lost. At the same time, information is processed by animal brains to survive in complex environments, and the capacity for information processing also evolves. Here, I review applications of information theory to the evolution of proteins and to the evolution of information processing in simulated agents that adapt to perform a complex task.
Article
Highly optimized tolerance (HOT) is a mechanism that relates evolving structure to power laws in interconnected systems. HOT systems arise where design and evolution create complex systems sharing common features, including (1) high efficiency, performance, and robustness to designed-for uncertainties, (2) hypersensitivity to design flaws and unanticipated perturbations, (3) nongeneric, specialized, structured configurations, and (4) power laws. We study the impact of incorporating increasing levels of design and find that even small amounts of design lead to HOT states in percolation.
Article
Branching morphogenesis is one of the earliest events essential for the success of metazoans. By branching out and forming cellular or tissue extensions, cells can maximize their surface area and overcome space constraints posed by organ size. Over the past decade, tremendous progress has been made toward understanding the branching mechanisms of various invertebrate and vertebrate organ systems. Despite their distinct origins, morphologies and functions, different cell and tissue types use a remarkably conserved set of tools to undergo branching morphogenesis. Recent studies have shed important light on the basis of molecular conservation in the formation of branched structures in diverse organs.
Article
Very little is known about the distribution of functional DNA, RNA, and protein molecules in sequence space. The question of how the number and complexity of distinct solutions to a particular biochemical problem varies with activity is an important aspect of this general problem. Here we present a comparison of the structures and activities of eleven distinct GTP-binding RNAs (aptamers). By experimentally measuring the amount of information required to specify each optimal binding structure, we show that defining a structure capable of 10-fold tighter binding requires approximately 10 additional bits of information. This increase in information content is equivalent to specifying the identity of five additional nucleotide positions and corresponds to an approximately 1000-fold decrease in abundance in a sample of random sequences. We observe a similar relationship between structural complexity and activity in a comparison of two catalytic RNAs (ribozyme ligases), raising the possibility of a general relationship between the complexity of RNA structures and their functional activity. Describing how information varies with activity in other heteropolymers, both biological and synthetic, may lead to an objective means of comparing their functional properties. This approach could be useful in predicting the functional utility of novel heteropolymers.
Article
We present an analysis of the topologies of a class of networks which are optimal in terms of the requirements of having as short a route as possible between any two nodes while yet keeping the congestion in the network as low as possible. Strikingly, we find a variety of distinct topologies and novel phase transitions between them on varying the number of links per node. Our results suggest that the emergence of the topologies observed in nature may arise both from growth mechanisms and the interplay of dynamical mechanisms with a selection process.
Article
Developmental processes are thought to be highly complex, but there have been few attempts to measure and compare such complexity across different groups of organisms. Here we introduce a measure of biological complexity based on the similarity between developmental and computer programs. We define the algorithmic complexity of a cell lineage as the length of the shortest description of the lineage based on its constituent sublineages. We then use this measure to estimate the complexity of the embryonic lineages of four metazoan species from two different phyla. We find that these cell lineages are significantly simpler than would be expected by chance. Furthermore, evolutionary simulations show that the complexity of the embryonic lineages surveyed is near that of the simplest lineages evolvable, assuming strong developmental constraints on the spatial positions of cells and stabilizing selection on cell number. We propose that selection for decreased complexity has played a major role in moulding metazoan cell lineages.
Article
Theoretical studies of RNA and lattice protein models suggest that mutationally robust or the so-called designable phenotypes tend to have special geometric features such as being more compact and more geometrically regular. Such geometrical forms have been also linked to speed of folding and stability properties that may also assist in promoting mutational robustness. Here we test these theoretical predictions on a non-redundant collection of 2,660 experimentally determined structures from the PDB (Protein Data Bank) and CATH (Class Architecture Topology Homologous superfamily) database. We first developed an index summarizing the geometrical regularity of the structures and then used this index to show that the statistical pattern of empirical data is consistent with the theoretical predictions relating geometry to mutational robustness. Mutationally robust proteins tend to be more symmetric and compact. But, the relationship between compactness and robustness cannot be explained simply by the geometrical packing of individual amino acids in proteins; rather, it is the property of the whole system that is related to the statistical characteristics of the folding landscape. Finally, we hypothesize that a triplet relationship between mutational robustness, stability and form is a general properties of objects that optimize real-valued relationships between sequences and discrete structures.
Article
Mammalian lungs are branched networks containing thousands to millions of airways arrayed in intricate patterns that are crucial for respiration. How such trees are generated during development, and how the developmental patterning information is encoded, have long fascinated biologists and mathematicians. However, models have been limited by a lack of information on the normal sequence and pattern of branching events. Here we present the complete three-dimensional branching pattern and lineage of the mouse bronchial tree, reconstructed from an analysis of hundreds of developmental intermediates. The branching process is remarkably stereotyped and elegant: the tree is generated by three geometrically simple local modes of branching used in three different orders throughout the lung. We propose that each mode of branching is controlled by a genetically encoded subroutine, a series of local patterning and morphogenesis operations, which are themselves controlled by a more global master routine. We show that this hierarchical and modular programme is genetically tractable, and it is ideally suited to encoding and evolving the complex networks of the lung and other branched organs.
Article
A new definition of program-size complexity is made. H(A;B=C;D) is defined to be the size in bits of the shortest self-delimiting program for calculating strings A and B if one is given a minimal-size selfdelimiting program for calculating strings C and D. This differs from previous definitions: (1) programs are required to be self-delimiting, i.e. no program is a prefix of another, and (2) instead of being given C and D directly, one is given a program for calculating them that is minimal in size. Unlike previous definitions, this one has precisely the formal 2 G. J. Chaitin properties of the entropy concept of information theory. For example, H(A;B) = H(A) + H(B=A) + O(1). Also, if a program of length k is assigned measure 2 Gammak , then H(A) = Gamma log 2 (the probability that the standard universal computer will calculate A) +O(1). Key Words and Phrases: computational complexity, entropy, information theory, instantaneous code, Kraft inequality, minimal program, probab...
Optimization in evolution, limitations of
  • H G Spencer
H.G. Spencer. Optimization in evolution, limitations of. In Neil J. Smelser and Paul B. Baltes, editors, International Encyclopedia of the Social and Behavioral Sciences, pages 10882-10887. Pergamon, Oxford, 2001.
Optima and simplicity in nature
  • Kamaludin Dingle
Kamaludin Dingle. Optima and simplicity in nature.