Article

Ode to the Code

Authors:
To read the full-text of this research, you can request a copy directly from the author.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Rather, errors are made, often repaired or discarded, but always tested as the source of blind innovation during the continuous adaptation to unpredictable environmental changes and challenges." Fourthly, the code is not frozen at twenty amino acids (Hayes, 2004) 29 -both selenocysteine and pyrolysine are used in some organisms. Fifthly, the length of the anticodon in tRNA's is not always three nucleotides long but can vary up to four or five (Hayes, 2004) 29 . ...
... Rather, errors are made, often repaired or discarded, but always tested as the source of blind innovation during the continuous adaptation to unpredictable environmental changes and challenges." Fourthly, the code is not frozen at twenty amino acids (Hayes, 2004) 29 -both selenocysteine and pyrolysine are used in some organisms. Fifthly, the length of the anticodon in tRNA's is not always three nucleotides long but can vary up to four or five (Hayes, 2004) 29 . ...
... Fourthly, the code is not frozen at twenty amino acids (Hayes, 2004) 29 -both selenocysteine and pyrolysine are used in some organisms. Fifthly, the length of the anticodon in tRNA's is not always three nucleotides long but can vary up to four or five (Hayes, 2004) 29 . Finally, as Massey (2008) 30 points out, much of evolutionary history is not always due to selection; in particular, what role has random genetic drift played over the course of history on the evolution of genetic coding apparatuses? ...
... In nature, the genetic code was assumed to maximize efficiency and information density for some time. Nowadays the code is investigated from the point of view of providing maximum fault-tolerance or robustness [5]. Analogous, in order to make telecommunication systems fault tolerant hence robust, telecommunication technology applies error detection and error correction codes to confirm or protect information transmitted over communication channels [8]. ...
Article
Many artificial intelligence (AI) techniques are inspired by problem-solving strategies found in nature. Robustness is a key feature in many natural systems. This paper studies robustness in artificial neural networks (ANNs) and proposes several novel, nature inspired ANN architectures. The paper includes encouraging results from experimental studies on these networks showing increased robustness.
... A number of three-dimensional representations of the genetic code have also been proposed, including a dodecahedral version [7], and multiple tetrahedral ones [8][9][10][11][12][13][14][15][16][17]. Most of these latter ones either divide the tetrahedron into twenty parts representing the amino acids, or divide each of the four faces into nine triangular subdivisions, with the inner vertices or intersections representing the codons. ...
Article
Full-text available
The genetic code is a mapping of 64 codons to 22 actions, including polypeptide chain initiation, termination, and incorporation of the twenty amino acids. The standard tabular representation is useful for looking up which amino acid is encoded by a particular codon, but says little about functional relationships in the code. The possibility of making sense of the code rather than simply enumerating its codon-to-action pairings therefore is appealing, and many have attempted to find geometric representations of the code that illuminate its functional organization. Here, I show that a regular tetrahedron with each of its four faces divided into sixteen equilateral triangles (for a total of 64 triangular 'cells') is a particularly apt geometry for representing the code. I apply five principles of symmetry and balance in order to assign codons to the triangular cells of the tetrahedral faces. These principles draw on various aspects of the genetic code and the twenty amino acids, making the final construct a positional balance of the amino acids and their functions rather than a re-analysis of them. The potential significance of this exercise, and others like it, is that this way of organizing the biological facts may provide new insights into them.
... Feedback loops are pivotal in the emergence of a coding machinery that synthesis itself [12]. I believe that this self-referential feature of the emergent code is also linked to the universality of the coding phase transition and to its generic topological features [13][14][15], including the 'magic number' of 20 amino-acids [16,17]. Concrete models, such as the TKY dynamical system [3] will allow stringent testing of the topological coding transition. ...
... Besides the traditional code-table (Fig. 1A), quite a few ingenious schemes were devised in order to draw the genetic code on a sheet of paper [74][75][76] or on a sphere [77] in a manner that would reflect the interrelations between codons and amino-acids. ...
Article
Full-text available
The genetic code maps the sixty-four nucleotide triplets (codons) to twenty amino-acids. While the biochemical details of this code were unraveled long ago, its origin is still obscure. We review information-theoretic approaches to the problem of the code's origin and discuss the results of a recent work that treats the code in terms of an evolving, error-prone information channel. Our model - which utilizes the rate-distortion theory of noisy communication channels - suggests that the genetic code originated as a result of the interplay of the three conflicting evolutionary forces: the needs for diverse amino-acids, for error-tolerance and for minimal cost of resources. The description of the code as an information channel allows us to mathematically identify the fitness of the code and locate its emergence at a second-order phase transition when the mapping of codons to amino-acids becomes nonrandom. The noise in the channel brings about an error-graph, in which edges connect codons that are likely to be confused. The emergence of the code is governed by the topology of the error-graph, which determines the lowest modes of the graph-Laplacian and is related to the map coloring problem.
Chapter
Intelligent Computing Everywhere reflects the current perception in various fields that modern computing applications are becoming increasingly challenged in terms of complexity and intelligence. It investigates the relevance and relationship artificial intelligence maintains with "modern strands of computing" i.e. pervasive computing and ambient intelligence, bioinformatics, neuroinformatics, computing and the mind, non-classical computing and novel computing models, as well as DNA computing and quantum computing. Each subject is examined from two main viewpoints, the first provides a state of the art introduction to a field such as DNA computing and the second investigates the subject from an artificial intelligence perspective.
Article
Robustness is a feature in many systems, natural and artificial alike. This chapter investigates robustness from a variety of perspectives including its appearances in nature and its application in modern environments. A particular focus investigates the relevance and importance of robustness in a discipline where many techniques are inspired by problem-solving strategies found in nature---artificial intelligence. The challenging field of artificial intelligence provides an opportunity to engage in a wider discussion on the subject of robustness.
Article
Full-text available
My previous theoretical research shows that the rotating circular genetic code is a viable tool to make easier to distinguish the rules of variation applied to the amino acid exchange; it presents a precise and positional bio-mathematical balance of codons, according to the amino acids they codify. Here, I demonstrate that when using the conventional or classic circular genetic code, a clearer pattern for the human codon usage per amino acid and per genome emerges. The most used human codons per amino acid were the ones ending with the three hydrogen bond nucleotides: C for 12 amino acids and G for the remaining 8, plus one codon for arginine ending in A that was used approximately with the same frequency than the one ending in G for this same amino acid (plus *). The most used codons in man fall almost all the time at the rightmost position, clockwise, ending either in C or in G within the circular genetic code. The human codon usage per genome is compared to other organisms such as fruit flies (Drosophila melanogaster), squid (Loligo pealei), and many others. The biosemiotic codon usage of each genomic population or 'Theme' is equated to a 'molecular language'. The C/U choice or difference, and the G/A difference in the third nucleotide of the most used codons per amino acid are illustrated by comparing the most used codons per genome in humans and squids. The human distribution in the third position of most used codons is a 12-8-2, C-G-A, nucleotide ending signature, while the squid distribution in the third position of most used codons was an odd, or uneven, distribution in the third position of its most used codons: 13-6-3, U-A-G, as its nucleotide ending signature. These findings may help to design computational tools to compare human genomes, to determine the exchangeability between compatible codons and amino acids, and for the early detection of incompatible changes leading to hereditary diseases.
Article
Darwin’s insight that species are mutable, and descent, and origin by means of natural selection is one of the most widely acknowledged strategies for the origin of species and their survival in nature. In his famous contribution, however, Darwin also writes that he is convinced that “... Natural Selection has been the main but not exclusive means of modification” (Darwin in The origin of species. Oxford Univeristy Press, Oxford, p. 7, 1996). This research suggests robustness as another fundamental strategy for survival in nature. The paper does not contradict the popular view, which usually sees robustness as a feature making systems fault-tolerant, thereby focusing on the identification of strategies and techniques for making systems robust (i.e., how to achieve robustness). The paper rather extends this view with an interpretation resting on the question—WHY is robustness omnipresent in the world around us? From this point of view, robustness is interpreted as a fundamental mechanism that is in place because of another fundamental feature in nature—the design and use of sub-optimal systems. The paper argues that, in a sense, nature under-specifies systems but compensates for this by providing systems with various degrees of robustness. We believe that this interpretation may lead to fundamentally new design approaches and insights in several fields.
Article
Full-text available
In this article, the pattern learned from the classic or conventional rotating circular genetic code is transferred to a 64-grid model. In this non-static representation, the codons for the same amino acid within each quadrant could be exchanged, wobbling or rotating in a quantic way similar to the electrons within an atomic orbit. Represented in this 64-grid format are the three rules of variation encompassing 4, 2, or 1 quadrant, respectively: 1) same position in four quadrants for the essential hydrophobic amino acids that have U at the center, 2) same or contiguous position for the same or related amino acids in two quadrants, and 3) equivalent amino acids within one quadrant. Also represented is the mathematical balance of the odd and even codons, and the most used codons per amino acid in humans compared to one diametrically opposed organism: the plant Arabidopsis thaliana, a comparison that depicts the difference in third nucleotide preferences: a C/U exchange for 11 amino acids, a G/A exchange for 2 amino acids, and G/U or C/A exchanges for one amino acid, respectively; by studying these codon usage preferences per amino acid we present our two hypotheses: 1) A slower translation in vertebrates and 2) a faster translation in invertebrates, possibly due to the aqueous environments where they live. These codon usage preferences may also be able to determine genomic compatibility by comparing individual mRNAs and their functional third dimensional structure, transport and translation within cells and organisms. These observations are aimed to the design of bioinformatics computational tools to compare human genomes and to determine the exchange between compatible codons and amino acids, to preserve and/or to bring back extinct biodiversity, and for the early detection f incompatible changes that lead to genetic diseases.
Article
Full-text available
General guidelines for the molecular basis of functional variation are presented while focused on the rotating circular genetic code and allowable exchanges that make it resistant to genetic diseases under normal conditions. The rules of variation, bioinformatics aids for preventative medicine, are: (1) same position in the four quadrants for hydrophobic codons, (2) same or contiguous position in two quadrants for synonymous or related codons, and (3) same quadrant for equivalent codons. To preserve protein function, amino acid exchange according to the first rule takes into account the positional homology of essential hydrophobic amino acids with every codon with a central uracil in the four quadrants, the second rule includes codons for identical, acidic, or their amidic amino acids present in two quadrants, and the third rule, the smaller, aromatic, stop codons, and basic amino acids, each in proximity within a 90 degree angle. I also define codifying genes and palindromati, CTCGTGCCGAATTCGGCACGAG.
Article
This paper presents a new version of a periodic table for genetic codes using a ‘Leibnitz Number’ as a codon number or anticodon number, which is a natural binary code number and hence outwardly similar to the Gray code binary number. In the obtained periodic table or in the reformed table (a cube-shaped periodic table), the proteinaceous amino acids not only have periodicity, but also occupy mirror-symmetrical positions with respect to the xy-plane. Moreover, the cube-shaped periodic table allows a partial explanation of non-standard genetic codes and some predictions about providing potential candidates for non-standard genetic codons to be discovered in the future. By making a new format of a two-dimensional periodic table for anticodons as the primary reference point, all of the anticodon pairing with multiple codons can be intimately related to a mirror-symmetrical arrangement of amino acids with relation to the yz-plane in the two-dimensional periodic table. In the later section two new indexes, the Inversion Number and the Miracle Number, are introduced to show that the codon numbers and anticodon numbers play a fundamental role in the structure underlying the genetic code table. These characteristic features, such as periodicity and mirror symmetry of the indexes, hold true for not just the Watson–Crick base-pairs, but also for the non-Watson–Crick base-pairs. Furthermore, in the mammalian mitochondrial genetic code, some basic rules identical/similar to the standard genetic code can be disclosed. These results, including symmetric quality of amino acids and Inversion Numbers, suggest the necessary conditions for the existence of life systems. Additionally, the proposed periodic table can successfully understand the previous studies, such as codon ring, mutation ring, and biosynthetic pathways.
Article
The general features of the genetic code are described. It is considered that originally only a few amino acids were coded, but that most of the possible codons were fairly soon brought into use. In subsequent steps additional amino acids were substituted when they were able to confer a selective advantage, until eventually the code became frozen in its present form.
Article
The amino acid substitutions resulting from single-base substitution in the natural genetic code have been compared with those resulting from single-base substitutions in computer-generated random codes. Considering the amino acid properties of molecular weight, polar requirement, number of dissociating groups, pK(1)', isoelectric point, and alpha-helix forming ability, it is concluded that, for the natural code, single-base substitution in the first position of the codon tends to result in the substitution of an amino acid more similar to the original amino acid than would be expected from a random code. In the natural code, the second position of the codon plays the largest role in determining the properties of the amino acid.
Article
The allocation of codons in the genetic code makes possible a moderate minimization of the chemical distances between pairs of neighboring amino acids in the code. However, the code is neither a global nor a local optimum with respect to distance minimization. These findings do not support the physicochemical postulate that distance minimization was a major factor shaping the evolution of the genetic code. They agree with the coevolution theory, which proposes that genetic code evolution was predominantly determined by the concession of codons from precursor to product amino acids in an expansion of the code to accommodate new varieties of amino acids, with distance minimization playing a subsidiary role in deciding the choice of codons to be acquired by the product amino acids from the codon domains of the precursor amino acids.
Article
The hypothesis that the universal genetic code is adapted to double-strand coding is supported by its remarkable compatibility with the RNY comma-less hypothesis. Coding by a triplet code on a polynucleotide double-strand allows for enciphering of five additional messages with reference to a chosen primary reading frame. Assuming the acceptance of coupled mutations on both strands, the best codon register for two overlapping messages can be inferred. The idea of evolutionarily compatible coding of two proteins by one nucleotide double-strand is extended to complementary coding for one protein in folded, single-stranded RNA.
Article
We argue that a primitive genetic code with only 20 separate words explains that there are 20 coded amino acids in modern life. The existence of 64 words on the modern genetic code requires modern life to read almost exclusively one strand of DNA in one direction. In our primitive code, both the original and the complementary sequence are read in either direction to give the same strings of amino acids. The algebra of complements forces synonymy of primitive codons so as to reduce the 64 independent codons of the modern code to exactly 20 independent separate words in the primitive condition. The synonymy in the modern code is the result of selection rather than algebraic forcing. The primitive code has almost no resilience to base mutations, unlike the third base redundancy of the modern code. Our primitive and the modern code are orthogonal. If palindromic proteins were coded by hairpin DNA or RNA, then (i) no punctuation would be needed; (ii) the reverse reading would give the same secondarily folded protein structure; and (iii) the sugar backbone would be read in the conventional 5' to 3' direction for the original arm and its complement. Modern copying of genetic material is almost always antiparallel. However, occasional parallel copying, as does occur in modern life, would give the complementary hairpin that would also read 5' to 3' along its entire length.(ABSTRACT TRUNCATED AT 250 WORDS)
Article
Statistical and biochemical studies of the genetic code have found evidence of nonrandom patterns in the distribution of codon assignments. It has, for example, been shown that the code minimizes the effects of point mutation or mistranslation: erroneous codons are either synonymous or code for an amino acid with chemical properties very similar to those of the one that would have been present had the error not occurred. This work has suggested that the second base of codons is less efficient in this respect, by about three orders of magnitude, than the first and third bases. These results are based on the assumption that all forms of error at all bases are equally likely. We extend this work to investigate (1) the effect of weighting transition errors differently from transversion errors and (2) the effect of weighting each base differently, depending on reported mistranslation biases. We find that if the bias affects all codon positions equally, as might be expected were the code adapted to a mutational environment with transition/transversion bias, then any reasonable transition/transversion bias increases the relative efficiency of the second base by an order of magnitude. In addition, if we employ weightings to allow for biases in translation, then only 1 in every million random alternative codes generated is more efficient than the natural code. We thus conclude not only that the natural genetic code is extremely efficient at minimizing the effects of errors, but also that its structure reflects biases in these errors, as might be expected were the code the product of selection.
Article
The Standard Genetic Code is organized such that similar codons encode similar amino acids. One explanation suggested that the Standard Code is the result of natural selection to reduce the fitness "load" that derives from the mutation and mistranslation of protein-coding genes. We review the arguments against the mutational load-minimizing hypothesis and argue that they need to be reassessed. We review recent analyses of the organization of the Standard Code and conclude that under cautious interpretation they support the mutational load-minimizing hypothesis. We then present a deterministic asexual model with which we study the mode of selection for load minimization. In this model, individual fitness is determined by a protein phenotype resulting from the translation of a mutable set of protein-coding genes. We show that an equilibrium fitness may be associated with a population with the same genetic code and that genetic codes that assign similar codons to similar amino acids have a higher fitness. We also show that the number of mutant codons in each individual at equilibrium, which determines the strength of selection for load minimization, reflects a long-term evolutionary balance between mutations in messages and selection on proteins, rather than the number of mutations that occur in a single generation, as has been assumed by previous authors. We thereby establish that selection for mutational load minimization acts at the level of an individual in a single generation. We conclude with comments on the shortcomings and advantages of load minimization over other hypotheses for the origin of the Standard Code.
Article
As an approach to investigate the molecular mechanism of in vivo protein folding and the role of translation kinetics on specific folding pathways, we made codon substitutions in the EgFABP1 (Echinococcus granulosus fatty acid binding protein1) gene that replaced five minor codons with their synonymous major ones. The altered region corresponds to a turn between two short alpha helices. One of the silent mutations of EgFABP1 markedly decreased the solubility of the protein when expressed in Escherichia coli. Expression of this protein also caused strong activation of a reporter gene designed to detect misfolded proteins, suggesting that the turn region seems to have special translation kinetic requirements that ensure proper folding of the protein. Our results highlight the importance of codon usage in the in vivo protein folding.
Article
Since discovering the pattern by which amino acids are assigned to codons within the standard genetic code, investigators have explored the idea that natural selection placed biochemically similar amino acids near to one another in coding space so as to minimize the impact of mutations and/or mistranslations. The analytical evidence to support this theory has grown in sophistication and strength over the years, and counterclaims questioning its plausibility and quantitative support have yet to transcend some significant weaknesses in their approach. These weaknesses are illustrated here by means of a simple simulation model for adaptive genetic code evolution. There remain ill explored facets of the 'error minimizing' code hypothesis, however, including the mechanism and pathway by which an adaptive pattern of codon assignments emerged, the extent to which natural selection created synonym redundancy, its role in shaping the amino acid and nucleotide languages, and even the correct interpretation of the adaptive codon assignment pattern: these represent fertile areas for future research.