Article

Tracing the evolution of RNA structure in ribosomes

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

The elucidation of ribosomal structure has shown that the function of ribosomes is fundamentally confined to dynamic interactions established between the RNA components of the ribosomal ensemble. These findings now enable a detailed analysis of the evolution of ribosomal RNA (rRNA) structure. The origin and diversification of rRNA was studied here using phylogenetic tools directly at the structural level. A rooted universal tree was reconstructed from the combined secondary structures of large (LSU) and small (SSU) subunit rRNA using cladistic methods and considerations in statistical mechanics. The evolution of the complete repertoire of structural ribosomal characters was formally traced lineage-by-lineage in the tree, showing a tendency towards molecular simplification and a homogeneous reduction of ribosomal structural change with time. Character tracing revealed patterns of evolution in inter-subunit bridge contacts and tRNA-binding sites that were consistent with the proposed coupling of tRNA translocation and subunit movement. These patterns support the concerted evolution of tRNA-binding sites in the two subunits and the ancestral nature and common origin of certain structural ribosomal features, such as the peptidyl (P) site, the functional relay of the penultimate stem helix of SSU rRNA, and other structures participating in ribosomal dynamics. Overall results provide a rare insight into the evolution of ribosomal structure.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Upon the accurate description of the ribosomal structure, a new type of deep sutructurephylogenetic sequence analysis has become possible, tracing the ribosomal functional centres back in the evolutionary timeline. A "coded character" approach was suggested, whereby rRNA nucleotides are represented reflective of their structural state (helices, stems, hairpins, loops etc.) and then analysed using phylogenetic methods (Caetano-Anollés, 2002). Notably, it was confirmed that the peptidyl-tRNA (P) site is the most ancient structure of the ribosome, along with the structures responsible for the subunit interaction and intersubunit dynamics (Caetano-Anollés, 2002). ...
... A "coded character" approach was suggested, whereby rRNA nucleotides are represented reflective of their structural state (helices, stems, hairpins, loops etc.) and then analysed using phylogenetic methods (Caetano-Anollés, 2002). Notably, it was confirmed that the peptidyl-tRNA (P) site is the most ancient structure of the ribosome, along with the structures responsible for the subunit interaction and intersubunit dynamics (Caetano-Anollés, 2002). The rRNA regions responsible for tRNA interaction and operation have shown signs of later co-evolution and traceable sequential adjustment that occurred in the rRNA of both subunits of the ribosome concurrently, providing additional evidence towards the structural link between tRNA translocation and intersubunit movement (Caetano-Anollés, 2002). ...
... Notably, it was confirmed that the peptidyl-tRNA (P) site is the most ancient structure of the ribosome, along with the structures responsible for the subunit interaction and intersubunit dynamics (Caetano-Anollés, 2002). The rRNA regions responsible for tRNA interaction and operation have shown signs of later co-evolution and traceable sequential adjustment that occurred in the rRNA of both subunits of the ribosome concurrently, providing additional evidence towards the structural link between tRNA translocation and intersubunit movement (Caetano-Anollés, 2002). Importantly, this tracing of conserved structures and interactions suggested gradual simplification and "channelling" of the evolutionary changes, whereby the initial diversity of structures and interactions would be gradually simplified, including size optimisations and non-orthologous function replacement, and structures streamlined in relatively independent "units" to enable modularity and subsequent independent evolution (Caetano-Anollés, 2002;Roberts et al., 2008). ...
Preprint
Full-text available
Translation of the genetic code into proteins is the main process across all life and ribosomes are ancient cellular machines uniquely enabling this information transformation. We provide a brief overview of the recent advances in linking the ribosomal structure and evolution. Based on these insights into ribosomal organisation across time, we propose that early replication and protein biosynthesis functions were inseparable and in fact were performed by the same ancient RNA molecule, the riboreplisome. Riboreplisome hypothesis helps to address issues of non-Darwinian evolution and complicated starting point that are characteristic to the RNA world, protein world and RNA:protein mixed co-development theories. We suggest that the riboreplisome is the missing link and a molecular machine connecting chemical and biological evolution paths, by being capable of basic genetic and feature selection functions in a cell- or cell-free setting. The riboreplisome hypothesis allows ease of sequential, genetically uninterrupted emergence and sophistication of the genetic code and its decoding machinery, and provides plausible explanations to the origins of the three main RNA types involved in the decoding: the ribosomal, transfer and messenger RNA. Furthermore, riboreplisome can help explaining the co-evolution of aminoacylation machinery, the driving force behind selective gene transcription and expression, and the cell-like compartmentalisation. While we may never find the original riboreplisome again, we might continue to discover different molecular remnants of its prior existence across the existing biological RNA, which, once identified or resurrected, can be useful in synthetic biology applications.
... The structure of RNA molecules has been used to improve sequence alignments (e.g., [4]) or generate phylogenetic trees describing the evolutionary relationship of organisms (beginning with [5][6][7]). However, the first use of structural information to reconstruct the history of RNA accretion began as either ancestral character state reconstructions (CSRs) along branches of a tree of life generated from rRNA [8] or directly as trees of molecular substructures describing their gradual addition to growing ribosomal molecules [9]. These novel approaches that embed "structure and function directly into phylogenetic analysis" point the way to "how structures evolve from one to the other" [10]. ...
... These novel approaches that embed "structure and function directly into phylogenetic analysis" point the way to "how structures evolve from one to the other" [10]. Their original application to evolutionary studies on different time scales (e.g., initial studies of mRNA and ITS rRNA to SRP RNA and rRNA [11][12][13]) was soon extended to the origin and evolution of ancient RNA molecules: tRNA [14][15][16][17], 5S RNA [18], RNase P RNA [19], SINE RNA [20], and rRNA [9,21]. In one remarkable example, the approach unfolded the translocation ('turnstile') origin and co-evolving history of the RNA and proteins that make up the entire ribosomal complex, the machinery responsible for protein biosynthesis [21]. ...
... Data matrices and rooted phylogenetic trees describing the evolution of tRNA, 5S rRNA, RNase P RNA, and rRNA were from published studies [9,14,18,19]. Original data came from the Bayreuth tRNA database (now at: http://trnadb.bioinf.uni-leipzig.de accessed on 26 May 2021), 5S rRNA Database (http://biobases.ibch.poznan.pl/5SData/ ...
Article
Full-text available
RNA evolves by adding substructural parts to growing molecules. Molecular accretion history can be dissected with phylogenetic methods that exploit structural and functional evidence. Here, we explore the statistical behaviors of lengths of double-stranded and single-stranded segments of growing tRNA, 5S rRNA, RNase P RNA, and rRNA molecules. The reconstruction of character state changes along branches of phylogenetic trees of molecules and trees of substructures revealed strong pushes towards an economy of scale. In addition, statistically significant negative correlations and strong associations between the average lengths of helical double-stranded stems and their time of origin (age) were identified with the Pierson’s correlation and Spearman’s rho methods. The ages of substructures were derived directly from published rooted trees of substructures. A similar negative correlation was detected in unpaired segments of rRNA but not for the other molecules studied. These results suggest a principle of diminishing returns in RNA accretion history. We show this principle follows a tendency of substructural parts to decrease their size when molecular systems enlarge that follows the Menzerath–Altmann’s law of language in full generality and without interference from the details of molecular growth.
... These features include the topology and thermodynamics of RNA molecules and the folds of the protein structural domains identified with hidden Markov models of structural recognition in thousands of genomes. 31 For example, phylogenomic tree-like statements (phylogenies) portraying the histories of molecular parts, including the structural domains of proteins 35 or the helical stems of RNA molecules, 36 allow to map the progression of accretion in the most central macromolecular complex of the cell, the ribosome. The ribosome is an essential molecular machine that is universally present in cells. ...
... The history of the entire ribosomal complex revealed piecemeal buildup of a universal structural core and later on ribosomal diversification. 31,34,36 Phylogenomic trees of rRNA helical stems and protein structural domains that are part of the small and large ribosomal subunits uncovered an evolutionary chronology of accretion ( Figure 3). This timeline described the evolution of the universally conserved ribosomal core, Figure 3. ...
... Source: Data from previous studies. 31,34,36 which was visualized by coloring relative evolutionary ages (derived directly from the trees) in 3-dimensional (3D) atomic ribosomal models. A molecular clock of folds linked these chronologies to the geological record. ...
Article
Full-text available
Networks describe how parts associate with each other to form integrated systems which often have modular and hierarchical structure. In biology, network growth involves two processes, one that unifies and the other that diversifies. Here, we propose a biphasic (bow-tie) theory of module emergence. In the first phase, parts are at first weakly linked and associate variously. As they diversify, they compete with each other and are often selected for performance. The emerging interactions constrain their structure and associations. This causes parts to self-organize into modules with tight linkage. In the second phase, variants of the modules diversify and become new parts for a new generative cycle of higher level organization. The paradigm predicts the rise of hierarchical modularity in evolving networks at different timescales and complexity levels. Remarkably, phylogenomic analyses uncover this emergence in the rewiring of metabolomic and transcriptome-informed metabolic networks, the nanosecond dynamics of proteins, and evolving networks of metabolism, elementary functionomes, and protein domain organization.
... (2) Evolution of nucleic acids. Since RNA molecules carry deep phylogenetic signal and the arrow of time in their structures, we have been able to derive historical accounts of molecular evolution directly from structural topology and thermodynamics [8,[41][42][43][44]. The evolutionary signal that we mine exists because the secondary structure is closely linked to structural conformation and dynamics [45]. ...
... These timelines define a 'natural history' of nucleic acids. The origin and evolution of the most ancient RNA molecules have been studied in this way, including tRNA [8,48,49], SINE elements [44], the large and small rRNA subunits [17,32,33,42,43], 5S rRNA [50] and RNase P RNA [51]. ...
... Similarly, SSU and LSU hold about 50 and 100 universal helical segments, respectively, which can also provide details about the evolutionary growth of the RNA molecules. Indeed, ToDs and ToSs enabled construction of detailed timelines of the history of r-proteins and nucleic acids, respectively [17,32,33,43,50]. More importantly, the structural interactions present in models of the atomic structure of the ribosome permitted mapping interactions in both timelines, effectively linking the two. ...
Chapter
The natural history of translation is mysterious but central to our understanding of the origin and evolution of biochemistry and life. tRNA is at the center of this biological process. Its interactions with aminoacyl-tRNA synthetase enzymes define the specificities of the genetic code and those with the ribosome their accurate biosynthetic interpretation. Here we review structural phylogenomic explorations of thousands of genomes and molecular structures that reveal a ‘metabolic-first’ origin of proteins, the early history of tRNA in interaction with cognate synthetase enzymes, the late appearance of a functional ribosome, and the co-evolutionary history of rRNA and proteins during ribosomal growth. We also discuss how the history of amino acid charging and codon specificities is embedded in tRNA and is encoded in genomes. Results uncover a hidden link between the genetic code and protein flexibility and suggest that tRNA molecules are building blocks of ribosomes and genomes. We make explicit the need to understand processes of molecular growth of macromolecules that would explain a primordial ribosome with both biocatalytic and genetic memory storage functions.
... RNA macromolecules also carry deep phylogenetic signal [74][75][76] and the arrow of time in their structures [77][78][79]. RNA base pairs associate and disassociate at rates b 0.5 s −1 [80]. Furthermore, RNA folding rate is dependent on chain length [81]. ...
... The thermodynamic stability of evolved molecules also increases in a process known as 'structural canalization' [84,85]. This global trend to increase molecular persistence and stability can be exploited in phylogenetic reconstruction to produce rooted phylogenies of parts and wholes, taking advantage of considerable background knowledge from cladistics, morphometrics and statistical mechanics [77][78][79]. ...
... molecular mechanics, simulations, phylogenetic analysis, thermodynamics; [86]) and comply with Weston's generality criterion through positional and compositional correspondence. The method has been applied to the study of a number of molecules, including rRNA [58,77,78], 5S rRNA [87], tRNA [79,88], RNase P RNA [89], and SINE RNA [90], to study molecular evolution of closely or distantly related organisms spanning years (e.g. continental introduction of a plant pathogenic fungus [76], ascomycete population differentiation [91], or coral evolution [92]) to billions of years of evolution (rise of superkingdoms; e.g. ...
Article
Full-text available
Accretion occurs pervasively in nature at widely different timeframes. The process also manifests in the evolution of macromolecules. Here we review recent computational and structural biology studies of evolutionary accretion that make use of the ideographic (historical, retrodictive) and nomothetic (universal, predictive) scientific frameworks. Computational studies uncover explicit timelines of accretion of structural parts in molecular repertoires and molecules. Phylogenetic trees of protein structural domains and proteomes and their molecular functions were built from a genomic census of millions of encoded proteins and associated terminal Gene Ontology terms. Trees reveal a ‘metabolic-first’ origin of proteins, the late development of translation, and a patchwork distribution of proteins in biological networks mediated by molecular recruitment. Similarly, the natural history of ancient RNA molecules inferred from trees of molecular substructures built from a census of molecular features shows patchwork-like accretion patterns. Ideographic analyses of ribosomal history uncover the early appearance of structures supporting mRNA decoding and tRNA translocation, the coevolution of ribosomal proteins and RNA, and a first evolutionary transition that brings ribosomal subunits together into a processive protein biosynthetic complex. Nomothetic structural biology studies of tertiary interactions and ancient insertions in rRNA complement these findings, once concentric layering assumptions are removed. Patterns of coaxial helical stacking reveal a frustrated dynamics of outward and inward ribosomal growth possibly mediated by structural grafting. The early rise of the ribosomal ‘turnstile’ suggests an evolutionary transition in natural biological computation. Results make explicit the need to understand processes of molecular growth and information transfer of macromolecules.
... But not all La protein is present in the nuclear compartment. It has been demonstrated that 2-4% of the Xenopus La homologue accumulates in the cytoplasm (27) and that the human La (hLa) protein shuttles between nucleus and cytoplasm (28). Moreover, a major pool of La protein is redistributed to the cytoplasm under various stress conditions such as apoptosis (29) or *To whom correspondence should be addressed. ...
... The origin and association of a SINE to a described superfamily (t-SINE, CORE-SINE or V-SINE) is indicated. 28 Opinion TRENDS in Genetics Vol. 23 No.1 www.sciencedirect.com ...
... Use of a cladistic method to compare the different SINE RNA structures To test more rigorously the relatedness of the different SINE RNA structures, we used a cladistic approach that recovers phylogenetic information directly from the structure of nucleic acid molecules and can be applied to the study of molecules of highly divergent lineage [27,28]. Maximum parsimony analyses of 37 substructural components derived from 20 representative eukaryotic SINE RNAs resulted in two minimal-length trees (see the supplementary material online, including Table S1 and Table S2, for methods and data), the strict consensus of which is presented in Figure 3. ...
Article
SINEs are mobile DNA elements found in almost all eukaryotes. Study of SINEs first focused on their retroposition mechanism, mutagenic effect and general impact on the structure and evolution of genomes. However SINE RNAs have recently been proposed to act as cellular riboregulators. Studying cis and trans elements taking part in the metabolism of SINE RNA in the model plant Arabidopsis thaliana, we aim to better understand SINE element biology. First, we experimentally defined the secondary structure of two tRNA-related SINE RNA: SB1 from Brassica napus and SB2 from Arabidopsis. Although unrelated at the primary sequence level, we found that these RNAs present similar secondary structures. Following this observation, an in silico analysis including tRNA-related SINE RNAs from various eukaryotes has been performed by FJ Sun, G Caetano-Anollés and JM Deragon. This study underlines the existence of common evolutionary trends for SINE RNA secondary structure that could be linked with the SINE RNA riboregulators function. Searching for trans factors involved in SINE RNA metabolism, we have chosen to characterise the La protein, an ubiquitous RNA-binding protein involved in the metabolism of various RNAs, from non-coding RNA to cellular or viral mRNA. Unlike other eukaryotes, which have only one La protein, we identified in Arabidopsis two proteins with the phylogenetic and structural characteristics of genuine La protein: At32 and At79. We showed that At32 (renamed AtLa1) is able to fulfil La nuclear functions in non-coding RNA maturation, including SINE RNA. We also demonstrate that loss of AtLa1 function leads to embryonic lethality. Although AtLa1 and At79 have the same nuclear localisation, loss of At79 function did not affect viability. AtLa1 and At79 have differing levels and profiles of expression. Furthermore, the AtLa1 and At79 proteins apparently bind distinct sets of RNA. We thus propose that Arabidopsis possess two functional homologues of the La protein, which have partially specialised to fulfil different aspects of the La function.
... Considering there is an intimate relationship between genus and energy, it could be unsurprising to find, for instance, extermophyles to possess an averagely higher genus. It has been already demonstrated, in fact, that RNA structure can be phylognetically conserved even when sequences are not [10,17,34]. To test this hypothesis, and assess weather McGenus could detect such differences, 3D structures of homologs of the same RNA molecules should be computationally and experimentally investigated. ...
... 4.3 deserve further analyses, as the preliminary tests show an intriguing behaviour within different taxonomic classes. Previous works proved a relationship between RNA structure and biological function [10,17,20,25,34], thus similar questions apply to the concept of genus. This should be seen as the first step toward a new research line, which could lead to a more complete understanding of the relationship between the concept of genus and its biological significance, thus improving our comprehension of RNA 2D and 3D structure and its related functions. ...
Preprint
RNA folding prediction remains challenging, but can be also studied using a topological mathematical approach. In the present paper, the mathematical method to compute the topological classification of RNA structures and based on matrix field theory is shortly reviewed, as well as a computational software, McGenus, used for topological and folding predictions. Additionally, two types of analysis are performed: the prediction results from McGenus are compared with topological information extracted from experimentally-determined RNA structures, and the topology of RNA structures is investigated for biological significance, in both evolutionary and functional terms. Lastly, we advocate for more research efforts to be performed at intersection of physics-mathematics and biology, and in particular about the possible contributions that topology can provide to the study of RNA folding and structure.
... While an almost universal consensus defines the PTC as the most ancestral part of the ribosome, Caetano and coworkers argue that the most ancestral part is the processivity center of the ribosome (mRNA decoding and mRNA helicase). Caetano-Anoles applied sophisticated methods to hierarchically organize RNA molecules based on cladistic principles [56]. He derived universal phylogenetic trees based on rRNA secondary structures. ...
... They argued that the addition of shells could have happened easily with every shell locked only if it made the previous structure more stable and kept the activity of transpeptidase. Caetano-Anoles applied sophisticated methods to hierarchically organize RNA molecules based on cladistic principles [56]. He derived universal phylogenetic trees based on rRNA secondary structures. ...
Article
Full-text available
Proteins are the workhorses of the cell and have been key players throughout the evolution of all organisms, from the origin of life to the present era. How might life have originated from the prebiotic chemistry of early Earth? This is one of the most intriguing unsolved questions in biology. Currently, however, it is generally accepted that amino acids, the building blocks of proteins, were abiotically available on primitive Earth, which would have made the formation of early peptides in a similar fashion possible. Peptides are likely to have coevolved with ancestral forms of RNA. The ribosome is the most evident product of this coevolution process, a sophisticated nanomachine that performs the synthesis of proteins codified in genomes. In this general review, we explore the evolution of proteins from their peptide origins to their folding and regulation based on the example of superoxide dismutase (SOD1), a key enzyme in oxygen metabolism on modern Earth.
... • RNA classification of various species (phylogeny) [4,3,16], ...
... 1. there is no base incident to more than one arc, 2. there are no crossing arcs, 3. there is no arc contained into another, 4. there is no arc. ...
... These gene sequences are functionally different and could have experienced differential mutational rates and selective constraints during evolution. 16SrRNA gene sequences have moderately well-conserved secondary structures among distantly related taxa [25] and are usually applied for distinguishing between well-resolved species and establishing relations between genera [26]. ...
Article
Full-text available
Rana temporaria is one of the most widespread Palearctic brown frogs. We aimed to clarify distribution pattern of two main genetic clades in the understudied Balkan peninsula by using 16SrRNA and MT-CYTB sequences, already widely applied in analyses of populations from other parts of Europe, while focusing on the broad area along the Morava river (central Balkans) as a known gap in the species distribution. Additionally, we were interested in revealing the extent of haplotype diversity within the main genetic clades in the Balkans, particularly around the supposed suture zone. The results revealed a suture zone between the Western and Eastern Clades in the central part of the Balkan Peninsula. This indicated the existence of a historical barrier between the Balkan Mountain Belt and geographically close mountains surrounding the Vlasina Plateau (Rhodope/Serbian–Macedonian Massif). The overall observed haplotype diversity in populations of R. temporaria from the Balkan Peninsula seems high. Harboring both main genetic clades of R. temporaria qualifies the Balkan Peninsula as another important center of species’ genetic diversity, as well as rich in unique haplotypes. This points out the necessity of applying conservation measures focused on the common European frog populations and habitats in this part of the species’ distribution area.
... Phylogenomic analysis has shown that helical structures accrete in tRNA, 5S rRNA, RNase P RNA, SINE RNA, and rRNA as molecular structures unfold in evolution (reviewed in Caetano-Anollés and Caetano-Anollés 2015). The first studies of accretion made use of CSR along branches of a tree of life generated from rRNA (Caetano-Anollés 2002a) or direct reconstruction of trees of rRNA substructures describing ribosomal growth (Caetano-Anollés 2002b). These studies were later advanced to include evolution of ribosomal proteins, providing a very detailed phylogenomic-based model of ribosomal evolution and showing proteins and RNA were co-evolving in the ribosomal ensemble (Harish and Caetano-Anollés 2012). ...
Article
Full-text available
The principle of continuity demands the existence of prior molecular states and common ancestors responsible for extant macromolecular structure. Here, we focus on the emergence and evolution of loop prototypes – the elemental architects of protein domain structure. Phylogenomic reconstruction spanning superkingdoms and viruses generated an evolutionary chronology of prototypes with six distinct evolutionary phases defining a most parsimonious evolutionary progression of cellular life. Each phase was marked by strategic prototype accumulation shaping the structures and functions of common ancestors. The last universal common ancestor (LUCA) of cells and viruses and the last universal cellular ancestor (LUCellA) defined stem lines that were structurally and functionally complex. The evolutionary saga highlighted transformative forces. LUCA lacked biosynthetic ribosomal machinery, while the pivotal LUCellA lacked essential DNA biosynthesis and modern transcription. Early proteins therefore relied on RNA for genetic information storage but appeared initially decoupled from it, hinting at transformative shifts of genetic processing. Urancestral loop types suggest advanced folding designs were present at an early evolutionary stage. An exploration of loop geometric properties revealed gradual replacement of prototypes with α-helix and β-strand bracing structures over time, paving the way for the dominance of other loop types. AlphFold2-generated atomic models of prototype accretion described patterns of fold emergence. Our findings favor a ‛processual’ model of evolving stem lines aligned with Woese’s vision of a communal world. This model prompts discussing the ‘problem of ancestors’ and the challenges that lie ahead for research in taxonomy, evolution and complexity.
... The mentioned methods have already proven to be excellent for the identification of BMR cryptic species [14][15][16]21]. The 16SrRNA gene is regularly used to identify well-differentiated species, genera and distantly related taxa [27][28][29]. Polymorphism of the MT-CYTB gene has been widely used to identify species and phylogenetic relationships in BMRs [30][31][32]. In addition to proving that N. l. syrmiensis still occurs in Serbia, we determined its greatly reduced distribution range. ...
Article
Full-text available
Blind mole rats (genus Nannospalax) attract a great deal of attention because of their cancer resistance and longevity. Due to the high rate of chromosome rearrangements, 74 Nannospalax chromosomal forms have been discovered. The convergence of their external morphology complicates their taxonomy, and many cryptic species remain unrecognized. Thus, the European N. leucodon supersp. is listed in the IUCN Red List of Threatened Species with “Data Deficient” status. It is crucial for the conservation of biodiversity to clarify its taxonomy, to recognize each cryptic species, and assign to them the correct conservation status. Of the more than 20 chromosomal forms described within N. leucodon, five cryptic species occur in Serbia. The most threatened among them—N. l. syrmiensis, described and named 50 years ago in the regions of Srem, Belgrade and Mačva—has been declared extinct in the literature, which may have negative consequences for the conservation of wildlife genetic diversity. Through five years of fieldwork and comparison of 16SrRNA and MT-CYTB gene segments between old, archived teeth and recently collected material, we show that N. l. syrmiensis is not extinct. However, its habitat has been fragmented and reduced, owing primarily to anthropogenic impact. Therefore, detailed surveillance, population-structure studies, risk assessment, and appropriate conservation measures are needed.
... Phylogenomicdata-driven analyses allowed for assigning a time of origin to each of its molecular components and building a time series of accretion and diversification events. 6,44,45 The ribosome originated ∼3. This timeline was then mapped onto evolving networks of domain organization. ...
Article
Full-text available
Biomolecular communication demands that interactions between parts of a molecular system act as scaffolds for message transmission. It also requires an organized system of signs—a communicative agency—for creating and transmitting meaning. The emergence of agency, the capacity to act in a given context and generate end‐directed behaviors, has baffled evolutionary biologists for centuries. Here, I explore its emergence with knowledge grounded in over two decades of evolutionary genomic and bioinformatic exploration. Biphasic processes of growth and diversification exist that generate hierarchy and modularity in biological systems at widely ranging time scales. Similarly, a biphasic process exists in communication that constructs a message before it can be transmitted for interpretation. Transmission dissipates matter‐energy and information and involves computation. Agency emerges when molecular machinery generates hierarchical layers of vocabularies in an entangled communication network clustered around the universal Turing machine of the ribosome. Computations canalize biological systems to perform biological functions in a dissipative quest to structure long‐lived occurrents. This occurs within the confines of a “triangle of persistence” that maximizes invariance with trade‐offs between economy, flexibility, and robustness. Thus, learning from previous historical and circumstantial experiences unifies modules in a hierarchy that expands the agency of systems.
... The ITS1 and ITS2 regions are already well known to play important roles in the rRNA maturation process [4,13,22,56,61,63,91,93,94], apparently requiring secondary structure, despite dramatic nucleotide sequence variation. Also, the 5.8S rRNA plays a critical role in ribosome movement and protein translation and therefore, displays a high degree of pan-eukaryotic conservation [29,95]. ...
Article
Full-text available
This is the first study to systematically evaluate rRNA secondary structures of Hedysareae with an emphasis on Hedysarum. ITS2 and 5.8S regions of the genus shared a common secondary structure with a four-fingered central loop, whereas ITS1 possessed five distinct structures. The secondary structural features of the two regions provided advantageous data for clades, species groups, and closely related species. Hemi-CBCs were mostly observed in the reconstruction of species groups, and Nsts, mostly between closely related species. The investigations showed that ITS1 varied more than ITS2 in length, GC content, and most of the diversity indices within the tribe. Maximum likelihood analyses of the synchronized sequence-structure tree of ITS1 were performed. The accuracy and phylogenetic signals of ITS1 were higher than ITS2. The similar GC content, and no CBC, in both spacers, fortified the close relationship of CEGO and H. sections Stracheya and Hedysarum clades in the synchronized sequence-structure tree topology of ITS1. In both regions, no inter-generic CBCs were detected inside the CEGO clade and the inter-sectional level of Hedysarum. But, in the ITS2 region, a CBC was detected between H. section Multicaulia, and Taverniera versus H. sections Hedysarum, and Stracheya. The lowest inter-sectional genetic distance and structural features were found between H. sect. Hedysarum and H. sect. Stracheya clades in the ITS2 region.
... 20 While information in RNA structure can improve sequence alignments 21 or be used directly to build standard phylogenetic trees, 22,23 the first studies of molecular growth over evolutionary time (accretion) made use of CSRs along branches of a tree of life generated from ribosomal RNA (rRNA) 24 or reconstructed trees of rRNA substructures describing ribosomal growth. 25 These phylogenetic strategies allow to study molecular accretion in different RNA molecules, including RNA of ancient origin such as transfer RNA (tRNA), 5S rRNA, RNase P RNA, SINE RNA and rRNA (reviewed in ref. 26 ). Operationally, geometrical features (e.g., length of single-stranded or double-stranded RNA segments) or statistical features (e.g. ...
Preprint
Full-text available
Biomolecular communication demands that interactions between parts of a molecular system act as scaffolds for message transmission. It also requires an evolving and organized system of signs - a communicative agency - for creating and transmitting meaning. Here I explore the need to dissect biomolecular communication with retrodiction approaches that make claims about the past given information that is available in the present. While the passage of time restricts the explanatory power of retrodiction, the use of molecular structure in biology offsets information erosion. This allows description of the gradual evolutionary rise of structural and functional innovations in RNA and proteins. The resulting chronologies can also describe the gradual rise of molecular machines of increasing complexity and computation capabilities. For example, the accretion of rRNA substructures and ribosomal proteins can be traced in time and placed within a geological timescale. Phylogenetic, algorithmic and theoretical-inspired accretion models can be reconciled into a congruent evolutionary model. Remarkably, the time of origin of enzymes, functional RNA, non-ribosomal peptide synthetase (NRPS) complexes, and ribosomes suggest they gradually climbed Chomsky's hierarchy of formal grammars, supporting the gradual complexification of machines and communication in molecular biology. Future retrodiction approaches and in-depth exploration of theoretical models of computation will need to confirm such evolutionary progression.
... It is beyond the scope of this chapter to review the structural interactions and requirements for translation that involve the 16S rRNA because these have been better studied in other systems. Interested readers are referred to reviews on this topic (Caetano-Anollés, 2002;Demongeot & Seligmann, 2020;Korostelev, Trakhanov, Laurberg, & Noller, 2006;Polacek & Mankin, 2005;Prosdocimi, Zamudio, Palacios-Pérez, Torres de Farias, & José, M., 2020;Rodnina, Beringer, & Wintermeyer, 2007;Schmidt et al., 2016;Wilson & Nierhaus, 2007). ...
... Chaetognath mitogenomes apparently lack tRNA genes, but tRNA-like sequences occur in their 16S rRNAs, suggesting potential dual function for these rRNA stretches (Barthélémy and Seligmann, 2016). Several other authors independently report tRNA-like sequences within rRNAs (Bloch et al., 1983(Bloch et al., , 1984(Bloch et al., , 1985(Bloch et al., , 1989Caetano-Anollés 2002; and Root-Bernstein 2015; ...
Article
tRNAs presumably accreted into modern ribosomal RNAs. Previous analyses showed similar secondary structures for ancient rRNA subelements and theoretical minimal RNA rings, candidate tRNA ancestors rationally designed from tRNA-unrelated principles. Here, analyses test which tRNA secondary structure subelements resemble ancient/recent rRNA subelements. Results show that ribosomal RNA subelements evolved from structures resembling 1. Upper half part of the tRNA secondary structure; and 2. Towards structures resembling (a) tRNA 5′ stem-loop hairpins in large rRNA subunit and (b) tRNA lower half part in small rRNA subunit (stop and start codons conservation model). tRNAs and rRNAs presumably originated from the tRNA upper half part including the acceptor stem. Modern split 5′ and 3′ tRNA genes (spliced at anticodons) apparently reproduce ancestral-like states, because the acceptor stem protocode suggests acceptor stems evolved from spliced anticodon-like stem-loop hairpins, strengthening central roles for acceptor stem CCA-addition at translation origins. The Root-Bernstein hypothesis on the existence of tRNA structural symmetries presumably reflects late 5’ tRNA stem-loop hairpin duplications, some integrating rRNAs. Analyses of tRNA subelements similarities with rRNA subelements suggest tRNAs evolved and re-evolved by different duplication-fusions, along different structural subdivision models. Hence, sequential/parallel processes, perhaps in the same ancestral organism(s) produced polyphyletic tRNAs. Results confirm RNA ring usefulness for understanding prebiotic and early life evolution, and their similarities with primordial protein coding and tRNA genes.
... To obtain more realistic phylogenetic patterns, we analysed the amount of nucleotide polymorphism in two mitochondrial gene sequences-16S rRNA and MT-CYTB, which are functionally different and could have experienced differential mutational rates and selective constraints during evolution (for example, in Artiodactyla [39]). 16S rRNA gene sequences have moderately well-conserved secondary structures among distantly related taxa [40], and are usually applied for distinguishing between well-resolved species and establishing relations between genera [41]. However, this gene was successfully used for differentiating cattle breeds [42]. ...
Article
Full-text available
Simple Summary Cryptic species, hidden by morphological uniformity, represent a significant part of the diversity in some taxonomic groups, and pose a real challenge for conservation planning. Here, we explore cryptic speciation in blind mole rats of the genus Nannospalax—comprehensively studied mammals for many unusual features (cancer resistance, longevity, etc.). Intensive chromosomal changes are one of these peculiarities. In the European N. leucodon species complex, 25 lineages with different karyotypes have been described, comprising undetected/cryptic species. As some of them are endangered, taxonomic revision is urgent for conservation purposes. Using 36–60-year-old archived teeth samples and newly captured animals, we analysed the nucleotide polymorphism of two mitochondrial gene sequences among 17 out of 25 chromosomal forms—the highest number studied so far—and provided molecular genetic records for 5 of them for the first time. Eleven chromosomal forms were separated into distinct clades in phylogenetic trees. High evolutionary divergence values among several chromosomal forms overlapped with those acquired for higher taxonomic categories. By integrating the results of previous karyological analyses and crossbreeding experiments that revealed complete reproductive isolation of seven chromosomal forms, with our new findings, we propose conservation strategies to preserve their genetic diversity. Abstract We explored the cryptic speciation of the Nannospalax leucodon species complex, characterised by intense karyotype evolution and reduced phenotypic variability that has produced different lineages, out of which 25 are described as chromosomal forms (CFs), so many cryptic species remain unnoticed. Although some of them should be classified as threatened, they lack the official nomenclature necessary to be involved in conservation strategies. Reproductive isolation between seven CFs has previously been demonstrated. To investigate the amount and dynamics of genetic discrepancy that follows chromosomal changes, infer speciation levels, and obtain phylogenetic patterns, we analysed mitochondrial 16S rRNA and MT-CYTB nucleotide polymorphism among 17 CFs—the highest number studied so far. Phylogenetic trees delineated 11 CFs as separate clades. Evolutionary divergence values overlapped with acknowledged higher taxonomic categories, or sometimes exceeded them. The fact that CFs with higher 2n are evolutionary older corresponds to the fusion hypothesis of Nannospalax karyotype evolution. To participate in conservation strategies, N. leucodon classification should follow the biological species concept, and proposed cryptic species should be formally named, despite a lack of classical morphometric discrepancy. We draw attention towards the syrmiensis and montanosyrmiensis CFs, estimated to be endangered/critically endangered, and emphasise the need for detailed monitoring and population survey for other cryptic species.
... Even though a complete rigorous analysis and contextualization of this data is unfortunately out of scope, we believe these observations provide enough support to justify further investigations. This data could be useful for evolutionary studies of ribosomes [23,[36][37][38], viroids structures [39] and the enhancement of motifs libraries for RNA design [8,40]. As illustrated in sub-section 3.4.2, ...
Article
Full-text available
RNA tertiary structure is crucial to its many non-coding molecular functions. RNA architecture is shaped by its secondary structure composed of stems, stacked canonical base pairs, enclosing loops. While stems are precisely captured by free-energy models, loops composed of non-canonical base pairs are not. Nor are distant interactions linking together those secondary structure elements (SSEs). Databases of conserved 3D geometries (a.k.a. modules) not captured by energetic models are leveraged for structure prediction and design, but the computational complexity has limited their study to local elements, loops. Representing the RNA structure as a graph has recently allowed to expend this work to pairs of SSEs, uncovering a hierarchical organization of these 3D modules, at great computational cost. Systematically capturing recurrent patterns on a large scale is a main challenge in the study of RNA structures. In this paper, we present an efficient algorithm to compute maximal isomorphisms in edge colored graphs. We extend this algorithm to a framework well suited to identify RNA modules, and fast enough to considerably generalize previous approaches. To exhibit the versatility of our framework, we first reproduce results identifying all common modules spanning more than 2 SSEs, in a few hours instead of weeks. The efficiency of our new algorithm is demonstrated by computing the maximal modules between any pair of entire RNA in the non-redundant corpus of known RNA 3D structures. We observe that the biggest modules our method uncovers compose large shared sub-structure spanning hundreds of nucleotides and base pairs between the ribosomes of Thermus thermophilus , Escherichia Coli , and Pseudomonas aeruginosa .
... For this machinery to work, the tools that read and build (called ribosomes) need to already be built and carried along. Ribosomes are largely composed of RNA and are hypothesized to have been fully composed of it during early life on Earth [50][51][52]. Since RNA also has DNA-like properties, it is believed that the instruction manual (DNA) and the tool that reads and builds according to it (ribosomes) have a common precursor molecule. Selection itself has been described under certain circumstances to lead to replicators grouping themselves together in ways that favor cooperation and synergy [53], suggesting that the fittest is a cooperative and dynamic unit [54]. ...
Article
Full-text available
The details of abiogenesis, to date, remain a matter of debate and constitute a key mystery in science and philosophy. The prevailing scientific hypothesis implies an evolutionary process of increasing complexity on Earth starting from (self-) replicating polymers. Defining the cut-off point where life begins is another moot point beyond the scope of this article. We will instead walk through the known evolutionary steps that led from these first exceptional polymers to the vast network of living biomatter that spans our world today, focusing in particular on perception, from simple biological feedback mechanisms to the complexity that allows for abstract thought. We will then project from the well-known to the unknown to gain a glimpse into what the universe aims to accomplish with living matter, just to find that if the universe had ever planned to be comprehended, evolution still has a long way to go.
... For this machinery to work the tools that read and build (called ribosomes) need to already be built and carried along. Ribosomes are largely composed of RNA and are hypothesized to have been fully composed of it during early life on Earth (Petrov et al. 2015;Fox 2010;Caetano-Anollés 2002). Since RNA also has DNA like properties it is believed that the instruction manual (DNA) and the tool that reads and builds according to it (ribosomes) have a common precursor molecule. ...
Preprint
Full-text available
The details of abiogenesis to date remain a matter of debate and constitute a key mystery in science and philosophy. The prevailing scientific hypothesis implies an evolutionary process of increasing complexity on earth starting from (self-) replicating polymers. Defining the cut-off point where life begins is another moot point beyond the scope of this article. We will instead walk through the known evolutionary steps that lead from these first exceptional polymers to the vast network of living biomatter that spans our world today, focusing in particular on perception, from simple biological feedback mechanisms to the complexity that allows for abstract thought. We then will project from the well-known to the unknown to gain a glimpse on what the universe aims to accomplish with living matter, just to find that if the universe had ever planned to be comprehended, evolution still has a long way to go.
... With a view to expanding molecular sequence data, we compare the mitochondrial 16S rRNA gene segment, which has moderately well-conserved secondary structures among distantly related taxa (Caetano-Anollés 2002) and may be suitable for species identification (Yang et al. 2014). Although 16s rRNA sequences can be used routinely to distinguish and establish relationships between genera and well-resolved species, newly diverged species may not be recognizable (Fox et al. 1992). ...
Article
The role of intraspecific karyotype variability in reproductive isolation and speciation has been widely studied. Among the 26 genera of Palaearctic mammals, the blind mole rats genus Nannospalax has the highest karyotype variability with 74 chromosomal forms (CFs). Although these CFs have been described in detail, taxonomic effects of chromosomal rearrangements are still lacking, especially among 25 recorded CFs of European N. leucodon superspecies. As genetic discrepancies for most of them are missing, we analyze nucleotide sequence polymorphism of the mitochondrial 16S rRNA gene between eight N. leucodon CFs. Here we provide for the first time nucleotide sequence data for three CFs: monticola, montanoserbicus and syrmiensis using 40–57-year-old archived samples from our mammalian collection and thus demonstrate the usefulness of archived/museum samples as starting material for DNA analysis. The topology of the phylogenetic tree is congruent with the traditional taxonomic separation of recent blind mole rats with high support. Diversification of N. leucodon cluster into discrete subclusters—CFs—and the extent of evolutionary divergence among them are in accordance with previous findings of complete reproductive isolation between six CFs analyzed here. Additionally, the level of evolutionary divergence among six N. leucodon CFs resembles those recorded among clearly distinct Spalax species and four proposed species of N. ehrenbergi. These facts suggest that they could be cryptic species and bring attention to their conservation and natural resource protection.
... The first evidence in this respect is that the RNA rings designed along these coding constraints resemble a consensus tRNA sequence (Demongeot and Moreira 2007). This similarity strengthens the status of these theoretical sequences as plausible candidate primordial RNAs because tRNAs are probably among life's most ancient molecules (Sun and Raoult 2016, 2018) whose accretion probably produced parts of ribosomal RNA (Bloch et al. 1983(Bloch et al. , 1984(Bloch et al. , 1985(Bloch et al. , 1989Caetano-Anollés 2002;Barthélémy and Seligmann 2016;Farias et al. 2019), perhaps via an ancestral tRNA dimer (de Farias et al. 2014;Agmon 2016;Guimarães 2017). The hypothetical homology between RNA rings and tRNAs defines an anticodon for each RNA ring, and a corresponding cognate amino acid (Table 1). ...
Article
Full-text available
Deaminations (A->G, C->T) increase with DNA singlestrandedness during replication, presumably creating spontaneous genomic mutational and nucleotide frequency gradients. Alternatively, genes are positioned to avoid deaminations. Deamination gradients affect directly mitogene third codon positions; conserved vertebrate mitochondrial tRNA and protein coding gene arrangements minimize deaminations in anticodons, and first and second codon positions in mitogenes. Here we describe deamination gradients across theoretical minimal RNA rings, 22 nucleotide-long RNAs designed to simulate prebiotic RNAs. These RNA rings code for a start/stop codon and a single codon for each amino acid, and form stem-loop hairpins slowing degradation. They resemble consensus tRNAs, defining potential anticodons and cognate amino acids. Theoretical minimal RNA rings are not designed to include deamination gradients, yet deamination gradients occur in RNA rings. tRNA homology produces stronger evidence for deamination gradients than RNA ring homology defined by coding properties. Deamination gradients start at predicted RNA ring anticodons, corresponding to known homologies between mitochondrial tRNAs and replication origins, and between bacterial tRNA synthetases and mitochondrial DNA polymerase gamma. Deamination gradients are strongest for RNA rings with predicted anticodons matching cognate amino acids that integrated early the genetic code. Presumably protections against deaminations evolved while amino acids integrated the genetic code. Results confirm tRNA-RNA ring homologies. Coding constraints defining RNA rings presumably produce deamination gradients starting at predicted anticodons. Hence, the universal genetic code determines nucleotide deamination gradients in theoretical minimal RNA rings, suggesting adaptation to prevent consequences of deamination mutations. Results also indicate that the genetic code’s structure determined evolution of tRNAs, their cognates, tRNA synthetases, and polymerases.
... The ability to compare RNA structures is useful for the prediction of the RNA folding process taking as initial data a set of already known secondary structures [9]. It is also useful for the RNA classification of various species [10,11], for determining the RNA consensus structure of aligned sequences and for the identification of highly conserved structures during evolution [12,13]. Functional RNA families such as tRNA, rRNA, and RNAse P exhibit a highly conserved shape of secondary structure but little sequence similarity [14]. ...
Article
Full-text available
Background RNA secondary structure comparison is a fundamental task for several studies, among which are RNA structure prediction and evolution. The comparison can currently be done efficiently only for pseudoknot-free structures due to their inherent tree representation. Results In this work, we introduce an algebraic language to represent RNA secondary structures with arbitrary pseudoknots. Each structure is associated with a unique algebraic RNA tree that is derived from a tree grammar having concatenation, nesting and crossing as operators. From an algebraic RNA tree, an abstraction is defined in which the primary structure is neglected. The resulting structural RNA tree allows us to define a new measure of similarity calculated exploiting classical tree alignment. Conclusions The tree grammar with its operators permit to uniquely represent any RNA secondary structure as a tree. Structural RNA trees allow us to perform comparison of RNA secondary structures with arbitrary pseudoknots without taking into account the primary structure.
... Molecular components become parts of growing molecules and macromolecules, which also interact and merge with other growing molecules and macromolecules to form molecular complexes that make up higher levels of molecular and cellular structure 7 . For example, we used phylogenetic methods to trace the evolutionary growth of the ribosome, the molecular complex responsible for protein synthesis in the cell [8][9][10] . Figure 1A shows a timeline describing how ribosomal RNA (rRNA) helices and ribosomal proteins (r-proteins) accrete in evolution to form the modern ribosomal biosynthetic complex. ...
Article
Full-text available
The evolution of structure in biology is driven by accretion and diversification. Accretion brings together disparate parts to form bigger wholes. Diversification provides opportunities for growth and innovation. Here, we review patterns and processes that are responsible for a 'double tale' of accretion and diversification at various levels of complexity, from proteins and nucleic acids to high-rise building structures in cities. Parts are at first weakly linked and associate variously. As they diversify, they compete with each other and are selected for performance. The emerging interactions constrain their structure and associations. This causes parts to self-organise into modules with tight linkage. In a second phase, variants of the modules evolve and become new parts for a new generative cycle of higher-level organisation. Evolutionary genomics and network biology support the 'double tale' of structural module creation and validate an evolutionary principle of maximum abundance that drives the gain and loss of modules.
... The utility of the internal transcribed spacers to differentiate closely related and cryptic species is well established (Collins and Paskewitz, 1996;Li and Wilkerson, 2005) and studies suggest that inferences from ITS2 sequences and secondary structures strongly correlate with taxonomic classification (Coleman, 2007). Further, secondary structures are particularly useful over primary sequences because they include information on species morphology which are not seen in primary sequences (Caetano-Anollés, 2002). Therefore, the current study was designed to describe the complete ITS2 sequence and secondary structure characteristics of the members of An. culicifacies species complex compared to the universal eukaryotic ITS2 secondary structure and to the vector competence of different sibling species of the An. ...
... To examine the possible structural effects of nucleotide differences in the variable positions, we generated secondary ITS1 and ITS2 rRNA structures from each cloned sequence. Secondary structures can provide information not found in the primary sequence (Caetano-Anolle's, 2002). The ITS1 structures were very diverse (examples are presented in Figure 6) showing limited structural conservatism. ...
Article
Full-text available
The internal transcribed spacer (ITS) region (ITS1, 5.8S rDNA, and ITS2) separates the genes coding for the SSU 18S and the LSU 26S genes in the rDNA units which are organized into long tandem arrays in the overwhelming majority of fungi. As members of a multigenic family, these units are subject of concerted evolution, which homogenizes their sequences. Exceptions have been observed in certain groups of plants and in a few fungal species. In our previous study we described exceptionally high degree of sequence diversity in the D1/D2 domains of two pulcherrimin-producing Metschnikowia (Saccharomycotina) species which appeared to evolve by reticulation. The major goals of this study were the examination of the diversity of the ITS segments and their evolution. We show that the ITS sequences of these species are not homogenized either, differ from each other by up to 38 substitutions and indels which have dramatic effects on the predicted secondary structures of the transcripts. The high intragenomic diversity makes the D1/D2 domains and the ITS spacers unsuitable for barcoding of these species and therefore the taxonomic position of strains previously assigned to them needs revision. By analyzing the genome sequence of the M. fructicola type strain, we also show that the rDNA of this species is fragmented, contains pseudogenes and thus evolves by the birth-and-death mechanism rather than by homogenisation, which is unusual in yeasts. The results of the network analysis of the sequences further indicate that the ITS regions are also involved in reticulation. M. andauensis and M. fructicola can form interspecies hybrids and their hybrids segregate, providing thus possibilities for reticulation of the rDNA repeats.
... Molecular components become parts of growing molecules and macromolecules, which also interact and merge with other growing molecules and macromolecules to form molecular complexes that make up higher levels of molecular and cellular structure 7 . For example, we used phylogenetic methods to trace the evolutionary growth of the ribosome, the molecular complex responsible for protein synthesis in the cell [8][9][10] . Figure 1A shows a timeline describing how ribosomal RNA (rRNA) helices and ribosomal proteins (r-proteins) accrete in evolution to form the modern ribosomal biosynthetic complex. ...
Preprint
The evolution of structure in biology is driven by accretion and change. Accretion brings together disparate parts to form bigger wholes. Change provides opportunities for growth and innovation. Here we review patterns and processes that are responsible for a 'double tale' of evolutionary accretion at various levels of complexity, from proteins and nucleic acids to high-rise building structures in cities. Parts are at first weakly linked and associate variously. As they diversify, they compete with each other and are selected for performance. The emerging interactions constrain their structure and associations. This causes parts to self-organize into modules with tight linkage. In a second phase, variants of the modules evolve and become new parts for a new generative cycle of higher-level organization. Evolutionary genomics and network biology support the 'double tale' of structural module creation and validate an evolutionary principle of maximum abundance that drives the gain and loss of modules.
... Several previous studies have shown that the secondary reconstruction of ribosomal RNA molecules can aid in improving recognition of primary homology [46,51,52], and refine the alignment process [46][47][48][49][50]. Some structural motifs are highly stable among distantly related taxa, which can provide potentially informative characters for estimating phylogeny [53]. ...
Article
Full-text available
It is well known that the rRNA structure information is important to assist phylogenetic analysis through identifying homologous positions to improve alignment accuracy. In addition, the secondary structure of some conserved motifs is highly stable among distantly related taxa, which can provide potentially informative characters for estimating phylogeny. In this paper, we applied the high-throughput pooled sequencing approach to the determination of neuropteran mitogenomes. Four complete mitogenome sequences were obtained: Micromus angulatus (Hemerobiidae), Chrysoperla nipponensis (Chrysopidae), Rapisma sp. (Ithonidae), and Thaumatosmylus sp. (Osmylidae). This allowed us to sample more complete mitochondrial RNA gene sequences. Secondary structure diagrams for the complete mitochondrial small and large ribosomal subunit RNA genes of eleven neuropterid species were predicted. Comparative analysis of the secondary structures indicated a closer relationship of Megaloptera and Neuroptera. This result was congruent with the resulting phylogeny inferred from sequence alignments of all 37 mitochondrial genes, namely the hypothesis of (Raphidioptera + (Megaloptera + Neuroptera)).
... Thus, CR is utilized in evolutionary analyses due to its high variability in hypervariable segments. 16s rRNA genes are much less diverse; however, an important aspect of rRNA genes is their secondary structures, which are moderately well conserved among distantly related taxa (Caetano-Anollés, 2002). The 16s rRNA gene has been widely used to explore the phylogenetic relationships in marine species at varying taxonomic levels (Li et al., 2008). ...
Article
Fenneropenaeus penicillatus is a widely distributed economically and ecologically important shrimp species, which is endangered in China. Sequence analysis of 16s rRNA and control region (CR) fragments from mitochondrial DNA was conducted to obtain information on genetic diversity and population structure. Individuals from 12 wild F. penicillatus populations located along the southeast coast of China were used. Polymerase chain reaction (PCR) fragments of the CR gene revealed high genetic diversity among the 12 populations; however, PCR fragments of the 16s rRNA gene revealed very low genetic diversity in the Hainan (HN) and Ningde (ND) populations and high genetic diversity in the DS, BH, PT, XM, and SZ populations. Data obtained from the CR and 16s rRNA genes suggested that high genetic differentiation exists among the 12 populations, which is mainly due to the high genetic differentiation between HN and all other 11 populations. These results may be useful for further sustainable management and utilization of this species.
... Compensatory base changes which alter the sequence of this region could restore the higher order structure of the stem resulting in processing of pre-rRNA (Peculis & Greer, 1998). It is already well established that rRNA structure is highly conserved throughout evolution as most of the folding is functionally important despite primary sequence divergence (Caetano-Anolles, 2002). The conservation of specific domains and nucleotide motifs in the ribosomal non coding spacers is apparent across the eukaryotic kingdom (Coleman, 2007). ...
Article
The use of ribosomal DNA (rDNA) internal transcribed spacer (ITS) primary sequence based phylogeny is a conventional practice to estimate the evolutionary interspecies relationship. However, analysis of the functional folding patterns and higher order secondary structures of ITS regions can provide additional important information regarding species relatedness and interspecies variations. In the present study, we provide the first detailed information on the rDNA ITS secondary structure diversity in the four subclades of the subgenus Protasparagus. Several angiospermic conserved motifs were identified in each of the ITS1, 5.8S and ITS2 secondary structures of the studied taxon. Topological comparison of the ITS1 secondary structures showed variations in the helix- IV regions. Moreover, presence of unique sequence motifs and differences in the internal loop structures were found to be subclade specific. The present study suggests that comprehensive analysis of the ITS1, 5.8S and ITS2 structural elements including helices, loops and bulges can be used as an important tool for species delimitation. The present study investigated the evolution of the secondary structure of ITS marker (its phylogenetic utility), genome size, base chromosome number and phytochemicals, and identified a putative polyploid event shared by a number of Protasparagus species. The phytochemical analysis of two important active compounds, i.e., shatavarin-IV and sarsasapogenin, also reveals their presence in all the studied taxa constitutively even at the subclade level.
... There is an important aspect of ribosomal RNA gene is that they have conserved secondary structures that are moderately well conserved among distantly related taxa (Caetano-Anolles, 2002). While Simons and Mayden (1998) Similarly, used these mitochondrial 16s rRNA and Cyt b genes to resolve the phylogenetic relationships among subfamilies of family cyprinid. ...
... There is an important aspect of ribosomal RNA gene that has conserved secondary structures that are moderately well conserved among distantly related taxa (Caetano-Anolles, 2002). While Simons and Mayden (1998) used 12S and 16S rRNA sequences to clarify the phylogenetic relationship of Western North American genus phoxinus (Cyprinidae) with the western clad and concluded that they likely have Asian or European relatives. ...
Article
Full-text available
To confirm the taxonomical status of seven cyprinin fish species in Iraqi inland waters: Barbus xanthopterus, B. kersin, B. barbulus, B. grypus, B. sharpeyi, B. luteus and Cyprinus carpio, the mitochondrial 16S rRNA gene fragment was used as a molecular marker. The primer was modified to include limited 120 bp fragment. PCR product tested on 2% agarose gel electrophoresis. The size of bands was estimated using the 100 bp ladder, while the correlation between the size of 100 bp ladder bands and the migrated distance was high since the R2> 0.99. Molecular profile of mtDNA 16S rRNA gene fragment appeared that the six Barbus species and Cyprinus carpio had responded similarly to the modified primer. Whereas the second profile showed that the muglid Liza abu and Liza klunzingeri (used as out-group family) and leuciscin A. vorax (out-group sub-family) did not respond to the modified primer but the cyprinin Carassius auratus responded and the marker band of 120 bp length proved that this species also belongs to sub-family Cyprininae.
... Presumably they grew larger over time. Thus, the history of various secondary structural features has recently been traced 46 and exploited to develop a novel method of deducing phylogenetic relationships. 47 Perhaps even more to the point, conserved features in the RNA primary and secondary structure are being mapped against the three-dimensional structures thereby revealing spatial relationships that have changed or been conserved over time. ...
Chapter
Full-text available
Current theories on the origin of Ufe envision an RNA World as the culmination of chemical evolution. The extent of this RNA World, and the biochemical complexity of the progenotes that populated it, is subject to much debate. It, nevertheless, is likely a point of agreement among workers in the field that the discovery of machinery for the chiral synthesis of defined sequence peptides would have paved the way for transition to the modern protein world. With the discovery of an RNA replicase, which might initially have been a catalytic RNA or an early peptide product, the stage would be set for the development of populations of progenotes that had both of these features in one enclosure. Such advanced progenotes would be the first entities capable of having the genetic couple between replication, transcription and translation that is the hallmark of life, as we know it. The modern day tmRNA at one stage is recognized as a tRNA by the ribosome while it subsequently serves as a mRNA during translation. This unusual RNA might be representative of the types of entities present in the late RNA World. The addition of DNA as a better storage medium for genetic informa-tion would finalize the transition from the progenotic world to the living systems that exist in the modern world.
... En revanche, il alimente le débat sur l'ancêtre universel commun. Cette description lui attribuerait un caractère eucaryotique alors qu'un caractère procaryotique était le plus souvent considéré (Caetano-Anolles, 2002). Une structure ribosomale minimale universellement conservée comportant les organes essentiels à la réalisation de la traduction peut donc être décrite. ...
Article
Full-text available
Translation initiation plays a central role in every living organism. Studying the proteins helping the ribosome toward this task, Initiation Factors (Ifs), allows to get insights into the complex molecular mechanisms ensuring a high fidelity and efficiency of translation initiation. When comparing the existing initiation factors in the three kingdoms of life, three universally conserved initiation factors were identified. Among these, the eukaryotic/archaeal factor e/aIF5B, the counterpart of the bacterial factor IF2, stimulates the ribosomal subunits joining step as in Bacteria. However, the universality of this factor was limited by the lack of reported interaction between the factor e/aIF5B and the initiator tRNA whereas such an interaction was characterized in Bacteria. A first part of this thesis work aimed at extending the functional similarity between the factors by bringing to light a binding of methionylated initiator tRNA by the factor e/aIF5B. The characteristics of this binding are very similar to the binding of formylated methionylated initiator tRNA by the bacterial factor IF2. A second part of my thesis work dealt with the factor eIF3, the most complex factor of the eukaryotic translation initiation apparatus. This multimeric factor, composed of 13 subunits in human and 5 in budding yeast, has no counterpart in the other kingdoms of life despite a central and essential role in Eukaryotes. In addition, eIF3 is involved in numerous cancers. However, the understanding of its functions is severely limited by the lack of structural information on subunits interactions and on interactions with other partners of the translation apparatus. This work led to the development of a plasmid library allowing the coexpression of Saccharomyces cerevisiae eIF3's subunits or stabilized forms of them in the Bacteria Escherichia coli. The purifications of the individual subunits and different subcomplexes are leading us on the road of the factor structure determination by an approach combining crystallography and electron microscopy.
... When a mutation occurs in one side of a pair, it is always correlated with substitution on the other side in order to retain the paired bond. This new substitution model, called compensatory base changes (CBCs) [49], is different from the currently widely used nucleotide substitution models, and can include additional information not found in the primary sequence [24,50]. In this study, we explored alternative methods of adding ITS2 secondary structures to increase phylogenetic information gathering without having to add nucleotides. ...
Article
Full-text available
DNA barcoding is a promising species identification method, but it has proved difficult to find a standardized DNA marker in plant. Although the ITS/ITS2 RNA transcript has been proposed as the core barcode for seed plants, it has been criticized for being too conserved in some species to provide enough information or too variable in some species to align it within the different taxa ranks. We selected 30 individuals, representing 16 species and four families, to explore whether ITS2 can successfully resolve species in terms of secondary structure. Secondary structure was predicted using Mfold software and sequence-structure was aligned by MARNA. RNAstat software transformed the secondary structures into 28 symbol code data for maximum parsimony (MP) analysis. The results showed that the ITS2 structures in our samples had a common four-helix folding type with some shared motifs. This conserved structure facilitated the alignment of ambiguous sequences from divergent families. The structure alignment yielded a MP tree, in which most topological relationships were congruent with the tree constructed using nucleotide sequence data. When the data was combined, we obtained a well-resolved and highly supported phylogeny, in which individuals of a same species were clustered together into a monophyletic group. As a result, the different species that are often referred to as the herb "Mu tong" were successfully identified using short fragments of 250 bp ITS2 sequences, together with their secondary structure. Thus our analysis strengthens the potential of ITS2 as a promising DNA barcode because it incorporates valuable secondary structure information that will help improve discrimination between species.
... (1) Surviving molecular progeny do not have to be more or less "stable than their ancestors" in evolution. Accretion implies ensembles of structural modules of different age, both in the BS and phylogenetic models, with properties of molecular flexibility and robustness distributing along the nested lineages of the tree of life (e.g., Caetano-Anollés, 2002, 2005Sun et al., 2010). ...
Article
Full-text available
Historical (ideographic) and non-historical (nomothetic) studies of ribosomal accretion appear to arrive at diametrically opposite conclusions. Phylogenetic analysis of thousands of RNA molecules and protein structures in hundreds of genomes supports the structural origin of the ribosome in RNA decoding and ribosomal mechanics. Predictions from extant features in a handful of rRNA structural models of the large ribosomal subunit support its origin in protein biosynthesis. In recent correspondence, one of us reported that correcting dismissals of conflicting data and avoiding unwarranted assumptions of the nomothetic method reconciled conclusions. In response, Petrov and Williams dismissed our arguments claiming we did not understand their algorithmic model of ribosomal apical growth. Instead, they controverted the historical approach. Here we show that their objections to the phylogenetic method are unjustified, that their algorithm subjectively guarantees back-in-time molecular deconstructions toward the protein biosynthetic core, and that processes of ribosomal growth are much more complex. We prompt abandoning apriorism, decreasing ad hoc hypotheses and integrating historical and non-historical scientific methods.
Preprint
Full-text available
Freshwater crabs ( Potamiscus manipuriensis ), commonly consumed as local delicacies by the native people in the state of Manipur, were found to harbour metacercariae of Microphallus sp. (Family Microphyllidae), which were morphologically different from metacercariae of Microphallus indicus reported earlier from a different host ( Barytelphusa lugubris mansoniana ) in Meghalaya, another state in Northeast India. So, PCR-based molecular characterization of this metacercaria was done utilizing rDNA marker regions: larger subunit (LSU) or 28S and inter-transcribed spacer 2 (ITS2). Sequence and phylogenetic analyses confirmed that the taxon under study belonged to family Microphyllidae. The ITS2 secondary structure data analyses also confirmed the primary sequence analysis. The analysis also revealed sequence differences in one hundred and nineteen bases (with 38 transitions, 35 transversions and 46 indels) with regard to 28S, though ITS2 showed sequence differences in 25 bases (10 transitions, 7 transversions and 8 indels) between the present microphallid and M. indicus .
Article
Studies with RNA enzymes (ribozymes) and protein enzymes have identified certain structural elements that are present in some cellular mRNAs and viral RNAs. These elements do not share a primary structure and, thus, are not phylogenetically related. However, they have common (secondary/tertiary) structural folds that, according to some lines of evidence, may have an ancient and common origin. The term 'mRNA archaeology' has been coined to refer to the search for such structural/functional relics that may be informative of early evolutionary developments in the cellular and viral worlds and have lasted to the present day. Such identified RNA elements may have developed as biological signals with structural and functional relevance (as if they were buried objects with archaeological value), and coexist with the standard linear information of nucleic acid molecules that is translated into proteins. However, there is a key difference between the methods that extract information from either the primary structure of mRNA or the signals provided by secondary and tertiary structures. The former (sequence comparison and phylogenetic analysis) requires strict continuity of the material vehicle of information during evolution, whereas the archaeological method does not require such continuity. The tools of RNA archaeology (including the use of ribozymes and enzymes to investigate the reactivity of the RNA elements) establish links between the concepts of communication and language theories that have not been incorporated into knowledge of virology, as well as experimental studies on the search for functionally relevant RNA structures.
Article
Introduction: While the origin and evolution of proteins remain mysterious, advances in evolutionary genomics and systems biology are facilitating the historical exploration of the structure, function and organization of proteins and proteomes. Molecular chronologies are series of time events describing the history of biological systems and subsystems and the rise of biological innovations. Together with time-varying networks, these chronologies provide a window into the past. Areas covered: Here, we review molecular chronologies and networks built with modern methods of phylogeny reconstruction. We discuss how chronologies of structural domain families uncover the explosive emergence of metabolism, the late rise of translation, the co-evolution of ribosomal proteins and rRNA, and the late development of the ribosomal exit tunnel; events that coincided with a tendency to shorten folding time. Evolving networks described the early emergence of domains and a late 'big bang' of domain combinations. Expert opinion: Two processes, folding and recruitment appear central to the evolutionary progression. The former increases protein persistence. The later fosters diversity. Chronologically, protein evolution mirrors folding by combining supersecondary structures into domains, developing translation machinery to facilitate folding speed and stability, and enhancing structural complexity by establishing long-distance interactions in novel structural and architectural designs.
Article
Full-text available
Ribosomal RNAs are complex structures that presumably evolved by tRNA accretions. Statistical properties of tRNA secondary structures correlate with genetic code integration orders of their cognate amino acids. Ribosomal RNA secondary structures resemble those of tRNAs with recent cognates. Hence, rRNAs presumably evolved from ancestral tRNAs. Here, analyses compare secondary structure subcomponents of small ribosomal RNA subunits with secondary structures of theoretical minimal RNA rings, presumed proto-tRNAs. Two independent methods determined different accretion orders of rRNA structural subelements: (a) classical comparative homology and phylogenetic reconstruction, and (b) a structural hypothesis assuming an inverted onion ring growth where the three-dimensional ribosome's core is most ancient and peripheral elements most recent. Comparisons between (a) and (b) accretions orders with RNA ring secondary structure scales show that recent rRNA subelements are: 1. more like RNA rings with recent cognates, indicating ongoing coevolution between tRNA and rRNA secondary structures; 2. less similar to theoretical minimal RNA rings with ancient cognates. Our method fits (a) and (b) in all examined organisms, more with (a) than (b). Results stress the need to integrate independent methods. Theoretical minimal RNA rings are potential evolutionary references for any sequence-based evolutionary analyses, independent of the focal data from that study.
Article
Full-text available
Stereochemical nucleotide-amino acid interactions, in the form of non-covalent nucleotide-amino acid interactions, potentially produced the genetic code's codon-amino acid assignments. Empirical estimates of single nucleotide-amino acid affinities on surfaces and in solution are used to test whether trinucleotide-amino acid affinities determined genetic code assignments pending the principle "first arrived, first served": presumed early amino acids have greater codon-amino acid affinities than ulterior ones. Here, these single nucleotide affinities are used to approximate all 64x20 trinucleotide-amino acid affinities. Analyses show that 1. on surfaces, genetic code codon-amino acid assignments tend to match high affinities for the amino acids that integrated earliest the genetic code (according to Wong's metabolic coevolution hypothesis between nucleotides and amino acids); and 2. in solution, the same principle holds for the anticodon-amino acid assignments. Affinity analyses match best genetic code assignments when assuming that trinucleotides competed for amino acids, rather than amino acids for trinucleotides. Codon-amino acid affinities stick better to genetic code assignments than anticodon-amino acid affinities. Presumably, two independent coding systems, on surfaces and in solution, converged, and formed the current translation system. Proto-translation on surfaces by direct codon-amino acid interactions without tRNA-like adaptors coadapted with a system emerging in solution by proto-tRNA anticodon-amino acid interactions. These systems assigned identical or similar cognates to codons on surfaces and to anticodons in solution. Results indicate that a prebiotic metabolism predated genetic code self-organization.
Article
Full-text available
Accretions of tRNAs presumably formed the large complex ribosomal RNA structures. Similarities of tRNA secondary structures with rRNA secondary structures increase with the integration order of their cognate amino acid in the genetic code, indicating tRNA evolution towards rRNA-like structures. Here analyses rank secondary structure subelements of three large ribosomal RNAs (Prokaryota: Archaea: Thermus thermophilus; Bacteria: Escherichia coli; Eukaryota: Saccharomyces cerevisiae) in relation to their similarities with secondary structures formed by presumed proto-tRNAs, represented by 25 theoretical minimal RNA rings. These ranks are compared to those derived from two independent methods (ranks provide a relative evolutionary age to the rRNA substructure), (a) cladistic phylogenetic analyses and (b) 3D-crystallography where core subelements are presumed ancient and peripheral ones recent. Comparisons of rRNA secondary structure subelements with RNA ring secondary structures show congruence between ranks deduced by this method and both (a) and (b) (more with (a) than (b)), especially for RNA rings with predicted ancient cognate amino acid. Reconstruction of accretion histories of large rRNAs will gain from adequately integrating information from independent methods. Theoretical minimal RNA rings, sequences deterministically designed in silico according to specific coding constraints, might produce adequate scales for prebiotic and early life molecular evolution.
Article
Full-text available
Frameshifting protein translation occasionally results from insertion of amino acids at isolated mono- or dinucleotide-expanded codons by tRNAs with expanded anticodons. Previous analyses of two different types of human mitochondrial MS proteomic data (Fisher and Waters technologies) detect peptides entirely corresponding to expanded codon translation. Here, these proteomic data are reanalyzed searching for peptides consisting of at least eight consecutive amino acids translated according to regular tricodons, and at least eight adjacent consecutive amino acids translated according to expanded codons. Both datasets include chimerically translated peptides (mono- and dinucleotide expansions, 42 and 37, respectively). The regular tricodon-encoded part of some chimeric peptides corresponds to standard human mitochondrial proteins (mono- and dinucleotide expansions, six (AT6, CytB, ND1, 2xND2, ND5) and one (ND1), respectively). Chimeric translation probably increases the diversity of mitogenome-encoded proteins, putatively producing functional proteins. These might result from translation by tRNAs with expanded anticodons, or from regular tricodon translation of RNAs where transcription/posttranscriptional edition systematically deleted mono- or dinucleotides after each trinucleotide. The pairwise matched combination of adjacent peptide parts translated from regular and expanded codons strengthens the hypothesis that translation of stretches of consecutive expanded codons occurs. Results indicate statistical translation producing distributions of alternative proteins. Genetic engineering should account for potential unexpected, unwanted secondary products.
Article
Ribosomes are the translational machineries having two unequal subunits, small subunit (SSU) and large subunit (LSU) across all the domains of life. Origin and evolution of ribosome are encoded in its structure, and the core of the ribosome is highly conserved. Here, we have used Shannon entropy to analyze the evolution of ribosomal proteins (r-proteins) across the three domains of life. Moreover, we have analyzed the residue conservation at protein-protein (PP) and protein-RNA (PR) interfaces in SSU and LSU. Furthermore, we have studied the evolution of early, intermediate and late binding r-proteins. We show that the r-proteins of Thermus thermophilus are better conserved during the evolution. Furthermore, we find the late binders are better conserved than the early and the intermediate binders. The residues at the interior of the r-proteins are the most conserved followed by those at the interface and the solvent accessible surface. Additionally, we show that the residues at the PP interfaces are better conserved than those at the PR interfaces. However, between PR and PP interfaces, the multi-interface residues at the former are better conserved than those at the latter ones. Our findings may provide insights into the evolution of r-proteins in ribosomal assembly and function.
Chapter
Full-text available
This review summarizes some major events in the evolution of body plans along the backbone of the arthropod tree, with a special focus on the origin of insects. The incompatibility among recent molecular phylogenies motivates a discussion about possible causes for failures: there is a worrisome lack of information in alignments, which can be visualized with spectra of split-supporting positions, and there are systematic errors occurring even when using correct models in maximum likelihood methods (Kück et al., this book). Currently, these problems cannot be avoided. Combining information from the fossil record and from extant arthropods, the morphology-based evolutionary scenario leads from worm-like stem-lineage arthropods via first euarthropods to the crown group of Mandibulata. The evolution of the mandibulate head is well documented in the Cambrian Orsten fossils. The evolution within crustaceans is also the evolution that leads to characters of the bauplan of myriapods and insects. It is argued that morphologicallymyriapods do not fit to the base of the mandibulatan tree and that this placement is also not plausible from a paleontological point of view. Available morphological evidencesuggests that myriapods are the sister-group to Hexapoda and that tracheates evolved from a marine ancestor that was similar in many ways to Remipedia. In the extant fauna, the Remipedia are the sister-group of Tracheata.
Chapter
Full-text available
In contrast to theories arguing that cellular life has evolved to transmit genes, we propose instead that cellular life evolved to facilitate the full potential of self-replicating ribosomes. Our theory explicitly rejects ?master molecule? theories such as Dawkins?s ?selfish gene? in favor of the emergence of life by means of systems of increasingly networked interactions that carried out metabolic and genetic functions concurrently within a complex chemical ecology. The critical role of networking chemical interactions within this ecology was (as it still is) mediated by all possible forms of molecular complementarity, of which base-pairing in RNA and DNA is just one. Selection for molecular complementarity functional and structural modules vastly increased the probability that networked systems would evolve, eventually resulting in the first self-replicating entity, which we believe was the ribosome. We make six predictions from our ribosome-first theory of cellular evolution that may seem, at first glance, heretical: (1) Ribosomal RNA (rRNA) contains genetic information encoding its own proteins, meaning that it also encodes messenger RNA (mRNA); (2) these proteins bind to the rRNA to form the functional ribosomal structure, but since the rRNA is also functioning as mRNA, the ribosomal proteins must bind to their own mRNA as well; (3) rRNA encodes all of the transfer RNAs (tRNA) required for the translation of its genetic information; (4) thus, tRNAs may be the precursor modules that gave rise to rRNA; (5) rRNA is pleiofunctional, integrating genetic, protein, translational, and structural information often in the same or overlapping sequences and in all reading frames; and (6) since the ribosome gave rise to cellular life, tRNA- and rRNA-like genetic information must be major building blocks from which cellular genomes evolved. We present evidence supporting all six of these apparently unlikely predictions. Our conclusion is that life is not about the evolution of genes, but the evolution of the kinds of networked interactions through complementarity that characterize ecologies: Genes evolved merely as storage units to ?back up? ribosomal functions. This same complementarity-based approach may help to explain why functional traits, rather than genetic populations, appear to network interactions within higher-order systems such as ecosystems and holobionts.
Chapter
The modern ribosomal machinery is very complex, and its core subsystems and many of its individual components are universally found in all three domains of life. This indicates that much of the story of ribosome origins and its subsequent evolution predates the last universal common ancestor (LUCA). Thus, ribosome history relates to other early life issues such as the possibility and nature of an RNA World, the early history of chirality, and always most hopefully the origins of the genetic code. However, this is not the end of the story. As discussed elsewhere in this volume, important events have also occurred since the LUCA, especially in eukaryotic ribosomes that have served to integrate the machinery with other cellular systems. Ribosome origins and subsequent evolution are in reality somewhat separate problems. In addressing the former, this chapter initially examines the source and nature of the peptidyl transferase center (PTC), including where and how the peptide bond is made. This is followed by efforts to understand the subsequent evolution of the ribosome, which led to the addition and refinement of various other functional centers including the decoding center. This is being accomplished using what is in essence a reverse engineering approach to develop a timeline of major events in the ribosome history. Finally, significant events on the timeline are discussed in detail.
Article
Translational control functions in diverse biological processes including developmental pattern formation, intracellular protein targeting, the control of homeostasis, and a variety of stress responses. Since the discovery of genomes in plastids, Chlamydomonas research has played a central role in the development of our understanding of the plastid translation machinery and the mechanisms that regulate translation in chloroplasts, and this chapter reviews relevant findings and models. Protein synthesis rates in the Chlamydomonas chloroplast are routinely measured using pulse labeling of intact cells with 35SO4 or [14C] acetate, or isolated chloroplasts with [35S]methionine. This part, radioisotope pulse labeling of newly synthesized proteins, is described under methodologies of protein synthesis. Genetics has been used extensively for the identification and characterization of trans-acting factors and cis-acting sequences that function in the translation of chloroplast mRNAs. Chloroplast ribosomes are introduced and updated with recent research findings. Translation requires general translation factors (GTF), many of which are GTPases activated by the ribosome. The identities and properties of the Chlamydomonas GTFs are covered in this section. Chlamydomonas responds to light in a variety of ways, including changes in chloroplast gene expression at the translational level. This section reviews distinct translational regulatory responses during dark-to-light transitions and under high light stress. Reverse genetic approaches are used to address the functions of other translation factors originally identified through biochemical approaches.
Article
We present a set of results concerning two types of biological problems: (1) RNA structure comparison and (2) intergenomic distance computation con- sidering non trivial genomes. In this thesis, we determine the algorithmic complexity of a set of problems linked to either RNA structure comparison (edit distance, APS problem, 2-interval pattern extraction, RNA design), or genomic rearrangements (breakpoints and conserved intervals distances). For each studied problem, we try to find an exact and fast algorithm resolving it. If we do not find such an algorithm, we try to prove that it is impossible to find one. To do so, we prove that the corresponding problem is difficult. Finally, we continue the study of each difficult problem by proposing three types of results: (1) Approximation, (2) Parameterized complexity, (3) Heuris- tic. We use in this thesis notions of combinatorics, mathematics, graph theory and algorithmics.
Article
Full-text available
Over 73,000 projections of the E. coli ribosome bound with formyl-methionyl initiator tRNAfMet were used to obtain an 11.5 Å cryo-electron microscopy map of the complex. This map allows identification of RNA helices, peripheral proteins, and intersubunit bridges. Comparison of double-stranded RNA regions and positions of proteins identified in both cryo-EM and X-ray maps indicates good overall agreement but points to rearrangements of ribosomal components required for the subunit association. Fitting of known components of the 50S stalk base region into the map defines the architecture of the GTPase-associated center and reveals a major change in the orientation of the α-sarcin-ricin loop. Analysis of the bridging connections between the subunits provides insight into the dynamic signaling mechanism between the ribosomal subunits.
Article
Full-text available
The ribosome is a macromolecular assembly that is responsible for protein biosynthesis following genetic instructions in all organisms. It is composed of two unequal subunits: the smaller subunit binds messenger RNA and the anticodon end of transfer RNAs, and helps to decode the mRNA; and the larger subunit interacts with the amino-acid-carrying end of tRNAs and catalyses the formation of the peptide bonds. After peptide-bond formation, elongation factor G (EF-G) binds to the ribosome, triggering the translocation of peptidyl-tRNA from its aminoacyl site to the peptidyl site, and movement of mRNA by one codon. Here we analyse three-dimensional cryo-electron microscopy maps of the Escherichia coli 70S ribosome in various functional states, and show that both EF-G binding and subsequent GTP hydrolysis lead to ratchet-like rotations of the small 30S subunit relative to the large 50S subunit. Furthermore, our finding indicates a two-step mechanism of translocation: first, relative rotation of the subunits and opening of the mRNA channel following binding of GTP to EF-G; and second, advance of the mRNA/(tRNA)2 complex in the direction of the rotation of the 30S subunit, following GTP hydrolysis.
Article
Full-text available
Abstract — Arguments for and against combined analysis of multiple data sets in phylogenetic inference are reviewed. Simultaneous analysis of combined data better maximizes cladistic parsimony than separate analyses, hence is to be preferred. Simultaneous analysis can allow “secondary signals” to emerge because it measures strength of evidence supporting disparate results. Separate analyses are useful and of interest to understanding the differences among data sets, but simultaneous analysis provides the greatest possible explanatory power, and should always be evaluated when possible. The mechanics of simultaneous analysis are discussed.
Article
Full-text available
Computer codes for computation and comparison of RNA secondary structures, the Vienna RNA package, are presented, that are based on dynamic programming algorithms and aim at predictions of structures with minimum free energies as well as at computations of the equilibrium partition functions and base pairing probabilities.An efficient heuristic for the inverse folding problem of RNA is introduced. In addition we present compact and efficient programs for the comparison of RNA secondary structures based on tree editing and alignment.All computer codes are written in ANSI C. They include implementations of modified algorithms on parallel computers with distributed memory. Performance analysis carried out on an Intel Hypercube shows that parallel computing becomes gradually more and more efficient the longer the sequences are.Die im Vienna RNA package enthaltenen Computer Programme fr die Berechnung und den Vergleich von RNA Sekundrstrukturen werden prsentiert. Ihren Kern bilden Algorithmen zur Vorhersage von Strukturen minimaler Energie sowie zur Berechnung von Zustandssumme und Basenpaarungswahrscheinlichkeiten mittels dynamischer Programmierung.Ein effizienter heuristischer Algorithmus fr das inverse Faltungsproblem wird vorgestellt. Darberhinaus prsentieren wir kompakte und effiziente Programme zum Vergleich von RNA Sekundrstrukturen durch Baum-Editierung und Alignierung.Alle Programme sind in ANSI C geschrieben, darunter auch eine Implementation des Faltungs-algorithmus fr Parallelrechner mit verteiltem Speicher. Wie Tests auf einem Intel Hypercube zeigen, wird das Parallelrechnen umso effizienter je lnger die Sequenzen sind.
Article
Full-text available
*To whom correspondence should be addressed Summary: RadCon is a Macintosh® program for manipulating and analysing phylogenetic trees. The program can determine the Cladistic Information Content of individual trees, the stability of leaves across a set of bootstrap trees, produce the strict basic Reduced Cladistic Consensus profile of a set of trees and convert a set of trees into its matrix representation for supertree construction. Availability: The program is free and available at http://taxonomy.zoology.gla.ac.uk/~jthorley/radcon/radcon.html. Contact: j.l.thorley@bris.ac.uk
Article
Full-text available
A sample of transfer RNA molecules is compared to a sample of random sequences having the same length and same percentage composition of the different bases. For each sequence all possible secondary structures are constructed and a distribution of free energies for the states is obtained. It is found that the ground state free energies of tRNA molecules are significantly lower than for random sequences, and that tRNA molecules have significantly fewer alternative secondary structures at energies close to the ground state than do random sequences. A distance $D$ is defined which measures the average difference between molecular configurations and the ground state configuration. At realistic temperatures of order 300 K this distance is much larger for random sequences than for tRNA sequences. Thus the secondary structure of tRNA molecules at finite temperature is more stable than for random sequences. Sequences are considered which differ by a small number of mutations from real tRNA sequences. On average mutations destabilize the secondary structure. This suggests that a stable secondary structure is one of the factors selected for by natural selection. The thermodynamic behaviour of RNA sequences is compared to models for random heteropolymers which have a low temperature frozen phase.
Article
Full-text available
Thermodynamic parameters for prediction of RNA duplex stability are reported. One parameter for duplex initiation and 10 parameters for helix propagation are derived from enthalpy and free-energy changes for helix formation by 45 RNA oligonucleotide duplexes. The oligomer sequences were chosen to maximize reliability of secondary structure predictions. Each of the 10 nearest-neighbor sequences is well-represented among the 45 oligonucleotides, and the sequences were chosen to minimize experimental errors in delta GO at 37 degrees C. These parameters predict melting temperatures of most oligonucleotide duplexes within 5 degrees C. This is about as good as can be expected from the nearest-neighbor model. Free-energy changes for helix propagation at dangling ends, terminal mismatches, and internal G X U mismatches, and free-energy changes for helix initiation at hairpin loops, internal loops, or internal bulges are also tabulated.
Article
Full-text available
The accuracy of computer predictions of RNA secondary structure from sequence data and free energy parameters has been increased to roughly 70%. Performance is judged by comparison with structures known from phylogenetic analysis. The algorithm also generates suboptimal structures. On average, the best structure within 10% of the lowest free energy contains roughly 90% of phylogenetically known helixes. The algorithm does not include tertiary interactions or pseudoknots and employs a crude model for single-stranded regions. The only favorable interactions are base pairing and stacking of terminal unpaired nucleotides at the ends of helixes. The excellent performance is consistent with these interactions being the primary interactions determining RNA secondary structure.
Article
Full-text available
Evolutionary trees were constructed, by distance methods, from an alignment of 225 complete large subunit (LSU) rRNA sequences, representing Eucarya, Archaea, Bacteria, plastids, and mitochondria. A comparison was made with trees based on sets of small subunit (SSU) rRNA sequences. Trees constructed on the set of 172 species and organelles for which the sequences of both molecules are known had a very similar topology, at least with respect to the divergence order of large taxa such as the eukaryotic kingdoms and the bacterial divisions. However, since there are more than ten times as many SSU as LSU rRNA sequences, it is possible to select many SSU rRNA sequence sets of equivalent size but different species composition. The topologies of these trees showed considerable differences according to the particular species set selected. The effect of the dataset and of different distance correction methods on tree topology was tested for both LSU and SSU rRNA by repetitive random sampling of a single species from each large taxon. The impact of the species set on the topology of the resulting consensus trees is much lower using LSU than using SSU rRNA. This might imply that LSU rRNA is a better molecule for studying wide-range relationships. The mitochondria behave clearly as a monophyletic group, clustering with the Proteobacteria. Gram-positive bacteria appear as two distinct groups, which are found clustered together in very few cases. Archaea behave as if monophyletic in most cases, but with a low confidence.
Article
Full-text available
Microsporidia are eukaryotic parasites lacking mitochondria, the ribosomes of which present prokaryote-like features. In order to better understand the structural evolution of rRNA molecules in microsporidia, the 5S and rDNA genes were investigated in Encephalitozoon cuniculi. The genes are not in close proximity. Non-tandemly arranged rDNA units are on every one of the 11 chromosomes. Such a dispersion is also shown in two other Encephalitozoon species. Sequencing of the 5S rRNA coding region reveals a 120 nt long RNA which folds according to the eukaryotic consensus structural shape. In contrast, the LSU rRNA molecule is greatly reduced in length (2487 nt). This dramatic shortening is essentially due to truncation of divergent domains, most of them being removed. Most variable stems of the conserved core are also deleted, reducing the LSU rRNA to only those structural features preserved in all living cells. This suggests that the E.cuniculi LSU rRNA performs only the basic mechanisms of translation. LSU rRNA phylogenetic analysis with the BASEML program favours a relatively recent origin of the fast evolving microsporidian lineage. Therefore, the prokaryote-like ribosomal features, such as the absence of ITS2, may be derived rather than primitive characters.
Article
Full-text available
An improved dynamic programming algorithm is reported for RNA secondary structure prediction by free energy minimization. Thermodynamic parameters for the stabilities of secondary structure motifs are revised to include expanded sequence dependence as revealed by recent experiments. Additional algorithmic improvements include reduced search time and storage for multibranch loop free energies and improved imposition of folding constraints. An extended database of 151,503 nt in 955 structures? determined by comparative sequence analysis was assembled to allow optimization of parameters not based on experiments and to test the accuracy of the algorithm. On average, the predicted lowest free energy structure contains 73 % of known base-pairs when domains of fewer than 700 nt are folded; this compares with 64 % accuracy for previous versions of the algorithm and parameters. For a given sequence, a set of 750 generated structures contains one structure that, on average, has 86 % of known base-pairs. Experimental constraints, derived from enzymatic and flavin mononucleotide cleavage, improve the accuracy of structure predictions.
Article
Full-text available
In addition to characteristic structural properties imposed by evolutionary modification, evolved, single-stranded RNAs also display characteristic structural properties imposed by intrinsic physical constraints on RNA polymer folding. The balance of intrinsic and functionally selected characters in the folded conformation of evolved secondary structures was determined by comparing the predicted secondary structures of evolved and unevolved (random) RNA sequences. Though evolved conformations are significantly more ordered than conformations of random-sequence RNA, this analysis demonstrates that the majority of conformational order within evolved structures results not from evolutionary optimization but from constraints imposed by rules intrinsic to RNA polymer folding.
Article
Full-text available
The 54-kDa signal recognition particle and the receptor SR alpha, two proteins involved in the cotranslational translocation of proteins, are paralogs. They originate from a gene duplication that occurred prior to the last universal common ancestor, allowing one to root the universal tree of life. Phylogenetic analysis using standard methods supports the generally accepted cluster of Archaea and Eucarya. However, a new method increasing the signal-to-noise ratio strongly suggests that this result is due to a long-branch attraction artifact, with the Bacteria evolving fastest. In fact, the Archaea/Eucarya sisterhood is recovered only by the fast-evolving positions. In contrast, the most slowly evolving positions, which are the most likely to retain the ancient phylogenetic signal, support the monophyly of prokaryotes. Such a eukaryotic rooting provides a simple explanation for the high similarity of Archaea and Bacteria observed in complete-genome analysis, and should prompt a reconsideration of current views on the origin of eukaryotes.
Article
Full-text available
Structures of 70S ribosome complexes containing messenger RNA and transfer RNA (tRNA), or tRNA analogs, have been solved by x-ray crystallography at up to 7.8 angstrom resolution. Many details of the interactions between tRNA and the ribosome, and of the packing arrangements of ribosomal RNA (rRNA) helices in and between the ribosomal subunits, can be seen. Numerous contacts are made between the 30S subunit and the P-tRNA anticodon stem-loop; in contrast, the anticodon region of A-tRNA is much more exposed. A complex network of molecular interactions suggestive of a functional relay is centered around the long penultimate stem of 16S rRNA at the subunit interface, including interactions involving the “switch” helix and decoding site of 16S rRNA, and RNA bridges from the 50S subunit.
Article
Full-text available
RadCon is a Macintosh program for manipulating and analysing phylogenetic trees. The program can determine the Cladistic Information Content of individual trees, the stability of leaves across a set of bootstrap trees, produce the strict basic Reduced Cladistic Consensus profile of a set of trees and convert a set of trees into its matrix representation for supertree construction. Availability: The program is free and available at http://taxonomy.zoology.gla.ac.uk/ approximately jthorley/radcon/radcon.html.
Article
The recently-developed statistical method known as the "bootstrap" can be used to place confidence intervals on phylogenies. It involves resampling points from one's own data, with replacement, to create a series of bootstrap samples of the same size as the original data. Each of these is analyzed, and the variation among the resulting estimates taken to indicate the size of the error involved in making estimates from the original data. In the case of phylogenies, it is argued that the proper method of resampling is to keep all of the original species while sampling characters with replacement, under the assumption that the characters have been independently drawn by the systematist and have evolved independently. Majority-rule consensus trees can be used to construct a phylogeny showing all of the inferred monophyletic groups that occurred in a majority of the bootstrap samples. If a group shows up 95% of the time or more, the evidence for it is taken to be statistically significant. Existing computer programs can be used to analyze different bootstrap samples by using weights on the characters, the weight of a character being how many times it was drawn in bootstrap sampling. When all characters are perfectly compatible, as envisioned by Hennig, bootstrap sampling becomes unnecessary; the bootstrap method would show significant evidence for a group if it is defined by three or more characters.
Article
The secondary structure of V4, the largest variable area of eukaryotic small subunit ribosomal RNA, was re-examined by comparative analysis of 3253 nucleotide sequences distributed over the animal, plant and fungal kingdoms and a diverse set of protist taxa. An extensive search for compensating base pair substitutions and for base covariation revealed that in most eukaryotes the secondary structure of the area consists of 11 helices and includes two pseudoknots. In one of the pseudoknots, exchange of base pairs between the two stems seems to occur, and covariation analysis points to the presence of a base triple. The area also contains three potential insertion points where additional hairpins or branched structures are present in a number of taxa scattered throughout the eukaryotic domain.
Article
Arguments for and against combined analysis of multiple data sets in phylogenetic inference are reviewed. Simultaneous analysis of combined data better maximizes cladistic parsimony than separate analyses, hence is to be preferred. Simultaneous analysis can allow "secondary signals" to emerge because it measures strength of evidence supporting disparate results. Separate analyses are useful and of interest to understanding the differences among data sets, but simultaneous analysis provides the greatest possible explanatory power, and should always be evaluated when possible. The mechanics of simultaneous analysis are discussed.
Book
— We studied sequence variation in 16S rDNA in 204 individuals from 37 populations of the land snail Candidula unifasciata (Poiret 1801) across the core species range in France, Switzerland, and Germany. Phylogeographic, nested clade, and coalescence analyses were used to elucidate the species evolutionary history. The study revealed the presence of two major evolutionary lineages that evolved in separate refuges in southeast France as result of previous fragmentation during the Pleistocene. Applying a recent extension of the nested clade analysis (Templeton 2001), we inferred that range expansions along river valleys in independent corridors to the north led eventually to a secondary contact zone of the major clades around the Geneva Basin. There is evidence supporting the idea that the formation of the secondary contact zone and the colonization of Germany might be postglacial events. The phylogeographic history inferred for C. unifasciata differs from general biogeographic patterns of postglacial colonization previously identified for other taxa, and it might represent a common model for species with restricted dispersal.
Article
By studying large samples of molecules on a statistical basis useful information can be obtained regarding the type of properties which are important for biological molecules, and the properties which have been selected by evolution. Here, computational algorithms for RNA secondary-structure prediction are used to analyse the thermodynamic properties of the complete set of transfer RNA sequences in the tRNA database. Significant differences are observed between the different classes of tRNA in the database, and these are strongly correlated with the content of C + G bases. For each class of tRNA, a set of random sequences is generated having the same base composition and the same length distribution as the real sequences. In each case the mean value of the minimum Gibbs energy for the random sequences is substantially higher than for tRNA sequences. Within the random sample, sequences with properties comparable to real tRNA molecules are rare. The probability that each ground-state base pair is present in thermal equilibrium is calculated. These probabilities are much higher for real sequences than random ones, indicating that real sequences have a much more stable secondary structure than random ones. The main reason for this is that there are fewer alternative structures with Gibbs energies close to the ground state in real sequences. Secondary structure in real tRNA is found to melt at higher temperatures than in random sequences, and over a narrower temperature range, i.e. the melting process is more cooperative. These results suggest that one important feature selected by evolution is the stability of the ground-state structure.
Article
The recently-developed statistical method known as the "bootstrap" can be used to place confidence intervals on phylogenies. It involves resampling points from one's own data, with replacement, to create a series of bootstrap samples of the same size as the original data. Each of these is analyzed, and the variation among the resulting estimates taken to indicate the size of the error involved in making estimates from the original data, In the case of phylogenies, it is argued that the proper method of resampling is to keep all of the original species while sampling characters with replacement, under the assumption that the characters have been independently drawn by the systematist and have evolved independently. Majority-rule consensus trees can be used to construct a phylogeny showing all of the inferred monophyletic groups that occurred in a majority of the bootstrap samples. If a group shows up 95% of the time or more, the evidence for it is taken to be statistically significant. Existing computer programs can be used to analyze different bootstrap samples by using weights on the characters, the weight of a character being how many times it was drawn in bootstrap sampling. When all characters are perfectly compatible, as envisioned by Hennig, bootstrap sampling becomes unnecessary; the bootstrap method would show significant evidence for a group if it is defined by three or more characters.
Article
Peer Reviewed http://deepblue.lib.umich.edu/bitstream/2027.42/31361/1/0000273.pdf
Article
A statistical reference for RNA secondary structures with minimum free energies is computed by folding large ensembles of random RNA sequences. Four nucleotide alphabets are used: two binary alphabets, AU and GC, the biophysical AUGC and the synthetic GCXK alphabet. RNA secondary structures are made of structural elements, such as stacks, loops, joints, and free ends. Statistical properties of these elements are computed for small RNA molecules of chain lengths up to 100. The results of RNA structure statistics depend strongly on the particular alphabet chosen. The statistical reference is compared with the data derived from natural RNA molecules with similar base frequencies. Secondary structures are represented as trees. Tree editing provides a quantitative measure for the distance d t , between two structures. We compute a structure density surface as the conditional probability of two structures having distance t given that their sequences have distance h . This surface indicates that the vast majority of possible minimum free energy secondary structures occur within a fairly small neighborhood of any typical (random) sequence. Correlation lengths for secondary structures in their tree representations are computed from probability densities. They are appropriate measures for the complexity of the sequence‐structure relation. The correlation length also provides a quantitative estimate for the mean sensitivity of structures to point mutations. © 1993 John Wiley & Sons, Inc.
Article
Abstract— A data dependent weighting procedure is developed to allow the comparison of phylogenetic trees based on nucleic acid sequence data. The sampling error of this cladogram “cost” is then examined, permitting statistical evaluation of the cost differential.
Article
A distance measure that reflects the dissimilarity among structures has been developed on the basis of the three-dimensional structures of similar proteins, this being totally independent of sequence in the sense that only the relative spatial positions of mainchain alpha-carbon atoms need be known. This procedure leads to phyletic relationships that are in general correlated with the sequence phylogenies based on residue type. Such relationships among known protein three-dimensional structures are also a useful aid to their classification and selection in knowledge-based modeling using homologous structures. We have applied this approach to six homologous sets of proteins: immunoglobulin fragments, globins, cytochromesc, serine proteinases, eye-lens gamma crystallins, and dinucleotide-binding domains.
Article
Natural selection processes tune genomes in the edge of the chaos imposed by mutation and drift, allowing an enduring exploration of fitter genetic networks within the constraints imposed by self-organization and the interactions of genotype and phenotype. Alternatively, evolution can be viewed from thermodynamic, kinetic or cybernetic perspectives. Regardless of insight, there is need to understand structure-function relationships at the molecular and holistic evolutionary levels. Strategies are here described that analyze genetic variation in time and trace the evolution of nucleic acid structure. Nucleic acid scanning techniques were used to measure sequence divergence and provide a direct inference of genome-wide mutation rate. This was tested for the first time in vegetatively propagating plants. The method is general and was also used in a study of mutational patterns in phytopathogenic fungi, showing there was a link between sequence and structural diversification of ribosomal gene spacers. In order to determine if this was a general phenomenon, the origin and diversification of nucleic acid secondary structure was traced using a cladistic method capable of producing rooted phylogenetic trees. Phylogenies reconstructed from primary and secondary RNA structure were congruent at all taxonomical levels, providing evidence of a strong link between phenotype and genotype favoring thermodynamic stability and dissipation of Gibbs free energy. Overall results suggest that thermodynamic principles are important driving forces of the evolutionary processes of the living world.
Article
Understanding which phenotypes are accessible from which genotypes is fundamental for understanding the evolutionary process. This notion of accessibility can be used to define a relation of nearness among phenotypes, independently of their similarity. Because of neutrality, phenotypes denote equivalence classes of genotypes. The definition of neighborhood relations among phenotypes relies, therefore, on the statistics of neighborhood relations among equivalence classes of genotypes in genotype space. The folding of RNA sequence (genotypes) into secondary structures (phenotypes) is an ideal case to implement these concepts. We study the extent to which the folding of RNA sequence induces a “statistical topology” on the set of minimum free energy secondary structures. The resulting nearness relation suggests a notion of “continuous” structure transformation. We can, then rationalize major transitions in evolutionary trajectories at the level of RNA structures by identifying those transformations which are irreducibly discontinuous. This is shown by means of computer simulations. The statistical topology organizing the set of RNA shapes explains why neutral drift in sequence space plays a key role in evolutionary optimization.
Article
Complete genome sequences are accumulating rapidly, culminating with the announcement of the human genome sequence in February 2001. In addition to cataloguing the diversity of genes and other sequences, genome sequences will provide the first detailed and complete data on gene families and genome organization, including data on evolutionary changes. Reciprocally, evolutionary biology will make important contributions to the efforts to understand functions of genes and other sequences in genomes. Large-scale, detailed and unbiased comparisons between species will illuminate the evolution of genes and genomes, and population genetics methods will enable detection of functionally important genes or sequences, including sequences that have been involved in adaptive changes.
Article
An official journal of the Genetics Society, Heredity publishes high-quality articles describing original research and theoretical insights in all areas of genetics. Research papers are complimented by News & Commentary articles and reviews, keeping researchers and students abreast of hot topics in the field.
Article
To distinguish continuous from discontinuous evolutionary change, a relation of nearness between phenotypes is needed. Such a relation is based on the probability of one phenotype being accessible from another through changes in the genotype. This nearness relation is exemplified by calculating the shape neighborhood of a transfer RNA secondary structure and provides a characterization of discontinuous shape transformations in RNA. The simulation of replicating and mutating RNA populations under selection shows that sudden adaptive progress coincides mostly, but not always, with discontinuous shape transformations. The nature of these transformations illuminates the key role of neutral genetic drift in their realization.
Article
A theory is proposed which explains the peculiarities of biological evolution on the basis of classical thermodynamics. Notions are introduced of particular evolutions as components of the general evolution of the biosphere. Thermodynamic criteria of evolutions are formulated allowing experimental proof of the applicability of classical thermodynamics to biological evolution as a whole.
Article
Molecular structures and sequences are generally more revealing of evolutionary relationships than are classical phenotypes (particularly so among microorganisms). Consequently, the basis for the definition of taxa has progressively shifted from the organismal to the cellular to the molecular level. Molecular comparisons show that life on this planet divides into three primary groupings, commonly known as the eubacteria, the archaebacteria, and the eukaryotes. The three are very dissimilar, the differences that separate them being of a more profound nature than the differences that separate typical kingdoms, such as animals and plants. Unfortunately, neither of the conventionally accepted views of the natural relationships among living systems--i.e., the five-kingdom taxonomy or the eukaryote-prokaryote dichotomy--reflects this primary tripartite division of the living world. To remedy this situation we propose that a formal system of organisms be established in which above the level of kingdom there exists a new taxon called a "domain." Life on this planet would then be seen as comprising three domains, the Bacteria, the Archaea, and the Eucarya, each containing two or more kingdoms. (The Eucarya, for example, contain Animalia, Plantae, Fungi, and a number of others yet to be defined). Although taxonomic structure within the Bacteria and Eucarya is not treated herein, Archaea is formally subdivided into the two kingdoms Euryarchaeota (encompassing the methanogens and their phenotypically diverse relatives) and Crenarchaeota (comprising the relatively tight clustering of extremely thermophilic archaebacteria, whose general phenotype appears to resemble most the ancestral phenotype of the Archaea.
Article
Comparative analyses have been instrumental in the elucidation of the structures of ribosomal RNAs (rRNAs), transfer RNAs (tRNAs), class I and class II introns, and small nuclear RNAs (snRNAs). This chapter describes the application of phylogenetic comparisons to the evaluation of RNA secondary structure, using as an example the catalytic, RNA moiety of ribonuclease P (RNase P), a ribonucleoprotein, tRNA-processing endonuclease. It illustrates the sequences used for the analysis, which show the inferred secondary structures for the RNase P RNAs of Bacillus subtilis and Escherichia coli. The rRNAs are conservative molecules; homologous sequences are identifiable in the rRNAs of all organisms. These homologous sequences can be used to quantify evolutionary distances between organisms and hence, to identify appropriate organisms for a comparative structure analysis of another RNA.
Article
How many primary lineages of life exist and what are their evolutionary relationships? These are fundamental but highly controversial issues. Woese and co-workers propose that archaebacteria, eubacteria and eukaryotes are the three primary lines of descent and their relationships can be represented by Fig. 1a (the 'archaebacterial tree') if one neglects the root of the tree. In contrast, Lake claims that archaebacteria are paraphyletic, and he groups eocytes (extremely thermophilic, sulphur-dependent bacteria) with eukaryotes, and halobacteria with eubacteria (the 'eocyte tree', Fig. 1b). Lake's view has gained considerable support as a result of an analysis of small subunit ribosomal RNA sequence data by a new approach, the evolutionary parsimony method. Here we report that analysis of small subunit data by the neighbour-joining and maximum parasimony methods favours the archaebacterial tree and that computer simulations using either the archaebacterial or the eocyte tree as a model tree show that the probability of recovering the model tree is very high (greater than 90 per cent) for both the neighbour-joining and maximum parsimony methods but is relatively low for the evolutionary parsimony method. Moreover, analysis of large subunit rRNA sequences by all three methods strongly favours the archaebacterial tree.
Article
Despite the availability of a rapidly growing ribosomal RNA database that now includes organisms in all three primary lines of descent (eubacteria, archaebacteria, and eukaryotes), theoretical treatment of the evolution of the ribosomal RNAs has lagged behind that of the protein genes. In this paper a theory is developed that applies current views of protein gene evolution to the ribosomal RNAs. The major topics addressed are the variability in size, gene arrangement, and processing of the rRNAs among the three primary lines of descent. Among the conclusions are that the rRNAs of eukaryotes retain some primitive features that were probably present in the rRNAs of the earliest cell (the progenote) and that the genes coding for the three major rRNA species were probably originally unlinked.
Article
It has been a tacit assumption of evolutionary theory that the closest surviving relatives of the first cellular organisms are to be found among prokaryotes. This paper draws attention to the fact that many stages of evolution appear to have been accompanied by physical loss of superfluous DNA. It is postulated that the genomes of prokaryotes—where almost every gene is represented by one copy only—represent the results of this process carried to its extreme. On this basis certain features of very early evolution which have been eliminated from prokaryotes may survive in eukaryotes. If correct, the hypothesis would require a careful re-evaluation of the assumptions underlying use of some sequence data to construct phylogenetic trees.
Article
We have determined the complete nucleotide sequence (4712 nucleotides) of the mouse 28S rRNA gene. Comparison with all other homologs indicates that the potential for major variations in size during the evolution has been restricted to a unique set of a few sites within a largely conserved secondary structure core. The D (divergent) domains, responsible for the large increase in size of the molecule from procaryotes to higher eukaryotes, represent half the mouse 28S rRNA length. They show a clear potential to form self-contained secondary structures. Their high GC content in vertebrates is correlated with the folding of very long stable stems. Their comparison with the two other vertebrates, xenopus and rat, reveals an history of repeated insertions and deletions. During the evolution of vertebrates, insertion or deletion of new sequence tracts preferentially takes place in the subareas of D domains where the more recently fixed insertions/deletions were located in the ancestor sequence. These D domains appear closely related to the transcribed spacers of rRNA precursor but a sizable fraction displays a much slower rate of sequence variation.
Article
A detailed substantiation is given of the thermodynamic theory of biological evolution which was formulated previously (Gladyshev, 1977, Gladyshev, 1978b). The thermodynamic approach to biological phenomena and the relationships between the thermodynamic method and kinetic, cybernetic and other methods are examined. Various components of the general evolution of biosphere (molecular, supramolecular, organelle, cellular, etc.) are discussed in detail. The prognostic possibilities of the thermodynamic method in relation to organisms, biological populations and to the biosphere as a whole are considered.
Article
The 16S and 23S rRNA higher-order structures inferred from comparative analysis are now quite refined. The models presented here differ from their immediate predecessors only in minor detail. Thus, it is safe to assert that all of the standard secondary-structure elements in (prokaryotic) rRNAs have been identified, with approximately 90% of the individual base pairs in each molecule having independent comparative support, and that at least some of the tertiary interactions have been revealed. It is interesting to compare the rRNAs in this respect with tRNA, whose higher-order structure is known in detail from its crystal structure (36) (Table 2). It can be seen that rRNAs have as great a fraction of their sequence in established secondary-structure elements as does tRNA. However, the fact that the former show a much lower fraction of identified tertiary interactions and a greater fraction of unpaired nucleotides than the latter implies that many of the rRNA tertiary interactions remain to be located. (Alternatively, the ribosome might involve protein-rRNA rather than intramolecular rRNA interactions to stabilize three-dimensional structure.) Experimental studies on rRNA are consistent to a first approximation with the structures proposed here, confirming the basic assumption of comparative analysis, i.e., that bases whose compositions strictly covary are physically interacting. In the exhaustive study of Moazed et al. (45) on protection of the bases in the small-subunit rRNA against chemical modification, the vast majority of bases inferred to pair by covariation are found to be protected from chemical modification, both in isolated small-subunit rRNA and in the 30S subunit. The majority of the tertiary interactions are reflected in the chemical protection data as well (45). On the other hand, many of the bases not shown as paired in Fig. 1 are accessible to chemical attack (45). However, in this case a sizeable fraction of them are also protected against chemical modification (in the isolated rRNA), which suggests that considerable higher-order structure remains to be found (although all of it may not involve base-base interactions and so may not be detectable by comparative analysis). The agreement between the higher-order structure of the small-subunit rRNA and protection against chemical modification is not perfect, however; some bases shown to covary canonically are accessible to chemical modification (45).(ABSTRACT TRUNCATED AT 400 WORDS)
Article
The ribosome is a large multifunctional complex composed of both RNA and proteins. Biophysical methods are yielding low-resolution structures of the overall architecture of ribosomes, and high-resolution structures of individual proteins and segments of rRNA. Accumulating evidence suggests that the ribosomal RNAs play central roles in the critical ribosomal functions of tRNA selection and binding, translocation, and peptidyl transferase. Biochemical and genetic approaches have identified specific functional interactions involving conserved nucleotides in 16S and 23S rRNA. The results obtained by these quite different approaches have begun to converge and promise to yield an unprecedented view of the mechanism of translation in the coming years.
Article
An algorithm is presented for generating rigorously all suboptimal secondary structures between the minimum free energy and an arbitrary upper limit. The algorithm is particularly fast in the vicinity of the minimum free energy. This enables the efficient approximation of statistical quantities, such as the partition function or measures for structural diversity. The density of states at low energies and its associated structures are crucial in assessing from a thermodynamic point of view how well-defined the ground state is. We demonstrate this by exploring the role of base modification in tRNA secondary structures, both at the level of individual sequences from Escherichia coli and by comparing artificially generated ensembles of modified and unmodified sequences with the same tRNA structure. The two major conclusions are that (1) base modification considerably sharpens the definition of the ground state structure by constraining energetically adjacent structures to be similar to the ground state, and (2) sequences whose ground state structure is thermodynamically well defined show a significant tendency to buffer single point mutations. This can have evolutionary implications, since selection pressure to improve the definition of ground states with biological function may result in increased neutrality.
Article
From comparative analyses of the nucleotide sequences of genes encoding ribosomal RNAs and several proteins, molecular phylogeneticists have constructed a “universal tree of life,” taking it as the basis for a “natural” hierarchical classification of all living things. Although confidence in some of the tree's early branches has recently been shaken, new approaches could still resolve many methodological uncertainties. More challenging is evidence that most archaeal and bacterial genomes (and the inferred ancestral eukaryotic nuclear genome) contain genes from multiple sources. If “chimerism” or “lateral gene transfer” cannot be dismissed as trivial in extent or limited to special categories of genes, then no hierarchical universal classification can be taken as natural. Molecular phylogeneticists will have failed to find the “true tree,” not because their methods are inadequate or because they have chosen the wrong genes, but because the history of life cannot properly be represented as a tree. However, taxonomies based on molecular sequences will remain indispensable, and understanding of the evolutionary process will ultimately be enriched, not impoverished.
Article
The currently accepted universal tree of life based on molecular phylogenies is characterised by a prokaryotic root and the sisterhood of archaea and eukaryotes. The recent discovery that each domain (bacteria, archaea, and eucarya) represents a mosaic of the two others in terms of its gene content has suggested various alternatives in which eukaryotes were derived from the merging of bacteria and archaea. In all these scenarios, life evolved from simple prokaryotes to complex eukaryotes. We argue here that these models are biased by overconfidence in molecular phylogenies and prejudices regarding the primitive nature of prokaryotes. We propose instead a universal tree of life with the root in the eukaryotic branch and suggest that many prokaryotic features of the information processing mechanisms originated by simplification through gene loss and non-orthologous displacement.
Article
To date all attempts to derive a phyletic relationship among restriction endonucleases (ENases) from multiple sequence alignments have been limited by extreme divergence of these enzymes. Based on the approach of Johnson et al. (1990), I report for the first time the evolutionary tree of the ENase-like protein superfamily inferred from quantitative comparison of atomic coordinates of structurally characterized enzymes. The results presented are in harmony with previous comparisons obtained by crystallographic analyses. It is shown that lambda-exonuclease initially diverged from the common ancestor and then two "endonucleolytic" families branched out, separating "blunt end cutters" from "5' four-base overhand cutters." These data may contribute to a better understanding of ENases and encourage the use of structure-based methods for inference of phylogenetic relationship among extremely divergent proteins. In addition, the comparison of three-dimensional structures of ENase-like domains provides a platform for further clustering analyses of sequence similarities among different branches of this large protein family, rational choice of homology modeling templates, and targets for protein engineering.
Article
The completion of entire genome sequences of many experimental organisms, and the promise that the human genome will be completed in the next year, find biology suddenly awash in genome-based data. Scientists are scrambling to develop new technologies that exploit genome data to ask entirely new kinds of questions about the complex nature of living cells.