A periodic pattern of mRNA secondary structure created by the genetic code

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Nucleic Acids Research (Impact Factor: 9.11). 02/2006; 34(8):2428-37. DOI: 10.1093/nar/gkl287
Source: PubMed


Single-stranded mRNA molecules form secondary structures through complementary self-interactions. Several hypotheses have been proposed on the relationship between the nucleotide sequence, encoded amino acid sequence and mRNA secondary structure. We performed the first transcriptome-wide in silico analysis of the human and mouse mRNA foldings and found a pronounced periodic pattern of nucleotide involvement in mRNA secondary structure. We show that this pattern is created by the structure of the genetic code, and the dinucleotide relative abundances are important for the maintenance of mRNA secondary structure. Although synonymous codon usage contributes to this pattern, it is intrinsic to the structure of the genetic code and manifests itself even in the absence of synonymous codon usage bias at the 4-fold degenerate sites. While all codon sites are important for the maintenance of mRNA secondary structure, degeneracy of the code allows regulation of stability and periodicity of mRNA secondary structure. We demonstrate that the third degenerate codon sites contribute most strongly to mRNA stability. These results convincingly support the hypothesis that redundancies in the genetic code allow transcripts to satisfy requirements for both protein structure and RNA structure. Our data show that selection may be operating on synonymous codons to maintain a more stable and ordered mRNA secondary structure, which is likely to be important for transcript stability and translation. We also demonstrate that functional domains of the mRNA [5'-untranslated region (5'-UTR), CDS and 3'-UTR] preferentially fold onto themselves, while the start codon and stop codon regions are characterized by relaxed secondary structures, which may facilitate initiation and termination of translation.

Download full-text


Available from: Nikolay A Spiridonov, Oct 13, 2015
40 Reads
  • Source
    • "It is indicated that, in addition to translational selection, other factors that correlate with GC3, e.g. transcriptional selection [40,41], mRNA stability [11,12], biased gene conversion [8,9], may also have combined with translational selection to contribute to the positive correlation between CUB and gene expression level. As translational regulation rather than transcriptional regulation or mRNA stability is more pronounced in influencing protein level in mammals [42], future investigations might as well involve protein expression data to verify such strong translational selection in human HK genes and take account of translation initiation [13] and elongation as well as codon order [43]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Translational selection is a ubiquitous and significant mechanism to regulate protein expression in prokaryotes and unicellular eukaryotes. Recent evidence has shown that translational selection is weakly operative in highly expressed genes in human and other vertebrates. However, it remains unclear whether translational selection acts differentially on human genes depending on their expression patterns. Results Here we report that human housekeeping (HK) genes that are strictly defined as genes that are expressed ubiquitously and consistently in most or all tissues, are under stronger translational selection. Conclusions These observations clearly show that translational selection is also closely associated with expression pattern. Our results suggest that human HK genes are more efficiently and/or accurately translated into proteins, which will inevitably open up a new understanding of HK genes and the regulation of gene expression. Reviewers This article was reviewed by Yuan Yuan, Baylor College of Medicine; Han Liang, University of Texas MD Anderson Cancer Center (nominated by Dr Laura Landweber) Eugene Koonin, NCBI, NLM, NIH, United States of America Sandor Pongor, International Centre for Genetic Engineering and biotechnology (ICGEB), Italy.
    Biology Direct 07/2014; 9(1):17. DOI:10.1186/1745-6150-9-17 · 4.66 Impact Factor
  • Source
    • "These observations are compatible with the hypothesis that the low Ks values in alternative exons reflect the requirement for conservation of regulatory signals in RNA and/or DNA which are most abundant in the 5′ grey area (45,47,52–54). Our results are in good agreement with reports on elevated selective pressure on mRNA folding immediately downstream of the translation start codons (47,50,55) and with the increased density of transcription factor (TF) footprints within the translated portion of gene first coding exons, where TF-DNA recognition requirements constrain the third codon positions (56). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Alternative splicing (AS), alternative transcription initiation (ATI) and alternative transcription termination (ATT) create the extraordinary complexity of transcriptomes and make key contributions to the structural and functional diversity of mammalian proteomes. Analysis of mammalian genomic and transcriptomic data shows that contrary to the traditional view, the joint contribution of ATI and ATT to the transcriptome and proteome diversity is quantitatively greater than the contribution of AS. Although the mean numbers of protein-coding constitutive and alternative nucleotides in gene loci are nearly identical, their distribution along the transcripts is highly non-uniform. On average, coding exons in the variable 5′ and 3′ transcript ends that are created by ATI and ATT contain approximately four times more alternative nucleotides than core protein-coding regions that diversify exclusively via AS. Short upstream exons that encompass alternative 5′-untranslated regions and N-termini of proteins evolve under strong nucleotide-level selection whereas in 3′-terminal exons that encode protein C-termini, protein-level selection is significantly stronger. The groups of genes that are subject to ATI and ATT show major differences in biological roles, expression and selection patterns.
    Nucleic Acids Research 05/2014; 42(11). DOI:10.1093/nar/gku342 · 9.11 Impact Factor
  • Source
    • "Codon bias can also be influenced by selection for mRNA stability. In humans and mice, optimal codons for translation are mostly GC-ending [44,45]; these codons are thought to decrease both mRNA degradation rates in vitro[46] and the Gibbs free energy of mRNA secondary structure [47,48]. Lastly, selective constraint for splicing control also seems to cause low synonymous substitution rates in splicing associated regions, such as purine-rich exonic splicing enhancers (ESEs) [49] and exon-intron junctions [50,51]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Synonymous codon usage can affect many cellular processes, particularly those associated with translation such as polypeptide elongation and folding, mRNA degradation/stability, and splicing. Highly expressed genes are thought to experience stronger selection pressures on synonymous codons. This should result in codon usage bias even in species with relatively low effective population sizes, like mammals, where synonymous site selection is thought to be weak. Here we use phylogenetic codon-based likelihood models to explore patterns of codon usage bias in a dataset of 18 mammalian rhodopsin sequences, the protein mediating the first step in vision in the eye, and one of the most highly expressed genes in vertebrates. We use these patterns to infer selection pressures on key translational mechanisms including polypeptide elongation, protein folding, mRNA stability, and splicing. Results Overall, patterns of selection in mammalian rhodopsin appear to be correlated with post-transcriptional and translational processes. We found significant evidence for selection at synonymous sites using phylogenetic mutation-selection likelihood models, with C-ending codons found to have the highest relative fitness, and to be significantly more abundant at conserved sites. In general, these codons corresponded with the most abundant tRNAs in mammals. We found significant differences in codon usage bias between rhodopsin loops versus helices, though there was no significant difference in mean synonymous substitution rate between these motifs. We also found a significantly higher proportion of GC-ending codons at paired sites in rhodopsin mRNA secondary structure, and significantly lower synonymous mutation rates in putative exonic splicing enhancer (ESE) regions than in non-ESE regions. Conclusions By focusing on a single highly expressed gene we both distinguish synonymous codon selection from mutational effects and analytically explore underlying functional mechanisms. Our results suggest that codon bias in mammalian rhodopsin arises from selection to optimally balance high overall translational speed, accuracy, and proper protein folding, especially in structurally complicated regions. Selection at synonymous sites may also be contributing to mRNA stability and splicing efficiency at exonic-splicing-enhancer (ESE) regions. Our results highlight the importance of investigating highly expressed genes in a broader phylogenetic context in order to better understand the evolution of synonymous substitutions.
    BMC Evolutionary Biology 05/2014; 14(1):96. DOI:10.1186/1471-2148-14-96 · 3.37 Impact Factor
Show more