Article

A periodic pattern of mRNA secondary structure created by the genetic code

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Nucleic Acids Research (Impact Factor: 9.11). 02/2006; 34(8):2428-37. DOI: 10.1093/nar/gkl287
Source: PubMed

ABSTRACT Single-stranded mRNA molecules form secondary structures through complementary self-interactions. Several hypotheses have been proposed on the relationship between the nucleotide sequence, encoded amino acid sequence and mRNA secondary structure. We performed the first transcriptome-wide in silico analysis of the human and mouse mRNA foldings and found a pronounced periodic pattern of nucleotide involvement in mRNA secondary structure. We show that this pattern is created by the structure of the genetic code, and the dinucleotide relative abundances are important for the maintenance of mRNA secondary structure. Although synonymous codon usage contributes to this pattern, it is intrinsic to the structure of the genetic code and manifests itself even in the absence of synonymous codon usage bias at the 4-fold degenerate sites. While all codon sites are important for the maintenance of mRNA secondary structure, degeneracy of the code allows regulation of stability and periodicity of mRNA secondary structure. We demonstrate that the third degenerate codon sites contribute most strongly to mRNA stability. These results convincingly support the hypothesis that redundancies in the genetic code allow transcripts to satisfy requirements for both protein structure and RNA structure. Our data show that selection may be operating on synonymous codons to maintain a more stable and ordered mRNA secondary structure, which is likely to be important for transcript stability and translation. We also demonstrate that functional domains of the mRNA [5'-untranslated region (5'-UTR), CDS and 3'-UTR] preferentially fold onto themselves, while the start codon and stop codon regions are characterized by relaxed secondary structures, which may facilitate initiation and termination of translation.

Download full-text

Full-text

Available from: Nikolay A Spiridonov, Jul 05, 2015
0 Followers
 · 
224 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: While translational stop codon readthrough is often used by viral genomes, it has been observed for only a handful of eukaryotic genes. We previously used comparative genomics evidence to recognize protein-coding regions in 12 species of Drosophila and showed that for 149 genes, the open reading frame following the stop codon has a protein-coding conservation signature, hinting that stop codon readthrough might be common in Drosophila. We return to this observation armed with deep RNA sequence data from the modENCODE project, an improved higher-resolution comparative genomics metric for detecting protein-coding regions, comparative sequence information from additional species, and directed experimental evidence. We report an expanded set of 283 readthrough candidates, including 16 double-readthrough candidates; these were manually curated to rule out alternatives such as A-to-I editing, alternative splicing, dicistronic translation, and selenocysteine incorporation. We report experimental evidence of translation using GFP tagging and mass spectrometry for several readthrough regions. We find that the set of readthrough candidates differs from other genes in length, composition, conservation, stop codon context, and in some cases, conserved stem-loops, providing clues about readthrough regulation and potential mechanisms. Lastly, we expand our studies beyond Drosophila and find evidence of abundant readthrough in several other insect species and one crustacean, and several readthrough candidates in nematode and human, suggesting that functionally important translational stop codon readthrough is significantly more prevalent in Metazoa than previously recognized.
    Genome Research 12/2011; 21(12):2096-113. DOI:10.1101/gr.119974.110 · 13.85 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Mammalian genomes contain numerous genes for long noncoding RNAs (lncRNAs). The functions of the lncRNAs remain largely unknown but their evolution appears to be constrained by purifying selection, albeit relatively weakly. To gain insights into the mode of evolution and the functional range of the lncRNA, they can be compared with much better characterized protein-coding genes. The evolutionary rate of the protein-coding genes shows a universal negative correlation with expression: highly expressed genes are on average more conserved during evolution than the genes with lower expression levels. This correlation was conceptualized in the misfolding-driven protein evolution hypothesis according to which misfolding is the principal cost incurred by protein expression. We sought to determine whether long intergenic ncRNAs (lincRNAs) follow the same evolutionary trend and indeed detected a moderate but statistically significant negative correlation between the evolutionary rate and expression level of human and mouse lincRNA genes. The magnitude of the correlation for the lincRNAs is similar to that for equal-sized sets of protein-coding genes with similar levels of sequence conservation. Additionally, the expression level of the lincRNAs is significantly and positively correlated with the predicted extent of lincRNA molecule folding (base-pairing), however, the contributions of evolutionary rates and folding to the expression level are independent. Thus, the anticorrelation between evolutionary rate and expression level appears to be a general feature of gene evolution that might be caused by similar deleterious effects of protein and RNA misfolding and/or other factors, for example, the number of interacting partners of the gene product.
    Genome Biology and Evolution 11/2011; 3:1390-404. DOI:10.1093/gbe/evr116 · 4.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: Aging is a complex process that involves the interplay of genetic, epigenetic, and environmental factors. Identifying aging-related biomarkers holds great potential for improving our understanding of complex physiological changes, thereby providing a means to investigate the mechanism by which aging influences various diseases. Method and Results: We performed a parallel study of microRNA and gene expression profiling of peripheral blood in a group of healthy young adult women, among which 13 were aged 22-25 and 9 were aged 36-39 years old. We identified a significantly distinct pattern of microRNA, but not gene expression profiling, between these two young adult women groups. We also performed correlation analysis of expression levels between all pairs of age-associated microRNAs and genes and identified a weak global correlation between these two types of expression levels. A significant involvement of estrogen regulation was observed by pathway analysis of the most differentially expressed microRNAs that included miR-155, -18a, -142, -340, -363, -195, and -24. Conclusion: Our results suggest that the change in global microRNA expression in the peripheral blood is associated with normal aging in young adult women. This change may precede global gene expression changes. Future studies are needed to investigate the regulatory mechanism of the estrogen-related microRNAs and associated diseases.
    Frontiers in Genetics 01/2011; 2:49. DOI:10.3389/fgene.2011.00049