Negative Correlation between Expression Level and Evolutionary Rate of Long Intergenic Noncoding RNAs

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.
Genome Biology and Evolution (Impact Factor: 4.23). 11/2011; 3(1):1390-404. DOI: 10.1093/gbe/evr116
Source: PubMed


Mammalian genomes contain numerous genes for long noncoding RNAs (lncRNAs). The functions of the lncRNAs remain largely unknown but their evolution appears to be constrained by purifying selection, albeit relatively weakly. To gain insights into the mode of evolution and the functional range of the lncRNA, they can be compared with much better characterized protein-coding genes. The evolutionary rate of the protein-coding genes shows a universal negative correlation with expression: highly expressed genes are on average more conserved during evolution than the genes with lower expression levels. This correlation was conceptualized in the misfolding-driven protein evolution hypothesis according to which misfolding is the principal cost incurred by protein expression. We sought to determine whether long intergenic ncRNAs (lincRNAs) follow the same evolutionary trend and indeed detected a moderate but statistically significant negative correlation between the evolutionary rate and expression level of human and mouse lincRNA genes. The magnitude of the correlation for the lincRNAs is similar to that for equal-sized sets of protein-coding genes with similar levels of sequence conservation. Additionally, the expression level of the lincRNAs is significantly and positively correlated with the predicted extent of lincRNA molecule folding (base-pairing), however, the contributions of evolutionary rates and folding to the expression level are independent. Thus, the anticorrelation between evolutionary rate and expression level appears to be a general feature of gene evolution that might be caused by similar deleterious effects of protein and RNA misfolding and/or other factors, for example, the number of interacting partners of the gene product.

Download full-text


Available from: David Managadze, Jun 26, 2014
44 Reads
  • Source
    • "Many of the putative functional ncRNAs are present at very low levels and thus unlikely to be of any importance with respect to cell or organism physiology. Additionally, the abundance of an ncRNA species shortly correlates with its level of conservation [10] "
    [Show abstract] [Hide abstract]
    ABSTRACT: In the last decade the role of noncoding RNAs (ncRNAs) emerges not only as key elements of posttranscriptional gene silencing, but also as important players of epigenetic regulation. New kind and new functions of ncRNAs are continuously discovered and one of their most important roles is the mediation of environmental signals, both physical and chemical. The activity of cytoplasmic short ncRNA is extensively studied, in spite of the fact that their function and role in the nuclear compartment are not yet completely unraveled. Cellular nucleus contains a multiplicity of long and short ncRNAs controlling at different levels transcriptional and epigenetic processes. In addition, some ncRNAs are involved in RNA editing and quality control. In this paper we review the existing knowledge dealing with how chemical stressors can influence the functionality of short nuclear ncRNAs. Furthermore, we perform bioinformatics analyses indicating that chemical environmental stressors not only induce DNA damage but also influence the mechanism of ncRNAs production and control.
    09/2015; 2015(9):761703. DOI:10.1155/2015/761703
  • Source
    • "Many of the putative functional ncRNAs are present at very low levels and thus unlikely to be of any importance with respect to cell or organismal physiology. Importantly, the abundance of an ncRNA species roughly correlates with its level of conservation (Managadze et al., 2011), which is a good proxy for function (Doolittle et al., 2014; Elliott et al., 2014; however, see below); thus, determining the relative abundance of a given ncRNA in the relevant cell type is an important piece of information. However, one should keep in mind that if the ncRNA has catalytic activity or if it acts as a scaffold to regulate chromosomal architecture near its site of transcription, the RNA may not need to be present at very high levels to be able to perform its task. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The genomes of large multicellular eukaryotes are mostly comprised of non-protein coding DNA. Although there has been much agreement that a small fraction of these genomes has important biological functions, there has been much debate as to whether the rest contributes to development and/or homeostasis. Much of the speculation has centered on the genomic regions that are transcribed into RNA at some low level. Unfortunately these RNAs have been arbitrarily assigned various names, such as "intergenic RNA," "long non-coding RNAs" etc., which have led to some confusion in the field. Many researchers believe that these transcripts represent a vast, unchartered world of functional non-coding RNAs (ncRNAs), simply because they exist. However, there are reasons to question this Panglossian view because it ignores our current understanding of how evolution shapes eukaryotic genomes and how the gene expression machinery works in eukaryotic cells. Although there are undoubtedly many more functional ncRNAs yet to be discovered and characterized, it is also likely that many of these transcripts are simply junk. Here, we discuss how to determine whether any given ncRNA has a function. Importantly, we advocate that in the absence of any such data, the appropriate null hypothesis is that the RNA in question is junk.
    Frontiers in Genetics 01/2015; 6:2. DOI:10.3389/fgene.2015.00002
  • Source
    • "Recent studies in mammalian genomes have shown that lncRNAs are generally characterized by four interesting features: (i) eukaryotic genome codes a few thousand lincRNAs (Cabili et al., 2011; Dinger et al., 2008; Guttman et al., 2009); *To whom correspondence should be addressed. (ii) lncRNA genes are expressed in a temporal and/or spatial specific manner (Dinger et al., 2008; Managadze et al., 2011); (iii) genomic loci encoding lncRNAs are associated with epigenetic markers (Guttman et al., 2009; Khalil et al., 2009); (iv) sense and antisense transcripts double-stranded structure may be processed into siRNAs (Zhang et al., 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Plant long non-coding RNA database (PLncDB) attempts to provide the following functions related to long non-coding RNAs (lncRNAs): (i) Genomic information for a large number of lncRNAs collected from various resources; (ii) an online genome browser for plant lncRNAs based on a platform similar to that of the UCSC Genome Browser; (iii) Integration of transcriptome datasets derived from various samples including different tissues, developmental stages, mutants and stress treatments; and (iv) A list of epigenetic modification datasets and small RNA datasets. Currently, our PLncDB provides a comprehensive genomic view of Arabidopsis lncRNAs for the plant research community. This database will be regularly updated with new plant genome when available so as to greatly facilitate future investigations on plant lncRNAs. Availability: PLncDB is freely accessible at and all results can be downloaded for free at the website.
    Bioinformatics 03/2013; 29(8). DOI:10.1093/bioinformatics/btt107 · 4.98 Impact Factor
Show more