Negative Correlation between Expression Level and Evolutionary Rate of Long Intergenic Noncoding RNAs

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.
Genome Biology and Evolution (Impact Factor: 4.53). 11/2011; 3(1):1390-404. DOI: 10.1093/gbe/evr116
Source: PubMed

ABSTRACT Mammalian genomes contain numerous genes for long noncoding RNAs (lncRNAs). The functions of the lncRNAs remain largely unknown but their evolution appears to be constrained by purifying selection, albeit relatively weakly. To gain insights into the mode of evolution and the functional range of the lncRNA, they can be compared with much better characterized protein-coding genes. The evolutionary rate of the protein-coding genes shows a universal negative correlation with expression: highly expressed genes are on average more conserved during evolution than the genes with lower expression levels. This correlation was conceptualized in the misfolding-driven protein evolution hypothesis according to which misfolding is the principal cost incurred by protein expression. We sought to determine whether long intergenic ncRNAs (lincRNAs) follow the same evolutionary trend and indeed detected a moderate but statistically significant negative correlation between the evolutionary rate and expression level of human and mouse lincRNA genes. The magnitude of the correlation for the lincRNAs is similar to that for equal-sized sets of protein-coding genes with similar levels of sequence conservation. Additionally, the expression level of the lincRNAs is significantly and positively correlated with the predicted extent of lincRNA molecule folding (base-pairing), however, the contributions of evolutionary rates and folding to the expression level are independent. Thus, the anticorrelation between evolutionary rate and expression level appears to be a general feature of gene evolution that might be caused by similar deleterious effects of protein and RNA misfolding and/or other factors, for example, the number of interacting partners of the gene product.

Download full-text


Available from: David Managadze, Jun 26, 2014
  • Source
    • "Recent studies in mammalian genomes have shown that lncRNAs are generally characterized by four interesting features: (i) eukaryotic genome codes a few thousand lincRNAs (Cabili et al., 2011; Dinger et al., 2008; Guttman et al., 2009); *To whom correspondence should be addressed. (ii) lncRNA genes are expressed in a temporal and/or spatial specific manner (Dinger et al., 2008; Managadze et al., 2011); (iii) genomic loci encoding lncRNAs are associated with epigenetic markers (Guttman et al., 2009; Khalil et al., 2009); (iv) sense and antisense transcripts double-stranded structure may be processed into siRNAs (Zhang et al., 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: PLncDB attempts to provide the following functions related to long noncoding RNAs (lncRNAs): (1) Genomic information for a large number of lncRNAs collected from various resources; (2) An online genome browser for plant lncRNAs based on a platform similar to that of the UCSC Genome Browser; (3) Integration of transcriptome datasets derived from various samples including different tissues, developmental stages, mutants and stress treatments; and (4) A list of epigenetic modification datasets and small RNA datasets. Currently, Our PLncDB provides a comprehensive genomic view of Arabidopsis lncRNAs for the plant research community. This database will be regularly updated with new plant genome when available so as to greatly facilitate future investigations on plant lncRNAs. AVAILABILITY: PLncDB is freely accessible at and all results can be downloaded for free at the website. CONTACT:
    Bioinformatics 03/2013; 29(8). DOI:10.1093/bioinformatics/btt107 · 4.62 Impact Factor
  • Source
    • "For convenience, we used the nomenclature " ATxNCxxxxxx, " which is similar to the current TAIR identifier " ATxGxxxxxx, " except with the change of G to NC denoting ncRNA. About 41% of the lincRNAs in human and mouse contain introns (Managadze et al., 2011). Among the 36 Arabidopsis lincRNAs annotated in TAIR9, 18 lincRNAs have introns (see Supplemental Data Set 14A online). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Long intergenic noncoding RNAs (lincRNAs) transcribed from intergenic regions of yeast and animal genomes play important roles in key biological processes. Yet, plant lincRNAs remain poorly characterized and how lincRNA biogenesis is regulated is unclear. Using a reproducibility-based bioinformatics strategy to analyze 200 Arabidopsis thaliana transcriptome data sets, we identified 13,230 intergenic transcripts of which 6480 can be classified as lincRNAs. Expression of 2708 lincRNAs was detected by RNA sequencing experiments. Transcriptome profiling by custom microarrays revealed that the majority of these lincRNAs are expressed at a level between those of mRNAs and precursors of miRNAs. A subset of lincRNA genes shows organ-specific expression, whereas others are responsive to biotic and/or abiotic stresses. Further analysis of transcriptome data in 11 mutants uncovered SERRATE, CAP BINDING PROTEIN20 (CBP20), and CBP80 as regulators of lincRNA expression and biogenesis. RT-PCR experiments confirmed these three proteins are also needed for splicing of a small group of intron-containing lincRNAs.
    The Plant Cell 11/2012; 24(11). DOI:10.1105/tpc.112.102855 · 9.58 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Long non-coding RNAs (lncRNAs) are emerging as an important class of regulatory transcripts that are implicated in a variety of biological functions. RNA-sequencing, along with other next-generation sequencing-based approaches, enables their study on a genome-wide scale, at maximal resolution, and across multiple conditions. This review discusses how sequencing-based studies are providing global insights into lncRNA transcription, post-transcriptional processing, expression regulation and sites of function. The next few years will deepen our insight into the overall contribution of lncRNAs to genome function and to the information flow from genotype to phenotype.
    Seminars in Cell and Developmental Biology 12/2011; 23(2):200-5. DOI:10.1016/j.semcdb.2011.12.003 · 5.97 Impact Factor
Show more