Negative correlation between expression level and evolutionary rate of long intergenic noncoding RNAs.

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.
Genome Biology and Evolution (Impact Factor: 4.53). 11/2011; 3:1390-404. DOI: 10.1093/gbe/evr116
Source: PubMed

ABSTRACT Mammalian genomes contain numerous genes for long noncoding RNAs (lncRNAs). The functions of the lncRNAs remain largely unknown but their evolution appears to be constrained by purifying selection, albeit relatively weakly. To gain insights into the mode of evolution and the functional range of the lncRNA, they can be compared with much better characterized protein-coding genes. The evolutionary rate of the protein-coding genes shows a universal negative correlation with expression: highly expressed genes are on average more conserved during evolution than the genes with lower expression levels. This correlation was conceptualized in the misfolding-driven protein evolution hypothesis according to which misfolding is the principal cost incurred by protein expression. We sought to determine whether long intergenic ncRNAs (lincRNAs) follow the same evolutionary trend and indeed detected a moderate but statistically significant negative correlation between the evolutionary rate and expression level of human and mouse lincRNA genes. The magnitude of the correlation for the lincRNAs is similar to that for equal-sized sets of protein-coding genes with similar levels of sequence conservation. Additionally, the expression level of the lincRNAs is significantly and positively correlated with the predicted extent of lincRNA molecule folding (base-pairing), however, the contributions of evolutionary rates and folding to the expression level are independent. Thus, the anticorrelation between evolutionary rate and expression level appears to be a general feature of gene evolution that might be caused by similar deleterious effects of protein and RNA misfolding and/or other factors, for example, the number of interacting partners of the gene product.

Download full-text


Available from: David Managadze, Jun 26, 2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: PLncDB attempts to provide the following functions related to long noncoding RNAs (lncRNAs): (1) Genomic information for a large number of lncRNAs collected from various resources; (2) An online genome browser for plant lncRNAs based on a platform similar to that of the UCSC Genome Browser; (3) Integration of transcriptome datasets derived from various samples including different tissues, developmental stages, mutants and stress treatments; and (4) A list of epigenetic modification datasets and small RNA datasets. Currently, Our PLncDB provides a comprehensive genomic view of Arabidopsis lncRNAs for the plant research community. This database will be regularly updated with new plant genome when available so as to greatly facilitate future investigations on plant lncRNAs. AVAILABILITY: PLncDB is freely accessible at and all results can be downloaded for free at the website. CONTACT:
    Bioinformatics 03/2013; 29(8). DOI:10.1093/bioinformatics/btt107 · 4.62 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Long intergenic noncoding RNAs (lincRNAs) transcribed from intergenic regions of yeast and animal genomes play important roles in key biological processes. Yet, plant lincRNAs remain poorly characterized and how lincRNA biogenesis is regulated is unclear. Using a reproducibility-based bioinformatics strategy to analyze 200 Arabidopsis thaliana transcriptome data sets, we identified 13,230 intergenic transcripts of which 6480 can be classified as lincRNAs. Expression of 2708 lincRNAs was detected by RNA sequencing experiments. Transcriptome profiling by custom microarrays revealed that the majority of these lincRNAs are expressed at a level between those of mRNAs and precursors of miRNAs. A subset of lincRNA genes shows organ-specific expression, whereas others are responsive to biotic and/or abiotic stresses. Further analysis of transcriptome data in 11 mutants uncovered SERRATE, CAP BINDING PROTEIN20 (CBP20), and CBP80 as regulators of lincRNA expression and biogenesis. RT-PCR experiments confirmed these three proteins are also needed for splicing of a small group of intron-containing lincRNAs.
    The Plant Cell 11/2012; 24(11). DOI:10.1105/tpc.112.102855 · 9.58 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Long non-coding RNAs (lncRNAs) are emerging as an important class of regulatory transcripts that are implicated in a variety of biological functions. RNA-sequencing, along with other next-generation sequencing-based approaches, enables their study on a genome-wide scale, at maximal resolution, and across multiple conditions. This review discusses how sequencing-based studies are providing global insights into lncRNA transcription, post-transcriptional processing, expression regulation and sites of function. The next few years will deepen our insight into the overall contribution of lncRNAs to genome function and to the information flow from genotype to phenotype.
    Seminars in Cell and Developmental Biology 12/2011; 23(2):200-5. DOI:10.1016/j.semcdb.2011.12.003 · 5.97 Impact Factor