Identification and Properties of 1,119 Candidate LincRNA Loci in the Drosophila melanogaster Genome

MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, United Kingdom.
Genome Biology and Evolution (Impact Factor: 4.53). 03/2012; 4(4):427-42. DOI: 10.1093/gbe/evs020
Source: PubMed

ABSTRACT The functional repertoire of long intergenic noncoding RNA (lincRNA) molecules has begun to be elucidated in mammals. Determining the biological relevance and potential gene regulatory mechanisms of these enigmatic molecules would be expedited in a more tractable model organism, such as Drosophila melanogaster. To this end, we defined a set of 1,119 putative lincRNA genes in D. melanogaster using modENCODE whole transcriptome (RNA-seq) data. A large majority (1.1 of 1.3 Mb; 85%) of these bases were not previously reported by modENCODE as being transcribed. Significant selective constraint on the sequences of these loci predicts that virtually all have sustained functionality across the Drosophila clade. We observe biases in lincRNA genomic locations and expression profiles that are consistent with some of these lincRNAs being involved in the regulation of neighboring protein-coding genes with developmental functions. We identify lincRNAs that may be important in the developing nervous system and in male-specific organs, such as the testes. LincRNA loci were also identified whose positions, relative to nearby protein-coding loci, are equivalent between D. melanogaster and mouse. This study predicts that the genomes of not only vertebrates, such as mammals, but also an invertebrate (fruit fly) harbor large numbers of lincRNA loci. Our findings now permit exploitation of Drosophila genetics for the investigation of lincRNA mechanisms, including lincRNAs with potential functional analogues in mammals.

Download full-text


Available from: Andrew Bassett, Aug 24, 2015
  • Source
    • "From R5.24 to R6.03, 2313 new candidate non-coding genes were annotated (including those flagged as antisense, see below). We assessed the proposed lncRNAs described in the published literature (Tupy et al. 2005; Inagaki, et al. 2005; Hiller et al. 2009; Young, et al. 2012) and annotated many, but not all, of the lncRNAs proposed. Unless we had independent evidence that the region is transcribed (for example, RNA-Seq coverage data), we did not annotate the predicted lncRNA gene. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We report the current status of the FlyBase annotated gene set for Drosophila melanogaster and highlight improvements based on high-throughput data. The FlyBase annotated gene set consists entirely of manually annotated gene models, with the exception of some classes of small non-coding RNAs. All gene models have been reviewed using evidence from high-throughput datasets, primarily from the modENCODE project. These datasets include RNA-Seq coverage data, RNA-Seq junction data, transcription start site profiles, and translation stop-codon read-through predictions. New annotation guidelines were developed to take into account the use of the high-throughput data. We describe how this flood of new data was incorporated into thousands of new and revised annotations. FlyBase has adopted a philosophy of excluding low confidence and low frequency data from gene model annotations; we also do not attempt to represent all possible permutations for complex and modularly organized genes. This has allowed us to produce a high-confidence, manageable gene annotation dataset that is available at FlyBase ( Interesting aspects of new annotations include new genes (coding, non-coding, and antisense), many genes with alternative transcripts with very long 3' UTRs (up to 15-18kb), and a stunning mismatch in the number of male-specific genes (approximately 13 percent of all annotated gene models) vs. female-specific genes (fewer than 1 percent). The number of identified pseudogenes and mutations in the sequenced strain also increased significantly. We discuss remaining challenges, for instance, identification of functional small polypeptides and detection of alternative translation starts. Copyright © 2015 Author et al.
    G3-Genes Genomes Genetics 06/2015; DOI:10.1534/g3.115.018929 · 2.51 Impact Factor
  • Source
    • "We defined a comprehensive yet conservative set of 2,935 single and multiexonic noncoding RNA transcripts, which includes lincRNAs, intronic lncRNAs, antisense overlapping lncRNAs, and precursors for sRNAs. This conservative estimate of A. queenslandica lncRNAs—the first lncRNAs catalog in an early-branching metazoan—shares many of the characteristics of their bilaterian counterparts (Guttman et al. 2009, 2010, 2011; Cabili et al. 2011; Nam and Bartel 2012; Pauli et al. 2012; Young et al. 2012; Brown et al. 2014; Zhou et al. 2014). Specifically, they are relatively short in length, have a low number of exons, display temporally restricted expression profiles throughout development , and have low sequence conservation in comparison to protein-coding genes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Long non-coding RNAs (lncRNAs) are important developmental regulators in bilaterian animals. A correlation has been claimed between the lncRNA repertoire expansion and morphological complexity in vertebrate evolution. However, this claim has not been tested by examining morphologically simple animals. Here, we undertake a systematic investigation of lncRNAs in the demosponge Amphimedon queenslandica, a morphologically-simple, early-branching metazoan. We combine RNA-Seq data across multiple developmental stages of Amphimedon with a filtering pipeline to conservatively predict 2,935 lncRNAs. These include intronic overlapping lncRNAs, exonic antisense overlapping lncRNAs, long intergenic ncRNAs and precursors for small RNAs. Sponge lncRNAs are remarkably similar to their bilaterian counterparts in being relatively short with few exons and having low primary sequence conservation relative to protein-coding genes. As in bilaterians, a majority of sponge lncRNAs exhibit typical hallmarks of regulatory molecules, including high temporal specificity and dynamic developmental expression. Specific lncRNA expression profiles correlate tightly with conserved protein-coding genes likely involved in a range of developmental and physiological processes, such as the Wnt signaling pathway. Although the majority of Amphimedon lncRNAs appear to be taxonomically-restricted with no identifiable orthologues, we find a few cases of conservation between demosponges in lncRNAs that are antisense to coding sequences. Based on the high similarity in the structure, organisation and dynamic expression of sponge lncRNAs to their bilaterian counterparts, we propose that these non-coding RNAs are an ancient feature of the metazoan genome. These results are consistent with lncRNAs regulating the development of animals, regardless of their level of morphological complexity.
    Molecular Biology and Evolution 05/2015; DOI:10.1093/molbev/msv117 · 14.31 Impact Factor
  • Source
    • "The rather poor conservation of lincRNAs as measured by splice sites does not come as a surprise, since only a small fraction of the observed splice junctions were included in the multiple sequence alignmens in the first place. Their level of sequence conservation was very low compared to other functional transcripts (Pang et al., 2006; Marques and Ponting, 2009), although there is good evidence that, at least as a group, mRNA‐like non‐coding RNAs are under stabilizing selection (Ponjavic et al., 2007; Guttman et al., 2009; Marques and Ponting, 2009; Young et al., 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Circular and apparently trans-spliced RNAs have recently been reported as abundant types of transcripts in mammalian transcriptome data. Both types of non-colinear RNAs are also abundant in RNA-seq of different tissue from both the African and the Indonesian coelacanth. We observe more than 8,000 lincRNAs with normal gene structure and several thousands of circularized and trans-spliced products, showing that such atypical RNAs form a substantial contribution to the transcriptome. Surprisingly, the majority of the circularizing and trans-connecting splice junctions are unique to atypical forms, that is, are not used in normal isoforms. J. Exp. Zool. (Mol. Dev. Evol.) 9999B: 1-10, 2013. © 2013 Wiley Periodicals, Inc.
    Journal of Experimental Zoology Part B Molecular and Developmental Evolution 09/2014; 322(6). DOI:10.1002/jez.b.22542 · 1.88 Impact Factor
Show more