Article

Identification and Properties of 1,119 Candidate LincRNA Loci in the Drosophila melanogaster Genome

MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, United Kingdom.
Genome Biology and Evolution (Impact Factor: 4.23). 03/2012; 4(4):427-42. DOI: 10.1093/gbe/evs020
Source: PubMed

ABSTRACT

The functional repertoire of long intergenic noncoding RNA (lincRNA) molecules has begun to be elucidated in mammals. Determining
the biological relevance and potential gene regulatory mechanisms of these enigmatic molecules would be expedited in a more
tractable model organism, such as Drosophila melanogaster. To this end, we defined a set of 1,119 putative lincRNA genes in D. melanogaster using modENCODE whole transcriptome (RNA-seq) data. A large majority (1.1 of 1.3 Mb; 85%) of these bases were not previously
reported by modENCODE as being transcribed. Significant selective constraint on the sequences of these loci predicts that
virtually all have sustained functionality across the Drosophila clade. We observe biases in lincRNA genomic locations and expression profiles that are consistent with some of these lincRNAs
being involved in the regulation of neighboring protein-coding genes with developmental functions. We identify lincRNAs that
may be important in the developing nervous system and in male-specific organs, such as the testes. LincRNA loci were also
identified whose positions, relative to nearby protein-coding loci, are equivalent between D. melanogaster and mouse. This study predicts that the genomes of not only vertebrates, such as mammals, but also an invertebrate (fruit
fly) harbor large numbers of lincRNA loci. Our findings now permit exploitation of Drosophila genetics for the investigation of lincRNA mechanisms, including lincRNAs with potential functional analogues in mammals.

Download full-text

Full-text

Available from: Andrew Bassett
  • Source
    • "For the H3K4Me1/H3K27Ac enrichments we restricted ourselves to three developmental stages (L2, L3, pupae), which we considered to be the most relevant interval for gene activity affecting growth of imaginal discs. We obtained a table with lincRNAs in the Drosophila genome from the study of Young et al.[67]and searched for enrichment of SNPs located in those lincRNA loci. Enrichment was tested using a hypergeometric test (function phyper()) in R). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Organismal size depends on the interplay between genetic and environmental factors. Genome-wide association (GWA) analyses in humans have implied many genes in the control of height but suffer from the inability to control the environment. Genetic analyses in Drosophila have identified conserved signaling pathways controlling size; however, how these pathways control phenotypic diversity is unclear. We performed GWA of size traits using the Drosophila Genetic Reference Panel of inbred, sequenced lines. We find that the top associated variants differ between traits and sexes; do not map to canonical growth pathway genes, but can be linked to these by epistasis analysis; and are enriched for genes and putative enhancers. Performing GWA on well-studied developmental traits under controlled conditions expands our understanding of developmental processes underlying phenotypic diversity.
    Full-text · Article · Jan 2016 · PLoS Genetics
  • Source
    • "Considering their genomic locations, lncRNAs can be mainly classified as i) intergenic lncRNAs (lincRNAs) (Guttman et al., 2009), ii) intronic lncRNAs (incRNAs) (Braconi et al., 2011), and iii) natural antisense transcripts (NATs, as cis-NATs and trans-NATs) with their sequences complementary (or partially complementary ) to other transcripts at the same (or different) genomic locus (Faghihi and Wahlestedt, 2009). In the past years, genome-wide explorations , e.g., tiling array, chromatin signature, and RNA-sequencing approach , have detected the expression of lncRNAs in many organisms, such as Homo sapiens, Drosophila melanogaster, Mus musculus, and Danio rerio (Guttman et al., 2010; Cabili et al., 2011; Pauli et al., 2012; Young et al., 2012). To date, a large body of evidence has demonstrated that lncRNAs play a critical role in transcriptional interference, cell differentiation , epigenetic modification, genomic imprinting, and other important biological processes (Dinger et al., 2008; Yu et al., 2008; Gupta et al., 2010; Gibb et al., 2011; Guttman and Rinn, 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Accumulating published reports have confirmed the critical biological role (e.g., cell differentiation, gene regulation, stress response) for plant long non-coding RNAs (lncRNAs). However, a literature-derived database with the aim of lncRNA curation, data deposit and further distribution remains still absent for this particular lncRNA clade. PLNlncRbase has been designed as an easy-to-use resource to provide detailed information for experimentally identified plant lncRNAs. In the current version, PLNlncRbase has manually collected data from nearly 200 published literature, covering a total of 1187 plant lncRNAs in 43 plant species. The user can retrieve plant lncRNA entries from a well-organized interface through a keyword search by using the name of plant species or a lncRNA identifier. Each entry upon a query will be returned with detailed information for a specific plant lncRNA, including the species name, a lncRNA identifier, a brief description of the potential biological role, the lncRNA sequence, the lncRNA classification, an expression pattern of the lncRNA, the tissue/developmental stage/condition for lncRNA expression, the detection method for lncRNA expression, a reference literature, and the potential target gene(s) of the lncRNA extracted from the original reference. This database will be regularly updated to greatly facilitate future investigations of plant lncRNAs pertaining to their biological significance. The PLNlncRbase database is now freely available at http://bioinformatics.ahau.edu.cn/PLNlncRbase. Copyright © 2015. Published by Elsevier B.V.
    Full-text · Article · Jul 2015 · Gene
  • Source
    • "From R5.24 to R6.03, 2313 new candidate non-coding genes were annotated (including those flagged as antisense, see below). We assessed the proposed lncRNAs described in the published literature (Tupy et al. 2005; Inagaki, et al. 2005; Hiller et al. 2009; Young, et al. 2012) and annotated many, but not all, of the lncRNAs proposed. Unless we had independent evidence that the region is transcribed (for example, RNA-Seq coverage data), we did not annotate the predicted lncRNA gene. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We report the current status of the FlyBase annotated gene set for Drosophila melanogaster and highlight improvements based on high-throughput data. The FlyBase annotated gene set consists entirely of manually annotated gene models, with the exception of some classes of small non-coding RNAs. All gene models have been reviewed using evidence from high-throughput datasets, primarily from the modENCODE project. These datasets include RNA-Seq coverage data, RNA-Seq junction data, transcription start site profiles, and translation stop-codon read-through predictions. New annotation guidelines were developed to take into account the use of the high-throughput data. We describe how this flood of new data was incorporated into thousands of new and revised annotations. FlyBase has adopted a philosophy of excluding low confidence and low frequency data from gene model annotations; we also do not attempt to represent all possible permutations for complex and modularly organized genes. This has allowed us to produce a high-confidence, manageable gene annotation dataset that is available at FlyBase (flybase.org). Interesting aspects of new annotations include new genes (coding, non-coding, and antisense), many genes with alternative transcripts with very long 3' UTRs (up to 15-18kb), and a stunning mismatch in the number of male-specific genes (approximately 13 percent of all annotated gene models) vs. female-specific genes (fewer than 1 percent). The number of identified pseudogenes and mutations in the sequenced strain also increased significantly. We discuss remaining challenges, for instance, identification of functional small polypeptides and detection of alternative translation starts. Copyright © 2015 Author et al.
    Full-text · Article · Jun 2015 · G3-Genes Genomes Genetics
Show more