A computational and experimental approach to validating annotations and gene predictions in the Drosophila melanogaster genome.

Howard Hughes Medical Institute and Department of Molecular and Cell Biology, University of California, Life Sciences Addition, Berkeley, CA 94720-3200, USA.
Proceedings of the National Academy of Sciences (Impact Factor: 9.81). 03/2005; 102(5):1566-71. DOI: 10.1073/pnas.0409421102
Source: PubMed

ABSTRACT Five years after the completion of the sequence of the Drosophila melanogaster genome, the number of protein-coding genes it contains remains a matter of debate; the number of computational gene predictions greatly exceeds the number of validated gene annotations. We have assembled a collection of >10,000 gene predictions that do not overlap existing gene annotations and have developed a process for their validation that allows us to efficiently prioritize and experimentally validate predictions from various sources by sequencing RT-PCR products to confirm gene structures. Our data provide experimental evidence for 122 protein-coding genes. Our analyses suggest that the entire collection of predictions contains only approximately 700 additional protein-coding genes. Although we cannot rule out the discovery of genes with unusual features that make them refractory to existing methods, our results suggest that the D. melanogaster genome contains approximately 14,000 protein-coding genes.

  • Source
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The emergence of RNA interference (RNAi) on the heels of the successful completion of the Drosophila genome project was seen by many as the ace in functional genomics: Its application would quickly assign a function to all genes in this organism and help delineate the complex web of interactions or networks linking them at the systemic level. A few years wiser and a number of genome-wide Drosophila RNAi screens later, we reflect on the state of high-throughput RNAi screens in Drosophila and ask whether the initial promise was fulfilled. We review the impact that this approach has had in the field of Drosophila research and chart out strategies to extract maximal benefit from the application of RNAi to gene discovery and pursuit of systems biology.
    Cold Spring Harbor Symposia on Quantitative Biology 02/2006; 71:141-8.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Many high-throughput loss-of-function analyses of the eukaryotic cell cycle have relied on the unicellular yeast species Saccharomyces cerevisiae and Schizosaccharomyces pombe. In multicellular organisms, however, additional control mechanisms regulate the cell cycle to specify the size of the organism and its constituent organs. To identify such genes, here we analysed the effect of the loss of function of 70% of Drosophila genes (including 90% of genes conserved in human) on cell-cycle progression of S2 cells using flow cytometry. To address redundancy, we also targeted genes involved in protein phosphorylation simultaneously with their homologues. We identify genes that control cell size, cytokinesis, cell death and/or apoptosis, and the G1 and G2/M phases of the cell cycle. Classification of the genes into pathways by unsupervised hierarchical clustering on the basis of these phenotypes shows that, in addition to classical regulatory mechanisms such as Myc/Max, Cyclin/Cdk and E2F, cell-cycle progression in S2 cells is controlled by vesicular and nuclear transport proteins, COP9 signalosome activity and four extracellular-signal-regulated pathways (Wnt, p38betaMAPK, FRAP/TOR and JAK/STAT). In addition, by simultaneously analysing several phenotypes, we identify a translational regulator, eIF-3p66, that specifically affects the Cyclin/Cdk pathway activity.
    Nature 03/2006; 439(7079):1009-13. · 38.60 Impact Factor


Available from