Recent de novo origin of human protein-coding genes.

Smurfit Institute of Genetics, University of Dublin, Trinity College, Ireland.
Genome Research (Impact Factor: 13.85). 10/2009; 19(10):1752-9. DOI: 10.1101/gr.095026.109
Source: PubMed

ABSTRACT The origin of new genes is extremely important to evolutionary innovation. Most new genes arise from existing genes through duplication or recombination. The origin of new genes from noncoding DNA is extremely rare, and very few eukaryotic examples are known. We present evidence for the de novo origin of at least three human protein-coding genes since the divergence with chimp. Each of these genes has no protein-coding homologs in any other genome, but is supported by evidence from expression and, importantly, proteomics data. The absence of these genes in chimp and macaque cannot be explained by sequencing gaps or annotation error. High-quality sequence data indicate that these loci are noncoding DNA in other primates. Furthermore, chimp, gorilla, gibbon, and macaque share the same disabling sequence difference, supporting the inference that the ancestral sequence was noncoding over the alternative possibility of parallel gene inactivation in multiple primate lineages. The genes are not well characterized, but interestingly, one of them was first identified as an up-regulated gene in chronic lymphocytic leukemia. This is the first evidence for entirely novel human-specific protein-coding genes originating from ancestrally noncoding sequences. We estimate that 0.075% of human genes may have originated through this mechanism leading to a total expectation of 18 such cases in a genome of 24,000 protein-coding genes.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: What makes us human is one of the most interesting and enduring questions in evolutionary biology. To assist in answering this question, we have identified insertions in the human genome which cannot be found in five comparison primate species: Chimpanzee, Gorilla, Orangutan, Gibbon and Macaque. 21,269 non-polymorphic human specific insertions were identified, only 372 of which were found in exons, any function conferred by the remaining 20897 is likely to be regulatory. Many of these insertions are likely to have been fitness neutral, however a small number have been identified in genes showing signs of positive selection. Insertions found within positively selected genes, show associations to neural phenotypes, which were also enriched in the whole data set. Other phenotypes found to be enriched in the data set include dental and sensory perception related phenotypes, features which are know to differ between humans and other apes. The analysis provides several likely candidates, either genes or regulatory regions, which may be involved in the processes that differentiate humans from other apes. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
    Genome Biology and Evolution 01/2015; 7(4). DOI:10.1093/gbe/evv012 · 4.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Although considered an extremely unlikely event, many genes emerge from previously noncoding genomic regions. This review covers the entire life cycle of such de novo genes. Two competing hypotheses about the process of de novo gene birth are discussed as well as the high death rate of de novo genes. Despite the high death rate, some de novo genes are retained and remain functional, even in distantly related species, through their integration into gene networks. Further studies combining gene expression with ribosome profiling in multiple populations across different species will be instrumental for an improved understanding of the evolutionary processes operating on de novo genes. Copyright © 2015. Published by Elsevier Ltd.
    Trends in Genetics 03/2015; 196(4). DOI:10.1016/j.tig.2015.02.007 · 11.60 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Evolutionary conservation has been an accurate predictor of functional elements across the first decade of metazoan genomics. More recently, there has been a move to define functional elements instead from biochemical annotations. Evolutionary methods are, however, more comprehensive than biochemical approaches can be and can assess quantitatively, especially for subtle effects, how biologically important-how injurious after mutation-different types of elements are. Evolutionary methods are thus critical for understanding the large fraction (up to 10%) of the human genome that does not encode proteins and yet might convey function. These methods can also capture the ephemeral nature of much noncoding functional sequence, with large numbers of functional elements having been gained and lost rapidly along each mammalian lineage. Here, we review how different strengths of purifying selection have impacted on protein-coding and nonprotein- coding loci and on transcription factor binding sites in mammalian and fruit fly genomes. Expected final online publication date for the Annual Review of Genomics and Human Genetics Volume 15 is September 01, 2014. Please see for revised estimates.
    Annual review of genomics and human genetics 04/2014; DOI:10.1146/annurev-genom-090413-025621 · 9.13 Impact Factor


1 Download
Available from