A high-resolution map of human evolutionary constraint using 29 mammals

Broad Institute of Harvard and Massachusetts Institute of Technology, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA.
Nature (Impact Factor: 41.46). 10/2011; 478(7370):476-82. DOI: 10.1038/nature10530
Source: PubMed


The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.

Download full-text


Available from: Wesley C Warren
  • Source
    • "Fieldwork on E. helvum and E. du preanum was conducted under permits granted by national and local authorities. Research involving live animals followed guidelines for the capture, handling, and care of mammals approved by the American Society of Mammalogists (Sikes et al., 2011). FIG. 1. Map of sampling localities for E. helvum across sub-Saharan "
    [Show abstract] [Hide abstract]
    ABSTRACT: The pteropodid fruit bat genus Eidolon is comprised of two extant species: E. dupreanum on Madagascar and E. helvum on the African mainland and offshore islands. Recent population genetic studies of E. helvum indicate widespread panmixia across the continent, although island populations off western Africa show genetic structure. Little is known about the genetic connectivity of E. dupreanum or the divergence time between these two sister species. We examine sequence data for one mitochondrial (cyt-b) and three nuclear regions (β-fib, RAG1, and RAG2) to assess population genetic structure within E. dupreanum and divergence between the two Eidolon spp. In addition, we characterize the demographic history of both taxa using coalescent-based methods. We find little evidence for population structure within E. dupreanum, and suggest that this reflects dispersal based on seasonal fruit availability and a preference for roosting sites in exposed rock outcrops. However, despite apparent panmixia in both Eidolon spp. and large dispersal distances reported in previous studies for E. helvum, these two taxa diverged in the mid-to-late Miocene. Both species are also characterized by population expansion and young, Pleistocene clade ages, although slower population growth in E. dupreanum is likely explained by its divergence via colonization from the mainland. Finally, we discuss the implications of population connectivity in E. dupreanum in the context of its potential role as a reservoir host for pathogens capable of infecting humans.
    Full-text · Article · Dec 2014 · Acta Chiropterologica
  • Source
    • "Some tools for WGA are also capable of modeling unbalanced rearrangements that lead to copy number change, such as tandem and segmental duplications (Blanchette et al. 2004; Miller et al. 2007; Paten et al. 2008, 2011; Angiuoli and Salzberg 2011). WGA methods have been critical to understanding the selective forces acting across genomes, allowing evolutionary analysis of many potential functional elements (The ENCODE Project Consortium 2012), and in particular, the identification of conserved noncoding functional elements (Drosophila 12 Genomes Consortium 2007; Lindblad-Toh et al. 2011), including cis-regulatory elements (Kellis et al. 2003), enhancers, and noncoding RNAs. The lack of accepted gold standard reference alignments has made it hard to objectively assess the relative merits of WGA methods. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark datasets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole genome alignment (WGA). Using the same model as the successful Assemblathon competitions we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three datasets were used; two were simulated and based on primate and mammalian phylogenies and one was comprised of 20 real fly genomes. In total 35 submissions were assessed, submitted by ten teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable difference in the alignment quality of differently annotated regions and found few tools aligned the duplications analysed. We found many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all datasets, submissions and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.
    Full-text · Article · Oct 2014 · Genome Research
  • Source
    • "Whether or not truncated splice variants are actually transcribed and translated in these species will need to be experimentally verified in each case. However, the regions of conservation identified here and previously by Lindblad-Toh et al. (2011) demonstrate the existence of selection pressure which is highly suggestive of a wider utilization of similar alternative transcripts beyond the mouse, rat and human. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Interleukin-18 (IL-18) is a pro-inflammatory cytokine which stimulates activation of the nuclear factor kappa beta (NF-κB) pathway via interaction with the IL-18 receptor. The receptor itself is formed from a dimer of two subunits, with the ligand-binding IL-18Rα subunit being encoded by the IL18R1 gene. A splice variant of murine IL18r1, which has been previously described, is formed by transcription of an unspliced intron (forming a 'type II' IL18r1 transcript) and is predicted to encode a receptor with a truncated intracellular domain lacking the capacity to generate downstream signalling. In order to examine the relevance of this finding to human IL-18 function, we assessed the presence of a homologous transcript by reverse transcription-polymerase chain reaction (RT-PCR) in the human and rat as another common laboratory animal. We present evidence for type II IL18R1 transcripts in both species. While the mouse and rat transcripts are predicted to encode a truncated receptor with a novel 5 amino acid C-terminal domain, the human sequence is predicted to encode a truncated protein with a novel 22 amino acid sequence bearing resemblance to the 'Box 1' motif of the Toll/interleukin-1 receptor (TIR) domain, in a similar fashion to the inhibitory interleukin-1 receptor 2. Given that transcripts from these three species are all formed by inclusion of homologous unspliced intronic regions, an analysis of homologous introns across a wider array of 33 species with available IL18R1 gene records was performed, which suggests similar transcripts may encode truncated type II IL-18Rα subunits in other species. This splice variant may represent a conserved evolutionary mechanism for regulating IL-18 activity.
    Full-text · Article · Sep 2014 · PeerJ
Show more