Evolution of Human-Specific Neural SRGAP2 Genes by Incomplete Segmental Duplication

Department of Genome Sciences, University of Washington School of Medicine, Seattle, 98195, USA.
Cell (Impact Factor: 32.24). 05/2012; 149(4):912-22. DOI: 10.1016/j.cell.2012.03.033
Source: PubMed


Gene duplication is an important source of phenotypic change and adaptive evolution. We leverage a haploid hydatidiform mole to identify highly identical sequences missing from the reference genome, confirming that the cortical development gene Slit-Robo Rho GTPase-activating protein 2 (SRGAP2) duplicated three times exclusively in humans. We show that the promoter and first nine exons of SRGAP2 duplicated from 1q32.1 (SRGAP2A) to 1q21.1 (SRGAP2B) ∼3.4 million years ago (mya). Two larger duplications later copied SRGAP2B to chromosome 1p12 (SRGAP2C) and to proximal 1q21.1 (SRGAP2D) ∼2.4 and ∼1 mya, respectively. Sequence and expression analyses show that SRGAP2C is the most likely duplicate to encode a functional protein and is among the most fixed human-specific duplicate genes. Our data suggest a mechanism where incomplete duplication created a novel gene function-antagonizing parental SRGAP2 function-immediately "at birth" 2-3 mya, which is a time corresponding to the transition from Australopithecus to Homo and the beginning of neocortex expansion.

Download full-text


Available from: Pieter de Jong, Oct 07, 2014
1 Follower
39 Reads
  • Source
    • "Interestingly, both gene families, FAM72 and SRGAP2, were composed of four human paralogues located as unique pairs on chr 1 (Fig. 1A). However, all other species investigated contained only one orthologue (Fig. 3) [31]. Thus, FAM72 and SRGAP2 appear to represent a unique gene couple that characterizes the emergence of vertebrates with a notochord (with one gene pair) and defines the human species containing four gene pairs (Fig. 3). "
    [Show abstract] [Hide abstract]
    ABSTRACT: FAM72 is a novel neuronal progenitor cell (NPC) self-renewal supporting protein expressed under physiological conditions at low levels in other tissues. Accumulating data indicate the potential pivotal tumourigenic effects of FAM72. Our in silico human genome-wide analysis (GWA) revealed that the FAM72 gene family consists of four human-specific paralogous members, all of which are located on chromosome (chr) 1. Unique asymmetric FAM72 segmental gene duplications are most likely to have occurred in conjunction with the paired genomic neighbour SRGAP2 (SLIT-ROBO Rho GTPase activating protein), as both genes have four paralogues in humans but only one vertebra-emerging orthologue in all other species. No species with two or three FAM72/SRGAP2 gene pairs could be identified, and the four exclusively human-defining ohnologues, with different mutation patterns in Homo neanderthalensis and Denisova hominin, may remain under epigenetic control through long non-coding (lnc) RNAs.
    Genomics 10/2015; 106:278-285. · 2.28 Impact Factor
  • Source
    • "To date, the genetic basis of human-specific phenotypes remains largely unknown, complicated by the difficulties in distinguishing between phenotypically significant and benign variation. Thus, evolutionary changes in protein-coding sequences have received considerable attention, as the phenotypic consequences of these mutations have historically been easier to interpret (Clark et al. 2003; Stedman et al. 2004; Chimpanzee Sequencing and Analysis Consortium 2005; Nielsen et al. 2005; Arbiza et al. 2006; Dennis et al. 2012; Sudmant et al. 2013). Although protein-coding evolution has clearly played a role in human evolution, proteins account for only ∼1.5% of the human genome, most of which exhibit high sequence similarity between humans and chimpanzees (Chimpanzee Sequencing and Analysis Consortium 2005). "
    [Show abstract] [Hide abstract]
    ABSTRACT: It has long been hypothesized that changes in gene regulation have played an important role in human evolution, but regulatory DNA has been much more difficult to study compared with protein-coding regions. Recent large-scale studies have created genome-scale catalogs of DNase I hypersensitive sites (DHSs), which demark potentially functional regulatory DNA. To better define regulatory DNA that has been subject to human-specific adaptive evolution, we performed comprehensive evolutionary and population genetics analyses on over 18 million DHSs discovered in 130 cell types. We identified 524 DHSs that are conserved in nonhuman primates but accelerated in the human lineage (haDHS), and estimate that 70% of substitutions in haDHSs are attributable to positive selection. Through extensive computational and experimental analyses, we demonstrate that haDHSs are often active in brain or neuronal cell types; play an important role in regulating the expression of developmentally important genes, including many transcription factors such as SOX6, POU3F2, and HOX genes; and identify striking examples of adaptive regulatory evolution that may have contributed to human-specific phenotypes. More generally, our results reveal new insights into conserved and adaptive regulatory DNA in humans and refine the set of genomic substrates that distinguish humans from their closest living primate relatives.
  • Source
    • "most HNR variants ( N = 618 ) is PRIM2 that is part of interchromosomal duplications of Chromo - somes 6 and 3 and represents cryptic SDs in the GRCh37 reference genome ( Genovese et al . 2013 ) . Additionally , two regions that were incorrectly represented in GRCh37 and subsequently resolved in GRCh38 using the CHM1 derived BAC library , SRGAP2 ( Dennis et al . 2012 ) and IGH ( Watson et al . 2013 ) , both had high counts of HNR variants ( 39 and 54 , respectively ) providing additional sup - port for the hypothesis that heterozygous calls are indicative of reference assembly errors . The majority of the heterozygous calls are errors that arise during variant detection due to paralogous sequences m"
    [Show abstract] [Hide abstract]
    ABSTRACT: A complete reference assembly is essential for accurately interpreting individual genomes and associating variation with phenotypes. While the current human reference genome sequence is of very high quality, gaps and misassemblies remain due to biological and technical complexities. Large repetitive sequences and complex allelic diversity are the two main drivers of assembly error. Although increasing the length of sequence reads and library fragments can improve assembly, even the longest available reads do not resolve all regions. In order to overcome the issue of allelic diversity, we used genomic DNA from an essentially haploid hydatidiform mole, CHM1. We utilized several resources from this DNA including a set of end-sequenced and indexed BAC clones and 100× Illumina whole-genome shotgun (WGS) sequence coverage. We used the WGS sequence and the GRCh37 reference assembly to create an assembly of the CHM1 genome. We subsequently incorporated 382 finished BAC clone sequences to generate a draft assembly, CHM1_1.1 (NCBI AssemblyDB GCA_000306695.2). Analysis of gene, repetitive element, and segmental duplication content show this assembly to be of excellent quality and contiguity. However, comparison to assembly-independent resources, such as BAC clone end sequences and PacBio long reads, indicate misassembled regions. Most of these regions are enriched for structural variation and segmental duplication, and can be resolved in the future. This publicly available assembly will be integrated into the Genome Reference Consortium curation framework for further improvement, with the ultimate goal being a completely finished gap-free assembly.
    Genome Research 11/2014; 24(12). DOI:10.1101/gr.180893.114 · 14.63 Impact Factor
Show more