A Map of Recent Positive Selection in the Human Genome

Department of Human Genetics, University of Chicago, Chicago, Illinois, USA.
PLoS Biology (Impact Factor: 9.34). 04/2006; 4(3):e72. DOI: 10.1371/journal.pbio.0040072
Source: PubMed


The identification of signals of very recent positive selection provides information about the adaptation of modern humans to local conditions. We report here on a genome-wide scan for signals of very recent positive selection in favor of variants that have not yet reached fixation. We describe a new analytical method for scanning single nucleotide polymorphism (SNP) data for signals of recent selection, and apply this to data from the International HapMap Project. In all three continental groups we find widespread signals of recent positive selection. Most signals are region-specific, though a significant excess are shared across groups. Contrary to some earlier low resolution studies that suggested a paucity of recent selection in sub-Saharan Africans, we find that by some measures our strongest signals of selection are from the Yoruba population. Finally, since these signals indicate the existence of genetic variants that have substantially different fitnesses, they must indicate loci that are the source of significant phenotypic variation. Though the relevant phenotypes are generally not known, such loci should be of particular interest in mapping studies of complex traits. For this purpose we have developed a set of SNPs that can be used to tag the strongest approximately 250 signals of recent selection in each population.

Download full-text


Available from: Xiaoquan Wen, Oct 14, 2015
37 Reads
  • Source
    • "This type of test may theoretically be able to distinguish between directional and balancing selection, since the former is expected to drive alleles toward fixation while the latter will maintain an elevated level of variability (Akey et al., 2002). In addition, selection on an individual locus is known to reduce genetic variability at linked sites (Maynard Smith and Haigh, 1974), which has led to an assortment of tests for selection based on data that are observed either within a single population (Sabeti et al., 2002; Voight et al., 2006) or between populations (Sabeti et al., 2007; Tang et al., 2007). This class of tests, however, is most powerful for detecting recent strong selection since the length of any haplotypes showing reduced variability, which are the basis of these tests, will decay over time. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A whole-genome scan for identifying selection acting on pairs of linked loci is proposed and implemented. The scan is based on , one of Ohta's 1982 measures of between-population linkage disequilibrium (LD). An approximate empirical null distribution for the statistic is suggested. Although the partitioning of LD into between-population components was originally used to investigate epistatic selection, we demonstrate that values of may also be influenced by single-locus selective sweeps with linkage but no epistasis. The proposed scan is implemented in a diverse panel of chickens including 72 distinct breeds genotyped at 538 298 single-nucleotide polymorphisms. In all, 1723 locus pairs are identified as putatively corresponding to a selective sweep or epistatic selection. These pairs of loci generally cluster to form overlapping or neighboring signals of selection. Known variants that were expected to have been under selection in the panel are identified, as well as an assortment of novel regions that have putatively been under selection in chickens. Notably, a promising pair of genes located 8 MB apart on chromosome 9 are identified based on as demonstrating strong evidence of dispersive epistatic selection between populations.Heredity advance online publication, 9 September 2015; doi:10.1038/hdy.2015.81.
    Heredity 09/2015; DOI:10.1038/hdy.2015.81 · 3.81 Impact Factor
  • Source
    • "While it has long been appreciated that demographic perturbations (e.g., population size change, population structure, migration) may result in patterns of variation that are similar to those produced under positive selection, and should therefore be taken into account when identifying selected regions (e.g., Robertson, 1975; Andolfatto and Przeworski, 2000; Teshima et al., 2006; Thornton and Jensen, 2007; Siol et al., 2010; Jensen, 2014), it has also been specifically demonstrated that the assumption of an equilibrium population history may bias selection inference (e.g., Jensen et al., 2005). Further, Crisci et al. (2013) recently evaluated several proposed background site frequency spectrum based approaches [including Sweepfinder (Nielsen et al., 2005), Sweed (Pavlidis et al., 2013), OmegaPlus (Alachiotis et al., 2012), and iHS (Voight et al., 2006)]. Though they demonstrated the linkage disequilibrium based approaches to perform better, they also described a high false positive rate and low true positive rate under a great variety of models – most notably those including severe bottlenecks. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The ability to infer the parameters of positive selection from genomic data has many important implications, from identifying drug-resistance mutations in viruses to increasing crop yield by genetically integrating favorable alleles. Although it has been well-described that selection and demography may result in similar patterns of diversity, the ability to jointly estimate these two processes has remained elusive. Here, we use simulation to explore the utility of the joint site frequency spectrum to estimate selection and demography simultaneously, including developing an extension of the previously proposed Jaatha program (Mathew et al., 2013). We evaluate both complete and incomplete selective sweeps under an isolation-with-migration model with and without population size change (both population growth and bottlenecks). Results suggest that while it may not be possible to precisely estimate the strength of selection, it is possible to infer the presence of selection while estimating accurate demographic parameters. We further demonstrate that the common assumption of selective neutrality when estimating demographic models may lead to severe biases. Finally, we apply the approach we have developed to better characterize the within-host demographic and selective history of human cytomegalovirus (HCMV) infection using published next generation sequencing data.
    Frontiers in Genetics 09/2015; 6:268. DOI:10.3389/fgene.2015.00268
  • Source
    • "We validated that these changes were previously described not to affect the sensitivity and specificity of the method through coalescent simulations (Pybus et al. 2014). Standardized iHS scores were obtained by grouping SNVs into 20 bins separated by a derived allele frequency (DAF) of 0.05, subtracting the mean, and dividing by the standard deviation for all SNVs in the same bin as in Voight et al. (2006). Extreme positive or negative values indicate high EHH of haplotypes carrying the ancestral or derived allele, respectively. "
Show more