Zeng K, Fu YX, Shi SH, Wu CI.. Statistical tests for detecting positive selection by utilizing high-frequency variants. Genetics 174: 1431-1439

State Key Laboratory of Biocontrol, Ministry of Education, Sun Yat-sen University, Guangzhou, China.
Genetics (Impact Factor: 5.96). 12/2006; 174(3):1431-9. DOI: 10.1534/genetics.106.061432
Source: PubMed


By comparing the low-, intermediate-, and high-frequency parts of the frequency spectrum, we gain information on the evolutionary forces that influence the pattern of polymorphism in population samples. We emphasize the high-frequency variants on which positive selection and negative (background) selection exhibit different effects. We propose a new estimator of theta (the product of effective population size and neutral mutation rate), thetaL, which is sensitive to the changes in high-frequency variants. The new thetaL allows us to revise Fay and Wu's H-test by normalization. To complement the existing statistics (the H-test and Tajima's D-test), we propose a new test, E, which relies on the difference between thetaL and Watterson's thetaW. We show that this test is most powerful in detecting the recovery phase after the loss of genetic diversity, which includes the postselective sweep phase. The sensitivities of these tests to (or robustness against) background selection and demographic changes are also considered. Overall, D and H in combination can be most effective in detecting positive selection while being insensitive to other perturbations. We thus propose a joint test, referred to as the DH test. Simulations indicate that DH is indeed sensitive primarily to directional selection and no other driving forces.

Download full-text


Available from: Suhua Shi, Jun 03, 2015
19 Reads
  • Source
    • "In order to detect a signature of selection in the sex region, Tajima’s D [43] and Fu and Li’s D* [44] statistics were calculated with the DnaSP v5 software [45] separately for the male, female and hermaphrodite haplogroups. To confirm traces of selection detected on the male haplogroups with the Tajima’s D and the Fu and Li’s D* tests, the E statistics and the DH test [46] were computed using the male haplotype of V. balanseana as an outgroup (Table 2). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background In Vitis vinifera L., domestication induced a dramatic change in flower morphology: the wild sylvestris subspecies is dioecious while hermaphroditism is largely predominant in the domesticated subsp. V. v. vinifera. The characterisation of polymorphisms in genes underlying the sex-determining chromosomal region may help clarify the history of domestication in grapevine and the evolution of sex chromosomes in plants. In the genus Vitis, sex determination is putatively controlled by one major locus with three alleles, male M, hermaphrodite H and female F, with an allelic dominance M¿>¿H¿>¿F. Previous genetic studies located the sex locus on chromosome 2. We used DNA polymorphisms of geographically diverse V. vinifera genotypes to confirm the position of this locus, to characterise the genetic diversity and traces of selection in candidate genes, and to explore the origin of hermaphroditism.ResultsIn V. v. sylvestris, a sex-determining region of 154.8 kb, also present in other Vitis species, spans less than 1% of chromosome 2. It displays haplotype diversity, linkage disequilibrium and differentiation that typically correspond to a small XY sex-determining region with XY males and XX females. In male alleles, traces of purifying selection were found for a trehalose phosphatase, an exostosin and a WRKY transcription factor, with strikingly low polymorphism levels between distant geographic regions. Both diversity and network analysis revealed that H alleles are more closely related to M than to F alleles.Conclusions Hermaphrodite alleles appear to derive from male alleles of wild grapevines, with successive recombination events allowing import of diversity from the X into the Y chromosomal region and slowing down the expansion of the region into a full heteromorphic chromosome. Our data are consistent with multiple domestication events and show traces of introgression from other Asian Vitis species into the cultivated grapevine gene pool.
    BMC Plant Biology 09/2014; 14(1):229. DOI:10.1186/s12870-014-0229-z · 3.81 Impact Factor
  • Source
    • "We calculated the following statistics for all simulations: 1) Tajima’s D (Tajima 1989); 2) Fay and Wu’s HFW (Fay and Wu 2000); 3) Zeng et al.’s E (Zeng et al. 2006); 4) number of distinct haplotypes K; 5) haplotype diversity H; and 6) count of the most frequent haplotype M. The first three statistics are separate estimators of the scaled mutation rate "
    [Show abstract] [Hide abstract]
    ABSTRACT: The genome-wide scan for selection is an important method for identifying loci involved in adaptive evolution. However, theory that underlies standard scans for selection assumes a simple mutation model. In particular, recurrent mutation of the selective target is not considered. Although this assumption is reasonable for single nucleotide variants (SNVs), a microsatellite targeted by selection will reliably violate this assumption due to high mutation rate. Moreover, the mutation rate of microsatellites is generally high enough to ensure that recurrent mutation is pervasive rather than occasional. It is therefore unclear if positive selection targeting microsatellites can be detected using standard scanning statistics. Examples of functional variation at microsatellites underscore the significance of understanding the genomic effects of microsatellite selection. Here, we investigate the joint effects of selection and complex mutation on linked sequence diversity, comparing simulations of microsatellite selection and SNV-based selective sweeps. We find that selection on microsatellites is generally difficult to detect using popular summaries of the site frequency spectrum, and, under certain conditions, using popular methods such as iHS and SweepFinder. However, comparisons of the number of haplotypes (K) and segregating sites (S) often provides considerable power to detect selection on microsatellites. We apply this knowledge to a scan of autosomes in the human CEU population. In addition to the the most commonly reported targets of selection in European populations, we identify numerous novel genomic regions that bear highly anomalous haplotype configurations. Using one of these regions - intron 1 of MAGI2 - as an example, we show that the anomalous configuration is coincident with a perfect CA repeat of length 22. We conclude that standard genome-wide scans will commonly fail to detect mutationally complex targets of selection but that comparisons of K and S will, in many cases, facilitate their identification.
    Genome Biology and Evolution 06/2014; 6(7). DOI:10.1093/gbe/evu134 · 4.23 Impact Factor
  • Source
    • "Briefly, a large number of replicate simulations were performed for each demographic model, where the parameters of the model were drawn from prior distributions. Simulated data were summarized using h w (Watterson 1975), Tajima's D (Tajima 1989), the standardized Fay and Wu's H (Fay and Wu 2000; Zeng et al. 2006), and Kelly's Z nS (Kelly 1997) statistics. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Pinus krempfii Lecomte is a morphologically and ecologically unique pine, endemic to Vietnam. It is regarded as vulnerable species with distribution limited to just two provinces: Khanh Hoa and Lam Dong. Although a few phylogenetic studies have included this species, almost nothing is known about its genetic features. In particular, there are no studies addressing the levels and patterns of genetic variation in natural populations of P. krempfii. In this study, we sampled 57 individuals from six natural populations of P. krempfii and analyzed their sequence variation in ten nuclear gene regions (approximately 9 kb) and 14 mitochondrial (mt) DNA regions (approximately 10 kb). We also analyzed variation at seven chloroplast (cp) microsatellite (SSR) loci. We found very low haplotype and nucleotide diversity at nuclear loci compared with other pine species. Furthermore, all investigated populations were monomorphic across all mitochondrial DNA (mtDNA) regions included in our study, which are polymorphic in other pine species. Population differentiation at nuclear loci was low (5.2%) but significant. However, structure analysis of nuclear loci did not detect genetically differentiated groups of populations. Approximate Bayesian computation (ABC) using nuclear sequence data and mismatch distribution analysis for cpSSR loci suggested recent expansion of the species. The implications of these findings for the management and conservation of P. krempfii genetic resources were discussed.
    Ecology and Evolution 05/2014; 4(11):2228-2238. DOI:10.1002/ece3.1091 · 2.32 Impact Factor
Show more