Performing the Exact Test of Hardy-Weinberg Proportion for Multiple Alleles

Department of Statistics, University of Washington Seattle, Seattle, Washington, United States
Biometrics (Impact Factor: 1.52). 07/1992; 48(2):361-72. DOI: 10.2307/2532296
Source: PubMed

ABSTRACT The Hardy-Weinberg law plays an important role in the field of population genetics and often serves as a basis for genetic inference. Because of its importance, much attention has been devoted to tests of Hardy-Weinberg proportions (HWP) over the decades. It has long been recognized that large-sample goodness-of-fit tests can sometimes lead to spurious results when the sample size and/or some genotypic frequencies are small. Although a complete enumeration algorithm for the exact test has been proposed, it is not of practical use for loci with more than a few alleles due to the amount of computation required. We propose two algorithms to estimate the significance level for a test of HWP. The algorithms are easily applicable to loci with multiple alleles. Both are remarkably simple and computationally fast. Relative efficiency and merits of the two algorithms are compared. Guidelines regarding their usage are given. Numerical examples are given to illustrate the practicality of the algorithms.

Download full-text


Available from: Sun-Wei Guo, Aug 03, 2015
  • Source
    • "Tests for pairwise linkage disequilibrium (LD) between all allelic pairs among loci from all sampling sites were calculated in ARLEQUIN. HWE significance was calculated using an exact test (Guo and Thompson, 1992) with a Markov chain of 10 6 steps and 10 5 dememorisation steps. Significance values for LD were calculated using a likelihood-ratio test (LRT) of 20,000 permutations (Slatkin and Excoffier, 1996). "
    [Show abstract] [Hide abstract]
    ABSTRACT: We used mitochondrial DNA (mtDNA) control region (CR) sequences and genotypes from eight microsatellite DNA (msatDNA) loci to determine the genetic structure of the school shark (Galeorhinus galeus) in New Zealand, Australia and Chile. The estimates of mtDNA haplotype and nucleotide diversity were very similar in New Zealand (h = 0.735 ± 0.032, = 0.001 ± 0.001) and Australia (h = 0.729 ± 0.027, = 0.001 ± 0.001), but in Chile they were higher (h = 0.800 ± 0.089, = 0.002 ± 0.001). The haplotype genealogy showed evidence of two distinct clades, New Zealand and Australia combined (clade 1), and Chile (clade 2). A power analysis suggested that sample sizes were large enough to detect any significant differences within clade 1. Neutrality test, mismatch distribution, and demographic reconstructions based on a coalescence approach, suggested that the Oceania population (clade 1) went through a period of population expansion, whereas the population size of the Chile population (clade 2) has been relatively stable over the last 20,000 years. Data from microsatellite loci also supported the separation of the Oceania and Chile populations. Principal component analysis suggested that there might also be a separation of groups within clade 1, which was not statistically significant (P = 0.434). The genetic data reported in this study supported the model of a single G. galeus stock in New Zealand and Australia. Our findings were consistent with previous tagging data that showed individual G. galeus migrate across the Tasman Sea between Australia and New Zealand, and at least some of these migration events result in successful reproduction.
    Fisheries Research 07/2015; 167:132-142. DOI:10.1016/j.fishres.2015.02.010 · 1.84 Impact Factor
    • "Prior to the further analyses, microsatellite genotype frequencies of all loci were tested for Hardy–Weinberg equilibrium (HWE) expectations to avoid the inclusion of loci that did not conform to panmixia, which would violate the assumptions of most test statistics. Genotypic data were checked across loci and populations according to Guo and Thompson (1992), with 100 000 Markov chain steps and 10 000 dememorisation steps. Linkage disequilibrium between pairs of loci was tested using the likelihood ratio test. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Indochina Peninsula is the primary centre of diversity of rice and lies partly in the centre of origin of cultivated rice (Oryza sativa) where the wild ancestor (Oryza rufipogon) is still abundant. The wild gene pool is potentially endangered by urbanisation and the expansion of agriculture, and by introgression hybridisation with locally cultivated rice varieties. To determine genetic diversity and structure of the wild rice of the region we genotyped nearly 1000 individuals using 20 microsatellite loci. We found ecological differentiation in 48 populations, distinguishable by their life-history traits and the country of origin. Geographical divergence was suggested by isolation of the perennial Myanmar populations from those of Cambodia, Laos and Thailand. The annual types would be most likely to have lost genetic variation because of genetic drift and inbreeding. The growing of cultivated and wild rice together, however, gives ample opportunities for hybridisation, which already shows signs of genetic mixing, and will ultimately lead to replacement of the original wild rice gene pool. For conservation we suggest that wild rice should be conserved ex situ in order to prevent introgression from cultivated rice, along with in situ conservation in individual countries for the recurrent evolutionary process through local adaptation, but with sufficient isolation from cultivated rice fields to preserve genetic integrity of the wild populations.
    Annals of Applied Biology 07/2015; DOI:10.1111/aab.12242 · 1.96 Impact Factor
  • Source
    • "Specifically, selection, nonrandom mating , population subdivision, small population size, mutation, and null alleles all can result in deviations of genotype frequencies from Hardy-Weinberg frequencies; selection, mutation rate heterogeneity, and population expansions can cause significant departures of Tajima's D from zero (Tajima 1989a,b; Aris-Brosou and Excoffier 1996); and population bottlenecks and/or expansions, selective sweeps, and mutation rate heterogeneity can cause ''waves'' in the frequency distributions of substitutional differences among pairs of individuals (mismatch distributions; Slatkin and Hudson 1991; Rogers and Harpending 1992; Rogers et al. 1996). Genotype frequencies for each locus were tested for deviations from Hardy-Weinberg equilibrium both for individual sampling sites and for the total sample (Guo and Thompson 1992). Tajima's D statistic was estimated for the combined data after randomizing the order in which alleles were recorded for each individual to avoid establishing pseudo-linkages across loci; because mutation rates vary from less than 0.25% per million years to more than 0.65% per million years (unpubl. "
Show more