A Markov Chain Monte Carlo Approach for Joint Inference of Population Structure and Inbreeding Rates From Multilocus Genotype Data

Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853, USA.
Genetics (Impact Factor: 5.96). 08/2007; 176(3):1635-51. DOI: 10.1534/genetics.107.072371
Source: PubMed


Nonrandom mating induces correlations in allelic states within and among loci that can be exploited to understand the genetic structure of natural populations (Wright 1965). For many species, it is of considerable interest to quantify the contribution of two forms of nonrandom mating to patterns of standing genetic variation: inbreeding (mating among relatives) and population substructure (limited dispersal of gametes). Here, we extend the popular Bayesian clustering approach STRUCTURE (Pritchard et al. 2000) for simultaneous inference of inbreeding or selfing rates and population-of-origin classification using multilocus genetic markers. This is accomplished by eliminating the assumption of Hardy-Weinberg equilibrium within clusters and, instead, calculating expected genotype frequencies on the basis of inbreeding or selfing rates. We demonstrate the need for such an extension by showing that selfing leads to spurious signals of population substructure using the standard STRUCTURE algorithm with a bias toward spurious signals of admixture. We gauge the performance of our method using extensive coalescent simulations and demonstrate that our approach can correct for this bias. We also apply our approach to understanding the population structure of the wild relative of domesticated rice, Oryza rufipogon, an important partially selfing grass species. Using a sample of n = 16 individuals sequenced at 111 random loci, we find strong evidence for existence of two subpopulations, which correlates well with geographic location of sampling, and estimate selfing rates for both groups that are consistent with estimates from experimental data (s approximately 0.48-0.70).

1 Follower
28 Reads
  • Source
    • "David et al. (2007) extend the approach of Enjalbert and David (2000) to accommodate errors in scoring heterozygotes as homozygotes. A primary objective of InStruct (Gao et al. 2007) is the estimation of admixture. It extends the widely-used program structure (Pritchard et al. 2000), which bases the estimation of admixture on disequilibria of various forms, by accounting for disequilibria due to selfing. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a Bayesian method for characterizing the mating system of populations reproducing through a mixture of self-fertilization and random outcrossing. Our method uses patterns of genetic variation across the genome as a basis for inference about reproduction under pure hermaphroditism, gynodioecy, and a model developed to describe the self-fertilizing killifish Kryptolebias marmoratus. We extend the standard coalescence model to accommodate these mating systems, accounting explicitly for multilocus identity disequilibrium, inbreeding depression, and variation in fertility among mating types. We incorporate the Ewens Sampling Formula (ESF) under the infinite-alleles model of mutation to obtain a novel expression for the likelihood of mating system parameters. Our Markov chain Monte Carlo (MCMC) algorithm assigns locus-specific mutation rates, drawn from a common mutation rate distribution that is itself estimated from the data using a Dirichlet Process Prior model. Our sampler is designed to accommodate additional information, including observations pertaining to the sex ratio, the intensity of inbreeding depression, and other aspects of reproduction. It can provide joint posterior distributions for the population-wide proportion of uniparental individuals, locus-specific mutation rates, and the number of generations since the most recent outcrossing event for each sampled individual. Further, estimation of all basic parameters of a given model permits estimation of functions of those parameters, including the proportion of the gene pool contributed by each sex and relative effective numbers.
    Genetics 09/2015; DOI:10.1534/genetics.115.179093 · 5.96 Impact Factor
    • "e frequencies . Considering K = 1 to 20 , 10 5 burn - in steps and 10 6 iterations of MCMC algorithm were run 10 times per K . The optimal number of clusters was estimated using ΔK method described by Evanno et al . ( 2005 ) and computed with STRUCTURE HARVESTER online ( Earl & vonHoldt , 2011 ) . Contrary to STRUCTURE , the INSTRUCT soft - ware ( Gao et al . , 2007 ) takes inbreeding into account and does not require the assumption of HWE within clusters ."
    [Show abstract] [Hide abstract]
    ABSTRACT: 1. In the South-West Indian Ocean, the honeybee Apis mellifera is found on several islands including the Seychelles archipelago. This archipelago is located 1120 km North of Madagascar, where the endemic African subspecies A. m. unicolor occurs. The genetic diversity of the honeybee populations in the Seychelles islands has never been studied, yet this species interacts with highly endemic and indigenous flora. 2. A total of 186 honeybee colonies from the three main islands: Mahé, Praslin and La Digue were collected. In addition, 107 individuals from Madagascar (A. m. unicolor) and 49 from Italy (A. m. ligustica) were analysed as reference populations. The maternal lineages were assessed using PCR-RFLP (n = 342) and sequencing (n = 121) of the mtDNA COI–COII intergenic region. Intra-Seychelles nuclear genetic diversity and structure were analysed using 15 microsatellites while comparison with reference populations was done using 14 loci. 3. All Seychellian colonies had mtDNA sequences characteristic of the African evolutionary lineage. Two sub-lineages were detected: AI sub-lineage (A1) was dominant (96.7%) on all islands and mostly represented by the subspecies A. m. unicolor, while Z sub-lineage was observed in six colonies from two islands. No mtDNA characteristic of imported European lineages was detected. 4. Nuclear genetic diversity was high and structured, suggesting restricted gene flow between islands of the archipelago. High nuclear similarities were found among the Seychellian and A. m. unicolor populations, yet significant genetic differentiation was observed. The A. m. ligustica reference population was highly differentiated from the Seychellian honeybee populations.
    Insect Conservation and Diversity 08/2015; DOI:10.1111/icad.12138 · 2.17 Impact Factor
  • Source
    • "As the DK method cannot detect instances when K = 1, the lnP(D) was examined to determine whether K = 1 was the maximum. As a result of the significant levels of inbreeding reported in this species (Faulkes et al. 1990, 1997; Reeve et al. 1990; Honeycutt et al. 1991) and the nonrandom nature of the sampling scheme (# individuals/colony vs. # colonies/locality), we also applied the clustering program INSTRUCT v1.1 (Gao et al. 2007). The algorithm used in STRUCTURE is based on the principles of population genetics, detecting structure through deviations from the assumptions of Hardy–Weinberg equilibrium (Wahlund 1928; Pritchard et al. 2000). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The role of genetic relatedness in the evolution of eusociality has been the topic of much debate, especially when contrasting eusocial insects with vertebrates displaying reproductive altruism. The naked mole-rat, Heterocephalus glaber, was the first described eusocial mammal. Although this discovery was based on an ecological constraints model of eusocial evolution, early genetic studies reported high levels of relatedness in naked mole-rats, providing a compelling argument that low dispersal rates and consanguineous mating (inbreeding as a mating system) are the driving forces for the evolution of this eusocial species. One caveat to accepting this long-held view is that the original genetic studies were based on limited sampling from the species’ geographic distribution. A growing body of evidence supports a contrary view, with the original samples not representative of the species – rather reflecting a single founder event, establishing a small population south of the Athi River. Our study is the first to address these competing hypotheses by examining patterns of molecular variation in colonies sampled from north and south of the Athi and Tana Rivers, which based on our results, serve to isolate genetically distinct populations of naked mole-rats. Although colonies south of the Athi River share a single mtDNA haplotype and are fixed at most microsatellite loci, populations north of the Athi River are considerably more variable. Our findings support the position that the low variation observed in naked mole-rat populations south of the Athi River reflect a founder event, rather than a consequence of this species’ unusual mating system.This article is protected by copyright. All rights reserved.
    Molecular Ecology 08/2015; 24(19). DOI:10.1111/mec.13358 · 6.49 Impact Factor
Show more


28 Reads
Available from