Article

Inference of Population Structure Using Multilocus Genotype Data

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We describe a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations. We assume a model in which there are K populations (where K may be unknown), each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are assigned (probabilistically) to populations, or jointly to two or more populations if their genotypes indicate that they are admixed. Our model does not assume a particular mutation process, and it can be applied to most of the commonly used genetic markers, provided that they are not closely linked. Applications of our method include demonstrating the presence of population structure, assigning individuals to populations, studying hybrid zones, and identifying migrants and admixed individuals. We show that the method can produce highly accurate assignments using modest numbers of loci—e.g., seven microsatellite loci in an example using genotype data from an endangered bird species. The software used for this article is available from http://www.stats.ox.ac.uk/~pritch/home.html.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Bayesian clustering analysis in STRUCTURE v2.3.4 [91] was used to estimate the likelihood that a specimen belongs to an inferred cluster under a non-admixture model with correlated allele frequencies suitable for populations with minimal mixing [92][93][94]. Mitochondrial DNA haplotypes were used as priori to assist clustering [92,94]. ...
... This finding implies a broad genetic diversity among the turtles analysed, with specimens likely originating from different parental pairs. Bayesian analyses using Structure [91] identified an optimal number of clusters at K = 2 after filtering out spurious results with StructureSelector [95]. This finding was corroborated using the Puechmaille method and further supported using the Evanno method [96][97][98]. ...
... Bayesian analyses using Structure [91] identified an optimal number of clusters at K = 2 after filtering out spurious results with StructureSelector [95]. This finding was corroborated using the Puechmaille method and further supported using the Evanno method [96][97][98]. ...
Article
Full-text available
Background: The conservation of loggerhead sea turtles (Caretta caretta) in the central Mediterranean benefits from an in-depth understanding of its population genetic structure and diversity. Methods: This study, therefore, investigates C. caretta in Maltese waters by genetically analysing 63 specimens collected through strandings and in-water sampling, using mitochondrial DNA control region and microsatellites. Additionally, the two nests detected in Malta in 2023 were analysed for the same markers. Results: Mitochondrial data identified 10 haplotypes, with mixed stock analyses tracing 87.5% of the specimens to Mediterranean origins, primarily from Libyan rookeries, with contributions from Lebanon, Israel and Turkey. Three Atlantic haplotypes were identified in six specimens, with CC-A17.1 linking central Mediterranean foraging individuals to rookeries in Cape Verde. Five of these six Atlantic haplotype records were from recently sampled individuals (2022–2023), possibly indicating a recent eastward expansion of Atlantic haplotypes into the Mediterranean. Bayesian clustering (K = 2) of microsatellite data using haplotypes as priori revealed similar proportions for clusters across most specimens, except for three specimens with Atlantic haplotypes CC-A1.1 and CC-A1.3, which exhibited distinct patterns. The two nests examined here displayed Mediterranean haplotypes, with nuclear DNA matching the predominant Mediterranean profiles found in foraging individuals, suggesting that local clutches originated from Mediterranean parents. Conclusions: Increasing nesting activity on Maltese beaches and this archipelago’s geographical position highlight the need for ongoing genetic monitoring to track changes in genetic diversity and develop conservation strategies that support the effective protection of this species and its habitats.
... Bayesian clustering analysis in STRUCTURE v2.3.4 [91] was used to estimate the likelihood that a specimen belongs to an inferred cluster under a non-admixture model with correlated allele frequencies suitable for populations with minimal mixing [92][93][94]. Mitochondrial DNA haplotypes were used as priori to assist clustering [92,94]. ...
... This finding implies a broad genetic diversity among the turtles analysed, with specimens likely originating from different parental pairs. Bayesian analyses using Structure [91] identified an optimal number of clusters at K = 2 after filtering out spurious results with StructureSelector [95]. This finding was corroborated using the Puechmaille method and further supported using the Evanno method [96][97][98]. ...
... Bayesian analyses using Structure [91] identified an optimal number of clusters at K = 2 after filtering out spurious results with StructureSelector [95]. This finding was corroborated using the Puechmaille method and further supported using the Evanno method [96][97][98]. ...
Article
Full-text available
Background: The conservation of loggerhead sea turtles (Caretta caretta) in the central Mediterranean benefits from an in-depth understanding of its population genetic structure and diversity. Methods: This study, therefore, investigates C. caretta in Maltese waters by genetically analysing 63 specimens collected through strandings and in-water sampling, using mitochondrial DNA control region and microsatellites. Additionally, the two nests detected in Malta in 2023 were analysed for the same markers. Results: Mitochondrial data identified 10 haplotypes, with mixed stock analyses tracing 87.5% of the specimens to Mediterranean origins, primarily from Libyan rookeries, with contributions from Lebanon, Israel and Turkey. Three Atlantic haplotypes were identified in six specimens, with CC-A17.1 linking central Mediterranean foraging individuals to rookeries in Cape Verde. Five of these six Atlantic haplotype records were from recently sampled individuals (2022-2023), possibly indicating a recent eastward expansion of Atlantic haplotypes into the Mediterranean. Bayesian clustering (K = 2) of microsatellite data using haplotypes as priori revealed similar proportions for clusters across most specimens, except for three specimens with Atlantic haplotypes CC-A1.1 and CC-A1.3, which exhibited distinct patterns. The two nests examined here displayed Mediterranean haplotypes, with nuclear DNA matching the predominant Mediterranean profiles found in foraging individuals, suggesting that local clutches originated from Mediterranean parents. Conclusions: Increasing nesting activity on Maltese beaches and this archipelago's geographical position highlight the need for ongoing genetic monitoring to track changes in genetic diversity and develop conservation strategies that support the effective protection of this species and its habitats. https://doi.org/10.3390/genes15121565
... Assessment of structure of the gene pool based on genetic data offers insights into the whole population and the role of evolutionary forces such as mutation, migration, selection, random drift, and geographical barriers that creates diverse clusters often termed as subpopulations (Greenbaum et al. 2016). The number of subpopulations from a given population is based on two different approaches, viz., model-based method, and based on the genetic distance matrices obtained from the genetic data (Pritchard et al. 2000). The former method relies on the likelihood of the genetic data randomly chosen from the predefined population models by assuming that the Ksubpopulation each is in Hardy-Weinberg equilibrium. ...
... To understand the composition/structure and gene flow within the AP across 80 polymorphic DNA markers, population structure analysis was carried out by using the STRUCTURE v 2.3.4 software package (Pritchard et al. 2000) with the use of Bayesian model of clustering. Parameters were set to consider the admixture model with correlated allelic frequencies among populations. ...
... Model-based population genetic clustering approaches are widely used to identify hybrid individuals from genetic data, and serve as an important tool for understanding patterns of extant genetic variation. Often implemented within the maximum likelihood (Alexander et al., 2009) or Bayesian frameworks (Pritchard et al., 2000), these methods estimate contributions from a user-designated number of ancestral genetic pools (typically denoted by ) to an individual's ancestry through estimation of probabilistic quantities called ancestry coefficients. Despite the popularity of these methods in studies of hybridization on phylogenetic time scales, population clustering methods were not originally designed for this task but rather for the task of identifying population structure in contemporary populations. ...
... These methods are also sensitive to the choice of markers, the level of genetic differentiation between populations, and the amount of data utilized (Kalinowski, 2011;Latch et al., 2006;Vähä & Primmer, 2005). The algorithms implemented in some of these tools involve unsupervised clustering, and simulations (e.g., structure, Pritchard et al., 2000) have demonstrated that these methods can produce different outcomes in replicate analyses due to label-switching or multimodality (Kopelman et al., 2015). While the former issue can be detected and eliminated through post-processing, the latter issue is much more difficult to assess. ...
Article
Full-text available
Several methods have been developed to carry out a statistical test for hybridization at the species level, including the ABBA-BABA test and HyDe. Here, we propose a new method for detecting hybridization and quantifying the extent of hybridization. Our test computes the likelihood of a species tree that is possibly subject to hybridization using site pattern frequencies from genomic-scale datasets under the multispecies coalescent. To do this, we extend the calculation of the likelihood for site pattern frequency data for the 4-taxon symmetric and asymmetric species trees proposed in Chifman and Kubatko (2015) by incorporating an inheritance parameter, resulting in efficient computation of the likelihood under a scenario of hybridization. We use this likelihood computation to construct a likelihood ratio test that a given species is a hybrid of two parental species. Simulations demonstrate that our test is more powerful than existing tests of hybridization, including HyDe, and that it achieves the desired type I error rate. We apply the method to two empirical data sets, one for which hybridization is believed to have occurred and one for which previous methods have failed to detect hybridization.
... DNA was extracted from blood or tissue samples using the standard phenol-chloroform protocol (Sambrook et al., 1989) and from feces samples using a QIAmp DNA Stool Mini Kit (QIAGEN). For DNA extracted from fecal samples, PCR and sizing were repeated twice (in the case of a heterozygous genotype call) or four times (in the case of a homozygous genotype call) to minimize possible genotyping errors due to allelic dropout (Pritchard et al., 2000;Peakall et al., 2003). We only recorded an allele if it was observed at least twice in different amplifications from the same DNA extract. ...
... Genetic structure was evaluated using non-spatial Bayesian clustering with the Structure v.2.3.4 program (Pritchard et al., 2000). A series of 20 independent runs per K (ranging from 1 to 7) was conducted using the admixture model with correlated allele frequencies, without prior information about sampling locations and independent allele frequencies, and with 1,000,000 Monte Carlo-Markov iterations after a burn-in of 50,000 replicates. ...
Article
Full-text available
The brown howler, Alouatta guariba, endemic to the Atlantic Forest of Brazil and Argentina, is threatened by habitat loss and fragmentation, hunting, and its susceptibility to yellow fever. Two subspecies have been recognized, but their names, validity, and geographic ranges have been controversial. We obtained samples covering the species' entire distribution in Brazil and Argentina to clarify these issues by investigating their genetic diversity and structure and assessing their evolutionary history. We analyzed, for the first time, a set of ten microsatellite markers (N = 153), plus mitochondrial DNA (mtDNA) segments of the control region (N = 207) and cytochrome b gene (N = 116). The microsatellite data support two to three genetic clusters with biological significance. The southern populations (Argentina, Santa Catarina, and Rio Grande do Sul) presented a homogeneous genetic component, and populations from São Paulo (SP) to the north presented another component, although most presented~20% of the southern component. With K = 3, SP emerged as a third component while These authors have contributed equally to this work and share senior authorship ‡ These authors have contributed equally to this work and share last authorship sharing some ancestry with Rio de Janeiro and Argentina. The mtDNA phylogenies revealed three main clades that diverged almost simultaneously around 250 thousand years ago (kya). Clades A and B are from central SP to the north and east, while clade C is from SP to the south and southwest. Samples from SP presented haplotypes in all three clades, sometimes in the same population. The demographic history of the species estimated with the Bayesian skyline plot of the mtDNA showed a strong expansion~40-20 kya and a strong reduction over the last~4-2 kya. Although the genetic clusters identified here deserve appropriate management strategies as conservation units, the absence of (i) concordance between the mtDNA and microsatellite data, (ii) reciprocal monophyly in the mtDNA, and (iii) clear-cut non-genetic diagnostic characters advises against considering them as different taxonomic entities. None of the previous taxonomic proposals were corroborated by our data. Our results elucidate the taxonomy of the Atlantic Forest brown howler, indicating it should be considered a monotypic species, A. guariba. We also clarify the evolutionary history of the species regarding its intraspecific genetic diversity, which is crucial information for its conservation and population management.
... Gene flow exists between these populations (F ST = 0.07, p < 0.01). Despite the small sample sizes at intermediate sites, both the observed genetic diversity and the results from the Structure software (Pritchard et al., 2000) suggest a higher rate of dispersal from the high altitude population (300-1200 m, coastal region) to the low altitude population (0-900 m, inland region). This apparently directional gene flow has not been formally tested but would reinforce that tapirs prefer lowland areas (as the common name of the species suggests). ...
... Samples were analyzed using a panel of six microsatellite loci (Table 1.1), and samples from Belize were not included in the population analyses. While Structure (Pritchard et al., 2000) estimated a single panmictic population, F statistics indicated a moderate level of population differentiation between Costa Rica and Panama (F ST = 0.114; 95% CI = 0.059-0.18). Norton and Ashley (2004a) recommended that the two study areas be treated as separate management units because Structure may have reduced accuracy under scenarios of moderate population differentiation (Maudet et al., 2002) and because of the lack of suitable habitat connecting these regions. ...
Chapter
Tapirs are among the oldest living large mammals, represented by only four widely recognized extant species: two in South America, one in Central America, and one in Southeast Asia. All species are in the genus Tapirus and are currently listed as vulnerable or endangered on the IUCN Red List of Threatened Species. This chapter builds on publications using genetic tools to reveal phylogenetic relationships among tapir species, intraspecific biogeography, and population patterns. We describe how genetic tools have been used by tapir researchers to date, discuss the limitations of published studies, and provide recommendations for future research.
... A priori group assignment is based on the analysis of microsatellite data according to Confalonieri (2009), Bracco et al. (2016), López et al. (2021) and Rivas et al. (2022) ( Table S1). The 87 individuals selected for this study showed group assignment probabilities > 0.8, according to STRUCTURE analysis (Pritchard, Stephens, and Donnelly 2000). The map was made with QGIS v3.16.16-Hannover (https:// qgis. ...
... The DAPC itself was implemented with the 'xvalDapc' function using the previously inferred k groups and cross-validation to define the number of PCs. A Bayesian analysis of population structure was performed with the STRUCTURE software employing the admixture model with correlated allele frequencies (Pritchard, Stephens, and Donnelly 2000). Between two and six clusters (Ks) were evaluated running three times each K (burn-in: 50,000; iterations: 100,000). ...
Article
Full-text available
Maize (Zea mays ssp. mays L.) landraces are traditional American crops with high genetic variability that conform a source of original alleles for conventional maize breeding. Northern Argentina, one the southernmost regions of traditional maize cultivation in the Americas, harbours around 57 races traditionally grown in two regions with contrasting environmental conditions, namely, the Andean mountains in the Northwest and the tropical grasslands and Atlantic Forest in the Northeast. These races encounter diverse threats to their genetic diversity and persistence in their regions of origin, with climate change standing out as one of the major challenges. In this work, we use genome‐wide SNPs derived from ddRADseq to study the genetic diversity of individuals representing the five groups previously described for this area. This allowed us to distinguish two clearly differentiated gene pools, the highland northwestern maize (HNWA) and the floury northeastern maize (FNEA). Subsequently, we employed essential biodiversity variables at the genetic level, as proposed by the Group on Earth Observations Biodiversity Observation Network (GEO BON), to evaluate the conservation status of these two groups. This assessment encompassed genetic diversity (Pi), inbreeding coefficient (F) and effective population size (Ne). FNEA showed low Ne values and high F values, while HNWA showed low Ne values and low Pi values, indicating that further genetic erosion is imminent for these landraces. Outlier detection methods allowed identification of putative adaptive genomic regions, consistent with previously reported flowering‐time loci and chromosomal regions displaying introgression from the teosinte Zea mays ssp. mexicana. Finally, species distribution models were obtained for two future climate scenarios, showing a notable reduction in the potential planting area of HNWA and a shift in the cultivation areas of FNEA. These results suggest that maize landraces from Northern Argentina may be unable to cope with climate change. Therefore, active conservation policies are advisable.
... An unweighted neighbor-joining un-rooted tree was constructed by the calculated NEI coefficient of dissimilarity index (Nei 1972) with a bootstrap value of 1000 using DARwin6 software (Perrier et al. 2006). Population structure analysis was performed using STRU CTU RE software, version 2.3.4 (Pritchard et al. 2000) to determine the nature of genetic structure and the number of clusters. To estimate the Bayesian distributions, the programme was set up to run from K = 1 to K = 10 with 5 independent replications per K using the admixture model, correlated allele frequencies, a 200,000 burn-in period, and 200,000 MCMC. ...
... The STRU CTU RE program can detect the homogeneous two or more groups that are likely to be present within a single population (Pritchard et al. 2000). The genotypes present in each group may be of pure type or admixture based on their genetic loci that contribute to differences among the subgroups. ...
Article
Full-text available
The loss of rice's genetic diversity caused by the replacement of landraces with superior cultivars in modern agriculture highlights the importance of genetic diversity assessment. Therefore, a set of 96 rice genotypes collected from different parts of Manipur, India was used for genetic diversity assessment in the present investigation. Nine agronomic characters and 60 simple sequence repeat (SSR) microsatellite markers were the keys to differentiate their genetic relationship. A correlation study reveals the positive relationship of plant tiller number with yield which can be used as a selection criterion for the improvement of yield. The SSR profile depicts the presence of 118 alleles in the rice genotypes studied. The STRUCTURE analysis could be able to group the 96 rice genotypes into two major clusters and further sub-clustering into four sub-clusters in Cluster-I while into three sub-clusters in Cluster-II. Analysis of molecular variance (AMOVA) resulted in more variation in within individuals than among individuals and less variation exists among populations. A marker-trait association analysis identified 31 SSR markers significantly associated with eight yield-related traits. These markers would be useful for the identification of novel genes/alleles and could be deployed as a key indicator of selection for a marker-assisted breeding program of rice.
... Based on the calculated Prevosti distance values, an automated clustering analysis on the top 50 PCs (principal components) was conducted ( Figure 4A). STRUCTURE v 2.3.4 [36] was used to study the population structure and genetic relations among the 50 maize landraces from the Lazio region. The BIC (Bayesian information criterion) values were used to estimate the optimal number of clusters that To further investigate relatedness within the germplasm collection, the genetic distance between samples was calculated based on the Prevosti distance. ...
... Based on the calculated Prevosti distance values, an automated clustering analysis on the top 50 PCs (principal components) was conducted ( Figure 4A). STRUCTURE v 2.3.4 [36] was used to study the population structure and genetic relations among the 50 maize landraces from the Lazio region. The BIC (Bayesian information criterion) values were used to estimate the optimal number of clusters that define the population structure. ...
... To reveal the number of clusters, a Bayesian analysis under an admixture model with correlated allele frequencies was performed with the program STRUCTURE 2.3.4 [53,54]. The potential number of genetic clusters (K) varied from 1 to 20. ...
Article
Full-text available
Understanding genetic diversity and structure in natural populations and their suitable habitat response to environmental changes is critical for the protection and utilization of germplasm resources. We evaluated the genetic diversity and structure of 24 A. chinensis populations using simple sequence repeat (SSR) molecular markers. The potential suitable distribution of tetraploid A. chinensis estimated under the current climate and predicted for the future climate was generated with ecological niche modeling (ENM). The results indicated that the polyploid populations of A. chinensis have high levels of genetic diversity and that there are distinct eastern and western genetic clusters. The population structure of A. chinensis can be explained by an isolation-by-distance model. The results also revealed that potentially suitable areas of tetraploids will likely be gradually lost and the habitat will likely be increasingly fragmented in the future. This study provides an extensive overview of tetraploid A. chinensis across its distribution range, contributing to a better understanding of its germplasm resources. These results can also provide the scientific basis for the protection and sustainable utilization of kiwifruit wild resources.
... software with an internal size standard of GeneScan™ 500LIZ™. Analysis of the genetic structure and membership coefficient (Q) was carried through an alternative model-based Bayesian clustering analysis using STRUCTURE [12] and the optimum ∆K value was visualized and calculated using the STRUCTURE Harvester program [13]. ...
Preprint
Full-text available
Understanding the state of the swamp buffalo population in Calayan Island is important to strengthen the conservation and management program in the country. This study aimed to provide insights into the morphology, population structure, and health profile of the swamp buffaloes on the island. In total, 35 fresh blood samples were analyzed using 27 polymorphic microsatellite markers to determine the population structure analysis. Data were gathered for the morphological features of Calayan swamp buffaloes and served as the baseline information for the descriptive traits. Furthermore, samples were tested for surra and brucellosis using PCR and serological tests respectively. Results showed that Calayan swamp buffaloes were morphologically bigger except for body length compared with other populations. Genotype analysis using microsatellite markers showed remarkable discriminatory power to distinguish distinct populations within the tested population and could discriminate subspecies of swamp and river types plus crossbreds. The study also reports the first incidence of surra and brucellosis on the island. Overall, the new insights provided on newly detected Philippine carabao lineage in Calayan Island would be highly recommended for ex-situ conservation and animal health control strategy. The conservation strategy would encompass collecting, cryopreserving, and storing viable germplasms from local swamp buffalo on Calayan Island.
... Plink and R software [30,32] was employed to perform principal component analysis (PCA) based on filtered SNPs. The Bayesian clustering analysis was conducted using STRU CTU RE v2.3.4 [33], with 50,000 iterations and a burn-in phase. Different K values from 1 to 5 were examined. ...
Article
Full-text available
Understanding the genomic characteristics of livestock is crucial for improving breeding efficiency and conservation efforts. However, there is a relative lack of information on the genetic makeup of local goat breeds in Henan, China. In this study, we identified runs of homozygosity (ROH), genomic inbreeding coefficients (FROH), and selection signatures in four breeds including Funiu White (FNW), Huai (HG), Lushan Bullleg (LS), and Taihang black (THB). The genomic analysis utilized a dataset of 46,278 SNP markers and 102 animals. A total of 342, 567, 1285, and 180 ROH segments were detected in FNW, HG, LS, and THB, respectively, with an average of 15.55, 29.84, 32.95, and 8.18 segments per individual. The lengths of ROH segments varied from 69.36 Mb in THB to 417.06 Mb in LS, with the most common lengths being 2-4 Mb and 4-8 Mb. The highest number of longest ROH segments (> 16 Mb) were found in LS (328) and the highest average FROH value was observed in LS (0.173), followed by HG (0.128), while the lowest FROH values were in THB (0.029) and FNW (0.070). Furthermore, the analysis of ROH islands and Composite Likelihood Ratio (CLR) identified a total of 175 significant genes. Among these, 25 genes were found to overlap, detected by both methods. These genes were associated with a diverse range of traits including reproductive ability (GPRIN3), weight (CCSER1), immune response (HERC5 and TIGD2), embryo development (NAP1L5), environmental adaptation (KLHL3, TRHDE, and IFNGR1), and milk characteristics (FAM13A). Significant Gene Ontology (GO) terms related to embryo skeletal system morphogenesis, brain ventricle development, and growth were also identified. This study helps reveal the genetic architecture of Henan goat breeds and provides valuable insights for the effective conservation and breeding programs of local goat breeds in Henan. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-024-11098-0.
... Structure analysis (Pritchard et al., 2000) was used to estimate genetic clusters using the Bayesian approach, whereas DAPC (discriminant analysis of principal components) was used to estimate genetic groups by the non-Bayesian method (Jombart 2008). The Structure procedure included an admixture model, a 100,000 burn-in, 1,000,000 MCMC replications, and ten independent runs for each K with the maximum number of Clusters K= 9. ...
... We used two different approaches to assess the population structure forA. trapezoides : STRUCTURE v 2.3.4 (Pritchard et al., 2000), and Discriminant Analysis of Principal Components (DAPC ) as implemented in the R package adegenet v 2.2.10 (Jombart et al, 2010). We ran STRUCTURE with 100,000 MCMC iterations following a burn-in of 50,000 iterations, using the admixture ancestry model and no sampling location as a prior, setting a putativeK from 1 to 6 in a total of 15 independent iterations for each run (Falush et al., 2003). ...
... Genetic distances among the adzuki bean accessions and the Neighbour-joining (NJ) tree were constructed using Power Marker v.3.25. Using STRUC-TURE v.2.3.4 software which works on the Bayesian model, the population structure of adzuki bean accessions was determined [23]. Clusters number (k) ranged between 2 to 10 and five replications were run for each k value analysis. ...
Article
Full-text available
Adzuki bean, an underutilized grain legume, has a significant potential for enhancing food and nutritional security. The main obstacles to developing new cultivars and promoting the adzuki bean as a mainstream pulse crop are a lack of awareness about its potential and insufficient information on crop its genetic diversity. Here, we aimed to explore the untapped potential of adzuki bean germplasm by evaluating its agro-morphological traits and diversity at the molecular level and also to identify trait-specific germplasm by utilizing 100 adzuki bean accessions conserved in the Indian National Genebank. Significant variations was recorded for the morphological traits and identified promising accessions exhibiting desirable traits, such as early flowering (IC341945, EC340257 and EC340283), number of primary branches (IC341945 and IC469175), number of clusters per plant (EC000264, IC167611 and IC341939), number of pods per plant (IC469175, EC34264, EC000264), early maturity (EC340283; EC120460; IC341941) and number of seeds per pod (EC340240, IC455396 and IC341955). Molecular characterization of diverse accessions using 22 polymorphic SSR markers identified a total of 50 alleles, with a mean of 2.27 alleles per loci. The polymorphic information content (PIC) ranged from 0.03 to 0.46, indicating informativeness of markers in distinguishing diverse accessions. Further, the gene diversity among the accessions ranged from 0.03 to 0.57 with a mean of 0.19. Population structure analysis grouped the accessions into three genetic groups, supported by Principal Coordinate Analysis (PCoA) and a phylogenetic tree. Additionally, Analysis of Molecular Variance (AMOVA) confirmed a substantial genetic diversity among the adzuki bean accessions. Thus, the combined assessment of agro-morphological traits and molecular markers effectively distinguished adzuki bean accessions and provided valuable insights in understanding untapped variation at both morphological and molecular levels. The promising accessions identified in the study hold potential for integration into legume improvement programs through introgression breeding, contributing to the development of adzuki bean varieties with target trait.
... We used the 120 SNP panel to genotype all the individuals in the natural population. As tree spatial isolation and selfing can limit gene dispersion and create fine-scale spatial genetic structure within the population (Vekemans and Hardy 2004), we explored the population genetic structuring of the natural population using Structure v. 2.3.4 (Pritchard, Stephens, and Donnelly 2000). We estimated the number of genetic clusters (K) by assigning individuals in undefined mixture clusters under a Bayesian framework. ...
Article
Full-text available
Species with extremely small population sizes are critically endangered because of reduced genetic diversity, increased inbreeding and hybridisation threats. Genomic tools significantly advance conservation by revealing genetic insights into endangered species, notably in monitoring frameworks. Sicilian fir ( Abies nebrodensis ) is the most endangered conifer in Europe with only 30 adult trees in an 84‐ha area. Using 20,824 SNPs from RAD‐seq, employing genome assembly and a custom 120 SNP‐array, we evaluated genetic diversity, mating patterns, and effective population size in adult trees, 118 natural seedlings, and 2064 nursery seedlings from past conservation actions. We assessed introgression from neighbouring non‐native fir plantations (~6%) and established an intra‐population assisted gene flow (AGF) program selecting the most genetically dissimilar individuals and investigating the outcome through simulations. Genomic analysis unveiled significant genetic diversity among adult Sicilian firs, comparable to non‐endangered Mediterranean firs with larger populations. However, the genetic diversity of the forthcoming generation declined due to high self‐fertilisation, leading to marked inbreeding ( F IS = 0.38) and an alarmingly low effective population size ( N e = 6). Nursery seedling monitoring revealed similar selfing rates and introgression (~2%) from non‐native firs. Although intra‐population AGF could help to mitigate genetic loss, it may not alleviate the species vulnerability to imminent environmental challenges, perpetuating the risk of an extinction vortex. Hence, investigating the impact of Sicilian fir population decline and selfing on inbreeding depression, along with exploring the potential of hybrids for genetic load alleviation and future adaptation, is crucial for effective conservation strategies.
... It represents each data point as a convex combination of representative points, i.e. 'archetypes' (Cutler & Breiman, 1994;Mørup & Hansen, 2012). Archetypal analysis and conceptually similar methods, such as ADMIXTURE and STRUCTURE, have been widely applied in population genetics and genomics (Pritchard et al., 2000;Li et al., 2008;Alexander et al., 2009;Alexander & Lange, 2011;Gimbernat-Mayol et al., 2022). After identifying cultural clusters, we assessed potential differences in their cultural transmission. ...
Article
Full-text available
With its linguistic and cultural diversity, Austronesia is important in the study of evolutionary forces that generate and maintain cultural variation. By analysing publicly available datasets, we have identified four classes of cultural features in Austronesia and distinct clusters within each class. We hypothesized that there are differing modes of transmission and patterns of variation in these cultural classes and that geography alone would be insufficient to explain some of these patterns of variation. We detected relative differences in the verticality of transmission and distinct patterns of cultural variation in each cultural class. There is support for pulses and pauses in the Austronesian expansion, a west-to-east increase in isolation with explicable exceptions, and correspondence between linguistic and cultural outliers. Our results demonstrate how cultural transmission and patterns of variation can be analysed using methods inspired by population genetics.
... Genetic Structure: The genetic structure of naked barley varieties was analyzed in the Bayesian clustering algorithm-based program STRUCTURE ver. 2.3.4 [45] to visualize the genetic components of the varieties in different gene pools, based on the SSR genotypic data matrix. The running parameters were set as 100,000 burn-in period and 100,000 replicates. ...
Article
Full-text available
Naked barley (Hordeum vulgare var. nudum) is a staple food crop, contributing significantly to global food security. Understanding genetic diversity will facilitate its effective conservation and utilization. To determine genetic diversity and its distribution within and among varieties, we characterized 30 naked barley varieties from Tibet, representing the traditional, modern, and germplasm-resources-bank gene pools, by analyzing SSR molecular fingerprints. The results demonstrate abundant genetic diversity in Tibetan naked barley varieties, particularly those in the traditional gene pool that holds much more private (unique) alleles. Principal coordinates and STRUCTURE analyses indicate substantial deviation of the modern varieties from the traditional and germplasm-resources-bank varieties. A considerable amount of seed mixture is detected in the modern varieties, suggesting the practices of using mixed seeds in modern-variety cultivation. Cluster analyses further indicate the narrow genetic background of the modern varieties, likely due to the limited number of traditional/germplasm-resources-bank varieties applied in breeding. Relationships between increases in genetic diversity and sample sizes within naked barley varieties highlight the importance of effective sampling strategies for field collections. The findings from this study have important implications for the sustainable utilization and effective conservation of different types of naked barley germplasm, both in Tibet and in other regions around the world.
... For this purpose, the software TASSEL v3.0 was employed and the GWAS was computed implementing two models: the general linear model (GLM) and the mixed linear model (MLM). The GLM (Pritchard et al., 2000) was performed taking into consideration population structure (Q matrix) to correct for genetic stratification. The membership of each individual in each subpopulation, represented by principal components (PCs), was further added to the model as covariates. ...
... An Analysis of Molecular Variance (AMOVA) was performed in GenAlEx with 9999 permutations to assess the portioning of variation within and between groups (established meadows and seed cohorts). A bayesian clustering analysis was then performed with Structure 2.3.4 (Pritchard, Stephens, and Donnelly 2000) for K2-K10 to identify genetic structure among established meadows and seed groups, separately, with the options admixture model, run length 100,000, 100,000 MCMC iterations, and correlated allele frequencies. Each K consisted of 10 independent runs. ...
Article
Full-text available
Aim Seed dispersal plays a key role in shaping the distribution and genetic complexity of seagrass populations and affects their resilience capacity under disturbance. The endemic seagrass Posidonia oceanica is a key component of Mediterranean coastal ecosystems, but knowledge about movement ecology in this species is limited, especially regarding seed movement pathways and dispersal potential. Location Western coast of Sicily (central Mediterranean). Methods Beach‐cast fruits of the Mediterranean seagrass P. oceanica were collected from nine localities along the Western coast of Sicily, along with adult shoots from eight putative donor meadows. We determined pair‐wise genetic differentiation between established meadows and seed cohorts. Genetic assignment tests were used to infer the most likely meadow of origin of individual seeds and were complemented with forward and backward Lagrangian simulations of dispersal. Results A significant genetic differentiation was found between seed pools and the most‐likely meadow of origin. The genetic assignment confirmed that seeds from the same cohort originated from multiple meadows and emphasised the presence of long‐distance‐dispersal (LDD) events (up to hundreds of km). Genetic connectivity appeared to be greater than that predicted by oceanographic simulations, which may reflect the longer temporal scales on which gene flow is shaped, in contrast to contemporary dispersal patterns. Lagrangian simulations highlighted that fruits were physically capable of dispersing beyond the study area and that the north Tunisian coast could be a key source of propagules for the populations studied. Main Conclusions Our study represents a significant step forward in the understanding of P. oceanica movement ecology and could guide meadows' conservation and restoration actions. Our findings are significant in a broader context outside of the research area and could be the basis of similar studies in other regions, especially considering the increasing number of fruiting events recorded across the Mediterranean likely associated with ocean warming.
... software with an internal size standard of GeneScan™ 500LIZ™. Analysis of the genetic structure and membership coefficient (Q) was conducted using an alternative model-based Bayesian clustering analysis using STRUCTURE [12] and the optimum ∆K value was visualized and calculated using the STRUCTURE Harvester program [13]. ...
Article
Full-text available
Understanding the state of the swamp buffalo population in Calayan Island is important to strengthen the conservation and management program in the country. This study aimed to provide insights into the morphology, population structure, and health profile of the swamp buffaloes on the island. In total, 35 fresh blood samples were analyzed using 27 polymorphic microsatellite markers to determine the population structure analysis. Data were gathered for the morphological features of Calayan swamp buffaloes and served as the baseline information for the descriptive traits. Furthermore, samples were tested for surra and brucellosis using PCR and serological tests, respectively. The results showed that Calayan swamp buffaloes were morphologically bigger, except for body length, compared with other populations. A genotype analysis using microsatellite markers showed remarkable discriminatory power to distinguish distinct populations within the tested population, and could discriminate subspecies of swamp and river types plus crossbreds. The study also reports the first incidence of surra and brucellosis on the island. Overall, the new insights provided on the newly detected Philippine carabao lineage in Calayan Island would be highly recommended for the ex situ conservation and animal health control strategy. The conservation strategy would encompass collecting, cryopreserving, and storing viable germplasms from local swamp buffalo on Calayan Island.
... Markers showing a lack of variability will be removed from the final data set and individuals missing data for four or more markers were removed prior to analysis. It should be noted that although the mitochondrial and nuclear microsatellite datasets contain individuals sampled from the same sites only a subset of individuals have been genotyped for both set of markers, To assess population structure from microsatellite data, we used the Bayesian clustering program, STRU CTU RE v.2.3.4 [38]. Initially, the program was run using the no admixture model with location priors assigned by collection site for 20 iterations of K1-K8 (300,000 generation burn-in, 200,000 generation sample). ...
Article
Full-text available
Background The mosquito Culex annulirostris Skuse (Diptera: Culicidae) is an important arbovirus vector in Australasia. It is part of the Culex sitiens subgroup that also includes Cx. palpalis and Cx. sitiens. Single locus mitochondrial and nuclear DNA sequencing studies suggest that Cx. annulirostris consists of a complex of at least two species. We tested this hypothesis by analysing both nuclear microsatellite data and additional mitochondrial sequence data to describe the population genetics of Cx. annulirostris through Australia, Papua New Guinea (PNG) and the Solomon Archipelago. Methods Twelve novel microsatellite markers for Cx. annulirostris were developed and used on over 500 individuals identified as Cx. annulirostris by molecular diagnostics. Ten of the 12 microsatellites then used for analysis using Discriminant Analysis of Principal Components, a Bayesian clustering software, STRUCTURE, along with estimates of Jost’s D statistic that is similar to FST but better suited to microsatellite data. Mitochondrial cytochrome oxidase I (COI) DNA sequence were also generated complementing previous work and analysed for sequence diversity (Haplotype diversity, Hd and Pi, π), Tadjima’s D, and pairwise FST between populations. An allele specific molecular diagnostic with an internal control was developed. Results We confirm the existence of multiple genetically and geographically restricted populations. Within mainland Australia, our findings show that Cx. annulirostris consists of two genetically and geographically distinct populations. One population extends through northern Australia and into the south-east coast of Queensland and New South Wales (NSW). The second Australian population occurs through inland NSW, Victoria, South Australia, extending west to southern Western Australia. These two Australian populations show evidence of possible admixture in central Australia and far north Queensland. Australia’s Great Dividing Range that runs down southeast Australia presents a strong gene-flow barrier between these two populations which may be driven by climate, elevation or river basins. In PNG we find evidence of reproductive isolation between sympatric cryptic species occurring through PNG and Australia’s northern Cape York Peninsula. A PCR-based molecular diagnostic was developed to distinguish these two cryptic species. Conclusion This study adds to the growing body of work suggesting that the taxon presently known as Cx. annulirostris now appears to consist of at least two cryptic species that co-occur in northern Australia and New Guinea and can be distinguished by a ITS1 PCR diagnostic. The Solomon Islands population may also represent a distinct species but in light its geographic isolation and lack of sympatry with other species would require further study. Additionally, the mitochondrial and nuclear DNA evidence of population structure between geographic regions within Australia appears latitudinal and elevational driven and may suggest an additional subspecies in that hybridise where they overlap. Graphical Abstract
... To test for both overall genetic structure and within-region structure, we ran the clustering analyses and the PCA on the entire dataset, and on subsets of the Canadian populations only, the US populations only, and the US populations with the commercial populations. To test for population differentiation, we used Bayesian clustering analyses in Structure v2.3.4 (Pritchard et al. 2000; h t t p s : / / d o c s . a l l i a n c e c a n . ...
Article
Full-text available
Many species-at-risk reach the edge of their range as disjunct and isolated populations. These peripheral populations may harbour unique genetic diversity crucial for future range shifts, or they may lack genetic diversity due to isolation or small population size. As such it is important to assess the genetic diversity and differentiation of these populations. We used 1,838 SNPs to evaluate the differentiation and genetic diversity in six Canadian (peripheral) and eight US (core) populations of wood-poppy (Stylophorum diphyllum), a perennial wildflower that is endangered at the northern limit of its range. We also compared these 14 populations to seeds from two commercial seed providers to determine if commercial sources are introgressing into wild populations. We found strong differentiation between core and peripheral populations, low levels of gene flow among both core and peripheral populations, and low to moderate levels of genetic diversity across the range with a decrease in heterozygosity in peripheral populations. We also noted that the commercial populations were genetically distinct from all natural sampled populations, with no evidence of introgression between commercial seeds and either Canadian or US populations. Our study indicates that peripheral and core populations form unique conservation units and therefore the conservation and recovery of the wood-poppy in Canada is necessary to conserve the full range of genetic diversity within the species.
... We calculated the observed number of alleles (Na), effective number of alleles (Ne), Shannon's information index (I), and Nei's genetic diversity (h) values from the ISSR primer data using GenAlEx software (Peakall and Smouse 2012). A model based on a clustering algorithm (STRU CTU RE v.2.3.4), which genetically divides groups according to allele frequencies (Pritchard et al. 2000), determined the genetic structures of the genotypes used in the study. This value is expressed as the K value, and the STRU CTU RE program was run by setting the number of populations (K value from 1 to 10). ...
Article
Full-text available
This research was conducted over a three-year period from 2021 to 2023, utilizing the ISSR marker system to determine the kinship relationships among populations obtained from seeds of lesser burnet (Sanguisorba minor Scop.) collected from the natural pasture areas of the Erzurum region. The study revealed genetic diversity among 42 burnet genotypes, including 40 genotypes and 2 control cultivars (Bünyan 80 and Altınova). The work with ISSR primers yielded PIC values ranging from 0.24 to 0.34, with an average of 0.30. The highest PIC value (0.34) was obtained from the UBC811 primer, while the lowest value (0.24) was from the UBC812 primer. Among the ISSR primers, UBC811 showed the highest Na, Ne, I, and h values with 2, 1.63, 0.54, and 0.37, respectively, whereas the UBC812 primer displayed the lowest Na, Ne, I, and h values with 2, 1.40, 0.42, and 0.26, respectively. When the results were evaluated as a whole, it was determined that there was no relationship among the collected lesser burnet (S. minor Scop.) populations solely in terms of geographical isolation. Additionally, we recorded that almost all of the 40 collected genotypes significantly differed from the commercial cultivars used as controls, indicating potential for the development of new cultivars.
... ac. cn/ Struc tureS elect or/) to determine an appropriate K value to determine the number of subpopulations in the population (Pritchard et al. 2000). PCoA analyses and AMO-VAs were performed via GenAIex 6.5b3 (Excoffier et al. 1992). ...
Article
Full-text available
Hippophae rhamnoides L. subsp. mongolica ‘Shenqiuhong’ is an excellent cultivar of sea buckthorn unique to China that is characterised by large fruits, high yield and no fruit drop even in winter, but this cultivar is poorly adapted to areas south of China’s northern latitude of 40°. In this study, 358 free-pollinated offspring of ‘Shenqiuhong’ were evaluated for genetic diversity using simple sequence repeat (SSR) markers to comprehensively resolve the genetic diversity information of this population. The results of the present study revealed that the ‘Shenqiuhong’ half-sib family presented high genetic diversity. The 12 pairs of SSR primers amplified a total of 228 clearly identifiable loci, with an average of 19 loci per marker. The mean values of the observed number of alleles (Na), effective number of alleles (Ne), Shannon’s information index (I) and polymorphism information content (PIC) of the primers were 2.000, 1.361, 0.359, and 0.588, respectively. The results of the structural analysis, neighbour-joining (NJ) cluster analysis, and principal coordinate analysis (PCoA) were consistent, suggesting that the 358 offspring could be divided into two subgroups. The results provide a theoretical basis and material support for sea buckthorn breeding.
... For these analyses, 24 individuals with no missing data were used to avoid distortion and increase reliability. The program STRUCTURE 2.3.4 was used to visualize population structure under 10 iterations of 400,000 MCMC generations, with 100,000 burn-in period 82 . The most likely number of cluster was determined based on Evanno method 83 , implemented in STRUCTURER HARVESTER web 0.6.94 ...
Article
Full-text available
Southwest Primorye hosts approximately 9% of the remaining wild Amur tiger population and represents hope for the revival of tigers in Northeast China and the Korean peninsula. Decades of conservation efforts have led to a significant increase in population size, from less than 10 individuals surviving in the region in 1996 to multiple folds today. However, while the population size has recovered since the mid-1900s, the effects of genetic depletion on evolutionary potential are not easily reversed. In this study, a non-invasive genetic analysis of the Amur tiger subpopulation in Southwest Primorye was conducted using microsatellite loci and mitochondrial genes to estimate genetic diversity, relatedness, and determine the impact of historical demographic dynamics. A total of 32 individuals (16 males, 15 females, and 1 unidentified sex) were identified, and signs of bottlenecks were detected, reflecting past demographic events. Low genetic variation observed in mitochondrial DNA also revealed genetic depletion within the population. Most individuals were found to be closely related to each other, raising concerns about inbreeding given the small population size and somewhat isolated environment from the main population in Sikhote-Alin. These findings emphasize the urgent need to establish ecological corridors to neighboring areas to restore genetic diversity and ensure the conservation of the Amur tiger population in Southwest Primorye.
... The genetic distance matrix between apple cultivars was calculated and shown in Tab. 2. The JMP Statistical Software which presents the distances between individuals in a two-dimensional diagram, was used to perform the PCA analysis. In addition, major allele frequency, Nei's H (gene diversity) and PIC (polymorphism 62 (Pritchard et al., 2000). ...
Article
Full-text available
In this study, 'Ak Sakı' and 'Kara Sakı' apple cultivars were collected from different locations in Erzincan province, Türkiye, and genetic diversity was determined using the Start Codon Targeted (SCoT) marker technique. The SCoT marker technique was chosen because its gene targeting, long primer, and high annealing temperature make it more effective than other marker techniques. Using ten SCoT primers, 60 bands were obtained, and 42 of them were polymorphic. The polymorphism rate was determined to be 70%. The UPGMA (Unweighted Pair Group Method with Arithmetic mean) dendrogram created using the PAUP 4.0b10 program consists of two clades. The genetic distance between apple cultivars varies between 0.13462 and 0.45614. Principal Component Analysis (PCA) results were compatible with the UPGMA dendrogram. With the SCoT marker technique, genetic diversity among apple cultivars can be determined in a shorter time and with more reliable results.
... Identification of individual genetic profiles (genotypes) based on microsatellite loci, of genotype matches (recaptures), estimation of the probability of identity and assignment tests were performed in GenAlEx (Peakall & Smouse, 2006, 2012a, 2012b Probability of identity was estimated using a conservative approach for small and closely related populations, by assuming populations of siblings (Waits et al., 2001). Assessment of genetic structure and posterior probability of assignment of individual genotypes to inferred genetic clusters was performed using STRUCTURE (Pritchard et al., 2000). LIFE17 NAT/PT/554 Action D1: Wolf activity monitoring and feeding ecology analysis post implementation of conservation actions 22 LIFE17 NAT/PT/554 Action D1: Wolf activity monitoring and feeding ecology analysis post implementation of conservation actions 23 4. Results ...
Technical Report
Full-text available
From 2020 to 2024, we continued implementing a large-scale, non-invasive integrated monitoring program to assess the demographic, genetic, and ecological status of the Iberian wolf population south of the Douro River and evaluate the effects of the LIFE WolFlux project conservation actions. The results confirm the success of the monitoring approach, although gathering data remains challenging, particularly in regions with extremely low wolf abundance. Law changes related to damage compensation could have affected forensic analysis sample size and wolf detection. Our findings indicate an increasingly precarious wolf status, with a decreased confirmed wolf range from 2019 to 2024. Also, confirmed reproduction events showed the same negative trend. Evidence suggests that connectivity levels varied with covered distance and location. It was higher in the central region (comprised by Leomil, Lapa and Trancoso packs), and lower in the western (Arada and Montemuro packs) and eastern areas (Almeida/border). Moreover, long distance extraterritorial movements were not detected during the project timeframe. However, the detection of first-generation migrants, even if low, suggests potential for connectivity among regions and possibly with wolf populations in Spain. Regarding feeding ecology, a slight increase in wild prey depredation was registered, yet the wolf population still shows a high dependence on human activity. Supported by fine-scale habitat modelling, we identify practical conservation tasks that should be implemented together with the projects and partners' broader conservation actions. We also recommend the continuity of the monitoring program to support this population’s recovery actions closely. (https://rewilding-portugal.com/wp-content/uploads/sites/3/2024/11/Technical-Report-of-Action-D1.pdf)
... To detect the population structure within the Rhododendron keiskei complex, we employed the ADMIXTURE program (v.1.3.0) (Alexander & al., 2009), which applies the same population model as STRUCTURE (Pritchard & al., 2000) to multilocus SNP genotype datasets. The number of clusters (K) was set from K = 1, …,10 for the three-taxon dataset. ...
Article
Full-text available
Unraveling species boundaries is pivotal for evolutionary biology and conservation endeavors. However, it proves challenging in instances where recent speciation is intertwined with complex demographic histories and natural selection processes. The Rhododendron keiskei complex, an evergreen rhododendron distributed in East Asia, consists of a widespread variety ( R. keiskei var. keiskei ) and a more restricted R. keiskei var. hypoglaucum . Intriguingly, the latter is exceptionally rare yet displays a disjunction that spans approximately 1100 km. This study aimed to elucidate the evolutionary backgrounds of the enigmatic disjunctions of R. keiskei var. hypoglaucum and to propose species delimitation within the species complex. An integrative approach, combining genomic data (MIG‐seq and GBS‐derived SNPs) with Scanning Electron Microscopy analysis of leaf microstructures was adopted in this study. Phylogenetic analyses revealed significant divergence among the studied rhododendrons. Genetic demographic analyses favored the population models that assumed non‐monophyly of two disjunct populations of R. keiskei var. hypoglaucum indicating their independent origins. Recent gene flow between the widespread R. keiskei var. keiskei and “var. hypoglaucum ” populations were limited due to geographic and habitat isolation factors, even in areas where their distributions overlap. Detailed morphological assessments detected distinctions between morphologically similar “var. hypoglaucum ” populations based on leaf microstructures and flowering habits. Our study has shown that the apparent disjunctions of rare rhododendrons are more likely attributed to morphological convergence, possibly due to similar environmental selections in unrelated taxa. The finding highlights the importance of an integrative approach for resolving taxonomic challenges in plant species complexes.
Article
Background Corema album, a wild shrub endemic to the Atlantic coastal dunes of Iberian Peninsula, has important ecological and medicinal value. Corema album has been listed as an endangered plant due to threats associated with anthropogenic events. Conservation of the species is therefore critically important. Objective The aim of this study was to assess the genetic diversity and population structure of four well-characterized populations to support pre-breeding programs aimed at enhancing species knowledge for a future berry crop culture, with the ultimate goal of protecting and conserving the threatened species C. album. Methods The set of 128 C. album accessions was genotyped using six EST-SSR primers and one genomic-SSR primer. Results A total of 65 alleles were obtained with a mean of 9.285 alleles per locus. The genetic diversity was found to be high, with mean values of observed heterozygosity ( Ho) and expected heterozygosity ( He) at 0.594 and 0.756, respectively. Moderate genetic differentiation among populations (FST = 0.095) was observed. STRUCTURE analysis and multivariate analyses roughly categorized the accessions into a complex mixed group. The STRUCTURE analysis revealed no correlation between genetic structure and geographical distribution; however, it facilitated the identification of pure accessions within each population. Conclusions The results aid in the understanding the population structure and genetic diversity and are crucial to optimize conservation strategies as well as the utilization of genetic resources of C. album for a deeper genetic study of the species.
Article
The genus Baccharis in Chile is an extraordinary example of admixture, previously described only morphologically and chemically. In Chile, the genus forms a homoploid complex with at least 16 species and 21 hybrids. Genotyping‐by‐Sequencing (GBS) was used to clarify the hybrid character of Baccharis × intermedia , which originated from the species B. macraei and B. linearis . Additionally, B. vernalis , another species with a morphological resemblance to B. macraei , was subjected to analysis to ascertain its role in the hybridization process. A total of 11,006 SNPs and 72 individuals were analysed using Treemix, D ‐ and f ‐statistics, which revealed genetic evidence of hybridization between B. macraei and B. linearis . Furthermore, other genetic indicators, such as a high level of heterozygosity, also provided evidence of the hybrid nature of Baccharis × intermedia . Additionally, one individual exhibited strong genetic proportions derived from B. vernalis , B. macraei , and B. linearis . Distinct individuals were clustered using sparse Non‐Negative Matrix Factorization (sNMF) into five distinct groups, representing the described species and the hybrid. B. macraei exhibited division into a northern and a southern subpopulation. The morphological and chemical evidence of the hybrid character of Baccharis × intermedia is corroborated by genetic data. Further, the most likely evolutionary scenario is a hybrid swarm. Genetic differentiation between B. linearis and B. macraei indicates separation prior to secondary contact. The close relationship of B. macraei and B. vernalis was confirmed, suggesting that it may emerge as a vicariant species on different soil types.
Article
Genetic diversity is an important attribute of populations, essential for understanding the ecological and evolutionary processes affecting them and assessing their health status. In Hymenoptera, such as eusocial bees, colony management can influence genetic diversity in both natural and managed populations. Management can impact admixture, increasing the number of alleles due to colony displacement and decreasing the number of alleles in natural populations due to colony extraction. In this study, we analyzed genetic diversity in natural and managed colonies as well as in drone congregations of Scaptotrigona mexicana (Guérin), to assess genetic diversity, patterns of genetic structure and gene flow, and the presence of diploid males. We identified three distinct genetic groups: Northern, Central, and Southern. Although genetic differentiation and limited gene flow among genetic groups were evident, we detected significant gene flow from wild to managed populations, suggesting that natural populations can be an important reservoir of genetic diversity. The highest genetic diversity was found in the Northern group, composed of managed localities. This is likely due to the introduction of new alleles through to colony translocation. Notably, some loci exhibited more than three alleles in localities where all analyzed individuals were from the same colony, indicating possible polyandry in the species. We also detected diploid males, which suggests inbreeding and/or inefficient mechanisms for their elimination from the colony. Our results provide an initial assessment of genetic diversity in both natural and managed populations, as well as in drone congregations of S. mexicana.
Article
Ecological restoration requires large-scale reintroductions of plants, but their genetic basis is a controversial issue. Formerly, non-local seed sourcing of naturally occurring herbaceous species was common practice. Here we test whether the genetic pattern of the earlier introduction of non-local seeds of Leucanthemum vulgare agg. (ox-eye daisy) can still be detected several years after the application and whether it differs from that of the regional gene pool. We collected leaf material of the ox-eye daisy in Central Germany on sites of indigenous populations (I) and those formerly restored with non-local seed sources (R). Genome sizes and population genetic pattern (AFLP) were analysed. Genome size estimates of most of the individuals studied suggest, that most ox-eye daisies in the region have similar genome sizes regardless of their origin, while individuals from two indigenous populations from the most northwestern part of the study area had lower 1C values. All populations were genetically diverse and the former use of non-local geno-types of the species could not be detected up to more than 8 years after the establishment of the populations. The results shows that a recommendation for restoration purposes is unequivocal, it can only be concluded that it will be best to use seeds that are local and/or similar to the sites intended for sowing.
Article
Understanding how ecological, environmental and geographic features influence population genetic patterns provides crucial insights into a species' evolutionary history, as well as their vulnerability or resilience under climate change. In the Southern Ocean, population genetic variation is influenced across multiple spatial scales ranging from circum‐Antarctic, which encompasses the entire continent, to regional, with varying levels of geographic separation. However, comprehensive analyses testing the relative importance of different environmental and geographic variables on genomic variation across these scales are generally lacking in the Southern Ocean. Here, we examine genome‐wide single nucleotide polymorphisms of the Southern Ocean octopus Pareledone turqueti across the Scotia Sea and the Antarctic continental shelf, at depths between 102 and 1342 m, throughout most of this species' range. The circumpolar distribution of P. turqueti is biogeographically structured with a clear signature of isolation‐by‐geographical distance, but with long‐distance genetic connectivity also detected between East and West Antarctica. Genomic variation of P. turqueti was also associated with bottom water temperature at a circumpolar scale, driven by a genotype‐temperature association with the warmer sub‐Antarctic Shag Rocks and South Georgia. Within the Scotia Sea, geographic distance, oxygen and fine‐scale isolation‐by‐water depth were apparent drivers of genomic variation at regional scales. Putative positive selection of haemocyanin (oxygen transport protein), calcium ion transport and genes linked to RNA modification, detected within the Scotia Sea, suggest physiological adaptation to the regional sharp temperature gradient (~0–+2°C). Overall, we identified seascape drivers of genomic variation in the Southern Ocean at circumpolar and regional scales in P. turqueti and contextualised the role of environmental adaptations in the Southern Ocean.
Article
This study investigated the genetic diversity and pathogenicity of Pyricularia oryzae isolates collected from rice resistance lines in Lampung, Sukabumi, and Banyumas, Indonesia. A total of 38 isolates were collected, with the majority from Lampung. Pathotype variation analysis revealed diverse races across the regions, with significant differences in susceptibility observed among rice varieties. SSR markers showed four clusters based on host introgression types. Avr-ACE1 and mating type markers revealed clustering by geography and mating type. SNP markers indicated distinct relationships in rice hosts, particularly Banyumas and Sukabumi rice with Indica, Indica-Oryza rufipogon, and Japonica-Indica-O. rufipogon introgression, showing close relatedness. In contrast, Lampung rice, dominated by the J type of introgression, formed a separate cluster and exhibited the highest diversity of Avr-ACE1 alleles and mating types, suggesting strains with extensive resistance gene introgression are suitable for cultivation. Parasexual recombination was identified as the primary mechanism for DNA exchange in P. oryzae. Host responses indicated the Cisokan variety’s resistance to all blast pathogens, highlighting its potential as a donor for blast resistance in local rice improvement programs. These findings contribute to understanding the genetic interactions between P. oryzae and rice hosts, aiding in the development of more effective blast disease management strategies.
Article
Selection of parents for hybridization relies on the information on their genetic relationship and diversity which are essential in any breeding program. This study aimed to estimate the extent of genetic diversity and population structure of 76 sugarcane accessions from seven regions in the Philippines using 57 morphological characters and 50 microsatellite markers. The sugarcane collections exhibited moderate to high diversity with mean of H’ = 0.72 for qualitative and H’ = 0.75 for quantitative morphological characters, respectively. This is corroborated by the analysis of variance (ANOVA) of agronomic parameters, except for stalk length. Unweighted Pair Group Method with Arithmetic Mean (UPGMA) cluster analysis based on morphology subdivided the accessions into 31 clusters which reveal phenotypic variability among sugarcane samples. The fingerprints of the 76 accessions were also evaluated using 45 Saccharum-based genomic SSR and 5 EST-SSR primer pairs to measure genetic diversity and population structure. Based on UPGMA, a total of six clusters were generated with a 0.65 coefficient of dissimilarity, and the sugarcane accessions were further subdivided into five major sub-populations. Out of 50 markers used, 41 (82%) were found to be highly informative with a mean PIC value of 0.69. It was also expected that Saccharum-based genomic SSRs were more polymorphic (92%) compared to EST-SSRs (82%) since the latter preferably amplify in more conserved and expressed sequences in the genome. Out of 2,850 pairwise combinations based on the Jaccard coefficient index, large diverse parental combinations (genetic dissimilarity= 0.51 -0.70) were observed, indicating substantial diversity from the existing breeding pool of IPB-UPLB for genetic improvement. Cluster analysis based on UPGMA, STRUCTURE analysis, and Principal Coordinate Analysis (PCoA) were predominantly consistent. However, no association was observed between the geographical origin and genetic distance of the genotypes based on molecular data. The results showed that accessions were grouped into five sub-populations and genetic differentiation within sub-population was higher (85%) as compared to between sub-population (15%) based on the analysis of molecular variance (AMOVA), suggesting an active exchange of genetic pool across provinces and regions of both Luzon and Visayas islands. The findings from this study will be useful for future breeding efforts by exploiting genetic variation existing in the current breeding population.
Article
Onion, a member of the Allium genus, stands out as the most extensively cultivated species in the Indian sub-continent, possessing remarkable potential for export. To enhance bulb yield, overall quality, and to fortify the resistance against both biotic and abiotic stresses, agro-morphological and molecular characterization is of utmost significance. Genetic diversity in 49 onion genotypes was assessed using six DUS descriptors, 19 quantitative traits along with 13 ISSR markers. Among DUS descriptors, bulb: basic color of dry skin exhibited the highest diversity index (1.44). Mahalanobis D2 statistic grouped the genotypes into seven clusters with the highest inter-cluster distance between clusters V and VII (364.35). A total of 78 fragments were produced from 13 polymorphic primers with a mean of six alleles per primer. The polymorphic information content (PIC) ranged from 0.42 (UBC 835) to 0.75 (UBC 825) with an average of 0.61 per primer. Cluster analysis using UPGMA algorithm divided genotypes into two major clusters, whereas the cluster tree identified three major groups. The structure analysis divided the population into two main groups. Based on genetic distance, Genotypes no. 13 (ON20-11) and 28 (ON20-49) were found most diverse and also exhibited considerable resistance to stemphylium blight and thrips incidence. Mantel’s test showed a moderate positive correlation between agro-morphological (DUS descriptors) and molecular data. Thus, integrated morphological and molecular characterization followed by hybridization can facilitate onion breeding programs to introgress desired traits like resistance to stemphylium blight and thrips incidence into elite cultivars.
Article
Full-text available
Factors such as life history traits, environmental conditions, and landscape characteristics influence genetic diversity and structure. Rivers act as corridors that aid dispersal and gene flow among riparian species, such as the bamboo Guadua trinii, commonly known as the “tacuara brava” which grows along riversides and gallery forests in South America. We examined how the topography, river connectivity, environmental variables, and habitat suitability influence functional connectivity of G. trinii in the Atlantic Forest of Misiones, Argentina. We also assessed populations both inside and outside the confines of Iguazú National Park using nine microsatellite markers. Our findings revealed high genetic diversity (HE = 0.50) and low genetic structure (FST = 0.068), indicating substantial gene flow among populations. Genetic differentiation was primarily influenced by river connectivity, followed by precipitation during the wet-test month (BIO13) and elevation; geographic distance did not have a significant effect on genetic differentiation. Within the study area, niche modeling showed the highest suitability of G. trinii, suggesting high connectivity between populations. Levels of genetic diversity and population differentiation did not significantly differ between protected and unprotected areas. These results underscore the pivotal role of river connectivity in preserving genetic diversity, despite ongoing forest degradation and landscape modification.
Article
The cultivation of nearly 10,000 indigenous rice landraces in the North-Eastern Hill (NEH) region by various ethnic groups creates opportunities for the utilization of unique landraces through systematic evaluation of genetic variability. In the present study, a set of 102 rice landraces were assessed based on morphological and SSR markers, and five checks in augmented design vis-à-vis high-yielding rice genotypes with stable performance were identified. The presence of high estimates of heritability, genotypic coefficient of variation, and genetic advance over mean indicated the predominance of additive gene action, which necessitated the effectiveness of selection in augmenting productivity. A total of 83.73% of the total variation was accounted by the first five principal components. A total of 132 alleles were detected, with an average of 3 alleles per locus. The PIC values ranged from 0.01 to 0.70, with an average of 0.40. Based on FST value (5.1%), significant differences between the genotypes of Arunachal Pradesh and Sikkim were observed. The percentage of variation among the population, among individuals within the population, and within individuals was 5.14, 75.66, and 19.2%, respectively. Both Nei’s genetic distance and model-based clustering have differentiated the genotypes into five distinct clusters. Principal coordinate analysis illustrated that the genotypes of Manipur were scattered in all quadrants, showing that they are highly diverse, while the genotypes of Nagaland, Sikkim, and Meghalaya were found together, which represent the chance of mixing of the population at a certain point in time. Markers, namely RM 474, OSR 13, RM 413, and RM 259, were found to be associated with key traits for increasing yielding ability of plant. In a stability evaluation based on AMMI analysis and multi-trait genotype-ideoptype distance matrix (MGIDI), genotypes, namely Jyotrirmayie, RCPL 1–411, Tsamum firri, Ching Phouren, Rato Bhan Joha, MN-47, and Tara bali, were selected with higher yield potential.
Article
Full-text available
While anadromous salmonids reproduce in fresh water, most harvests occur at sea. Effective genetic management requires knowledge of the stock (source population) composition of the harvest. This is accomplished with genetic stock identification (GSI), which compares the genotypes of harvested fish with those of freshwater stocks, assuming that all candidate stocks are identified and that their allele frequencies are known exactly. We develop methods that: (1) allow for sampling error in allele frequencies of candidate stocks, and (2) evaluate the possibility of unsampled contributing stocks. Composition analysis for chinook salmon (Oncorhynchus tshawytscha) collected for the Bonneville Dam egg bank program in 1980 and 1981 shows that about 10% of both harvests were from the Deschutes River and about 90% from the Hanford Reach area. Contributions from lower Columbia and Snake River stocks or from unidentified sources were limited.
Article
Full-text available
As currently defined, DNA fingerprint profiles do not uniquely identify individuals. For criminal cases involving DNA evidence, forensic scientists evaluate the conditional prob-ability that an unknown, but distinct, individual matches the crime sample, given that the defendant matches. Estimates of the conditional probability of observing matching profiles are based on reference populations maintained by forensic testing laboratories. Each of these databases is heterogeneous, being composed of subpopulations of different heritages. This heterogeneity has an impact on the weight of the evidence. A hierarchical Bayes model is formulated that incorporates the key physical characteristics inherent in these data. With the help of Markov chain Monte Carlo sampling, levels of heterogeneity are estimated for three major ethnic groups in the database of Lifecodes Corporation.
Article
Full-text available
Genetic variation at hypervariable loci is being used extensively for linkage analysis and individual identification, and may be useful for inter-population studies. Here we show that polymorphic microsatellites (primarily CA repeats) allow trees of human individuals to be constructed that reflect their geographic origin with remarkable accuracy. This is achieved by the analysis of a large number of loci for each individual, in spite of the small variations in allele frequencies existing between populations. Reliable evolutionary relationships could also be established in comparisons among human populations but not among great ape species, probably because of constraints on allele length variation. Among human populations, diversity of microsatellites is highest in Africa, which is in contrast to other nuclear markers and supports the hypothesis of an African origin for humans.
Article
Full-text available
A method is proposed for allowing for the effects of population differentiation, and other factors, in forensic inference based on DNA profiles. Much current forensic practice ignores, for example, the effects of coancestry and inappropriate databases and is consequently systematically biased against defendants. Problems with the 'product rule' for forensic identification have been highlighted by several authors, but important aspects of the problems are not widely appreciated. This arises in part because the match probability has often been confused with the relative frequency of the profile. Further, the analogous problems in paternity cases have received little attention. The proposed method is derived under general assumptions about the underlying population genetic processes. Probabilities relevant to forensic inference are expressed in terms of a single parameter whose values can be chosen to reflect the specific circumstances. The method is currently used in some UK courts and has important advantages over the 'Ceiling Principle' method, which has been criticized on a number of grounds.
Article
Full-text available
Attempts to study the genetic population structure of large mammals are often hampered by the low levels of genetic variation observed in these species. Polar bears have particularly low levels of genetic variation with the result that their genetic population structure has been intractable. We describe the use of eight hypervariable microsatellite loci to study the genetic relationships between four Canadian polar bear populations: the northern Beaufort Sea, southern Beaufort Sea, western Hudson Bay, and Davis Strait-Labrador Sea. These markers detected considerable genetic variation, with average heterozygosity near 60% within each population. Interpopulation differences in allele frequency distribution were significant between all pairs of populations, including two adjacent populations in the Beaufort Sea. Measures of genetic distance reflect the geographic distribution of populations, but also suggest patterns of gene flow which are not obvious from geography and may reflect movement patterns of these animals. Distribution of variation is sufficiently different between the Beaufort Sea populations and the two more eastern ones that the region of origin for a given sample can be predicted based on its expected genotype frequency using an assignment test. These data indicate that gene flow between local populations is restricted despite the long-distance seasonal movements undertaken by polar bears.
Article
Full-text available
In DNA profile analysis, uncertainty arises due to a number of factors such as sampling error, single bands and correlations within and between loci. One of the most important of these factors is kinship: criminal and innocent suspect may share one or more bands through identity by descent from a common ancestor. Ignoring this uncertainty is consistently unfair to innocent suspects. The effect is usually small, but may be important in some cases. The report of the US National Research Committee proposed a complicated, ad-hoc and overly-conservative method of dealing with some of these problems. We propose an alternative approach which addresses directly the effect of kinship. Whilst remaining conservative, it is simple, logically coherent and makes efficient use of the data.
Article
Full-text available
Immigration is an important force shaping the social structure, evolution, and genetics of populations. A statistical method is presented that uses multilocus genotypes to identify individuals who are immigrants, or have recent immigrant ancestry. The method is appropriate for use with allozymes, microsatellites, or restriction fragment length polymorphisms (RFLPs) and assumes linkage equilibrium among loci. Potential applications include studies of dispersal among natural populations of animals and plants, human evolutionary studies, and typing zoo animals of unknown origin (for use in captive breeding programs). The method is illustrated by analyzing RFLP genotypes in samples of humans from Australian, Japanese, New Guinean, and Senegalese populations. The test has power to detect immigrant ancestors, for these data, up to two generations in the past even though the overall differentiation of allele frequencies among populations is low.
Article
Full-text available
We analyzed the European genetic contribution to 10 populations of African descent in the United States (Maywood, Illinois; Detroit; New York; Philadelphia; Pittsburgh; Baltimore; Charleston, South Carolina; New Orleans; and Houston) and in Jamaica, using nine autosomal DNA markers. These markers either are population-specific or show frequency differences >45% between the parental populations and are thus especially informative for admixture. European genetic ancestry ranged from 6.8% (Jamaica) to 22.5% (New Orleans). The unique utility of these markers is reflected in the low variance associated with these admixture estimates (SEM 1.3%-2.7%). We also estimated the male and female European contribution to African Americans, on the basis of informative mtDNA (haplogroups H and L) and Y Alu polymorphic markers. Results indicate a sex-biased gene flow from Europeans, the male contribution being substantially greater than the female contribution. mtDNA haplogroups analysis shows no evidence of a significant maternal Amerindian contribution to any of the 10 populations. We detected significant nonrandom association between two markers located 22 cM apart (FY-null and AT3), most likely due to admixture linkage disequilibrium created in the interbreeding of the two parental populations. The strength of this association and the substantial genetic distance between FY and AT3 emphasize the importance of admixed populations as a useful resource for mapping traits with different prevalence in two parental populations.
Article
Full-text available
We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. We follow Dempster in examining the posterior distribution of the log-likelihood under each model, from which we derive measures of fit and complexity (the effective number of parameters). These may be combined into a Deviance Information Criterion (DIC), which is shown to have an approximate decision-theoretic justification. Analytic and asymptotic identities reveal the measure of complexity to be a generalisation of a wide range of previous suggestions, with particular reference to the neural network literature. The contributions of individual observations to fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. The procedure is illustrated in a number of examples, and throughout it is emphasised that the required quantities are trivial to compute in a Markov chain Monte Carlo analysis, and require no analytic work for new...
Article
The utilization of DNA evidence in cases of forensic identification has become widespread over the last few years. The strength of this evidence against an individual standing trial is typically presented in court in the form of a likelihood ratio (LR) or its reciprocal (the profile match probability). The value of this LR will vary according to the nature of the genetic relationship between the accused and other possible perpetrators of the crime in the population. This paper develops ideas and methods for analysing data and evaluating LRs when the evidence is based on short tandem repeat profiles, with special emphasis placed on a Bayesian approach. These are then applied in the context of a particular quadruplex profiling system used for routine case-work by the UK Forensic Science Service.
Article
In the context of Bayes estimation via Gibbs sampling, with or without data augmentation, a simple approach is developed for computing the marginal density of the sample data (marginal likelihood) given parameter draws from the posterior distribution. Consequently, Bayes factors for model comparisons can be routinely computed as a by-product of the simulation. Hitherto, this calculation has proved extremely challenging. Our approach exploits the fact that the marginal density can be expressed as the prior times the likelihood function over the posterior density. This simple identity holds for any parameter value. An estimate of the posterior density is shown to be available if all complete conditional densities used in the Gibbs sampler have closed-form expressions. To improve accuracy, the posterior density is estimated at a high density point, and the numerical standard error of resulting estimate is derived. The ideas are applied to probit regression and finite mixture models.
Article
We provide a detailed, introductory exposition of the Metropolis-Hastings algorithm, a powerful Markov chain method to simulate multivariate distributions. A simple, intuitive derivation of this method is given along with guidance on implementation. Also discussed are two applications of the algorithm, one for implementing acceptance-rejection sampling when a blanketing function is not available and the other for implementing the algorithm with block-at-a-time scans. In the latter situation, many different algorithms, including the Gibbs sampler, are shown to be special cases of the Metropolis-Hastings algorithm. The methods are illustrated with examples.
Article
New methodology for fully Bayesian mixture analysis is developed, making use of reversible jump Markov chain Monte Carlo methods that are capable of jumping between the parameter subspaces corresponding to different numbers of components in the mixture. A sample from the full joint distribution of all unknown variables is thereby generated, and this can be used as a basis for a thorough presentation of many aspects of the posterior distribution. The methodology is applied here to the analysis of univariate normal mixtures, using a hierarchical prior model that offers an approach to dealing with weak prior information while avoiding the mathematical pitfalls of using improper priors in the mixture context.
Article
Recently founded populations represent an enormous challenge for genetic analysis: new populations are often genetically impoverished, making it hard to find sufficiently variable markers, and what little variation is present tends to be ancestral, rendering phylogenetic methods inappropriate. Recently, novel genetic markers and new statistical analyses have made multilocus genotyping an invaluable tool in the fledgling field of nonequilibrium population genetics. Such advances are not of mere academic interest but address questions of great economic, medical and conservation significance.
Article
To test hypotheses about the origin of modern humans, we analyzed mtDNA sequences, 30 nuclear restriction-site polymorphisms (RSPs), and 30 tetranucleotide short tandem repeat (STR) polymorphisms in 243 Africans, Asians, and Europeans. An evolutionary tree based on mtDNA displays deep African branches, indicating greater genetic diversity for African populations. This finding, which is consistent with previous mtDNA analyses, has been interpreted as evidence for an African origin of modern humans. Both sets of nuclear polymorphisms, as well as a third set of trinucleotide polymorphisms, are highly consistent with one another but fail to show deep branches for African populations. These results, which represent the first direct comparison of mtDNA and nuclear genetic data in major continental populations, undermine the genetic evidence for an African origin of modern humans.
Article
Our goal is to infer, from human genetic data, general patterns as well as details of human evolutionary history. Here we present the results of an analysis of genetic data at the level of the individual. A tree relating 144 individuals from 12 human groups of Africa, Asia, Europe, and Oceania, inferred from an average of 75 DNA polymorphisms/individual, is remarkable in that most individuals cluster with other members of their regional group. In order to interpret this tree, we consider the factors that influence the tree pattern, including the number of genetic loci examined, the length of population isolation, the sampling process, and the extent of gene flow among groups. Understanding the impact of these factors enables us to infer details of human evolutionary history that might otherwise remain undetected. Our analyses indicate that some recent ancestor(s) of each of a few of the individuals tested may have immigrated. In general, the populations within regional groups appear to have been isolated from one another for <25,000 years. Regional groups may have been isolated for somewhat longer.
Article
We examine the issue of population stratification in association-mapping studies. In case-control studies of association, population subdivision or recent admixture of populations can lead to spurious associations between a phenotype and unlinked candidate loci. Using a model of sampling from a structured population, we show that if population stratification exists, it can be detected by use of unlinked marker loci. We show that the case-control-study design, using unrelated control individuals, is a valid approach for association mapping, provided that marker loci unlinked to the candidate locus are included in the study, to test for stratification. We suggest guidelines as to the number of unlinked marker loci to use.
Article
Richardson and Green (1997) present a method of performing a Bayesian analysis of data from a finite mixture distribution with an unknown number of components. Their method is a Markov Chain Monte Carlo (MCMC) approach, which makes use of the "reversible jump" methodology described by Green (1995). We describe an alternative MCMC method which views the parameters of the model as a (marked) point process, extending methods suggested by Ripley (1977) to create a Markov birth-death process with an appropriate stationary distribution. Our method is easy to implement, even in the case of data in more than one dimension, and we illustrate it on both univariate and bivariate data. Keywords: Bayesian analysis, Birth-death process, Markov process, MCMC, Mixture model, Model Choice, Reversible Jump, Spatial point process 1 Introduction Finite mixture models are typically used to model data where each observation is assumed to have arisen from one of k groups, each group being suitably modelle...
Article
Markov chain Monte Carlo methods for Bayesian computation have until recently been restricted to problems where the joint distribution of all variables has a density with respect to some fixed standard underlying measure. They have therefore not been available for application to Bayesian model determination, where the dimensionality of the parameter vector is typically not fixed. This paper proposes a new framework for the construction of reversible Markov chain samplers that jump between parameter subspaces of differing dimensionality, which is flexible and entirely constructive. It should therefore have wide applicability in model determination problems. The methodology is illustrated with applications to multiple change-point analysis in one and two dimensions, and to a Bayesian comparison of binomial experiments.
Article
In a Bayesian analysis of finite mixture models, parameter estimation and clustering are sometimes less straightforward that might be expected. In particular, the common practice of estimating parameters by their posterior mean, and summarising joint posterior distributions by marginal distributions, often leads to nonsensical answers. This is due to the so-called "labelswitching " problem, which is caused by symmetry in the likelihood of the model parameters. A frequent response to this problem is to remove the symmetry using artificial identifiability constraints. We demonstrate that this fails in general to solve the problem, and describe an alternative class of approaches, relabelling algorithms, which arise from attempting to minimise the posterior expected loss under a class of loss functions. We describe in detail one particularly simple and general relabelling algorithm, and illustrate its success in dealing with the labelswitching problem on two examples. KEYWORDS: ...
Detecting immigration by using multilocus genotypes
  • Rannala
The transmission/disequilibrium test: history, subdivision, and admixture
  • Ewens
Effective population size and gene flow in the globally, critically endangered Taita thrush, Turdus helleri
  • Galbusera
Computing Bayes factors by posterior simulation and asymptotic approximations
  • DiCiccio
Effective population size and gene flow in the globally, critically endangered Taita thrush, Turdus helleri
  • P Galbusera
  • L Lens
  • E Waiyaki
  • T Schenck
  • E Mattysen
Hypothesis testing and model selection
  • A E Raftery
  • W R Gilks
  • S Richardson
  • D J Spiegelhalter
Bayesian deviance, the effective number of parameters, and the comparison of arbitrarily complex models
  • D J Spiegelhalter
  • G Bestn
  • P Carlinb
Fraley is often referred to as the burn-in period of the chain
  • L B Jorde
  • M J Bamshad
  • W S Watkins
  • R Zenger
Jorde, L. B., M. J. Bamshad, W. S. Watkins, R. Zenger, A. E. Fraley is often referred to as the burn-in period of the chain;
Origins and affinities of modern humans: a comparic is often referred to as the thinning interval. son of mitochondrial and nuclear genetic data
et al., 1995 Origins and affinities of modern humans: a comparic is often referred to as the thinning interval. son of mitochondrial and nuclear genetic data. Am. J. Hum. In general it is very difficult to know how large m Genet. 57: 523-538.
Multilocus genoand c should be. The values required to obtain reliable types, a tree of individuals, and human evolutionary history. Am. results depend heavily on the amount of correlation
  • J L Mountain
  • L L Cavalli-Sforza
Mountain, J. L., and L. L. Cavalli-Sforza, 1997 Multilocus genoand c should be. The values required to obtain reliable types, a tree of individuals, and human evolutionary history. Am. results depend heavily on the amount of correlation J. Hum. Genet. 61: 705-718. between successive states of the Markov chain. If succes-
Hypothesis testing and model selection, pp. required, possibly rendering the method impracticable
  • A E Raftery
Raftery, A. E., 1996 Hypothesis testing and model selection, pp. required, possibly rendering the method impracticable.
Chapman & Hall, London. ciently large, and the strategy we adopt here
  • S Gilks
  • D J Richardson
  • Spiegelhalter
Gilks, S. Richardson and D. J. Spiegelhalter. Chapman & Hall, London. ciently large, and the strategy we adopt here, is to simu-
Detecting immigration by late several realizations of the Markov chain, each startusing multilocus genotypes
  • B Rannala
  • J L Mountain
Rannala, B., and J. L. Mountain, 1997 Detecting immigration by late several realizations of the Markov chain, each startusing multilocus genotypes. Proc. Natl. Acad. Sci. USA 94: 9197-ing from a different value of (0). If m and c are 9201.
On Bayesian analysis of sufficiently large, then the results obtained should be mixtures with an unknown number of components
  • S Richardson
  • P J Green
Richardson, S., and P. J. Green, 1997 On Bayesian analysis of sufficiently large, then the results obtained should be mixtures with an unknown number of components. J. R. Stat. independent of (0) and should therefore be similar for
B (in press). and that although it is not possible to simulate from Communicating editor: M. K. Uyenoyama () directly, it is possible to simulate a random value of i directly from the full conditional distribution
  • M Stephens
Stephens, M., 2000b Dealing with label-switching in mixture modpose that may be partitioned into ϭ ( 1,..., r ), els. J. R. Stat. Soc. Ser. B (in press). and that although it is not possible to simulate from Communicating editor: M. K. Uyenoyama () directly, it is possible to simulate a random value of i directly from the full conditional distribution ( i | 1, 2,..., iϪ1, i ϩ 1,..., r ) for i ϭ 1, 2,...,