Rapid assessment of genetic ancestry in populations of unknown origin by genome-wide genotyping of pooled samples.

Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America.
PLoS Genetics (Impact Factor: 8.17). 03/2010; 6(3):e1000866. DOI: 10.1371/journal.pgen.1000866
Source: PubMed

ABSTRACT As we move forward from the current generation of genome-wide association (GWA) studies, additional cohorts of different ancestries will be studied to increase power, fine map association signals, and generalize association results to additional populations. Knowledge of genetic ancestry as well as population substructure will become increasingly important for GWA studies in populations of unknown ancestry. Here we propose genotyping pooled DNA samples using genome-wide SNP arrays as a viable option to efficiently and inexpensively estimate admixture proportion and identify ancestry informative markers (AIMs) in populations of unknown origin. We constructed DNA pools from African American, Native Hawaiian, Latina, and Jamaican samples and genotyped them using the Affymetrix 6.0 array. Aided by individual genotype data from the African American cohort, we established quality control filters to remove poorly performing SNPs and estimated allele frequencies for the remaining SNPs in each panel. We then applied a regression-based method to estimate the proportion of admixture in each cohort using the allele frequencies estimated from pooling and populations from the International HapMap Consortium as reference panels, and identified AIMs unique to each population. In this study, we demonstrated that genotyping pooled DNA samples yields estimates of admixture proportion that are both consistent with our knowledge of population history and similar to those obtained by genotyping known AIMs. Furthermore, through validation by individual genotyping, we demonstrated that pooling is quite effective for identifying SNPs with large allele frequency differences (i.e., AIMs) and that these AIMs are able to differentiate two closely related populations (HapMap JPT and CHB).

Download full-text


Available from: Terrence Forrester, Jun 19, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The availability of sequence information for many plants has opened the way to advanced genetic analysis in many non-model plants. Nevertheless, exploration of genetic variation on a large scale and its use as a tool for the identification of traits of interest are still rare. In this study, we combined a bulk segregation approach with our own-designed microarrays to map the pH locus that influences fruit pH in melon. Using these technologies, we identified a set of markers that are genetically linked to the pH trait. Further analysis using a set of melon cultivars demonstrated that some of these markers are tightly linked to the pH trait throughout our germplasm collection. These results validate the utility of combining microarray technology with a bulk segregation approach in mapping traits of interest in non-model plants.
    Theoretical and Applied Genetics 10/2012; DOI:10.1007/s00122-012-1983-7 · 3.51 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide genotyping of a cohort using pools rather than individual samples has long been proposed as a cost-saving alternative for performing genome-wide association (GWA) studies. However, successful disease gene mapping using pooled genotyping has thus far been limited to detecting common variants with large effect sizes, which tend not to exist for many complex common diseases or traits. Therefore, for DNA pooling to be a viable strategy for conducting GWA studies, it is important to determine whether commonly used genome-wide SNP array platforms such as the Affymetrix 6.0 array can reliably detect common variants of small effect sizes using pooled DNA. Taking obesity and age at menarche as examples of human complex traits, we assessed the feasibility of genome-wide genotyping of pooled DNA as a single-stage design for phenotype association. By individually genotyping the top associations identified by pooling, we obtained a 14- to 16-fold enrichment of SNPs nominally associated with the phenotype, but we likely missed the top true associations. In addition, we assessed whether genotyping pooled DNA can serve as an inexpensive screen as the second stage of a multi-stage design with a large number of samples by comparing the most cost-effective 3-stage designs with 80% power to detect common variants with genotypic relative risk of 1.1, with and without pooling. Given the current state of the specific technology we employed and the associated genotyping costs, we showed through simulation that a design involving pooling would be 1.07 times more expensive than a design without pooling. Thus, while a significant amount of information exists within the data from pooled DNA, our analysis does not support genotyping pooled DNA as a means to efficiently identify common variants contributing small effects to phenotypes of interest. While our conclusions were based on the specific technology and study design we employed, the approach presented here will be useful for evaluating the utility of other or future genome-wide genotyping platforms in pooled DNA studies.
    Human Genetics 03/2011; 130(5):607-21. DOI:10.1007/s00439-011-0974-0 · 4.52 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The causes of oedematous vs non-oedematous childhood malnutrition (OM vs NOM) remain elusive. It is possible that inherited differences in handling oxidant stressors are a contributing factor. To test for associations between polymorphisms in five genes and (i) risk of OM, a case-control study, and (ii) percentage cytotoxicity in peripheral blood mononuclear cells (PBMCs) exposed to hydrogen peroxide (H(2)O(2)), an in vitro cell challenge study. Participants had been admitted previously for treatment of OM (cases, n = 74) or NOM (controls, n = 50), or were an independent set of healthy pregnant women (n = 47) who donated peripheral blood mononuclear cells. We tested for associations between genetic variation and outcome using single markers or a bivariate score constructed by counting numbers of deleterious alleles for each of 15 possible pairs of markers. In the case-control study there were no significant single-marker associations with OM. We did find that higher bivariate scores were associated with OM for the pair of NAD(P)H:quinone oxidoreductase 1 and catalase (odds ratio 2·00, 95% CI 1·05-3·82). In the cell challenge experiments, there were no significant associations with percentage cytotoxicity. Variation in this small set of genes seems unlikely to have a large impact on either risk of OM or cytotoxicity after H(2)O(2) exposure. The use of larger sample sizes to test the effects of a much larger set of genetic variants will be required in order to determine whether genetic variation contributes to the risk of OM. Such studies have potential for improving our understanding of causal pathways in OM.
    Annals of Tropical Paediatrics International Child Health 02/2011; 31(1):27-36. DOI:10.1179/146532811X12925735813805 · 1.17 Impact Factor