Evaluating potential for whole-genome studies in Kosrae, an isolated population in Micronesia.

Rockefeller University, 1230 York Avenue, New York, New York 10021, USA.
Nature Genetics (Impact Factor: 29.65). 03/2006; 38(2):214-7. DOI: 10.1038/ng1712
Source: PubMed

ABSTRACT Whole-genome association studies are predicted to be especially powerful in isolated populations owing to increased linkage disequilibrium (LD) and decreased allelic diversity, but this possibility has not been empirically tested. We compared genome-wide data on 113,240 SNPs typed on 30 trios from the Pacific island of Kosrae to the same markers typed in the 270 samples from the International HapMap Project. The extent of LD is longer and haplotype diversity is lower in Kosrae than in the HapMap populations. More than 98% of Kosraen haplotypes are present in HapMap populations, indicating that HapMap will be useful for genetic studies on Kosrae. The long-range LD around common alleles and limited diversity result in improved efficiency in genetic studies in this population and augments the power to detect association of 'hidden SNPs'.

Download full-text


Available from: Itsik Pe'er, Aug 28, 2014
  • Source
    • "Based on common SNPs, which comprised 74 and 66% of autosomal SNPs in the OOA and CEU, respectively, the distribution and extent of LD were remarkably similar between these two samples. These data are consistent with previous theoretical predictions [Kruglyak, 1999; Pritchard and Przeworski, 2001] and recent empirical data [Bonnen et al., 2006; Service et al., 2006; Navarro et al., 2009; Thompson et al., 2009], all of which point to modest differences in LD between isolated and cosmopolitan populations for common alleles. The situation for rare alleles, however, is likely to be different as has been demonstrated in applications of LD mapping for monogenic diseases and traits. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Knowledge of the extent and distribution of linkage disequilibrium (LD) is critical to the design and interpretation of gene mapping studies. Because the demographic history of each population varies and is often not accurately known, it is necessary to empirically evaluate LD on a population-specific basis. Here we present the first genome-wide survey of LD in the Old Order Amish (OOA) of Lancaster County Pennsylvania, a closed population derived from a modest number of founders. Specifically, we present a comparison of LD between OOA individuals and US Utah participants in the International HapMap project (abbreviated CEU) using a high-density single nucleotide polymorphism (SNP) map. Overall, the allele (and haplotype) frequency distributions and LD profiles were remarkably similar between these two populations. For example, the median absolute allele frequency difference for autosomal SNPs was 0.05, with an inter-quartile range of 0.02-0.09, and for autosomal SNPs 10-20 kb apart with common alleles (minor allele frequency > or =0.05), the LD measure r(2) was at least 0.8 for 15 and 14% of SNP pairs in the OOA and CEU, respectively. Moreover, tag SNPs selected from the HapMap CEU sample captured a substantial portion of the common variation in the OOA ( approximately 88%) at r(2) > or =0.8. These results suggest that the OOA and CEU may share similar LD profiles for other common but untyped SNPs. Thus, in the context of the common variant-common disease hypothesis, genetic variants discovered in gene mapping studies in the OOA may generalize to other populations.
    Genetic Epidemiology 02/2010; 34(2):146-50. DOI:10.1002/gepi.20444 · 2.95 Impact Factor
  • Source
    • "We then quantified haplotype diversity by the proportion of chromosomes from each population that had been accounted for by a specific number of haplotypes. This procedure is similar to that established for quantifying haplotype diversity across multiple populations (Bonnen et al. 2006). In order to investigate the extent of haplotype sharing, chromosomes from the region in chromosome 11 were clustered and visualized with the use of haplosim and hapvisual from the R package haplosuite (Teo and Small 2009). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The Singapore Genome Variation Project (SGVP) provides a publicly available resource of 1.6 million single nucleotide polymorphisms (SNPs) genotyped in 268 individuals from the Chinese, Malay, and Indian population groups in Southeast Asia. This online database catalogs information and summaries on genotype and phased haplotype data, including allele frequencies, assessment of linkage disequilibrium (LD), and recombination rates in a format similar to the International HapMap Project. Here, we introduce this resource and describe the analysis of human genomic variation upon agglomerating data from the HapMap and the Human Genome Diversity Project, providing useful insights into the population structure of the three major population groups in Asia. In addition, this resource also surveyed across the genome for variation in regional patterns of LD between the HapMap and SGVP populations, and for signatures of positive natural selection using two well-established metrics: iHS and XP-EHH. The raw and processed genetic data, together with all population genetic summaries, are publicly available for download and browsing through a web browser modeled with the Generic Genome Browser.
    Genome Research 09/2009; 19(11):2154-62. DOI:10.1101/gr.095000.109 · 13.85 Impact Factor
  • Source
    • "Recent advances in genotyping and sequencing technologies have resulted in exciting discoveries of links between genes and diseases via whole-genome association studies (Bonnen et al. 2006). In these studies, cases and controls are collected and single nucleotide polymorphisms (SNPs) are genotyped across the entire genome of these two populations. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Inference of ancestral information in recently admixed populations, in which every individual is composed of a mixed ancestry (e.g., African Americans in the United States), is a challenging problem. Several previous model-based approaches to admixture have been based on hidden Markov models (HMMs) and Markov hidden Markov models (MHMMs). We present an augmented form of these models that can be used to predict historical recombination events and can model background linkage disequilibrium (LD) more accurately. We also study some of the computational issues that arise in using such Markovian models on realistic data sets. In particular, we present an effective initialization procedure that, when combined with expectation-maximization (EM) algorithms for parameter estimation, yields high accuracy at significantly decreased computational cost relative to the Markov chain Monte Carlo (MCMC) algorithms that have generally been used in earlier studies. We present experiments exploring these modeling and algorithmic issues in two scenarios-the inference of locus-specific ancestries in a population that is assumed to originate from two unknown ancestral populations, and the inference of allele frequencies in one ancestral population given those in another.
    Genome Research 05/2008; 18(4):668-75. DOI:10.1101/gr.072751.107 · 13.85 Impact Factor
Show more