Article

Inter-chromosomal variation in the pattern of human population genetic structure

Cincinnati Children's Hospital Medical Center, Division of Asthma Research, Department of Pediatrics, University of Cincinnati, OH 45229, USA.
Human genomics 05/2011; 5(4):220-40. DOI: 10.1186/1479-7364-5-4-220
Source: PubMed

ABSTRACT Emerging technologies now make it possible to genotype hundreds of thousands of genetic variations in individuals, across the genome. The study of loci at finer scales will facilitate the understanding of genetic variation at genomic and geographic levels. We examined global and chromosomal variations across HapMap populations using 3.7 million single nucleotide polymorphisms to search for the most stratified genomic regions of human populations and linked these regions to ontological annotation and functional network analysis. To achieve this, we used five complementary statistical and genetic network procedures: principal component (PC), cluster, discriminant, fixation index (FST) and network/pathway analyses. At the global level, the first two PC scores were sufficient to account for major population structure; however, chromosomal level analysis detected subtle forms of population structure within continental populations, and as many as 31 PCs were required to classify individuals into homogeneous groups. Using recommended population ancestry differentiation measures, a total of 126 regions of the genome were catalogued. Gene ontology and networks analyses revealed that these regions included the genes encoding oculocutaneous albinism II (OCA2), hect domain and RLD 2 (HERC2), ectodysplasin A receptor (EDAR) and solute carrier family 45, member 2 (SLC45A2). These genes are associated with melanin production, which is involved in the development of skin and hair colour, skin cancer and eye pigmentation. We also identified the genes encoding interferon-γ (IFNG) and death-associated protein kinase 1 (DAPK1), which are associated with cell death, inflammatory and immunological diseases. An in-depth understanding of these genomic regions may help to explain variations in adaptation to different environments. Our approach offers a comprehensive strategy for analysing chromosome-based population structure and differentiation, and demonstrates the application of complementary statistical and functional network analysis in human genetic variation studies.

0 Followers
 · 
139 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Next-generation sequencing technologies now make it possible to genotype and measure hundreds of thousands of rare genetic variations in individuals across the genome. Characterization of high-density genetic variation facilitates control of population genetic structure on a finer scale before large-scale genotyping in disease genetics studies. Population structure is a well-known, prevalent, and important factor in common variant genetic studies, but its relevance in rare variants is unclear. We perform an extensive population structure analysis using common and rare functional variants from the Genetic Analysis Workshop 17 mini-exome sequence. The analysis based on common functional variants required 388 principal components to account for 90% of the variation in population structure. However, an analysis based on rare variants required 532 significant principal components to account for similar levels of variation. Using rare variants, we detected fine-scale substructure beyond the population structure identified using common functional variants. Our results show that the level of population structure embedded in rare variant data is different from the level embedded in common variant data and that correcting for population structure is only as good as the level one wishes to correct.
    BMC proceedings 11/2011; 5 Suppl 9(Suppl 9):S8. DOI:10.1186/1753-6561-5-S9-S8
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Admixture mapping is a powerful gene mapping approach for an admixed population formed from ancestral populations with different allele frequencies. The power of this method relies on the ability of ancestry informative markers (AIMs) to infer ancestry along the chromosomes of admixed individuals. In this study, more than one million SNPs from HapMap databases and simulated data have been interrogated in admixed populations using various measures of ancestry informativeness: Fisher Information Content (FIC), Shannon Information Content (SIC), F statistics (FST), Informativeness for Assignment Measure (In), and the Absolute Allele Frequency Differences (delta, δ). The objectives are to compare these measures of informativeness to select SNP markers for ancestry inference, and to determine the accuracy of AIM panels selected by each measure in estimating the contributions of the ancestors to the admixed population. FST and In had the highest Spearman correlation and the best agreement as measured by Kappa statistics based on deciles. Although the different measures of marker informativeness performed comparably well, analyses based on the top 1 to 10% ranked informative markers of simulated data showed that In was better in estimating ancestry for an admixed population. Although millions of SNPs have been identified, only a small subset needs to be genotyped in order to accurately predict ancestry with a minimal error rate in a cost-effective manner. In this article, we compared various methods for selecting ancestry informative SNPs using simulations as well as SNP genotype data from samples of admixed populations and showed that the In measure estimates ancestry proportion (in an admixed population) with lower bias and mean square error.
    BMC Genomics 12/2011; 12:622. DOI:10.1186/1471-2164-12-622 · 4.04 Impact Factor
  • Journal of clinical psychopharmacology 04/2012; 32(2):153-64. DOI:10.1097/JCP.0b013e3182491383 · 3.76 Impact Factor
Show more