Inter-chromosomal variation in the pattern of human population genetic structure

Cincinnati Children's Hospital Medical Center, Division of Asthma Research, Department of Pediatrics, University of Cincinnati, OH 45229, USA.
Human genomics (Impact Factor: 2.15). 05/2011; 5(4):220-40. DOI: 10.1186/1479-7364-5-4-220
Source: PubMed


Emerging technologies now make it possible to genotype hundreds of thousands of genetic variations in individuals, across the genome. The study of loci at finer scales will facilitate the understanding of genetic variation at genomic and geographic levels. We examined global and chromosomal variations across HapMap populations using 3.7 million single nucleotide polymorphisms to search for the most stratified genomic regions of human populations and linked these regions to ontological annotation and functional network analysis. To achieve this, we used five complementary statistical and genetic network procedures: principal component (PC), cluster, discriminant, fixation index (FST) and network/pathway analyses. At the global level, the first two PC scores were sufficient to account for major population structure; however, chromosomal level analysis detected subtle forms of population structure within continental populations, and as many as 31 PCs were required to classify individuals into homogeneous groups. Using recommended population ancestry differentiation measures, a total of 126 regions of the genome were catalogued. Gene ontology and networks analyses revealed that these regions included the genes encoding oculocutaneous albinism II (OCA2), hect domain and RLD 2 (HERC2), ectodysplasin A receptor (EDAR) and solute carrier family 45, member 2 (SLC45A2). These genes are associated with melanin production, which is involved in the development of skin and hair colour, skin cancer and eye pigmentation. We also identified the genes encoding interferon-γ (IFNG) and death-associated protein kinase 1 (DAPK1), which are associated with cell death, inflammatory and immunological diseases. An in-depth understanding of these genomic regions may help to explain variations in adaptation to different environments. Our approach offers a comprehensive strategy for analysing chromosome-based population structure and differentiation, and demonstrates the application of complementary statistical and functional network analysis in human genetic variation studies.

Download full-text


Available from: Tesfaye M Baye, Feb 06, 2015

Click to see the full-text of:

Article: Inter-chromosomal variation in the pattern of human population genetic structure

6.02 MB

See full-text
  • Source
    • "Ricans ( 19 . 6% ) and African Americans ( 14 . 6% ) is higher than among European Americans ( 8. 2% ), Mexican Americans ( 4. 8% ) , and Asian Americans ( 4 . 2% ) ( Gupta et al . , 2006 ; Moorman et al . , 2007 ; Baye et al . , 2011b ; Silvers and Lang , 2012"

  • Source
    • "Ricans ( 19 . 6% ) and African Americans ( 14 . 6% ) is higher than among European Americans ( 8. 2% ), Mexican Americans ( 4. 8% ) , and Asian Americans ( 4 . 2% ) ( Gupta et al . , 2006 ; Moorman et al . , 2007 ; Baye et al . , 2011b ; Silvers and Lang , 2012"
    [Show abstract] [Hide abstract]
    ABSTRACT: Admixed populations arise when two or more previously isolated populations interbreed. Mapping asthma susceptibility loci in an admixed population using admixture mapping (AM) involves screening the genome of individuals of mixed ancestry for chromosomal regions that have a higher frequency of alleles from a parental population with higher asthma risk as compared with parental population with lower asthma risk. AM takes advantage of the admixture created in populations of mixed ancestry to identify genomic regions where an association exists between genetic ancestry and asthma (in contrast to between the genotype of the marker and asthma). The theory behind AM is that chromosomal segments of affected individuals contain a significantly higher-than-average proportion of alleles from the high-risk parental population and thus are more likely to harbor disease–associated loci. Criteria to evaluate the applicability of AM as a gene mapping approach include: (1) the prevalence of the disease differences in ancestral populations from which the admixed population was formed; (2) a measurable difference in disease-causing alleles between the parental populations; (3) reduced linkage disequilibrium (LD) between unlinked loci across chromosomes and strong LD between neighboring loci; (4) a set of markers with noticeable allele-frequency differences between parental populations that contributes to the admixed population (single nucleotide polymorphisms (SNPs) are the markers of choice because they are abundant, stable, relatively cheap to genotype, and informative with regard to the LD structure of chromosomal segments); and (5) there is an understanding of the extent of segmental chromosomal admixtures and their interactions with environmental factors. Although genome-wide association studies have contributed greatly to our understanding of the genetic components of asthma, the large and increasing degree of admixture in populations across the world create many challenges for further efforts to map disease-causing genes. This review, summarizes the historical context of admixed populations and AM, and considers current opportunities to use AM to map asthma genes. In addition, we provide an overview of the potential limitations and future directions of AM in biomedical research, including joint admixture and association mapping for asthma and asthma-related disorders.
    Frontiers in Genetics 09/2015; 6. DOI:10.3389/fgene.2015.00292
  • Source
    • "Adjustment of global ancestry between study subjects may lead to false positives when chromosomal (local) population ancestry is an important confounding factor [34]. In a recent chromosome-based study by Baye (2011) [35], fine-scale substructure was detectable beyond the broad population level classifications that previously have been explored using genome-wide average estimates. The study of population ancestry in terms of local ancestry has broader practical relevance because genetic diversity is directly related to recombination rate (meiosis), which differs among regions of the genome, and genes are not randomly distributed along chromosomes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Admixture mapping is a powerful gene mapping approach for an admixed population formed from ancestral populations with different allele frequencies. The power of this method relies on the ability of ancestry informative markers (AIMs) to infer ancestry along the chromosomes of admixed individuals. In this study, more than one million SNPs from HapMap databases and simulated data have been interrogated in admixed populations using various measures of ancestry informativeness: Fisher Information Content (FIC), Shannon Information Content (SIC), F statistics (FST), Informativeness for Assignment Measure (In), and the Absolute Allele Frequency Differences (delta, δ). The objectives are to compare these measures of informativeness to select SNP markers for ancestry inference, and to determine the accuracy of AIM panels selected by each measure in estimating the contributions of the ancestors to the admixed population. FST and In had the highest Spearman correlation and the best agreement as measured by Kappa statistics based on deciles. Although the different measures of marker informativeness performed comparably well, analyses based on the top 1 to 10% ranked informative markers of simulated data showed that In was better in estimating ancestry for an admixed population. Although millions of SNPs have been identified, only a small subset needs to be genotyped in order to accurately predict ancestry with a minimal error rate in a cost-effective manner. In this article, we compared various methods for selecting ancestry informative SNPs using simulations as well as SNP genotype data from samples of admixed populations and showed that the In measure estimates ancestry proportion (in an admixed population) with lower bias and mean square error.
    BMC Genomics 12/2011; 12(1):622. DOI:10.1186/1471-2164-12-622 · 3.99 Impact Factor
Show more