[Show abstract][Hide abstract] ABSTRACT: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
[Show abstract][Hide abstract] ABSTRACT: The genetic characterization of Native American groups provides insights into their history and demographic events. We sequenced the mitochondrial D-loop region (control region) of 520 samples from eight Mexican indigenous groups. In addition to an analysis of the genetic diversity, structure and genetic relationship between 28 Native American populations, we applied Bayesian skyline methodology for a deeper insight into the history of Mesoamerica. AMOVA tests applying cultural, linguistic and geographic criteria were performed. MDS plots showed a central cluster of Oaxaca and Maya populations, whereas those from the North and West were located on the periphery. Demographic reconstruction indicates higher values of the effective number of breeding females (Nef) in Central Mesoamerica during the Preclassic period, whereas this pattern moves toward the Classic period for groups in the North and West. Conversely, Nef minimum values are distributed either in the Lithic period (i.e. founder effects) or in recent periods (i.e. population declines). The Mesomerican regions showed differences in population fluctuation as indicated by the maximum Inter-Generational Rate (IGRmax): i) Center-South from the lithic period until the Preclassic; ii) West from the beginning of the Preclassic period until early Classic; iii) North characterized by a wide range of temporal variation from the Lithic to the Preclassic. Our findings are consistent with the genetic variations observed between central, South and Southeast Mesoamerica and the North-West region that are related to differences in genetic drift, structure, and temporal survival strategies (agriculture versus hunter-gathering, respectively). Interestingly, although the European contact had a major negative demographic impact, we detect a previous decline in Mesoamerica that had begun a few hundred years before.
PLoS ONE 08/2015; 10(8):e0131791. DOI:10.1371/journal.pone.0131791 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Background
IgE is a key mediator of allergic inflammation, and its levels are frequently increased in patients with allergic disorders.
We sought to identify genetic variants associated with IgE levels in Latinos.
We performed a genome-wide association study and admixture mapping of total IgE levels in 3334 Latinos from the Genes-environments & Admixture in Latino Americans (GALA II) study. Replication was evaluated in 454 Latinos, 1564 European Americans, and 3187 African Americans from independent studies.
We confirmed associations of 6 genes identified by means of previous genome-wide association studies and identified a novel genome-wide significant association of a polymorphism in the zinc finger protein 365 gene (ZNF365) with total IgE levels (rs200076616, P = 2.3 × 10−8). We next identified 4 admixture mapping peaks (6p21.32-p22.1, 13p22-31, 14q23.2, and 22q13.1) at which local African, European, and/or Native American ancestry was significantly associated with IgE levels. The most significant peak was 6p21.32-p22.1, where Native American ancestry was associated with lower IgE levels (P = 4.95 × 10−8). All but 22q13.1 were replicated in an independent sample of Latinos, and 2 of the peaks were replicated in African Americans (6p21.32-p22.1 and 14q23.2). Fine mapping of 6p21.32-p22.1 identified 6 genome-wide significant single nucleotide polymorphisms in Latinos, 2 of which replicated in European Americans. Another single nucleotide polymorphism was peak-wide significant within 14q23.2 in African Americans (rs1741099, P = 3.7 × 10−6) and replicated in non–African American samples (P = .011).
We confirmed genetic associations at 6 genes and identified novel associations within ZNF365, HLA-DQA1, and 14q23.2. Our results highlight the importance of studying diverse multiethnic populations to uncover novel loci associated with total IgE levels.
The Journal of allergy and clinical immunology 12/2014; DOI:10.1016/j.jaci.2014.10.033 · 11.48 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Background
Childhood asthma prevalence and morbidity varies among Latinos in the United States, with Puerto Ricans having the highest and Mexicans the lowest.
To determine whether genetic ancestry is associated with the odds of asthma among Latinos, and secondarily whether genetic ancestry is associated with lung function among Latino children.
We analyzed 5493 Latinos with and without asthma from 3 independent studies. For each participant, we estimated the proportion of African, European, and Native American ancestry using genome-wide data. We tested whether genetic ancestry was associated with the presence of asthma and lung function among subjects with and without asthma. Odds ratios (OR) and effect sizes were assessed for every 20% increase in each ancestry.
Native American ancestry was associated with lower odds of asthma (OR = 0.72, 95% CI: 0.66-0.78, P = 8.0 × 10−15), while African ancestry was associated with higher odds of asthma (OR = 1.40, 95% CI: 1.14-1.72, P = .001). These associations were robust to adjustment for covariates related to early life exposures, air pollution, and socioeconomic status. Among children with asthma, African ancestry was associated with lower lung function, including both pre- and post-bronchodilator measures of FEV1 (−77 ± 19 mL; P = 5.8 × 10−5 and −83 ± 19 mL; P = 1.1 x 10−5, respectively) and forced vital capacity (−100 ± 21 mL; P = 2.7 × 10−6 and −107 ± 22 mL; P = 1.0 x 10−6, respectively).
Differences in the proportions of genetic ancestry can partially explain disparities in asthma susceptibility and lung function among Latinos.
Journal of Allergy and Clinical Immunology 10/2014; 135(1). DOI:10.1016/j.jaci.2014.07.053 · 11.48 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Mexico harbors great cultural and ethnic diversity, yet fine-scale patterns of human genome-wide variation from this region remain largely uncharacterized. We studied genomic variation within Mexico from over 1000 individuals representing 20 indigenous and 11 mestizo populations. We found striking genetic stratification among indigenous populations within Mexico at varying degrees of geographic isolation. Some groups were as differentiated as Europeans are from East Asians. Pre-Columbian genetic substructure is recapitulated in the indigenous ancestry of admixed mestizo individuals across the country. Furthermore, two independently phenotyped cohorts of Mexicans and Mexican Americans showed a significant association between subcontinental ancestry and lung function. Thus, accounting for fine-scale ancestry patterns is critical for medical and population genetic studies within Mexico, in Mexican-descent populations, and likely in many other populations worldwide.
[Show abstract][Hide abstract] ABSTRACT: Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6-to 159-fold. Further-more, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples.
The American Journal of Human Genetics 11/2013; 93(5):852-864. DOI:10.1016/j.ajhg.2013.10.002 · 10.93 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The Caribbean basin is home to some of the most complex interactions in recent history among previously diverged human populations. Here, we investigate the population genetic history of this region by characterizing patterns of genome-wide variation among 330 individuals from three of the Greater Antilles (Cuba, Puerto Rico, Hispaniola), two mainland (Honduras, Colombia), and three Native South American (Yukpa, Bari, and Warao) populations. We combine these data with a unique database of genomic variation in over 3,000 individuals from diverse European, African, and Native American populations. We use local ancestry inference and tract length distributions to test different demographic scenarios for the pre- and post-colonial history of the region. We develop a novel ancestry-specific PCA (ASPCA) method to reconstruct the sub-continental origin of Native American, European, and African haplotypes from admixed genomes. We find that the most likely source of the indigenous ancestry in Caribbean islanders is a Native South American component shared among inland Amazonian tribes, Central America, and the Yucatan peninsula, suggesting extensive gene flow across the Caribbean in pre-Columbian times. We find evidence of two pulses of African migration. The first pulse-which today is reflected by shorter, older ancestry tracts-consists of a genetic component more similar to coastal West African regions involved in early stages of the trans-Atlantic slave trade. The second pulse-reflected by longer, younger tracts-is more similar to present-day West-Central African populations, supporting historical records of later transatlantic deportation. Surprisingly, we also identify a Latino-specific European component that has significantly diverged from its parental Iberian source populations, presumably as a result of small European founder population size. We demonstrate that the ancestral components in admixed genomes can be traced back to distinct sub-continental source populations with far greater resolution than previously thought, even when limited pre-Columbian Caribbean haplotypes have survived.
[Show abstract][Hide abstract] ABSTRACT: The primary rescue medication to treat acute asthma exacerbation is the short-acting β2-adrenergic receptor agonist; however, there is variation in how well a patient responds to treatment. Although these differences might be due to environmental factors, there is mounting evidence for a genetic contribution to variability in bronchodilator response (BDR).
To identify genetic variation associated with bronchodilator drug response in Latino children with asthma.
We performed a genome-wide association study (GWAS) for BDR in 1782 Latino children with asthma using standard linear regression, adjusting for genetic ancestry and ethnicity, and performed replication studies in an additional 531 Latinos. We also performed admixture mapping across the genome by testing for an association between local European, African, and Native American ancestry and BDR, adjusting for genomic ancestry and ethnicity.
We identified 7 genetic variants associated with BDR at a genome-wide significant threshold (P < 5 × 10(-8)), all of which had frequencies of less than 5%. Furthermore, we observed an excess of small P values driven by rare variants (frequency, <5%) and by variants in the proximity of solute carrier (SLC) genes. Admixture mapping identified 5 significant peaks; fine mapping within these peaks identified 2 rare variants in SLC22A15 as being associated with increased BDR in Mexicans. Quantitative PCR and immunohistochemistry identified SLC22A15 as being expressed in the lung and bronchial epithelial cells.
Our results suggest that rare variation contributes to individual differences in response to albuterol in Latinos, notably in SLC genes that include membrane transport proteins involved in the transport of endogenous metabolites and xenobiotics. Resequencing in larger, multiethnic population samples and additional functional studies are required to further understand the role of rare variation in BDR.
The Journal of allergy and clinical immunology 08/2013; 133(2). DOI:10.1016/j.jaci.2013.06.043 · 11.48 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: There is great scientific and popular interest in understanding the genetic
history of populations in the Americas. We wish to understand when different
regions of the continent were inhabited, where settlers came from, and how
current inhabitants relate genetically to earlier populations. Recent studies
unraveled parts of the genetic history of the continent using genotyping arrays
and uniparental markers. The 1000 Genomes Project provides a unique opportunity
for improving our understanding of population genetic history by providing over
a hundred sequenced low coverage genomes and exomes from Colombian (CLM),
Mexican-American (MXL), and Puerto Rican (PUR) populations. Here, we explore
the genomic contributions of African, European, and especially Native American
ancestry to these populations. Estimated Native American ancestry is 48% in
MXL, 25% in CLM, and 13% in PUR. Native American ancestry in PUR appears most
closely related to Equatorial-Tucanoan-speaking populations, supporting a
Southern America ancestry of the Taino people of the Caribbean. We present new
methods to estimate the allele frequencies in the Native American fraction of
the populations, and model their distribution using a three-population
demographic model. The ancestral populations to the three groups likely split
in close succession: the most likely scenario, based on a peopling of the
Americas 16 thousand years ago (kya), supports that the MXL Ancestors split
12.2kya, with a subsequent split of the ancestors to CLM and PUR 11.7kya. The
model also features a Mexican population of 62,000, a Colombian population of
8,700, and a Puerto Rican population of 1,900. Modeling Identity-by-descent
(IBD) and ancestry tract length, we show that post-contact populations also
differ markedly in their effective sizes and migration patterns, with Puerto
Rico showing the smallest size and the earlier migration from Europe.
[Show abstract][Hide abstract] ABSTRACT: The Caribbean basin is home to some of the most complex interactions in
recent history among previously diverged human populations. Here, by making use
of genome-wide SNP array data, we characterize ancestral components of
Caribbean populations on a sub-continental level and unveil fine-scale patterns
of population structure distinguishing insular from mainland Caribbean
populations as well as from other Hispanic/Latino groups. We provide genetic
evidence for an inland South American origin of the Native American component
in island populations and for extensive pre-Columbian gene flow across the
Caribbean basin. The Caribbean-derived European component shows significant
differentiation from parental Iberian populations, presumably as a result of
founder effects during the colonization of the New World. Based on demographic
models, we reconstruct the complex population history of the Caribbean since
the onset of continental admixture. We find that insular populations are best
modeled as mixtures absorbing two pulses of African migrants, coinciding with
early and maximum activity stages of the transatlantic slave trade. These two
pulses appear to have originated in different regions within West Africa,
imprinting two distinguishable signatures in present day Afro-Caribbean genomes
and shedding light on the genetic impact of the dynamics occurring during the
slave trade in the Caribbean.
[Show abstract][Hide abstract] ABSTRACT: BACKGROUND: Atopy varies by ethnicity, even within Latino groups. This variation might be due to environmental, sociocultural, or genetic factors. OBJECTIVE: We sought to examine risk factors for atopy within a nationwide study of US Latino children with and without asthma. METHODS: Aeroallergen skin test responses were analyzed in 1830 US Latino subjects. Key determinants of atopy included country/region of origin, generation in the United States, acculturation, genetic ancestry, and site to which subjects migrated. Serial multivariate zero-inflated negative binomial regressions stratified by asthma status examined the association of each key determinant variable with the number of positive skin test responses. In addition, the independent effect of each key variable was determined by including all key variables in the final models. RESULTS: In baseline analyses African ancestry was associated with 3 times (95% CI, 1.62-5.57) as many positive skin test responses in asthmatic participants and 3.26 times (95% CI, 1.02-10.39) as many positive skin test responses in control participants. Generation and recruitment site were also associated with atopy in crude models. In final models adjusted for key variables, asthmatic patients of Puerto Rican (exp[β] [95% CI], 1.31 [1.02-1.69]) and mixed (exp[β] [95% CI], 1.27 [1.03-1.56]) ethnicity had a greater probability of positive skin test responses compared with Mexican asthmatic patients. Ancestry associations were abrogated by recruitment site but not region of origin. CONCLUSIONS: Puerto Rican ethnicity and mixed origin were associated with degree of atopy within US Latino children with asthma. African ancestry was not associated with degree of atopy after adjusting for recruitment site. Local environment variation, represented by site, was associated with degree of sensitization.
The Journal of allergy and clinical immunology 05/2013; 132(4). DOI:10.1016/j.jaci.2013.02.046 · 11.48 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations.
[Show abstract][Hide abstract] ABSTRACT: The genetic characterization of Native Mexicans is important to understand multiethnic based features influencing the medical genetics of present Mexican populations, as well as to the reconstruct the peopling of the Americas. We describe the Y-chromosome genetic diversity of 197 Native Mexicans from 11 populations and 1,044 individuals from 44 Native American populations after combining with publicly available data. We found extensive heterogeneity among Native Mexican populations and ample segregation of Q-M242* (46%) and Q-M3 (54%) haplogroups within Mexico. The northernmost sampled populations falling outside Mesoamerica (Pima and Tarahumara) showed a clear differentiation with respect to the other populations, which is in agreement with previous results from mtDNA lineages. However, our results point toward a complex genetic makeup of Native Mexicans whose maternal and paternal lineages reveal different narratives of their population history, with sex-biased continental contributions and different admixture proportions. At a continental scale, we found that Arctic populations and the northernmost groups from North America cluster together, but we did not find a clear differentiation within Mesoamerica and the rest of the continent, which coupled with the fact that the majority of individuals from Central and South American samples are restricted to the Q-M3 branch, supports the notion that most Native Americans from Mesoamerica southwards are descendants from a single wave of migration. This observation is compatible with the idea that present day Mexico might have constituted an area of transition in the diversification of paternal lineages during the colonization of the Americas.
American Journal of Physical Anthropology 07/2012; 148(3):395-405. DOI:10.1002/ajpa.22062 · 2.38 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Mesoamerica, defined as the broad linguistic and cultural area from middle southern Mexico to Costa Rica, might have played a pivotal role during the colonization of the American continent. The Mesoamerican isthmus has constituted an important geographic barrier that has severely restricted gene flow between North and South America in pre-historical times. Although the Native American component has been already described in admixed Mexican populations, few studies have been carried out in native Mexican populations. In this study, we present mitochondrial DNA (mtDNA) sequence data for the first hypervariable region (HVR-I) in 477 unrelated individuals belonging to 11 different native populations from Mexico. Almost all of the Native Mexican mtDNAs could be classified into the four pan-Amerindian haplogroups (A2, B2, C1, and D1); only two of them could be allocated to the rare Native American lineage D4h3. Their haplogroup phylogenies are clearly star-like, as expected from relatively young populations that have experienced diverse episodes of genetic drift (e.g., extensive isolation, genetic drift, and founder effects) and posterior population expansions. In agreement with this observation, Native Mexican populations show a high degree of heterogeneity in their patterns of haplogroup frequencies. Haplogroup X2a was absent in our samples, supporting previous observations where this clade was only detected in the American northernmost areas. The search for identical sequences in the American continent shows that, although Native Mexican populations seem to show a closer relationship to North American populations, they cannot be related to a single geographical region within the continent. Finally, we did not find significant population structure in the maternal lineages when considering the four main and distinct linguistic groups represented in our Mexican samples (Oto-Manguean, Uto-Aztecan, Tarascan, and Mayan), suggesting that genetic divergence predates linguistic diversification in Mexico.
Electronic supplementary material
The online version of this article (doi:10.1007/s00439-009-0693-y) contains supplementary material, which is available to authorized users.
Human Genetics 07/2009; 126(4):521-31. DOI:10.1007/s00439-009-0693-y · 4.82 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Before the arrival of Europeans to Cuba, the island was inhabited by two Native American groups, the Tainos and the Ciboneys. Most of the present archaeological, linguistic and ancient DNA evidence indicates a South American origin for these populations. In colonial times, Cuban Native American people were replaced by European settlers and slaves from Africa. It is still unknown however, to what extent their genetic pool intermingled with and was 'diluted' by the arrival of newcomers. In order to investigate the demographic processes that gave rise to the current Cuban population, we analyzed the hypervariable region I (HVS-I) and five single nucleotide polymorphisms (SNPs) in the mitochondrial DNA (mtDNA) coding region in 245 individuals, and 40 Y-chromosome SNPs in 132 male individuals.
The Native American contribution to present-day Cubans accounted for 33% of the maternal lineages, whereas Africa and Eurasia contributed 45% and 22% of the lineages, respectively. This Native American substrate in Cuba cannot be traced back to a single origin within the American continent, as previously suggested by ancient DNA analyses. Strikingly, no Native American lineages were found for the Y-chromosome, for which the Eurasian and African contributions were around 80% and 20%, respectively.
While the ancestral Native American substrate is still appreciable in the maternal lineages, the extensive process of population admixture in Cuba has left no trace of the paternal Native American lineages, mirroring the strong sexual bias in the admixture processes taking place during colonial times.
[Show abstract][Hide abstract] ABSTRACT: SNPs are one of the main sources of DNA variation among humans. Their unique properties make them useful polymorphic markers for a wide range of fields, such as medicine, forensics, and population genetics. Although several high-throughput techniques have been (and are being) developed for the vast typing of SNPs in the medical context, population genetic studies involve the typing of few and select SNPs for targeted research. This results in SNPs having to be typed in multiple reactions, consuming large amounts of time and of DNA. In order to improve the current situation in the area of human Y-chromosome diversity studies, we decided to employ a system based on a multiplex oligo ligation assay/PCR (OLA/PCR) followed by CE to create a Y multiplex capable of distinguishing, in a single reaction, all the major haplogroups and as many subhaplogroups on the Y-chromosome phylogeny as possible. Our efforts resulted in the creation of a robust and accurate 35plex (35 SNPs in a single reaction) that when tested on 165 human DNA samples from different geographic areas, proved capable of assigning samples to their corresponding haplogroup.