[Show abstract][Hide abstract] ABSTRACT: The consequences of the Neolithic transition in Europe-one of the most important cultural changes in human prehistory-is a subject of great interest. However, its effect on prehistoric and modern-day people in Iberia, the westernmost frontier of the European continent, remains unresolved. We present, to our knowledge, the first genome-wide sequence data from eight human remains, dated to between 5,500 and 3,500 years before present, excavated in the El Portalón cave at Sierra de Atapuerca, Spain. We show that these individuals emerged from the same ancestral gene pool as early farmers in other parts of Europe, suggesting that migration was the dominant mode of transferring farming practices throughout western Eurasia. In contrast to central and northern early European farmers, the Chalcolithic El Portalón individuals additionally mixed with local southwestern hunter-gatherers. The proportion of hunter-gatherer-related admixture into early farmers also increased over the course of two millennia. The Chalcolithic El Portalón individuals showed greatest genetic affinity to modern-day Basques, who have long been considered linguistic and genetic isolates linked to the Mesolithic whereas all other European early farmers show greater genetic similarity to modern-day Sardinians. These genetic links suggest that Basques and their language may be linked with the spread of agriculture during the Neolithic. Furthermore, all modern-day Iberian groups except the Basques display distinct admixture with Caucasus/Central Asian and North African groups, possibly related to historical migration events. The El Portalón genomes uncover important pieces of the demographic history of Iberia and Europe and reveal how prehistoric groups relate to modern-day people.
Proceedings of the National Academy of Sciences 09/2015; 112(38). DOI:10.1073/pnas.1509851112 · 9.67 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Background:
In ecology and forensics, some population assignment techniques use molecular markers to assign individuals to known groups. However, assigning individuals to known populations can be difficult if the level of genetic differentiation among populations is small. Most assignment studies handle independent markers, often by pruning markers in Linkage Disequilibrium (LD), ignoring the information contained in the correlation among markers due to LD.
To improve the accuracy of population assignment, we present an algorithm, implemented in the HaploPOP software, that combines markers into haplotypes, without requiring independence. The algorithm is based on the Gain of Informativeness for Assignment that provides a measure to decide if a pair of markers should be combined into haplotypes, or not, in order to improve assignment. Because complete exploration of all possible solutions for constructing haplotypes is computationally prohibitive, our approach uses a greedy algorithm based on windows of fixed sizes. We evaluate the performance of HaploPOP to assign individuals to populations using a split-validation approach. We investigate both simulated SNPs data and dense genotype data from individuals from Spain and Portugal.
Our results show that constructing haplotypes with HaploPOP can substantially reduce assignment error. The HaploPOP software is freely available as a command-line software at www.ieg.uu.se/Jakobsson/software/HaploPOP/.
[Show description][Hide description] DESCRIPTION: How and when the Americas were populated remains contentious. Using ancient and modern genome-wide data, we find that the ancestors of all present-day Native Americans, including Athabascans and Amerindians, entered the Americas as a single migration wave from Siberia no earlier than 23 thousand years ago (KYA), and after no more than 8,000-year isolation period in Beringia. Following their arrival to the Americas, ancestral Native Americans diversified into two basal genetic branches around 13 KYA, one that is now dispersed across North and South America and the other is restricted to North America. Subsequent gene flow resulted in some Native Americans sharing ancestry with present-day East Asians (including Siberians) and, more distantly, Australo-Melanesians. Putative ‘Paleoamerican’ relict populations, including the historical Mexican Pericúes and South American Fuego-Patagonians, are not directly related to modern Australo-Melanesians as suggested by the Paleoamerican Model.
[Show abstract][Hide abstract] ABSTRACT: How and when the Americas were populated remains contentious. Using ancient and modern genome-wide data, we found that the ancestors of all present-day Native Americans, including Athabascans and Amerindians, entered the Americas as a single migration wave from Siberia no earlier than 23 thousand years ago (ka) and after no more than an 8000-year isolation period in Beringia. After their arrival to the Americas, ancestral Native Americans diversified into two basal genetic branches around 13 ka, one that is now dispersed across North and South America and the other restricted to North America. Subsequent gene flow resulted in some Native Americans sharing ancestry with present-day East Asians (including Siberians) and, more distantly, Australo-Melanesians. Putative "Paleoamerican" relict populations, including the historical Mexican Pericúes and South American Fuego-Patagonians, are not directly related to modern Australo-Melanesians as suggested by the Paleoamerican Model.
[Show abstract][Hide abstract] ABSTRACT: The identification of the genetic structure of populations from multilocus genotype data has become a central component of modern population-genetic data analysis. Application of model-based clustering programs often entails a number of steps, in which the user considers different modeling assumptions, compares results across different pre-determined values of the number of assumed clusters (a parameter typically denoted K), examines multiple independent runs for each fixed value of K, and distinguishes among runs belonging to substantially distinct clustering solutions. Here, we present Clumpak (Cluster Markov Packager Across K), a method that automates the post-processing of results of model-based population structure analyses. For analyzing multiple independent runs at a single K value, Clumpak identifies sets of highly similar runs, separating distinct groups of runs that represent distinct modes in the space of possible solutions. This procedure, which generates a consensus solution for each distinct mode, is performed by the use of a Markov clustering algorithm that relies on a similarity matrix between replicate runs, as computed by the software Clumpp. Next, Clumpak identifies an optimal alignment of inferred clusters across different values of K, extending a similar approach implemented for a fixed K in Clumpp, and simplifying the comparison of clustering results across different K values. Clumpak incorporates additional features, such as implementations of methods for choosing K and comparing solutions obtained by different programs, models, or data subsets. Clumpak, available at http://clumpak.tau.ac.il, simplifies the use of model-based analyses of population structure in population genetics and molecular ecology. This article is protected by copyright. All rights reserved.
This article is protected by copyright. All rights reserved.
[Show abstract][Hide abstract] ABSTRACT: The origin of maize (Zea mays mays) in the US Southwest remains contentious, with conflicting archaeological data supporting either coastal1, 2, 3, 4 or highland5,6 routes of diffusion of maize into the United States. Furthermore, the genetics of adaptation to the new environmental and cultural context of the Southwest is largely uncharacterized7. To address these issues, we compared nuclear DNA from 32 archaeological maize samples spanning 6,000 years of evolution to modern landraces. We found that the initial diffusion of maize into the Southwest about 4,000 years ago is likely to have occurred along a highland route, followed by gene flow from a lowland coastal maize beginning at least 2,000 years ago. Our population genetic analysis also enabled us to differentiate selection during domestication for adaptation to the climatic and cultural environment of the Southwest, identifying adaptation loci relevant to drought tolerance and sugar content.
[Show abstract][Hide abstract] ABSTRACT: The majority of sub-Saharan Africans today speak a number of closely related languages collectively referred to as ‘Bantu’ languages. The current distribution of Bantu-speaking populations has been found to largely be a consequence of the movement of people rather than a diffusion of language alone. Linguistic and single marker genetic studies have generated various hypotheses regarding the timing and the routes of the Bantu expansion, but these hypotheses have not been thoroughly investigated. In this study, we re-analysed microsatellite markers typed for large number of African populations that—owing to their fast mutation rates—capture signatures of recent population history. We confirm the spread of west African people across most of sub-Saharan Africa and estimated the expansion of Bantu-speaking groups, using a Bayesian approach, to around 5600 years ago. We tested four different divergence models for Bantu-speaking populations with a distribution comprising three geographical regions in Africa. We found that the most likely model for the movement of the eastern branch of Bantu-speakers involves migration of Bantu-speaking groups to the east followed by migration to the south. This model, however, is only marginally more likely than other models, which might indicate direct movement from the west and/or significant gene flow with the western Branch of Bantu-speakers. Our study use multi-loci genetic data to explicitly investigate the timing and mode of the Bantu expansion and it demonstrates that west African groups rapidly expanded both in numbers and over a large geographical area, affirming the fact that the Bantu expansion was one of the most dramatic demographic events in human history.
Proceedings of the Royal Society B: Biological Sciences 09/2014; 281(1793). DOI:10.1098/rspb.2014.1448 · 5.05 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The New World Arctic, the last region of the Americas to be populated by humans, has a relatively well-researched archaeology, but an understanding of its genetic history is lacking. We present genome-wide sequence data from ancient and present-day humans from Greenland, Arctic Canada, Alaska, Aleutian Islands, and Siberia. We show that Paleo-Eskimos (~3000 BCE to 1300 CE) represent a migration pulse into the Americas independent of both Native American and Inuit expansions. Furthermore, the genetic continuity characterizing the Paleo-Eskimo period was interrupted by the arrival of a new population, representing the ancestors of present-day Inuit, with evidence of past gene flow between these lineages. Despite periodic abandonment of major Arctic regions, a single Paleo-Eskimo metapopulation likely survived in near-isolation for more than 4000 years, only to vanish around 700 years ago.
[Show abstract][Hide abstract] ABSTRACT: The rapid advance of sequencing technology, coupled with improvements in molecular methods for obtaining genetic data from
ancient sources, holds the promise of producing a wealth of genomic data from time-separated individuals. However, the population-genetic
properties of time-structured samples have not been extensively explored. Here, we consider the implications of temporal sampling
for analyses of genetic differentiation and use a temporal coalescent framework to show that complex historical events such
as size reductions, population replacements, and transient genetic barriers between populations leave a footprint of genetic
differentiation that can be traced through history using temporal samples. Our results emphasize explicit consideration of
the temporal structure when making inferences and indicate that genomic data from ancient individuals will greatly increase
our ability to reconstruct population history.
[Show abstract][Hide abstract] ABSTRACT: Background
Genome-wide scans for regions that demonstrate deviating patterns of genetic variation have become common approaches for finding genes targeted by selection. Several genomic patterns have been utilized for this purpose, including deviations in haplotype homozygosity, frequency spectra and genetic differentiation between populations.
We describe a novel approach based on the Maximum Frequency of Private Haplotypes – MFPH – to search for signals of recent population-specific selection. The MFPH statistic is straightforward to compute for phased SNP- and sequence-data. Using both simulated and empirical data, we show that MFPH can be a powerful statistic to detect recent population-specific selection, that it performs at the same level as other commonly used summary statistics (e.g. FST, iHS and XP-EHH), and that MFPH in some cases capture signals of selection that are missed by other statistics. For instance, in the Maasai, MFPH reveals a strong signal of selection in a region where other investigated statistics fail to pick up a clear signal that contains the genes DOCK3, MAPKAPK3 and CISH. This region has been suggested to affect height in many populations based on phenotype-genotype association studies. It has specifically been suggested to be targeted by selection in Pygmy groups, which are on the opposite end of the human height spectrum compared to the Maasai.
From the analysis of both simulated and publicly available empirical data, we show that MFPH represents a summary statistic that can provide further insight concerning population-specific adaptation.
[Show abstract][Hide abstract] ABSTRACT: Prehistoric population structure associated with the transition to an agricultural lifestyle in Europe remains a contentious
idea. Population-genomic data from 11 Scandinavian Stone Age human remains suggest that hunter-gatherers had lower genetic
diversity than that of farmers. Despite their close geographical proximity, the genetic differentiation between the two Stone
Age groups was greater than that observed among extant European populations. Additionally, the Scandinavian Neolithic farmers
exhibited a greater degree of hunter-gatherer–related admixture than that of the Tyrolean Iceman, who also originated from
a farming context. In contrast, Scandinavian hunter-gatherers displayed no significant evidence of introgression from farmers.
Our findings suggest that Stone Age foraging groups were historically in low numbers, likely owing to oscillating living conditions
or restricted carrying capacity, and that they were partially incorporated into expanding farming groups.
[Show abstract][Hide abstract] ABSTRACT: The ability to digest milk into adulthood, lactase persistence (LP), as well as specific genetic variants associated with LP, is heterogeneously distributed in global populations [1-4]. These variants were most likely targets of selection when some populations converted from hunter-gatherer to pastoralist or farming lifestyles [5-7]. Specific LP polymorphisms are associated with particular geographic regions and populations [1-4, 8-10]; however, they have not been extensively studied in southern Africa. We investigate the LP-regulatory region in 267 individuals from 13 southern African populations (including descendants of hunter-gatherers, pastoralists, and agropastoralists), providing the first comprehensive study of the LP-regulatory region in a large group of southern Africans. The "East African" LP single-nucleotide polymorphism (SNP) (14010G>C) was found at high frequency (>20%) in a strict pastoralist Khoe population, the Nama of Namibia, suggesting a connection to East Africa, whereas the "European" LP SNP (13910C>T) was found in populations of mixed ancestry. Using genome-wide data from various African populations, we identify admixture (13%) in the Nama, from an Afro-Asiatic group dating to >1,300 years ago, with the remaining fraction of their genomes being from San hunter-gatherers. We also find evidence of selection around the LCT gene among Khoe-speaking groups, and the substantial frequency of the 14010C variant among the Nama is best explained by adaptation to digesting milk. These genome-local and genome-wide results support a model in which an East African group brought pastoralist practices to southern Africa and admixed with local hunter-gatherers to form the ancestors of Khoe people.
Current biology: CB 04/2014; 24(8). DOI:10.1016/j.cub.2014.02.041 · 9.57 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Clovis, with its distinctive biface, blade and osseous technologies, is the oldest widespread archaeological complex defined in North America, dating from 11,100 to 10,700 (14)C years before present (bp) (13,000 to 12,600 calendar years bp). Nearly 50 years of archaeological research point to the Clovis complex as having developed south of the North American ice sheets from an ancestral technology. However, both the origins and the genetic legacy of the people who manufactured Clovis tools remain under debate. It is generally believed that these people ultimately derived from Asia and were directly related to contemporary Native Americans. An alternative, Solutrean, hypothesis posits that the Clovis predecessors emigrated from southwestern Europe during the Last Glacial Maximum. Here we report the genome sequence of a male infant (Anzick-1) recovered from the Anzick burial site in western Montana. The human bones date to 10,705 ± 35 (14)C years bp (approximately 12,707-12,556 calendar years bp) and were directly associated with Clovis tools. We sequenced the genome to an average depth of 14.4× and show that the gene flow from the Siberian Upper Palaeolithic Mal'ta population into Native American ancestors is also shared by the Anzick-1 individual and thus happened before 12,600 years bp. We also show that the Anzick-1 individual is more closely related to all indigenous American populations than to any other group. Our data are compatible with the hypothesis that Anzick-1 belonged to a population directly ancestral to many contemporary Native Americans. Finally, we find evidence of a deep divergence in Native American populations that predates the Anzick-1 individual.
[Show abstract][Hide abstract] ABSTRACT: Ancestral relationships between populations separated by time represent an often neglected dimension in population genetics, a field which historically has focused on analysis of spatially distributed samples from the same point in time. Models are usually straightforward when two time-separated populations are assumed to be completely isolated from all other populations. However, this is usually an unrealistically stringent assumption when there is gene flow with other populations. Here we investigate continuity in the presence of gene flow from unknown populations. This set-up allows a more nuanced treatment of questions regarding population continuity in terms of "level of contribution" from a particular ancient population to a more recent population. We propose a statistical framework which makes use of a biallelic marker sampled at two different points in time to assess population contribution, and present two different interpretations of the concept. We apply the approach to published data from a prehistoric human population in Scandinavia (Malmström et al. 2009) and Pleistocene woolly mammoth (Barnes et al. 2007; Debruyne et al. 2008).