ArticlePDF Available

Abstract and Figures

The Khoisan populations of southern Africa are known to harbor some of the deepest-rooting lineages of human mtDNA; however, their relationships are as yet poorly understood. Here, we report the results of analyses of complete mtDNA genome sequences from nearly 700 individuals representing 26 populations of southern Africa who speak diverse Khoisan and Bantu languages. Our data reveal a multilayered history of the indigenous populations of southern Africa, who are likely to be the result of admixture of different genetic substrates, such as resident forager populations and pre-Bantu pastoralists from East Africa. We find high levels of genetic differentiation of the Khoisan populations, which can be explained by the effect of drift together with a partial uxorilocal/multilocal residence pattern. Furthermore, there is evidence of extensive contact, not only between geographically proximate groups, but also across wider areas. The results of this contact, which may have played a role in the diffusion of common cultural and linguistic features, are especially evident in the Khoisan populations of the central Kalahari. Am J Phys Anthropol, 2013. © 2013 Wiley Periodicals, Inc.
Content may be subject to copyright.
Unraveling the Complex Maternal History of
Southern African Khoisan Populations
Chiara Barbieri,
* Tom G
Christfried Naumann,
Linda Gerlach,
Falko Berthold,
Hirosi Nakagawa,
Sununguko W. Mpoloka,
Mark Stoneking,
and Brigitte Pakendorf
Max Planck Research Group on Comparative Population Linguistics, MPI for Evolutionary
Anthropology, Leipzig 04103, Germany.
Department of African Studies, Humboldt University, Berlin 10099, Germany
Department of Linguistics, MPI for Evolutionary Anthropology, Leipzig 04103, Germany
Institute of Global Studies, Tokyo University of Foreign Studies, Tokyo 183-8534, Japan
Department of Biological Sciences, University of Botswana, Gaborone, Botswana
Department of Evolutionary Genetics, MPI for Evolutionary Anthropology, Leipzig 04103, Germany
KEY WORDS mtDNA; haplogroup; foragers
ABSTRACT The Khoisan populations of southern
Africa are known to harbor some of the deepest-rooting
lineages of human mtDNA; however, their relationships
are as yet poorly understood. Here, we report the results
of analyses of complete mtDNA genome sequences from
nearly 700 individuals representing 26 populations of
southern Africa who speak diverse Khoisan and Bantu
languages. Our data reveal a multilayered history of the
indigenous populations of southern Africa, who are likely
to be the result of admixture of different genetic sub-
strates, such as resident forager populations and pre-
Bantu pastoralists from East Africa. We find high levels
of genetic differentiation of the Khoisan populations,
which can be explained by the effect of drift together
with a partial uxorilocal/multilocal residence pattern.
Furthermore, there is evidence of extensive contact, not
only between geographically proximate groups, but also
across wider areas. The results of this contact, which
may have played a role in the diffusion of common cul-
tural and linguistic features, are especially evident in the
Khoisan populations of the central Kalahari. Am J Phys
Anthropol 153:435–448, 2014. V
C2013 Wiley Periodicals, Inc.
African populations are increasingly the focus of
genetic studies, in particular those characterized by the
simultaneous presence of an ancestral way of subsist-
ence (predominantly foraging) together with deep-
rooting genetic lineages, like the Pygmies of Central
Africa and the Khoisan of southern Africa (Tishkoff
et al., 2009; Batini et al., 2011; Henn et al., 2011; Veera-
mah et al., 2011; Lachance et al., 2012; Pickrell et al.,
2012; Schlebusch et al., 2012; Verdu et al., 2013). With
the term “Khoisan” we refer to the hunter-gatherer and
pastoralist populations of southern Africa that speak
indigenous non-Bantu languages characterized by heavy
use of click consonants, without any assumption about
their genetic or linguistic unity (cf. Barnard, 1992).
There is archeological evidence for the continuous
presence of foragers in the Kalahari region since the
Late Stone Age 30,000 years ago (Denbow, 1984; Dea-
con and Deacon, 1999). Much later in time, signals of
pastoralist and Iron Age agriculturalist cultures begin to
appear in the archeological record. Pottery and remains
of domesticated animals appear almost simultaneously
2000 years ago in the coastal regions of what is now
South Africa and Namibia, and in northern Botswana.
One hypothesis suggests that this pastoralist culture ori-
ginated in East Africa, where domesticated species are
found as early as 4000 years ago (Deacon and Deacon,
1999; Phillipson, 2005), and was brought to southern
Africa by an immigration of East African herders,
spreading rapidly over the entire territory (Deacon and
Deacon, 1999; Mitchell, 2002; Pleurdeau et al., 2012). A
contrasting explanation is cultural diffusion, according
to which “hunters with sheep” would have autonomously
embarked on the transition to a new way of subsistence
after coming into contact with populations of herders
from the north (Kinahan, 1991; Sadr, 1998). However,
such a rapid shift in lifestyle and cultural paradigm is
Additional Supporting Information may be found in the online ver-
sion of this article.
Grant sponsor: Deutsche Forschungsgemeinschaft (as part of the
European Science Foundation EUROCORES Programme EuroBA-
BEL); Grant sponsor: Japan Society for the Promotion of Science
(B), Grant-in-Aid for Scientific Research; Grant number: 19401019;
Grant sponsor: Max Planck Society.
*Correspondence to: Chiara Barbieri, MPI for Evolutionary
Anthropology, Deutscher Platz 6, Leipzig 04103, Germany. E-mail: (or) Brigitte Pakendorf, Institut des Sci-
ences de l’Homme, DDL, 14 avenue Berthelot, 69363 Lyon CEDEX
07, France. E-mail:
Present Affiliation of Chiara Barbieri: Department of Evolutionary
Genetics, MPI for Evolutionary Anthropology, Leipzig, 04103, Germany.
Present Affiliation of Linda Gerlach and Falko Berthold: Depart-
ment of Linguistics, MPI for Evolutionary Anthropology, Leipzig,
04103; Department of African Studies, Humboldt University, Berlin
10099, Germany.
Present Affiliation of Brigitte Pakendorf: Laboratoire Dynamique
du Langage, UMR5596, CNRS and Universit
e Lyon Lumie
`re 2,
Lyon, 69007, France.
Received 22 July 2013; revised 12 November 2013; accepted 15
November 2013
DOI: 10.1002/ajpa.22441
Published online 9 December 2013 in Wiley Online Library
hard to reconcile with the ethnographic evidence (Smith,
1990; Barnard, 2008). The pastoralist tradition predates
the arrival of the agriculturalist Bantu speakers, whose
culture appears in the archeological record of southern
Africa not earlier than 2000–1200 years ago (Reid et al.,
1998; Phillipson, 2005; Kinahan, 2011).
Southern African Khoisan populations speak lan-
guages that belong to three distinct families (Fig. 1):
Kx’a (Heine and Honken, 2010), Tuu (G
2005), and Khoe-Kwadi (G
uldemann, 2004; G
and Elderkin, 2012). Kx’a and Tuu share some linguistic
features, and their distribution is mostly centered in the
Kalahari and its immediate surroundings. Speakers of
dialects belonging to the Ju branch of Kx’a are settled
mainly somewhat to the northwest of the Kalahari, in
northeast Botswana, northern Namibia, and southern
Angola. The Tuu language family was formerly more
widespread than today, covering most of South Africa as
well as parts of Botswana and Namibia; however, in
South Africa most Khoisan populations have assimilated
culturally and linguistically to neighboring populations.
Khoe-Kwadi languages are distributed over a large geo-
graphic area, including the Kalahari, western Namibia,
the Okavango river delta, and the salt pans to the east
of the Kalahari; the now extinct language Kwadi was
spoken in southern Angola. While all speakers of Kx’a
and Tuu languages are (or were) foragers, Khoe-Kwadi
speakers exhibit diverse subsistence strategies: the
majority are (or were) foragers (with a focus on fishing
in the Okavango river), but the now extinct Kwadi of
Angola and the Nama of Namibia were traditionally pas-
toralists, and the Damara had a mixed pattern of sub-
sistence involving hunting and gathering as well as
herding of small livestock (Barnard, 1992). Lastly, there
are phenotypic differences: while the majority of Khoi-
san populations have on average light skin pigmentation
and relatively short stature (a phenotype we here refer
to as the “Khoisan phenotype”), the Damara from Nami-
bia as well as populations of the eastern Kalahari and of
the Okavango region are characterized by on average
taller stature and darker skin pigmentation; the latter
two groups were therefore known as “Black Bushmen”
(e.g., Weiner et al., 1964; Jenkins, 1986).
From a genetic perspective, Khoisan populations are
known to harbor the deepest-rooting clades of uniparen-
tal lineages (Behar et al., 2008; Naidoo et al., 2010; Bar-
bieri et al., 2013a; Schlebusch et al., 2013), but until
recently not much was known about the relationships
between individual populations and the distribution of
genetic variation in these populations. Two novel studies
of autosomal DNA diversity in extended datasets of
Khoisan populations from southern Africa demonstrated
an ancient split that dates within the past 30,000
years, dividing Khoisan populations of the northwest
Kalahari Basin from those settled to the southeast or
south (Pickrell et al., 2012; Schlebusch et al., 2012). Fur-
thermore, both studies detect genetic links with East
Africa in the Nama and other Khoe-Kwadi speakers.
This is in good accordance with the hypothesis that the
ancestor of the Khoe-Kwadi languages was brought to
southern Africa by the pre-Bantu immigration of pastor-
alists detectable in the archeological record (G
Fig. 1. Studied populations. (a) Map of approximate location of the 26 populations included in this study, with symbols indicat-
ing their linguistic affiliation. The gray area indicates the Kalahari semi-desert. (b) Schema of Khoisan linguistic relationships.
[Color figure can be viewed in the online issue, which is available at]
American Journal of Physical Anthropology
2008). A similar link of the Khwe from southern Angola
and the Caprivi Strip, who speak a Khoe-Kwadi lan-
guage, with East African pastoralists was detected in
the shared presence at high frequency of the Y-
chromosomal haplogroup E-M293 (Henn et al., 2008)
that is rare elsewhere in Africa (de Filippo et al., 2011).
Finally, autosomal and mtDNA studies display evidence
of varying degrees of non-indigenous ancestry in all
Khoisan populations, which could reflect contact
between indigenous populations and immigrating pre-
Bantu pastoralists and/or Bantu-speaking populations
that took place at different periods of time in different
areas (Pickrell et al., 2012; Schlebusch et al., 2013).
The mtDNA variation of most Khoisan populations is
characterized by high frequencies of the deepest clades of
the mtDNA phylogeny, namely haplogroups L0d and L0k
(Behar et al., 2008; Barbieri et al., 2013a; Schlebush
et al., 2013). A minor presence of these haplogroups in
neighboring Bantu-speaking populations can be explained
by gene flow after the ancestors of these populations
reached these southernmost areas of their migration; the
proportion of L0d and L0k in Bantu-speaking populations
is higher than that of the characteristic Khoisan Y-
chromosomal haplogroups A-M91 and B-M112, in line
with sex-biased gene flow after contact (Coelho et al.,
2009; Quintana-Murci et al., 2010; Schlebusch et al.,
2011; Barbieri et al., 2013b). The source of haplogroups
other than mtDNA L0d/L0k and Y-chromosome A-M91/B-
M112 in Khoisan foragers has been identified with Bantu
agriculturalists (Schlebusch et al., 2013); however, the
possibility of gene flow from pastoralists or other pre-
Bantu populations should not be dismissed out of hand.
This study is one of the first to investigate the history
of Khoisan populations using complete mtDNA genome
sequences from a large set of populations. We analyze a
total of nearly 700 complete mtDNA genome sequences
from 19 Khoisan populations covering the three linguistic
families Kx’a, Tuu and Khoe-Kwadi and including both
hunter-gatherers and pastoralists, as well as from seven
neighboring Bantu-speaking populations. Our dataset
covers most of the extant variability in Khoisan popula-
tions, but lacks samples from South African populations
whose heritage languages belonged to the Tuu family and
the Khoekhoe group of Khoe-Kwadi, as well as the
extinct Kwadi of Angola; for this reason we refer to the
“Khoe family” and “Khoe speakers” instead of the “Khoe-
Kwadi family” and “Khoe-Kwadi speakers” in the remain-
der of this article. With these data, we aim at investigat-
ing the relationships among Khoisan populations as well
as evidence for gene flow among them. In particular, we
focus on the following research questions: 1) How is the
maternal genetic component structured in Khoisan, and
does it mirror the genetic structure emerging from the
genome-wide data? 2) How much contact was there
between different Khoisan populations and to what
extent does contact correlate with geographic proximity?
3) Can we detect traces of the hypothesized East African
ancestry of populations speaking Khoe languages?
The dataset
Samples were collected in Botswana and Namibia
between 2009 and 2011 in the framework of a multidisci-
plinary research project focusing on the prehistory of
southern African Khoisan (
kba/). The collection was approved by the ethical review
board of the University of Leipzig and authorized by the
governments of Botswana and of Namibia (Research per-
mit CYSC 1/17/2 IV (8) from the Ministry of Youth Sport
and Culture of Botswana, and 17/3/3 from the Ministry
of Health and Social Services of Namibia). Each
TABLE 1. Populations included in the study with values of diversity
affiliation Subsistence Phenotype
cluster n
div (p) Variance
diversity sd
Taa East Tuu Forager Khoisan SOUTH-CENTRAL 30 0.0015 0.000001 0.95 0.02
Taa North Tuu Forager Khoisan SOUTH-CENTRAL 25 0.0022 0.000001 0.94 0.03
Taa West Tuu Forager Khoisan SOUTH-CENTRAL 31 0.0028 0.000002 0.96 0.02
Hoan Kx’a Forager Khoisan SOUTH-CENTRAL 13 0.0010 0.000000 0.79 0.11
G|ui Khoe Forager Khoisan CENTRAL 31 0.0022 0.000001 0.92 0.03
G||ana Khoe Forager Khoisan CENTRAL 15 0.0018 0.000001 0.98 0.03
Naro Khoe Forager Khoisan CENTRAL 35 0.0029 0.000002 0.99 0.01
Ju|’hoan North Kx’a Forager Khoisan NORTHWEST 40 0.0028 0.000002 0.92 0.03
Ju|’hoan South Kx’a Forager Khoisan NORTHWEST 44 0.0029 0.000002 0.98 0.01
!Xuun Kx’a Forager Khoisan NORTHWEST 27 0.0031 0.000002 0.99 0.02
Hai||om Khoe Various Khoisan NORTHWEST 51 0.0035 0.000003 0.98 0.01
Nama Khoe Pastoralist Khoisan NAMA 29 0.0033 0.000003 0.99 0.01
||Ani Khoe Various Non-Khoisan OKAVANGO 18 0.0037 0.000004 0.96 0.03
Buga Khoe Various Non-Khoisan OKAVANGO 14 0.0037 0.000004 0.90 0.06
||Xo Khoe Various Non-Khoisan OKAVANGO 17 0.0041 0.000004 0.86 0.07
Tshwa Khoe Various Non-Khoisan EAST 22 0.0039 0.000004 0.94 0.03
Tcire Tcire Khoe Various Non-Khoisan EAST 12 0.0039 0.000004 0.97 0.04
Shua Khoe Various Non-Khoisan EAST 42 0.0039 0.000004 0.95 0.02
Damara Khoe Pastoralist Non-Khoisan NW-NAMIBIA 38 0.0028 0.000002 0.89 0.04
Herero Bantu Pastoralist – NW-NAMIBIA 30 0.0025 0.000002 0.94 0.03
Himba Bantu Pastoralist – NW-NAMIBIA 21 0.0024 0.000002 0.93 0.04
Kgalagadi Bantu Various – BANTU 19 0.0037 0.000003 0.97 0.03
Tswana Bantu Various – BANTU 17 0.0037 0.000004 0.99 0.02
Kalanga Bantu Various – BANTU 17 0.0042 0.000005 1.00 0.02
Tonga Bantu Various – BANTU 22 0.0042 0.000004 1.00 0.01
Mbukushu Bantu Various – BANTU 20 0.0042 0.000005 0.99 0.02
Nuc. div: Nucleotide Diversity, sd: Standard Deviation.
American Journal of Physical Anthropology
individual gave written consent after the purpose of the
study was explained with the help of local translators.
Details of the sample collection and DNA extraction from
saliva have been reported in the Supporting Information
of Pickrell et al. (2012). While in that study a reduced set
of 187 individuals was chosen for genome-wide SNP typ-
ing from a total of 22 African populations, in this study we
consider almost all the unrelated individuals from the
same sample collection from Botswana and Namibia. Rela-
tives were excluded from the analysis as far as they could
be ascertained from the information provided, as were
individuals with unclear ethnolinguistic family back-
ground, resulting in a dataset of 665 individuals belonging
to 19 Khoisan and five Bantu-speaking populations from
Botswana and Namibia. This dataset was augmented with
22 Tonga and 12 Mbukushu sequences from Zambia (Bar-
bieri et al., 2013b); these Mbukushu sequences were
merged with data from Mbukushu samples obtained in
Namibia, after checking for genetic homogeneity.
Nineteen sequences were not included in analyses
based on population comparisons because they belong to
populations with sample sizes below 12 individuals;
these are speakers of Khoe languages from Botswana (8
individuals) and of Bantu languages from Namibia (11
individuals). These sequences were included only in com-
parisons of haplotypes (i.e., network analyses). We
assigned the remaining 680 individuals to 26 popula-
tions on the basis of ethnolinguistic self-affiliation; the
populations and their linguistic affiliation are provided
in Table 1 and Figure 1. Populations were grouped
together according to their geographic distribution, and
in some cases taking into consideration their linguistic
affiliation and way of subsistence, into eight clusters
(see Table 1). This was done to simplify the interpreta-
tion of sequence sharing and networks, and for analyses
performed in BEAST, where larger sample sizes improve
the performance of the methods.
Sequence and data analysis
Genomic libraries were made from sheared DNA,
tagged with either single or double indexes, and
enriched for mtDNA following protocols described in
Meyer and Kircher (2010) and Maricic et al. (2010); see
also Supporting Information in Barbieri et al. (2012).
The libraries were sequenced on the Illumina GAIIx
platform, using either single or paired end runs of 76 bp
length, resulting in an average coverage of 4003. Read
adaptors were trimmed, and reads were filtered for hav-
ing at most 5 bases with a quality score <15 and indexes
for having no base with a quality score <10. Sequences
were manually checked with Bioedit ( and read alignments were
screened with ma (Briggs et al., 2009) to exclude align-
ment errors and confirm INDELS. The sequences
belonging to haplogroups L0d and L0k were already sub-
mitted to Genbank (
bank/; Barbieri et al., 2013a) and given accession
numbers KC345764-KC346248; the remaining 218
sequences were given accession numbers KC622055-
KC622272. The two poly-C regions (np 303-315, 16183-
16194), which are prone to sequencing errors, were
trimmed from the final alignment used in the analysis.
In the final alignment of 699 sequences, 97 samples
have between one and eight missing nucleotides (result-
ing in a maximum of 0.05% missing data per sequence
and a total of 160 missing nucleotides in the dataset). Of
these missing nucleotides, 81 occurred among the 1,233
polymorphic positions detected in the dataset. To mini-
mize the impact of missing data on the polymorphic
positions, we applied imputation using stringent criteria,
replacing missing sites with the nucleotide that was
present in at least two otherwise identical haplotypes of
the dataset. One hundred thirty nine positions, 75 of
which were among the polymorphic sites, were imputed
in 79 individuals. After imputation, the maximum num-
ber of missing sites per sample was three (with 18 sam-
ples still containing missing sites), and in the final
alignment only a total of 21 sites with Ns in one or more
individuals were excluded from the analysis. Haplogroup
assignment was performed with the online tool Haplo-
grep (Kloss-Brandst
atter et al., 2011).
Values of nucleotide diversity and variance were calcu-
lated in R with the package Pegas (Paradis, 2010). Corre-
spondence analysis (CA) was performed with the package
ca (Nenadic and Greenacre, 2007) using the haplogroup
frequencies reported in the Supporting Information table.
Nonmetric multidimensional Scaling (MDS) analyses were
performed with the function “isoMDS” from the package
MASS (Venables and Ripley, 2002). AMOVA, values of
sequence diversity and U
matrices of distances were
computed in Arlequin ver. 3.11. A Mantel test was per-
formed between genetic (U
) and geographic distances
with the R package vegan (Oksanen et al., 2012); geo-
graphic distances between populations were averaged over
GPS data from the individual sampling locations with the
function of the package fields (Furrer et al.,
2012). A neighbor-joining tree of the populations was gen-
erated from a U
matrix of distances with the function
“nj” of the package ape (Paradis et al., 2004). A heatplot of
haplotypes shared between at least two populations was
generated in R, with the frequency of the respective haplo-
types in each population indicated by variable shading.
Median-joining networks (Bandelt et al., 1999) with
all sites given equal weights and no pre- or post-
processing steps were computed with Network 4.11
( and visualized in Network
Publisher. Branches showing starlike signals of expan-
sions were dated using the rho statistic (Forster et al.,
1996) implemented in Network, with the calculator pro-
vided as a Supporting Information by Soares et al.
(2009). In the L0d1 network, branches are labeled with
subhaplogroup names, according to the nomenclature
proposed in Barbieri et al. (2013a).
BEAST (v1.7.2; Drummond et al., 2012) was used to
construct Bayesian Skyline Plots, based on the whole
mtDNA sequence and using the mutation rate of 1.665
from Soares et al. (2009). A Generalized Time
Reversible model was applied, and multiple runs were
performed for each dataset, using 30 million chains.
Simulations were performed in Serial Simcoal (Ander-
son et al., 2005) to estimate the probability of retaining
identical whole mtDNA sequence types after a given
number of generations following a population split,
starting from effective population sizes of 100, 1,000,
5,000 and 10,000 individuals. We based our simulations
on the two groups emerging from the autosomal data—
NW Kalahari and SE Kalahari—which are estimated to
have split within the last 30,000 years (Pickrell et al.,
2012). The populations included in the two groups were
chosen according to Supporting Information Figure S18
of Pickrell et al. (2012): the Northwest Kalahari group
(NW Kalahari) included the Ju|’hoan South, Ju|’hoan
North, !Xuun, and Hai||om (and thus corresponds to our
American Journal of Physical Anthropology
NORTHWEST cluster), and the Southeast Kalahari group
(SE Kalahari) included the Taa North, Taa East, Taa
West, Hoan, G||ana, Shua and Tshwa. The resulting
groups had sample sizes of 162 for the NW Kalahari and
209 for the SE Kalahari, with seven haplotypes shared
between the groups. We proceeded as follows: the initial
population was split in two populations, N
was kept
constant, and no migration was considered. The time
after the split was calculated applying a generation time
of 25 years (Fenner, 2005). The possibility of generating
new haplotypes was taken into account: mutations could
occur following a Kimura 2-Parameter mutation model
with the mutation rate for full mtDNA genomes from
Soares et al. (2009), which followed a gamma distribu-
tion. For each effective population size and split time we
ran 1000 iterations, and calculated both the probability
of retaining identical haplotypes and the average num-
ber of haplotypes retained, sampling 162 and 209
Khoisan mtDNA variation, population
size, and demography
The haplogroups L0d and L0k are the most common
haplogroups in our dataset: L0d1 is present at 38%,
L0d2 at 16% and L0k at 11%. As discussed in detail in
Barbieri et al. (2013a), these haplogroups are present
in higher proportions in most Khoisan populations than
in populations speaking Bantu languages (Supporting
Information Table). Apart from L0d and L0k, the other
haplogroups found in the dataset have a non-uniform
distribution. They mainly characterize and distinguish
Bantu-speaking populations from each other, although
some are also present in certain Khoisan populations,
especially in those of the OKAVANGO and EAST clusters (cf.
Supporting Information Fig. S1a).
The MDS analysis based on pairwise U
values (Fig.
2) demonstrates a lack of clear structure in the data,
with no distinct linguistically or geographically defined
groupings emerging. The only apparent groups are
formed by the Bantu-speaking Himba and Herero with
the Khoe-speaking Damara, on the one hand, and the
Khoe-speaking G|ui and Kx’a-speaking Hoan on the
other; furthermore, the Taa East are another outlier.
Notwithstanding their geographic location in southern-
central Namibia and their pastoralist subsistence, the
Nama are genetically similar to foraging populations of
northern Namibia and central and eastern Botswana.
The separation of some populations and the striking
genetic proximity of others is also reflected in the matrix
of pairwise genetic distances (Supporting Information
Fig. S2): here, several populations are visibly distin-
guished as having large genetic distances from almost
all of the other populations, for example the Himba,
Herero, Damara, ||Xo, Tonga, and Mbukushu. In con-
trast, populations of the SOUTH-CENTRAL,CENTRAL and
NORTHWEST clusters appear genetically close to each
other, with the exception of the G|ui, Taa East, and
Hoan; these are also separated in the MDS plot.
In the CA analysis (Supporting Information Fig.
S1a,b), the distinction between most Khoisan and the
Bantu-speaking populations is emphasized more than in
the MDS analysis, as is the distinction between the
Khoisan populations of the Kalahari (NORTHWEST,SOUTH-
CENTRAL, and CENTRAL) and the populations of the
OKAVANGO and EAST clusters (cf. Table 1 for a definition of
the clusters). The absence of genetic outliers among the
Khoisan populations of the Kalahari suggests that the
G|ui and Hoan, who are separated in the MDS, do not
differ from their Khoisan neighbors with respect to their
haplogroup composition. While strong genetic drift as
well as the small sample size might account for the dis-
tinction of the Hoan, the G|ui are characterized by high
frequencies of divergent sequence types belonging to
haplogroup L0d2 (Supporting Information Fig. S3).
The overall lack of ethnolinguistic or geographic dis-
tinctions between the populations evident in the MDS
and CA plots is confirmed by AMOVA analyses (Table 2).
These underline the considerable heterogeneity of the
maternal genepool in southern Africa, with a very high
and significant variance observed between populations,
both for the whole dataset of 26 populations (21%), as
well as for the set of 19 Khoisan populations (16.6%).
Focusing on the 19 Khoisan populations, different group-
ings were tested (Table 2). The variance between groups
is very low (3.4%) and nonsignificant when grouping by
the three language families, suggesting that simple lin-
guistic classification is not a good predictor of genetic
variation between populations. Dividing the populations
in four groups by rough geographic criteria results in a
significant between-group variance of 6.7%, but the
between-population variance is still higher (11.3%). The
highly significant between-group variance (16.7%) is
higher than that between populations (7.5%) when
grouping by the two phenotypes, that is, “Khoisan
phenotype” vs. “non-Khoisan phenotype”; phenotypic
variation therefore correlates with genetic structure.
This result is not unexpected, given that phenotypic
traits have a biological basis and are thus more likely to
be linked to populations than their linguistic affiliation
or geographic location. Nevertheless, the highest
between-group variance (19.4%, as opposed to only 3.9%
variance between populations) is found when grouping
the Khoisan populations by the clusters defined here
on geographic, linguistic and subsistence criteria
Fig. 2. Multidimensional Scaling plot based on U
ces. Population symbols indicate their linguistic affiliation, as
shown in Figure 1. Stress value: 7.97%. [Color figure can be
viewed in the online issue, which is available at]
American Journal of Physical Anthropology
(cf. Table 1), suggesting that all these factors contribute
to structuring the genetic variation in Khoisan (cf.
Schlebusch et al., 2012).
The high level of between-population variance at the
maternal level emerging from the AMOVA is an impor-
tant feature of our dataset. In fact, this value of
between-population diversity is strikingly different from
that found in other African datasets of full mtDNA
sequences (Barbieri et al., 2012; Barbieri et al., 2013b),
where the variance between distinct ethnolinguistic pop-
ulations is <2% of the total. These studies focused on
agriculturalist patrilocal societies with a social structure
that has been shown to homogenize the maternal gene
pool across different ethnolinguistic groups (Gunnarsdot-
tir et al., 2011; Barbieri et al., 2012, 2013b) in the pres-
ence of strict exogamy (Kumar et al., 2006). The
majority of Khoisan societies, however, are traditionally
foragers, and patrilocality is not the predominant sys-
tem. While the ethnographic record for the populations
included in this study is often incomplete (Barnard,
1992), uxorilocal postmarital residence is documented
for several foraging populations: it implies residence
with the bride’s band for the first years after marriage
and up to the birth of the third child, in association with
bride service that the husband has to provide for the
bride’s father (Silberbauer, 1981; Lee, 1984; Heinz, 1994;
Widlok, 1999). In addition, this extended period of stay
with the bride’s parents frequently results in permanent
settlement of the young couple with the woman’s band.
While not strictly uxorilocal, this social behavior results
in reduced female mobility in comparison to the more
common patrilocal practice, and could have influenced
the distribution of the maternal lineages through gener-
ations. Notably, Verdu et al. (2013) find a similar pattern
for Pygmy populations of Cameroon and Gabon using a
dataset of mtDNA HVR-I sequences; they also associate
this result to the less pronounced patrilocality typical of
these foraging populations. A comparison with the pater-
nal gene pool might shed further light on this hypothesis
and complete the genetic picture of a potentially sex-
biased social structure (cf. Oota et al., 2001; Gunnarsdot-
tir et al., 2011; Heyer et al., 2012).
Mitochondrial genetic drift might have further
increased the structure of the maternal genepool caused
by reduced female mobility, since most Khoisan popula-
tions traditionally led a nomadic lifestyle within a
restricted territory, where the core unit was represented
by small bands of related individuals (Barnard, 1992).
This is confirmed by the low nucleotide diversity values
found in some populations of the CENTRAL and SOUTH-
CENTRAL clusters (Table 1), like Hoan, Taa East, and
G||ana (values below 0.002), while the Bantu-speaking
sedentary agriculturalists Tonga, Mbukushu, and
Kalanga have the highest values (0.0042). Bayesian Sky-
line plots (Supporting Information Fig. S4), too, show
reduced effective population sizes in the populations of
the Kalahari area (especially for the SOUTH-CENTRAL and
CENTRAL clusters), and higher population sizes in the
Bantu speakers.
To summarize, the majority of Khoisan populations
are confirmed to be distinct in their mtDNA from their
Bantu-speaking neighbors and more generally from sub-
Saharan Africans. They are also quite heterogeneous in
their mtDNA composition, irrespective of the high fre-
quency of haplogroups L0d and L0k in several popula-
tions living in the Kalahari and in contrast to perceived
wisdom of their constituting a linguistically, culturally,
and biologically unified group: this population heteroge-
neity matches the autosomal data to a certain degree
(Pickrell et al., 2012; Schlebusch et al., 2012). The major
social factor that could have played a role in shaping
this high mtDNA diversity is the tendency for multilocal
postmarital residence patterns, with a strong uxorilocal
tradition in the first years after marriage, which charac-
terizes some of the populations. In addition, in the Kala-
hari populations in particular, low diversity values
reflect low effective population size, making it likely that
genetic drift has further increased population differen-
ces. While there is genetic structure overall, Khoisan
populations cannot be split into distinct groups; however,
TABLE 2. AMOVA analyses based on
Percentage of variance
1 Group Between pops Within pops
All 26 populations 20.99
19 Khoisan populations 16.59
11 Kalahari forager populations
Ju dialect cluster
1.87 98.13
Grouping Criteria (only Khoisan) Between groups Between pops/within groups Within pops
3 Language families (Tuu, Kx’a, and Khoe)
3.38 14.37
4 Geographic groups (West, North, Center, and East)
2 Phenotypes (“Khoisan,” “non-Khoisan”)
7 Geolinguistic clusters
-excluding Bantu 19.39
2 Groups-NW Kalahari vs. SE Kalahari
0.86 11.15
Pvalue <0.01.
Pvalue <0.05.
Taa North, Taa East, Taa West,Hoan, Ju|’hoan North, Ju|’hoan South, !Xuun, Hai||om, G|ui, G||ana, Naro.
!Xuun, Ju|’hoan North, Ju|’hoan South (see Fig. 1b).
As shown in Table 1.
West: Damara, Nama, Hai||om; North: Ju|’hoan North, !Xuun, ||Ani, Buga, ||Xo; Center: Taa East, Taa North, Taa West, Hoan,
G|ui, G||ana, Naro, Ju|’hoan South; East: Tshwa, Tcire Tcire, Shua.
NW Kalahari: Ju|’hoan South, Ju|’hoan North, !Xuun, and Hai||om. SE Kalahari: Taa North, Taa East, Taa West, Hoan, G||ana,
Shua, and Tshwa (as indicated in main text).
American Journal of Physical Anthropology
their genetic variability is best explained by the small
clusters defined here on the grounds of geographic, lin-
guistic and subsistence variation, indicating that all
these factors helped shape the maternal diversity of
Khoisan populations.
The impact of geography on mtDNA variation
and the northwestern-southeastern split
There is a significant association between U
ces and geographic distances for the 19 Khoisan popula-
tions (Mantel test, Z50.33, P50.001), indicating that
geography plays a role in shaping genetic variation at a
local scale (cf. Schlebush et al., 2013). The distribution
of sequence types as seen in networks and analyses of
haplotype sharing can provide further insights into the
geographic component of the mtDNA variation. A net-
work based on sequences belonging to haplogroup L0d1
(Fig. 3) highlights the presence of both long isolated
branches consistent with a considerable time depth and
development in isolation (cf. Barbieri et al., 2013a) as
well as common haplotypes shared between different
geographic/linguistic clusters. L0d1c1 is the most widely
represented subhaplogroup, with frequencies of 14% in
the NORTHWEST, 34% in the SOUTH-CENTRAL, and 31% in
the CENTRAL clusters. A branch of haplogroup L0d1c1 is
characterized by a haplotype shared by several clusters
Nama and one Tswana individual) surrounded by many
star-shaped pattern, with other Khoe and Bantu haplo-
types represented to a lesser extent. Out of a total of 40
haplotypes found on this branch, only nine are shared
(22.5%), with clear evidence of close ties between the
SOUTH-CENTRAL and the CENTRAL clusters.
The striking star-like pattern in L0d1c1 is consistent
with a population expansion, which is dated with the
rho statistic (Forster et al., 1996) and the calculator pro-
vided in Soares et al. (2009) to be 5,247 (62,700) years
old. An explanation for this genetically detectable expan-
sion is not obvious: the signal of expansion is restricted
to this branch of L0d1c1, which is hard to reconcile with
a demographic expansion that would have affected all of
the populations represented in this star-like cluster, and
that should thus have left a trace in several hap-
logroups. An alternative explanation for the expansion
detectable solely in L0d1c1 is positive selection. How-
ever, although there is one nonsynonymous mutation on
the branch leading to L0d1c1, this mutation is not exclu-
sive to this haplogroup; it is present eight additional
times in the entire human mtDNA phylogeny (according
to Phylotree v. 15, van Oven and Kayser, 2009), with
Fig. 3. Network of L0d1 haplotypes. The nodes are colored by geo-linguistic clusters, as shown in Table 1.
American Journal of Physical Anthropology
two events occurring within the African haplogroup L2.
It is thus not obvious why selection might have occurred
on L0d1c1.
From the heatplot of haplotypes shared between clus-
ters (Fig. 4), we can see how the majority of haplotype-
sharing is between the NORTHWEST,SOUTH-CENTRAL, and
CENTRAL clusters. CENTRAL displays the most sharing,
with 29 haplotypes (53% of 55 haplotypes) shared with
other clusters. The SOUTH-CENTRAL populations share 18
of their 44 haplotypes (41%); of these, 66% are shared
with CENTRAL as opposed to only 22% shared with NORTH-
. In contrast, NORTHWEST populations share only 23%
of their 94 haplotypes with other populations; of these
22 shared haplotypes, they share 50% with CENTRAL and
18% with SOUTH-CENTRAL. These numbers indicate a
closer connection between SOUTH-CENTRAL and CENTRAL
than between SOUTH-CENTRAL and NORTHWEST, as was
also seen in the L0d1 network (Fig. 3). Furthermore, the
NORTHWEST cluster emerges as being somewhat isolated
from the other clusters, as evidenced by the relatively
low number of haplotypes they share with others (23%),
in spite of their having the largest sample size of the
dataset (162 individuals and 94 haplotypes); this pre-
dominance of exclusive haplotypes in the NORTHWEST
cluster can also be seen in Figure 3.
Sharing is frequent between populations that belong
to the same geographic cluster (Supporting Information
Fig. S5), as expected from the positive correlation
between genetic and geographic distances emerging in
the Mantel test, which could easily derive from contact
between neighbors. However, many haplotypes are
shared more widely. For example, excluding the first
most common haplotype, which is shared only between
populations from Namibia (Himba, Herero, Damara,
Nama, and Hai||om), the second most common haplotype
is shared among the Taa, Hoan, G|ui, Naro, Shua, and
Tshwa (thus connecting SOUTH-CENTRAL and CENTRAL
with EAST), and the third most common is found in
Buga, ||Xo, Nama, Damara, Himba, and Tonga, and is
therefore found mostly in the north (with the exception
of the Nama; Supporting Information Fig. S5). This close
proximity of populations belonging to different geo-
graphic clusters also emerges from the matrix of pair-
wise genetic distances (Supporting Information Fig. S2),
where nonsignificant genetic distances at a threshold of
0.05 (without any correction) are highlighted: they occur
between populations of the same cluster but also
between populations from different clusters, such as
between Buga and ||Ani (OKAVANGO) and the !Xuun and
Hai||o m ( N ORTHWEST), who are geographically close, and
Fig. 4. Heatplot of haplotype sharing. The plot displays the amount of haplotypes shared between geo-linguistic clusters. The
most common haplotypes are at the bottom of the plot.
American Journal of Physical Anthropology
between the Bantu speakers from Botswana, especially
the Kalanga, and the EAST and OKAVANGO clusters.
Autosomal DNA data indicate a clear split between
northwestern (NW) and southeastern (SE) Kalahari
Khoisan groups that dates to roughly 30,000 years ago
(Pickrell et al., 2012; Schlebush et al. 2012). The NW
group in Pickrell et al. (2012) corresponds to the NORTH-
WEST cluster defined here while the SE group corre-
sponds roughly to our SOUTH-CENTRAL,EAST, and CENTRAL
clusters. In our data, the NW and SE Kalahari groups
each contain a total of 94 haplotypes, with a large
amount of haplotype sharing within each group (29% for
NW Kalahari, 50% for SE Kalahari); in contrast, only
seven haplotypes (7.5%) are shared between the two
groups. However, in other analyses the division of the
NW and SE Kalahari groups, based on mtDNA, is not so
clear-cut: i) an AMOVA performed with populations
grouped into NW and SE Kalahari as defined in Figure
S18 of Pickrell et al. (2012) gives a very low and non-sig-
nificant between-group variance of 0.86 (Table 2); ii) the
two groups are not separated as clearly in the MDS plot
(Fig. 2) as in the PCA plot based on the autosomal data;
iii) some populations falling into the NW Kalahari and
SE Kalahari group are not significantly differentiated
(for example the Taa West, which are not significantly
differentiated from any of the NORTHWEST populations, or
the Ju|’hoan North, which are not differentiated from
the Taa North or Taa West; cf. Supporting Information
Fig. S2); and iv) there is some sharing of haplotypes
between groups (Fig. 4).
The split between the NW and SE Kalahari popula-
tions detected in the autosomal DNA data (Pickrell
et al., 2013) was based on analyses biased towards
genetic variation specific to central Kalahari Khoisan
populations (with a PC plot based on SNPs ascertained
in a Ju|’hoan and with a tree constructed after excluding
the effect of non-Khoisan admixture). We therefore con-
structed a neighbor-joining tree based on U
using only L0d and L0k sequences (Fig. 5) for those pop-
ulations with at least 10 individuals carrying L0d and
L0k haplogroups. This separates populations of the SE
Kalahari group (Taa North, Taa East, Hoan, G||ana,
G|ui, and Tshwa) from those of the NW Kalahari group
(Ju|’hoan North, Ju|’hoan South, !Xuun, and Hai||om).
However, differences between the mtDNA sequences and
the autosomal data emerge, too: the Taa West and the
Shua, who in the autosomal analyses fall into the SE
Kalahari group, fall on the branch with the NW Kala-
hari populations in the tree based on L0d/L0k sequen-
ces. Overall, the mtDNA analyses thus suggest an initial
population divergence between the NW and SW Kalahari
groups followed by more recent contact, which was not
captured in the previous autosomal DNA studies (Pick-
rell et al., 2012; Schlebusch et al., 2012).
To investigate whether the mtDNA sequences shared
between the NW and SE Kalahari groups are compatible
with a 30,000 year old separation, we performed simula-
tions to test how long shared haplotypes are retained
after a population split (Table 3). Since new mutations
(calculated as one every 3,624 years, with the rate of
Soares et al., 2009) will eventually erase the signal of
shared haplotypes, our simulations investigated how
long shared haplotypes are retained after two popula-
tions diverge, in the absence of any further contact. The
results show that the probability of keeping shared hap-
lotypes when the populations split more than 15,000
years ago is zero. Shared haplotypes are present with a
probability >0.05 only up to 7,500 years after the split.
If we take into consideration that there are seven unique
haplotypes shared between the NW and SE Kalahari
groups, the split would have had to occur 1000–1250
years ago in the absence of subsequent migration. Our
results thus suggest that some migration and exchange
throughout the area must have taken place after the
split that was inferred with autosomal data to have hap-
pened within the last 30,000 years. Distinguishing
shared ancestry from contact is difficult with autosomal
SNP data; mtDNA analyses can thus complement such
data, as shared mtDNA genome sequences provide a
clear signal of recent contact.
Nowadays the Kalahari and surrounding areas repre-
sent the core area of settlement of the indigenous popu-
lations of southern Africa (Barnard, 1992), but the
presence of these populations in the central Kalahari
itself can only be relatively recent: this area was covered
with water until 10,000 years ago, when postglacial
conditions dried the Makgadikgadi Lake (one of the larg-
est ancient basins) and filled it with alluvial debris
(Ebert and Hitchcock, 1978; Cooke, 1979). The lake
could have represented a geographic barrier dividing
northwestern populations [currently mainly speakers of
Ju dialects (Kx’a family)] from southeastern populations
[currently speakers of Taa (Tuu family), Hoan (Kx’a
family), and Khoe languages], resulting in the signal of
genetic structure observed in the autosomal data (Pick-
rell et al., 2012; Schlebusch et al., 2012). This deep divi-
sion may also be reflected in the divergent branches in
the L0d1 network, especially in L0d1b2, which makes
up 15% of the NORTHWEST haplotypes, who in turn
Fig. 5. Neighbor Joining tree based on U
distances of L0d
and L0k sequences.
American Journal of Physical Anthropology
represent almost half of the total haplotypes of this
branch (Fig. 3). A subsequent colonization of the basin,
once it dried up, is compatible with the signal of recent
areal contact that emerges from the shared haplotype
In conclusion, geography plays a role in connecting
neighboring populations, but the effect of contact also
involves populations that are distant geographically and
linguistically. Some differences emerge between northwest-
ern and southeastern Kalahari populations, with the
NORTHWEST cluster in particular appearing distinct from the
southeastern populations. The possibility of an early diver-
gence of the NW and SE Kalahari groups, which is strongly
supported by the autosomal data, is complemented by the
added signal of recent contact emerging from mtDNA.
Thus, comparing the structure emerging from the autoso-
mal and the mtDNA data reveals a highly complex pattern
of prehistoric population movements. However, for compre-
hensive insights into the prehistoric processes that may
have had an impact on Khoisan genetic structure, data
from extant representatives of South African and Angolan
Khoisan populations are needed.
Contact among the Kalahari foragers
In the previous section, a major signal of contact and
sharing emerged between three clusters: NORTHWEST,
CENTRAL and SOUTH-CENTRAL, confirmed by the sharing of
haplotypes in the L0d1 network and in the heatplot
(Figs. 3 and 4) and in mostly low and non-significant
genetic distances between populations (Supporting Infor-
mation Fig. S2). The populations from these three clus-
ters belong to the same geographic region: the core area
of the Kalahari Basin. They also share common traits
like the “Khoisan phenotype” and a traditional way of
subsistence based on foraging. Genetically, they are
characterized by very high frequencies of mtDNA hap-
logroups L0d and L0k and a common trend for low val-
ues of nucleotide diversity associated with not so low (or
even high) values of sequence diversity (with the excep-
tion of the Hoan, who are characterized by very low
sequence diversity; Table 1). Low nucleotide diversity
values indicate reduced admixture with populations
with a different genetic composition, such as the herders
who migrated to the area 2,000 years ago, or the Bantu-
speaking agriculturalists who arrived later. Neverthe-
less, the presence of a non-Khoisan genetic component
in the autosomal data (Pickrell et al., 2012) indicates
that some admixture must have occurred, probably in
the paternal line.
The common features displayed are probably the
result of areal contact. However, this contact is not
strong enough to make these populations genetically
homogeneous (when pooled together in one group, the
between-population variance of 7% is significant, cf.
Table 2). Further evidence of potential contact can be
revealed by comparisons of linguistic and genetic rela-
tionships. For instance, speakers of Ju languages of the
Kx’a family (Fig. 1), who are settled in the northwestern
Kalahari area, are genetically undifferentiated in the
maternal line (cf. Table 2). In contrast, their linguistic
relatives the Hoan, who live in southern Botswana, dif-
fer from the Ju|’hoan North and Ju|’hoan South, but
share haplotypes with the geographically neighboring
G|ui, Taa, Naro, and the Tshwa and Shua from the EAST
cluster (Supporting Information Fig. S5); furthermore,
they are not significantly differentiated from the G||ana
(Supporting Information Fig. S2). This proximity of the
Hoan to their geographic neighbors rather than to
their linguistic relatives mirrors the results from the
autosomal data (Pickrell et al., 2012) and is in good
agreement with linguistic evidence for contact among
these populations (Traill and Nakagawa 2000,
uldemann and Loughnane, 2012).
The CENTRAL cluster includes foragers of the Kalahari
who speak a West Kalahari Khoe language: these are
the G|ui, G||ana and Naro. The Naro are genetically
closely related to both the Ju and the Taa (Supporting
Information Fig. S2), which is in agreement with autoso-
mal evidence that they are the result of admixture
between northwestern and southeastern Kalahari popu-
lations (Pickrell et al., 2012). Irrespective of their genetic
affinities with the Taa and Hoan, the G|ui and G||ana
are distinct with respect to mtDNA from other popula-
tions speaking Khoe languages. This is in good accord-
ance with the hypothesis of a language shift of the G|ui
and G||ana to the Khoe languages they speak nowadays
uldemann, 2008). There is also linguistic and histori-
cal evidence for contact between speakers of G|ui and
Taa (Traill and Nakagawa, 2000).
Summing up, similarities between Khoisan popula-
tions are particularly evident in the core area of the
Kalahari Basin, where contact has played a large role in
shaping their genetic makeup. Admixture with immi-
grants did not leave evident traces in the maternal
genetic material, in accordance with low levels of exog-
amy also emerging in the diversity values.
Khoe pastoralists and a putative
East African origin
The majority of the Khoe-speaking populations live in
peripheral areas of the Kalahari, and it has been
hypothesized that they represent the descendants of a
TABLE 3. Results of simulations
nGenerations 30 40 50 100 150 200 250 300 400 600 800
Years after split 750 1,000 1,250 2,500 3,750 5,000 6,250 7,500 10,000 15,000 20,000
5100 P0.84 0.70 0.62 0.28 0.13 0.06 0.02 0.01 0 0 0
n1.4 1.2 1.2 1.0 1.0 1.0 1.0 1.0 1.0 NA NA
51000 P1.00 0.99 0.97 0.70 0.35 0.19 0.07 0.03 0.01 0 0
n4.4 3.5 2.9 1.6 1.1 1.0 1.0 1.0 1.0 NA NA
55000 P1.00 1.00 1.00 0.92 0.66 0.34 0.16 0.08 0.01 0 0
n10.2 8.2 6.5 2.6 1.6 1.2 1.1 1.1 1.1 NA NA
510,000 P1.00 1.00 1.00 0.96 0.74 0.45 0.24 0.09 0.03 0 0
n14.0 11.1 8.9 3.5 1.8 1.3 1.1 1.1 1.0 NA NA
P(probability of retaining shared haplotypes) and n(average number of haplotypes retained) were calculated for populations with
four different effective sizes (N
American Journal of Physical Anthropology
migration of Khoe-Kwadi speakers with a herding econ-
omy (G
uldemann, 2008). The putative origin of these
Khoe-Kwadi populations is in East Africa, where live-
stock is present from 4,000 years ago (Phillipson, 2005;
Deacon and Deacon, 1999). There is some genetic evi-
dence in support of this hypothesis: the distribution of Y
chromosome haplogroup E-M293, in association with
microsatellite diversity, suggests an expansion from Tan-
zania to southern Africa that does not overlap with the
Bantu migration (Henn et al., 2008). Autosomal data
(Schlebusch et al., 2012) provides evidence of shared
ancestry between the Nama and East African Maasai,
together with the presence of the same haplotype associ-
ated with a lactase persistence allele in both popula-
tions, which supports the suggested pastoralist
character of this demographic event. Autosomal data
(Pickrell et al., 2012) also suggest a tentative link to
East Africa for the Nama as well as other Khoe popula-
tions, especially the Shua. Once the migrating pastoral-
ists reached the Kalahari, it is likely that there was
intensive exchange and sex-biased gene flow with resi-
dent foraging populations (Deacon and Deacon, 1999):
this would be reflected in a major contribution of
mtDNA haplogroups L0d and L0k to the immigrating
pastoralists, and a consequent homogenization of the for-
ager and pastoralist populations.
Can a genetic signature of the pastoralist Khoe migra-
tion be identified from the mtDNA data? A potential sig-
nature would be mtDNA haplogroups and haplotypes
shared among modern Khoe speakers if the pastoralist
migration included female migrants, since this is
assumed to have taken place not more than 2,000 years
ago. The lineages mostly shared by Khoe populations
are haplogroups L0d (present in all populations) and
L0k (present in most). These might represent retentions
from an original shared East African ancestor, which
would explain the traces of L0d in the Sandawe of Tan-
zania (Tishkoff et al., 2007), who speak a language possi-
bly related to the Khoe languages (G
uldemann and
Elderkin, 2012). However, L0d and L0k are rare outside
of southern Africa (Barbieri et al., 2013a), and are
highly characteristic of the NORTHWEST,CENTRAL, and
SOUTH-CENTRAL clusters. Thus, the presence of these line-
ages in the Khoe populations might rather be the result
of contact with local foragers.
As found for the Y chromosome, some haplogroups
might retain traces of the putative East African origin of
the Khoe, assuming that not all of these lineages were
incorporated via direct contact with Bantu-speaking
agriculturalists. A potential East African candidate is
haplogroup L5, common in East Africa and present
exclusively in the Shua and Tshwa (at 5 and 18%,
respectively); however, this is notably absent in the OKA-
VANGO and NAMA populations.
A further trace of the East African migration might be
sought in the presence of a minimal common genetic
denominator that could be interpreted as a genetic sig-
nal of shared ancestry of these populations; however, the
Khoe clusters of putative East African origin (OKAVANGO,
EAST,NAMA) harbor different proportions of non-L0d/L0k
haplogroups (Supporting Information Table). Genetic
drift and/or subsequent contact with other Khoisan pop-
ulations may have played a role in increasing this differ-
entiation. A possible exception is represented by
haplogroup L3d, which is present in Khoe-speaking indi-
viduals belonging to the EAST,NAMA, and OKAVANGO clus-
ters, and in three Hai||om and one G|ui individual (as
well as two !Xuun). However, haplogroup L3d is present
at highest frequency in NW-NAMIBIA, which comprises the
Khoe-speaking Damara and the Bantu-speaking Himba
and Herero. The L3d network (Fig. 6) shows a common
haplotype shared by 28 individuals (26 NW-NAMIBIA, one
Nama, and one Hai||om, indicated with an asterisk in the
network) and surrounded by 15 other haplotypes in a
star-shaped form, suggesting a recent expansion. The
time of this expansion is dated with the rho statistic (For-
ster et al., 1996) and the calculator provided in Soares
et al. (2009) to 1,373 years ago (6700 years), which
would have followed the arrival of the pastoralist
migrants. This haplotype stems from a motif carried by
seven Khoe-speaking individuals from various regional
clusters (indicated by an arrow), suggesting that the
ancestors of the Khoe-Kwadi speakers could have initially
carried it to the area and subsequently spread it, creating
the resulting signal of expansion. Strong female gene flow
could then have incorporated L3d lineages into the gene
pool of the ancestors of the pastoralist Himba, Herero,
and Damara (NW-NAMIBIA cluster).
Among Khoisan populations, the pastoralist Nama
show the clearest signal of ancestry with East Africa in
the autosomal data (Schlebusch et al., 2012; Pickrell
et al., 2012), which strongly contrasts with the mtDNA
results: the Nama do not harbor any characteristic East
African mtDNA lineages, and they are genetically close
to the populations from the NORTHWEST,SOUTH-CENTRAL
and CENTRAL clusters, especially to the linguistically
closely related Hai||om (Fig. 2, Supporting Information
Fig. 6. Network of L3d haplotypes. The dashed line indi-
cates a branch that has been shortened for graphic purposes.
American Journal of Physical Anthropology
Fig. S2 and S5). It is possible that high levels of contact
with local foragers in the maternal line erased any origi-
nal signal of East African maternal ancestry in the
Nama, while a signal of East African ancestry was
retained in the autosomal data, and/or that the pastoral-
ist migration was heavily male-mediated.
In summary, the variation present in the non-L0d/L0k
lineages (which are less likely to stem from contact with
autochthonous foragers) does not provide a strong
genetic link of the Khoe-speaking populations with east-
ern Africa. Haplogroup L5 might represent a relic of this
putative immigration of eastern African pastoralists, but
it is present in only two of the seven Khoe-speaking pop-
ulations most likely to have eastern African ancestry.
L3d is another genetic marker that may have been
brought to southern Africa by the Khoe-Kwadi immigra-
tion, but this signal, too, is not unequivocal. The puta-
tive genetic background carried by the maternal
ancestors of the Khoe-Kwadi may have been diluted
through gene flow from local foragers and Bantu-
speaking migrants and further been erased by drift in
some of the populations. A male-dominated migration
could also have played a role in leaving a more evident
signal of eastern African origin in the Y chromosome
(Henn et al., 2008), while the maternal genetic compo-
nent would stem from autochthonous foragers. The
hypothesized eastern African origin of the Khoe ances-
tors requires more investigation, and this line of
research would greatly benefit from the availability of
more representative samples, in particular from more
pastoralist populations of East Africa.
With this dataset of complete mtDNA genome sequen-
ces we greatly extend our knowledge about the history
and demography of Khoisan populations of southern
Africa. Most importantly, we show that they are geneti-
cally differentiated, with populations of the NORTHWEST
geolinguistic cluster somewhat isolated from other popu-
lations. However, in contrast to the deep split emerging
from previous analyses of genome-wide data, contact in
the maternal line between geographically distant popu-
lations can be shown to have taken place. This areal
contact, involving especially the populations of the cen-
tral Kalahari, has played a role in shaping their mtDNA
diversity and may have played a role in the diffusion of
common cultural and linguistic features. Furthermore,
gene flow in the maternal line is most probably the rea-
son why no strong and unambiguous signal of the
hypothesized pre-Bantu pastoralist immigration from
eastern Africa can be detected in the Khoe-speaking pop-
ulations. However, the picture presented here is limited
by our lack of comparable data from descendants of
Khoisan populations from South Africa and Angola. In
future work, analyses of the Y-chromosome will contrib-
ute to our understanding of the genetic variation of
these populations, and will complete the picture of the
socio-demographic factors (in particular, those that are
sex-biased) that have had an impact during Khoisan
This study focuses on the prehistory of populations as
reflected in their genetic variation. It does not intend to
evaluate the self-identification or cultural identity of
any group, which consist of much more than just genetic
ancestry. The authors sincerely thank all the sample
donors for their participation in this study, the govern-
ments of Botswana, Namibia, and Zambia for supporting
their research, Blesswell Kure, Justin Magabe, and
Berendt Nakwe for assistance with sample collection,
Mingkun Li for bioinformatics assistance, and Serena
Tucci, Vera Lede, Roland Schr
oder, and Anne Butthof for
assistance with sample preparation. They thank Gertrud
Boden for helpful comments on the manuscript.
Anderson CNK, Ramakrishnan U, Chan YL, Hadly EA. 2005.
Serial SimCoal: a population genetics model for data from
multiple populations and points in time. Bioinformatics 21:
Bandelt HJ, Forster P, Rohl A. 1999. Median-joining networks
for inferring intraspecific phylogenies. Mol Biol Evol 16:37–
Barbieri C, Vicente M, Rocha J, Sununguko, Mpoloka W,
Stoneking M, Pakendorf B. 2013a. Ancient Substructure in
Early mtDNA Lineages of Southern Africa. Am J Hum Genet
Barbieri C, Butthof A, Bostoen K, Pakendorf B. 2013b. Genetic
perspectives on the origin of clicks in Bantu languages from
southwestern Zambia. Eur J Hum Genet 21:430–436.
Barbieri C, Whitten M, Beyer K, Schreiber H, Li M, Pakendorf
B. 2012. Contrasting maternal and paternal histories in the
linguistic context of Burkina Faso. Mol Biol Evol 29:1213–
Barnard A. 1992. Hunters and herders of southern Africa: a
comparative ethnography of the Khoisan peoples. Cambridge,
New York: Cambridge University Press.
Barnard A. 2008. Ethnographic analogy and the reconstruction
of early Khoekhoe society. South Afr Human 20:61–75.
Batini C, Lopes J, Behar DM, Calafell F, Jorde LB, van der
Veen L, Quintana-Murci L, Spedini G, Destro-Bisol G, Comas
D. 2011. Insights into the Demographic History of African
Pygmies from Complete Mitochondrial Genomes. Mol Biol
Evol 28:1099–1110.
Behar DM, Villems R, Soodyall H, Blue-Smith J, Pereira L,
Metspalu E, Scozzari R, Makkan H, Tzur S, Comas D,
Bertranpetit J, Quintana-Murci L, Tyler-Smith C, Wells SR,
Rosset S, The Genographic Consortium. 2008. The Dawn of
Human Matrilineal Diversity. Am J Hum Genet 82:1130–
Briggs AW, Good JM, Green RE, Krause J, Maricic T, Stenzel
U, Lalueza-Fox C, Rudan P, Brajkovic D, Kucan Z, Gu
Schmitz R, Doronichev VB, Golovanova LV, de la Rasilla M,
Fortea J, Rosas A, P
abo S. 2009. Targeted retrieval and
analysis of five Neandertal mtDNA genomes. Science 325:
Coelho, M, Sequeira F, Luiselli D, Beleza S, Rocha J. 2009. On
the edge of Bantu expansions: mtDNA, Y chromosome and
lactase persistence genetic variation in southwestern Angola.
BMC Evol Biol 9:80.
Cooke HJ. 1979. The origin of the Makgadikgadi Pans. Bots
Notes Records 11:37–42.
de Filippo C, Barbieri C, Whitten M, Mpoloka SW,
Gunnarsdottir ED, Bostoen K, Nyambe T, Beyer K, Schreiber
H, de Knijff P, Luiselli D, Stoneking M, Pakendorf B. 2011.
Y-chromosomal variation in sub-Saharan Africa: insights into
the history of Niger-Congo groups. Mol Biol Evol 28:1255–
Deacon HJ, Deacon J. 1999. Human beginnings in South Africa:
uncovering the secrets of the Stone Age. Walnut Creek, CA:
Altamira Press.
Denbow J. 1984. Prehistoric herders and foragers of the Kala-
hari: the evidence for 1500 years of interaction. In: C Schrire,
editor. Past and Present in Hunter Gatherer Studies.
Orlando: Academic Press. p 175–193.
American Journal of Physical Anthropology
Drummond AJ, Suchard MA, Xie D, Rambaut A. 2012. Bayes-
ian Phylogenetics with BEAUti and the BEAST 1.7. Mol Biol
Evol 29:1969–1973.
Ebert JI, Hitchcock RK. 1978. Ancient Lake Makgadikgadi,
Botswana: mapping, measurement and palaeoclimatic signifi-
cance. Palaeoecology Africa 10:47–56.
Fenner JN. 2005. Cross-cultural estimation of the human gen-
eration interval for use in genetics-based population diver-
gence studies. Am J Phys Anthropol 128:415–423.
Forster P, Harding R, Torroni A, Bandelt HJ. 1996. Origin and
evolution of Native American mtDNA variation: a reappraisal.
Am J Hum Genet 59:935–945.
Furrer R, Nychka D, Sain S. 2012. Fields: tools for spatial data.
R package version 6.7
uldemann T. 2004. Reconstruction through de-construction:
the marking of person, gender, and number in the Khoe fam-
ily and Kwadi. Diachronica 21:251–306.
uldemann T. 2005. Studies in Tuu (Southern Khoisan). Papers
on Africa, Languages and Literatures 23. Leipzig: Institut f
Afrikanistik, Universit
at Leipzig.
uldemann, T. 2008. A linguist’s view: Khoe-Kwadi speakers as
the earliest food-producers of southern Africa. South Afr
Human 20:93–132.
uldemann T, Elderkin ED. 2010. On external genealogical
relationships of the Khoe family. In: M Brenzinger, C K
editors. Khoisan languages and linguistics: proceedings of the
1st International Symposium January 4–8, 2003: Riezlern/
Kleinwalsertal. Quellen zur Khoisan-Forschung. K
udiger K
oppe. p 15–52.
uldemann T, Loughnane R. 2012. Are there “Khoisan” o
in body-part vocabulary? On linguistic inheritance and con-
tact in the Kalahari Basin. Language Dynamics & Change 2:
Gunnarsdottir ED, Nandineni MR, Li M, Myles S, Gil D,
Pakendorf B, Stoneking M. 2011. Larger mitochondrial DNA
than Y-chromosome differences between matrilocal and patri-
local groups from Sumatra. Nat Commun 2:228.
Heine B, Honken H. 2010. The Kx’a family: a new Khoisan
Genealogy. J Asian Afr Stud 79:5–36.
Heinz HJ. 1994. Social organization of the! K~
o Bushmen. K
R. K
Henn BM, Gignoux C, Lin AA, Oefner PJ, Shen P, Scozzari R,
Cruciani F, Tishkoff SA, Mountain JL, Underhill PA. 2008. Y-
chromosomal evidence of a pastoralist migration through Tan-
zania to southern Africa. Proc Natl Acad Sci USA 105:10693–
Henn BM, Gignoux CR, Jobin M, Granka JM, Macpherson JM,
Kidd JM, Rodriguez-Botigue L, Ramachandran S, Hon L,
Brisbin A, Lin AA, Underhill PA, Comas D, Kidd KK,
Norman PJ, Parham P, Bustamante CD, Mountain JL,
Feldman MW. 2011. Hunter-gatherer genomic diversity sug-
gests a southern African origin for modern humans. Proc
Natl Acad Sci USA 108:5154–5162.
Heyer E, Chaix R, Pavard S, Austerlitz F. 2012. Sex-specific
demographic behaviours that shape human genomic varia-
tion. Mol Ecol 21:597–612.
Jenkins T. 1986. The prehistory of the San and Khoikhoi as
recorded in their blood. In: R Vossen, K Keuthmann editors.
Contemporary Studies on Khoisan, Vol. 2. Hamburg, Helmut
Buske Verlag. p 51–77.
Kinahan J. 1991. Pastoral Nomads of the central Namib Desert:
the people history forgot. Windhoek: Namibia Archaeological
Kinahan J. 2011. From the beginning: the archaeological evi-
dence. In: M Wallace. A History of Namibia: From the Begin-
ning to 1990. London: Hurst and Company. p 15–43.
atter A, Pacher D, Sch
onherr S, Weissensteiner
H, Binna R, Specht G, Kronenberg F. 2011. HaploGrep: a fast
and reliable algorithm for automatic classification of mito-
chondrial DNA haplogroups. Hum Mutat 32:25–32.
Kumar V, Langstieh BT, Madhavi KV, Naidu VM, Singh HP,
Biswas S, Thangaraj K, Singh L, Reddy BM. 2006. Global
patterns in human mitochondrial DNA and Y-chromosome
variation caused by spatial instability of the local cultural
processes. PLoS Genet 2:e53.
Lachance J, Vernot B, Elbers CC, Ferwerda B, Froment A, Bodo
JM, Lema G, Fu W, Nyambo TB, Rebbeck TR, Zhang K, Akey
JM, Tishkoff SA. 2012. Evolutionary history and adaptation
from high-coverage whole-genome sequences of diverse African
hunter-gatherers. Cell 150:457–469.
Lee RB. 1984. The Dobe! Kung. Case studies in cultural anthro-
pology. New York: Holt, Rinehart and Winston.
Maricic T, Whitten M, P
abo S. 2010. Multiplexed DNA
sequence capture of mitochondrial genomes using PCR prod-
ucts. PLoS ONE 5:e14004–e14004.
Meyer M, Kircher M. 2010. Illumina sequencing library prepa-
ration for highly multiplexed target capture and sequencing.
Cold Spring Harbor Protoc 2010:pdb–prot5448.
Mitchell P. 2002. The Archaeology of Southern Africa. Cam-
bridge: Cambridge University Press.
Naidoo T, Schlebusch CM, Makkan H, Patel P, Mahabeer R,
Erasmus JC, Soodyall H. 2010. Development of a single base
extension method to resolve Y chromosome haplogroups in
sub-Saharan African populations. Investig Genet 1:6.
Nenadic O, Greenacre M. 2007. Correspondence analysis in R,
with two-and three-dimensional graphics: the ca package. J
Stat Software 20:1–13.
Oksanen J, Blanchet FG, Kindt R, Legendre P, Minchin PR,
O’Hara RB, Simpson GL, Solymos P, Stevens MRH, Wagner
H. 2012. vegan: Community Ecology Package. R package ver-
sion 2.0–5. http://CRAN. R-project. org/package5vegan.
Oota H, Settheetham-Ishida W, Tiwawech D, Ishida T,
Stoneking M. 2001. Human mtDNA and Y-chromosome varia-
tion is correlated with matrilocal versus patrilocal residence.
Nat Genet 29:20–21.
Paradis E. 2010. pegas: an R package for population genetics
with an integrated–modular approach. Bioinformatics 26:
Paradis E, Claude J, Strimmer K. 2004. APE: analyses of phylo-
genetics and evolution in R language. Bioinformatics 20:289–
Phillipson DW. 2005. African archaeology. Cambridge: Cam-
bridge University Press.
Pickrell JK, Patterson N, Barbieri C, Berthold F, Gerlach L,
Lipson M, Po-Ru L, Lachance J, G
uldemann T, Kure B, Wata
SM, Nakagawa H, Naumann C, Mountain JL, Bustamante
CD, Berger B, Stoneking M, Reich D, Pakendorf B. 2012. The
genetic prehistory of southern Africa. Nat Commun 3. doi:
Pleurdeau D, Imalwa E, Detroit F, Lesur J, Veldman A, Bahain
JJ, Marais E. 2012. "Of sheep and men": earliest direct evi-
dence of caprine domestication in southern Africa at leopard
cave (Erongo, Namibia). PLoS ONE 7:e40340.
Quintana-Murci L, Harmant C, Quach H, Balanovsky O,
Zaporozhchenko V, Bormans C, van Helden PD, Hoal EG,
Behar DM. 2010. Strong maternal Khoisan contribution to
the South African coloured population: a case of gender-
biased admixture. Am J Hum Genet 86:611–620.
Reid A, Sadr K, Hanson-James N. 1998. Herding traditions. In:
Lane P, Reid A, Segobye A, editors. Ditswa MMung: The
Archaeology of Botswana. Gaborone: Pula Press and The
Botswana Society. p 81–100.
Sadr K. 1998. The first herders at the Cape of Good Hope. Afr
Archaeol Rev 15:101–132.
Schlebusch CM, de Jongh M, Soodyall H. 2011. Different contri-
butions of ancient mitochondrial and Y-chromosomal lineages
in ’Karretjie people’ of the Great Karoo in South Africa. J
Hum Genetics 56:623–630.
Schlebusch CM, Skoglund P, Sjodin P, Gattepaille LM,
Hernandez D, Jay F, Li S, De Jongh M, Singleton A, Blum
MG, Soodyall H, Jakobsson M. 2012. Genomic variation in
seven Khoe-San groups reveals adaptation and complex Afri-
can history. Science 338:374–379.
Schlebusch CM, Lombard M, Soodyall H. 2013. MtDNA control
region variation affirms diversity and deep sub-structure in
populations from southern Africa. BMC Evol Biol 13:1–21.
American Journal of Physical Anthropology
Silberbauer GB. 1981. Hunter and habitat in the central Kala-
hari Desert. Cambridge: Cambridge University Press.
Smith AB. 1990. On becoming herders: Khoikhoi and San eth-
nicity in southern Africa. Afr Studies 49:51–73.
Soares P, Ermini L, Thomson N, Mormina M, Rito T, Rohl A,
Salas A, Oppenheimer S, Macaulay V, Richards MB. 2009.
Correcting for purifying selection: an improved human mito-
chondrial molecular clock. Am J Hum Genet 84:740–759.
Tishkoff SA, Gonder MK, Henn BM, Mortensen H, Knight A,
Gignoux C, Fernandopulle N, Lema G, Nyambo TB,
Ramakrishnan U, Reed FA, Mountain JL. 2007. History of
click-speaking Populations of Africa inferred from mtDNA and
Y chromosome genetic variation. Mol Biol Evol 24:2180–2195.
Tishkoff SA, Reed FA, Friedlaender FR, Ehret C, Ranciaro A,
Froment A, Hirbo JB, Awomoyi AA, Bodo JM, Doumbo O,
Ibrahim M, Juma AT, Kotze MJ, Lema G, Moore JH,
Mortensen H, Nyambo TB, Omar SA, Powell K, Pretorius GS,
Smith MW, Thera MA, Wambebe C, Weber JL, Williams SM.
2009. The genetic structure and history of Africans and Afri-
can Americans. Science 324:1035–1044.
Traill A, Nakagawa H. 2000. A historical! X
o-| Gui contact
zone: linguistic and other relations. In: Batibo H, Tsonope J,
editors. The state of Khoesan languages in Botswana. Gabor-
one: Basarwa Languages Project. p 1–17.
van Oven M, Kayser M. 2009. Updated comprehensive phyloge-
netic tree of global human mitochondrial DNA variation.
Hum Mutat 30:E386–E394.
Veeramah KR, Wegmann D, Woerner A, Mendez FL, Watkins
JC, Destro-Bisol G, Soodyall H, Louie L, Hammer MF. 2011.
An early divergence of KhoeSan ancestors from those of other
modern humans is supported by an ABC-based analysis of
autosomal re-sequencing data. Mol Biol Evol. 29:617–630.
Venables WN, Ripley BD. 2002. MASS: modern applied statis-
tics with S. New York: Springer.
Verdu P, Becker NSA, Froment A, Georges M, Grugni V,
Quintana-Murci L, Hombert JM, Van der Veen L, Le Bomin
S, Bahuchet S, Heyer E, Austerlitz F. 2013. Sociocultural
behavior, sex-biased admixture, and effective population sizes
in Central African Pygmies and Non-Pygmies. Mol Biol Evol
Weiner JS, Harris R, Harrison GA, Singer R, Jopp W. 1964.
Skin Colour in Southern Africa. Hum Biol 36:294–&.
Widlok T. 1999. Living on Mangetti: ’Bushman’ autonomy and
Namibian independence. Oxford: Oxford University Press.
American Journal of Physical Anthropology
... Archaeological data reveal traces of an agricultural way of subsistence in Namibia, Zambia and Botswana around 2000-1200 years ago [1, 13,14], which was preceded by a few centuries by an immigration of pastoralist cultures [15,16]. Thus, in these areas, the presumably Bantuspeaking agriculturalist immigrants would have met both populations of huntergatherers as well as pastoralists, whose descendants comprise the linguistically, culturally, and genetically diverse "Khoisan" populations [17,18]. ...
... One hundred and ninety seven sequences from Botswana, Namibia, and Angola were previously included in studies focusing on haplogroups L0d and L0k as well as on the prehistory of Khoisan populations [18,33], while a subset of 169 sequences from Zambia were included in [20]; the GenBank accession numbers of these samples can be found in Table S1. The remaining 446 sequences from Zambia and 170 sequences from Our dataset includes speakers of several Bantu languages belonging to both the Western and the Eastern branches of Bantu according to the classification found on glottolog 2.2 ( ...
... Supplementary Table S1 provides details on the country of sampling, ethnolinguistic affiliation, and GenBank accession number for each sample. We also included the Damara, who speak a Khoe language rather than a Bantu language, because of their known genetic proximity to Herero and Himba [17,18]. The rough geographic location of the 23 populations included in the study can be seen in Figure 1, while Table 1 summarizes the information on their country of origin, linguistic affiliation, and subsistence. ...
Full-text available
Bantu speech communities expanded over large parts of sub-Saharan Africa within the last 4000-5000 years, reaching different parts of southern Africa 1200-2000 years ago. The Bantu languages subdivide in several major branches, with languages belonging to the Eastern and Western Bantu branches spreading over large parts of Central, Eastern, and Southern Africa. There is still debate whether this linguistic divide is correlated with a genetic distinction between Eastern and Western Bantu speakers. During their expansion, Bantu speakers would have come into contact with diverse local populations, such as the Khoisan hunter-gatherers and pastoralists of southern Africa, with whom they may have intermarried. In this study, we analyze complete mtDNA genome sequences from over 900 Bantu-speaking individuals from Angola, Zambia, Namibia, and Botswana to investigate the demographic processes at play during the last stages of the Bantu expansion. Our results show that most of these Bantu-speaking populations are genetically very homogenous, with no genetic division between speakers of Eastern and Western Bantu languages. Most of the mtDNA diversity in our dataset is due to different degrees of admixture with autochthonous populations. Only the pastoralist Himba and Herero stand out due to high frequencies of particular L3f and L3d lineages; the latter are also found in the neighboring Damara, who speak a Khoisan language and were foragers and small-stock herders. In contrast, the close cultural and linguistic relatives of the Herero and Himba, the Kuvale, are genetically similar to other Bantu-speakers. Nevertheless, as demonstrated by resampling tests, the genetic divergence of Herero, Himba, and Kuvale is compatible with a common shared ancestry with high levels of drift and differential female admixture with local pre-Bantu populations.
... It is likely that the tripartite structure between populations residing in the northwestern Kalahari, central Kalahari and South Africa was strongly influenced by ecological and climatic factors shaping the southern African landscape during the last 100 ky. As the ongoing dry period of the Kalahari Basin only started at around ~10 kya, Barbieri et al. (2014) have suggested that the ancient lake Makgadikgadi in Botswana acted as an important barrier to gene flow between populations from the northwest and southeast. During about 120 ky before the onset of the current climate, the Makgadikgadi mega-lakes might indeed have been a formidable obstacle, occasionally covering an area as wide as 66,000 square kilometers which encompassed present-day lake Ngami, the Mababe Depression, lake Liambezi and the Makgadikgadi pans (Mendelsohn et al. 2010). ...
... Contact relations appear to have been especially dense along the northern and eastern Kalahari Basin fringe, with the Khwe, Shua, Tshwa and Gǁana all displaying considerable Bantu-related autosomal ancestries (Fig. 6A). The genetic profile of the Khwe from the Okavango River Basin suggests a particular type of leveled interaction in which gene-flow with southwestern Bantu peoples is symmetrically reflected in both mtDNA and Y-chromosome lineages (Bajić et al. 2018;Barbieri et al. 2014). Our own ethnographic observations in Namibia and Botswana confirm the close relationship between the Khwe and their Bantu-speaking neighbors. ...
Full-text available
The present-day diversity of southern African populations was shaped by the confluence of three major pre-historic settlement layers associated with distinct linguistic strata: i) an early occupation by foragers speaking languages of the Kx'a and Tuu families; ii) a Late Stone Age migration of pre-Bantu pastoralists from eastern Africa associated with Khoe-Kwadi languages; iii) the Iron Age expansion of Bantu-speaking farmers from West-Central Africa who reached southern Africa from the western and eastern part of the continent. Uniting data and methodologies from linguistics and genetics, we review evidence for the origins, migration routes and internal diversification patterns of all three layers. By examining the impact of admixture and sex-biased forms of interaction, we show that southern Africa can be characterized as a zone of high contact between foraging and food-producing communities, involving both egalitarian interactions and socially stratified relationships. A special focus on modern groups speaking languages of the Khoe-Kwadi family further reveals how contact and admixture led to the generation of new ethnic identities whose diverse subsistence patterns and cultural practices have long puzzled scholars from various disciplines.
... Similarly, we use the term "Bantu" to refer to the language family that is diffused over vast areas of sub-Saharan Africa (Williamson and Blench 2000), without any racial connotation. Southern African Khoisan groups are known to harbor a remarkable level of genetic variability both for autosomal loci (Pickrell et al. 2012;Schlebusch et al. 2013) and mtDNA sequences (Barbieri et al. 2014a(Barbieri et al. , 2013. However, very little is known about the Y-chromosomal variation in Khoisan groups, as previous studies included data from only a few such populations (Wood et al. 2005;Soodyall et al. 2008;Henn et al. 2011), missing most of the cultural and linguistic diversity subsumed under this generic label (Barnard 1992;Güldemann 2014). ...
... Bar-coded Illumina sequencing libraries prepared previously (Barbieri et al. 2013(Barbieri et al. , 2014a were enriched for ~500 kb of target NRY sequence using the Agilent Array and methods described previously (Lippold et al. 2014). Reads were generated from 7.5 lanes of the Illumina GAII (Solexa) ...
Full-text available
The recent availability of large-scale sequence data for the human Y chromosome has revolutionized analyses of and insights gained from this non-recombining, paternally inherited chromosome. However, the studies to date focus on Eurasian variation, and hence the diversity of early-diverging branches found in Africa has not been adequately documented. Here we analyze over 900 kb of Y chromosome sequence obtained from 547 individuals from southern African Khoisan and Bantu-speaking populations, identifying 232 new sequences from basal haplogroups A and B. We find new branches within haplogroups A2 and A3b1 and suggest that the prehistory of haplogroup B2a is more complex than previously suspected; this haplogroup is likely to have existed in Khoisan groups before the arrival of Bantu-speakers, who brought additional B2a lineages to southern Africa. Furthermore, we estimate older dates than obtained previously for both the A2-T node within the human Y chromosome phylogeny and for some individual haplogroups. Finally, there is pronounced variation in branch length between major haplogroups; haplogroups associated with Bantu-speakers have significantly longer branches. This likely reflects a combination of biases in the SNP calling process and demographic factors, such as an older average paternal age (hence a higher mutation rate), a higher effective population size, and/or a stronger effect of population expansion for Bantu-speakers than for Khoisan groups.
... Genome-wide coalescent analyses suggest that ancient populations began to take structure 200 kya, which led to a rift between Khoisan and non-hunter-gatherer groups (i.e., Niger-Congo, Nilo-Saharan, Afro-Asiatic) by 160 kya, followed shortly by a split between Khoisan and RFHG groups 120-100 kya 26,27 . Mitochondrial studies have reinforced this pattern of Stone Age divergences and subsequent admixture amongst rainforest 28 , and Khoisan 29,30 hunter-gatherer groups. Much of the past 200 thousand years of human evolution has therefore been a story of population structuring and diffusion within the continent of Africa. ...
... For example, Khoisan ancestors are thought to be the outgroup to other modern humans, yet L0 (the mtDNA outgroup) is found among many African populations; however, the two subclades L0d and L0k are comprised almost entirely (82% and 83% respectively) of Khoisan-speakers. Typically, this is interpreted to reinforce a correspondence between Khoisan and L0 30,43,52 . The high population diversity of other L0 subclades may represent ancient admixture with those groups, and the specificity of Khoisan-speakers in L0d/L0k may represent the drift within this shrinking population. ...
Full-text available
Archaeological and genomic evidence suggest that modern Homo sapiens have roamed the planet for some 300–500 thousand years. In contrast, global human mitochondrial (mtDNA) diversity coalesces to one African female ancestor (“Mitochondrial Eve”) some 145 thousand years ago, owing to the ¼ gene pool size of our matrilineally inherited haploid genome. Therefore, most of human prehistory was spent in Africa where early ancestors of Southern African Khoisan and Central African rainforest hunter-gatherers (RFHGs) segregated into smaller groups. Their subdivisions followed climatic oscillations, new modes of subsistence, local adaptations, and cultural-linguistic differences, all prior to their exodus out of Africa. Seven African mtDNA haplogroups (L0–L6) traditionally captured this ancient structure—these L haplogroups have formed the backbone of the mtDNA tree for nearly two decades. Here we describe L7, an eighth haplogroup that we estimate to be ~ 100 thousand years old and which has been previously misclassified in the literature. In addition, L7 has a phylogenetic sublineage L7a*, the oldest singleton branch in the human mtDNA tree (~ 80 thousand years). We found that L7 and its sister group L5 are both low-frequency relics centered around East Africa, but in different populations (L7: Sandawe; L5: Mbuti). Although three small subclades of African foragers hint at the population origins of L5'7, the majority of subclades are divided into Afro-Asiatic and eastern Bantu groups, indicative of more recent admixture. A regular re-estimation of the entire mtDNA haplotype tree is needed to ensure correct cladistic placement of new samples in the future.
... Since patrilocality is a frequent postmarital residence pattern among human societies, it is common to find women moving to the husband's ancestral territory [30][31][32]. Furthermore, it has been observed that in areas where farmers have a dominant position over foragers, genetic admixture is often sex-biased involving farmer males and forager females [33][34][35]. More specifically, a higher movement of women among groups reduces population differentiation on the maternally inherited mitochondrial DNA (mtDNA) and patrilocality will lead to an increase in population differentiation on the paternally inherited Y-chromosome (MSY) genetic variation. ...
Full-text available
Northwestern Amazonia is home to a great degree of linguistic diversity, and the human societies in that region are part of complex networks of interaction that predate the arrival of Europeans. This study investigates the population and language contact dynamics between two languages found within this region, Yukuna and Tanimuka, which belong to the Arawakan and Tukanoan language families, respectively. We use evidence from linguistics, ethnohistory, ethnography and population genetics to provide new insights into the contact dynamics between these and other human groups in NWA. Our results show that the interaction between these groups intensified in the last 500 years, to the point that it is difficult to differentiate between them genetically. However, this close interaction has led to more substantial contact-induced language changes in Tanimuka than in Yukuna, consistent with a scenario of language shift and asymmetrical power relations.
... Their presence and interaction with the Bantuspeaking communities in Southern Africa span centuries. Archaeological discoveries show evidence of some pottery and animal remains dating 2000 years ago and link these remains to those discovered in Eastern African communities dating 4000 years ago (Barbieri et al. 2014:1). The first archaeological evidence of 2000 years ago covers the related to the period of the early gentile house churches that Paul salutes in some of his epistles. ...
Full-text available
Paul usually ends his letters with salutations to believers who meet in someone else’s house. Far from being individualistic, these greetings also include people from different house churches. Considered from a functional angle, these greetings cement relationships between house churches. Within an ubuntu worldview, the oral praxis of sereto (Sepedi) or isiduko (IsiXhosa) (praise-poetry) establishes and confirms relationships between members of the same community (family, clan or tribe). The question is how such praxes affect women who belong to such communities. Contribution: This article is a comparative analysis of how some of the salutations used at the end of some of Paul’s epistles touch on gender relations in the same way as the ubuntu oral praxis of sereto or isiduko touches on gender relations among members of a community (family, clan or tribe).
... Contact among Bantu farmers, indigenous San hunter-gatherers, and Khoe pastoralists is indicated by isolated Shongwe and Bisoli pots at Late Stone Age sites and by genetic studies that have revealed intermarriage between the three communities around the Sua Pan (Barbieri et al. 2014;Pickrell et al. 2012). Small flaked stone tools, ostrich eggshell beads, and composite bone arrows, which are typical Late Stone Age and Khoesan artifacts, have also been found on Early Iron Age sites. ...
Full-text available
Agro-pastoralists arrived in Botswana in the 6th and 7th centuries, five groups of the Urewe Tradition of the Bantu migrations, of which two of the Nkope Branch or Central Stream, two of the Kwale Branch or Eastern Stream, and the fifth settled in the zone of contact between these two branches. A sixth group had links with the Naviundu Tradition in the southeastern Democratic Republic of Congo. There is no evidence of a Western Stream of the Bantu migrations, also known as the Kalundu Tradition, in Botswana. During the second phase of the Early Iron Age, c.800-1000 CE, east-central Botswana was dominated by the Zhizo culture.
Full-text available
The colonial-period arrival of Europeans in southern Africa is associated with strong sex-biased migration by which male settlers displaced indigenous Khoekhoe and San men. Simultaneously, the importation of South Asian, Indonesian and Eastern African slaves may have contributed female-biased migration to Cape Town and surrounding areas. We examine the spatial and temporal spread of sex-biased migration from the Cape northward into Namaqualand and the southern Kalahari using genetic data from more than 1,400 individuals. In all regions, admixture patterns were sex-biased, with evidence of a greater male contribution of European ancestry and greater female contribution of Khoe-San ancestry. While admixture among Khoe-San, European, equatorial African, and Asian groups has likely been continuous from the founding of Cape Town to present-day, we find that Khoe-San groups further north experienced a single pulse of European admixture 6-8 generations ago. European admixture was followed by additional Khoe-San gene flow, potentially reflecting an aggregation of indigenous groups due to disruption by colonial interlopers. Male migration into the northern frontier territories was not a homogenous group of expanding Afrikaners and slaves. The Nama show evidence of distinct founder effects and derive 15% of their male lineages from Asian men, a pattern absent in the ≠Khomani San. Khoe-San ancestry from the paternal line is greatly diminished in populations from Cape Town, the Cederberg Mountains and Upington, but remains more frequent in self-identified ethnically indigenous groups. Strikingly, we estimate that Khoe-San Y-chromosomes were experiencing unprecedented population growth at the time of European arrival. Our findings shed light on the patterns of admixture and the population history of South Africa as the colonial frontier expanded.
Full-text available
As the ancestral homeland of our species, Africa contains elevated levels of genetic diversity and substantial population structure. Importantly, African genomes are heterogeneous: they contain mixtures of multiple ancestries, each of which have experienced different evolutionary histories. In this review, we view population genetics through the lens of admixture, highlighting how multiple demographic events have shaped African genomes. Each of these historical vignettes paints a recurring picture of population divergence followed by secondary contact. First, we give a brief overview of African genetic variation and examine deep population structure within Africa, including evidence of ancient introgression from archaic "ghost" populations. Second, we describe the genetic legacies of admixture events that have occurred during the past 10,000 years. This includes gene flow between different click-speaking Khoe-San populations, the stepwise spread of pastoralism from eastern to southern Africa, multiple migrations of Bantu speakers across the continent, as well as admixture from the Middle East and Europe into the Sahel region and North Africa. Furthermore, the genomic signatures of more recent admixture can be found in the Cape Peninsula and throughout the African diaspora. Third, we highlight how natural selection has shaped patterns of genetic variation across the continent, noting that gene flow provides a potent source of adaptive variation and that selective pressures vary across Africa. Finally, we explore the biomedical implications of African population genetic structure on health and disease and call for more ethically conducted studies of African genetic variation.
The Khoisan were decimated, dispossessed and assimilated into the mixed-race “Coloured” group during colonialism and apartheid, spawning the notion of their supposed extinction. However, Cape Town, where colonial history runs deepest, became the epicentre of “Khoisan revivalism” after apartheid. Khoisan revivalists reject Coloured identity and campaign for cultural development, historical justice and indigenous rights. Many also claim land and traditional leadership titles. Drawing on ethnographic fieldwork among Khoisan revivalists, my PhD dissertation scrutinises Khoisan revivalism’s origins, appeal and political aspirations. It focuses on the various ways that historical events, figures and practices inform diverse articulations of indigeneity. Khoisan revivalists are primarily seeking a relatable connection with the past and select sources, mediums and content accordingly. Moreover, in simultaneously replicating, disregarding and appropriating colonial representations, they produce a “subversive authenticity”. While empowering to many, Khoisan revivalism has also emboldened some to mobilise a racialised identity politics based on prior occupancy, which today extends beyond the movement.
The Khoisan are a cluster of southern African peoples, including the famous Bushmen or San 'hunters', the Khoekhoe 'herders' (in the past called 'Hottentots'), and the Damara, also a herding people. Most Khoisan live in the Kalahari desert and surrounding areas of Botswana and Namibia. In spite of differences in their way of life, the various groups have much in common, and this book explores these similarities and the influence of environment and history on aspects of Khoisan culture. This is the first book on the Khoisan as a whole since the publication in 1930 of The Khoisan Peoples of South Africa, by Isaac Schapera, doyen of southern African studies.
A guide to using S environments to perform statistical analyses providing both an introduction to the use of S and a course in modern statistical methods. The emphasis is on presenting practical problems and full analyses of real data sets.
G/wi society and culture have been shaped by the rugged natural environment. The volume focusses on the interrelationships, the socio-cultural system and habitat of the hunter-gatherer G/wi bushmen of the central Kalahari Desert of Botswana. Drawing on ten years of field-experience, the author sets out the foundations of G/wi society, with descriptions of their social, political and economic organisation, living patterns, subsistence technology, and seasonal adaptations. -John Sheail
David Phillipson presents an illustrated account of African prehistory, from the origins of humanity through European colonization in this revised and expanded edition of his original work. Phillipson considers Egypt and North Africa in their African context, comprehensively reviewing the archaeology of West, East, Central and Southern Africa. His book demonstrates the relevance of archaeological research to understanding contemporary Africa and stresses the continent's contribution to the cultural heritage of humankind.