Complex poplation structure in African village dogs and its implications for inferring dog domestication history

Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.
Proceedings of the National Academy of Sciences (Impact Factor: 9.67). 09/2009; 106(33):13903-8. DOI: 10.1073/pnas.0902129106
Source: PubMed
High genetic diversity of East Asian village dogs has recently been used to argue for an East Asian origin of the domestic dog. However, global village dog genetic diversity and the extent to which semiferal village dogs represent distinct, indigenous populations instead of admixtures of various dog breeds has not been quantified. Understanding these issues is critical to properly reconstructing the timing, number, and locations of dog domestication. To address these questions, we sampled 318 village dogs from 7 regions in Egypt, Uganda, and Namibia, measuring genetic diversity >680 bp of the mitochondrial D-loop, 300 SNPs, and 89 microsatellite markers. We also analyzed breed dogs, including putatively African breeds (Afghan hounds, Basenjis, Pharaoh hounds, Rhodesian ridgebacks, and Salukis), Puerto Rican street dogs, and mixed breed dogs from the United States. Village dogs from most African regions appear genetically distinct from non-native breed and mixed-breed dogs, although some individuals cluster genetically with Puerto Rican dogs or United States breed mixes instead of with neighboring village dogs. Thus, African village dogs are a mosaic of indigenous dogs descended from early migrants to Africa, and non-native, breed-admixed individuals. Among putatively African breeds, Pharaoh hounds, and Rhodesian ridgebacks clustered with non-native rather than indigenous African dogs, suggesting they have predominantly non-African origins. Surprisingly, we find similar mtDNA haplotype diversity in African and East Asian village dogs, potentially calling into question the hypothesis of an East Asian origin for dog domestication.


Available from: Marius Hedimbi, Jul 30, 2014
Complex population structure in African village
dogs and its implications for inferring dog
domestication history
Adam R. Boyko
, Ryan H. Boyko
, Corin M. Boyko
, Heidi G. Parker
, Marta Castelhano
, Liz Corey
Jeremiah D. Degenhardt
, Adam Auton
, Marius Hedimbi
, Robert Kityo
, Elaine A. Ostrander
, Jeffrey Schoenebeck
Rory J. Todhunter
, Paul Jones
, and Carlos D. Bustamante
Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853;
Department of Anthropology and Graduate Group in
Ecology, University of California, Davis, CA 95616;
National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892;
Department of Clinical Sciences and the Medical Genetic Archive, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853;
Department of
Biological Sciences, University of Namibia, Windhoek, Namibia;
Department of Zoology, Makerere University, Kampala, Uganda; and
The Walthan Centre
for Pet Nutrition, Waltham on the Wolds, Leicestershire LE14 4RT, United Kingdom
Edited by Tomoko Ohta, National Institute of Genetics, Mishima, Japan, and approved June 12, 2009 (received for review February 26, 2009)
High genetic diversity of East Asian village dogs has recently been
used to argue for an East Asian origin of the domestic dog. However,
global village dog genetic diversity and the extent to which semiferal
village dogs represent distinct, indigenous populations instead of
admixtures of various dog breeds has not been quantified. Under-
standing these issues is critical to properly reconstructing the timing,
number, and locations of dog domestication. To address these ques-
tions, we sampled 318 village dogs from 7 regions in Egypt, Uganda,
and Namibia, measuring genetic diversity >680 bp of the mitochon-
drial D-loop, 300 SNPs, and 89 microsatellite markers. We also ana-
lyzed breed dogs, including putatively African breeds (Afghan
hounds, Basenjis, Pharaoh hounds, Rhodesian ridgebacks, and Sa-
lukis), Puerto Rican street dogs, and mixed breed dogs from the
United States. Village dogs from most African regions appear genet-
ically distinct from non-native breed and mixed-breed dogs, although
some individuals cluster genetically with Puerto Rican dogs or United
States breed mixes instead of with neighboring village dogs. Thus,
African village dogs are a mosaic of indigenous dogs descended from
early migrants to Africa, and non-native, breed-admixed individuals.
Among putatively African breeds, Pharaoh hounds, and Rhodesian
ridgebacks clustered with non-native rather than indigenous African
dogs, suggesting they have predominantly non-African origins. Sur-
prisingly, we find similar mtDNA haplotype diversity in African and
East Asian village dogs, potentially calling into question the hypoth-
esis of an East Asian origin for dog domestication.
Canis familiaris microsatellites principal component analysis single
nucleotide polymorphisms
n many respects, dogs have a unique relationship to humans. They
were the first domesticated species, serve as valuable companions
and service animals, and have been bred to exhibit more phenotypic
diversity than any other mammal (1–3). Dogs were probably
domesticated from Eurasian wolves at least 15,00040,000 years
ago (46), although the process by which domestication took place,
including the specific selected traits and the manner in which
selection was performed, is very poorly understood (7, 8).
After domestication somewhere in Eurasia, dogs quickly spread
throughout the continent and into Africa, Oceania and the Amer-
icas (9). These early dogs, like modern day ‘‘village dogs’’ (7), almost
certainly lived as human commensals that were not subject to the
same degree of intense artificial selection and closed breeding
practices that characterize modern dog breeds. Like ancient human
populations, the se ancient dog populations developed genetic sig-
natures characteristic of their geographic locale. These signatures
would persist in both modern day village dog populations that
descend from these ancient populations and in dog breeds that were
founded from them. We refer to such dogs as ‘‘indigenous’’ in the
sense that they carry characteristic genetic signature s appropriate
for their geographic region.
Today, semiferal village dogs are nearly ubiquitous around human
settlements in much of the world, and such animals comprise a large
proportion of the global dog population (7). However, the popularity of
modern breeds has led to the widespread transport of mostly European-
derived breed dogs into many areas containing village dogs, so it is likely
that many modern village dogs are not derived solely from indigenous
ancestors. We refer to village dogs that descend from these foreign dogs
as ‘‘non-native’’ and expect that genetic markers can differentiate these
village dogs from indigenous dogs. We believe most of these dogs will
be complex mixtures of several non-native breeds and/or mixture s of
both non-native breeds and indigenous village dogs (‘‘intermediate’’
The distinction between indigenous and non-native dogs is
important because indigenous, but not non-native, village dogs are
likely to contain genetic variants that are not found in any of today’s
400 recognized dog breeds. Furthermore, they are expected to be
more informative regarding dog population history and are likely to
be more adapted to local environmental conditions and more
genetically related to the first prebreed domestic dogs than breed
or breed-admixed individuals. To our knowledge, the degree to
which village dogs consist of indigenous versus non-native individ-
uals has not been quantified.
In one of the most comprehensive surveys of village and breed
dogs to date, Savolainen et al. (6) examined mtDNA diversity in a
global panel of 654 dogs. Their results confirmed previous mtDNA
evidence of dog domestication from Eurasian wolves (5), showed
that East Asian dogs had the highest mtDNA diversity of any
region, suggesting an East Asian origin of domestication. However,
subsequent work by Pire s et al. (10) has shown that mtDNA doe s
not show significant population structure in village dogs. Because
Savolainen et al. included many East Asian village dogs but few
village dogs from other regions, their conclusion of high levels of
Author contributions: A.R.B., R.H.B., C.M.B., M.C., M.H., and C.D.B. designed research;
A.R.B., R.H.B., C.M.B., H.G.P., L.C., J.D.D., M.H., J.S., and P.J. performed research; H.G.P.,
M.C., L.C., J.D.D., A.A., R.K., E.A.O., R.J.T., and P.J. contributed new reagents/analytic tools;
A.R.B., R.H.B., H.G.P., A.A., and P.J. analyzed data; and A.R.B., R.H.B., and C.D.B. wrote the
Conflict of interest statement: For some of this project, we utilized the Wisdom MX product
(MARS Inc.) for detecting breed-admixed ancestry. P.G.J. was as employee of MARS
overseeing Wisdom development, C.D.B. was paid consultant for MARS during its devel-
opment, and E.A.O. is a licenser of the patent.
This article is a PNAS Direct Submission.
Data deposition: The sequences reported in this paper have been deposited in the GenBank
database (accession nos. GQ375164 –GQ375213).
To whom correspondence should be addressed. E-mail:
This article contains supporting information online at
www.pnas.orgcgidoi10.1073pnas.0902129106 PNAS
August 18, 2009
vol. 106
no. 33
Page 1
East Asian diversity is likely a consequence of high levels of
mitochondrial diversity in village dogs and not necessarily an
indication of East Asian domestication.
Other genetic markers have been shown to exhibit significant
population structure in village dogs. Microsatellites and MHC types
both separate Bali street dogs from New Guinea singing dogs,
dingoes, and breed dogs (11, 12). Both studies demonstrated high
diversity in the Bali dogs, consistent either with an indigenous,
prebreed ancestry or with a complex admixture history from a large
number of breeds. Therefore, given a large enough sample of village
and breed dogs, microsatellite and single nucleotide polymorphism
(SNP) markers seem well suited to studying population structure
and the possibility breed admixture in village dogs.
In this study, we analyzed mtDNA, microsatellite, and SNP
markers in 318 African village dogs to characterize population
structure and genetic diversity. In addition, we analyzed 16 Puerto
Rican street dogs, 102 known mixed-breed dogs from the United
States, and several hundred dogs from 126 breeds, including 129
dogs from five African and Middle Eastern breeds, to determine the
degree of non-native admixture in African village dogs. Our sam-
pling effort concentrated on seven regions from three geographi-
cally separated African countries (Fig. 1): Egypt: We sampled three
distinct locales: a Giza animal shelter, a Luxor animal shelter and
surrounds, and a rural desert oasis (Kharga). Although the geo-
graphic distance between Giza and Luxor is greater than that
between Kharga and Luxor, we hypothesized that the desert would
be a strong barrier to gene flow, making the latter populations more
genetically distinct.
Uganda: We sampled 100 dogs from a cluster of villages east
of Kampala and 30 dogs from three neighboring isles of the Kome
Island group in Lake Victoria. Despite the islands being close to
each other and the mainland (20 km), we expected the lake might
act as a dispersal barrier.
Namibia: We sampled from over a dozen villages and urban areas
in the northern and central parts of the country. No natural
dispersal barriers existed between sampling locations, although a
cordon fence is maintained to keep livestock diseases out of the
southern part of the country. Dogs are permitted to be taken across
the cordon and likely have little difficulty getting through the fence
themselves, but the cordon is significant in that it demarcate s the
extent of European colonization influence in the country [with
southern and central Namibia colonization history being roughly
similar to that of South Africa while northern Namibia resemble s
the rest of sub-Saharan Africa (13)]. We sampled dogs within 100
km of both sides of the cordon, including populations within 10–20
km of the barrier.
For comparison, we also sampled from two shelters in Puerto
Rico, known mixed-breed dogs (see Methods) from the United
States, and dogs from 126 breeds, including five African and
near-African breeds (putative origin in parentheses): Afghan
hounds (Sinai, Egypt), Basenjis (Congo), Pharaoh hounds (near
Mediterranean), Rhodesian ridgebacks (Zimbabwe), and Salukis
Inference of Population Structure and Degree of Breed Admixture in
African Village Dogs.
A subset of 223 unrelated African village dogs
from seven African locale s were typed on a panel of 89 microsat-
ellite markers or 300 SNP markers (206 village dogs, 15 Puerto
Rican dogs, and two United States mixed-breed dogs were typed on
both panels). Using the Bayesian clustering program STRUC-
TURE (14), we found that Puerto Rican street dogs clustered with
the mixed-breed dogs from the United States, indicating these dogs
are all breed admixtures. STRUCTURE analysis at K 5 consis-
tently showed the same five groupings: Egyptian dogs, Ugandan
mainland dogs, Kome Island dogs, Northern Namibian dogs, and
admixed dogs (including all Puerto Rican and U.S. dogs, nearly all
Central Namibian dogs, and a few other African village dogs; Fig.
2 and Fig. S1). At K 4, STRUCTURE clustered Ugandan dogs
together (mainland and Kome Islands), and at K 5, STRUC-
TURE subdivided Ugandan dogs further, although these clusters
were inconsistent (Fig. S2).
We quantified admixture in each village dog as the mean
proportion of the genome assigned to the American (United
St ates Puerto Ric o) cluster by STRUCT URE across 10 r uns
at K 5 (admixture estimates using K 4orK 6 mean
proportions were nearly identical; R
0.984 and 0.992,
respectively). In tot al, 84% of African v illage dogs outside of
central Namibia showed little or no evidence of non-native
admixture (estimated admixture proportion 25% in 152 of
181 dogs), whereas all central Namibian dogs had 25%
admixture, and most had 60% (24 of 25; Table 1). Principal
c omponent analysis showed a clear separation of Eg yptian
f rom sub-Saharan populations in PC1 and separation bet ween
Kharga oasis
Uganda (mainland)
Kome Isles
Namibia (north)
Namibia (central)
Puerto Rico
Fig. 1. Map of village dog sampling locations. Colors denote each distinct
region and dots show approximate range of sampling within each region. See
Table S1 for full description.
Puerto Rico
and U.S.
Giza Luxor Kharga Uganda (main) KomeIs Namibia
Namibia (north)
Fig. 2. STRUCTURE analysis across 389 SNP and microsatellite loci in African
village and American mixed breed dogs.
Table 1. Number of indigenous (<25% inferred admixture),
uncertain (25%– 60% inferred admixture) and breed admixed
(>60% inferred admixture) village dogs by region from the 223
unrelated genotyped dogs
country region indigenous uncertain admixed
Egypt Giza 7 4 0
Egypt Luxor 25 0 0
Egypt Kharga 5 0 0
Uganda mainland 34 4 7
Uganda isles 19 3 0
Namibia central 0 1 24
Namibia north 62 7 4
Puerto Rico 0 0 15
www.pnas.orgcgidoi10.1073pnas.0902129106 Boyko et al.
Page 2
Ugandan and Namibian populations in PC2 for indigenous
Af rican v illage dogs for both SNP and microsatellite markers
(Fig. 3). When admixed Af rican and American dogs were
included, PCA, like STRUCT URE, always clustered them
together, and the interpret ation of the principal components
became more complicated (Fig. S2).
To clarify the relationship between the Puerto Rican and African
dogs that clustered with the two known mixed-breed dogs geno-
typed on the full 389 marker panel, we ran STRUCTURE on the
300-SNP dataset with an additional 100 known breed-admixed dogs
from the United States that were genotyped on this SNP panel (Fig.
S3). The groupings of African dogs and the inference of non-native
admixed individuals are highly consistent with the earlier analyse s
until K 5, when STRUCTURE starts to detect groupings within
the admixed individuals. The substructure found within admixed
individuals may be a consequence of different ancestral breeds in
different individuals; STRUCTURE analysis of the village dogs
and dogs from 126 breeds shows that the putatively indigenous
village dogs cluster with ancient breeds (specifically Basenjis) while
the putatively non-native dogs cluster with modern breed groups in
various proportions (Fig. S4).
calculations confirm that central Namibian dogs show virtu-
ally no genetic differentiation from American dogs (pairwise F
based on SNP markers 0.011; microsatellite F
0.0025). The
pairwise F
between Egyptian dogs from Giza and Luxor was also
low (SNP F
0.0024; microsatellite F
0.0057), whereas other
village dog populations had pairwise F
values of 0.025–0.133
(Table 2). Dogs from Kharga were the most distinct (F
0.0735–13.3) whereas dogs from mainland Uganda and northern
Namibia (2,900 km apart) show only moderate differentiation
0.0237–0.0254). Heterozygosity was high across all genetic
marker types in all village dog populations except those of the
Kharga oasis and the Kome islands and low in all of the breed dogs
(Table 3).
Origin of Putatively African Breeds. We included individuals from
five breeds with presumed African or Middle Eastern ancestry in
our principal component analyses to see whether this approach
could detect which sampled village dog populations are close st to
the founding population for each breed. For the SNP loci, PC1 and
PC2 differentiated three breed groups—Basenjis, Salukis/Afghan
hounds, and Rhodesian ridgebacks/Pharaoh hounds—while village
dogs were clustered closer to the origin (Fig. 4). Notably, the village
dog cluster still exhibited geographical structuring with Egyptian
village dogs lying closest to the Saluki/Afghan hound cluster,
indigenous Namibian and Ugandan dogs lying closest to the Basenji
cluster, and breed-admixed Namibian and American dogs lying
closest to the Rhodesian ridgeback/Pharaoh hound cluster. PCA of
the microsatellite loci revealed the same clustering affinities (Egyp-
tian village dogs nearest to Salukis/Afghan hounds, etc.) as the SNP
PCA although the breed clusters were less well defined (Fig. S5).
Analysis of Mitochondrial Diversity. We sequenced 680 bp of the
mitochondrial D-loop, including the 582-bp region described in ref.
6. We found 47 haplotypes in the African dogs as well as 9 hap-
lotypes in the Puerto Rican dogs, two of which were also found in
the sampled United States mixed breed dogs (see Table S1 and
Table S3). All haplotypes were in the A (33 African haplotypes), B
(6 African haplotypes), or C (8 African haplotypes) clades (Fig. S6),
the clades that are believed to contain 95% of domestic dogs (6).
Over the region sequenced in (6) and ignoring indels, we found 18
African haplotypes that were not described by (6); 14 in A clade
[one of which was found in Africa by (10)], one in B clade, and three
in C clade. The Puerto Rican and United States mixed-breed dogs
had8AcladeandoneBcladehaplotypes (only one haplotype, a
Puerto Rican A clade haplotype, ws not previously described in ref.
Surprisingly, local mtDNA diversity did not differ systematically
between African regions and similarly sized regions in East Asia,
the purported origin of domestic dogs. Across the 582-bp region
analyzed in refs. 6 and 10, and this study, the number of haplotypes
observed in a region closely matches the neutral expectation (Fig.
5). Differences in regional haplotype diversity appear to be driven
by sampling artifacts rather than by distance from an hypothetical
domestication origin, with the highly sampled and fractionated
subpopulations of Japan exhibiting the most diversity, and nearby
Sichuan (China) probably exhibiting the least (Fig. 5). Neither
Africa nor East Asia appears to contain private haplogroups
e xplained)
3.99% varianc e
-0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15
PC2 (
PC1 (4.85% variance explained)
riance explained
-0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15
PC2 (5.08% va
PC1 (6 09% (6. variance explained)
Fig. 3. Principal component analysis of indigenous African village dogs. (A) PCA
with the 89 microsatellite loci (n 152). (B) PCA with the 300 SNP loci (n 126).
Table 2. Pairwise F
in village dogs between regions based on 300 SNPs
Giza Kharga Luxor NA_cent NA_north UG_isles UG_main America
Kharga 8.98%
Luxor 0.62% 8.31%
NA_cent 3.48% 12.52% 5.92%
NA_north 3.87% 10.91% 4.75% 4.75%
UG_isles 4.90% 13.27% 6.28% 4.70% 4.93%
UG_main 3.38% 11.78% 4.95% 4.44% 2.54% 3.75%
America 2.70% 12.86% 5.15% 1.14% 4.97% 5.00% 4.51%
Boyko et al. PNAS
August 18, 2009
vol. 106
no. 33
Page 3
(haplotypes that are highly differentiated from those found on other
continents; Fig. S6).
This study analyzed a large number of genetic markers to charac-
terize the level of non-native admixture in a geographically wide-
spread set of semiferal village dog populations. African village dogs
exhibit complex population structure because of the effects of
geography, gene flow barriers, and the presence of non-indigenous
dogs in some populations. Notably, the vast majority of the African
village dogs could be classified as indigenous (25% non-African
ancestry) or non-native (60% non-African ancestry), with only
7% showing intermediate levels of African ancestry (Table 1).
Classification of individuals as indigenous versus non-native was
consistent between runs, and remained consistent even when the
number of mixed-breed dogs included in the analysis was substan-
tially increased (Fig. S3).
With two exceptions, African village dogs did not exhibit a
region-specific level of non-African admixture, but rather con-
tained dogs with completely indigenous ancestry (or nearly so) that
were often intermingling with a few highly admixed individuals. The
lack of consistent levels of admixture within regions sugge sts that
non-indigenous dog genes are quickly removed from village dog
populations, or that admixture with non-indigenous dogs is a very
recent phenomenon in these areas. The two exceptions were central
Namibia, where every dog had significant levels of non-indigenous
admixture (see below), and Giza, where all dogs showed some,
usually low, level of admixture. This background level of admixture
in Giza could reflect older mixing with breed dogs around this
ancient city, or it could simply reflect the relative proximity of Giza
to Eurasia, the ancestral home of most modern breed dogs.
STRUCTURE analyses including dogs from 126 breeds suggest it
is the latter—Egyptian dogs cluster partially with ancient (mostly
Asian) breeds and the sub-Saharan (Basenji village dog) cluster
and do not appear to cluster significantly with any of the (mostly
European) modern breed groups (Fig. S4).
Dispersal barriers significantly affected population structure.
The 230 km of de sert separating the Kharga oasis from Luxor led
to much stronger population differentiation (F
0.084) than the
500 km Nile corridor between Luxor and Giza (F
Likewise, the Kome islands which lie 10–20 km from the mainland
in Lake Victoria were much more differentiated from mainland
Uganda than were northern Namibian populations 2,900 km away
0.051 vs. F
0.033). Most surprising, the 20–100 km
distance between northern and central Namibian populations that
coincided with that country’s Red Line veterinary cordon fence
African breed + village dog SNP
Egypt (Giza)
Egypt (Luxor)
Egypt (Kharga)
Uganda (mainland)
Uganda (isles)
Namibia (central)
Namibia (north)
Afghan Hound
5% variance explained)
Pharaoh Hound
Rhodesian Ridgeback
PC2 (14.
-0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2
PC1 (23.2% variance explained)
Fig. 4. Principal component analysis of village dogs and dogs from 5 putatively
African and Middle Eastern breeds across 300 SNP markers in 186 village dog and
105 breed dogs.
observed (K)
East Asia
NW Africa
num haplotypes
5 10204080160
num dogs sampled (n)
Fig. 5. Numberof haplotypes (excluding indels) versus number of dogs sampled
within Africa and East Asian geographic regions. Note log scale of x axis. East
Asian samples from (6); African samples from this study or by (10). See Table S4 for
a list of the areas used to construct this figure. The blue line depicts the expected
number of haplotypes from Ewens’s sampling formula (29), which assumes an
infinite alleles model; E(K)⫽兺
/( j
). Using Levenberg-Marquardt nonlinear
regression, we estimate
to be 8.654 (95% C.I. [7.41, 9.89]).
Table 3. Gene diversity (expected heterozygosity) at 89 microsatellite markers, 300 SNP markers, and the mitochondrial D-loop in
African village dogs and five breeds
microsatellites SNPs mtDNA
all dogs indigenous all dogs indigenous all dogs
Egypt (Giza) 0.677 (11) 0.684 (8) 0.438 (11) 0.438 (8) 0.890 (11)
Egypt (Luxor) 0.666 (25) 0.664 (24) 0.419 (25) 0.417 (24) 0.936 (26)
Egypt (Kharga) 0.553 (5) 0.553 (5) 0.360 (5) 0.360 (5) 0.427 (5)
Uganda (mainland) 0.669 (43) 0.660 (30) 0.432 (19) 0.424 (16) 0.901 (118)
Uganda (isles) 0.633 (20) 0.630 (16) 0.435 (20) 0.429 (16) 0.858 (30)
Namibia (north) 0.638 (71) 0.631 (60) 0.429 (61) 0.415 (50) 0.929 (91)
Namibia (central) 0.637 (25) 0 0.466 (18) 0 0.916 (28)
America 0.648 (17) 0 0.459 (18) 0 0.909 (17)
Afghan Hound 0.333 (5) 0.317 (18)
Basenji 0.356 (5) 0.184 (19)
Pharaoh Hound 0.217 (4) 0.283 (16)
Rhodesian Ridgeback 0.353 (5) 0.368 (28)
Saluki 0.330 (5) 0.355 (24)
Sample sizes are given in parentheses.
www.pnas.orgcgidoi10.1073pnas.0902129106 Boyko et al.
Page 4
represented a stark population boundary—dogs north of the cor-
don averaged 87% indigenous African ancestry while those south
of the cordon were only 9% African. The cordon has separated the
indigenous human populations (to the north) from white settlement
areas (to the south) for the last 100 years and is currently used to
restrict livestock (but not humans or dogs) from crossing southward
(13). During this time, indigenous dogs have apparently been
extirpated from central Namibia, and the selective pressures on
dogs in each region must be strong and disparate enough to
maintain a sharp genetic boundary along this porous chain-link
fence. That Puerto Rico also seems to contain few, if any, indige-
nous dogs highlights the degree to which colonization history
affects dog populations.
STRUCTURE and principal component analysis revealed strik-
ingly similar patterns of genetic variation—indigenous Africian
dogs clearly clustered by country and away from non-indigenous
dogs in each analysis (Figs. 2–4). PCA showed slight differences
between the SNP and microsatellite results: SNP but not micro-
satellite markers led to PC1 separating out dogs based on admixture
(Fig. S2), although PCA with only indigenous African dogs resulted
in the same axes of variation in both sets (Fig. 3). Breeds were
clustered more cleanly with the SNP dataset than the microsatellite
dataset, although this re sult could be an effect of the larger number
of breed dogs that were typed on the SNP panel rather than a
consequence of using SNPs versus microsatellites per se (Fig. 4 and
Fig. S5). Nevertheless, both marker sets clustered Salukis and
Afghan hounds neare st to Egyptian village dogs and Basenjis
nearest to indigenous Ugandan and Namibian dogs, as expected by
each breed’s history. In contrast, Rhodesian ridgebacks and Pha-
raoh hounds clustered nearest to admixed dogs, suggesting the se
breeds have been recreated from admixture with non-African dogs.
These results are consistent with the STRUCTURE results from
(15, 16), showing that Salukis, Afghan hounds, and Basenjis cluster
with ancient, non-European breeds, while Pharaoh hounds and
Rhodesian ridgebacks do not. Although this coarse sampling (3
countries) is suitable for detecting truly indigenous versus recon-
stituted ancestry in putatively African breeds, analysis including
village dogs from more regions will be necessary to better localize
the ance stral origins of the se breeds.
Village dog populations had higher levels of diversity than
purebred dogs across all markers (see (17) for purebred mtDNA
diversity estimates), although for SNP markers, non-native/admixed
dogs had even higher diversity e stimate s. The high heterozygosity
found in breed-admixed dogs is likely because of SNP ascertain-
ment; by preferentially genotyping SNPs that are highly polymor-
phic in breed dogs, inferences based on SNP diversity in village dogs
may be biased. Microsatellite ascertainment bias is less likely to
have this effect since even microsatellite s that are highly polymor-
phic in breeds can exhibit new alleles when genotyped in other
populations. This suggests that careful control of ascertainment, or
a denser SNP marker set that enables haplotype-based inference, is
desirable for SNP markers. However, the high degree of concor-
dance of SNP and microsatellite markers in both PCA and STRUC-
TURE analyses shows that these methods are robust to these
African village dogs exhibited a similar level of mitochondrial
D-loop diversity to that of the dogs sampled by (6) in East Asia, the
putative site of dog domestication. Although we do not suggest that
Africa is actually the site of dog domestication, we do believe that
an East Asian origin of dogs should be further scrutinized, espe-
cially as Africa also has numerous private haplotypes and East Asia
has no private haplogroups, with the possible exception of clade E,
which is poorly represented numerically (1 haplotype, 3 individuals)
and is rather similar to clade C. The data appear consistent with a
rapid spread of dogs after original domestication and high effective
population sizes and gene flow between continents, as there is no
clear signal of decreasing haplotype diversity away from any origin.
Interestingly, Ugandan and northern Namibian populations that
appear relatively undifferentiated using nuclear markers also have
large overlap in their mitochondrial sequences. Thus, long-distance
gene flow may be occurring, leading to a lower total number of
haplotypes in these areas, whereas areas in Egypt with less chance
for gene flow between them may harbor more diversity in the
aggregate. This underscores the need to design a sampling and
interpretation scheme to compare populations as opposed to coarse
geographic areas. The se areas could have feature s such as islands
and deserts that may increase the number of haplotypes found only
because one is sampling multiple populations.
Beside s the discovery of 18 haplotypes, we have also expanded
the geographic range of some previously reported dog mtDNA
haplotypes. For example, we found haplotype A29, the predomi-
nant mtDNA haplotype of Australian dingoes, in a Puerto Rican
dog even though this haplotype has never been reported in a dog
outside of East Asia or the American Arctic (18). Either Puerto
Rican dogs descend from some non-European (probably Asian)
dogs that still carry this haplotype, or this is an indigenous New
World haplotype that has persisted in Puerto Rico de spite wide-
spread historical European admixture.
Our results clearly demonstrate the need for further research
with indigenous village dogs. Indigenous dog populations can be
largely eliminated, as in Puerto Rico and central Namibia, by
European colonization, and it is unclear the degree to which other
populations will be able to maintain their genetic identity and
persist in the face of modernity. The dog, although certainly a
species uniquely suited as a model organism for genomics, can also
serve as an invaluable organism for comparative studie s of evolu-
tion and adaptation. Like other domesticated animals (e.g., cats,
horses, and pigeons), dogs consist of breeds intensely selected for
specific traits and feral populations that have been left to adapt to
local conditions with ‘‘random’’ breeding. Dense genotyping and
resequencing in the se species should reveal gene s underlying do-
mestication in random-bred populations, instead of just those that
have been under strong artificial selection in breed animals, and
whether the relaxation of selective constraint observed in these
species (19) is a product of recent breeding practices or domesti-
cation per se. Resequencing in indigenous village dogs will also be
necessary to obtain markers free of ascertainment bias to estimate
the amount of genetic variation in dogs that is absent in existing
modern breeds, and the degree to which present-day indigenous
village dogs represent populations that have been randomly breed-
ing since dog domestication versus remnants of ancient, indigenous
Mitochondrial sequencing alone does not seem well-suited to
determining the timing and location of dome stication. Dog mito-
chondrial haplogroups seem more or less cosmopolitan, and infer-
ences based on mtDNA diversity statistics can be easily skewed by
sampling effort and misled by the inability to distinguish indigenous
from non-native dogs. In the absence of finding multiple highly
diverged and highly localized mitochondrial haplogroups, genome-
wide autosomal markers will be needed to unravel the story of the
first dome sticated species.
Materials and Methods
Sampling Protocol. Dogs were sampled from animal shelters or were brought to
the researchers for sampling by owners and villagers. In accordance with Cornell
IACUC protocol 2007–0076, 3–5 mL of blood drawn from the cephalic or lateral
saphenous vein into K2-EDTA blood collection tubes. At the field site, blood cells
were lysed with an ammonium chloride solution and spun at 1,100 g with a
portable centrifuge. After discarding the supernatant, cell pellets were resus-
pended in an EDTA-Tris-SDS solution for transport to the DNA Bank at Cornell
Baker Institute for Animal Health. DNA was isolated from the lysate using
ammonium acetate and alcohol and was suspended in Tris-EDTA buffer. Con-
centrations were determined by A260 on a NanoDrop ND1000 spectrophotom-
eter. Stock DNA was stored in 20 °C freezers by the Cornell Medical Genetics
Archive. Dilutions were made from a 200
g/mL working stock as needed for
sequencing and genotyping. A similar protocol was followed for the 102 United
Boyko et al. PNAS
August 18, 2009
vol. 106
no. 33
Page 5
States dogs, except that we also verified that they were mixtures of several
different breeds by using the Wisdom MX breed test (Mars Inc.).
Microsatellite Genotyping. Two hundred twenty-seven village dogs were typed
on a 96-microsatellite panel described in (15, 16). Microsatellites were amplified
individually in the presence of a fluorescently labeled universal primer and were
combined post-PCR into sets of 1 to 4 markers for capillary electrophoresis on an
ABI3730xl (ABI). Standard PCR conditions have been described in ref. 15 while
adjustments made to individual markers are listed in Table S5. Each 96-well plate
of samples included a previously genotyped control sample for size verification
and binned using GeneMapper 4.0. All genotype calls were checked manually
and markers were scanned individually for the appearance of new alleles outside
the existing bins. After genotyping, 7 markers were excluded on the basis of high
missing rates (20%) or heterozygote deficits (P 0.01) in a majority of the 8
regional populations because this suggests the presence of null alleles at these
loci. These data were combined with dogs from 126 breeds previously genotyped
for breed structure studies (15, 16).
SNP Genotyping. One hundred sixty-eight village dogs, 102 mixed-breed dogs,
and dogs from 126 breeds were genotyped using the sequenom iPLEX platform
on a 321-SNP panel described in ref. 20. For each sample, 2
L of dog genomic
DNA was aliquoted into 13 separate microtiter wells for PCR amplification. Each
genomic aliquot was amplified in a total volume of 10
L 45 cycles with up to
28 primer pairs. Each reaction was treated with shrimp alkaline phosphatase for
40 min before heat inactivation. Primer extension reactions were carried out in a
standard thermocycler according to the sequenom iPLEX gold protocol. Each
reaction was desalted before spotting and shooting a SpectroChip on the Com-
pact MassARRAY system (Sequenom). Results were interpreted automatically
using cluster plots with the Histogram tabular view active in SpectroTyper-
TyperAnalyzer (Sequenom). SNP genotypes were loaded into P
LINK version 1.0.4
(21) and 15 SNPs with high missingness (20%) and 1 SNP with an extreme
heterozygote deficiency (P 10
below Hardy-Weinberg equilibrium) were
removed from further analysis.
Mitochondrial Sequencing. A 680-bp fragment of the mitochondrial D-loop was
amplified in two overlapping reactions. Region-1 was amplified using forward
primer H15422: 5-CTCTTGCTCCACCATCAGC-3, and reverse primer L15781: 5-
GTAAGAACCAGATGCCAGG-3. Region-2 was amplified using forward primer
H15693 5-AATAAGGGCTTAATCACCATGC-3 and reverse primer L16106: 5-
AAACTATATGTCCTGAAACC-3 (primer names correspond to 3 most position of
primer, relative to the published dog mitochondrial genome as in (6)). PCR was
carried out under the following protocol using 10 ng genomic DNA: Denatur-
ation: 94 °C (40 s); annealing: 54 °C (1 min); amplification: 72 °C (1 min) for 35 total
cycles followed by a 5 min final annealing step at 72 °C. Sequencing reactions
were carried out on an ABI 3730 sequencer using BigDye Terminator chemistry
using the Region-1 reverse primer and Region-2 forward primer. Any reads with
ambiguous bases were rerun in the opposite direction. Sequences were edited,
assembled, and aligned with Sequencher 4.8 (Gene Codes Corporation) and
submitted to GenBank with Sequin (
Statistical Analyses. We used two approaches—principal component analysis
with E
IGENSOFT v2.0 (22) and clustering analysis with STRUCTURE v2.2 (14)—to
classify individuals as indigenous or non-native and to describe the genetic
structure of indigenous African village dogs and their relationship to dogs from
putatively African breeds. We relied primarily on STRUCTURE to determine the
proportion of non-African admixture present in each village dog because struc-
ture allows for probabilistic assignment of individuals to classes and explicit
modeling of admixture (22). In contrast, PCA makes no assumptions regarding
discrete versus clinal population structure and is well suited for describing the
principal axes of genetic variation between populations. In practice, STRUCTURE
and PCA usually reveal very similar patterns of genetic variation (22).
Before running these clustering methods, we removed markers in high LD
with other markers [r
0.5, see (23)] using Arlequin v3.11 (24) and removed 9
village dogs that showed high relatedness to another dog in the genotyping
panel (
0.3). All STRUCTURE runs were done using the admixture model
with correlated allele frequencies, no prior population information, and default
parameter settings with a burnin period of 100,000 iterations followed by
500,000 MCMC repetitions, with 10 runs per K, and averaged using CLUMPP
v1.1.2 (25). In contrast, PCA was carried out separately for the SNP and microsat-
ellite markers. Microsatellite loci with n 2 alleles were recorded as n-1 biallelic
loci before running PCA in Eigensoft.
Expected heterozygoisty (h) was calculated in Arlequin after removing 10 dogs
that appeared to be r approximately 0.5 related. F
based on SNP loci was computed
with a custom C⫹⫹ implementation of Eq. 6 from (26); microsatellite F
was com-
puted using Arlequin. Unless otherwise noted, statistical tests were performed in R
v2.6.2 (27). STRUCTURE results were plotted using Distruct v1.1 (28).
ACKNOWLEDGMENTS. We thank numerous volunteers and animal shelters for
their assistance in gathering samples, including Leonard Kuwale, Ahmed Sa-
maha, Kazhila Chinsembu, Animal Care in Egypt (Luxor), Animal Friends Shelter
(Giza), Albergue de Animales Villa Michelle (Mayaguez), and Albergue La Gab-
riella (Ponce); Jason Mezey, Fengfei Wang, Katarzyna Bryc, and Andy Reynolds
for their assistance with lab and computational resources; Bob Wayne, Niels
Pedersen, Ben Sacks, Sarah Brown, and Peter Savolainen for helpful comments
and discussion; and the intramural program of the National Human Genome
Research Institute. This work supported by the Center for Vertebrate Genomics,
Department of Clinical Sciences and Baker Institute of Animal Health, Cornell
University; National Institutes of Health Center for Scientific Review and R24
research grant program; NationalScience Foundation Grant 0516310; and a Sloan
Foundation research fellowship.
1. Wayne R (2001) Consequences of domestication: Morphological diversity of the dog. In
The Genetics of the Dog, eds Ruvinsky A, Sampson J (CABI Publishing, Oxon, UK), pp 43–60.
2. Clutton-Brock J (1995) Origins of the dog: Domestication and early history. In The
Domestic Dog, Its Evolution, Behavior and Interactions with People, ed Serpell J CUP,
Cambridge), pp 7–20.
3. Vila` C, Maldonado J, Wayne R (1999) Phylogenetic relationships, evolution, and
genetic diversity of the domestic dog. J Hered 90:71–77.
4. Germonpre´ M, et al. (2009) Fossil dogs and wolves from Palaeolithic sites in Belgium,
the Ukraine and Russia: Osteometry, ancient DNA and stable isotopes. J Arch Sci
5. Vila` C, et al. (1997) Multiple and ancient origins of the domestic dog. Science 276:1687–
6. Savolainen P, Zhang Y, Luo J, Lundeberg J, Leitner T (2002) Genetic evidence for an East
Asian origin of domestic dogs. Science 298:1610 –1613.
7. Coppinger R, Coppinger L (2001) in Dogs: A Startling New Understanding of Canine
Origin, Behavior and Evolution (Scribner, New York).
8. Dobney K, Larson G (2006) Genetics and animal domestication: New windows on an
elusive process. J Zool 269:261–271.
9. Miklosi A (2008) in Dog Behaviour, Evolution, and Cognition (Oxford Univ Press,
Oxford), p 304.
10. Pires A, et al. (2006) Mitochondrial DNA sequence variation in Portuguese native breed
dogs: diversity and phylogenetic affinities. J Hered 97:318 –330.
11. Irion D, Schaffer A, Grant S, Wilton A, Pedersen N (2005) Genetic variation analysis of
the Bali street dog using microsatellites. BMC Genet 6:6.
12. Runstadler J, Angles J, Pedersen N (2006) Dog leucocyte antigen class II diversity and
relationships among indigenous dogs of the island nations of Indonesia (Bali). Aus-
tralia and New Guinea Tissue Antigens 68:418 426.
13. (2008) Police Zone. Encyclopædia Britannica. Online Ed.
14. Pritchard J, Stephens M, Donnelly P (2000) Inference of population structure using
multilocus genotype data. Genetics 155:945–949.
15. Parker HG, et al. (2004) Genetic structure of the purebred domestic dog. Science
16. Parker H, et al. (2007) Breed relationships facilitate fine-mapping studies: A 7.8-kb
deletion cosegregates with Collie eye anomaly across multiple dog breeds. Genome
Res 2007:1562–1571.
17. Gundry R, et al. (2007) Mitochondrial DNA analysis of the domestic dog: Control region
variation within and among breeds. J Forensic Sci 52:562–572.
18. Savolainen P, Leitner T, Wilton A, Matisoo-Smith E, Lundeberg J (2004) A detailed
picture of the origin of the Australian dingo, obtained from the study of mitochondrial
DNA. Proc Natl Acad Sci USA 101:12387–12390.
19. Bjo¨ rnerfeldt S, Webster M, Vila` C (2006) Relaxation of selective constraint on dog
mitochondrial DNA following domestication. Genome Res 16:990 –994.
20. Jones P, et al. (2008) Single-nucleotide-polymorphism-based association mapping of
dog stereotypes. Genetics 179:1033–1044.
21. Purcell S, et al. (2007) PLINK: A tool set for whole-genome association and population-
based linkage analyses. Am J Hum Genet 81:559 –575.
22. Patterson N, Price A, Reich D (2006) Population structure and eigenanalysis. PLoS Genet
23. Kaeuffer R, Re´ ale D, Coltman D, Pontier D (2007) Detecting population structure using
STRUCTURE software: Effect of background linkage disequilibrium. J Hered 99:374 –380.
24. Excoffier L, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for
population genetics data analysis. Evol Bioinform Online 1:47–50.
25. Jakobsson M, Rosenberg N (2007) CLUMPP: A cluster matching and permutation
program for dealing with label switching and multimodality in analysis of population
structure. Bioinformatics 23:1801–1806.
26. Weir B, Cockerman C (1984) Estimating F-statistics for the analysis of population
structure. Evolution 38:1358 –1370.
27. R Development Core Team (2008) in R: A language and environment for statistical
computing (R Foundation for Statistical Computing, Vienna, Austria).
28. Rosenberg N (2004) DISTRUCT: A program for the graphical display of population
structure. Mol Ecol Notes 4:137–138.
29. Ewens W (1972) The sampling theory of selectively neutral alleles. Theor Pop Biol
www.pnas.orgcgidoi10.1073pnas.0902129106 Boyko et al.
Page 6
  • Source
    • "Exists as commensal and feral populations over much of tropical Asia and on islands as far as New Guinea (Miklouho-Maclay, 1882) and historically Polynesia, including Hawaii and New Zealand (Oskarsson et al., 2012). Pre-Colombian American breeds, Canaan breed of the Middle East, and some (but not all) native African and East Asian breeds also belong to this subspecies (Boyko et al., 2009; von Holdt et al., 2010; Larson et al., 2012). Feral Dogs of this subspecies reverse to Dingo-like rather than Gray Wolf-like appearance (Dinets, Rotshild, 1998). "
    Full-text · Article · Jul 2015
    • "Mitochondrial DNA (mtDNA) mainly used for phylogenetic population studies in the domestic dog (Canis lupus familiaris) regarding their geographic and temporal origin (Savolainen et al. 2002; Boyko et al. 2009; Pang et al. 2009; Vonholdt et al. 2010) as well as their evolutionary history (Tsuda et al. 1997; Vilà et al. 2005) also provides information regarding maternal gene flow and phylogenetic relationships within and among purebred dog breeds because of its maternal pattern of inheritance. In addition, the paternally inherited Y chromosome allows conclusions about evolutionary events in paternal lineages of mammals. "
    [Show abstract] [Hide abstract] ABSTRACT: The Norwegian Lundehund breed of dog has undergone a severe loss of genetic diversity as a result of inbreeding and epizootics of canine distemper. As a consequence, the breed is extremely homogeneous and accurate sex identification is not always possible by standard screening of X-chromosomal loci. To improve our genetic understanding of the breed we genotyped 17 individuals using a genome-wide array of 170 000 single nucleotide polymorphisms (SNPs). Standard analyses based on expected homozygosity of X-chromosomal loci failed in assigning individuals to the correct sex, as determined initially by physical examination and confirmed with the Y-chromosomal marker, amelogenin. This demonstrates that identification of sex using standard SNP assays can be erroneous in highly inbred individual
    No preview · Article · May 2015 · Journal of Heredity
    • "In order to demonstrate the applicability of this nomenclature to previous D-loop datasets, and the need for a coding-region SNP analysis in future studies, we classified a representative sample from several geographical regions into haplogroups (Ardalan et al., 2011; Boyko et al., 2009; Brown et al., 2011 Brown et al., , 2013 Elledge et al., 2008; Klutsch et al., 2011; Koban et al., 2009; Okumura et al., 1996; Oskarsson et al., 2011; Pang et al., 2009; Parra et al., 2008; Savolainen et al., 2002 Savolainen et al., , 2004 Suarez et al., 2013; Takahasi et al., 2002; Tsuda et al., 1997; van Asch et al., 2005 van Asch et al., , 2013 Wetton et al., 2003). Although most samples could be classified tentatively into the main haplogroups, some problems were found in identifying certain clusters. "
    [Show abstract] [Hide abstract] ABSTRACT: Canis lupus familiaris mitochondrial DNA analysis has increased in recent years, not only for the purpose of deciphering dog domestication but also for forensic genetic studies or breed characterization. The resultant accumulation of data has increased the need for a normalized and phylogenetic-based nomenclature like those provided for human maternal lineages. Although a standardized classification has been proposed, haplotype names within clades have been assigned gradually without considering the evolutionary history of dog mtDNA. Moreover, this classification is based only on the D-loop region, proven to be insufficient for phylogenetic purposes due to its high number of recurrent mutations and the lack of relevant information present in the coding region. In this study, we design: 1) a refined mtDNA cladistic nomenclature from a phylogenetic tree based on complete sequences, classifying dog maternal lineages into haplogroups defined by specific diagnostic mutations. 2) a coding region SNP analysis that allows a more accurate classification into haplogroups when combined with D-loop sequencing, thus improving the phylogenetic information obtained in dog mitochondrial DNA studies. Copyright © 2015. Published by Elsevier B.V.
    No preview · Article · Apr 2015 · Mitochondrion
Show more