Complex population structure in African village
dogs and its implications for inferring dog
Adam R. Boykoa,1, Ryan H. Boykob, Corin M. Boykob, Heidi G. Parkerc, Marta Castelhanod, Liz Coreyd,
Jeremiah D. Degenhardta, Adam Autona, Marius Hedimbie, Robert Kityof, Elaine A. Ostranderc, Jeffrey Schoenebeckc,
Rory J. Todhunterd, Paul Jonesg, and Carlos D. Bustamantea
aDepartment of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853;bDepartment of Anthropology and Graduate Group in
Ecology, University of California, Davis, CA 95616;cNational Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892;
dDepartment of Clinical Sciences and the Medical Genetic Archive, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853;eDepartment of
Biological Sciences, University of Namibia, Windhoek, Namibia;fDepartment of Zoology, Makerere University, Kampala, Uganda; andgThe Walthan Centre
for Pet Nutrition, Waltham on the Wolds, Leicestershire LE14 4RT, United Kingdom
Edited by Tomoko Ohta, National Institute of Genetics, Mishima, Japan, and approved June 12, 2009 (received for review February 26, 2009)
High genetic diversity of East Asian village dogs has recently been
used to argue for an East Asian origin of the domestic dog. However,
village dogs represent distinct, indigenous populations instead of
admixtures of various dog breeds has not been quantified. Under-
standing these issues is critical to properly reconstructing the timing,
number, and locations of dog domestication. To address these ques-
tions, we sampled 318 village dogs from 7 regions in Egypt, Uganda,
and Namibia, measuring genetic diversity >680 bp of the mitochon-
drial D-loop, 300 SNPs, and 89 microsatellite markers. We also ana-
lyzed breed dogs, including putatively African breeds (Afghan
hounds, Basenjis, Pharaoh hounds, Rhodesian ridgebacks, and Sa-
lukis), Puerto Rican street dogs, and mixed breed dogs from the
United States. Village dogs from most African regions appear genet-
some individuals cluster genetically with Puerto Rican dogs or United
States breed mixes instead of with neighboring village dogs. Thus,
African village dogs are a mosaic of indigenous dogs descended from
early migrants to Africa, and non-native, breed-admixed individuals.
Among putatively African breeds, Pharaoh hounds, and Rhodesian
ridgebacks clustered with non-native rather than indigenous African
dogs, suggesting they have predominantly non-African origins. Sur-
prisingly, we find similar mtDNA haplotype diversity in African and
East Asian village dogs, potentially calling into question the hypoth-
esis of an East Asian origin for dog domestication.
Canis familiaris ? microsatellites ? principal component analysis ? single
diversity than any other mammal (1–3). Dogs were probably
domesticated from Eurasian wolves at least 15,000–40,000 years
ago (4–6), although the process by which domestication took place,
including the specific selected traits and the manner in which
selection was performed, is very poorly understood (7, 8).
After domestication somewhere in Eurasia, dogs quickly spread
throughout the continent and into Africa, Oceania and the Amer-
certainly lived as human commensals that were not subject to the
same degree of intense artificial selection and closed breeding
populations, these ancient dog populations developed genetic sig-
natures characteristic of their geographic locale. These signatures
would persist in both modern day village dog populations that
founded from them. We refer to such dogs as ‘‘indigenous’’ in the
sense that they carry characteristic genetic signatures appropriate
for their geographic region.
Today, semiferal village dogs are nearly ubiquitous around human
settlements in much of the world, and such animals comprise a large
that many modern village dogs are not derived solely from indigenous
village dogs from indigenous dogs. We believe most of these dogs will
be complex mixtures of several non-native breeds and/or mixtures of
both non-native breeds and indigenous village dogs (‘‘intermediate’’
The distinction between indigenous and non-native dogs is
important because indigenous, but not non-native, village dogs are
be more adapted to local environmental conditions and more
genetically related to the first prebreed domestic dogs than breed
or breed-admixed individuals. To our knowledge, the degree to
which village dogs consist of indigenous versus non-native individ-
uals has not been quantified.
In one of the most comprehensive surveys of village and breed
dogs to date, Savolainen et al. (6) examined mtDNA diversity in a
global panel of 654 dogs. Their results confirmed previous mtDNA
evidence of dog domestication from Eurasian wolves (5), showed
that East Asian dogs had the highest mtDNA diversity of any
region, suggesting an East Asian origin of domestication. However,
subsequent work by Pires et al. (10) has shown that mtDNA does
not show significant population structure in village dogs. Because
Savolainen et al. included many East Asian village dogs but few
village dogs from other regions, their conclusion of high levels of
Author contributions: A.R.B., R.H.B., C.M.B., M.C., M.H., and C.D.B. designed research;
A.R.B., R.H.B., C.M.B., H.G.P., L.C., J.D.D., M.H., J.S., and P.J. performed research; H.G.P.,
M.C., L.C., J.D.D., A.A., R.K., E.A.O., R.J.T., and P.J. contributed new reagents/analytic tools;
A.R.B., R.H.B., H.G.P., A.A., and P.J. analyzed data; and A.R.B., R.H.B., and C.D.B. wrote the
(MARS Inc.) for detecting breed-admixed ancestry. P.G.J. was as employee of MARS
overseeing Wisdom development, C.D.B. was paid consultant for MARS during its devel-
opment, and E.A.O. is a licenser of the patent.
This article is a PNAS Direct Submission.
database (accession nos. GQ375164–GQ375213).
1To whom correspondence should be addressed. E-mail: firstname.lastname@example.org.
This article contains supporting information online at www.pnas.org/cgi/content/full/
August 18, 2009 ?
vol. 106 ?
no. 33 ?
East Asian diversity is likely a consequence of high levels of
mitochondrial diversity in village dogs and not necessarily an
indication of East Asian domestication.
Other genetic markers have been shown to exhibit significant
both separate Bali street dogs from New Guinea singing dogs,
dingoes, and breed dogs (11, 12). Both studies demonstrated high
diversity in the Bali dogs, consistent either with an indigenous,
and breed dogs, microsatellite and single nucleotide polymorphism
(SNP) markers seem well suited to studying population structure
and the possibility breed admixture in village dogs.
In this study, we analyzed mtDNA, microsatellite, and SNP
markers in 318 African village dogs to characterize population
structure and genetic diversity. In addition, we analyzed 16 Puerto
Rican street dogs, 102 known mixed-breed dogs from the United
States, and several hundred dogs from 126 breeds, including 129
degree of non-native admixture in African village dogs. Our sam-
pling effort concentrated on seven regions from three geographi-
distinct locales: a Giza animal shelter, a Luxor animal shelter and
surrounds, and a rural desert oasis (Kharga). Although the geo-
graphic distance between Giza and Luxor is greater than that
between Kharga and Luxor, we hypothesized that the desert would
Uganda: We sampled ?100 dogs from a cluster of villages east
of Kampala and 30 dogs from three neighboring isles of the Kome
Island group in Lake Victoria. Despite the islands being close to
act as a dispersal barrier.
in the northern and central parts of the country. No natural
dispersal barriers existed between sampling locations, although a
cordon fence is maintained to keep livestock diseases out of the
the cordon and likely have little difficulty getting through the fence
themselves, but the cordon is significant in that it demarcates the
extent of European colonization influence in the country [with
southern and central Namibia colonization history being roughly
similar to that of South Africa while northern Namibia resembles
the rest of sub-Saharan Africa (13)]. We sampled dogs within 100
km of both sides of the cordon, including populations within 10–20
km of the barrier.
For comparison, we also sampled from two shelters in Puerto
Rico, known mixed-breed dogs (see Methods) from the United
States, and dogs from 126 breeds, including five African and
near-African breeds (putative origin in parentheses): Afghan
hounds (Sinai, Egypt), Basenjis (Congo), Pharaoh hounds (near
Mediterranean), Rhodesian ridgebacks (Zimbabwe), and Salukis
Inference of Population Structure and Degree of Breed Admixture in
African Village Dogs. A subset of 223 unrelated African village dogs
from seven African locales were typed on a panel of 89 microsat-
ellite markers or 300 SNP markers (206 village dogs, 15 Puerto
both panels). Using the Bayesian clustering program STRUC-
TURE (14), we found that Puerto Rican street dogs clustered with
are all breed admixtures. STRUCTURE analysis at K ? 5 consis-
tently showed the same five groupings: Egyptian dogs, Ugandan
mainland dogs, Kome Island dogs, Northern Namibian dogs, and
admixed dogs (including all Puerto Rican and U.S. dogs, nearly all
Central Namibian dogs, and a few other African village dogs; Fig.
2 and Fig. S1). At K ? 4, STRUCTURE clustered Ugandan dogs
together (mainland and Kome Islands), and at K ?5, STRUC-
TURE subdivided Ugandan dogs further, although these clusters
were inconsistent (Fig. S2).
We quantified admixture in each village dog as the mean
proportion of the genome assigned to the American (United
States ? Puerto Rico) cluster by STRUCTURE across 10 runs
at K ? 5 (admixture estimates using K ? 4 or K ? 6 mean
proportions were nearly identical; R2? 0.984 and 0.992,
respectively). In total, 84% of African village dogs outside of
central Namibia showed little or no evidence of non-native
admixture (estimated admixture proportion ?25% in 152 of
181 dogs), whereas all central Namibian dogs had ?25%
admixture, and most had ?60% (24 of 25; Table 1). Principal
component analysis showed a clear separation of Egyptian
from sub-Saharan populations in PC1 and separation between
region and dots show approximate range of sampling within each region. See
Table S1 for full description.
Map of village dog sampling locations. Colors denote each distinct
Giza Luxor Kharga Uganda (main) KomeIsNamibia
village and American mixed breed dogs.
STRUCTURE analysis across 389 SNP and microsatellite loci in African
Table 1. Number of indigenous (<25% inferred admixture),
uncertain (25%–60% inferred admixture) and breed admixed
(>60% inferred admixture) village dogs by region from the 223
unrelated genotyped dogs
country regionindigenousuncertain admixed
www.pnas.org?cgi?doi?10.1073?pnas.0902129106Boyko et al.
Ugandan and Namibian populations in PC2 for indigenous
African village dogs for both SNP and microsatellite markers
(Fig. 3). When admixed African and American dogs were
included, PCA, like STRUCTURE, always clustered them
together, and the interpretation of the principal components
became more complicated (Fig. S2).
dogs that clustered with the two known mixed-breed dogs geno-
typed on the full 389 marker panel, we ran STRUCTURE on the
S3). The groupings of African dogs and the inference of non-native
admixed individuals are highly consistent with the earlier analyses
until K ? 5, when STRUCTURE starts to detect groupings within
the admixed individuals. The substructure found within admixed
individuals may be a consequence of different ancestral breeds in
different individuals; STRUCTURE analysis of the village dogs
and dogs from 126 breeds shows that the putatively indigenous
village dogs cluster with ancient breeds (specifically Basenjis) while
the putatively non-native dogs cluster with modern breed groups in
various proportions (Fig. S4).
FSTcalculations confirm that central Namibian dogs show virtu-
ally no genetic differentiation from American dogs (pairwise FST
based on SNP markers ? 0.011; microsatellite FST? 0.0025). The
pairwise FSTbetween Egyptian dogs from Giza and Luxor was also
low (SNP FST? 0.0024; microsatellite FST? 0.0057), whereas other
village dog populations had pairwise FST values of 0.025–0.133
(Table 2). Dogs from Kharga were the most distinct (FST of
0.0735–13.3) whereas dogs from mainland Uganda and northern
Namibia (?2,900 km apart) show only moderate differentiation
(FST? 0.0237–0.0254). Heterozygosity was high across all genetic
marker types in all village dog populations except those of the
Kharga oasis and the Kome islands and low in all of the breed dogs
Origin of Putatively African Breeds. We included individuals from
five breeds with presumed African or Middle Eastern ancestry in
our principal component analyses to see whether this approach
could detect which sampled village dog populations are closest to
the founding population for each breed. For the SNP loci, PC1 and
PC2 differentiated three breed groups—Basenjis, Salukis/Afghan
hounds, and Rhodesian ridgebacks/Pharaoh hounds—while village
dog cluster still exhibited geographical structuring with Egyptian
village dogs lying closest to the Saluki/Afghan hound cluster,
cluster, and breed-admixed Namibian and American dogs lying
closest to the Rhodesian ridgeback/Pharaoh hound cluster. PCA of
PCA although the breed clusters were less well defined (Fig. S5).
Analysis of Mitochondrial Diversity. We sequenced 680 bp of the
mitochondrial D-loop, including the 582-bp region described in ref.
6. We found 47 haplotypes in the African dogs as well as 9 hap-
lotypes in the Puerto Rican dogs, two of which were also found in
the sampled United States mixed breed dogs (see Table S1 and
Table S3). All haplotypes were in the A (33 African haplotypes), B
the clades that are believed to contain ?95% of domestic dogs (6).
Over the region sequenced in (6) and ignoring indels, we found 18
African haplotypes that were not described by (6); 14 in A clade
in C clade. The Puerto Rican and United States mixed-breed dogs
had 8 A clade and one B clade haplotypes (only one haplotype, a
Puerto Rican A clade haplotype, ws not previously described in ref.
Surprisingly, local mtDNA diversity did not differ systematically
between African regions and similarly sized regions in East Asia,
the purported origin of domestic dogs. Across the 582-bp region
analyzed in refs. 6 and 10, and this study, the number of haplotypes
observed in a region closely matches the neutral expectation (Fig.
5). Differences in regional haplotype diversity appear to be driven
by sampling artifacts rather than by distance from an hypothetical
domestication origin, with the highly sampled and fractionated
subpopulations of Japan exhibiting the most diversity, and nearby
Sichuan (China) probably exhibiting the least (Fig. 5). Neither
Africa nor East Asia appears to contain private haplogroups
3.99% varianc e
PC1 (4.85% variance explained)
PC2 (5.08% va
PC1 (6 09% (6.
-0.15 -0.1-0.050 0.050.1 0.15
with the 89 microsatellite loci (n ? 152). (B) PCA with the 300 SNP loci (n ? 126).
Table 2. Pairwise FSTin village dogs between regions based on 300 SNPs
GizaKharga LuxorNA_cent NA_northUG_islesUG_main America
Boyko et al. PNAS ?
August 18, 2009 ?
vol. 106 ?
no. 33 ?
continents; Fig. S6).
This study analyzed a large number of genetic markers to charac-
terize the level of non-native admixture in a geographically wide-
exhibit complex population structure because of the effects of
geography, gene flow barriers, and the presence of non-indigenous
dogs in some populations. Notably, the vast majority of the African
village dogs could be classified as indigenous (?25% non-African
ancestry) or non-native (?60% non-African ancestry), with only
7% showing intermediate levels of African ancestry (Table 1).
Classification of individuals as indigenous versus non-native was
consistent between runs, and remained consistent even when the
number of mixed-breed dogs included in the analysis was substan-
tially increased (Fig. S3).
With two exceptions, African village dogs did not exhibit a
region-specific level of non-African admixture, but rather con-
tained dogs with completely indigenous ancestry (or nearly so) that
lack of consistent levels of admixture within regions suggests that
non-indigenous dog genes are quickly removed from village dog
populations, or that admixture with non-indigenous dogs is a very
Namibia, where every dog had significant levels of non-indigenous
admixture (see below), and Giza, where all dogs showed some,
usually low, level of admixture. This background level of admixture
in Giza could reflect older mixing with breed dogs around this
ancient city, or it could simply reflect the relative proximity of Giza
to Eurasia, the ancestral home of most modern breed dogs.
STRUCTURE analyses including dogs from 126 breeds suggest it
is the latter—Egyptian dogs cluster partially with ancient (mostly
Asian) breeds and the sub-Saharan (Basenji ? village dog) cluster
and do not appear to cluster significantly with any of the (mostly
European) modern breed groups (Fig. S4).
Dispersal barriers significantly affected population structure.
The 230 km of desert separating the Kharga oasis from Luxor led
to much stronger population differentiation (FST? 0.084) than the
500 km Nile corridor between Luxor and Giza (FST? 0.0024).
Likewise, the Kome islands which lie 10–20 km from the mainland
in Lake Victoria were much more differentiated from mainland
Uganda than were northern Namibian populations 2,900 km away
(FST? 0.051 vs. FST? 0.033). Most surprising, the 20–100 km
distance between northern and central Namibian populations that
coincided with that country’s Red Line veterinary cordon fence
African breed + village dog SNP
5% variance explained)
PC1 (23.2% variance explained)
0 0.050.1 0.150.2
105 breed dogs.
5 10 20 40 80 160
num dogs sampled (n)
within Africa and East Asian geographic regions. Note log scale of x axis. East
number of haplotypes from Ewens’s sampling formula (29), which assumes an
regression, we estimate ? to be 8.654 (95% C.I. ? [7.41, 9.89]).
Table 3. Gene diversity (expected heterozygosity) at 89 microsatellite markers, 300 SNP markers, and the mitochondrial D-loop in
African village dogs and five breeds
all dogsindigenous all dogsindigenousall dogs
Sample sizes are given in parentheses.
www.pnas.org?cgi?doi?10.1073?pnas.0902129106 Boyko et al.
represented a stark population boundary—dogs north of the cor-
don averaged 87% indigenous African ancestry while those south
of the cordon were only 9% African. The cordon has separated the
areas (to the south) for the last 100 years and is currently used to
(13). During this time, indigenous dogs have apparently been
extirpated from central Namibia, and the selective pressures on
dogs in each region must be strong and disparate enough to
maintain a sharp genetic boundary along this porous chain-link
fence. That Puerto Rico also seems to contain few, if any, indige-
nous dogs highlights the degree to which colonization history
affects dog populations.
STRUCTURE and principal component analysis revealed strik-
ingly similar patterns of genetic variation—indigenous Africian
dogs clearly clustered by country and away from non-indigenous
dogs in each analysis (Figs. 2–4). PCA showed slight differences
between the SNP and microsatellite results: SNP but not micro-
in the same axes of variation in both sets (Fig. 3). Breeds were
clustered more cleanly with the SNP dataset than the microsatellite
of breed dogs that were typed on the SNP panel rather than a
consequence of using SNPs versus microsatellites per se (Fig. 4 and
Fig. S5). Nevertheless, both marker sets clustered Salukis and
Afghan hounds nearest to Egyptian village dogs and Basenjis
nearest to indigenous Ugandan and Namibian dogs, as expected by
each breed’s history. In contrast, Rhodesian ridgebacks and Pha-
raoh hounds clustered nearest to admixed dogs, suggesting these
These results are consistent with the STRUCTURE results from
(15, 16), showing that Salukis, Afghan hounds, and Basenjis cluster
with ancient, non-European breeds, while Pharaoh hounds and
Rhodesian ridgebacks do not. Although this coarse sampling (3
countries) is suitable for detecting truly indigenous versus recon-
stituted ancestry in putatively African breeds, analysis including
village dogs from more regions will be necessary to better localize
the ancestral origins of these breeds.
Village dog populations had higher levels of diversity than
purebred dogs across all markers (see (17) for purebred mtDNA
dogs had even higher diversity estimates. The high heterozygosity
found in breed-admixed dogs is likely because of SNP ascertain-
ment; by preferentially genotyping SNPs that are highly polymor-
may be biased. Microsatellite ascertainment bias is less likely to
have this effect since even microsatellites that are highly polymor-
phic in breeds can exhibit new alleles when genotyped in other
populations. This suggests that careful control of ascertainment, or
a denser SNP marker set that enables haplotype-based inference, is
desirable for SNP markers. However, the high degree of concor-
TURE analyses shows that these methods are robust to these
African village dogs exhibited a similar level of mitochondrial
D-loop diversity to that of the dogs sampled by (6) in East Asia, the
Africa is actually the site of dog domestication, we do believe that
an East Asian origin of dogs should be further scrutinized, espe-
cially as Africa also has numerous private haplotypes and East Asia
has no private haplogroups, with the possible exception of clade E,
and is rather similar to clade C. The data appear consistent with a
rapid spread of dogs after original domestication and high effective
population sizes and gene flow between continents, as there is no
clear signal of decreasing haplotype diversity away from any origin.
Interestingly, Ugandan and northern Namibian populations that
appear relatively undifferentiated using nuclear markers also have
large overlap in their mitochondrial sequences. Thus, long-distance
gene flow may be occurring, leading to a lower total number of
haplotypes in these areas, whereas areas in Egypt with less chance
for gene flow between them may harbor more diversity in the
aggregate. This underscores the need to design a sampling and
geographic areas. These areas could have features such as islands
and deserts that may increase the number of haplotypes found only
because one is sampling multiple populations.
Besides the discovery of 18 haplotypes, we have also expanded
the geographic range of some previously reported dog mtDNA
haplotypes. For example, we found haplotype A29, the predomi-
nant mtDNA haplotype of Australian dingoes, in a Puerto Rican
dog even though this haplotype has never been reported in a dog
outside of East Asia or the American Arctic (18). Either Puerto
Rican dogs descend from some non-European (probably Asian)
dogs that still carry this haplotype, or this is an indigenous New
World haplotype that has persisted in Puerto Rico despite wide-
spread historical European admixture.
Our results clearly demonstrate the need for further research
with indigenous village dogs. Indigenous dog populations can be
largely eliminated, as in Puerto Rico and central Namibia, by
European colonization, and it is unclear the degree to which other
populations will be able to maintain their genetic identity and
persist in the face of modernity. The dog, although certainly a
species uniquely suited as a model organism for genomics, can also
serve as an invaluable organism for comparative studies of evolu-
tion and adaptation. Like other domesticated animals (e.g., cats,
horses, and pigeons), dogs consist of breeds intensely selected for
specific traits and feral populations that have been left to adapt to
local conditions with ‘‘random’’ breeding. Dense genotyping and
resequencing in these species should reveal genes underlying do-
mestication in random-bred populations, instead of just those that
have been under strong artificial selection in breed animals, and
whether the relaxation of selective constraint observed in these
species (19) is a product of recent breeding practices or domesti-
cation per se. Resequencing in indigenous village dogs will also be
necessary to obtain markers free of ascertainment bias to estimate
the amount of genetic variation in dogs that is absent in existing
modern breeds, and the degree to which present-day indigenous
village dogs represent populations that have been randomly breed-
ing since dog domestication versus remnants of ancient, indigenous
Mitochondrial sequencing alone does not seem well-suited to
determining the timing and location of domestication. Dog mito-
chondrial haplogroups seem more or less cosmopolitan, and infer-
ences based on mtDNA diversity statistics can be easily skewed by
from non-native dogs. In the absence of finding multiple highly
diverged and highly localized mitochondrial haplogroups, genome-
wide autosomal markers will be needed to unravel the story of the
first domesticated species.
Materials and Methods
Sampling Protocol. Dogsweresampledfromanimalsheltersorwerebroughtto
IACUC protocol 2007–0076, 3–5 mL of blood drawn from the cephalic or lateral
were lysed with an ammonium chloride solution and spun at 1,100 ? g with a
portable centrifuge. After discarding the supernatant, cell pellets were resus-
pended in an EDTA-Tris-SDS solution for transport to the DNA Bank at Cornell
Baker Institute for Animal Health. DNA was isolated from the lysate using
ammonium acetate and alcohol and was suspended in Tris-EDTA buffer. Con-
centrations were determined by A260 on a NanoDrop ND1000 spectrophotom-
eter. Stock DNA was stored in ?20 °C freezers by the Cornell Medical Genetics
Archive. Dilutions were made from a 200 ?g/mL working stock as needed for
sequencing and genotyping. A similar protocol was followed for the 102 United
Boyko et al. PNAS ?
August 18, 2009 ?
vol. 106 ?
no. 33 ?
States dogs, except that we also verified that they were mixtures of several Download full-text
different breeds by using the Wisdom MX breed test (Mars Inc.).
Microsatellite Genotyping. Two hundred twenty-seven village dogs were typed
on a 96-microsatellite panel described in (15, 16). Microsatellites were amplified
ABI3730xl (ABI). Standard PCR conditions have been described in ref. 15 while
of samples included a previously genotyped control sample for size verification
and binned using GeneMapper 4.0. All genotype calls were checked manually
missing rates (?20%) or heterozygote deficits (P ? 0.01) in a majority of the 8
regional populations because this suggests the presence of null alleles at these
for breed structure studies (15, 16).
SNP Genotyping. One hundred sixty-eight village dogs, 102 mixed-breed dogs,
and dogs from 126 breeds were genotyped using the sequenom iPLEX platform
on a 321-SNP panel described in ref. 20. For each sample, 2 ?L of dog genomic
DNA was aliquoted into 13 separate microtiter wells for PCR amplification. Each
genomic aliquot was amplified in a total volume of 10 ?L ?45 cycles with up to
28 primer pairs. Each reaction was treated with shrimp alkaline phosphatase for
standard thermocycler according to the sequenom iPLEX gold protocol. Each
reaction was desalted before spotting and shooting a SpectroChip on the Com-
pact MassARRAY system (Sequenom). Results were interpreted automatically
using cluster plots with the Histogram tabular view active in SpectroTyper-
TyperAnalyzer (Sequenom). SNP genotypes were loaded into PLINK version 1.0.4
(21) and 15 SNPs with high missingness (?20%) and 1 SNP with an extreme
heterozygote deficiency (P ? 10?7below Hardy-Weinberg equilibrium) were
removed from further analysis.
Mitochondrial Sequencing. A 680-bp fragment of the mitochondrial D-loop was
amplified in two overlapping reactions. Region-1 was amplified using forward
primer H15422: 5?-CTCTTGCTCCACCATCAGC-3?, and reverse primer L15781: 5?-
GTAAGAACCAGATGCCAGG-3?. Region-2 was amplified using forward primer
H15693 5?-AATAAGGGCTTAATCACCATGC-3? and reverse primer L16106: 5?-
primer, relative to the published dog mitochondrial genome as in (6)). PCR was
carried out under the following protocol using 10 ng genomic DNA: Denatur-
ation:94 °C(40s);annealing:54 °C(1min);amplification:72 °C(1min)for35total
cycles followed by a 5 min final annealing step at 72 °C. Sequencing reactions
were carried out on an ABI 3730 sequencer using BigDye Terminator chemistry
ambiguous bases were rerun in the opposite direction. Sequences were edited,
assembled, and aligned with Sequencher 4.8 (Gene Codes Corporation) and
submitted to GenBank with Sequin (http://www.ncbi.nlm.nih.gov/Sequin/).
Statistical Analyses. We used two approaches—principal component analysis
with EIGENSOFT v2.0 (22) and clustering analysis with STRUCTURE v2.2 (14)—to
classify individuals as indigenous or non-native and to describe the genetic
structure of indigenous African village dogs and their relationship to dogs from
putatively African breeds. We relied primarily on STRUCTURE to determine the
proportion of non-African admixture present in each village dog because struc-
ture allows for probabilistic assignment of individuals to classes and explicit
modeling of admixture (22). In contrast, PCA makes no assumptions regarding
discrete versus clinal population structure and is well suited for describing the
and PCA usually reveal very similar patterns of genetic variation (22).
Before running these clustering methods, we removed markers in high LD
with other markers [r2?0.5, see (23)] using Arlequin v3.11 (24) and removed 9
village dogs that showed high relatedness to another dog in the genotyping
panel (?hat? 0.3). All STRUCTURE runs were done using the admixture model
parameter settings with a burnin period of 100,000 iterations followed by
500,000 MCMC repetitions, with 10 runs per K, and averaged using CLUMPP
v1.1.2 (25). In contrast, PCA was carried out separately for the SNP and microsat-
ellite markers. Microsatellite loci with n ? 2 alleles were recorded as n-1 biallelic
loci before running PCA in Eigensoft.
Expected heterozygoisty (h) was calculated in Arlequin after removing 10 dogs
with a custom C?? implementation of Eq. 6 from (26); microsatellite FSTwas com-
puted using Arlequin. Unless otherwise noted, statistical tests were performed in R
ACKNOWLEDGMENTS. We thank numerous volunteers and animal shelters for
their assistance in gathering samples, including Leonard Kuwale, Ahmed Sa-
maha, Kazhila Chinsembu, Animal Care in Egypt (Luxor), Animal Friends Shelter
(Giza), Albergue de Animales Villa Michelle (Mayaguez), and Albergue La Gab-
riella (Ponce); Jason Mezey, Fengfei Wang, Katarzyna Bryc, and Andy Reynolds
for their assistance with lab and computational resources; Bob Wayne, Niels
Pedersen, Ben Sacks, Sarah Brown, and Peter Savolainen for helpful comments
and discussion; and the intramural program of the National Human Genome
Research Institute. This work supported by the Center for Vertebrate Genomics,
Department of Clinical Sciences and Baker Institute of Animal Health, Cornell
University; National Institutes of Health Center for Scientific Review and R24
Foundation research fellowship.
1. Wayne R (2001) Consequences of domestication: Morphological diversity of the dog. In
2. Clutton-Brock J (1995) Origins of the dog: Domestication and early history. In The
Domestic Dog, Its Evolution, Behavior and Interactions with People, ed Serpell J CUP,
Cambridge), pp 7–20.
3. Vila ` C, Maldonado J, Wayne R (1999) Phylogenetic relationships, evolution, and
genetic diversity of the domestic dog. J Hered 90:71–77.
4. Germonpre ´ M, et al. (2009) Fossil dogs and wolves from Palaeolithic sites in Belgium,
the Ukraine and Russia: Osteometry, ancient DNA and stable isotopes. J Arch Sci
5. Vila ` C,etal.(1997)Multipleandancientoriginsofthedomesticdog.Science276:1687–
Asian origin of domestic dogs. Science 298:1610–1613.
7. Coppinger R, Coppinger L (2001) in Dogs: A Startling New Understanding of Canine
Origin, Behavior and Evolution (Scribner, New York).
8. Dobney K, Larson G (2006) Genetics and animal domestication: New windows on an
elusive process. J Zool 269:261–271.
9. Miklosi A (2008) in Dog Behaviour, Evolution, and Cognition (Oxford Univ Press,
Oxford), p 304.
dogs: diversity and phylogenetic affinities. J Hered 97:318–330.
11. Irion D, Schaffer A, Grant S, Wilton A, Pedersen N (2005) Genetic variation analysis of
the Bali street dog using microsatellites. BMC Genet 6:6.
12. Runstadler J, Angles J, Pedersen N (2006) Dog leucocyte antigen class II diversity and
relationships among indigenous dogs of the island nations of Indonesia (Bali). Aus-
tralia and New Guinea Tissue Antigens 68:418–426.
13. (2008) Police Zone. Encyclopædia Britannica. Online Ed.
14. Pritchard J, Stephens M, Donnelly P (2000) Inference of population structure using
multilocus genotype data. Genetics 155:945–949.
15. Parker HG, et al. (2004) Genetic structure of the purebred domestic dog. Science
16. Parker H, et al. (2007) Breed relationships facilitate fine-mapping studies: A 7.8-kb
deletion cosegregates with Collie eye anomaly across multiple dog breeds. Genome
variation within and among breeds. J Forensic Sci 52:562–572.
18. Savolainen P, Leitner T, Wilton A, Matisoo-Smith E, Lundeberg J (2004) A detailed
DNA. Proc Natl Acad Sci USA 101:12387–12390.
19. Bjo ¨rnerfeldt S, Webster M, Vila ` C (2006) Relaxation of selective constraint on dog
mitochondrial DNA following domestication. Genome Res 16:990–994.
20. Jones P, et al. (2008) Single-nucleotide-polymorphism-based association mapping of
dog stereotypes. Genetics 179:1033–1044.
21. Purcell S, et al. (2007) PLINK: A tool set for whole-genome association and population-
based linkage analyses. Am J Hum Genet 81:559–575.
23. Kaeuffer R, Re ´ale D, Coltman D, Pontier D (2007) Detecting population structure using
STRUCTURE software: Effect of background linkage disequilibrium. J Hered 99:374–380.
24. Excoffier L, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for
population genetics data analysis. Evol Bioinform Online 1:47–50.
25. Jakobsson M, Rosenberg N (2007) CLUMPP: A cluster matching and permutation
program for dealing with label switching and multimodality in analysis of population
structure. Bioinformatics 23:1801–1806.
26. Weir B, Cockerman C (1984) Estimating F-statistics for the analysis of population
structure. Evolution 38:1358–1370.
27. R Development Core Team (2008) in R: A language and environment for statistical
computing (R Foundation for Statistical Computing, Vienna, Austria).
28. Rosenberg N (2004) DISTRUCT: A program for the graphical display of population
structure. Mol Ecol Notes 4:137–138.
29. Ewens W (1972) The sampling theory of selectively neutral alleles. Theor Pop Biol
www.pnas.org?cgi?doi?10.1073?pnas.0902129106Boyko et al.