ArticlePDF Available

Genomic Analyses Reveal the Influence of Geographic Origin, Migration, and Hybridization on Modern Dog Breed Development


Abstract and Figures

There are nearly 400 modern domestic dog breeds with a unique histories and genetic profiles. To track the genetic signatures of breed development, we have assembled the most diverse dataset of dog breeds, reflecting their extensive phenotypic variation and heritage. Combining genetic distance, migration, and genome-wide haplotype sharing analyses, we uncover geographic patterns of development and independent origins of common traits. Our analyses reveal the hybrid history of breeds and elucidate the effects of immigration, revealing for the first time a suggestion of New World dog within some modern breeds. Finally, we used cladistics and haplotype sharing to show that some common traits have arisen more than once in the history of the dog. These analyses characterize the complexities of breed development, resolving longstanding questions regarding individual breed origination, the effect of migration on geographically distinct breeds, and, by inference, transfer of trait and disease alleles among dog breeds.
Content may be subject to copyright.
Genomic Analyses Reveal the Influence of
Geographic Origin, Migration, and Hybridization on
Modern Dog Breed Development
Graphical Abstract
dNeighbor joining cladogram of 161 breeds establishes 23
supported clades
dCrossing between diverse clades was done recently to add
new traits
dMigration of a breed to a new region alters both immigrant
and indigenous breeds
dTracking recent crosses can identify the source of mutations
in multiple breeds
Heidi G. Parker, Dayna L. Dreger, Maud
Rimbault, Brian W. Davis, Alexandra B.
Mullen, Gretchen Carpintero-Ramirez,
Elaine A. Ostrander
In Brief
The domestic dog is divided into
hundreds of island-like populations
called breeds. Parker et al. examine 161
breeds and show that they were
developed through division and
admixture. The analyses define clades,
estimate admixture dates, distinguish
geographically diverse populations, and
help determine the source of shared
mutations among diverse populations.
Accession Numbers
Parker et al., 2017, Cell Reports 19, 697–708
April 25, 2017 ª2017 The Author(s).
Cell Reports
Genomic Analyses Reveal the Influence
of Geographic Origin, Migration, and
Hybridization on Modern Dog Breed Development
Heidi G. Parker,
Dayna L. Dreger,
Maud Rimbault,
Brian W. Davis,
Alexandra B. Mullen,
Gretchen Carpintero-Ramirez,
and Elaine A. Ostrander
Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health,
Bethesda, MD 20892, USA
Lead Contact
There are nearly 400 modern domestic dog breeds
with a unique histories and genetic profiles. To track
the genetic signatures of breed development, we
have assembled the most diverse dataset of dog
breeds, reflecting their extensive phenotypic varia-
tion and heritage. Combining genetic distance,
migration, and genome-wide haplotype sharing ana-
lyses, we uncover geographic patterns of develop-
ment and independent origins of common traits.
Our analyses reveal the hybrid history of breeds
and elucidate the effects of immigration, revealing
for the first time a suggestion of New World dog
within some modern breeds. Finally, we used cladis-
tics and haplotype sharing to show that some com-
mon traits have arisen more than once in the history
of the dog. These analyses characterize the com-
plexities of breed development, resolving longstand-
ing questions regarding individual breed origination,
the effect of migration on geographically distinct
breeds, and, by inference, transfer of trait and dis-
ease alleles among dog breeds.
The dog, Canis familiaris, is the first domesticate earning a place
within nearly every society across the globe for thousands of
years (Druzhkova et al., 2013; Thalmann et al., 2013; Vila
`et al.,
1997, 1999). Over the millennia, dogs have assisted humans
with hunting and livestock management, guarding house and
field, and played crucial roles in major wars (Moody et al.,
2006). Providing a range of services from companionship to pro-
duction of fur and meat (Wilcox and Walkowicz, 1995), the diver-
sity of talents and phenotypes combined with an unequalled
emotional connection between dogs and humans has led to
the creation of more than 350 distinct breeds, each of which is
a closed breeding population that reflects a collage of defining
traits (
Previous studies have addressed the genomic makeup of a
limited number of breeds, demonstrating that dogs from the
same breed share common alleles and can be grouped using
measures of population structure (Irion et al., 2003; Koskinen,
2003; Parker et al., 2004), and breeds that possess similar
form and function often share similar allelic patterns (Parker
et al., 2004, 2007; Vonholdt et al., 2010). However, none of these
studies have effectively accounted for the variety of mechanisms
through which modern breeds may have developed, such as
geographic separation and immigration; the role of hybridization
in the history of the breeds; and the timeline of the formation of
breeds. In this study, we overcome these barriers by presenting
an expansive dataset, including pure breeds sampled from mul-
tiple sections of the globe and genotyped on a dense scale. By
applying both phylogenetic methods and a genome-wide anal-
ysis of recent haplotype sharing, we have unraveled common
population confounders for many breeds, leading us to propose
a two-step process of breed creation beginning with ancient
separation by functional employment followed by recent selec-
tion for physical attributes. These data and analyses provide a
basis for understanding which and why numerous, sometimes
deleterious mutations are shared across seemingly unrelated
We examined genomic data from the largest and most diverse
group of breeds studied to date, amassing a dataset of 1,346
dogs representing 161 breeds. Included are populations with
vastly different breed histories, originating from all continents
except Antarctica, and sampled from North America, Europe,
Africa, and Asia. We have specifically included breeds that
represent the full range of phenotypic variation present among
modern dogs, as well as three breeds sampled from both the
United States and their country of origin. Samples from 938
dogs representing 127 breeds and nine wild canids were geno-
typed using the Illumina CanineHD bead array following standard
protocols. Data were combined with publically available informa-
tion from 405 dogs genotyped using the same chip (Hayward
et al., 2016; Vaysse et al., 2011). For three dogs from one breed,
genotypes were retrieved from publically available sequence
Cell Reports 19, 697–708, April 25, 2017 ª2017 The Author(s). 697
This is an open access article under the CC BY license (
files, and all were merged into a single dataset (Table S1). After
pruning for low quality or genotyping rate, 150,067 informative
SNPs were retained.
Ascertainment bias has been shown to skew population ge-
netic calculations that require estimation of allele frequencies
and diversity measures (Lachance and Tishkoff, 2013). It has
also been shown that ascertainment based on a single individual
provides less bias than a mixed group (Patterson et al., 2012).
The SNPs used in this study were identified primarily within the
boxer or from boxer compared to another genome (Vaysse
et al., 2011), which has exaggerated the boxer minor allele fre-
quency (MAF; 0.351 in boxer compared to 0.260 overall) but
has little affect the other breeds (MAF range, 0.247–0.284). To
minimize the effect this might have, we have chosen to use dis-
tance measures based on allele sharing rather than frequency
and to enhance these analyses with unbiased haplotype sharing
for a robust assessment of canine population structure.
A bootstrapped cladogram was obtained using an identity-by-
state distance matrix and a neighbor-joining tree algorithm (Sup-
plemental Experimental Procedures). After 100 bootstraps, 91%
of breeds (146/161) formed single, breed-specific nodes with
100% bootstrap support (Figure 1). Of the 15 breeds that did
not meet these criteria, seven (Belgian tervuren, Belgian
sheepdog, cane corso, bull terrier, miniature bull terrier, rat ter-
rier, and American hairless terrier) were part of two- or three-
breed clades that were supported at 98% or greater, and two
breeds (Lhasa apso and saluki) formed single-breed clades
that were supported at 50% and 78%, respectively. Four breeds
(redbone coonhound, sloughi, cane paratore, and Jack Russell
terrier) were split within single multi-breed clades, and the last
Toy and Miniature
= 90-100
= 70-89
= 50-69
Figure 1. Cladogram of 161 Domestic Dog Breeds
Breeds that form unique clades supported by 100% of bootstraps are combined into triangles. For all other branches, a gold star indicates 90% or better, black
star 70%89%, and silver star 50%69% bootstrap support. Breeds are listed on the perimeter of the circle. A small number of dogs do not cluster with the rest
of their breed, indicated as follows: *cane paratore, +Peruv ian hairless dog, #sloughi, @country-of-origin salukis, and ^miniature xoloitzcuintle.
698 Cell Reports 19, 697–708, April 25, 2017
two breeds (xoloitzcuintli and Peruvian hairless dog) were split
between divergent clades. Nine of the breeds that were not
monophyletic were either newly recognized by the American
Kennel Club (AKC) or not recognized at the time of sample
collection and likely represent a breed under development.
Two other non-monophyletic breeds are composed of dogs
collected in two countries; the cane corsos collected in Italy
form a fully supported, single clade, as do the salukis collected
in the United States. However, the cane corsos collected in the
United States form a paraphyletic clade near the Neapolitan
mastiffs, and the salukis collected in the Middle East form
multiple paraphyletic groups within a clade that includes the
US salukis and Afghan hounds.
Not including those that are breed specific, this study defined
105 phylogenetic nodes supported by R90% of bootstrap
replicates, 133 by R70%, and 150 supported by R50% of
replicates. We identify 29 multi-breed clades that are supported
at R90%. Each of these clades includes 2–16 breeds and
together account for 78% of breeds in the dataset. 150 breeds,
or 93% of the dataset, can be divided into 23 clades of 2–18
breeds each, supported at >50%. These multi-breed clades
reflect common behaviors, physical appearance, and/or related
geographic origin (Figure 2).
Eleven breeds did not group with significance to any other
breeds. Five breeds form independent clades and six others
are paraphyletic to established clades with <50% bootstrap sup-
port (Table S2). The lack of grouping may indicate that we have
not sampled the closest relatives of these breeds or that these
breeds comprise outcrossings that are not shared by similar
To assess hybridization across the clades, identical-by-
descent (IBD) haplotype sharing was calculated between all
pairs of dogs from the 161 breeds. Haplotypes were phased
using the program Beagle (Browning and Browning, 2013)in
100-SNP windows, resulting in a minimum haplotype size of
232 kb, well above the shared background level established in
previous studies (Lindblad-Toh et al., 2005; Sutter et al., 2004).
The large haplotypes specifically target admixture resulting
from breed formation rather than domestication, which previous
studies have not addressed. The total length of the shared hap-
lotypes was summed for each pair of dogs. Individuals from
within the same breed clade share nearly four times more of their
genome within large IBD haplotype blocks than dogs in different
breed clades (median shared haplotype lengths of 9,742,000 bp
and 2,184,000 bp, respectively; p [Kolmogorov-Smirnov (K-S)
and Wilcox] < 2.2e
;Figure 3A). Only 5% of the across-breed
pairings have a median greater than 9,744,974 bp. These excep-
tions argue for recent admixture events between breeds, as evi-
denced by the example of the Eurasier breed, created in the
1970s by mixing chow chow with other spitz-type breeds (Fogle,
2000)(Figure 3B). The data reveal not only the components of the
breed but also the explanation for its placement on the clado-
gram. The Eurasier (unclustered) shows significant haplotype
sharing with the samoyed (unclustered), keeshond (Nordic spitz),
and chow chow (Asian spitz) (Figure 3B). Because all three
breeds are located in different clades, unrelated to each other,
the Eurasier falls between the component breeds and forms its
own single-breed clade. Haplotype-sharing bar graphs for
each of 161 breeds, including 152 AKC breeds, are available in
Data S1. This provides a long-term resource for identifying
Figure 2. Representatives from Each of the
23 Clades of Breeds
Breeds and clades are listed for each picture from
left to right, top to bottom.
(A) Akita/Asian spitz.
(B) Shih tzu/Asian toy (by Mary Bloom).
(C) Icelandic sheepdog/Nordic spitz (by Veronica
(D) Miniature schnauzer/schnauzer.
(E) Pomeranian/small spitz.
(F) Brussels griffon/toy spitz (by Mary Bloom).
(G) Puli/Hungarian.
(H) Standard poodle/poodle.
(I) Chihuahua/American toy.
(J) Rat terrier/American terrier (by Stacy Zimmer-
(K) Miniature pinscher/pinscher.
(L) Irish terrier/terrier.
(M) German shepherd dog/New World (by Mary
(N) Saluki/Mediterranean (by Mary Bloom).
(O) Basset hound/scent hound (by Mary Bloom).
(P) American cocker spaniel/spaniel (by Mary
(Q) Golden retriever/retriever (by Mary Bloom).
(R) German shorthaired pointer/pointer setter (by Mary Bloom).
(S) Briard/continental herder (by Mary Bloom).
(T) Shetland sheepdog/UK rural.
(U) Rottweiler/drover
(V) Saint Bernard/alpine.
(W) English mastiff/European mastiff (by Mary Bloom).
Cell Reports 19, 697–708, April 25, 2017 699
populations that likely share rare and common traits that will be
invaluable for mapping the origins of deleterious and beneficial
Strong evidence of admixture across the clades was found in
117 breeds (Figure 4). A small number of these were identified in
previous studies using migration analysis (Pickrell and Pritchard,
2012; Shannon et al., 2015) 30% of these breeds share with only
one breed outside their clade. Therefore, more than half (54%) of
the breeds that make up the 23 established clades share large
haplotypes with one or zero breeds outside their clade, indi-
cating breed creation by selection based on the initial founder
population rather than recent admixture. Only 6 of the 161
breeds share extensive haplotypes with many (>8) different
groups, suggesting recent creation of these breeds from multiple
others or that they provide a popular modern breed component.
The overall low level of sharing across diverse breeds suggests
that interclade crosses are done thoughtfully and for specific
reasons, such as the introduction of a new trait or the immigra-
tion of a breed to a new geographic region.
As importation and establishment in a new country has been
shown to have a measurable effect on breed structure (Quignon
et al., 2007), we chose three breeds, the Tibetan mastiff, saluki,
and cane corso, for inclusion in the study, with each collected in
the country of origin as well as from established populations in
the United States. In each case, there is division of the breed
based on collection location. The split between the US and
Figure 3. Gross Haplotype Sharing across Breeds
(A) Boxplot of total haplotype sharing between all pairs of dogs from breeds within the same clade, across different clades, and within the same breed. The
difference between the distributions is highly significant (p < 2e-16).
(B) Example of haplotype sharing between three breeds (samoyed, chow chow, and keeshond) and a fourth (Eurasier) that was created as a composite of the
other three. Combined haplotype length is displayed on the y axis, and 169 breeds and populations are listed on the x axis in the order they appear on the
cladogram, starting with the jackal and continuing counterclockwise. Haplotype sharing of zero is set at 250,000 for graphing, a value just below what is detected
in this analysis. Breeds are colored by clade. 95% significance level is indicated by the horizontal line. Breed abbreviations are listed under the graph in the order
they appear and colored by clade. Definitions of the breed abbreviations can be found in Table S1.
700 Cell Reports 19, 697–708, April 25, 2017
Chinese Tibetan mastiffs is likely due to independent lineage for-
mation stemming from an importation bottleneck, as is evident
from estimations of inbreeding coefficients (Chinese Tibetan
mastiffs average F = 0.07, and US Tibetan mastiffs average
F = 0.15). Similarly, the average inbreeding coefficient of salukis
collected in the United States is twice as high as those sampled
from the countries of origin (F = 0.21 and 0.10, respectively).
Since the US salukis form a more strongly bootstrapped clade
than the country-of-origin dogs, we suggest that there is a less
diverse gene pool in the United States. In comparison, the
cane corsos from Italy form a single clade, while the cane corsos
from the United States cluster with the Neapolitan mastiffs, also
collected in the United States. Significant shared haplotypes are
observed between the US cane corsos and the rottweiler that are
not evident in the Italian cane corsos, as well as increased shared
haplotypes with the other mastiffs. Cane corsos have been in the
United States for less than 30 years (American Kennel Club,
Our analyses were designed to detect recent admixture;
therefore, we were able to identify hybridization events that are
described in written breed histories and stud-book records.
Using the most reliably dated crosses that produced modern
breeds, we established a linear relationship between the total
length of haplotype sharing and the age of an admixture event,
occurring between 35 and 160 years before present (ybp) (Fig-
ure 5A). Applying this equation to the total shared haplotypes
calculated from the genotyping data, we have validated this rela-
tionship on a second set of recently created breeds arriving at
Figure 4. Haplotype Sharing between Breeds from Different Phylogenetic Clades
The circos plot is ordered and colored to match the tree in Figure 1. Ribbons connecting breeds indicate a median haplotype sharing between all dogs of each
breed in excess of 95% of all haplotype sharing across clades. Definitions of the breed abbreviations can be found in Table S1.
Cell Reports 19, 697–708, April 25, 2017 701
historically accurate time estimations (Figure 5B). Using the rela-
tionship equation y = 1,613,084.67x + 262,137,843.89, where y
is the total shared haplotype length and x is the number of years,
we can estimate the time at which undocumented crosses or di-
visions from older breeds took place. For example, based on a
median haplotype sharing value of 66,993,738, the golden
retriever was separated from the flat-coated retriever in 1895,
and the written history of the golden retriever dates to crosses
between multiple breeds taking place between 1868 and 1890
(Figure 5B), a near-perfect match.
To determine if the multi-breed clades are formed through
recent admixture rather than through common ancestral sour-
ces, we examined migration in 18 groups of four or more breeds.
These include 16 of the clades established on the tree, including
nearby unclustered breeds, and two groups of small clades
(American terrier/American toy and small spitz/toy spitz/schnau-
zer) that are monophyletic, but not well supported. Using the pro-
gram Treemix (Pickrell and Pritchard, 2012), and allowing 0–12
predicted migration events, we determined the effect of admix-
ture on clade formation by calculating the increase in maximum
Breed 1 Breed 2 Total sharing Esmated Yrs Ago Predicted Year Historical year
Figure 5. Total Haplotype SharingIs Inversely
Correlated with the Time of Hybridization
between Breeds that Have Developed within
the Last 200 Years
(A) The time of hybridization in years before present
is graphed on the x axis and the median total
haplotype sharing on the y axis for six breeds of dog
with reliable recent histories of admixture in breed
formation or recovery. The trendline shows a linear
correlation with r
(B) The slope and intercept of the trendline from A
was applied to the median haplotype sharing values
from the data for four additional breeds with reliable
breed creation dates to establish accuracy of esti-
mated hybridization dates.
likelihood score over a zero migration tree
(Figure 6A). Only 2 of the 18 clades, New
World and Asian toy (Figures 6B and 6C),
showed evidence of extensive hybridiza-
tion between the breeds. Thus, the mod-
ern breeds were likely created through se-
lection for unique traits within an ancient
breed type with possible admixture from
unrelated breeds to enhance the trait.
Our hybridization analysis reveals evi-
dence for disease sharing across the
clades. For instance, collie eye anomaly
(CEA) is a disease that affects the develop-
ment of the choroid in several herding
breeds, including the collie, Border collie,
Shetland sheepdog, and Australian shep-
herd, all members of the UK Rural clade
(Lowe et al., 2003). The mutation and
haplotype pattern displayed IBD across
all affected breeds, and we speculated
that all share a common obviously
affected ancestor (Parker et al., 2007).
We were unable to explain, however, the presence of the disease
in the Nova Scotia duck tolling retriever, a sporting dog devel-
oped in Canada from an unknown mixture of local breeds, which
also shares the same haplotype. This perplexing observation can
now be explained, as this analysis shows that collie and/or Shet-
land sheepdog were strong, undocumented, contributors to the
formation of the Nova Scotia duck tolling retriever and, therefore,
the likely source of the CEA mutation within that breed
(Figure 7A).
Similarly,a mutationin the MDR1 gene (multi-drug resistance 1),
which causes life-threatening reactions to multiple drugs in many
of the UK Rural breeds, has been reported in 10% of German
shepherd dogs (Mealey and Meurs, 2008). These data display a
link between the German shepherd dog and UK Rural breeds
through the Australian shepherd, highlighting the unexpected
role the Australian shepherd orits predecessor played inthe devel-
opment of the modern German shepherd dog (Figure 6B). Earlier
this year, the MDR1mutation was identified in the chinook at a fre-
quency of 15% (Donner et al., 2016). Our analysis reveals recent
admixture between this breed and the German shepherd dog as
702 Cell Reports 19, 697–708, April 25, 2017
Figure 6. Assessment of Migration between Breeds within Clades
Admixture was measured in Treemix for 18 groups of breeds representing clades or combinations of small clades.
(A) Improvement to the maximum likelihood tree of each group as the result of admixture. The y axis shows fold improvement over the zero admixture tree.
(B) Cladogram of the New World breeds with European herders allowing four migration events. Arrows show estimated migration between breeds colored by
weight (yellow to red = 0–0.5).
(C) Cladogram showing migration within the Asian toy clade, including a neighboring breed, the Tibetan terrier. Pictures by Terri Gueck (TIBT), Yuri Hooker (INCA),
Mary Bloom (GSD and SHIH), Maurizio Marziali (CPAT), Mary Malkiel (COOK), and John and Debbie Caponetto (large and small XOLO/MXOL).
Cell Reports 19, 697–708, April 25, 2017 703
Figure 7. Haplotypes Shared with Breeds that Carry
Known Deleterious Mutations
Breeds are connected if the median shared haplotype size
exceeds the 95% threshold for interclade sharing. Sharing
between breeds that are known to carry the mutation is
colored black, and sharing with other breeds is colored
according to the breed that carries the mutation.
(A) Collie eye anomaly is found in a number of herding
breeds developed in the United Kingdom and some
sporting breeds developed in the United States.
(B) Multi-drug resistance 1 mutation is carried by many UK
herding breeds as well as the German shepherd.
704 Cell Reports 19, 697–708, April 25, 2017
well as previously unknown addition of Collie, both carriers of
the MDR1 mutation. Haplotype sharing with the same affected
breeds is found in the xoloitzcuintli, which allows us to predict
that this rare breed may also carry the deleterious allele but has
yet to be tested.
Phylogenetic analyses have often been applied to determine the
relationships between dog breeds with the understanding that a
tree structure cannot fully explain the development of breeds.
Prior studies have shown that single mutations produce recog-
nizable traits that are shared across breeds from diverse clades,
suggesting that admixture across clades is a notable source of
morphologic diversity (Cadieu et al., 2009; Parker et al., 2009;
Sutter et al., 2007). Studies of linkage disequilibrium and haplo-
type sharing suggest further that within regions of 10–15 kb,
there exist a small number of haplotypes that are shared by
the majority of breeds, while breed specificity is revealed only
in large haplotypes (Lindblad-Toh et al., 2005; Sutter et al., 2004).
In this study, we observe that the majority of dog breeds either
do not share large haplotypes outside their clade or share with
only one remote breed. The small number of breeds that share
excessively outside their assigned clade could be recently
created from multiple diverse breeds or may have been popular
contributors to other breeds. For example, the pug dog groups
closely with the European toy breed, Brussels griffon (Figure 2F),
in the toy spitz clade but also shares extensive haplotypes with
the Asian toy breeds (Figure 2B) as well as many small dog
breeds from multiple other clades. This likely indicates the
pug’s early exportation from Asia and subsequent contribution
to many small breeds (Watson, 1906). Consider also the exten-
sive cross-clade haplotype sharing in the chinook, a recently
created breed with multiple ancestors from different breeds.
Our data both recapitulates and enhances the written history
of this breed ( S1).
Extreme examples such as these underscore the complications
implicit in relying on phylogeny alone to describe breed relation-
ships. Overall, our data show that admixture has played an
important role in the development of many breeds and, as new
hybrids are added to phylogenetic analyses, the topology of
the cladogram will likely rearrange to accommodate.
The ability to determine a time of hybridization for recent
admixture events can refine sparse historical accounts of breed
formation. For example, when dog fighting was a popular form of
entertainment, many combinations of terriers and mastiff or
bully-type breeds were crossed to create dogs that would excel
in that sport. In this analysis, all of the bull and terrier crosses
map to the terriers of Ireland and date to 1860–1870. This coin-
cides perfectly with the historical descriptions that, though they
do not clearly identify all breeds involved, report the popularity of
dog contests in Ireland and the lack of stud book veracity, hence
undocumented crosses, during this era of breed creation (Lee,
The dates estimated from these data are approximations, as
selection for or against traits that accompanied each cross, as
well as the size of the population at the time of the cross, would
have affected retention of the haplotypes within the genome.
Based on these estimates, the excess haplotype sharing that
we have identified represents the creation of breeds since the
Victorian era breed explosion. Most breeds within each clade
share haplotypes at this level (<200 ybp); however, the lack of
sharing across the clades, outside of very specific crosses, sug-
gests the clades were developed much earlier than the breed
registries. Dividing the data by clade, the median haplotype
sharing is lowest in the Asian spitz (median = 0) and the Mediter-
ranean clades (median = 516,900) (median range across all
clades = 0–3,459,000), indicating that these clades are most
divergent and possibly older than the rest. This fits well with pre-
vious studies that suggest the earliest dogs came from Central
and East Asia (Pang et al., 2009; Shannon et al., 2015). Interest-
ingly, the mean haplotype sharing is slightly higher in the Asian
spitz clade than it is in the Mediterranean clade (mean =
1,596,000 and 1,317,000, respectively) (Figure S1), implying
that the Asian spitz breeds have been used in recent crosses
while the Mediterranean breeds are currently more segregated.
These data describe a staggered pattern of dog breed creation
starting with separation by type based on required function
and the form necessary to carry out that function. This would
have taken place as the need arose during early human progres-
sion from hunter-gather to pastoral, agricultural, and finally ur-
ban lifestyles. During the last 200 years, these breed types
were refined into very specific breeds by dividing the original
functional dog into morphotypes based on small changes in
appearance and with occasional outcrosses to enhance appear-
ance or alter behavior (e.g., reduce aggression, increase
Though most breeds within a clade appear to be the result of
descent from a common ancestor, the New World dogs and the
Asian toys showed nearly 200% improvement in the maximum
likelihood score by allowing for admixture between the breeds
within the clade. Based on this analysis, the Asian toy dogs
were likely not considered separate breeds when first exported
from their country of origin resulting in multiple admixture events
(Figure 6C). Unexpectedly, the New World clade admixture
events center exclusively on the German shepherd dog, which
informs both the development of this breed as well as immigra-
tion of dog breeds to the New World (Figure 6B). The inclusion of
German shepherd dog with cane paratore, an Italian working
farm dog, likely indicates a recent common ancestor among
these breeds, as the German shepherd dog was derived from
a herding dog of unknown ancestry in the late 1800s (http:// However, the hybridization of the German shepherd
dog with the Peruvian hairless dog and the xoloitzcuintli, also a
hairless breed, is unexpected and could be the result of recent
admixture to enhance the larger varieties of these breeds or
could indicate admixture of generic herding dogs from Southern
Europe into South America during the Columbian Exchange.
Dogs have been in the Americas for more than 10,000 years,
likely traveling from East Asia with the first humans (Wang
et al., 2016). However, studies of mitochondrial DNA suggest
that the original New World dogs were almost entirely replaced
through European contact (Castroviejo-Fisher et al., 2011;
Wayne and Ostrander, 1999; Witt et al., 2014) and additional
Asian migrations (Brown et al., 2015). As colonists came to the
Americas from the 16
to the 19
centuries, they brought Old
Cell Reports 19, 697–708, April 25, 2017 705
World livestock, and therefore the dogs required to manage and
tend the livestock, to the New World (Crosby, 1972). Many of the
newly introduced animals outcompeted the native animals,
which may explain the surprising and very strong herding dog
signature in the native hairless breeds of South and Central
America that were not developed to herd. In this analysis, we
observe that the ancient hairless breeds show extensive hybrid-
ization with herding dogs from Europe and, to a lesser extent,
with each other. We also identify two additional clades of New
World breeds, the American terriers and the American toys (Fig-
ures 2I and 2J), two monophyletic clades of small-sized breeds
from North/Central America, which include a set of related ter-
riers, and the Chihuahua and Chinese crested. Written records
state that the terriers trace their ancestry to the feists, a
North American landrace dog bred for hunting (http://www., The Chihuahua and Chinese
crested are both believed to have originated in Central America
(American Kennel Club, 1998; Parker et al., 2017), despite the
nomenclature of the latter, which implies Asian ancestry. In
contrast, most new breeds developed in the Americas were
created from crosses of European breeds and cluster accord-
ingly (i.e., Boston terrier [European mastiff], Nova Scotia duck
tolling retriever [retriever], and Australian shepherd [UK Rural]).
The separation of the older American breeds on the cladogram,
despite recent European admixture, suggests that both clades
may retain the aboriginal New World dog genomic signatures in-
termixed with the European breed haplotypes, similar to the
admixture among European, African, and Native American ge-
nomes that can be found in modern South American human pop-
ulations (Mathias et al., 2016; Ruiz-Linares et al., 2014). This is
the first indication that the New World dog signature may not
be entirely extinct in modern dog breeds, as has been previously
suggested (Leonard et al., 2002).
In addition to the effects on the native population, our analysis
of geographically distinct subsets of the same breeds shows that
some degree of admixture also occurs in the imported breeds
when first introduced into a new country. These data suggest
two outcomes of breed immigration that mirror human immigra-
tion into a new region: the immigrant population is less diverse
than the founding population, and there is often admixture with
the native population in early generations (Baharian et al.,
2016; Zhai et al., 2016).
We observe further evidence of the role geography plays in the
distribution of breeds within the clades. For instance, both the
UK Rural and the Mediterranean clades include both sighthound
and working dog breeds, two highly divergent groups in terms of
physical and behavioral phenotype. Sighthounds are lithe and
leggy hunters, built to run fast, and have a strong prey drive.
Working dogs include both the tall and heavy flock guards that
are bred to live among herds without human interaction, prevent-
ing predator attacks, and mid-sized herders (Figure 2T), which
are agile and bred to work closely with humans to control the
movement of the flock without harming them. Despite the
opposing phenotypes under selection, both breed types form
single clades stemming from distinct geographical regions.
Haplotype analysis shows no recent admixture between the
geographically distinct clades, suggesting that these groups
arose independently (Figures S2A and S2B). Archeological
depictions show sighthound-type hunting dogs that date back
4,000 ybp (Alderton, 2002; Fogle, 2000), and one of the earliest
known writings regarding segregation of dogs based by type
clearly delineates hunting dogs from working dogs (Columella,
1954). The new cladogram presented herein suggests that the
switch from hunting to agricultural pursuits may have initiated
early breed formation and that this occurred in multiple regions.
These data show that geographical region can define a founda-
tional canid population within which selection for universally rele-
vant behaviors occurred independently, separating the regional
groups also by function long ago.
The lack of admixture across clades that appear to share a
common trait suggests that these traits may have arisen inde-
pendently, multiple times. For example, these data show no
recent haplotype sharing between the giant flock guards of the
Mediterranean and the European mastiffs (Figure S2D). These
breed types required large size for guarding; however, each
used that size in a different way, a fact that was recognized at
least 2,000 years ago (Columella, 1954). The flock guards use
their size to defeat animal predators, while the mastiffs use their
size to keep human predators at bay, often through fierce coun-
tenance rather than action. The phylogenetic placement of these
breeds and lack of recent admixture suggests that giant size
developed independently in the different clades and that it may
have been one of the earliest traits by which breeds were segre-
gated thousands of years ago.
The cladogram of 161 breeds presented here represents the
most diverse dataset of domestic dog breeds analyzed to
date, displaying 23 well-supported clades of breeds represent-
ing breed types that existed before the advent of breed clubs
and registries. While the addition of more rare or niche breeds
will produce a denser tree, the results here address many unan-
swered questions regarding the origins of breeds. We show that
many traits such as herding, coursing, and intimidating size,
which are associated with specific canine occupations, have
likely been developed more than once in different geographical
locales during the history of modern dog. These data also
show that extensive haplotype sharing across clades is a likely
indicator of recent admixture that took place in the time since
the advent of breed registries, thus leading to the creation of
most of the modern breeds. However, the primary breed types
were developed well before this time, indicating selection and
segregation of dog populations in the absence of formal breed
recognition. Breed prototypes have been forming through selec-
tive pressures since ancient times depending on the job they
were most required to perform. A second round of hybridization
and selection has been applied within the last 200 years to create
the many unique combinations of traits that modern breeds
display. By combining genetic distance relationships with pat-
terns of haplotype sharing, we can now elucidate the complex
makeup of modern dogs breeds and guide the search for genetic
variants important to canine breed development, morphology,
behavior, and disease.
Further details and an outline of resources used in this work can be found in the
Supplemental Experimental Procedures.
706 Cell Reports 19, 697–708, April 25, 2017
The accession numbers for the raw data files for the SNP genotype arrays
reported in this paper are GEO: GSE90441, GSE83160, GSE70454, and
Supplemental Information includes Supplemental Experimental Procedures,
two figures, two tables, and two data files and can be found with this article
online at
H.G.P. conceived of project, performed analyses, created figures, and pre-
pared the manuscript. D.L.D. created figures and assisted in manuscript prep-
aration. M.R. ran SNP chips and worked on early analysis. B.W.D. and A.B.
performed experiments. G.C.-R. performed sample collection and DNA isola-
tion. E.A.O. organized and directed the study and contributed to manuscript
We gratefully acknowledge support from the Intramural Program of the Na-
tional Human Genome Research Institute. We thank Sir Terence Clark for col-
lecting DNA samples from multiple breeds of sighthounds from their countries
of origin in Africa and Asia; Mauricio Lima, Flavio Bruno, and Robert Gennari for
collecting samples from native Italian breeds; and Lei Song for collecting sam-
ples from native Tibetan mastiffs.
Received: January 3, 2017
Revised: February 10, 2017
Accepted: March 28, 2017
Published: April 25, 2017
Alderton, D. (2002). Dogs (Dorling Kindersley, Ltd.).
American Kennel Club (1998). The Complete Dog Book, 19th Edition Revised
(Howell Book House).
Baharian, S., Barakatt, M., Gignoux, C.R., Shringarpure, S., Errington, J., Blot,
W.J., Bustamante, C.D., Kenny, E.E., Williams, S.M., Aldrich, M.C., and
Gravel, S. (2016). The great migration and African-American genomic diversit y.
PLoS Genet. 12, e1006059.
Brown, S.K., Darwent, C.M., Wictum, E.J., and Sacks, B.N. (2015). Using mul-
tiple markers to elucidate the ancient, historical and modern relationships
among North American Arctic dog breeds. Heredity (Edinb.) 115, 488–495.
Browning, B.L., and Browning, S.R. (2013). Improving the accuracy and effi-
ciency of identity-by-descent detection in population data. Genetics 194,
Cadieu, E., Neff, M.W., Quignon, P., Walsh, K., Chase, K., Parker, H.G., Von-
holdt, B.M., Rhue, A., Boyko, A., Byers, A., et al. (2009). Coat variation in the
domestic dog is governed by variants in three genes. Science 326, 150–153.
Castroviejo-Fisher, S., Skoglund, P., Valadez, R., Vila
`, C., and Leonard, J.A.
(2011). Vanishing native American dog lineages. BMC Evol. Biol. 11,73.
Columella, L.J.M. (1954). On Agriculture (De Re Rustica), Vol. Books 5–9 (E.S.
Forster and E.H. Heffner, Trans.) (Harvard University Press).
Crosby, A.W., Jr. (1972). The Columbian Exchange (Greenwood Publishing
Donner, J., Kaukonen, M., Anderson, H., Mo
¨ller, F., Kyo
¨, K., Sankari, S.,
¨nen, M., Giger, U., and Lohi, H. (2016). Genetic panel screening of nearly
100 mutations reveals new insights into the breed distribution of risk variants
for canine hereditary disorders. PLoS ONE 11, e0161005.
Druzhkova, A.S., Thalmann, O., Trifonov, V.A., Leonard, J.A., Vorobieva, N.V.,
Ovodov, N.D., Graphodatsky, A.S., and Wayne, R.K. (2013). Ancient DNA anal-
ysis affirms the canid from Altai as a primitive dog. PLoS ONE 8, e57754.
Fogle, B. (2000). The New Encyclopedia of the Dog, Second Edition (Dorling
Kindersley Publishing, Inc.).
Hayward, J.J., Castelhano, M.G., Oliveira, K.C., Corey, E., Balkman, C.,
Baxter, T.L., Casal, M.L., Center, S.A., Fang, M., Garrison, S.J., et al. (2016).
Complex disease and phenotype mapping in the domestic dog. Nat. Commun.
7, 10460.
Irion, D.N., Schaffer, A.L., Famula, T.R., Eggleston, M.L., Hughes, S.S., and
Pedersen, N.C. (2003). Analysis of genetic variation in 28 dog breed popula-
tions with 100 microsatellite markers. J. Hered. 94, 81–87.
Koskinen, M.T. (2003). Individual assignment using microsatellite DNA reveals
unambiguous breed identification in the domestic dog. Anim. Genet. 34,
Lachance, J., and Tishkoff, S.A. (2013). SNP ascertainment bias in population
genetic analyses: why it is important, and how to correct it. BioEssays 35,
Lee, R.B. (1894). A History and Description of the Modern Dogs of Great Britain
and Ireland (Horace Cox).
Leonard, J.A., Wayne, R.K., Wheeler, J., Valadez, R., Guille
´n, S., and Vila
(2002). Ancient DNA evidence for Old World origin of New World dogs. Science
298, 1613–1616.
Lindblad-Toh, K., Wade, C.M., Mikkelsen, T.S., Karlsson, E.K., Jaffe, D.B.,
Kamal, M., Clamp, M., Chang, J.L., Kulbokas, E.J., 3rd, Zody, M.C., et al.
(2005). Genome sequence, comparative analysis and haplotype structure of
the domestic dog. Nature 438, 803–819.
Lowe, J.K., Kukekova, A.V., Kirkness, E.F., Langlois, M.C., Aguirre, G.D.,
Acland, G.M., and Ostrander, E.A. (2003). Linkage mapping of the primary dis-
ease locus for collie eye anomaly. Genomics 82, 86–95.
Mathias, R.A., Taub, M.A., Gignoux, C.R., Fu, W., Musharoff, S., O’Connor,
T.D., Vergara, C., Torgerson, D.G., Pino-Yanes, M., Shringarpure, S.S.,
et al.; CAAPA (2016). A continuum of admixture in the Western Hemisphere re-
vealed by the African Diaspora genome. Nat. Commun. 7, 12522.
Mealey, K.L., and Meurs, K.M. (2008). Breed distribution of the ABCB1-1Delta
(multidrug sensitivity) polymorphism among dogs undergoing ABCB1 geno-
typing. J. Am. Vet. Med. Assoc. 233, 921–924.
Moody, J.A., Clark, L.A., and Murphy, K.E. (2006). Canine history and breed
clubs. In The Dog and Its Genome, E.A. Ostrander, U. Giger, and K. Lind-
blad-Toh, eds. (Cold Spring Harbor Laboratory Press), pp. 1–18.
Pang, J.F., Kluetsch, C., Zou, X.J., Zhang, A.B., Luo, L.Y., Angleby, H., Arda-
lan, A., Ekstro
¨m, C., Sko
¨llermo, A., Lundeberg, J., et al. (2009). mtDNA data
indicate a single origin for dogs south of Yangtze River, less than 16,300 years
ago, from numerous wolves. Mol. Biol. Evol. 26, 2849–2864.
Parker, H.G., Kim, L.V., Sutter, N.B., Carlson, S., Lorentzen, T.D., Malek, T.B.,
Johnson, G.S., DeFrance, H.B., Ostrander, E.A., and Kruglyak, L. (2004).
Genetic structure of the purebred domestic dog. Science 304, 1160–1164.
Parker, H.G., Kukekova, A.V., Akey, D.T., Goldstein, O., Kirkness, E.F.,
Baysac, K.C., Mosher, D.S., Aguirre, G.D., Acland, G.M., and Ostrander,
E.A. (2007). Breed relationships facilitate fine-mapping studies: a 7.8-kb dele-
tion cosegregates with Collie eye anomaly across multiple dog breeds.
Genome Res. 17, 1562–1571.
Parker, H.G., VonHoldt, B.M., Quignon, P., Margulies, E.H., Shao, S., Mosher,
D.S., Spady, T.C., Elkahloun, A., Cargill, M., Jones, P.G., et al. (2009). An ex-
pressed fgf4 retrogene is associated with breed-defining chondrodysplasia in
domestic dogs. Science 325, 995–998.
Parker, H.G., Harris, A., Dreger, D.L., Davis, B.W., and Ostrander, E.A. (2017).
The bald and the beautiful: hairlessness in domestic dog breeds. Philos. Trans.
R. Soc. Lond. B Biol. Sci. 372, 372.
Patterson, N., Moorjani, P., Luo, Y., Mallick, S., Rohland, N., Zhan, Y., Gen-
schoreck, T., Webster, T., and Reich, D. (2012). Ancient admixture in human
history. Genetics 192, 1065–1093.
Cell Reports 19, 697–708, April 25, 2017 707
Pickrell, J.K., and Pritchard, J.K. (2012). Inference of population splits and mix-
tures from genome-wide allele frequency data. PLoS Genet. 8, e1002967.
Quignon, P., Herbin, L., Cadieu, E., Kirkness, E.F., He
´dan, B., Mosher, D.S.,
Galibert, F., Andre
´, C., Ostrander, E.A., and Hitte, C. (2007). Canine population
structure: assessment and impact of intra-breed stratification on SNP-based
association studies. PLoS ONE 2, e1324.
Ruiz-Linares, A., Adhikari, K., Acun
˜a-Alonzo, V., Quinto-Sanchez, M., Jara-
millo, C., Arias, W., Fuentes, M., Pizarro, M., Everardo, P., de Avila, F., et al.
(2014). Admixture in Latin America: geographic structure, phenotypic diversity
and self-perception of ancestry based on 7,342 individuals. PLoS Genet. 10,
Shannon, L.M., Boyko, R.H., Castelhano, M., Corey, E., Hayward , J.J.,
McLean, C., White, M.E., Abi Said, M., Anita, B.A., Bondjengo, N.I., et al.
(2015). Genetic structure in village dogs reveals a Central Asian domestication
origin. Proc. Natl. Acad. Sci. USA 112, 13639–13644.
Sutter, N.B., Eberle, M.A., Parker, H.G., Pullar, B.J., Kirkness, E.F., Kruglyak,
L., and Ostrander, E.A. (2004). Extensive and breed-specific linkage disequi-
librium in Canis familiaris. Genome Res. 14, 2388–2396.
Sutter, N.B., Bustamante, C.D., Chase, K., Gray, M.M., Zhao, K., Zhu, L., Pad-
hukasahasram, B., Karlins, E., Davis, S., Jones, P.G., et al. (2007). A single
IGF1 allele is a major determinant of small size in dogs. Science 316, 112–115.
Thalmann, O., Shapiro, B., Cui, P., Schuenemann, V.J., Sawyer, S.K., Green-
field, D.L., Germonpre
´, M.B., Sablin, M.V., Lo
´ldez, F., Domingo-
Roura, X., et al. (2013). Complete mitochondrial genomes of ancient canids
suggest a European origin of domestic dogs. Science 342, 871–874.
Vaysse, A., Ratnakumar, A., Derrien, T., Axelsson, E., Rosengren Pielberg, G.,
Sigurdsson, S., Fall, T., Seppa
¨, E.H., Hansen, M.S., Lawley, C.T., et al.;
LUPA Consortium (2011). Identification of genomic regions associated with
phenotypic variation between dog breeds using selection mapping. PLoS
Genet. 7, e1002316.
`, C., Savolainen, P., Maldonado, J.E., Amorim, I.R., Rice, J.E., Honeycutt,
R.L., Crandall, K.A., Lundeberg, J., and Wayne, R.K. (1997). Multiple and
ancient origins of the domestic dog. Science 276, 1687–1689.
`, C., Maldonado, J.E., and Wayne, R.K. (1999). Phylogenetic relationships,
evolution, and genetic diversity of the domestic dog. J. Hered. 90, 71–77.
Vonholdt, B.M., Pollinger, J.P., Lohmueller, K.E., Han, E., Parker, H.G.,
Quignon, P., Degenhardt, J.D., Boyko, A.R., Earl, D.A., Auton, A., et al.
(2010). Genome-wide SNP and haplotype analyses reveal a rich history under-
lying dog domestication. Nature 464, 898–902.
Wang, G.D., Zhai, W., Yang, H.C., Wang, L., Zhong, L., Liu, Y.H., Fan, R.X., Yin,
T.T., Zhu, C.L., Poyarkov, A.D., et al. (2016). Out of southern East Asia: the nat-
ural history of domestic dogs across the world. Cell Res. 26, 21–33.
Watson, J. (1906). The Dog Book, Volume II (Doubleday, Page & Company).
Wayne, R.K., and Ostrander, E.A. (1999). Origin, genetic diversity, and genome
structure of the domestic dog. BioEssays 21, 247–257.
Wilcox, B., and Walkowicz, C. (1995). Atlas of Dog Breeds of the World, Fifth
Edition (T.F.H. Publications).
Witt, K.E., Judd, K., Kitchen, A., Grier, C., Kohler, T.A., Ortman, S.G., Kemp,
B.M., and Malhi, R.S. (2014). DNA analysis of ancient dogs of the Americas:
identifying possible founding haplotypes and reconstructing population his-
tories. J. Hum. Evol. 79, 105–118.
Zhai, G., Zhou, J., Woods, M.O., Green, J.S., Parfrey, P., Rahman, P., and
Green, R.C. (2016). Genetic structure of the Newfoundland and Labrador
population: founder effects modulate variability. Eur. J. Hum. Genet. 24,
708 Cell Reports 19, 697–708, April 25, 2017
... In addition, 49 free-breeding dogs from the Ukrainian city of Vinnytsia, which is located approximately 350 km southwest of Chernobyl, were also included. To provide a comparison to purebred populations, we used genetic data from 1324 dogs from 162 breeds recognized by the Fédération Cynologique International, which are largely of western European descent (15). The 162 breeds were organized into clades of related breeds based on a bootstrapped phylogenetic tree of only purebred dogs, and the clades were named according to their general function or breed type for ease of comparison (e.g., shepherds and related breeds, U.K. flock guardians, and related breeds) (see Materials and Methods and fig. ...
... A total of 406 samples were genotyped using Illumina CanineHD 170k SNP arrays at NHGRI, thus maintaining consistency with previously analyzed datasets of purebred dogs (1296 dogs from 157 breeds) (15) and free-breeding dogs (232 from 12 countries) (13). Genotype calls were made with GenomeStudio (v2011.1) ...
... The purebred dog dataset included 1324 individuals from 162 breeds, six of which originated in either Russia or Ukraine (table S1). Of these, 1296 purebred dogs from 157 breeds had been previously genotyped using the Illumina CanineHD 170k SNP array (table S1) (15). An additional 28 dogs from six breeds originating in either Russia or Ukraine were downsampled from publicly available whole-genome sequence data ( ...
Ceasium-137 and 90Sr are major artificial radionuclides that have been released into the environment. Soil-to-plant transfer of radionuclides is an important route to food contamination. The radionuclide activity concentrations in crops must be quantitatively predicted for estimating the internal radiation doses from food ingestion. In this study, soil and potato samples were collected from three study sites contaminated with different sources of 137Cs and 90Sr: Aomori Prefecture (global fallout) and two accidental release areas (Fukushima Prefecture and the Chornobyl exclusion zone). The 137Cs activity concentrations in the soil and potato samples widely ranged from 1.0 to 250,000 and from 0.048 to 200,000 Bq kg-1 dry weight, respectively. The soil-to-potato transfer factor of 137Cs also ranged widely (0.0015-1.1) and decreased with increasing concentration of exchangeable K. Meanwhile, the activity concentrations of 90Sr in the soil and potato samples were 0.50-64,000 and 0.027-18,000 Bq kg-1 dry weight respectively, and the soil-to-potato transfer factor of 90Sr was 0.023-0.74, decreasing with increasing concentration of exchangeable Ca. The specific activity ratios of 137Cs/Cs and 90Sr/Sr in the exchangeable fraction were similar to those in potatoes, with a factor of 3 in the ±95 % confidence intervals over six orders of magnitude and a factor of 2 in the ±95 % confidence intervals over five orders of magnitude, respectively. According to the data, the accuracy of predicting the activity concentrations of 137Cs and 90Sr in potatoes can be improved by applying the specific activity ratios of 137Cs/Cs and 90Sr/Sr in the exchangeable fraction. This approach accounts for variable factors such as the effects of K and Ca fertilization and soil characteristics. It also emphasizes the benefit of determining the stable Cs and Sr concentrations in potatoes and other crops prior to possible future contamination.
... We next assessed levels of haplotype sharing among breed dogs. Consistent with previous studies [49], the average haplotype sharing of dogs within a breed is > 40 times greater than the average among dogs from different breeds (average across breeds = 23.5 Mb). Dogs representing breeds within the same clade, as identified on the consensus neighbor joining cladogram, share haplotypes at 3.6 times the average observed in breeds from different clades, and sharing is seven-fold higher for breeds within subclades compared to breeds in distinct clades [(Mann-Whitney test for all above comparisons is p < 2.2 × 10 −16 ) (Fig. 3)]. ...
... Breeds within the German Shepherd clade are the only ones showing significant levels of haplotype sharing with wolves. Since a similar analysis with SNV genotyping arrays and the CanFam 3.1 Boxer reference genome revealed the same result [49], using a German Shepherd Dog reference genome is unlikely to contribute significantly to this observation. ...
Full-text available
Background The international Dog10K project aims to sequence and analyze several thousand canine genomes. Incorporating 20 × data from 1987 individuals, including 1611 dogs (321 breeds), 309 village dogs, 63 wolves, and four coyotes, we identify genomic variation across the canid family, setting the stage for detailed studies of domestication, behavior, morphology, disease susceptibility, and genome architecture and function. Results We report the analysis of > 48 M single-nucleotide, indel, and structural variants spanning the autosomes, X chromosome, and mitochondria. We discover more than 75% of variation for 239 sampled breeds. Allele sharing analysis indicates that 94.9% of breeds form monophyletic clusters and 25 major clades. German Shepherd Dogs and related breeds show the highest allele sharing with independent breeds from multiple clades. On average, each breed dog differs from the UU_Cfam_GSD_1.0 reference at 26,960 deletions and 14,034 insertions greater than 50 bp, with wolves having 14% more variants. Discovered variants include retrogene insertions from 926 parent genes. To aid functional prioritization, single-nucleotide variants were annotated with SnpEff and Zoonomia phyloP constraint scores. Constrained positions were negatively correlated with allele frequency. Finally, the utility of the Dog10K data as an imputation reference panel is assessed, generating high-confidence calls across varied genotyping platform densities including for breeds not included in the Dog10K collection. Conclusions We have developed a dense dataset of 1987 sequenced canids that reveals patterns of allele sharing, identifies likely functional variants, informs breed structure, and enables accurate imputation. Dog10K data are publicly available.
... Furthermore, analysis of molecular markers further identified Afghan hounds and Salukis as two of the most ancient breeds (Parker et al. 2004). These findings imply that sighthounds may have played an important role in the transition from a hunter-gathering to an agrarian society in ancient human civilizations, potentially initiating the early formation of distinct breeds (Parker et al. 2017). Due to their ancient lineages and absence of written records, the precise origin of sighthound breeds remains ambiguous. ...
... Genomic evidence suggests that the switch from hunting to agricultural pursuits may have initiated early breed formation; this could occur in multiple regions (Parker et al. 2017). Ancient dog breeds, such as Greyhounds and mastiffs, may have emerged during periods when human migration was challenging (Wayne and Ostrander 1999). ...
Full-text available
Sighthounds, a distinctive group of hounds comprising numerous breeds, have their origins rooted in ancient artificial selection of dogs. In this study, we performed genome sequencing for 123 sighthounds, including one breed from Africa, six breeds from Europe, two breeds from Russia and four breeds and 12 village dogs from the Middle East. We gathered public genome data of five sighthounds and 98 other dogs as well as 31 grey wolves to pinpoint the origin and genes influencing the morphology of the sighthound genome. Population genomic analysis suggested that sighthounds originated from native dogs independently and were comprehensively admixed among breeds, supporting the multiple origins hypothesis of sighthounds. An additional 67 published ancient wolf genomes were added for gene flow detection. Results showed dramatic admixture of ancient wolves in African sighthounds, even more than with modern wolves. Whole genome-scan analysis identified 17 positively selected genes (PSGs) in the African population, 27 PSGs in the European population, and 54 PSGs in the Middle Eastern population. None of the PSGs overlapped in the three populations. Pooled PSGs of the three populations were significantly enriched in "regulation of release of sequestered calcium ion into cytosol" (GO:0051279), which is related to blood circulation and heart contraction. In addition, ESR1, JAK2, ADRB1, PRKCE, and CAMK2D were under positive selection in all three selected groups. This suggests that different PSGs in the same pathway contributed to the similar phenotype of sighthounds. We identified an ESR1 mutation (chr1: g.42,177,149 T>C) in the transcription factor (TF) binding site of Stat5a, and a JAK2 mutation (chr1: g.93,277,007 T>A) in the TF binding site of Sox5. Functional experiments confirmed that the ESR1 and JAK2 mutation reduced their expression. Our results provide new insights into the domestication history and genomic basis of sighthounds.
... modern breeds), while a minority originated ~500 years ago (i.e. ancient breeds; Lindblad-Toh et al., 2005;Parker et al., 2017;Vonholdt et al., 2010). In our study we used a dog type made up of ancient dog breeds. ...
... Breed average genetic similarity was represented by an identity-by-state (IBS) matrix calculated from publicly available genetic data collected using the Illumina CanineHD bead array (Parker et al., 2017). The proportion of single-nucleotide polymorphisms (SNPs) identical by state (IBS) between pairs of individual dogs was calculated using PLINK (Chang et al., 2015). ...
Full-text available
To promote collaboration across canine science, address replicability issues, and advance open science practices within animal cognition, we have launched the ManyDogs consortium, modeled on similar ManyX projects in other fields. We aimed to create a collaborative network that (a) uses large, diverse samples to investigate and replicate findings, (b) promotes open science practices of pre-registering hypotheses, methods, and analysis plans, (c) investigates the influence of differences across populations and breeds, and (d) examines how different research methods and testing environments influence the robustness of results. Our first study combines a phenomenon that appears to be highly reliable—dogs’ ability to follow human pointing—with a question that remains controversial: do dogs interpret pointing as a social communicative gesture or as a simple associative cue? We collected data (N = 455) from 20 research sites on two conditions of a 2-alternative object choice task: (1) Ostensive (pointing to a baited cup after making eye-contact and saying the dog’s name); (2) Non-ostensive (pointing without eye-contact, after a throat-clearing auditory control cue). Comparing performance between conditions, while both were significantly above chance, there was no significant difference in dogs’ responses. This result was consistent across sites. Further, we found that dogs followed contralateral, momentary pointing at lower rates than has been reported in prior research, suggesting that there are limits to the robustness of point-following behavior: not all pointing styles are equally likely to elicit a response. Together, these findings underscore the important role of procedural details in study design and the broader need for replication studies in canine science.
... Given the peculiar population structure of purebred dogs, even a small number of carefully chosen subjects can serve as representative samples of their respective breeds [41]; indeed, the sample size per breed analysed in the present study is consistent with that of other comparable genomic studies [10,15,42]. ...
Full-text available
Shepherd and hunting dogs have undergone divergent selection for specific tasks, resulting in distinct phenotypic and behavioural differences. Italy is home to numerous recognized and unrecognized breeds of both types, providing an opportunity to compare them genomically. In this study, we analysed SNP data obtained from the CanineHD BeadChip, encompassing 116 hunting dogs (representing 6 breeds) and 158 shepherd dogs (representing 9 breeds). We explored the population structure, genomic background, and phylogenetic relationships among the breeds. To compare the two groups, we employed three complementary methods for selection signature detection: FST, XP-EHH, and ROH. Our results reveal a clear differentiation between shepherd and hunting dogs as well as between gun dogs vs. hounds and guardian vs. herding shepherd dogs. The genomic regions distinguishing these groups harbour several genes associated with domestication and behavioural traits, including gregariousness (WBSRC17) and aggressiveness (CDH12 and HTT). Additionally, genes related to morphology, such as size and coat colour (ASIP and TYRP1) and texture (RSPO2), were identified. This comparative genomic analysis sheds light on the genetic underpinnings of the phenotypic and behavioural variations observed in Italian hunting and shepherd dogs.
Full-text available
Growing concerns over health and welfare impacts from extreme phenotypes in dogs have created an urgent need for reliable demographic information on the national breed structures of dogs. This study included all dogs under primary veterinary care in the UK during 2019 at practices participating in VetCompass. Demographic data on these dogs were analysed to report on the frequency of common breeds and also to report on conformation, bodyweight, sex and neuter associations with these breeds. The study included 2,237,105 dogs under UK veterinary care in 2019. Overall, 69.4% (n = 1,551,462) were classified as purebred, 6.7% (149,308) as designer-crossbred and 24.0% (536,335) as nondesigner-crossbred. Across 800 unique breed names, the most frequent breeds at any age were nondesigner-crossbred (n = 536,335, 24.0%), Labrador Retriever (154,222, 6.9%) and Jack Russell Terrier (101,294, 4.5%). Among 229,624 (10.3%) dogs aged under one year, the most frequent breeds were nondesigner-crossbred (n = 45,995, 20.0%), French Bulldog (16,036, 7.0%) and Cockapoo (14,321, 6.2%). Overall, based on breed characteristics, 17.6% (395,739) were classified as brachycephalic, 43.1% (969,403) as mesaticephalic and 8.3% (186,320) as dolichocephalic. Of 1,551,336 dogs that were classifiable based on breed, 52.6% (815,673) were chondrodystrophic. Of 1,462,925 dogs that were classifiable, there were 54.6% (n = 798,426) short haired, 32.6% (476,883) medium haired and 12.8% (186,934) long haired. Of 1,547,653 dogs that were classifiable for ear carriage, 24.5% (n = 379,581) were erect, 28.1% (434,273) were semi-erect, 19.7% (305,475) were v-shaped drop and 27.7% (428,324) were pendulous. Overall, there was a 1.09:1.00 ratio of male (n = 1,163,512; 52.2%) to female dogs (n = 1,067,552; 47.8%). Health and welfare issues linked to popular breeds with extreme phenotypes suggest that there is much work to do to help owners to make more welfare-friendly decisions when choosing which type of dog to own.
Full-text available
Background: Veterinarians hold distinct breed-specific pain sensitivity beliefs that differ from the general public but are highly consistent with one another. This is remarkable as there is no current scientific evidence for biological differences in pain sensitivity across dog breeds. Therefore, the present study evaluated whether pain sensitivity thresholds differ across a set of dog breeds and, if so, whether veterinarians' pain sensitivity ratings explain these differences or whether these ratings are attributed to behavioral characteristics. Methods: Pain sensitivity thresholds [using quantitative sensory testing (QST) methods] and canine behaviors (using owner questionnaires and emotional reactivity tests) were prospectively measured across selected dog breeds. Adult, healthy dogs from 10 dog breeds/breed types were recruited, representing breeds subjectively rated by veterinarians as high (chihuahua, German shepherd, Maltese, Siberian husky), average (border collie, Boston terrier, Jack Russell terrier), or low (golden retriever, pitbull, Labrador retriever) pain sensitivity. A final sample of 149 dogs was included in statistical analyses. Results: Veterinarians' pain sensitivity ratings provided a minimal explanation for pain sensitivity thresholds measured using QST in dogs; however, dog breeds did differ in their pain sensitivity thresholds across the QST methods evaluated. Breed differences were observed for some aspects of emotional reactivity tests; however, these behavioral differences did not explain the differences in pain sensitivity thresholds found. Veterinarians' pain sensitivity ratings were positively associated with dog approach scores for the disgruntled stranger test suggesting that the way dogs greet strangers may be a factor influencing veterinarians' ratings of pain sensitivity across dog breeds. Conclusions and clinical relevance: Overall, these findings highlight a need to investigate biological mechanisms that may explain breed differences in pain sensitivity because this may inform pain management recommendations. Further, future research should focus on when and how these breed-specific pain sensitivity beliefs developed in veterinarians, as veterinarians' beliefs could impact the recognition and treatment of pain for canine patients.
Full-text available
Williams-Beuren Syndrome (WBS) is a neurodevelopmental disorder in humans caused by a hemizygous deletion of 28–30 genes and characterized by hypersociability and cognitive deficits. In canines, the homologous chromosomal region shows a strong signature of selection in domestic dogs relative to gray wolves, and four structural variants derived from transposons have been associated with social behavior. To explore these genetic associations in more phenotypic detail—as well as their role in training success—we genotyped 1,001 assistance dogs from Canine Companions for Independence®, including both successful graduates and those released from the training program for behavioral problems. We collected phenotypes on each dog using puppy-raiser questionnaires, trainer questionnaires, and both cognitive and behavioral tests. Using Bayesian mixed models, we found strong associations between genotypes and certain behavioral measures, including separation-related problems, aggression when challenged or corrected, and reactivity to other dogs. Furthermore, we found moderate differences in the genotypes of dogs who graduated versus those who did not; insertions in GTF2I showed the strongest association ( β = 0 . 23, CI 95% = -0.04, 0.49), translating to an odds-ratio of 1.25 for one insertion. Our results provide insight into the role of each of these loci in canine sociability and may inform breeding and training practices for working dog organizations. Furthermore, the observed importance of GTF2I supports the emerging consensus that GTF2I genotypes, dosage, and expression are particularly important for the social behavior phenotypes seen in WBS.
Full-text available
An extraordinary amount of genomic variation is contained within the chromosomes of domestic dogs, manifesting as dramatic differences in morphology, behaviour and disease susceptibility. Morphology, in particular, has been a topic of enormous interest as biologists struggle to understand the small window of dog domestication from wolves, and the division of dogs into pure breeding, closed populations termed breeds. Many traits related to morphology, including body size, leg length and skull shape, have been under selection as part of the standard descriptions for the nearly 400 breeds recognized worldwide. Just as important, however, are the minor traits that have undergone selection by fanciers and breeders to define dogs of a particular appearance, such as tail length, ear position, back arch and variation in fur (pelage) growth patterns. In this paper, we both review and present new data for traits associated with pelage including fur length, curl, growth, shedding and even the presence or absence of fur. Finally, we report the discovery of a new gene associated with the absence of coat in the American Hairless Terrier breed. This article is part of the themed issue ‘Evo-devo in the genomics era, and the origins of morphological diversity’.
Full-text available
In the decade following publication of the draft genome sequence of the domestic dog, extraordinary advances with application to several fields have been credited to the canine genetic system. Taking advantage of closed breeding populations and the subsequent selection for aesthetic and behavioral characteristics, researchers have leveraged the dog as an effective natural model for the study of complex traits, such as disease susceptibility, behavior, and morphology, generating unique contributions to human health and biology. When designing genetic studies using purebred dogs, it is essential to consider the unique demography of each population, including estimation of effective population size and timing of population bottlenecks. The analytical design approach for genome-wide association studies (GWAS) and analysis of whole genome sequence (WGS) experiments are inextricable from demographic data. We have performed a comprehensive study of genomic homozygosity, using high-depth WGS data for 90 individuals, and Illumina HD SNP data from 800 individuals representing 80 breeds. These data were coupled with extensive pedigree data analyses for 11 breeds that, together, allowed us to compute breed structure, demography, and molecular measures of genome diversity. Our comparative analyses characterize the extent, formation, and implication of breed-specific diversity as it relates to population structure. These data demonstrate the relationship between breed-specific genome dynamics and population architecture, and provide essential important considerations influencing the technological and sampling cohort design of association and other genomic studies.
Full-text available
The African Diaspora in the Western Hemisphere represents one of the largest forced migrations in history and had a profound impact on genetic diversity in modern populations. To date, the fine-scale population structure of descendants of the African Diaspora remains largely uncharacterized. Here we present genetic variation from deeply sequenced genomes of 642 individuals from North and South American, Caribbean and West African populations, substantially increasing the lexicon of human genomic variation and suggesting much variation remains to be discovered in African-admixed populations in the Americas. We summarize genetic variation in these populations, quantifying the postcolonial sex-biased European gene flow across multiple regions. Moreover, we refine estimates on the burden of deleterious variants carried across populations and how this varies with African ancestry. Our data are an important resource for empowering disease mapping studies in African-admixed individuals and will facilitate gene discovery for diseases disproportionately affecting individuals of African ancestry.
Full-text available
Background: The growing number of identified genetic disease risk variants across dog breeds challenges the current state-of-the-art of population screening, veterinary molecular diagnostics, and genetic counseling. Multiplex screening of such variants is now technologically feasible, but its practical potential as a supportive tool for canine breeding, disease diagnostics, pet care, and genetics research is still unexplored. Results: To demonstrate the utility of comprehensive genetic panel screening, we tested nearly 7000 dogs representing around 230 breeds for 93 disease-associated variants using a custom-designed genotyping microarray (the MyDogDNA® panel test). In addition to known breed disease-associated mutations, we discovered 15 risk variants in a total of 34 breeds in which their presence was previously undocumented. We followed up on seven of these genetic findings to demonstrate their clinical relevance. We report additional breeds harboring variants causing factor VII deficiency, hyperuricosuria, lens luxation, von Willebrand's disease, multifocal retinopathy, multidrug resistance, and rod-cone dysplasia. Moreover, we provide plausible molecular explanations for chondrodysplasia in the Chinook, cerebellar ataxia in the Norrbottenspitz, and familiar nephropathy in the Welsh Springer Spaniel. Conclusions: These practical examples illustrate how genetic panel screening represents a comprehensive, efficient and powerful diagnostic and research discovery tool with a range of applications in veterinary care, disease research, and breeding. We conclude that several known disease alleles are more widespread across different breeds than previously recognized. However, careful follow up studies of any unexpected discoveries are essential to establish genotype-phenotype correlations, as is readiness to provide genetic counseling on their implications for the dog and its breed.
Full-text available
The island inhabitants of Sardinia have long been a focus for studies of complex human traits due to their unique ancestral background and population isolation reflecting geographic and cultural restriction. Population isolates share decreased genomic diversity, increased linkage disequilibrium, and increased inbreeding coefficients. In many regions, dogs and humans have been exposed to the same natural and artificial forces of environment, growth, and migration. Distinct dog breeds have arisen through human-driven selection of characteristics to meet an ideal standard of appearance and function. The Fonni’s Dog, an endemic dog population on Sardinia, has not been subjected to an intensive system of artificial selection, but rather has developed alongside the human population of Sardinia, influenced by geographic isolation and unregulated selection based on its environmental adaptation and aptitude for owner-desired behaviors. Through analysis of 28 dog breeds, represented with whole-genome sequences from 13 dogs and ~170K genome-wide single nucleotide variants from 155 dogs, we have produced a genomic illustration of the Fonni’s Dog. Genomic patterns confirm withinbreed similarity, while population and demographic analyses provide spatial identity of Fonni’s Dog to other Mediterranean breeds. Investigation of admixture and fixation indices reveal insights into the Fonni’s Dog’s involvement in breed development throughout the Mediterranean. We describe how characteristics of population isolates are reflected in dog breeds that have undergone artificial selection, and are mirrored in the Fonni’s Dog through traditional isolating factors that affect human populations. Lastly, we show that the genetic history of Fonni’s Dog parallels demographic events in local human populations.
Full-text available
Dogs were an important element in many native American cultures at the time Europeans arrived. Although previous ancient DNA studies revealed the existence of unique native American mitochondrial sequences, these have not been found in modern dogs, mainly purebred, studied so far. We identified many previously undescribed mitochondrial control region sequences in 400 dogs from rural and isolated areas as well as street dogs from across the Americas. However, sequences of native American origin proved to be exceedingly rare, and we estimate that the native population contributed only a minor fraction of the gene pool that constitutes the modern population. The high number of previously unidentified haplotypes in our sample suggests that a lot of unsampled genetic variation exists in non-breed dogs. Our results also suggest that the arrival of European colonists to the Americas may have led to an extensive replacement of the native American dog population by the dogs of the invaders.
Full-text available
We present a comprehensive assessment of genomic diversity in the African-American population by studying three genotyped cohorts comprising 3,726 African-Americans from across the United States that provide a representative description of the population across all US states and socioeconomic status. An estimated 82.1% of ancestors to African-Americans lived in Africa prior to the advent of transatlantic travel, 16.7% in Europe, and 1.2% in the Americas, with increased African ancestry in the southern United States compared to the North and West. Combining demographic models of ancestry and those of relatedness suggests that admixture occurred predominantly in the South prior to the Civil War and that ancestry-biased migration is responsible for regional differences in ancestry. We find that recent migrations also caused a strong increase in genetic relatedness among geographically distant African-Americans. Long-range relatedness among African-Americans and between African-Americans and European-Americans thus track north- and west-bound migration routes followed during the Great Migration of the twentieth century. By contrast, short-range relatedness patterns suggest comparable mobility of ∼15-16km per generation for African-Americans and European-Americans, as estimated using a novel analytical model of isolation-by-distance.
Full-text available
The domestic dog is becoming an increasingly valuable model species in medical genetics, showing particular promise to advance our understanding of cancer and orthopaedic disease. Here we undertake the largest canine genome-wide association study to date, with a panel of over 4,200 dogs genotyped at 180,000 markers, to accelerate mapping efforts. For complex diseases, we identify loci significantly associated with hip dysplasia, elbow dysplasia, idiopathic epilepsy, lymphoma, mast cell tumour and granulomatous colitis; for morphological traits, we report three novel quantitative trait loci that influence body size and one that influences fur length and shedding. Using simulation studies, we show that modestly larger sample sizes and denser marker sets will be sufficient to identify most moderate- to large-effect complex disease loci. This proposed design will enable efficient mapping of canine complex diseases, most of which have human homologues, using far fewer samples than required in human studies.
The current genetic makeup of Latin America has been shaped by a history of extensive admixture between Africans, Europeans and Native Americans, a process taking place within the context of extensive geographic and social stratification. We estimated individual ancestry proportions in a sample of 7,342 subjects ascertained in five countries (Brazil, Chile, Colombia, México and Perú). These individuals were also characterized for a range of physical appearance traits and for self-perception of ancestry. The geographic distribution of admixture proportions in this sample reveals extensive population structure, illustrating the continuing impact of demographic history on the genetic diversity of Latin America. Significant ancestry effects were detected for most phenotypes studied. However, ancestry generally explains only a modest proportion of total phenotypic variation. Genetically estimated and self-perceived ancestry correlate significantly, but certain physical attributes have a strong impact on self-perception and bias self-perception of ancestry relative to genetically estimated ancestry.
The population of the province of Newfoundland and Labrador (NL) has been a resource for genetic studies because of its historical isolation and increased prevalence of several monogenic disorders. Controversy remains regarding the genetic substructure and the extent of genetic homogeneity, which have implications for disease gene mapping. Population substructure has been reported from other isolated populations such as Iceland, Finland and Sardinia. We undertook this study to further our understanding of the genetic architecture of the NL population. We enrolled 494 individuals randomly selected from NL. Genome-wide SNP data were analyzed together with that from 14 other populations including HapMap3, Ireland, Britain and Native American samples from the Human Genome Diversity Project. Using multidimensional scaling and admixture analysis, we observed that the genetic structure of the NL population resembles that of the British population but can be divided into three clusters that correspond to religious/ethnic origins: Protestant English, Roman Catholic Irish and North American aboriginals. We observed reduced heterozygosity and an increased inbreeding coefficient (mean=0.005), which corresponds to that expected in the offspring of third-cousin marriages. We also found that the NL population has a significantly higher number of runs of homozygosity (ROH) and longer lengths of ROH segments. These results are consistent with our understanding of the population history and indicate that the NL population may be ideal for identifying recessive variants for complex diseases that affect populations of European origin.European Journal of Human Genetics advance online publication, 16 December 2015; doi:10.1038/ejhg.2015.256.