ArticlePDF Available

Comparison of village dog and wolf genomes highlights the role of the neural crest in dog domestication

Springer Nature
BMC Biology
Authors:

Abstract and Figures

Background: Domesticated from gray wolves between 10 and 40 kya in Eurasia, dogs display a vast array of phenotypes that differ from their ancestors, yet mirror other domesticated animal species, a phenomenon known as the domestication syndrome. Here, we use signatures persisting in dog genomes to identify genes and pathways possibly altered by the selective pressures of domestication. Results Whole-genome SNP analyses of 43 globally distributed village dogs and 10 wolves differentiated signatures resulting from domestication rather than breed formation. We identified 246 candidate domestication regions containing 10.8 Mb of genome sequence and 429 genes. The regions share haplotypes with ancient dogs, suggesting that the detected signals are not the result of recent selection. Gene enrichments highlight numerous genes linked to neural crest and central nervous system development as well as neurological function. Read depth analysis suggests that copy number variation played a minor role in dog domestication. Conclusions Our results identify genes that act early in embryogenesis and can confer phenotypes distinguishing domesticated dogs from wolves, such as tameness, smaller jaws, floppy ears, and diminished craniofacial development as the targets of selection during domestication. These differences reflect the phenotypes of the domestication syndrome, which can be explained by alterations in the migration or activity of neural crest cells during development. We propose that initial selection during early dog domestication was for behavior, a trait influenced by genes which act in the neural crest, which secondarily gave rise to the phenotypes of modern dogs.
Content may be subject to copyright.
R E S E A R C H A R T I C L E Open Access
Comparison of village dog and wolf
genomes highlights the role of the neural
crest in dog domestication
Amanda L. Pendleton
1
, Feichen Shen
1
, Angela M. Taravella
1
, Sarah Emery
1
, Krishna R. Veeramah
2
,
Adam R. Boyko
3
and Jeffrey M. Kidd
1,4*
Abstract
Background: Domesticated from gray wolves between 10 and 40 kya in Eurasia, dogs display a vast array of
phenotypes that differ from their ancestors, yet mirror other domesticated animal species, a phenomenon known
as the domestication syndrome. Here, we use signatures persisting in dog genomes to identify genes and
pathways possibly altered by the selective pressures of domestication.
Results: Whole-genome SNP analyses of 43 globally distributed village dogs and 10 wolves differentiated
signatures resulting from domestication rather than breed formation. We identified 246 candidate
domestication regions containing 10.8 Mb of genome sequence and 429 genes. The regions share haplotypes
with ancient dogs, suggesting that the detected signals are not the result of recent selection. Gene
enrichments highlight numerous genes linked to neural crest and central nervous system development as
well as neurological function. Read depth analysis suggests that copy number variation played a minor role in
dog domestication.
Conclusions: Our results identify genes that act early in embryogenesis and can confer phenotypes
distinguishing domesticated dogs from wolves, such as tameness, smaller jaws, floppy ears, and diminished
craniofacial development as the targets of selection during domestication. These differences reflect the
phenotypes of the domestication syndrome, which can be explained by alterations in the migration or
activity of neural crest cells during development. We propose that initial selection during early dog
domestication was for behavior, a trait influenced by genes which act in the neural crest, which secondarily
gave rise to the phenotypes of modern dogs.
Keywords: Domestication, Canine, Selection scan, Neural crest, Retinoic acid
Background
The process of animal domestication by humans was
complex and multi-staged, resulting in disparate appear-
ances and behaviors of domesticates relative to their wild
ancestors [13]. In 1868, Darwin noted that numerous
traits are shared among domesticated animals, an obser-
vation that has since been classified as the domestication
syndrome [4]. This syndrome describes the phenomenon
where diverse phenotypes are shared among phylogenet-
ically distinct domesticated species but absent in their
wild progenitors. Such traits include increased tameness,
shorter muzzles/snouts, smaller teeth, more frequent es-
trous cycles, floppy ears, reduced brain size, depigmen-
tation of skin or fur, and loss of hair.
During the domestication process, the most desired
traits are subject to selection. This selection process
may result in detectable genetic signatures such as al-
terations in allele frequencies [511], amino acid sub-
stitution patterns [1214], and linkage disequilibrium
patterns [15,16]. Numerous genome selection scans
have been performed within a variety of domesticated
* Correspondence: jmkidd@umich.edu
1
Department of Human Genetics, University of Michigan, Ann Arbor, MI
48109, USA
4
Department of Computational Medicine and Bioinformatics, University of
Michigan, Ann Arbor, MI 48109, USA
Full list of author information is available at the end of the article
© Kidd et al. 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Pendleton et al. BMC Biology (2018) 16:64
https://doi.org/10.1186/s12915-018-0535-2
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
animal taxa [511,17], and several genes are
highlighted as likely associated with the domestication
syndrome. This is not unexpected given that more
than a dozen diverse behavioral and complex physical
traits fall under the syndrome, making it likely that
numerous genes with pleiotropic effects contribute
through mechanisms which act early in organismal
development [18,19]. For this reason, the putative
role of the neural crest in domestication has gained
traction [18,20,21]. Alterations in neural crest cells
number and function can also influence behavior. For
example, the adrenal and pituitary systems, which are
derived from neural crest cells, influence aggression
and the fight or flightbehavioral reactions, two re-
sponses which are lessened in domesticates [22].
No domestic animal has shared more of its evolution-
ary history in direct contact with humans than the dog
(Canis lupus familiaris, also referred to as Canis famil-
iaris), living alongside humans for more than ten thou-
sand years since domestication from its ancestor the
gray wolf (Canis lupus). Despite numerous studies, vig-
orous debate still persist regarding the location, timing,
and number of dog domestication events [2327]. Sev-
eral studies [5,8,26,28,29] using related approaches
have attempted to identify genomic regions which are
highly differentiated between dogs and wolves, with the
goal of identifying candidate targets of selection during
domestications (candidate domestication regions, CDRs
[5]). In these studies, breed dogs either fully or partially
represented dog genetic diversity. Most modern breeds
arose ~ 300 years ago [30] and contain only a small por-
tion of the genetic diversity found among the vast ma-
jority of extant dogs. Instead, semi-feral village dogs are
the most abundant and genetically diverse modern dog
populations and have undergone limited targeted selec-
tion by humans since initial domestication [24,31].
These two dog groups represent products of two bottle-
necks in the evolution of the domestic dog, the first
resulting from the initial domestication of gray wolves,
and the second from modern breed formation [32,33].
Selection scans including breed dog genetic data may
therefore confound signatures associated with these two
events. Indeed, we recently reported [34] that neither
ancient nor modern village dogs could be genetically dis-
tinguished from wolves at 18 of 30 previously identified
autosomal CDRs [5,8]. Furthermore, most of these stud-
ies employed empirical outlier approaches wherein the
extreme tail of differentiated loci is assumed to differ
due to the action of selection [35]. Freedman et al. [29]
extended these studies through the use of a simulated
demographic history to identify loci whose variability is
unlikely to result from a neutral population history of
bottlenecks and migration. When compared to previous
outlier-based studies, most of the regions identified in
[29] were novel, and harbored genes in neurological, be-
havioral, and metabolic pathways.
In this study, we reassess candidate domestication re-
gions in dogs using genome sequence data from a globally
diverse collection of village dogs and wolves. First, using
methods previously applied to breed dog samples, we
show that the use of semi-feral village dogs better captures
dog genetic diversity and identifies loci more likely to be
truly associated with domestication. Next, we perform a
scan for CDRs in village dogs utilizing the XP-CLR statis-
tic, refine our results by requiring shared haplotypes with
ancient dogs (> 5000 years old) and present a revised set
of pathways altered during dog domestication. Finally, we
perform a scan for copy number differences between vil-
lage dogs and wolves, and identify additional copy number
variation at the starch-metabolizing gene amylase-2b
(AMY2B) that is independent of the AMY2B tandem ex-
pansion previously found in dogs [5,3638].
Results
Use of village dogs eliminates bias in domestication scans
associated with breed formation
Comparison using F
ST
outlier approaches
Utilizing pooled F
ST
calculations in sliding windows
along the genome, two previous studies [5,8] isolated
candidate domestication regions from sample sets con-
sisting of mostly breed dogs and wolves. These loci were
classified as statistical outliers based on empirical
thresholds (arbitrary Zscore cutoffs). In order to dem-
onstrate the impact of sample choice (i.e., breed vs vil-
lage dogs) on the detection of selective signatures
associated with early domestication pressures, rather
than breed formation, we adapted the methods from
these studies and identified outlier loci empirically [5,8].
First, through ADMIXTURE [39] and identity-by-state
(IBS) analyses, we identified a collection of 43 village
dog and 10 gray wolf samples (Additional file 1: Table S1)
that have less than 5% dog-wolf admixed ancestry and
excludes close relatives (Fig. 1a, b; see the Methodssec-
tion). Principal component analysis (PCA) illustrates the
genetic separation between village dogs and wolves along
PCs 1 and 2 (Fig. 1c), while positions along PC4 reflect
the east-west geographic distribution of the village dog
populations (Fig. 1d). To compare directly with previous
studies, we calculated average F
ST
values in overlapping
200 kb sliding windows with a step-size of 50 kb across
the genome using a pooled approach. As in [5,8], we
performed a Ztransformation of F
ST
values to normalize
the resulting values and identified windows with a ZF
ST
score greater than 5 (autosomes) or 3 (X chromosome) as
candidate domestication regions. Following merging, this
outlier procedure identified 31 CDRs encompassing
12.3 Mb of sequence (Additional file 1: Table S2). As in
previous studies, a 550 kb region on chromosome 6
Pendleton et al. BMC Biology (2018) 16:64 Page 2 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
(46.8047.35 Mb) that contains the pancreatic amylase
2B (AMY2B)andRNA Binding Region Containing 3
(RNPC3) genes had the highest observed average ZF
ST
score (ZF
ST
=7.67).
Only 15 of these 31 regions intersect with those re-
ported in [5]and[8] (Fig. 2a). To further explore this
discrepancy, we visually assessed whether the dog or
wolf haplotype is present at the loci reported in these
earlier studies in 46 additional canine samples, including
three ancient European dogs ranging in age from 5000
to 7000 years old (see the Methodssection; [23,34]).
Likely due to the absence of village dogs in their study,
some loci identified in Axelsson et al. [5] appear to con-
tain selective sweeps associated with breed formation, as
evidenced by the presence of the wild haplotype in an-
cient and village dogs (example in Fig. 2b). Although all
autosomal sweeps identified by [8] intersected with
CDRs from our study, seven of their X chromosome
windows did not meet the thresholds of significance from
our SNP sets (example in Additional file 2: Figure S1).
Unlike [8], we performed F
ST
scans and Ztransformations
for windows on autosomes and the X chromosome separ-
ately, which may limit false inflation of F
ST
signals on the
X that arise due to smaller effective population sizes and
correspondingly higher expected levels of genetic drift on
the X chromosome. More detailed analysis of the loci
highlighted in these two earlier studies [5,8]willbeelabo-
rated in the following section.
Refined assessment of previously identified candidate
differentiated loci using demographic models and ancient
genomes
The above results suggest that the use of village dogs,
rather than breed dogs, in selection scans identifies
novel candidate domestication regions that are not
confounded by breed formation. We developed a statis-
tical filtering strategy to systematically further explore
the impact of sample choice on F
ST
-based scans. First,
rather than setting an empirical threshold at a ZF
ST
score of 5, we created a neutral null model that captures
key aspects of dog and wolf demographic history
(Additional file 1: Table S3; Additional file 2: Figure S2;
c
ab
d
Fig. 1 Origin and diversity of sampled village dogs and wolves. aThe approximate geographic origin of the village dog (circles) and gray wolf
(triangles) genome samples included in our analysis. The numbers within each shape indicate the sample count from each population. b
Admixture plot at K= 3 for the filtered village dog (N= 43) and gray wolf set (N= 10) are shown. Principal component analysis of the filtered
sample set at 7,657,272 sites. Results are projected on cPC1 and PC2 and dPC3 and PC4. Colors in all figures correspond to sample origins and
are explained in the PCA legends
Pendleton et al. BMC Biology (2018) 16:64 Page 3 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
[34,40]). We identified 443 autosomal sliding windows
with F
ST
values that exceed the 99th percentile of the neu-
tral simulations (F
ST
=0.308;Additionalfile 2:FigureS3a).
Second, reasoning that a true domestication sweep will be
largely fixed among extant dogs with no recent wolf
admixture, we calculated pooled heterozygosity (H
P
)in
village dogs within the same window boundaries and
retained windows with a H
P
lower than the 0.1th
percentile observed in our simulations (Additional file 2:
Figure S3b). This heterozygosity filter removed 199 of the
443 windows. Finally, we excluded regions where the pu-
tatively selected haplotype is not found in ancient dog
samples. To do this, we calculated the difference in dog
H
P
(ΔH
P
) with and without the inclusion of two ancient
dog samples HXH, a 7-ky-old dog from Herxheim,
Germany [34] and NGD, a 5-ky-old dog from Newgrange,
Ireland [23]; see the Methodssection). Windows with ΔH
P
greater than the 5th percentile of all windows genome-wide
(ΔH
P
=0.0036) were removed (Additional file 2:FiguresS3c,
d and S4). Remaining overlapping windows were
merged, resulting in 58 autosomal F
ST
CDRs that en-
compass 18.65 Mbp of the genome and are within 50 kb
of 248 Ensembl gene models (Fig. 3;Additionalfile1:
Table S 4 ) .
We applied the same filtration parameters to the can-
didate domestication regions identified on the auto-
somes in Axelsson et al. (N= 30; [5]) and Cagan and
Blass (N=5;[8]) (Additional file 2: Figure S5a and b).
Since window coordinates of these studies may not
precisely match our own, we selected the maximum F
ST
value per locus from our village dog and wolf data. We
then removed any locus with F
ST
,H
P
, and ΔH
P
levels not
passing our thresholds. Following these three filtration
steps, only 14 Axelsson and 4 Cagan and Blass loci
remained. In addition, we separately assessed the overlap
of our F
ST
-based regions with the 349 loci identified by
[29] using various statistics and a simulation-based sig-
nificance threshold which is more comparable to our ap-
proach. We found that only 41 of the 349 loci from [29]
loci passed our filtrations (Additional file 2: Figure S5c).
In total, 25/58 loci identified using F
ST
in village dogs inter-
sected with a putative sweep identified from at least one
previous study (for specific overlaps, see Additional file 1:
Table S4). The fact that the majority of the previously re-
ported CDRs fail our thresholds when examined in village
dogs and ancient dogs suggest that these CDRs reflect se-
lection events that occurred in breeds after dog domestica-
tion, rather than true domestication sweeps which should
be present in all dogs.
A scan for the targets of selection during domestication
using cross-population haplotype comparisons
To gain a better picture of the targets of selection during
dog domestication, we conducted a search for domesti-
cation regions in village dogs using XP-CLR, a statistic
developed to identify loci under selection based on pat-
terns of correlated multilocus allele frequency differ-
ences between two populations [41]. XP-CLR has several
ab
Fig. 2 Comparison with previously published candidate domestication regions. aVenn diagram depicting counts of intersecting village dog
(current study), Axelsson et al. [5] (AX), and Cagan and Blass [8] (CB) candidate domestication regions. Note, some intersecting regions contain
multiple loci from a single study; therefore, the counts in this diagram represent the number of genomic regions, not individual loci counts.
bGenotype matrix for 130 SNPs within chr7: 24,632,211-25,033,464 in AX_14 for 99 canine samples. Sites homozygous for the reference (0/0;
blue) and alternate alleles (1/1; orange) are indicated along with heterozygous sites (0/1; white). Each column represents a single SNP, while each
row is a sample. Canid groupings are on the right of the matrix
Pendleton et al. BMC Biology (2018) 16:64 Page 4 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
advantages over other methods used to identify selection
signatures, as it is less biased by demographic history, by
uncertainty in recombination rates, and does not main-
tain strict window boundaries [41]. Instead, the method
considers patterns of contiguous SNPs to isolate loci
that, based on the size of the affected region, had more
rapid correlated changes in allele frequency than ex-
pected by genetic drift [41]. Since we are searching for
regions under selection in the dog genome, wolves were
set as our reference population and XP-CLR was run
on both simulated and real SNP datasets with a spa-
cing of 2 kb, and a window size of 50 kb. Average
XP-CLR values were calculated within 25 kb sliding
windows (10 kb step size) for both datasets, and we
retained 889 windows with scores greater than the
99th percentile obtained from simulations (XP-CLR =
19.78; Additional file 2: Figure S6a). Using methods
similar to those employed for the F
ST
scans described
Fig. 3 Circos plot of genome-wide selection statistics. Statistics from multiple selection scans are provided across the autosomes (chromosomes
identifiers are indicated in the inner circle). (A) Averaged XP-CLR scores in 25 kb windows across the genome. Windows with significant scores
(greater than 99th percentile from simulations) are in red, and those that passed filtration are in blue. Genes within significant windows are listed
above each region. (B) F
ST
values calculated in 100 kb windows. Values greater than the 99th percentile of simulations are in red. Windows that
passed filtration are in green
Pendleton et al. BMC Biology (2018) 16:64 Page 5 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
above, windows with village dog H
P
values less than
the 0.1st simulation percentile (H
P
= 0.0598) or where
the ancient dog samples carried a different haplotype
(ΔH
P
filtration threshold at 5th percentile = 0.0066) were
eliminated (Additional file 2: Figures S6bd and S3c). This
resulted in 598 autosomal windows which we merged into
246 candidate loci, encompassing 10.81 Mb of genomic
sequence and within 50 kb of 429 unique genes (Fig. 3b;
Additional file 1: Table S5). Of these windows, 178 are
located within 50 kb of at least one an Ensembl gene
model. No SNPs with high F
ST
within these intervals had
predicted deleterious effects on coding sequence.
(Additional file 1: Table S6; [42]). The vast majority of the
XP-CLR regions (204/246) were not found in previous
studies [5,8,29], with 4 also found in Axelsson et al. [5]
only, 33 in Freedman et al. [29] only and 5 in both Axels-
son et al. [5] and Freedman et al. [29]. No loci intersected
with the Cagan and Blass [8] findings. Thirty four XP-CLR
regions overlap with 21 of the 58 loci we identified using
F
ST
-based approaches, indicating that XP-CLR often iden-
tifies selection signatures within narrower regions.
Gene content of 246 candidate domestication regions
We sought to identify gene sets and pathways enriched
within our candidate domestication regions. Based on
1000 randomized permutations (see the Methodssec-
tion), we found that the XP-CLR regions are not more
likely to localize near genes than expected (p= 0.07),
though the loci are near a greater total number of genes
than random permutations (p= 0.003; Additional file 2:
Figure S7a and b). We observed that our candidate loci
contain genes of the similar average length as found in the
randomized set (p> 0.05; Additional file 2: Figure S7c).
The biological functions of numerous genes near the can-
didate domestication regions are consistent with the
neural crest hypothesis, linking this critical embryonic de-
velopment pathway to the domestication syndrome
(Table 1;[18,20,21]). Multiple genes are also involved in
retinoic acid signaling, neurotransmission, and RNA
splicing.
Candidate genes influencing retinoic acid signaling
Retinoic acid (RA) is a signaling molecule that has nu-
merous critical roles in development at the embryonic
level, continuing into adult stages with roles such as
maintaining stem cell proliferation, tissue regeneration,
and regulation of circadian rhythm [43,44]. The highest
scoring XP-CLR locus centers upon RAI1 (retinoic
acid-induced 1; XP 52; Fig. 4), a gene that has not been
identified in previous domestication scans. RAI1 has
numerous developmental functions in the RA pathway, and
mutations in this gene are responsible for Smith-Magenis
and Potocki-Lupski syndromes in humans [45,46]. Other
genes with related functions include NR2C1 (XP 143),
essential for the development of early retina cells through
regulation of early transcription factors that govern retinal
progenitor cells such as RA receptors [47] and calreticulin,
a protein involved in inhibition of both androgen and RA
transcriptional activities [47,48]. Ncor2 (XP 209) increases
cell sensitivity to RA when knocked out in mice [49], and
CYP1B1 (XP 152) is a pathway component that can direct
embryonic patterning by RA [50].
Candidate genes regulating brain development and
behavior
Twelve XP-CLR candidate genes related to neurotrans-
mitter function include the serotonin transporter
SLC6A4 (XP 101) and dopamine signaling members
GNAQ (XP 16) and ADCY6 (XP 215). Genes associated
with glutamate, the excitatory neurotransmitter, include
DGKI (ranked 6th by XP-CLR; XP 145), which regulates
presynaptic release in glutamate receptors [51], and
GRIK3 (XP 141), a glutamate receptor [52]. Other genes
include UNC13B, which is essential for competence of
glutamatergic synaptic vesicles [53], and CACNA1A (XP
176) influences glutamatergic synaptic transmission [54].
In contrast to glutamate, GABA is the nervous systems
inhibitory neurotransmitter and has been linked to the
response to and memory of fear [55,56]. Genes in our
XP-CLR loci relating to GABA include one of the two
mammalian GABA biosynthetic enzymes GAD2 (or
GAD65; ranked 20th), the GABA receptor GABRA4,
auxiliary subunit of GABA-B receptors KCTD12 ([57]),
and the GABA inhibitor osteocalcin (or BGLAP;[58]).
Lastly, TLX3 (XP 48) is a key switch between gluta-
matergic and GABAergic cell fates [59].
Candidate genes related to RNA splicing
We also observe numerous candidate genes involved in
splicing of transcripts by both the major and minor spli-
cing pathways. The eighth highest XP-CLR region (XP 57)
harbors the gene RNPC3, the 65 KDa subunit of the U12
minor spliceosome, which is located ~ 55 kb downstream
of pancreatic amylase AMY2B (Fig. 5). Another core sub-
unit, SF3B1, belongs to both the minor and major (U2)
spliceosome. Additional XP-CLR genes related to splicing
and/or spliceosome function include FRG1 [60], DDX23
(alias PRP28;[61]), CELF1 [62], NSRP1 (alias NSrp70;[63,
64]), and SRSF11 (alias P54;[65]).
Survey of copy number variation between dogs and
wolves
Copy number variants have also been associated with
population-specific selection and domestication in a num-
ber of species [5,66,67]. Since regions showing extensive
copy number variation may not be uniquely localized in
the genome reference and may have a deficit of SNPs
passing our coverage thresholds, we directly estimated
Pendleton et al. BMC Biology (2018) 16:64 Page 6 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Table 1 XP-CLR CDR genes with evidenced or putative roles in nervous system and neural crest pathways
Gene XP-CLR locus (rank) System Phenotypes/effects
RAI1 XP 52 (1st) Xenopus Mutants display craniofacial defects, improper migration of neural crest cells,
decrease in facial cartilage components, axonal defects, and altered forebrain
ventricle sizes [119].
NKAIN2 (TCBA1) XP 9 (4th) Human Neurocristopathy-like phenotypes observed in patients with translocation
breakpoint in NKAIN2 such as hair hypopigmentation, craniofacial and limb
malformation, misdevelopment of eyes, and macrocephaly [172].
RNPC3 XP 57 (8th) Human Mutations linked to isolated growth hormone deficiency and pituitary
hypoplasia [128].
NPR2 XP 127 (14th) Human, mouse Mutants exhibit dwarfism and impacted skeletal growth during
embryogenesis [173,174].
NPHP3 XP 197 (24th) Mouse, xenopus Left/right asymmetry, shortened body axes, and neural folds fail to close in
mutants. Interacts with non-canonical Wnt pathway [84].
LIMCH1 XP 135 (30th) Human Significantly altered methylation patterns in Chinese Han pedigrees exhibiting
neural tube defects [175]. LIMCH1 depletion increased cell migration by
spatiotemporally regulating non-muscular myosin II activity [176].
CCDC65 XP 215 (45th) Zebrafish Critical for cilia and dynein function. Knockdowns cause left-right asymmetry
and axis curvature embryos [177].
DAND5 (cerberus-like) XP 177 (51st) Mouse Prevents signaling of the Nodal pathway on the right side of the developing
mouse embryo, establishing left/right asymmetry during early
somitogenesis [178].
GBF1 XP 220 (67th) Fly Expressed in embryogenesis, contributes to cell polarity in tubular organs
and chemotaxis of neutrophils [179,180].
GDPD5 XP 181 (102nd) Zebrafish Regulator of the notch signaling pathway, essential for neural crest pathway,
linked to body axis determination [181], is induced by retinoids and drives
motor neuron differentiation [182].
HAUS3 XP 38 (111th) Zebrafish Essential regulator of embryonic hematopoietic stem/progenitor cell
maintenance and cell cycle progression [183].
PAX9 XP 80 (114th) Mouse Mutants displayed improper craniofacial development, lacked organs deriving
from pharyngeal pouches, no teeth [184].
DIAPH1 XP 21 (117th) Human Expressed in neural progenitors, linked to microcephaly in humans [185],
impacts migration of glioma cells [186].
TCF4 XP 3 (127th) Mouse Myelinates oligodendrocytes, antagonizes the Wnt signaling pathway, and
interacts with SOX10 (a known neural crest gene [187]) to promote
oligodendrocytic maturation gene expression [188].
TSPAN14 XP 46 (129th) Human Promotes the activity of notch receptors and the expression of ADAM10 [189],
both players in the neural crest signaling pathway [190].
SATB2 XP 244 (131st) Mouse Mutants exhibit craniofacial abnormalities (e.g., cleft palate, dental misgrowth)
and disrupted osteoblast differentiation [99].
FOXI1 XP 49 (136th) Zebrafish Regulates inner ear and jaw development in embryogenesis, and hypothesized
to influence neural crest cell migration and/or separation in the brachial
arches [191].
PRKCAB XP 61 (138th) Mouse Mutations yield improper development of the neural tube and spina bifida in
mice, asymmetric expansion of hedgehog signaling in the neural tube, impact
neuronal cell survival [192].
GNAQ XP 16 (141st) Mouse Mutants exhibit heart malformations and shortened jaws [193].
Tlx3 XP 48 (146th) Mouse Dorsal spinal cord development, specification of glutamatergic neurons [194],
and is a target of Wnt signaling pathway [59].
SEMA4A XP 70 (150th) Xenopus Expressed in neurogenic placodes in the developing neural tube, which along
with neural crest cells, migrate to final cell locations [195].
TIAM1 XP 225 (154th) Mouse With PAR3 , gene is responsible for determination of front-rear and apical-basal
polarity in migratory keratinocyte cells [196].
PITX1 XP 124 (157th) Mouse, anolis Transcription factor whose binding sites are near key neural crest signaling
members (Wnt, Hedgehog, BMP)[197]. Mutants have improper hind limb
development and patterning as well as craniofacial abnormalities [100].
Pendleton et al. BMC Biology (2018) 16:64 Page 7 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
copy number along the reference assembly and searched
for regions of extreme copy number differences (see the
Methodssection). Using V
ST
, a statistic analogous to F
ST
[66], we identified 67 regions of extreme copy number dif-
ference between village dogs and wolves which are within
50 kb of 89 unique genes (Additional file 1: Table S7).
There was no overlap of these copy number outliers with
regions identified through F
ST
or XP-CLR. Relative to ran-
domly permuted intervals, the 67 V
ST
outliers are more
likely to be near genes (p< 0.01; Additional file 2:Figure
S8a) but do not encompass more total genes than ex-
pected (p > 0.05; Additional file 2: Figure S8b).
The top locus identified through V
ST
analysis encom-
passes the AMY2B gene, which at increased copy number
confers greater starch metabolism efficiency due to higher
pancreatic amylase enzyme levels [5,37]. Quantitative
PCR results have suggested an ancient origin for the
AMY2B copy number expansion, as 7-ky-old Romanian
dogs exhibit elevated AMY2B copy number [38]. However,
read-depth analysis shows that the AMY2B tandem
expansion is absent in 57-ky-old ancient European dogs
[34]. We identified two large duplications, one of 1.9 Mb
and the other of 2.0 Mb, that encompass AMY2B
(Additional file 2: Figure S9). We quantified copy number
Table 1 XP-CLR CDR genes with evidenced or putative roles in nervous system and neural crest pathways (Continued)
Gene XP-CLR locus (rank) System Phenotypes/effects
FKBP8 XP 175 (167th) Mouse Critical for development of the neural tube, establishes dorso-ventral patterning,
and prevents apoptosis in embryonic cells in the neural tube [198].
AMBRA1 XP 161 (169th) Mouse Mutants show disrupted embryonic development, neural tube defects, cell
cycle perturbations (unbalanced proliferation and high apoptosis) [199].
SCUBE1 XP 115 (202nd) Mouse Required for proper development of the central nervous system, neural tube,
brain regions, and the cranial vault formation [200].
CYP1B1 XP 152 (229th) Zebrafish Mutants showed disrupted neural crest migration [201] and is associated with
retinoic acid synthesis during the patterning of the developing embryo [50].
Genes within XP-CLR candidate domestication regions (with rank) that have experimental or clinical evidence that illustrate roles in early embryonic pathways,
especially in the developing central nervous system and components of the neural crest and its signaling pathways
a
b
c
Fig. 4 Selection scan statistics at the RAI1 Locus. Selection scan statistics surrounding the retinoic acid-induced 1 (RAI1) locus (chr5: ~ 41.6-41.2
Mb). aPer site F
ST
scores for all SNPs are indicated along with the F
ST
significance threshold determined by the 99th percentile of simulations
(red dashed line). bBars represent raw XP-CLR grid scores. Circles indicate the mean XP-CLR score calculated from averaging grid scores within
25 kb windows and are positioned within the center point window. Red bars and circles indicate that the score is significant (above the 99th
percentile significance threshold determined through simulations). The black line indicates the average pooled heterozygosity (H
P
) values for the
same window boundaries. cThe significant XP-CLR locus (gray box) is presented relative to Ensembl gene models (black). Direction of each gene
is indicated with blue arrows
Pendleton et al. BMC Biology (2018) 16:64 Page 8 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
at AMY2B itself and regions which discriminate the two
segmental duplications in 90 dogs using digital droplet
PCR (ddPCR). Copy number estimated through read
depth strongly correlated with estimates from ddPCR
(Additional file 2: Figure S10) confirming the presence of
standing copy number variation of AMY2B in dogs (range
of 2n
AMY2B
=218) and distinguishing the two large-scale
duplications (Additional file 2: Figure S11). The extreme
AMY2B copy number expansion appears to be independ-
ent of the large-scale duplications, as ddPCR results show
that some dogs without the large duplications still have
very high AMY2B copy number. Read-depth patterns at
the duplication breakpoints indicated that NGD, the an-
cient Irish dog, harbored the 2.0 Mb duplication resulting
in increased AMY2B copy number.
Gene ontology enrichment analysis
We performed enrichment tests using the parent-child
model [68] in the topGO R package [69] with the intersect-
ing 429 unique genes as the test set. To control for biasing
factors such as gene size, function, and colocalization, we cal-
culated permutation-based pvalues (p
perm
) for each GO
term by comparing the observed parent-child significance
score for each GO term with the distribution obtained by ap-
plying the parent-child test to gene sets identified by 1000
randomly permuted genome intervals (see the Methods
section). We identified 636 enriched GO terms (p
perm
<0.05)
including 327 GO terms represented by more than one gene
and more than one XP-CLR locus (Additional file 1:
Table S8). The set supported by multiple loci includes
several categories related to the process noted above
including the regulation of retinoic acid receptors
(p
perm
= 0.028), retinol metabolism (p
perm
= 0.014), the
secretion (p
perm
=0.01), transport (p
perm
= 0.01), and sig-
naling of GABA (p
perm
= 0.03), dopamine receptor signal-
ing (p
perm
= 0.04), and cell maturation (p
perm
=0.012).
Similar enrichment results were also observed using
EMBL-EBI ontology annotations (see the Methodssec-
tion; Additional file 1: Table S9). Seventy-one enriched
(p
perm
< 0.05) categories were identified using the same
methods for the 89 genes intersecting the V
ST
(copy num-
ber) candidate loci (Additional file 1: Table S10). However,
these enrichments were largely driven by a handful of
genes with broad biological functions. No enrichments for
either XP-CLR or copy number results remain statistically
significant if one corrects for the 19,408 tests representing
all of the possible GO terms in our gene set, although
there are limitations to the application of multiple testing
corrections to correlated GO terms.
Discussion
Genetic and archaeological data indicate that the dog
was first domesticated from Eurasian gray wolves well
over 10 kya [23,27,34,40]. Evidence suggests that the
domestication process was complex and may have
spanned thousands of years [3,23]. Through multiple
analyses, we have identified regions that are strongly dif-
ferentiated between modern village dogs and wolves and
which may represent targets of selection during domesti-
cation. Our approach differs from previous studies in
several ways including the use of village dogs rather than
breed dogs, using neutral simulations to set statistical
a
b
c
Fig. 5 Selection scan statistics at the RNPC3 locus. Selection scan statistics surrounding the RNA-binding region (RNP1,RRM) containing 3 (RNPC3)
locus (chr5: ~ 46.947.3 Mb). acas in Fig. 4
Pendleton et al. BMC Biology (2018) 16:64 Page 9 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
cut-offs, and filtering candidate loci based on ancient
dog DNA data. Most (83%) of the 246 candidate domes-
tication regions we identified are novel to our study,
which we largely attribute to reduced signals associated
with post-domestication breed formation. We argue that
swept haplotypes identified in modern village dogs and
also present in Neolithic dogs more likely represent sig-
nals of ancient selection events. Although the 43 village
dogs sampled here do not represent the full spectrum of
genetic diversity of modern dogs, these samples largely
reflect the diversity found in an extensive panel of canids
sampled by SNP array and represent populations esti-
mated to have split over 15 kya (European vs Asian)
[34]. We expect true targets of selection associated with
domestication to be found across all dogs. Signals re-
stricted to breed dogs, although unlikely to reflect select-
ive pressures during domestication, identify genes and
pathways important for understanding the genetic basis
of modern dog biology and disease. Deeper sampling of
village dog diversity may reveal that the CDRs we identi-
fied are unique to the studied samples, perhaps as a po-
tential result of geographically restricted selection. As
more village dogs are sequenced, it is likely that these
candidate domestication regions will be refined and
narrowed.
While the use of neutral simulations accounts for
genetic diversity in both wild and domestic sampled
populations, and better controls false positive rates
than arbitrary empirical thresholds [29,70], several
limitations are still apparent in our approach. The
demographic model we used does not capture all as-
pects of dog history, does not include the X chromo-
some, and does not fit all aspects of the observed
data equally well. This likely represents unaccounted
for features of the data, such as unmodeled popula-
tion structure, as well as technical issues such as re-
duced ascertainment of low frequency alleles due to
sequencing depth. Although previous studies have
identified detectable jackal admixture ranging from 1
to 2% in the ancestral dog population, we did not in-
clude the jackal in our demographic model. Since this
gene flow occurred in the ancestral lineage of both
modern dogs and wolves (> 20 kya) [32,34,40]the
jackal ancestry is expected to be similarly represented
in all of our samples. This assumption may not hold
if the ancestral population had a high degree of popu-
lation structure, but suitable data to model such com-
plexities is not available.
Although the inclusion of ancient samples allows for
the removal of candidate domestication regions that are
unique to modern dogs, this approach is limited by the
narrow temporal (57 kya) and geographic (restricted to
Europe) sampling offered by the available data. Even
though most selected alleles likely preexisted in the
ancestral wolf population, our approach identifies re-
gions where modern village dogs share the same haplo-
type. However, even when selection acts on preexisting
mutation, a single haplotype often reaches fixation [71],
consistent with the variation patterns we identify across
village dog populations. As the amount of ancient dogs
with genome data increase, it will become possible to
apply sophisticated tests that make direct use of ancient
genomes to discover sites of selection [72,73].
Our gene annotations were obtained directly through
established BLAST2GO pipelines [74]. Similar results,
although with fewer gene-function links, were obtained
when using the Ensembl Release 92 of the EMBL-EBI
GO gene annotations (Additional file 1: Table S10). After
correcting for a total of 19,408 possible tests, none of
our enrichments would be significant, even if the raw
parent-child pvalues were used. However, several factors
complicate these gene set enrichment tests. First, the na-
ture of the GO ontology relationships introduces
non-independence among related GO terms and genes,
a problem partially ameliorated by the parent-child
model [68]. Second, the underlying statistical tests as-
sume that every gene is equally likely to be a member of
the test set under the null hypothesis, an assumption
that may be reasonable for studies of gene expression.
Our permutation strategy attempts to control for the
non-random correlation between gene size, colocaliza-
tion, and gene function. However, since no GO term
survives a global multiple testing correction, these en-
richments must be viewed as tentative.
The role of the neural crest in dog domestication
Our XP-CLR candidate domestication regions include
52 genes that were also identified in analyses of other
domesticated or self-domesticated animals [9,11,17,
7579], including four genes (RNPC3,CUEDC1,GBA2,
NPR2) in our top 20 XP-CLR loci. No gene was found in
more than three species, consistent with the hypothesis
that no single domestication gene exists [19]. Although
the overlap of specific genes across species is modest,
there are many enriched gene pathways and ontologies
shared in domesticates including neurological and ner-
vous system development, behavior, reproduction, me-
tabolism, and pigmentation [10,11,17,73,75,80]. We
attribute these patterns to the domestication syndrome,
a phenomenon where diverse traits, manifested in vastly
different anatomical zones, appear seemingly discon-
nected, yet are maintained across domesticates. Two
possible modes of action could generate the domestica-
tion syndrome phenotypes while still displaying the
genome-wide distribution of sweeps. The first would re-
quire independent selection events for distinct traits at
numerous loci. Alternatively, selection could have acted
on considerably fewer genes that are members of
Pendleton et al. BMC Biology (2018) 16:64 Page 10 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
early-acting developmental pathways with broad pheno-
typic effects.
For these reasons, the role of the neural crest in ani-
mal domestication has gained support from researchers
over recent years [18,20,21] (Table 1). In 2014, Wilkins
et al. [18] established that the vast array of phenotypes
displayed in the animal domestication syndrome mirror
those exhibited in mild human neurocristopathies,
whose pathology stems from aberrant differentiation,
division, survival, and altered migration of neural crest
cells (NCCs). These cells are multipotent, transient, em-
bryonic stem cells that are initially located at the crest
(or dorsal border) of the neural tube. The initiation and
regulation of neural crest development is a multi-stage
process requiring the actions of many early-expressed
genes including the fibroblast growth factor (Fgf ), bone
morphogenic protein (Bmp), wingless (Wnt), and Zic
gene families [81]. Several of the genes identified in our
XP-CLR analysis are involved in this transition including
members of the Fgf (Fgf1) family as well as a transcrip-
tion factor (TCF4;[82]), inhibitors (RRM2;NPHP3;[83,
84]), and regulators (LGR5; [85]) of the Wnt signaling
pathways.
Following induction, NCCs migrate along defined
pathways to various sites in the developing embryo.
Assignment of identity and the determination of migra-
tion routes rely on positional information provided by
external signaling cues [86,87]. KCTD12,CLIC4,PAK1,
NCOR2,DOCK2, and EXOC7 are all examples of such
genes found in our candidate loci that are linked to the
determination of symmetry, polarity, and/or axis specifi-
cation [8892]. Together, our results suggest that early
selection may have acted on genes essential to the initi-
ation of the neural crest and the definition of migration
routes for NCCs.
NCC-derived tissues linked to domestication syndrome
phenotypes
Once in their final destinations, NCC further differen-
tiates as the precursors to many tissues in the devel-
oping embryo. Most of the head, for example, arises
from NCCs including craniofacial bones, cartilage,
and teeth [93,94]. Ancient dog remains indicate that
body size, snout lengths, and cranial proportions of
dogs considerably decreased compared to the wolf an-
cestral state following early domestication [95]. Fur-
ther, these remains indicate jaw size reduction also
occurred, as evidenced by tooth crowding [95]. Such
alterations are consistent with the domestication syn-
drome and implicate aberrant NCC migration since
decreases in the number of NCCs in facial primordia
are directly correlated with reductions in mid-face
and jaw sizes [18,96]. Genes associated with both
craniofacial and tooth development in vertebrates are
found in our candidate loci including SCUBE1 (XP
115), which is essential in craniofacial development of
mice, and SATB2 (XP 244), which has roles in pat-
terning of the developing branchial arches, palate fu-
sion, and regulation of HOXa2 in the developing
neural crest [9799]. Lastly, when knocked out in
mice, Bicoid-related homeodomain factor PITX1 (XP
124) not only affected hindlimb growth, but also dis-
played craniofacial abnormalities such as cleft palate
and branchial arch defects [100], and influences verte-
brate tooth development [101].
Insufficient cartilage, a NCC-derived tissue [94] that
consists of chondrocytes and collagen, in the outer ear
of humans results in a drooping ear phenotype linked to
numerous NC-associated neurocristopathies (e.g., Trea-
cher Collins and Mowat-Wilson) [102]. Analogously,
compared to the pricked ears of wolves, dogs predomin-
antly have floppyears [103], a hallmark feature of do-
mesticates [18]. Ablation of SERPINH1 (XP 181), a
collagen-binding protein found in our list of CDRs, is
embryonically lethal in ablated in mice [104] and ap-
pears to be required for chrondrocyte maturation [105].
Alterations of activity by genes such as SERPINH1 and
those regulating NCC migration may have reduced the
numbers of NCCs in dog ears, contributing to the floppy
phenotype [18].
Genes associated with neurological signaling, circadian
rhythms, and behavior
Tameness or reduced fear toward humans was likely the
earliest trait selected for by humans during domestica-
tion [3,106,107]. Recapitulating such selection, numer-
ous physiological and morphological characteristics,
including domestication syndrome phenotypes (i.e.,
floppy ears, altered craniofacial proportions, and unsea-
sonal timing for mating), appeared within 20 generations
when researchers selected only for tameness in a silver
fox breeding population [1,108]. As the progenitors for
the adrenal medulla, which produces hormones associ-
ated with the fight-or-flightresponse, hypofunction of
NCCs can lead to changes in the tameness of animals
[18]. The link between tameness and the NC suggests
that changes in neural crest development could have
arisen first, either through direct selection by humans
for desired behaviors or via the self-domestication
[109,110] of wolves that were more docile around
humans. Genes contributing to neurological function
and behavioral responses were observed in our XP-CLR
candidate loci, suggesting these genes may influence
chemical and morphological differences associated with
tameness. Numerous candidate loci contain genes influ-
encing neurological function and behavioral responses
including genes in the dopamine, serotonin, glutamate,
and GABA neurotransmission pathways, as well as genes
Pendleton et al. BMC Biology (2018) 16:64 Page 11 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
contributing to the connectivity and development of
synapses and dendrites.
In addition to changes in behavior, alterations in sleep
patterns would also likely have occurred early in the do-
mestication process due to the shift from the ancestral noc-
turnal state of wolves, to that of the diurnal lifestyle also
exhibited by humans. Evidenced by this, levels of circadian
rhythm determinants (e.g., melatonin and serotonin) were
significantly altered in domesticated silver foxes selected for
tameness compared to wild foxes [111113]. We
hypothesize that early selection on genes influencing behav-
ior have additional functions in the establishment of circa-
dian rhythms, and that both can be explained by impaired
NC function. The Smith-Magenis syndrome is caused by
disrupted function of RAI1 [114], the gene with the highest
XP-CLR score in our study. Humans with Smith-Magenis
syndrome display increased aggression and altered circa-
dian rhythms, as well as craniofacial and skeletal deforma-
tions, developmental delays, and intellectual disabilities
[115]. Similarly, Williams-Beuren syndrome, another neu-
rodevelopmental disorder, affects sleep patterns as well as
contributes to hypersociability in humans [116]. A recent
study in canines linked behavioral changes in breed dogs to
structural variants near WBSCR17, a Williams-Beuren syn-
dromegene[117]. Both syndromes display multiple fea-
tures associated with improper NCC development,
resembling phenotypes of neurocristopathies [115,118].
For example, disruption of the transcription factors RAI1
and WSTF in xenopus (also disrupted in Williams-Beuren
syndrome) negatively impacts proper NCC migration, re-
capitulating the human craniofacial defects associated with
the syndromes [119,120]. RAI1 also regulates circadian
rhythms [121124], a pathway within which other XP-CLR
candidate loci genes also exhibit possible (RNPC3;[125,
126]) and experimentally verified (FBLX3;[127]) roles.
Altogether, the top scoring locus, as well as others, indicate
overlap of gene functions in influencing behavior and circa-
dian rhythms, and were likely early genetic components of
the domestication syndrome.
Misregulation of gene expression may contribute to
domestication syndrome phenotypes
Similar to other domestication scans [6,9,19], we did
not find SNPs deleteriously altering protein sequence in
our predicted sweeps, indicating that gene loss did not
have a significant role in dog domestication. Instead, we
hypothesize that alterations in gene regulatory pathways
or the regulation of transcriptional activity could con-
tribute to broad domestication syndrome phenotypes.
Our gene list includes two components of the minor
spliceosome; RNPC3 and Sf3b1.RNPC3, which affects
early development and is linked to dwarfism (isolated
growth hormone deficiency; [128]), is also under selec-
tion in cats and humans [17,77]. Absence of Sf3b1
disrupts proper NCC specification, survival, and migra-
tion [129]. A further example of the role of splicing in
NC development is that mutations in U4atac, a U12
snRNA subunit gene missing in the current dog annota-
tion, causes Taybi-Lindner syndrome (TALS) in humans.
Phenotypes of this syndrome resemble those of the
domestication syndrome including craniofacial, brain,
and skeletal abnormalities [130]. Thus, proper splicing,
particularly for transcripts processed by the minor spli-
ceosome, is required for proper NC function and
development.
Copy number variation was likely not a major driver during
dog domestication
Our scan for differentiated copy number states identi-
fied few regions that differentiate village dogs and
wolves. A previous study found that dogs and wolves
have a similar proportion of CNV loci [131]. This
suggests that copy number expansion or contraction
may not have made as significant contributions to the
phenotypic changes associated with domestication.
The quantification of wolf copy number using a dog
genome reference limits the accuracy of the estimates
and prevents detection of wolf-specific insertions.
Therefore, reassessment of population-specific copy
number changes would be improved by the use of a
wolf genome reference [132]. Of note, the top hit
from the copy number selection scan corresponded to
the AMY2B, a gene linked to increased efficiency of
starch digestion in dogs [5,36,37]. Previous studies
have concluded that the increase in AMY2B copy
number occurred post-domestication, since the timing
of domestication (> 10 kya) predates the introduction
of starch-rich diets in both humans and dogs [32,34,
36]. However, this study utilizes previously imple-
mented copy number estimation techniques [34,36]
to identify two independent large-scale duplications
(1.9 and 2.0 Mb) that are at least the age of the old-
est sampled dog genome (7 ky old). Significant selec-
tion signatures from XP-CLR are distal to AMY2B,
instead centered on RNPC3 (discussed above) which
also lies within the boundaries of both large duplica-
tions.Sincetheselargeduplicationsarenotfixedin
dogs, yet the RNPC3 selected haplotypes are, we
speculate that the initial target of selection may have
been on RNPC3 which could have global effects on
expression and phenotype (body size).
Conclusions
By comparing village dogs and wolves, we identified 246
candidate domestication regions in the dog genome.
Analysis of gene function in these regions suggests that
perturbation of crucial neural crest signaling pathways
could result in the broad phenotypes associated with the
Pendleton et al. BMC Biology (2018) 16:64 Page 12 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
domestication syndrome. Additionally, these findings
suggest links between transcriptional regulation and spli-
cing to alterations in cell differentiation, migration, and
neural crest development. Altogether, we conclude that
while primary selection during domestication likely tar-
geted tameness, genes that contribute to determination
of this behavioral change are also involved in critical,
far-reaching pathways that conferred drastic phenotypic
changes in dogs relative to their wild counterparts.
Methods
Sample processing and population structure analysis
The primary selection scans in this paper are based on
43 village dog and 10 gray wolf samples selected from a
larger sample set as described below. Additional analysis
of candidate genomic regions is based on genotype data
from two ancient European samples. For visualization
purposes, Fig. 1also includes genotype data from a lar-
ger collection of breed dogs and wild canid out groups.
Canid genomes (Additional file 1: Table S1) were proc-
essed using the pipeline outlined in [34] to produce a
data set of single nucleotide polymorphisms (SNPs)
using GATK [133]. From this larger sample set, 37 breed
dogs, 45 village dogs, and 12 wolves were selected from
the samples described in [34], and ADMIXTURE [39]
was utilized to estimate the levels of wolf-dog admixture
within this subset. This sample set includes three New
Guinea Singing Dogs sequenced as described in [134].
To account for LD, the data was thinned with PLINK
v1.07 (indep-pairwise 50 10 0.1; [135]), where SNPs
with an R
2
value over 0.1 were removed in 50 kb win-
dows, sliding 10 sites at a time. The remaining 1,030,234
SNPs were used in five independent ADMIXTURE runs
using different seeds, for up to five ancestral populations
(K=15). K= 3 had the lowest average cross validation
error (0.0373) from the five runs and was therefore the
best fit for the data (Additional file 2: Figure S12). To
eliminate noise in subsequent analyses, we removed all
village dogs with greater than 5% wolf ancestry and
wolves with greater than 5% dog ancestry. Fifty-four
samples remained following this filtration.
Following elimination of admixed samples, we called SNPs
in 43 village dogs and 11 gray wolves (Additional file 1:Table
S1) using GATK (v. 3.4-46; [133]). Using the GATK VQSR
procedure, we identified a high quality variant set such that
99% of positions on the Illumina canine HD array were
retained. VQSR filtration was performed separately for the
autosomes + chrX pseudoautosomal region (PAR) and the
non-PAR region. SNPs within 5 bp of an indel identified by
GATK were also removed. We further excluded sites with
missing genotype calls in any sample, triallelic sites, and
X-nonPAR positions where any male sample was called as
heterozygous. The final SNP set contained 7,657,272 sites.
Using these SNPs, we removed samples that exhibited
over 30% relatedness following identity by state (IBS)
analysis with PLINK v1.90 (min 0.05; [135]). Only one
sample (mxb) was removed from the sample set, a sam-
ple known to be related to another Mexican wolf in the
dataset. Principal component analyses were completed
on the remaining 53 samples (43 dogs and 10 wolves)
using smartpca, a component of Eigensoft package ver-
sion 3.0 [136] after randomly thinning the total SNP set
to 500,000 sites using PLINK v.1.90 [135]. Once PCA
confirmed clear genetic distinctions between these dogs
and wolves, this final sample set was used for subse-
quent analyses. For visualization of the final sample set
used in selection scans, a further ADMIXTURE plot was
generated for this filtered set of 53 samples (Fig. 1b).
The SNP set was further filtered for the selection scans
to remove rare alleles (minor allele frequencies < 3 out
of possible 106 alleles or 0.028). Finally, village dog and
wolf allele frequencies were calculated separately using
VCFtools [137].
Demographic model and simulations
Simulations of dog and wolf demographic history were
performed using msprime v.0.4.0 [138]. For each auto-
some, 75 independent simulations were performed using
independent random seeds and a pedigree-based genetic
map [139]. A mutation rate of 4 × 10
9
per site per gen-
eration with a generation time of 3 years was assumed.
The 53 samples were modeled as coming from 10 line-
ages with population histories adapted from [34,40]
(Additional file 1: Table S3; Additional file 2: Figure S2).
The simulation is designed to capture key aspects
impacting dog and wolf diversity, rather than a definitive
depiction of their demography. Resulting simulated SNP
sets were filtered for minor allele frequency and ran-
domly thinned to have the same number of SNPs per
chromosome as the real SNP datasets used in F
ST
,
XP-CLR, and H
P
calculations.
F
ST
selection scans
Dog and wolf allele counts generated above were used to
calculate the fixation index (F
ST
) using the Hudson esti-
mator derived in [140] with the following formula: F
ST
=(p
1
p
2
)(p
1
(1 p
1
)/n
1
1) (p
2
(1 p
2
)/n
2
1))/(p
1
(1
p
2
)+p
2
(1 p
1
)) where p
x
is the allele frequency in popu-
lation x, and n
x
is the number of individuals in population
x, with village dogs and wolves treated as separate populations.
With this equation, the X chromosome could be included in
F
ST
calculations. A custom script [141] calculated the per site
F
ST
across the genome for both the real and 75 simulated SNP
sets. Due to differences in effective population size and
corresponding expected levels of genetic drift, analyses
were performed separately for the chromosome X
non-pseudoautosomal region (PAR). Ratio of averages
Pendleton et al. BMC Biology (2018) 16:64 Page 13 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
for the resulting F
ST
values were calculated in 200 kb
sliding windows with 50 kb step sizes, and we required
each window to contain at least 10 SNPs. Additionally,
we calculated per site F
ST
for each SNP that did not
have missing data in any sample.
F
ST
loci filtration was completed differently for the
outlier and non-outlier approach. For the outlier F
ST
ap-
proach, the windows were Z-transformed and only win-
dows with Zscores 5 standard deviations were deemed
significant for autosomal and X-PAR loci, and 3 for the
X-NonPAR. Significance thresholds for the non-outlier
approach were determined as the 99th percentile from
F
ST
score distributions from the simulated genomes.
Overlapping windows passing these thresholds were
merged.
Pooled heterozygosity (H
P
) and ΔH
P
calculations
Per window, dog allele frequencies were used to calcu-
late pooled heterozygosity (H
P
) using the following for-
mula from [6]: 2Σn
MAJ
Σn
MIN
/(Σn
MAJ
+Σn
MIN
)
2
, where
Σn
MAJ
is the sum of major and Σn
MIN
minor dog alleles,
respectively, for all sites in the window. Significance
threshold for window filtration was set as the 0.1th per-
centile of the H
P
distribution from the simulated ge-
nomes. The change in H
P
(or ΔH
P
) was calculated as the
difference in ΔH
P
with and without the inclusion of the
two ancient dog samples (HXH and NGD). Importantly,
genotypes in the ancient samples were determined for
the sites variable among the modern samples using an
approach that accounts for post-mortem ancient DNA
damage [34]. The 5-ky-old German dog (CTC) was not
included in this analysis due to known wolf admixture
[34]. Windows with ΔH
P
greater than the 5th percentile
observed genome-wide were removed.
XP-CLR selection scans
Cross-population comparative likelihood ratio (XP-CLR;
[41]) scores were calculated using pooled dog and wolf
allele frequencies at sites described above. This analysis
requires separate genotype files for each population, and
a single SNP file with positions of each SNP and their
genetic distance (in Morgans), which were determined
through linear extrapolation from the pedigree-based re-
combination map from [139]. Wolves were set as the
reference population, and XP-CLR was run on both the
real and simulated SNP sets with a grid size of 2 kb and
a window size of 50 kb. Windows that did not return a
value (failed) or did not have at least five grids were
removed. Average XP-CLR scores from passing grids
were calculated in 25 kb windows (step size = 10 kb).
Filtration of real windows with averages less than the
99th percentile of averaged simulation scores was per-
formed. Remaining adjacent windows were merged if
they were within 50 kb distance (i.e., one sliding window
apart).
Visualization of candidate domestication regions
Forty-six additional canines (e.g., dog breeds, jackals, coyotes;
Additional file 1: Table S1) were genotyped at candidate loci
identified in this study, as well as those from [5,8,29]using
autosomal SNPs previously called in [34]. SNPs within CDRs
of interest were extracted from the SNP dataset using the
PLINK make-bed tool with no missing data filter. Per sam-
ple, each SNP was classified as 0/0, 0/1, or 1/1 at all loci (1
representing the non-reference allele), and this genotype data
was stored in Eigenstrat genotype files, which were generated
per window using convertf (Eigensoft package; [136]). A cus-
tom script [141] then converted the Eigenstrat genotype files
into matrices for visualization using matrix2png [142].
Gene enrichment and variant annotation
Coordinates and annotations of dog gene models were
obtained from Ensembl ([143,144], respectively), and a
non-redundant annotation set was determined. The se-
quence of each Ensembl protein was BLASTed against
the NCBI non-redundant database (blastp -outfmt 5 -evalue
1e-3 -word_size 3 -show_gis -max_hsps_per_subject 20
-num_threads 5 -max_target_seqs 20) and all blastp outputs
were processed through BLAST2GO [74] with the following
parameters: minimum annotation cut-off of 55, GO weight
equal to 5, BLASTp cut-off equal to 1e
6
,HSP-hitcut-offof
0, and a hit filter equal to 55. Of the 19,017 autosomal genes
in our non-redundant gene set, 16,927 received BLAST2GO
annotations representing a total of 19,958 GO terms. To ac-
count effects from differential annotations, we also obtained
GO annotations from EMBL-EBI (Ensembl Release 92) for
the 19,017 gene models above. Predicted effects of SNP vari-
ants were obtained by the processing of the total variant
VCF file of all canine samples by variant effect predictor
(VEP; [42]).
Positions of predicted domestication regions (XP-CLR
or V
ST
) were intersected using BEDtools [145] (within a
window of 50 kb) with the coordinates of the annotated
Ensembl dog gene set to isolate genes within the puta-
tively swept regions, and we defined these as the ob-
served gene set. We performed 1000 randomized
shuffles of the loci of interest and, again, identified gene
models intersecting within 50 kb, and defined these as
the permuted gene sets. Gene enrichment analyses were
separately performed on the observed and permuted
gene sets using the parent-child model [68] in the
topGO R package [69]. Permutation-based pvalues
(p
perm
) were produced for all GO terms by comparing
the observed parent-child test score with the results of
the 1000 permutations using the formula p
perm
=(X
perm
+ 1)/(N+1), where X
perm
is the number of in-
stances where a permutation obtained a parent-child p
Pendleton et al. BMC Biology (2018) 16:64 Page 14 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
value less than or equal to the observed pvalue, and N
is the number of permutations (N= 1000). One was
added to both the numerator and denominator in this
equation to avoid adjusted pvalues of 1.0. GO terms
with p
perm
values less than 0.05 were further filtered to
produce our final enriched GO set. First, terms that were
not represented by more than one locus (XP-CLR or
V
ST
) were removed, as these could have arisen due to
clustering of genes belonging to a given gene ontology.
Finally, terms were removed if they were represented by
only one gene. This occurs when one gene may be
spanned by more than one XP-CLR or V
ST
locus.
Remaining GO terms are considered the enriched set.
This approach was performed separately for BLAST2GO
and EMBL-EBI go annotation sets.
Copy number estimation using QuicK-mer and fastCN
We implemented two copy number estimation pipelines to
assess copy number in village dogs and wolves using the
depth of sequencing reads. The first, fastCN, is a modified
version of existing pipelines that considers multi-mapping
reads to calculate copy number within 3 kb windows
(Additional file 3:Note1;[5,23,24,32,34,3638,66,145
171]). By considering multi-mapping reads, copy number
profiles will be shared among related gene paralogs, making
it difficult to identify specific sequences that are potentially
variable. The second pipeline we employed, QuicK-mer, a
map-free approach based on k-mer counting which can ac-
curately assess copy number in a paralog-sensitive manner
(Additional file 3: Note 2; Additional file 4). Both pipelines
analyze sequencing read-depth within predefined windows,
apply GC-correction and other normalizations, and are able
to convert read depth to a copy-number estimate for each
window (Additional file 3:Note3.1).Thesignal-to-noisera-
tio (SNR), defined as the mean depth in autosomal control
windows divided by the standard deviation, was calculated
for each sample (Additional file 3: Note 3.2). The copy
number states called by both the QuicK-mer and fastCN
pipelines were validated throughcomparisonwithaCGH
data from [170] (Additional file 3: Note 3.3; Additional file 5).
Regions with copy number variation between samples in
the aCGH or WGS data were selected for correlation
analysis.
V
ST
selection scans
Treating village dogs and wolves as separate populations,
V
ST
values [66] were calculated for genomic windows
with evidence of copy number variation. V
ST
values were
Z-transformed and we identified outlier regions as win-
dows exhibiting at least a 1.5 copy number range across
all samples, and ZV
ST
scores greater than 5 on the auto-
somes and the X-PAR, or greater than 3 in the
X-nonPAR. Prior to analysis, estimated copy numbers
for male samples on the non-PAR region of the X were
doubled. Outlier regions spanning more than one win-
dow were then classified as copy number outlier regions
(Additional file 1: Table S7). A similar analysis was per-
formed for the unplaced chromosomal contigs in the
CanFam3.1 assembly (Additional file 1: Table S11). See
Additional file 3: Note 3.4 for additional methods and
details.
Amylase structural variant analysis
We estimated copy number using short-read sequencing
data from each canine listed in Additional file 1:TableS1.
Copy number estimates for the AMY2B gene using fastCN
were based on a single window located at chrU-
n_AAEX03020568: 4873-8379. See Supplementary Methods:
Note 3.5.1 (Additional file 3) for further methods and results.
Digital droplet PCR (ddPCR) primers were designed target-
ing overlapping 1.9 and 2.0 Mb duplications, the AMY2B
gene and a copy number control region (chr18:
27,529,623-27,535,395) found to have a copy number of two
in all sampled canines by QuicK-mer and fastCN. Copy
number for each target was determined from ddPCR results
from a single replication for 30 village dogs, 3 New Guinea
singing dogs, and 5 breed dogs (Additional file 1:TableS12),
and averaged from two replicates for 48 breed dogs
(Additional file 1: Table S13). For more details on primer de-
sign, methods, and results for the characterization of the
AMY2B locus, see Additional file 3:Note3.5.
Additional files
Additional file 1: Table S1. Description and accession numbers for
canine genomes processed in this study. Indications for whether or not a
sample was used in the ADMIXTURE analysis, selection scans (F
ST
, XP-CLR,
or V
ST
) are provided. Table S2. Coordinates and annotations of outlier F
ST
loci identified through the empirical approach, including intersecting
Axelsson, Cagan and Blass, and Freedman loci [5,8,29]. Table S3.
Parameters incorporated into demographic model for neutral evolution
in village dog and wolf populations. Table S4. Coordinates and
annotations of outlier F
ST
loci identified through the simulation informed
(non-empirical) approach, including genes as well as intersecting
Axelsson, Cagan and Blass, and Freedman loci [5,8,29]. Table S5. Coordinates
and annotations of outlier XP-CLR candidate domestication regions identified
through simulation approach, including intersecting genes as well as
intersecting Axelsson, Cagan and Blass, and Freedman loci [5,8,29]. Table S6.
Predicted SNP effects (per variant effect predictor [42]) for sites in XP-CLR
candidate domestication regions. Table S7. Coordinates of V
ST
copy number
outliers on autosomes and chromosome X. Table S8. Gene enrichment re-
sults for XP-CLR candidate domestication regions following pvalue adjustment
(BLAST2GO). Table S9. Gene enrichment for XP-CLR candidate domestication
regions following pvalue adjustment (EMBL-EBI). Table S10. Gene enrichment
results for V
ST
copy number outliers following pvalue adjustment. Table S11.
Coordinates of V
ST
copy number outliers on chromosome unknown (chrUn).
Table S12. ddPCR results from 30 village, 3 New Guinea singing, and 5 breed
dog samples of AMY2B segmental duplications. Table S13. ddPCR results from
48 breed dog samples of AMY2B segmental duplications. (XLSX 328 kb)
Additional file 2: Figure S1. Z-transformed F
ST
scores for Cagan and
Blass locus. Figure S2. Demographic model for village dog and wolf
populations used in neutral simulations. Figure S3. Filtration pipeline
implemented for F
ST
and XP-CLR windows. Figure S4. Distribution of
selection scan statistics for real and simulated F
ST
windows. Figure S5.
Pendleton et al. BMC Biology (2018) 16:64 Page 15 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Filtration pipeline implemented for Axelsson, Cagan and Blass, and Freed-
man CDRs. Figure S6. Distribution of selection scan statistics for real and
simulated XP-CLR windows. Figure S7. Gene intersect statistics from ran-
domized permutations of XP-CLR gene positions. Figure S8. Gene inter-
sect statistics from randomized permutations of V
ST
gene positions.
Figure S9. Read-depth profiles at the AMY2B locus highlights large-scale
structural variant. Figure S10. Correlations between the ddPCR and read-
depth estimated copy number for AMY2B and associated segmental du-
plications. Figure S11. ddPCR results for the AMY2B gene, 1.9 Mb dupli-
cation, and the 2.0 Mb duplication.
Figure S12. Admixture plot for K 2-5 for the full assayed canines for
sample filtration. This includes breed dogs, village dogs, as well as gray
wolves. (DOCX 3397 kb)
Additional file 3: Notes 13 providing supplementary methods and
results of copy number analysis. (DOCX 1850 kb)
Additional file 4: Supplemental QuicK-mer validation figures.
(PDF 1024 kb)
Additional file 5 Plots displaying aCGH probe intensity correlations with
in silico copy number estimates. (PDF 1916 kb)
Abbreviations
aCGH: Array comparative genomic hybridization; CDR: Candidate
domestication region; chrUn: Chromosome unknown; ddPCR: Droplet digital
polymerase chain reaction; GO: Gene ontology; H
P
: Pooled heterozygosity;
NC: Neural crest; NCC: Neural crest cell; qPCR: Quantitative polymerase chain
reaction; SNP: Single-nucleotide polymorphism; XP-CLR: Cross-population
composite likelihood ratio
Acknowledgements
We thank Shiya Song for advice and assistance in the processing of
canid variation data and Laura Botigue for discussion of results utilizing
ancient DNA.
Funding
This work was supported by National Institutes of Health (R01GM103961 to
ARB and JMK, and T32HG00040 AP). DNA samples and associated
phenotypic data were provided by the Cornell Veterinary Biobank, a resource
built with the support of NIH grant R24GM082910 and the Cornell University
College of Veterinary Medicine.
Availability of data and materials
Custom scripts and datasets supporting the conclusions of this article are
available in the article and its additional files, as well as a custom UCSC track
hub (https://raw.githubusercontent.com/KiddLab/
Pendleton_2018_Selection_Scan/master/Selection_track_hub.txt). Software
(fastCN and QuicK-mer) implemented in this article are available for down-
load in a GitHub repository (https://github.com/KiddLab/). Pre-computed 30-
mers from the dog, human, mouse, and chimpanzee genomes can be
downloaded from http://kiddlabshare.umms.med.umich.edu/public-data/
QuicK-mer/Ref/ for QuicK-mer processing. Genome sequence data for three
New Guinea singing dogs was published under project ID SRP034749 in the
Short Read Archive.
Authorscontributions
JMK, ALP, and FS designed the study. JMK oversaw the study. Selection scans
were performed by ALP, AT, and FS. AT and ALP assessed population
structure. Copy number variation was estimated by FS and JMK. Functional
annotations and enrichment analyses were performed by ALP. FS processed
aCGH data. KV processed ancient dog samples. Samples and genome
sequences were provided by KV, ARB, and JMK. SE performed the DNA
extractions, library generation, and ddPCR analyses. ALP, FS, and JMK wrote
the paper with input from the other authors. All authors read and approved
the final manuscript.
Ethics approval
Not applicable.
Competing interests
ARB is a cofounder and officer of Embark Veterinary, Inc., a canine genetics
testing company.
PublishersNote
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
Author details
1
Department of Human Genetics, University of Michigan, Ann Arbor, MI
48109, USA.
2
Department of Ecology and Evolution, Stony Brook University,
Stony Brook, NY 11794, USA.
3
Department of Biomedical Sciences, Cornell
University, Ithaca, New York 14853, USA.
4
Department of Computational
Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109,
USA.
Received: 27 February 2018 Accepted: 23 May 2018
References
1. Trut LN. Early Canid Domestication: The Farm-Fox Experiment: foxes bred
for tamability in a 40-year experiment exhibit remarkable transformations
that suggest an interplay between behavioral genetics and development.
Am Sci. 1999;87(2):1609.
2. Germonpré M, Sablin MV, Lázničková-Galetová M, Després V, Stevens RE,
Stiller M, Hofreiter M. Palaeolithic dogs and Pleistocene wolves revisited: a
reply to Morey (2014). J Archaeol Sci. 2015;54:2106.
3. Larson G, Burger J. A population genetics view of animal domestication.
Trends Genet. 2013;29(4):197205.
4. Harlan JR. Crops and man. Foundation for modern Crop Science. Madison:
American Society of Agronomy; 1975.
5. Axelsson E, Ratnakumar A, Arendt ML, Maqbool K, Webster MT, Perloski M,
Liberg O, Arnemo JM, Hedhammar A, Lindblad-Toh K. The genomic
signature of dog domestication reveals adaptation to a starch-rich diet.
Nature. 2013;495(7441):3604.
6. Rubin CJ, Zody MC, Eriksson J, Meadows JR, Sherwood E, Webster MT,
Jiang L, Ingman M, Sharpe T, Ka S, et al. Whole-genome resequencing
reveals loci under selection during chicken domestication. Nature. 2010;
464(7288):58791.
7. Li M, Tian S, Jin L, Zhou G, Li Y, Zhang Y, Wang T, Yeung CK, Chen L, Ma J,
et al. Genomic analyses identify distinct patterns of selection in
domesticated pigs and Tibetan wild boars. Nat Genet. 2013;45(12):14318.
8. Cagan A, Blass T. Identification of genomic variants putatively targeted by
selection during dog domestication. BMC Evol Biol. 2016;16:10.
9. Rubin C-J, Megens H-J, Barrio AM, Maqbool K, Sayyab S, Schwochow D,
Wang C, Carlborg Ö, Jern P, Jørgensen CB. Strong signatures of selection in
the domestic pig genome. Proc Natl Acad Sci. 2012;109(48):1952936.
10. Qiu Q, Wang L, Wang K, Yang Y, Ma T, Wang Z, Zhang X, Ni Z, Hou F, Long
R, et al. Yak whole-genome resequencing reveals domestication signatures
and prehistoric population expansions. Nat Commun. 2015;6:10283.
11. Fariello MI, Servin B, Tosser-Klopp G, Rupp R, Moreno C, International Sheep
Genomics Consortium, San Cristobal M, Boitard S. Selection signatures in
worldwide sheep populations. PLoS One. 2014;9(8):e103813.
12. Fang M, Larson G, Ribeiro HS, Li N, Andersson L. Contrasting mode of
evolution at a coat color locus in wild and domestic pigs. PLoS Genet. 2009;
5(1):e1000341.
13. Wang Z, Yonezawa T, Liu B, Ma T, Shen X, Su J, Guo S, Hasegawa M, Liu J.
Domestication relaxed selective constraints on the yak mitochondrial
genome. Mol Biol Evol. 2010;28(5):15536.
14. Cheng T, Fu B, Wu Y, Long R, Liu C, Xia Q. Transcriptome sequencing and
positive selected genes analysis of Bombyx mandarina. PLoS One. 2015;
10(3):e0122837.
15. Gray MM, Granka JM, Bustamante CD, Sutter NB, Boyko AR, Zhu L, Ostrander
EA, Wayne RK. Linkage disequilibrium and demographic history of wild and
domestic canids. Genetics. 2009;181(4):1493505.
16. Amaral AJ, Megens H-J, Crooijmans RP, Heuven HC, Groenen MA. Linkage
disequilibrium decay and haplotype block structure in the pig. Genetics.
2008;179(1):56979.
17. Montague MJ, Li G, Gandolfi B, Khan R, Aken BL, Searle SM, Minx P,
Hillier LW, Koboldt DC, Davis BW, et al. Comparative analysis of the
domestic cat genome reveals genetic signatures underlying feline
Pendleton et al. BMC Biology (2018) 16:64 Page 16 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
biology and domestication. Proc Natl Acad Sci U S A. 2014;111(48):
172305.
18. Wilkins AS, Wrangham RW, Fitch WT. The domestication syndromein
mammals: a unified explanation based on neural crest cell behavior and
genetics. Genetics. 2014;197(3):795808.
19. Carneiro M, Rubin C-J, Di Palma F, Albert FW, Alföldi J, Barrio AM,
Pielberg G, Rafati N, Sayyab S, Turner-Maier J. Rabbit genome analysis
reveals a polygenic basis for phenotypic change during domestication.
Science. 2014;345(6200):10749.
20. Wilkins AS. Revisiting two hypotheses on the domestication syndromein
light of genomic data. Vavilov J Genet Breed. 2017;21(4):43542.
21. Sánchez-Villagra MR, Geiger M, Schneider RA. The taming of the neural
crest: a developmental perspective on the origins of morphological
covariation in domesticated mammals. R Soc Open Sci. 2016;3(6):160107.
22. Fallahsharoudi A, De Kock N, Johnsson M, Ubhayasekera SKA, Bergquist J,
Wright D, Jensen P. Domestication effects on stress induced steroid
secretion and adrenal gene expression in chickens. Sci Rep. 2015;5:15345.
23. Frantz LA, Mullin VE, Pionnier-Capitan M, Lebrasseur O, Ollivier M, Perri A,
Linderholm A, Mattiangeli V, Teasdale MD, Dimopoulos EA, et al. Genomic
and archaeological evidence suggest a dual origin of domestic dogs.
Science. 2016;352(6290):122831.
24. Shannon LM, Boyko RH, Castelhano M, Corey E, Hayward JJ, McLean C,
White ME, Abi Said M, Anita BA, Bondjengo NI, et al. Genetic structure in
village dogs reveals a Central Asian domestication origin. Proc Natl Acad Sci
U S A. 2015;112(44):1363944.
25. Wang GD, Zhai W, Yang HC, Wang L, Zhong L, Liu YH, Fan RX, Yin TT, Zhu
CL, Poyarkov AD, et al. Out of southern East Asia: the natural history of
domestic dogs across the world. Cell Res. 2016;26(1):2133.
26. Vonholdt BM, Pollinger JP, Lohmueller KE, Han E, Parker HG, Quignon P,
Degenhardt JD, Boyko AR, Earl DA, Auton A, et al. Genome-wide SNP and
haplotype analyses reveal a rich history underlying dog domestication.
Nature. 2010;464(7290):898902.
27. Skoglund P, Ersmark E, Palkopoulou E, Dalen L. Ancient wolf genome
reveals an early divergence of domestic dog ancestors and admixture into
high-latitude breeds. Curr Biol. 2015;25(11):15159.
28. Wang G-D, Zhai W, Yang H-C, Fan R-X, Cao X, Zhong L, Wang L, Liu F, Wu
H, Cheng L-G. The genomics of selection in dogs and the parallel evolution
between dogs and humans. Nat Commun. 2013;4:1860.
29. Freedman AH, Schweizer RM, Ortega-Del Vecchyo D, Han E, Davis BW,
Gronau I, Silva PM, Galaverni M, Fan Z, Marx P, et al. Demographically-based
evaluation of genomic regions under selection in domestic dogs. PLoS
Genet. 2016;12(3):e1005851.
30. Boyko AR. The domestic dog: mans best friend in the genomic era.
Genome Biol. 2011;12(2):216.
31. Pilot M, Malewski T, Moura AE, Grzybowski T, Oleński K, Ruść A, Kamiński S,
Fadel FR, Mills DS, Alagaili AN. On the origin of mongrels: evolutionary
history of free-breeding dogs in Eurasia. Proc R Soc B. 2015;282(1820):
20152189.
32. Freedman AH, Gronau I, Schweizer RM, Ortega-Del Vecchyo D, Han E, Silva
PM, Galaverni M, Fan Z, Marx P, Lorente-Galdos B. Genome sequencing
highlights the dynamic early history of dogs. PLoS Genet. 2014;10(1):
e1004016.
33. Marsden CD, Ortega-Del Vecchyo D, OBrien DP, Taylor JF, Ramirez O, Vilà C,
Marques-Bonet T, Schnabel RD, Wayne RK, Lohmueller KE. Bottlenecks and
selective sweeps during domestication have increased deleterious genetic
variation in dogs. Proc Natl Acad Sci. 2016;113(1):1527.
34. Botigue LR, Song S, Scheu A, Gopalan S, Pendleton AL, Oetjens M,
Taravella AM, Seregely T, Zeeb-Lanz A, Arbogast RM, et al. Ancient
European dog genomes reveal continuity since the Early Neolithic. Nat
Commun. 2017;8:16082.
35. Kelley JL, Madeoy J, Calhoun JC, Swanson W, Akey JM. Genomic signatures
of positive selection in humans and the limits of outlier approaches.
Genome Res. 2006;16(8):9809.
36. Arendt M, Cairns KM, Ballard JW, Savolainen P, Axelsson E. Diet adaptation
in dog reflects spread of prehistoric agriculture. Heredity (Edinb). 2016;
117(5):3016.
37. Arendt M, Fall T, Lindblad-Toh K, Axelsson E. Amylase activity is associated
with AMY2B copy numbers in dog: implications for dog domestication, diet
and diabetes. Anim Genet. 2014;45(5):71622.
38. Ollivier M, Tresset A, Bastian F, Lagoutte L, Axelsson E, Arendt ML, Balasescu
A, Marshour M, Sablin MV, Salanova L, et al. Amy2B copy number variation
reveals starch diet adaptations in ancient European dogs. R Soc Open Sci.
2016;3(11):160449.
39. Alexander DH, Novembre J, Lange K. Fast model-based estimation of
ancestry in unrelated individuals. Genome Res. 2009;19(9):165564.
40. Fan Z, Silva P, Gronau I, Wang S, Armero AS, Schweizer RM, Ramirez O,
Pollinger J, Galaverni M, Del-Vecchyo DO. Worldwide patterns of genomic
variation and admixture in gray wolves. Genome Res. 2016;26(2):16373.
41. Chen H, Patterson N, Reich D. Population differentiation as a test for
selective sweeps. Genome Res. 2010;20(3):393402.
42. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, Flicek P,
Cunningham F. The ensembl variant effect predictor. Genome Biol.
2016;17(1):122.
43. Maden M. Retinoic acid in the development, regeneration and maintenance
of the nervous system. Nat Rev Neurosci. 2007;8(10):755.
44. Shirai H, Oishi K, Ishida N. Bidirectional CLOCK/BMAL1-dependent circadian
gene regulation by retinoic acid in vitro. Biochem Biophys Res Commun.
2006;351(2):38791.
45. Slager RE, Newton TL, Vlangos CN, Finucane B, Elsea SH. Mutations in RAI1
associated with SmithMagenis syndrome. Nat Genet. 2003;33(4):466.
46. Potocki L, Bi W, Treadwell-Deering D, Carvalho CM, Eifert A, Friedman EM,
Glaze D, Krull K, Lee JA, Lewis RA. Characterization of Potocki-Lupski
syndrome (dup (17)(p11. 2p11. 2)) and delineation of a dosage-sensitive
critical interval that can convey an autism phenotype. Am J Hum Genet.
2007;80(4):63349.
47. Olivares AM, Han Y, Soto D, Flattery K, Marini J, Mollema N, Haider A, Escher
P, DeAngelis MM, Haider NB. The nuclear hormone receptor gene Nr2c1
(Tr2) is a critical regulator of early retina cell patterning. Dev Biol. 2017;
429(1):34355.
48. Dedhar S, Rennie PS, Shago M, Hagesteijn C-YL, Yang H, Filmus J, Hawley
RG, Bruchovsky N, Cheng H, Matusik RJ. Inhibition of nuclear hormone
receptor activity by calreticulin. Nature. 1994;367(6462):480.
49. Takahashi H, Kanno T, Nakayamada S, Hirahara K, Sciumè G, Muljo SA,
Kuchen S, Casellas R, Wei L, Kanno Y. TGF-βand retinoic acid induce the
microRNA miR-10a, which targets Bcl-6 and constrains the plasticity of
helper T cells. Nat Immunol. 2012;13(6):587.
50. Chambers D, Wilson L, Maden M, Lumsden A. RALDH-independent
generation of retinoic acid during vertebrate embryogenesis by CYP1B1.
Development. 2007;134(7):136983.
51. Yang J, Seo J, Nair R, Han S, Jang S, Kim K, Han K, Paik SK, Choi J, Lee S.
DGKιregulates presynaptic release during mGluR-dependent LTD. EMBO J.
2011;30(1):16580.
52. Puranam RS, Eubanks JH, Heinemann SF, McNamara JO. Chromosomal
localization of gene for human glutamate receptor subunit-7. Somat Cell
Mol Genet. 1993;19(6):5818.
53. Augustin I, Rosenmund C, Südhof TC, Brose N. Munc13-1 is essential for
fusion competence of glutamatergic synaptic vesicles. Nature. 1999;
400(6743):457.
54. Caddick SJ, Wang C, Fletcher CF, Jenkins NA, Copeland NG, Hosford DA.
Excitatory but not inhibitory synaptic transmission is reduced in lethargic
(Cacnb4 lh) and tottering (Cacna1a tg) mouse thalami. J Neurophysiol. 1999;
81(5):206674.
55. Harris JA, Westbrook RF. Evidence that GABA transmission mediates
context-specific extinction of learned fear. Psychopharmacology. 1998;
140(1):10515.
56. Stork O, Ji F-Y, Obata K. Reduction of extracellular GABA in the mouse
amygdala during and following confrontation with a conditioned fear
stimulus. Neurosci Lett. 2002;327(2):13842.
57. Gassmann M, Bettler B. Regulation of neuronal GABA B receptor functions
by subunit composition. Nat Rev Neurosci. 2012;13(6):380.
58. Oury F, Khrimian L, Denny CA, Gardin A, Chamouni A, Goeden N,
Huang Y-Y, Lee H, Srinivas P, Gao X-B. Maternal and offspring pools of
osteocalcin influence brain development and functions. Cell. 2013;
155(1):22841.
59. Cheng L, Arata A, Mizuguchi R, Qian Y, Karunaratne A, Gray PA, Arata S,
Shirasawa S, Bouchard M, Luo P. Tlx3 and Tlx1 are post-mitotic selector
genes determining glutamatergic over GABAergic cell fates. Nat Neurosci.
2004;7(5):510.
60. van Koningsbruggen S, Straasheijm KR, Sterrenburg E, de Graaf N,
Dauwerse HG, Frants RR, van der Maarel SM. FRG1P-mediated
aggregation of proteins involved in pre-mRNA processing.
Chromosoma. 2007;116(1):5364.
Pendleton et al. BMC Biology (2018) 16:64 Page 17 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
61. Mathew R, Hartmuth K, Möhlmann S, Urlaub H, Ficner R, Lührmann R.
Phosphorylation of human PRP28 by SRPK2 is required for integration of
the U4/U6-U5 tri-snRNP into the spliceosome. Nat Struct Mol Biol. 2008;
15(5):435.
62. Xia H, Chen D, Wu Q, Wu G, Zhou Y, Zhang Y, Zhang L. CELF1 preferentially
binds to exon-intron boundary and regulates alternative splicing in HeLa
cells. Biochim Biophys Acta. 2017;1860(9):91121.
63. Kim Y-D, Lee J-Y, Oh K-M, Araki M, Araki K, Yamamura K-i, Jun C-D. NSrp70 is
a novel nuclear speckle-related protein that modulates alternative pre-
mRNA splicing in vivo. Nucleic Acids Res. 2011;39(10):430014.
64. Kim C-H, Kim Y-D, Choi E-K, Kim H-R, Na B-R, Im S-H, Jun C-D. Nuclear
speckle-related protein 70 binds to serine/arginine-rich splicing factors 1
and 2 via an arginine/serine-like region and counteracts their alternative
splicing activity. J Biol Chem. 2016;291(12):616981.
65. Straub T, Grue P, Uhse A, Lisby M, Knudsen BR, Tange TØ, Westergaard O,
Boege F. The RNA-splicing factor PSF/p54 controls DNA-topoisomerase I
activity by a direct interaction. J Biol Chem. 1998;273(41):262614.
66. Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H,
Shapero MH, Carson AR, Chen W. Global variation in copy number in the
human genome. nature. 2006;444(7118):444.
67. Zhou Z, Jiang Y, Wang Z, Gou Z, Lyu J, Li W, Yu Y, Shu L, Zhao Y, Ma Y.
Resequencing 302 wild and cultivated accessions identifies genes related to
domestication and improvement in soybean. Nat Biotechnol. 2015;33(4):408.
68. Grossmann S, Bauer S, Robinson PN, Vingron M. Improved detection of
overrepresentation of Gene-Ontology annotations with parentchild
analysis. Bioinformatics. 2007;23(22):302431.
69. Alexa A, Rahnenfuhrer J. topGO: enrichment analysis for gene ontology. R
Packag Ver. 2010;2(0):1-26.
70. Schlamp F, Made J, Stambler R, Chesebrough L, Boyko AR, Messer PW.
Evaluating the performance of selection scans to detect selective sweeps in
domestic dogs. Mol Ecol. 2016;25(1):34256.
71. Jensen JD. On the unfounded enthusiasm for soft selective sweeps. Nat
Commun. 2014;5:5281.
72. Librado P, Gamba C, Gaunitz C, Der Sarkissian C, Pruvost M, Albrechtsen A,
Fages A, Khan N, Schubert M, Jagannathan V, et al. Ancient genomic
changes associated with domestication of the horse. Science. 2017;
356(6336):4425.
73. Schubert M, Jonsson H, Chang D, Der Sarkissian C, Ermini L, Ginolhac A,
Albrechtsen A, Dupanloup I, Foucal A, Petersen B, et al. Prehistoric genomes
reveal the genetic foundation and cost of horse domestication. Proc Natl
Acad Sci U S A. 2014;111(52):E56619.
74. Gotz S, Garcia-Gomez J, Terol J, Williams T, Nagaraj S, Nueda M,
Robles M, Talon M, Dopazo J, Conesa A. High-throughput functional
annotation and data mining with the Blast2GO suite. Nucleic Acids Res.
2008;36(10):342035.
75. Qanbari S, Pausch H, Jansen S, Somel M, Strom TM, Fries R, Nielsen R,
Simianer H. Classic selective sweeps revealed by massive sequencing in
cattle. PLoS Genet. 2014;10(2):e1004148.
76. Kijas JW. Haplotype-based analysis of selective sweeps in sheep. Genome.
2014;57(8):4337.
77. Peyrégne S, Boyle MJ, Dannemann M, Prüfer K. Detecting ancient positive
selection in humans using extended lineage sorting. Genome Res. 2017;
27(9):156372.
78. Prüfer K, Racimo F, Patterson N, Jay F, Sankararaman S, Sawyer S, Heinze A,
Renaud G, Sudmant PH, De Filippo C. The complete genome sequence of a
Neanderthal from the Altai Mountains. Nature. 2014;505(7481):43.
79. Racimo F. Testing for ancient selection using cross-population allele
frequency differentiation. Genetics. 2016;202(2):73350.
80. Lin R, Du X, Peng S, Yang L, Ma Y, Gong Y, Li S. Discovering all
transcriptome single-nucleotide polymorphisms and scanning for selection
signatures in ducks (Anas platyrhynchos). Evol Bioinformatics Online. 2015;
11(Suppl 1):6776.
81. Bronner ME, LeDouarin NM. Development and evolution of the neural crest:
an overview. Dev Biol. 2012;366(1):29.
82. van Es JH, Haegebarth A, Kujala P, Itzkovitz S, Koo B-K, Boj SF,
Korving J, van den Born M, van Oudenaarden A, Robine S. A critical
role for the Wnt effector Tcf4 in adult intestinal homeostatic self-
renewal. Mol Cell Biol. 2012;32(10):191827.
83. Tang L-Y, Deng N, Wang L-S, Dai J, Wang Z-L, Jiang X-S, Li S-J, Li L,
Sheng Q-H, Wu D-Q. Quantitative phosphoproteome profiling of
Wnt3a-mediated signaling network indicating the involvement of
ribonucleoside-diphosphate reductase M2 subunit phosphorylation at
residue serine 20 in canonical Wnt signal transduction. Mol Cell Proteomics.
2007;6(11):195267.
84. Bergmann C, Fliegauf M, Brüchle NO, Frank V, Olbrich H, Kirschner J,
Schermer B, Schmedding I, Kispert A, Kränzlin B. Loss of nephrocystin-3
function can cause embryonic lethality, Meckel-Gruber-like syndrome, situs
inversus, and renal-hepatic-pancreatic dysplasia. Am J Hum Genet. 2008;
82(4):95970.
85. Carmon KS, Gong X, Lin Q, Thomas A, Liu Q. R-spondins function as ligands
of the orphan receptors LGR4 and LGR5 to regulate Wnt/β-catenin
signaling. Proc Natl Acad Sci. 2011;108(28):114527.
86. Bronner-Fraser M. Neural crest cell formation and migration in the
developing embryo. FASEB J. 1994;8(10):699706.
87. Santagati F, Rijli FM. Cranial neural crest and the building of the vertebrate
head. Nat Rev Neurosci. 2003;4(10):806.
88. Lagadec R, Laguerre L, Menuet A, Amara A, Rocancourt C, Péricard P,
Godard BG, Rodicio MC, Rodriguez-Moldes I, Mayeur H. The ancestral role of
nodal signalling in breaking L/R symmetry in the vertebrate forebrain. Nat
Commun. 2015;6:6686.
89. Berryman MA, Goldenring JR. CLIC4 is enriched at cell-cell junctions and
colocalizes with AKAP350 at the centrosome and midbody of cultured
mammalian cells. Cytoskeleton. 2003;56(3):15972.
90. de la Torre-Ubieta L, Gaudillière B, Yang Y, Ikeuchi Y, Yamada T, DiBacco S,
Stegmüller J, Schüller U, Salih DA, Rowitch D. A FOXOPak1 transcriptional
pathway controls neuronal polarity. Genes Dev. 2010;24(8):799813.
91. Kunisaki Y, Nishikimi A, Tanaka Y, Takii R, Noda M, Inayoshi A, Watanabe K-i,
Sanematsu F, Sasazuki T, Sasaki T. DOCK2 is a Rac activator that
regulates motility and polarity during neutrophil chemotaxis. J Cell Biol.
2006;174(5):64752.
92. Dupraz S, Grassi D, Bernis ME, Sosa L, Bisbal M, Gastaldi L, Jausoro I, Cáceres
A, Pfenninger KH, Quiroga S. The TC10Exo70 complex is essential for
membrane expansion and axonal specification in developing neurons. J
Neurosci. 2009;29(42):13292301.
93. Le Douarin NM, Dupin E. The neural crest in vertebrate evolution. Curr Opin
Genet Dev. 2012;22(4):3819.
94. Minoux M, Rijli FM. Molecular mechanisms of cranial neural crest cell
migration and patterning in craniofacial development. Development. 2010;
137(16):260521.
95. Morey DF. Size, shape and development in the evolution of the domestic
dog. J Archaeol Sci. 1992;19(2):181204.
96. Etchevers HC, Couly G, Vincent C, Le Douarin NM. Anterior cephalic neural
crest is required for forebrain viability. Development. 1999;126(16):353343.
97. Xavier GM, Sharpe PT, Cobourne MT. Scube1 is expressed during facial
development in the mouse. J Exp Zool B Mol Dev Evol. 2009;312(5):51824.
98. FitzPatrick DR, Carr IM, McLaren L, Leek JP, Wightman P, Williamson K,
Gautier P, McGill N, Hayward C, Firth H. Identification of SATB2 as the cleft
palate gene on 2q32q33. Hum Mol Genet. 2003;12(19):2491501.
99. Dobreva G, Chahrour M, Dautzenberg M, Chirivella L, Kanzler B, Fariñas I,
Karsenty G, Grosschedl R. SATB2 is a multifunctional determinant of craniofacial
patterning and osteoblast differentiation. Cell. 2006;125(5):97186.
100. Szeto DP, Rodriguez-Esteban C, Ryan AK, OConnell SM, Liu F, Kioussi C,
Gleiberman AS, Izpisúa-Belmonte JC, Rosenfeld MG. Role of the Bicoid-
related homeodomain factor Pitx1 in specifying hindlimb morphogenesis
and pituitary development. Genes Dev. 1999;13(4):48494.
101. Amand TRS, Zhang Y, Semina EV, Zhao X, Hu Y, Nguyen L, Murray JC, Chen
Y. Antagonistic signals between BMP4 and FGF8 define the expression of
Pitx1 and Pitx2 in mouse tooth-forming anlage. Dev Biol. 2000;217(2):323
32.
102. Sandell L. Neural crest cells in ear development. In: Neural Crest Cells. 1st
ed. Cambridge: Academic Press, Elsevier; 2014. p. 16787.
103. Boyko AR, Quignon P, Li L, Schoenebeck JJ, Degenhardt JD, Lohmueller KE,
Zhao K, Brisbin A, Parker HG, Cargill M. A simple genetic architecture
underlies morphological variation in dogs. PLoS Biol. 2010;8(8):e1000451.
104. Nagai N, Hosokawa M, Itohara S, Adachi E, Matsushita T, Hosokawa N,
Nagata K. Embryonic lethality of molecular chaperone hsp47 knockout mice
is associated with defects in collagen biosynthesis. J Cell Biol. 2000;150(6):
1499506.
105. Wilson R, Norris EL, Brachvogel B, Angelucci C, Zivkovic S, Gordon L,
Bernardo BC, Stermann J, Sekiguchi K, Gorman JJ. Changes in the
chondrocyte and extracellular matrix proteome during post-natal mouse
cartilage development. Mol Cell Proteomics. 2012;11(1):M111. 014159.
Pendleton et al. BMC Biology (2018) 16:64 Page 18 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
106. Agnvall B, Jöngren M, Strandberg E, Jensen P. Heritability and genetic
correlations of fear-related behaviour in red junglefowlpossible
implications for early domestication. PLoS One. 2012;7(4):e35162.
107. Lindberg J, Björnerfeldt S, Saetre P, Svartberg K, Seehuus B, Bakken M, Vilà C,
Jazin E. Selection for tameness has changed brain gene expression in silver
foxes. Curr Biol. 2005;15(22):R9156.
108. Trut LN, Plyusnina I, Oskina I. An experiment on fox domestication and
debatable issues of evolution of the dog. Russ J Genet. 2004;40(6):64455.
109. Hare B, Wobber V, Wrangham R. The self-domestication hypothesis:
evolution of bonobo psychology is due to selection against aggression.
Anim Behav. 2012;83(3):57385.
110. Morey DF, Jeger R. Paleolithic dogs: why sustained domestication then? J
Archaeol Sci Rep. 2015;3:4208.
111. Popova N, Voitenko N, Kulikov A, Avgustinovich D. Evidence for the
involvement of central serotonin in mechanism of domestication of silver
foxes. Pharmacol Biochem Behav. 1991;40(4):7516.
112. Popova N, Kulikov A, Avgustinovich D, Voĭtenko N, Trut L. Effect of
domestication of the silver fox on the main enzymes of serotonin
metabolism and serotonin receptors. Genetika. 1997;33(3):3704.
113. Kolesnikova L, Trut L, Beliaev D. Changes in the morphology of the
epiphysis of silver foxes during domestication. Zh Obshch Biol. 1988;
49(4):487.
114. Truong HT, Solaymani-Kohal S, Baker KR, Girirajan S, Williams SR, Vlangos CN,
Smith AC, Bunyan DJ, Roffey PE, Blanchard CL. Diagnosing SmithMagenis
syndrome and duplication 17p11. 2 syndrome by RAI1 gene copy number
variation using quantitative real-time PCR. Genet Test. 2008;12(1):6773.
115. Elsea SH, Girirajan S. SmithMagenis syndrome. Eur J Hum Genet. 2008;
16(4):412.
116. Jones W, Bellugi U, Lai Z, Chiles M, Reilly J, Lincoln A, Adolphs R. II.
Hypersociability in Williams syndrome. J Cogn Neurosci. 2000;
12(Supplement 1):3046.
117. Shuldiner E, Koch IJ, Kartzinel RY, Hogan A, Brubaker L, Wanser S, Stahler D,
Wynne CD, Ostrander EA, Sinsheimer JS. Structural variants in genes
associated with human Williams-Beuren syndrome underlie stereotypical
hypersociability in domestic dogs. Sci Adv. 2017;3(7):e1700398.
118. Adams MS, Gammill LS, Bronner-Fraser M. Discovery of transcription factors
and other candidate regulators of neural crest development. Dev Dyn. 2008;
237(4):102133.
119. Tahir R, Kennedy A, Elsea SH, Dickinson AJ. Retinoic acid induced-1 (Rai1)
regulates craniofacial and brain development in Xenopus. Mech Dev. 2014;
133:91104.
120. Barnett C, Yazgan O, Kuo H-C, Malakar S, Thomas T, Fitzgerald A, Harbour W,
Henry JJ, Krebs JE. Williams Syndrome Transcription Factor is critical for
neural crest cell function in Xenopus laevis. Mech Dev. 2012;129(9):32438.
121. Goldman S, Malow B, Newman K, Roof E, Dykens E. Sleep patterns and
daytime sleepiness in adolescents and young adults with Williams
syndrome. J Intellect Disabil Res. 2009;53(2):1828.
122. Sniecinska-Cooper AM, Iles RK, Butler SA, Jones H, Bayford R, Dimitriou D.
Abnormalsecretionofmelatoninandcortisolinrelationtosleep
disturbances in children with Williams syndrome. Sleep Med.
2015;16(1):94100.
123. Williams SR, Zies D, Mullegama SV, Grotewiel MS, Elsea SH. Smith-Magenis
syndrome results in disruption of CLOCK gene transcription and reveals an
integral role for RAI1 in the maintenance of circadian rhythmicity. Am J
Hum Genet. 2012;90(6):9419.
124. De Leersnyder H, de Blois M-C, Claustrat B, Romana S, Albrecht U, von
Kleist-Retzow J-C, Delobel B, Viot G, Lyonnet S, Vekemans M. Inversion of
the circadian rhythm of melatonin in the Smith-Magenis syndrome. J
Pediatr. 2001;139(1):1116.
125. Tian C, Liu D, Sun Q-L, Chen C, Xu Y, Wang H, Xiang W, Kretzschmar HA, Li
W, Chen C. Comparative analysis of gene expression profiles between
cortex and thalamus in Chinese fatal familial insomnia patients. Mol
Neurobiol. 2013;48(1):3648.
126. Nikonova EV, Gilliland JD, Tanis KQ, Podtelezhnikov AA, Rigby AM, Galante RJ,
Finney EM, Stone DJ, Renger JJ, Pack AI. Transcriptional profiling of cholinergic
neurons from basal forebrain identifies changes in expression of genes between
sleep and wake. Sleep. 2017;40(6):zsx059. https://doi.org/10.1093/sleep/zsx059.
127. Godinho SI, Maywood ES, Shaw L, Tucci V, Barnard AR, Busino L, Pagano M,
Kendall R, Quwailid MM, Romero MR. The after-hours mutant reveals a role
for Fbxl3 in determining mammalian circadian period. Science. 2007;
316(5826):897900.
128. Argente J, Flores R, Gutierrez-Arumi A, Verma B, Martos-Moreno GA, Cusco I,
Oghabian A, Chowen JA, Frilander MJ, Perez-Jurado LA. Defective minor
spliceosome mRNA processing results in isolated familial growth hormone
deficiency. EMBO Mol Med. 2014;6(3):299306.
129. An M, Henion PD. The zebrafish sf3b1b460 mutant reveals differential
requirements for the sf3b1 pre-mRNA processing gene during neural crest
development. Int J Dev Biol. 2012;56(4):223.
130. Edery P, Marcaillou C, Sahbatou M, Labalme A, Chastang J, Touraine R,
Tubacher E, Senni F, Bober MB, Nampoothiri S. Association of TALS
developmental disorder with defect in minor splicing component U4atac
snRNA. Science. 2011;332(6026):2403.
131. Serres-Armero A, Povolotskaya IS, Quilez J, Ramirez O, Santpere G, Kuderna
LF, Hernandez-Rodriguez J, Fernandez-Callejo M, Gomez-Sanchez D,
Freedman AH. Similar genomic proportions of copy number variation
within gray wolves and modern dog breeds inferred from whole genome
sequencing. BMC Genomics. 2017;18(1):977.
132. Gopalakrishnan S, Castruita JAS, Sinding M-HS, Kuderna LF, Räikkönen J,
Petersen B, Sicheritz-Ponten T, Larson G, Orlando L, Marques-Bonet T. The
wolf reference genome sequence (Canis lupus lupus) and its implications
for Canis spp. population genomics. BMC Genomics. 2017;18(1):495.
133. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A,
Garimella K, Altshuler D, Gabriel S, Daly M. The Genome Analysis Toolkit: a
MapReduce framework for analyzing next-generation DNA sequencing data.
Genome Res. 2010;20(9):1297303.
134. Auton A, Li YR, Kidd J, Oliveira K, Nadel J, Holloway JK, Hayward JJ, Cohen
PE, Greally JM, Wang J. Genetic recombination is targeted towards gene
promoter regions in dogs. PLoS Genet. 2013;9(12):e1003984.
135. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J,
Sklar P, De Bakker PI, Daly MJ. PLINK: a tool set for whole-genome association
and population-based linkage analyses. Am J Hum Genet. 2007;81(3):55975.
136. Patterson N, Price AL, Reich D. Population structure and eigenanalysis. PLoS
Genet. 2006;2(12):e190.
137. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA,
Handsaker RE, Lunter G, Marth GT, Sherry ST. The variant call format and
VCFtools. Bioinformatics. 2011;27(15):21568.
138. Kelleher J, Etheridge AM, McVean G. Efficient coalescent simulation and
genealogical analysis for large sample sizes. PLoS Comput Biol. 2016;12(5):
e1004842.
139. Campbell CL, Bhérer C, Morrow BE, Boyko AR, Auton A. A pedigree-based
map of recombination in the domestic dog genome. G3 (Bethesda). 2016;
6(11):351724.
140. Bhatia G, Patterson N, Sankararaman S, Price AL. Estimating and interpreting
FST: the impact of rare variants. Genome Res. 2013;23(9):151421.
141. Kidd Lab - Selection Scan GitHub Repository. https://github.com/KiddLab/
Pendleton_2018_Selection_Scan/.
142. Pavlidis P, Noble WS. Matrix2png: a utility for visualizing matrix data.
Bioinformatics. 2003;19(2):2956.
143. Ensembl Canis familiaris database. ftp://ftp.ensembl.org/pub/release-81/.
Accessed 1 Mar 2016.
144. Ensembl Canis familiaris Index - Release 81. http://www.ensembl.org/Canis_
familiaris/Info/Index. Accessed 1 Mar 2016.
145. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing
genomic features. Bioinformatics. 2010;26(6):8412.
146. Sudmant PH, Mallick S, Nelson BJ, Hormozdiari F, Krumm N, Huddleston J,
Coe BP, Baker C, Nordenfelt S, Bamshad M. Global diversity, population
stratification, and selection of human copy-number variation. Science. 2015;
349(6253):aab3761.
147. Sudmant PH, Huddleston J, Catacchio CR, Malig M, Hillier LW, Baker C,
Mohajeri K, Kondova I, Bontrop RE, Persengiev S. Evolution and diversity of
copy number variation in the great ape lineage. Genome Res. 2013;23(9):
137382.
148. Sudmant PH, Kitzman JO, Antonacci F, Alkan C, Malig M, Tsalenko A,
Sampas N, Bruhn L, Shendure J, Eichler EE. Diversity of human copy number
variation and multicopy genes. Science. 2010;330(6004):6416.
149. Alkan C, Kidd JM, Marques-Bonet T, Aksay G, Antonacci F, Hormozdiari F,
Kitzman JO, Baker C, Malig M, Mutlu O. Personalized copy number and
segmental duplication maps using next-generation sequencing. Nat Genet.
2009;41(10):1061.
150. Hach F, Hormozdiari F, Alkan C, Hormozdiari F, Birol I, Eichler EE, Sahinalp
SC. mrsFAST: a cache-oblivious algorithm for short-read mapping. Nat
Methods. 2010;7(8):576.
Pendleton et al. BMC Biology (2018) 16:64 Page 19 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
151. fastCN. https://github.com/KiddLab/fastCN.
152. Alkan C, Coe BP, Eichler EE. Genome structural variation discovery and
genotyping. Nat Rev Genet. 2011;12(5):363.
153. Handsaker RE, Van Doren V, Berman JR, Genovese G, Kashin S, Boettger LM,
McCarroll SA. Large multiallelic copy number variations in humans. Nat
Genet. 2015;47(3):296.
154. Zhang Z, Wang W. RNA-Skim: a rapid method for RNA-Seq quantification at
transcript level. Bioinformatics. 2014;30(12):i28392.
155. Patro R, Mount SM, Kingsford C. Sailfish enables alignment-free isoform
quantification from RNA-seq reads using lightweight algorithms. Nat
Biotechnol. 2014;32(5):462.
156. Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-
seq quantification. Nat Biotechnol. 2016;34(5):525.
157. QuicK-mer. https://github.com/KiddLab/QuicK-mer.
158. Marçais G, Kingsford C. A fast, lock-free approach for efficient parallel
counting of occurrences of k-mers. Bioinformatics. 2011;27(6):76470.
159. Consortium GP. A map of human genome variation from population-scale
sequencing. Nature. 2010;467(7319):1061.
160. Consortium GP. An integrated map of genetic variation from 1,092 human
genomes. Nature. 2012;491(7422):56.
161. Consortium GP. A global reference for human genetic variation. Nature.
2015;526(7571):68.
162. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown
CG, Hall KP, Evers DJ, Barnes CL, Bignell HR. Accurate whole human
genome sequencing using reversible terminator chemistry. Nature. 2008;
456(7218):53.
163. Song S, Sliwerska E, Emery S, Kidd JM. Modeling human population
separation history using physically phased genomes. Genetics. 2016; https://
doi.org/10.1534/genetics.116.192963.
164. Hughes JF, Skaletsky H, Pyntikova T, Graves TA, van Daalen SK, Minx PJ,
Fulton RS, McGrath SD, Locke DP, Friedman C. Chimpanzee and human Y
chromosomes are remarkably divergent in structure and gene content.
Nature. 2010;463(7280):536.
165. Oetjens MT, Shen F, Emery SB, Zou Z, Kidd JM. Y-chromosome structural
diversity in the bonobo and chimpanzee lineages. Genome Biol Evol. 2016;
8(7):223140.
166. QuicK-mer Precomputed K-mers. http://kiddlabshare.umms.med.umich.edu/
public-data/QuicK-mer/Ref/.
167. Nicholas TJ, Baker C, Eichler EE, Akey JM. A high-resolution integrated map
of copy number polymorphisms within and between breeds of the modern
domesticated dog. BMC Genomics. 2011;12(1):414.
168. Nicholas TJ, Cheng Z, Ventura M, Mealey K, Eichler EE, Akey JM. The
genomic architecture of segmental duplications and associated copy
number variants in dogs. Genome Res. 2009;19(3):4919.
169. Chen W-K, Swartz JD, Rush LJ, Alvarez CE. Mapping DNA structural variation
in dogs. Genome Res. 2009;19(3):5009.
170. Ramirez O, Olalde I, Berglund J, Lorente-Galdos B, Hernandez-Rodriguez J,
Quilez J, Webster MT, Wayne RK, Lalueza-Fox C, Vilà C. Analysis of structural
diversity in wolf-like canids reveals post-domestication variants. BMC
Genomics. 2014;15(1):465.
171. Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, Rozen SG.
Primer3new capabilities and interfaces. Nucleic Acids Res. 2012;40(15):e115.
172. Bocciardi R, Giorda R, Marigo V, Zordan P, Montanaro D, Gimelli S, Seri M,
Lerone M, Ravazzolo R, Gimelli G. Molecular characterization of at (2; 6)
balanced translocation that is associated with a complex phenotype and
leads to truncation of the TCBA1 gene. Hum Mutat. 2005;26(5):42636.
173. Bartels CF, Bükülmez H, Padayatti P, Rhee DK, van Ravenswaaij-Arts C, Pauli RM,
Mundlos S, Chitayat D, Shih L-Y, Al-Gazali LI. Mutations in the transmembrane
natriuretic peptide receptor NPR-B impair skeletal growth and cause
acromesomelic dysplasia, type Maroteaux. Am J Hum Genet. 2004;75(1):2734.
174. Tsuji T, Kunieda T. A loss-of-function mutation in natriuretic peptide
receptor 2 (Npr2) gene is responsible for disproportionate dwarfism in cn/
cn mouse. J Biol Chem. 2005;280(14):1428892.
175. Zhang R, Cao L, Wang Y, Fang Y, Zhao L, Li W, Shi O-Y, Cai C-Q. A unique
methylation pattern co-segregates with neural tube defect statuses in Han
Chinese pedigrees. Neurol Sci. 2017;38(12):215364.
176. Lin Y-H, Zhen Y-Y, Chien K-Y, Lee I-C, Lin W-C, Chen M-Y, Pai L-M. LIMCH1
regulates nonmuscle myosin-II activity and suppresses cell migration. Mol
Biol Cell. 2017;28(8):105465.
177. Austin-Tse C, Halbritter J, Zariwala MA, Gilberti RM, Gee HY, Hellman N,
Pathak N, Liu Y, Panizzi JR, Patel-King RS. Zebrafish ciliopathy screen plus
human mutational analysis identifies C21orf59 and CCDC65 defects as
causing primary ciliary dyskinesia. Am J Hum Genet. 2013;93(4):67286.
178. Marques S, Borges AC, Silva AC, Freitas S, Cordenonsi M, Belo JA. The
activity of the Nodal antagonist Cerl-2 in the mouse node is required for
correct L/R body axis. Genes Dev. 2004;18(19):23427.
179. Wang S, Meyer H, Ochoa-Espinosa A, Buchwald U, Önel S, Altenhein B,
Heinisch JJ, Affolter M, Paululat A. GBF1 (Gartenzwerg)-dependent secretion
is required for Drosophila tubulogenesis. J Cell Sci. 2012;125(2):46172.
180. Mazaki Y, Nishimura Y, Sabe H. GBF1 bears a novel phosphatidylinositol-
phosphate binding module, BP3K, to link PI3Kγactivity with Arf1 activation
involved in GPCR-mediated neutrophil chemotaxis and superoxide
production. Mol Biol Cell. 2012;23(13):245767.
181. van Veen M, Mans LA, Matas-Rico E, van Pelt J, Perrakis A, Moolenaar WH,
Haramis A-PG. Glycerophosphodiesterase GDE2/GDPD5 affects pancreas
differentiation in zebrafish. Int J Biochem Cell Biol. 2018;94:718.
182. Rao M, Sockanathan S. Transmembrane protein GDE2 induces motor
neuron differentiation in vivo. Science. 2005;309(5744):22125.
183. Du L, Xu J, Li X, Ma N, Liu Y, Peng J, Osato M, Zhang W, Wen Z.
Rumba and Haus3 are essential factors for the maintenance of
hematopoietic stem/progenitor cells during zebrafish hematopoiesis.
Development. 2011;138(4):61929.
184. Peters H, Neubüser A, Kratochwil K, Balling R. Pax9-deficient mice lack
pharyngeal pouch derivatives and teeth and exhibit craniofacial and limb
abnormalities. Genes Dev. 1998;12(17):273547.
185. Ercan-Sencicek AG, Jambi S, Franjic D, Nishimura S, Li M, El-Fishawy P,
Morgan TM, Sanders SJ, Bilguvar K, Suri M. Homozygous loss of DIAPH1 is a
novel cause of microcephaly in humans. Eur J Hum Genet. 2015;23(2):165.
186. Bai SW, Herrera-Abreu MT, Rohn JL, Racine V, Tajadura V, Suryavanshi N,
Bechtel S, Wiemann S, Baum B, Ridley AJ. Identification and characterization
of a set of conserved and new regulators of cytoskeletal organization, cell
morphology and migration. BMC Biol. 2011;9(1):54.
187. Alkobtawi M, Ray H, Barriga EH, Moreno M, Kerney R, Monsoro-Burq A-H,
Saint-Jeannet J-P, Mayor R. Characterization of Pax3 and Sox10 transgenic
Xenopus laevis embryos as tools to study neural crest development. Dev
Biol. 2018. https://doi.org/10.1016/j.ydbio.2018.02.020.
188. Zhao C, Deng Y, Liu L, Yu K, Zhang L, Wang H, He X, Wang J, Lu C, Wu LN.
Dual regulatory switch through interactions of Tcf7l2/Tcf4 with stage-
specific partners propels oligodendroglial maturation. Nat Commun.
2016;7:10883.
189. Dornier E, Coumailleau F, Ottavi J-F, Moretti J, Boucheix C, Mauduit P,
Schweisguth F, Rubinstein E. TspanC8 tetraspanins regulate ADAM10/
Kuzbanian trafficking and promote Notch activation in flies and mammals.
J Cell Biol. 2012;199(3):48196.
190. Theveneau E, Mayor R. Neural crest delamination and migration: from
epithelium-to-mesenchyme transition to collective cell migration. Dev Biol.
2012;366(1):3454.
191. Solomon KS, Kudoh T, Dawid IB, Fritz A. Zebrafish foxi1 mediates otic
placode formation and jaw development. Development. 2003;130(5):92940.
192. Huang Y, Roelink H, McKnight GS. Protein kinase A deficiency causes axially
localized neural tube defects in mice. J Biol Chem. 2002;277(22):1988996.
193. Offermanns S, Zhao LP, Gohla A, Sarosi I, Simon MI, Wilkie TM. Embryonic
cardiomyocyte hypoplasia and craniofacial defects in Gαq. Gα11-mutant
mice. EMBO J. 1998;17(15):430412.
194. Kondo T, Matsuoka AJ, Shimomura A, Koehler KR, Chan RJ, Miller JM,
Srour EF, Hashino E. Wnt signaling promotes neuronal differentiation
from mesenchymal stem cells through activation of Tlx3. Stem Cells.
2011;29(5):83646.
195. Koestner U, Shnitsar I, Linnemannstöns K, Hufton AL, Borchers A.
Semaphorin and neuropilin expression during early morphogenesis of
Xenopus laevis. Dev Dyn. 2008;237(12):385363.
196. Pegtel DM, Ellenbroek SI, Mertens AE, van der Kammen RA, de Rooij J,
Collard JG. The Par-Tiam1 complex controls persistent migration by
stabilizing microtubule-dependent front-rear polarity. Curr Biol. 2007;17(19):
162334.
197. Wang JS, Infante CR, Park S, Menke DB. PITX1 promotes chondrogenesis
and myogenesis in mouse hindlimbs through conserved regulatory targets.
Dev Biol. 2017;434(1):186-95.
198. Wong RLY, Wlodarczyk BJ, Min KS, Scott ML, Kartiko S, Yu W, Merriweather
MY, Vogel P, Zambrowicz BP, Finnell RH. Mouse Fkbp8 activity is required to
inhibit cell death and establish dorso-ventral patterning in the posterior
neural tube. Hum Mol Genet. 2007;17(4):587601.
Pendleton et al. BMC Biology (2018) 16:64 Page 20 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
199. Fimia GM, Stoykova A, Romagnoli A, Giunta L, Di Bartolomeo S, Nardacci R,
Corazzari M, Fuoco C, Ucar A, Schwartz P. Ambra1 regulates autophagy and
development of the nervous system. Nature. 2007;447(7148):1121.
200. Tu C-F, Yan Y-T, Wu S-Y, Djoko B, Tsai M-T, Cheng C-J, Yang R-B. Domain
and functional analysis of a novel platelet-endothelial cell surface protein,
SCUBE1. J Biol Chem. 2008;283(18):1247888.
201. Williams AL, Eason J, Chawla B, Bohnsack BL. Cyp1b1 regulates ocular fissure
closure through a retinoic acidindependent pathway. Invest Ophthalmol
Vis Sci. 2017;58(2):108497.
Pendleton et al. BMC Biology (2018) 16:64 Page 21 of 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... We applied fastCN to high-coverage Illumina data for 2,609 unrelated individuals from the 1KG to estimate NPIP copy number (36,93,94). Short-read shotgun sequences from each individual are split into 36 bp segments and aligned to a reference genome (up to two single-nucleotide mismatches) allowing copy number to be estimated (94). ...
... We applied fastCN to high-coverage Illumina data for 2,609 unrelated individuals from the 1KG to estimate NPIP copy number (36,93,94). Short-read shotgun sequences from each individual are split into 36 bp segments and aligned to a reference genome (up to two single-nucleotide mismatches) allowing copy number to be estimated (94). Windows overlapping the exon 8 VNTR were excluded from copy number estimation to avoid biasing the estimate. ...
Preprint
Full-text available
The NPIP (nuclear pore interacting protein) gene family has expanded to high copy number in humans and African apes where it has been subject to an excess of amino acid replacement consistent with positive selection (1). Due to the limitations of short-read sequencing, NPIP human genetic diversity has been poorly understood. Using highly accurate assemblies generated from long-read sequencing as part of the human pangenome, we completely characterize 169 human haplotypes (4,665 NPIP paralogs and alleles). Of the 28 NPIP paralogs, just three ( NPIPB2 , B11 , and B14 ) are fixed at a single copy, and only a single locus, B2 , shows no structural variation. Four NPIP paralogs map to large segmental duplication blocks that mediate polymorphic inversions (355 kbp-1.6 Mbp) corresponding to microdeletions associated with developmental delay and autism. Haplotype-based tests of positive selection and selective sweeps identify two paralogs, B9 and B15 , within the top percentile for both tests. Using full-length cDNA data from 101 tissue/cell types, we construct paralog-specific gene models and show that 56% (31/55 most abundant isoforms) have not been previously described in RefSeq. We define six distinct translation start sites and other protein structural features that distinguish paralogs, including a variable number tandem repeat that encodes a beta helix of variable size that emerged ~3.1 million years ago in human evolution. Among the 28 NPIP paralogs, we identify distinct tissue and developmental patterns of expression with only a few maintaining the ancestral testis-enriched expression. A subset of paralogs ( NPIPA1 , A5 , A6-9 , B3-5 , and B12/B13 ) show increased brain expression. Our results suggest ongoing positive selection in the human population and rapid diversification of NPIP gene models.
... There is a consensus that the key to dog domestication lies in changes in the rate of developmental phases, and due to genetic effects, dogs retain paedomorphic features (Geiger et al., 2017;Goodwin et al., 1997). It is hypothesised that the domestication syndrome results from mild neural crest cell deficits during embryonic development (Pendleton et al., 2018;Wilkins et al., 2014). ...
Article
Full-text available
While many societies worldwide are experiencing demographic transitions characterized by declining birth rates and shrinking kinship networks, the rise in pet ownership, particularly dog keeping, is most pronounced in Western and East Asian urbanized societies, where pets increasingly fulfill companionship roles. Dogs, one of the most often kept pets, are largely considered integral members of the human family. An increasing number of owners have even begun to regard their dogs as their children. This phenomenon can be explained by cultural evolutionary hypotheses, which suggest that due to changes in their environment, humans have culturally redirected their biological needs to nurture and care for children towards animals. Why are dogs good candidates for this child-like role in Western societies? The aim of this theoretical review is to describe the child-like morphological, behavioural and physiological features of pet dogs and explore the similarities and differences in dog and child parenting. We also examine the motivations behind “dog parenting” and conclude that “dog parents” constitute a heterogeneous group of people who attribute child-like roles to their dogs to various degrees and for various reasons. Both are highly dependent on socio-cultural contexts, among other factors. While some owners might see their dog as a child surrogate to spoil, others actively choose to have dogs and not children, bearing in mind that they have species-specific characteristics and needs. Dog parenting can also coexist with child parenting, enhancing the idea that humans might have evolved to care for others regardless of species.
... Under neutral evolution, 90% sequence identity allows us to identify SDs that occurred ~35-40 million years ago, and a length threshold >1 kb excludes the effective insertion length of most retrotransposons other than some full-length elements. We matched Illumina short-read sequencing data for all 170 haplotypes, which were used for additional read-depth support of the putative duplicated regions (fastCN) 75 . ...
Article
Full-text available
Segmental duplications (SDs) contribute significantly to human disease, evolution and diversity but have been difficult to resolve at the sequence level. We present a population genetics survey of SDs by analyzing 170 human genome assemblies (from 85 samples representing 38 Africans and 47 non-Africans) in which the majority of autosomal SDs are fully resolved using long-read sequence assembly. Excluding the acrocentric short arms and sex chromosomes, we identify 173.2 Mb of duplicated sequence (47.4 Mb not present in the telomere-to-telomere reference) distinguishing fixed from structurally polymorphic events. We find that intrachromosomal SDs are among the most variable, with rare events mapping near their progenitor sequences. African genomes harbor significantly more intrachromosomal SDs and are more likely to have recently duplicated gene families with higher copy numbers than non-African samples. Comparison to a resource of 563 million full-length isoform sequencing reads identifies 201 novel, potentially protein-coding genes corresponding to these copy number polymorphic SDs.
... Dogs have lived in and around human settlements for well over 12,000 years (3)(4)(5). To survive as scavengers of human food and waste, they evolved foraging, hazard avoidance, and reproductive behaviors useful for their niche as scavengers of human waste (6)(7)(8)(9)(10) and lost hunting behaviors that in wolves are essential for survival (2). Domestication-related behaviors are those that are common to all dogs and differentiate them from wolves. ...
Article
Dogs have played an outsized role in the field of behavioral genetics since its earliest days. Their unique evolutionary history and ubiquity in the modern world make them a potentially powerful model system for discovering how genetic changes lead to changes in behavior. Genomic technology has supercharged this potential by enabling scientists to sequence the DNA of thousands of dogs and test for correlations with behavioral traits. However, fractures in the early history of animal behavior between biological and psychological subfields may be impeding progress. In addition, canine behavioral genetics has included almost exclusively dogs from modern breeds, who represent just a small fraction of all dog diversity. By expanding the scope of dog behavior studies, and incorporating an evolutionary perspective on canine behavioral genetics, we can move beyond associations to understanding the complex interactions between genes and environment that lead to dog behavior.
Preprint
Full-text available
Climate and land use change have increased human-wildlife interactions, potentially reducing wild species density and prompting behavioural adaptations to urbanised environments. It is still debated if behavioural responses are mainly the result of phenotypic plasticity or if they were driven by anthropic selective pressures, especially in small populations. Our study focused on the Apennine brown bear population (Ursus arctos marsicanus), which has coexisted with humans in Central Italy for millennia. We characterised genomic diversity and identified adaptation signals distinctive to this population by comparing whole genome resequencing data across the Holarctic species range. We show that Apennine brown bears possess a unique genomic diversity pattern including selective signatures at genes associated with reduced aggressiveness, possibly involving alternative splicing mechanism. Our findings suggest that even in small and long-isolated populations, selection may shape behavioural traits. We hypothesise that human-induced selection has influenced these changes, reducing conflicts and contributing to the long-term persistence of the Apennine bear and its coexistence with humans.
Preprint
Full-text available
NOTCH2NL ( NOTCH2 -N-terminus-like) genes arose from incomplete, recent chromosome 1 segmental duplications implicated in human brain cortical expansion. Genetic characterization of these loci and their regulation is complicated by the fact they are embedded in large, nearly identical duplications that predispose to recurrent microdeletion syndromes. Using nearly complete long-read assemblies generated from 67 human and 12 ape haploid genomes, we show independent recurrent duplication among apes with functional copies emerging in humans ~2.1 million years ago. We distinguish NOTCH2NL paralogs present in every human haplotype ( NOTCH2NLA ) from copy number variable ones. We also characterize large-scale structural variation, including gene conversion, for 28% of haplotypes leading to a previously undescribed paralog, NOTCH2tv . Finally, we apply Fiber-seq and long-read transcript sequencing to human cortical neurospheres to characterize the regulatory landscape and find that the most fixed paralogs, NOTCH2 and NOTCH2NLA , harbor the greatest number of paralog-specific elements potentially driving their regulation.
Article
The house mouse X and Y chromosomes have recently acquired multicopy, rapidly evolving gene families representing an evolutionary arms race. This arms race between proteins encoded by X-linked Slxl1 / Slx and Y-linked Sly gene families can distort offspring sex ratio, but how these proteins compete remains unknown. Here, we report how Slxl1 / Slx and Sly encoded proteins compete in a protein family–specific and dose-dependent manner using yeast. Specifically, SLXL1 competes with SLY1 and SLY2 for binding to the Spindlin SPIN1. Similarly, SLX competes with SLY2 for binding the Spindlin SSTY2. These competitions are driven by the N termini of SLXL1, SLX, SLY1, and SLY2 binding to the third Tudor domains of SPIN1 and SSTY2. SLY1 and SLY2 form homo- and heterodimers, suggesting that the competition is between complex multimers. Residues under positive selection mapping to the interaction domains and rapid exon gain/loss are consistent with competition between the X- and Y-linked gene families. Our findings support a model in which dose-dependent competition of these X- and Y-linked encoded proteins to bind Spindlins occurs in haploid X- and Y-spermatids to influence X- versus Y-sperm fitness and thus sex ratio.
Article
How populations adapt to their environment is a fundamental question in biology. Yet, we know surprisingly little about this process, especially for endangered species, such as nonhuman great apes. Chimpanzees, our closest living relatives, are particularly notable because they inhabit diverse habitats, from rainforest to woodland-savannah. Whether genetic adaptation facilitates such habitat diversity remains unknown, despite it having wide implications for evolutionary biology and conservation. By using newly sequenced exomes from 828 wild chimpanzees (388 postfiltering), we found evidence of fine-scale genetic adaptation to habitat, with signatures of positive selection in forest chimpanzees in the same genes underlying adaptation to malaria in humans. This work demonstrates the power of noninvasive samples to reveal genetic adaptations in endangered populations and highlights the importance of adaptive genetic diversity for chimpanzees.
Article
Full-text available
Largemouth bass (Micropterus salmoides, LMB) is an important aquaculture species due to its excellent flesh quality and environmental adaptability. It has been continuously introduced to many countries and cultured for decades. Here, an LMB population was used for selective breeding to improve growth rate and feed adaptability. After five generations of breeding, the growth rate improved by 38%, and feed adaptability improved by 22% compared to the non-breeding population. To study the underlying genetic mechanism, 100 LMB from the breeding population and 100 from the non-breeding population were sampled for whole-genome resequencing. The population genetics analysis shows that the breeding population has a higher inbreeding coefficient and linkage disequilibrium (LD) level, a lower nucleic acid diversity and effective population size (Ne). Using FSTF_{ST} (fixation index), we found that the average FSTF_{ST} value between the two populations was 0.07, with the highest FSTF_{ST} value reaching 0.38, which overlaps with the trypsin gene. Additionally, other genes exhibiting high FSTF_{ST} values are associated with functions such as neural development, glucose metabolism, and growth. Using FSTF_{ST} and nucleic acid diversity as criteria, we identified 698 genes that are positively selected in the breeding population, and gene functional enrichment analysis shows that 36 genes are related to the olfactory receptor pathway. Overall, our study found that multiple genes were selected in the LMB breeding population. These genes may be associated with adaptation and digestion of artificial feed in fish.
Preprint
Full-text available
The house mouse X and Y chromosomes have recently acquired high copy number, rapidly evolving gene families representing an evolutionary arms race. This arms race between proteins encoded by X-linked Slxl1 / Slx and Y-linked Sly gene families can distort male offspring sex ratio, but how these proteins compete remains unknown. Here, we report how Slxl1 / Slx and Sly encoded proteins compete in a protein family-specific and dose-dependent manner using yeast. Specifically, SLXL1 competes with SLY1 and SLY2 for binding to the Spindlin SPIN1. Similarly, SLX competes with SLY2 for binding the Spindlin SSTY2. These competitions are driven by the N-termini of SLXL1, SLX, SLY1, and SLY2 binding to the third Tudor domains of SPIN1 and SSTY2. SLY1 and SLY2 form homo- and heterodimers, suggesting the competition is between complex multimers. Residues under positive selection mapping to the interaction domains and rapid exon gain/loss are consistent with competition between the X- and Y-linked gene families. Our findings support a model in which dose-dependent competition of these X- and Y-linked encoded proteins to bind Spindlins occurs in haploid X- and Y-spermatids to influence X-versus Y-sperm fitness and thus sex ratio. Significance Statement In house mouse, an evolutionary arms race between proteins encoded by the X-linked Slxl1/Slx and Y-linked Sly gene families during spermatogenesis can distort male offspring sex ratio, but how these proteins compete remains unknown. We report how SLXL1/SLX competes with SLY1/SLY2 by demonstrating their dose-dependent competitive binding to Spindlins, the key protein domains and rapidly evolving residues and exons that drive the competition, and how the competition is likely between complex multimers. Our findings have broad implications for the mechanics of evolutionary arms and how competition between sex chromosomes influences X-versus Y-sperm fitness and sex ratio.
Article
Full-text available
Background Whole genome re-sequencing data from dogs and wolves are now commonly used to study how natural and artificial selection have shaped the patterns of genetic diversity. Single nucleotide polymorphisms, microsatellites and variants in mitochondrial DNA have been interrogated for links to specific phenotypes or signals of domestication. However, copy number variation (CNV), despite its increasingly recognized importance as a contributor to phenotypic diversity, has not been extensively explored in canids. ResultsHere, we develop a new accurate probabilistic framework to create fine-scale genomic maps of segmental duplications (SDs), compare patterns of CNV across groups and investigate their role in the evolution of the domestic dog by using information from 34 canine genomes. Our analyses show that duplicated regions are enriched in genes and hence likely possess functional importance. We identify 86 loci with large CNV differences between dogs and wolves, enriched in genes responsible for sensory perception, immune response, metabolic processes, etc. In striking contrast to the observed loss of nucleotide diversity in domestic dogs following the population bottlenecks that occurred during domestication and breed creation, we find a similar proportion of CNV loci in dogs and wolves, suggesting that other dynamics are acting to particularly select for CNVs with potentially functional impacts. Conclusions This work is the first comparison of genome wide CNV patterns in domestic and wild canids using whole-genome sequencing data and our findings contribute to study the impact of novel kinds of genetic changes on the evolution of the domestic dog.
Article
Full-text available
Notch signaling plays an essential role in the proliferation, differentiation and cell fate determination of various tissues, including the developing pancreas. One regulator of the Notch pathway is GDE2 (or GDPD5), a transmembrane ecto-phosphodiesterase that cleaves GPI-anchored proteins at the plasma membrane, including a Notch ligand regulator. Here we report that Gdpd5-knockdown in zebrafish embryos leads to developmental defects, particularly, impaired motility and reduced pancreas differentiation, as shown by decreased expression of insulin and other pancreatic markers. Exogenous expression of human GDE2, but not catalytically dead GDE2, similarly leads to developmental defects. Human GDE2 restores insulin expression in Gdpd5a-depleted zebrafish embryos. Importantly, zebrafish Gdpd5 orthologues localize to the plasma membrane where they show catalytic activity against GPI-anchored GPC6. Thus, our data reveal functional conservation between zebrafish Gdpd5 and human GDE2, and suggest that strict regulation of GDE2 expression and catalytic activity is critical for correct embryonic patterning. In particular, our data uncover a role for GDE2 in regulating pancreas differentiation.
Article
Full-text available
Domesticated mammals of many different species share a set of physical and physiological traits that are not displayed by any of their wild progenitors. This suite of traits, now termed the "domestication syndrome" (DS), has been a puzzle since Charles Darwin discovered it. Two general explanations of its basis have been proposed, which in principle, could also apply to other vertebrates, such as fish and birds, whose domesticated varieties show some of its elements. The two ideas are termed here, respectively, the thyroid hormone hypothesis or the THH, and the neural crest cell hypothesis, the NCCH. The two ideas make distinctly different genetic predictions. Here, the current relevant evidence from genomics is evaluated and it is concluded that the NCCH has more support. Nevertheless, one set of observations, from chickens, suggest a potentially important role of altered thyroid metabolism in domestication. In addition, recent studies indicate the possibility of additional genetic factors in domestication, affecting tameness and sociality, that may go beyond either hypothesis. The tasks that lie ahead to fully ascertain the genetic bases of the "domestication syndrome" and the behaviors that characterize mammalian domestication are discussed briefly.
Article
Full-text available
Although considerable progress has been made in understanding the genetic basis of morphologic traits (for example, body size and coat color) in dogs and wolves, the genetic basis of their behavioral divergence is poorly understood. An integrative approach using both behavioral and genetic data is required to understand the molecular underpinnings of the various behavioral characteristics associated with domestication. We analyze a 5-Mb genomic region on chromosome 6 previously found to be under positive selection in domestic dog breeds. Deletion of this region in humans is linked to Williams-Beuren syndrome (WBS), a multisystem congenital disorder characterized by hypersocial behavior. We associate quantitative data on behavioral phenotypes symptomatic of WBS in humans with structural changes in the WBS locus in dogs. We find that hypersociability, a central feature of WBS, is also a core element of domestication that distinguishes dogs from wolves. We provide evidence that structural variants in GTF2I and GTF2IRD1, genes previously implicated in the behavioral phenotype of patients with WBS and contained within the WBS locus, contribute to extreme sociability in dogs. This finding suggests that there are commonalities in the genetic architecture of WBS and canine tameness and that directional selection may have targeted a unique set of linked behavioral genes of large phenotypic effect, allowing for rapid behavioral divergence of dogs and wolves, facilitating coexistence with humans.
Article
Full-text available
Natural selection that affected modern humans early in their evolution has likely shaped some of the traits that set present-day humans apart from their closest extinct and living relatives. The ability to detect ancient natural selection in the human genome could provide insights into the molecular basis for these human-specific traits. Here, we introduce a method for detecting ancient selective sweeps by scanning for extended genomic regions where our closest extinct relatives, Neandertals and Denisovans, fall outside of the present-day human variation. Regions that are unusually long indicate the presence of lineages that reached fixation in the human population faster than expected under neutral evolution. Using simulations we show that the method is able to detect ancient events of positive selection and that it can differentiate those from background selection. Applying our method to the 1000 Genomes dataset, we find evidence for ancient selective sweeps favoring regulatory changes and present a list of genomic regions that are predicted to underlie positively selected human specific traits.
Article
Full-text available
Europe has played a major role in dog evolution, harbouring the oldest uncontested Palaeolithic remains and having been the centre of modern dog breed creation. Here we sequence the genomes of an Early and End Neolithic dog from Germany, including a sample associated with an early European farming community. Both dogs demonstrate continuity with each other and predominantly share ancestry with modern European dogs, contradicting a previously suggested Late Neolithic population replacement. We find no genetic evidence to support the recent hypothesis proposing dual origins of dog domestication. By calibrating the mutation rate using our oldest dog, we narrow the timing of dog domestication to 20,000–40,000 years ago. Interestingly, we do not observe the extreme copy number expansion of the AMY2B gene characteristic of modern dogs that has previously been proposed as an adaptation to a starch-rich diet driven by the widespread adoption of agriculture in the Neolithic.
Article
The neural crest is a multipotent population of cells that originates a variety of cell types. Many animal models are used to study neural crest induction, migration and differentiation, with amphibians and birds being the most widely used systems. A major technological advance to study neural crest development in mouse, chick and zebrafish has been the generation of transgenic animals in which neural crest specific enhancers/promoters drive the expression of either fluorescent proteins for use as lineage tracers, or modified genes for use in functional studies. Unfortunately, no such transgenic animals currently exist for the amphibians Xenopus laevis and tropicalis, key model systems for studying neural crest development. Here we describe the generation and characterization of two transgenic Xenopus laevis lines, Pax3-GFP and Sox10-GFP, in which GFP is expressed in the pre-migratory and migratory neural crest, respectively. We show that Pax3-GFP could be a powerful tool to study neural crest induction, whereas Sox10-GFP could be used in the study of neural crest migration in living embryos.
Article
The PITX1 transcription factor is expressed during hindlimb development, where it plays a critical role in directing hindlimb growth and the specification of hindlimb morphology. While it is known that PITX1 regulates hindlimb formation, in part, through activation of the Tbx4 gene, other transcriptional targets remain to be elucidated. We have used a combination of ChIP-seq and RNA-seq to investigate enhancer regions and target genes that are directly regulated by PITX1 in embryonic mouse hindlimbs. In addition, we have analyzed PITX1 binding sites in hindlimbs of Anolis lizards to identify ancient PITX1 regulatory targets. We find that PITX1-bound regions in both mouse and Anolis hindlimbs are strongly associated with genes implicated in limb and skeletal system development. Gene expression analyses reveal a large number of misexpressed genes in the hindlimbs of Pitx1-/- mouse embryos. By intersecting misexpressed genes with genes that have neighboring mouse PITX1 binding sites, we identified 440 candidate targets of PITX1. Of these candidates, 68 exhibit ultra-conserved PITX1 binding events that are shared between mouse and Anolis hindlimbs. Among the ancient targets of PITX1 are important regulators of cartilage and skeletal muscle development, including Sox9 and Six1. Our data suggest that PITX1 promotes chondrogenesis and myogenesis in the hindlimb by direct regulation of several key members of the cartilage and muscle transcriptional networks.
Article
Neural tube defects (NTDs) are a complex trait associated with gene–environment interactions. Folic acid deficiency and planar cell polarity gene mutations account for some NTD cases; however, the etiology of NTDs is still little understood. In this study, in three Han Chinese NTD pedigrees (two with multiple affected children), with no information on folic acid deficiency or supplement, we examined genome-wide methylation profiles of each individual in these families. We further compared methylation status among cases and normal individuals within the pedigrees. A unique methylation pattern co-segregated with affected status: NTD cases had more hypermethylated than hypomethylated CpG islands; genes with different methylations clustered in pathways associated with epithelial-to-mesenchymal transition (ZEB2, SMAD6, and CDH23), folic acid/homocysteine metabolism (MTHFD1L), transcription/nuclear factors (HDAC4, HOXB7, SOX18), cell migration/motility/adhesion, insulin and cell growth, and neuron/axon development. Although the genetics of NTD are likely complex, epigenetic changes may concentrate in certain key pathways.
Article
The current RIP-seq approach has been developed for the identification of genome-wide interaction between RNA binding protein (RBP) and the bound RNA transcripts, but still rarely for identifying its binding sites. In this study, we performed RIP-seq experiments in HeLa cells using a monoclonal antibody against CELF1. Mapping of the RIP-seq reads showed a biased distribution at the 3′UTR and intronic regions. A total of 15,285 and 1384 CELF1-specific sense and antisense peaks were identified using the ABLIRC software tool. Our bioinformatics analyses revealed that 5′ and 3′ splice site motifs and GU-rich motifs were highly enriched in the CELF1-bound peaks. Furthermore, transcriptome analyses revealed that alternative splicing was globally regulated by CELF1 in HeLa cells. For example, the inclusion of exon 16 of LMO7 gene, a marker gene of breast cancer, is positively regulated by CELF1. Taken together, we have shown that RIP-seq data can be used to decipher RBP binding sites and reveal an unexpected landscape of the genome-wide CELF1-RNA interactions in HeLa cells. In addition, we found that CELF1 globally regulates the alternative splicing by binding the exon-intron boundary in HeLa cells, which will deepen our understanding of the regulatory roles of CELF1 in the pre-mRNA splicing process.