ArticlePDF Available

The case of an arctic wild ass highlights the utility of ancient DNA for validating problematic identifications in museum collections

  • LifeMine Therapeutics


Museum collections are essential for reconstructing and understanding past biodiversity. Many museum specimens are, however, challenging to identify. Museum samples may be incomplete, have an unusual morphology, or represent juvenile individuals, all of which complicate accurate identification. In some cases, inaccurate identification can lead to false biogeographic reconstructions with cascading impacts on paleontological and paleoecological research. Here we analyze an unusual Equid mandible found in the Far North of the Taymyr peninsula that was identified morphologically as Equus hemionus, an ancestor of present-day Asiatic wild asses. If correct, this identification represents the only finding of a putative Late Pleistocene hemione in the Arctic region, and is therefore critical to understanding wild ass evolution and paleoecology. To confirm the accuracy of this specimen's taxonomic assignment, we used ancient DNA and mitochondrial hybridization capture to identify and place this specimen in the larger equid phylogeny. We find that the specimen is actually a member of E. caballus, the ancestor of domestic horses. Our study demonstrates the utility of ancient DNA to validate morphological identification, in particular of incomplete, otherwise problematic, or taxonomically unusual museum specimens.
Mol Ecol Resour. 2020;00:1–9.
  1© 2019 John Wiley & Sons Ltd
Natural history museums are biomolecular biobanks. Museum col-
lections maintain fossil and biomolecular archives of the evolution-
ary history of life, including species that are rare, threatened with
extinction, or already extinct (Johnson et al., 2011). Museum col-
lections can be used, for example, to reconstruct evolutionary rela-
tionships and geographic distribution of species (Foote et al., 2013;
White, Mitchell, & Austin, 2018). Historically, specimens archived in
museums have been identified based on their morphological char-
acters. Recently, however, advances in biomolecular techniques
including ancient DNA (aDNA), RNA, and protein sequencing have
provided other sources of information that can be recovered from
museum specimens (Cappellini et al., 2019; Keller et al., 2017; Meyer
et al., 2016). These approaches have expanded the potential utility
of museum collections by making it possible to provide taxonomic
assignments to fragmentary and otherwise difficult to identify spec-
imens (Brown et al., 2016). In some instances, ancient DNA data have
contradicted morphological identifications, leading to both correc-
tion of accidental misidentifications and discover y of unknown taxo-
nomic groups (Heintzman et al., 2017; Krause et al., 2010).
Among the most widely studied taxonomic groups using mu-
seum-preserved remains is the family Equidae. Equids, whose liv-
ing members includes horses, zebras, donkeys, and asses, were
among the first taxonomic groups for which evolutionary history
was inferred through examination of archived fossils and bones
Received: 18 Septemb er 2019 
  Revised: 25 November 2019 
  Accepted: 10 December 2019
DOI: 10.1111/1755-0998.13130
The case of an arctic wild ass highlights the utility of ancient
DNA for validating problematic identifications in museum
Alisa O. Vershinina1| Joshua D. Kapp1| Gennady F. Baryshnikov2| Beth Shapiro1,3
1Department of Ecology and Evolutionary
Biolog y, Universi ty of California Santa Cr uz,
Santa Cruz, CA , USA
2Laboratory of Theriology, Zoological
Instit ute of the Russian Aca demy of
Science s, St. Petersbur g, Russia
3Howard Hu ghes Medical Institute,
University of California Santa C ruz, Santa
Cruz, C A, USA
Alisa O. Vershinina, Depar tment of Ecology
and Evolutionar y Biolog y, University of
Califo rnia Santa Cruz, Santa Cr uz, CA US A.
Funding information
Russian Ac ademy of S ciences Presidium
and the Russian Ministry of Education and
Science, Grant/Award Number: "Evolution
of the organic worl d"; National Scien ce
Foundation, Grant/Award Number:
1417036; Institute of Museum and Libr ary
Services, Gr ant/Award Number: M G-30-17-
0045 -17
Museum collections are essential for reconstructing and understanding past bio-
diversity. Many museum specimens are, however, challenging to identify. Museum
samples may be incomplete, have an unusual morphology, or represent juvenile in-
dividuals, all of which complicate accurate identification. In some cases, inaccurate
identification can lead to false biogeographic reconstructions with cascading impacts
on paleontological and paleoecological research. Here, we analyzed an unusual Equid
mandible found in the Far North of the Taymyr peninsula that was identified morpho-
logically as Equus hemionus, an ancestor of present-day Asiatic wild asses. If correct,
this identification represents the only finding of a putative Late Pleistocene hemi-
one in the Arctic region, and is therefore critical to understanding wild ass evolution
and paleoecology. To confirm the accuracy of this specimen's taxonomic assignment,
we used ancient DNA and mitochondrial hybridization capture to identify and place
this specimen in the larger equid phylogeny. We find that the specimen is actually a
member of E. caballus, the ancestor of domestic horses. Our study demonstrates the
utility of ancient DNA to validate morphological identification, in particular of incom-
plete, otherwise problematic, or taxonomically unusual museum specimens.
ancient DNA, Asiatic wild ass, Equus caballus, Equus ferus, Equus hemionus
(MacFadden, 1992). Thanks in part to their abundance and history
of living in colder climates, horse evolution has also been a common
theme in paleogenomics research. In 1984, the first ancient DNA se-
quences were recovered from the skin of a museum-preser ved sub-
species of zebra, the quagga (Higuchi, Bowman, Freiberger, Ryder,
& Wilson, 1984), and the oldest genome yet sequenced is from an
early Middle Pleistocene horse from Canada's Yukon (Orlando et al.,
2013). The abundance and diversity of equid fossils has also made
their taxonomy contentious, with genetic and morphological anal-
yses often leading to taxonomic reassignments and redesignations
(Barron-Ortiz et al., 2019). For example, Przewalski's horse, E. ferus
przewalskii, was considered for many decades to be the only remain-
ing truly wild horse (Der Sarkissian et al., 2015), but has recently
been linked to the lineage of Botai horses that were tamed and
herded five thousand years ago (Gaunitz et al., 2018). In another ex-
ample, DNA from Nor th American horse remains dating to the Late
Pleistocene was recently used to name a new genus, Haringtonhippus
francisci, the New World stilt-legged horse (Heintzman et al., 2017).
While this lineage was known, its phylogenetic placement within
the equids was uncertain, with different authors assigning it to at
least five different species (Eisenmann, Howe, & Pichardo, 20 08;
Heintzman et al., 2017; Weinstock et al., 2005).
Living horses can be broadly subdivided into two major lineages
that diverged 4–4.5 million years ago (Orlando et al., 2013): caballine
horses, which includes domestic horses (most often referred to as
either E. ferus caballus or E. caballus), Przewalski's horses (E. f. prze-
walski or E. c. przewalski); and noncaballine horses, which includes
hemiones (Asiatic wild asses E. hemionus and kiangs E. kiang), zebras
(E. zebra), and donkeys (E. asinus). While caballine horses have been
well characterized using ancient DNA (Fages et al., 2019; Gaunitz
et al., 2018; Heintzman et al., 2017; Orlando et al., 2013), less is
known about the geographic distribution and evolutionary relation-
ships among extinct and extant hemiones. Morphologically, hemi-
ones have both caballine horse features, such as gracile and slender
bodies, and features similar to donkeys, such as a relatively small
body size and a large head. Today, kiangs are found across Tibetan
plateau (St-Louis & Côté, 2009) and Asiatic wild asses are found in
the deserts and arid steppes of southern Mongolia, Kazakhstan,
Iran, and India (Kaczensky, Lkhagvasuren, Pereladova, Hemami, &
Bouskila, 2015). While fossils of kiang-like wild asses are known from
Pleistocene deposits in Alaska (Harington, 1980), the Pleistocene
range of kiangs remains unknown. Wild asses during the Pleistocene
spanned from present-day France, where they were known as the
European wild ass (E. hydrintius), to China (Figure 1a) in regions south
of 50˚N (Bennett et al., 2017).
During the 1980s, a complete mandible, sample ZIN-35608 (the
Zoological Institute of St. Petersburg, Russia), was discovered on the
Begichev Islands in the Laptev sea to the east of the Taymyr penin-
sula, Russia (Figure 1b). The mandible, which was identified as be-
longing to an equid, was small, with both the length of the mandible
bone and the lower tooth row 5 cm smaller than what is typical of
extinct caballine horses. Although the other measurements of the
mandible were inconclusive, it had a curved lower jaw ridge and a
V-shaped linguaflexid of the lower teeth, both of which are char-
acteristics of hemionid horses (Cucchi et al., 2017). Based on these
morphological characters, the mandible was assigned to E. hemionus,
or Asiatic wild ass (Kuzmina, 1997). The discovery of a wild ass in
the Taymyr peninsula was paleontologically significant, as it expands
the range of wild asses far to the nor th and adds them to the list of
arctic Siberian fauna that lived contemporaneously with mammoths
(Figure 1a) (Markova, Smirnov, Kozharinov, Kazantseva, & Kitaev,
There are several reasons to suspect that the taxonomic identifi-
cation of sample ZIN-35608 as a wild ass may have been inaccurate.
First, the location where the specimen was recovered is far outside
the known range of Late Pleistocene hemiones. Although wild asses
were widespread during the Late Pleistocene, no other wild ass sam-
ples are known from northerly regions of Eurasia. To date, the only
equid known from the Pleistocene of Siberian Far North is the cabal-
line horse, thus assigning the specimen in question to caballines is
an alternative hypothesis. Second, the identification of equid fossils
is challenging, given their extensive morphological variation both
within and between taxonomic groups (Bennett, 1980; Forsten,
1998; Geigl & Grange, 2012; Orlando et al., 2009; Twiss et al., 2017).
Taxonomic identification is particularly complicated if the specimen
is a subadult, as suggested for ZIN-35608 based on its small size, as
diagnostic features may have not yet formed.
To confirm the identification of a Pleistocene wild ass in the
Russian Far Nor th, we extracted and captured ancient mitochondrial
DNA and placed sample ZIN-35608 in a phylogenetic tree of the
Equidae. Our analyses revealed the specimen not to be a wild ass,
but instead a caballine horse, the ancestor of the domestic horse.
The unusually small size of the animal, combined with the standard
challenges of morphological identification in equids, probably led to
its misinterpretation as a member of the smaller species, E. hemionus.
Our results underscore the utility of ancient DNA as a paleontolog-
ical tool, in particular when specimens are challenging to identify
morphologically, and highlight the important role that museum
collections play in understanding evolutionary and biogeographic
ZIN-35608 is a partial mandible found on the Begichev Islands,
Russia, during the 1980s and currently held in the collec tion of the
Zoological Institute of St. Petersburg (Figure 1b). To estimate the age
of the individual at time of death, we followed the protocol outlined
in Hillson (2005). We then collected ~ 1 g of bone surrounding the
M2 tooth socket of the ZIN-35608 mandible, which we subdivided
for radiocarbon dating at the Keck Radiocarbon facility at UC Irvine
and ancient DNA analysis in the purpose-built, sterile, ancient DNA
facilit y at the University of California Santa Cruz Paleogenomics
Laboratory. We performed DNA extraction and subsequent pro-
cessing following standard protocols for working with degraded
DNA (Fulton & Shapiro, 2019). Briefly, the sterile laboratory is
located in the building isolated from other molecular research labo-
ratories. Laboratory personnel are instructed to not enter any areas
of campus that have a risk of PCR contamination prior to entering
the aDNA facility. Once inside, they wear coverall suits with hoods,
face masks, and double layer of gloves. All equipment and surfaces
are washed with bleach and ethanol.
We extrac ted DNA following the Dabney et al. (2013) pro-
tocol with modifications for the recovery of short molecules
(Campos et al., 2012), and a sodium hypochlorite pretreatment
(Boessenkool et al., 2017) to reduce the amount of contaminat-
ing DNA potentially adhered to the surface of the bone. Following
extraction, we prepared the extract into Illumina DNA sequenc-
ing libraries following Meyer and Kircher (2010), using Sera-Mag
SPRI SpeedBeads ( ThermoScientific) in 18% PEG-8 000 between
each step for librar y clean-up. To enrich libraries for mitochon-
drial DNA , we performed in-solution hybridization capture using
MyBaits Mito E. caballus RNA bait set (Arbor Biosciences, Ann
Arbor, MI), following the manufacturer's protocol version 3.01. We
incubated the hybridization reactions for 36 hr at 65°C, and then
isolated DNA from the probes using Dynabeads magnetic strepta-
vidin-coated beads. We amplified captured libraries with K APA
HiFi 2X master mix using IS5 and IS6 primers, and purified the
enriched libraries with SPRI beads as above. We then pooled and
sequenced pre- and post-capture libraries on two Illumina MiSeq
runs (v3 chemistry, 75 bp paired end reads).
We used in-house scripts to process the recovered data, map
reads to the reference genomes, and assemble a mitochondrial ge-
nome (https :// genom ics/DNA-Post-Proce ssing/
blob/maste r/mito_assem bly_pipel We removed adapters
and merged reads with a minimum overlap of 15 bp and minimum
length of 27 bp using SeqPrep ( hn/SeqPrep).
After verifying the presence of ancient DNA damage at the end
of the reads using MapDamage2 (Jónsson, Ginolhac, Schuber t,
Johnson, & Orlando, 2013), we mapped merged reads to previously
published genomes of E. caballus (EquCab3, GeneBank accession
GCF_002863925, mitochondrial NC_001640), E. asinus (nuclear
GCA_003033725), and E. hemionus (mitochondrial NC_016061). For
alignments to nuclear genomes, we used bwa aln with seed disabled
(Li & Durbin, 2009). To assemble the mitochondrial genome, we used
mia (https :// a/mappi ng-itera tive-assem bler), call-
ing bases with at least 10X independent read coverage and > 90%
consensus among reads, so as to avoid false nucleotide calls due to
ancient DNA damage. We imported the BAM file of the mitochon-
drial alignment into Geneious R11 (Biomatters, NZ) to create a con-
sensus nucleotide sequence, which we deposited in GenBank as
accession number MN503280.
To create a data set for comparison to ZIN-35608, we added its
mitochondrial genome to a previously published alignment of 30
equid mitochondrial genomes (table) (Heintzman et al., 2017). We
aligned the assembled mitochondrial genome to this data set using
MAFFT v.7 (Nakamura, Yamada, Tomii, & Katoh, 2018), and manually
checked the alignment for discrepancies. Using the annotation of E.
caballus mitochondrial genome (GenBank ID JN398421), we subdi-
vided the mitochondrial alignment into six partitions (first, second,
and third codon positions for protein coding genes, concatenated
tRNAs, rRNA genes, and the control region). We estimated the ap-
propriate evolutionary models for each partition using jModeltest2
(Darriba, Taboada, Doallo, & Posada, 2012).
FIGURE 1 (a) A map of Eurasia
highlighting the distribution of extinct
European (E. hydrintius) and present-
day Asiatic (E. hemionus) wild asses.
The black dot shows the location of the
Begichev Islands where ZIN-35608 was
found. (b) A mandible found on Begichev
Islands (ZIN-35608) and identified as E.
We estimated phylogenetic trees describing the relation-
ships among the 31 equids in our data set using both a Maximum
Likelihood (ML) and a Bayesian approach. For the ML reconstruc-
tion, we specified the Malasyian tapir, Tapirus indicus as the outgroup
lineage (GenBank ID NC_023838) and ran three instances of RAxML
v.8.2.4 (Stamatakis, 2014) with the GTRGAMMAI nucleotide model
on each par tition. We then selected the best ML tree and estimated
branch supports using 1,000 rapid bootstrap iterations. Bayesian in-
ference does not require an outgroup, therefore we did not include
the tapir sequence in this analysis. Following Heint zman et al. (2017),
we excluded third position of the codon and the control region as to
account for possible convergent mutations in rapidly evolving posi-
tions of the mitochondrial genome. For the first, and second codons,
and the rRNA partitions, we specified the TN93 + I+G nucleotide
model, and for the tRNA partition we specified HKY + I. We ran t wo
instances of BEAST 1.8.4 for 100 million iterations per run, sampling
model parameters and trees every 1,000 iterations (Drummond &
Rambaut, 2007). We assumed the uncorrelated relaxed clock model
and calibrated the molecular clock using the radiocarbon dates of
ancient samples as priors, with ages of the present-day samples set
to 0, and a divergence of the crown caballine group of 4–4.5 Ma
(normal prior, mean 4.25 M, SD: 1.5 M; Orlando et al., 2013). We cal-
ibrated radiocarbon ages of the Late Pleistocene samples reported
in Table 1 using the IntCal13 curve (Reimer et al., 2013) and OxCal
v4.2 (Ramsey, 2009), and assigned the median calibrated age of each
sample as prior information. We used the birth-death process with
serial samples as a tree prior (Drummond, Ho, Phillips, & Rambaut,
2006; Stadler, 2010). We discarded the first 25% of MCMC itera-
tions from each run as burnin, and analyzed parameters for conver-
gence in Tracer (Rambaut, Drummond, Xie, Baele, & Suchard, 2018).
All parameters reached an ef fective sample size > 1,00 0. We then
combined trees from the two BEAST runs in logCombiner and cal-
culated the maximum clade credibility (MCC) tree in Tree Annotator
(Rambaut & Drummond, 2010). We visualized the BEAST MCC tree
and the estimated R AxML phylogeny in Figtree v1.4.2 (Rambaut,
DNA extraction and mitochondrial enrichment were both successful
for specimen ZIN-35608. We sequenced the unenriched library to
a depth of 841,010 reads. When these reads were mapped to the
nuclear reference genomes of E. caballus and E. asinus, 68,276 (8.1%)
of recovered reads mapped to the E. caballus reference nuclear ge-
nome, and 63,916 (7.6%) of reads mapped to E. asinus. Although this
number of reads is small and is not indicative of a final taxonomic as-
signment, it suggests a closer relationship between ZIN-35608 and
caballi ne horses than do nkeys. The median le ngth of aligned DN A se-
quences was 55 bp, and we observed an elevated frequency of G > A
and C > T substitutions at the ends of molecules, consistent with
degraded DNA. After mitochondrial capture, we sequenced the cap-
tured library to a depth of 249,469 reads. Of these, 35,271 (14.1%)
unique reads mapped to the E. hemionus mitochondrial genome, and
44,270 (17.8%) unique reads mapped to E. caballus, resulting in mito-
chondrial genomes with an average coverage of 206x using E. hemio-
nus as the starting reference for mapping and 217x when assembling
on E. caballus. The final assemblies were identical except for three
sections of the control region that were not assembled with E. he-
mionus as the starting reference. We therefore chose the assembly
seeded with E. caballus for phylogenetic reconstruction.
Radiocarbon dating, which was performed at the Keck facility at
UC Irvine, provided an uncalibrated age of 19,470 ± 70 years before
present (UCIAMS-199226). To incorporate this age into the Bayesian
analysis described above, we calibrated it using the IntCal13 radio-
carbon curve as implemented in OxCal 4.3 (Ramsey, 2009; Reimer
et al., 2013). This provided a median age of 23,457 calibrated years
before present (Cal BP 23697–23138, 95.4% probability range).
Based on the state of tooth eruption and microwear analysis (Hillson,
2005), we estimate the age of the individual to be 4–4.5 years at the
time of death.
The mitochondrial phylogenies reconstructed with Maximum
Likelihood and Bayesian approaches were topologically concor-
dant with respect to the major clades within the Equidae family,
although branching order slightly differed within the noncaballine
clade (Figure 2). The major difference is that in the ML phylogeny, E.
ovodovi falls outside of the diversity of zebras, donkeys, and asses,
whereas in the Bayesian phylogeny it is sister to E. asinus, although
with low statistical support. In both ML and BEAST analysis, ZIN-
35608 falls within the clade of caballine horses with strong statis-
tical support (Figure 2b). Based on these results, we conclude that
ZIN-35608 is a caballine horse.
Our ancient mitochondrial DNA data indicate that ZIN-35608 is a
member of the caballine clade of ancient horses, rather than a hemi-
one as assigned based on morphology. Although we generated too
few nuclear reads for a confident taxonomic assignment based on
nuclear genomic data, a greater proportion of reads from the shot-
gun library mapped to the E. caballus genome than to the E. asinus
genome, suggesting that the former is a closer evolutionary match.
Both ML and BEAST phylogenies reconstructed with complete high
coverage mitochondrial genomes place ZIN-35608 confidently
within the past and present diversity of c aballine horses (Figure 2).
Our results do not support the proposed expansion of wild
asses to the North of Eastern Siberia, but instead indicate that ca-
balline horses were present in the Begichev Islands during the L ate
Pleistocene (Figure 1a). With a calibrated age of 23,457 years ago,
ZIN-35608 lived during the peak cold interval of the Last Glacial
Maximum. At that time, the sea level was significantly lower than
today, which would have allowed the mainland horse population
to expand to what is today the Begichev Islands. Adjacent to the
Barents-Kara Ice Sheet, the region was at the arctic edge of the dry
Mammoth steppe, dominated by graminoid vegetation (Binney et
al., 2017; Mangerud et al., 2004; Tarasov et al., 2000). Such habi-
tats were widely populated by caballine horses at that time (Zimov,
Zimov, Tikhonov, & Chapin, 2012).
The erroneous taxonomic assignment of ZIN-35608 highlights
the challenge of generating an accurate taxonomic identification
from some paleontological remains, which can be problematic when,
GenBank ID Species Age (uncalibrated)
Calendar date BP
(IntCal13 median)
Noncaballine equids
NC016 061 Equus hemionus Present day 0
NC018782 E. hemionus Present day 0
JX312732 E. kiang Present day 0
KM881681 E. asinus somalicus Present day 0
NC0 0178 8 E. asinus Present day 0
NC018780 E. zebra har tmannae Present day 0
NC020476 E. zebra Present day 0
NC020432 E. grev yi Present day 0
JX312729 E. burchellii chapmani Present day 0
KM881680 E. burchellii quagga Present day 0
Extinct noncaballine equids
NC018783 E. ovodovi 45,000* 45,000*
Caballine equids
MN503280 E. caballus
- ZIN35608
19,470 ± 70 23,457
KT 75 774 9 E. caballus 28,80 0 ± 1,100 32,918
KT757757 E. caballus 34,460 ± 240 38 ,961
KT757759 E. caballus 13,940 ± 55 16,908
KT 75 7761 E. caballus Pre sent day 0
NC0 0164 0 E. caballus Present day 0
KT 75 776 3 E. scotti 560,000–780,000* 650,000*
KT168318 E. lambei 33,760 ± 400 38,138
KT168322 E. lambei 21,420 ± 80 25,749
New World Stilt Legged horses
KT168320 Haringtonhippus
28,740 ± 570 32,767
KT168326 H. francisci 46,500 ± 1900 47,770
KT168332 H. francisci 33,400 ± 430 37,6 6 4
KT168333 H. francisci 33,560 ± 440 37, 858
KT168335 H. francisci 14,450 ± 90 17, 5 98
KT168336 H. francisci 28,390 ± 240 32,311
Extinct South American horses
KM8 81671 Hippidion saldiasi 13,990 ± 150 16,980
KM8 81672 H. saldiasi 11,900 ± 60 13,709
KM8 81673 H. saldiasi 13,890 ± 60 16,83 0
KM8 81675 H. saldiasi 10,680 ± 40 12,653
KM881677 H. sp 13,275 ± 30 15,961
Note: The star (*) marks specimen ages estimated by stratigraphic dating. All other nonpresent day
specimens are radiocarbon dated.
TABLE 1 Samples used in the current
study, their NCBI numbers, taxonomic
position, uncalibrated radiocarbon age,
and calendar dates used for BEAST
for example, specimens are fragmentary or come from animals that
have not reached full adult size. In this and other cases, ancient
DNA has proven to be a useful tool for resolving such questions.
Barbanera, Moretti, Guerrini, Al-Sheikhly, and Forcina (2016), for
example, amplified the mitochondrial cytochrome b gene from spec-
imens stored in three museum collections that had been identified as
smooth-coated otters (Lutrogale perspicillata). Ancient DNA revealed
these to belong to three different species, none of which were L. per-
spicillata (Barbanera et al., 2016). Similarly, Cappellini et al. (2014)
used a combination of ancient DNA and proteomics to correct the
taxonomic assignment of the original type material of the Asian el-
ephant Elephas maximus. The biomolecular data revealed that this
sample, a complete ethanol-preserved fetus originally described by
Linnaeus, was in fact an African elephant, Loxodonta sp. (Cappellini
et al., 2014).
In addition to resolving incorrect taxonomic identifications, pa-
leogenomic data can augment the contribution of museum collec-
tions to our understanding of evolutionary history, paleoecology,
and conservation. Samples from organisms that lived before envi-
ronmental shifts or periods of population decline can be used to es-
timate evolutionary changes that occur as a consequence of those
events. For example, ancient DNA from museum preserved speci-
mens collected within the last few centuries has revealed a dramatic
reduction in genetic diversity of the critically endangered Western
Australian woylie and Eastern gorilla, both of which are associated
with anthropogenic habitat changes (Pacioni et al., 2015; van der
Valk et al., 2018). Ancient DNA from much older museum specimens,
coupled with radiocarbon dating and stable isotope analysis, has
also been used to reconstruct ecological changes during the early
Holocene megafaunal mass extinction event (Lorenzen et al., 2011;
Shapiro et al., 200 4). Finally, museum-preserved specimens have tre-
mendous potential value for conservation. Recently, museum spec-
imens of Eurasian beaver, Castor fiber, from the last 10,000 years
helped to identify potential source populations for this species’ rein-
troduction to Britain (Marr, Brace, Schreve, & Barnes, 2018). In the
European beaver study, museum specimens allowed the reconstruc-
tion of past diversity and ancient dispersal events across geographic
locations, which is essential for finding a suitable source population
for controlled genetic rescue (Dietl & Flessa, 2011; Leonard, 2008).
Although the preservation of ZIN-35608 is poor, future advances in
DNA recover y efficiency may allow a complete genome sequence to
be isolated from this specimen, which would help to reveal whether
FIGURE 2 Molecular phylogenies of 31 mitochondrial genomes of various Equid groups sampled worldwide. Colour coding corresponds
to different continents. (a) Maximum clade credibility tree reconstructed with BEAST. Each node has a bar corresponding to a 95% HPD
height interval. (b) Maximum likelihood phylogeny estimated using tapir as an outgroup (the outgroup is not shown). Posterior probabilities
and bootstrap supports higher than 0.95 and 95, respectively are indicated with numbers above and below branches
the Begichev populations were genetically isolated from the main-
land Yakutiya and how horse population structure changed with
the separation of the Begichev Islands from the continent. While
the present study of ZIN-35608 is a single example, it highlight s the
potential power of museum specimens, combined with increasingly
sophisticated biomolecular approaches, to reveal the pat tern and
process of biodiversity change over time.
While not all museum specimens retain DNA, advances in paleog-
enomic approaches continue to expand the range of material from
which DNA can be recovered, increasing the value of museum speci-
mens. New methods have been developed to extract DNA from sam-
ples fixed in formalin (Hykin, Bi, & McGuire, 2015; Ruane & Austin,
2017) and ethanol (McGuire et al., 2018) and to decontaminate sam-
ples with sodium hypochlorite prior to extraction (Korlević et al., 2015)
thereby increasing the fraction of useful DNA recovered. In the current
study, we used in-solution hybridization capture, which is an approach
developed to efficiently recover short fragments of a targeted region
of the genome. This method allows recovery of sufficient quantities
of data for population genetic analyses even when samples are poorly
preserved. Such data can be generated US$50–$100 per sample, al-
though costs vary depending on sample preservation, experimental
approach to DNA extraction and library preparation, and local costs of
consumables and sequencing. Other strategies to reduce cost and/or
generate new biological information from ancient specimens include
DNA barcoding and metabarcoding of bulk-extracted bone fragments,
which enables taxonomic identification of a corecovered community
of organisms even when remains are too fragmentary to attempt mor-
phological analysis (Grealy et al., 2015).
Finally, while ancient DNA recovered from museum specimens
has broad utility in ecological and evolutionary analyses, this ap-
proach is most powerful when used in combination with other meth-
ods, including morphological analysis, radiocarbon dating, stable
isotope analysis, and other techniques. Many of these approaches
are inherently destructive, and it is therefore important to consider
the long-term impact on the collections when deciding what ap-
proaches are best for any particular sample.
In summar y, our results reveal that a museum specimen recov-
ered in the Russian Far Nor th and identified based on morphological
characters as a wild ass is actually a caballine horse. This incorrect
identification is probably a consequence of unusually small size of
the horse combined with problematic teeth characteristics. Our
finding therefore disputes the hypothesis that Pleistocene wild
asses expanded to the North of Eurasia during the Pleistocene. Our
study demonstrates how ancient DNA can be used to validate the
taxonomic identity of problematic museum specimens. The growing
diversit y of approaches that can be used to analyze the preserved
remains of organisms highlights the important role of museum col-
lections in advancing our understanding of evolutionary history, pa-
leoecology, and conservation.
This work was funded by National Science Foundation grant
NSF ANS 1417036 and Institute of Museum and Library Services
grant MG-30-17-0045-17. GB is supported by the Program of the
Russian Academy of Sciences Presidium and the Russian Ministry of
Education and Science "Evolution of the organic world. The role and
significance of planetary processes” (2019).
G.B., A .V., and B.S. conceived the study; A.V. obtained the sample;
J.K. ex tracted DNA, prepared sequencing library and performed hy-
bridization capture; A.V. conducted analyses of the sequencing data;
A.V., and B.S. wrote the manuscript. All authors discussed the results
and contributed to the final version of manuscript.
The assembled mitochondrial genome of ZIN-35608 is deposited to
GenBank (accession MN503280). XML file used for BEAST analysis
and mitochondrial genome alignment are available for download on
the Dryad Digital Repository (https ://
Alisa O. Vershinina
Beth Shapiro
Barbanera, F., Moret ti, B., G uerrini, M., Al-Sheikhly, O. F., & Forcina,
G. (2016). Investigation of ancient DNA to enhance natural his-
tory museum collections: Misidentification of smooth-coated otter
(Lutrogale perspicillata) specimens across multiple museums. Belgian
Journal of Zoology, 146(2), 101–112.
Barron-Ortiz, C., Avilla, L., Jass, C., Bravo-Cuevas, V., Machado, H., &
Mothé, D. (2019). What is Equus? Reconciling taxonomy and phylo-
genetic analyses. Frontiers in Ecology and Evolution, 7, 3 43. https ://
Bennet t, D. K. (1980). Stripes do not a Zebra Make, Part I: A Cladis tic
Analysis of Equus. Systematic Zoology, 29(3), 272–287. https ://doi.
Bennet t, E. A ., Champlot, S., Peters, J., Arbuckle, B. S., Guimaraes, S.,
Pruvost, M., … Geigl, E.-M. (2017). Taming the late Quaternary phy-
logeography of the Eurasiatic wild ass through ancient and modern
DNA. PLoS ONE, 12(4), e0174216. https ://
Binney, H., Edwards, M., Macias-Fauria, M., Lozhkin, A ., Anderson, P.,
Kaplan , J. O., … Zernitskaya, V. (2017). Vegetation of Eurasia from
the last glacial maximum to present: Key biogeographic pat terns.
Quaternary Science Reviews, 157, 80–97. https :// /10.1016/ j.
quasc irev.2016.11.022
Boessenkool, S., Hanghøj, K., Nistelberger, H. M., Der Sarkissian, C., Gondek,
A. T., Orlando, L., … Star, B. (2017). Combining bleach and mild predi-
gestion improves ancient DNA recovery from bones. Molecular Ecology
Resources, 17(4), 742–751. https ://
Brown, S ., Higham, T., Slon, V., Pääb o, S., Meyer, M., Douka , K., … Buckley,
M. (2016). Identification of a new hominin bone from Denisova Cave,
Siberia using collagen fingerprinting and mitochondrial DNA analysis.
Scientific Reports, 6, 23559. https :// 3559
Campos, P. F., Craig, O. E., Turner-Walker, G., Peacock, E., Willerslev,
E., & Gilbert, M . T. P. (2012). DNA in ancient bone - where is it lo-
cated and how should we extract it? Annals of Anatomy, 194 (1), 7–16.
https ://
Cappellini, E., Gentr y, A., Palkopoulou, E., Ishida, Y., Cram, D., Roos,
A.-M., … Gilbert, M. T. P. (2014). Resolution of the type material of
the Asian elephant, Elephas maximus Linnaeus, 1758 (Proboscidea,
Elephantidae). Zoological Journal of the Linnean Society, 170(1), 222–
232. https ://doi. org/10.1111/zo j12084
Cappellini, E., Welker, F., Pandolfi, L., Ramos-Madrigal, J., Samodova,
D., Rüther, P. L., … Willerslev, E. (2019). ). Early Pleis tocene enamel
proteome from Dmanisi resolves Stephanorhinus phylogeny. Nature,
1–5. ht tps ://do 38/s41586 -019-1555-y
Cucchi, T., Mohaseb, A., Peigné, S., Debue, K., Orlando, L ., & Mashkour,
M. (2017). Detecting taxonomic and phylogenetic signals in equid
cheek teeth: Towards new palaeontological and archaeologi-
cal proxies. Royal Society Open Science, 4(4), 160997. https ://doi.
org /10.1098/rso s.160997
Dabney, J., Knapp, M., Glocke, I., Gansauge, M.-T., Weihmann, A., Nickel,
B., … Meyer, M. (2013). Complete mitochondrial genome sequence
of a Middle Pleistocene cave bear reconstructed from ultrashor t
DNA fragments. Proceedings of the National Academy of Sciences of
the United States of America, 110 (39), 15758–15763. https ://doi.
org/10.1073/pnas.13144 45110
Darriba, D., Taboada, G . L., Doallo, R., & Posada, D. (2012). jModelT-
est 2: More models, new heuristics and parallel computing. Nature
Methods, 9(8), 772. https ://
Der Sarkissian, C., Ermini, L., Schuber t, M., Yang, M. A., Librado, P.,
Fumagalli, M., Orlando, L. (2015). Evolutionar y Genomics
and Conservation of the Endangered Przewalski’s Horse.
Current Biology, 25(19), 2577–2583. https ://
Dietl, G . P., & Flessa, K. W. (2011). Conservation paleobiology: Putting
the dead to work. Trends in Ecology & Evolution, 26(1), 30–37. https ://
Drummond, A. J., Ho, S. Y. W., Phillips, M. J., & Rambaut, A . (2006).
Relaxed phylogenetics and dating with confidence. PLoS Biolog y, 4(5),
e88. https :// al.pbio.0040088
Drummond, A. J., & Rambaut, A. (2007). BEAST: Bayesian evolutionary
analysis by sampling trees. BMC Evolutionary Biology, 7, 214. https ://
doi .org/10.1186/1471-214 8-7-214
Eisenmann, V., Howe, J., & Pichardo, M. (2008). Old world hemiones and
new world slender species (Mammalia, Equidae). Palaeovertebrata,
36(1–4), https :// /10.18563/ pv.36.1-4.159-233
Fages, A., Hanghøj, K., Khan, N., G aunitz, C., Seguin-Orlando, A.,
Leonardi, M., … Orlando, L . (2019). Tracking Five Millennia of Horse
Management with Extensive Ancient Genome Time Series. Cell,
177(6), 14191435. https ://
Foote, A. D., Kaschner, K., Schultze, S. E., Garilao, C., Ho, S. Y. W., Post, K.,
… Gilbert, M. T. P. (2013). Ancient DNA reveals that bowhead whale
lineages survived Late Pleistocene climate change and habitat shifts.
Nature Communications, 4, 1677. https :// s2714
Forste n, A. (1998). The Late P leistocene- Holocene hor ses, not asse s, from
Japan (Mammalia, Perissodactyla, Equus). Geobios. Memoire Special,
31(4), 545–548. ht tps ://
Fulton, T. L., & Shapiro, B. (2019). Settin g Up an Ancient DNA L aboratory.
In B. Shapiro, A. Barlow, P. D. Heintzman, M. Hofreiter, J. L. A.
Paijmans, & A. E. R . Soares (Eds.), An cient DNA: Methods and Protocol s
(pp. 1–13). New York, NY: Springer.
Gaunit z, C., Fage s, A., Hanghøj, K., Albrechtsen, A., Khan, N., Schubert,
M., … Orlando, L. (2018). Ancient genomes revisit the ancestry of
domestic and Przewalski’s horses. Science, 360(63 84) , 111–114.
Geigl, E.-M., & Grange, T. (2012). Eurasian wild asses in time and
space: Morphological versus genetic diversity. Annals of Anatomy =
Anatomis cher Anzeiger: Of ficial Organ of the A natomische. Gesellschaft,
194(1), 88–102. https ://
Grealy, A. C., McDowell, M. C ., Scofield, P., Murray, D. C., Fusco, D. A.,
Haile, J., … Bunce, M. (2015). A critical evaluation of how ancient
DNA bulk bone metabarcoding complements traditional morpholog-
ical analysis of fossil assemblages. Quaternary Science Reviews, 128,
37–47. https :// irev.2015.09.014
Haring ton, C. R. (1980). Pleistocene mammals from Lost Chicken Creek.
Alaska. Canadian Journal of Earth Sciences, 17(2), 168–198. https ://
Heintzman, P. D., Zazula, G . D., MacPhee, R. D. E., Scott, E., Cahill, J.
A., McHorse, B. K., Shapiro, B.. (2017). A new genus of horse from
Pleistocene North America. eLife, 6, https :// /10.7554/
eLife.2994 4
Higuchi, R., Bowman, B., Freiberger, M., Ryder, O. A., & Wilson, A . C.
(1984). DNA sequences from the quag ga, an ex tinct member of the
horse family. Nature, 312(5991), 282–284.
Hillson, S. (2005). Teeth (Cambridge Manuals in Archaeology). Cambridge:
Cambridge University Press.
Hykin, S. M., Bi, K ., & McGuire, J. A. (2015). Fixing Formalin: A Method
to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed
Museum Specimens Using High-Throughput Sequencing. PLoS ONE,
10(10), e0141579. https :// al.pone.0141579
Johnson, K. G., Brooks, S. J., Fenberg, P. B., Glover, A. G., James, K. E.,
Lister, A. M ., … Stewar t, J. R. (2011). Climate Change and Biosphere
Response: Unlock ing the Collections Vault. BioScience, 61( 2) , 147–
153. https ://
Jónsson , H., Ginolhac, A ., Schubert , M., Johnson, P. L. F., & Orlando, L.
(2013). mapDamage2. 0: Fast approximate Bayesian estimates of an-
cient DNA damage parameters. Bioinformatics, 29(13), 16 82–1684.
Kaczensky, P., Lkhagvasuren, B., Pereladova, O., Hemami, M., & Bouskila,
A. (2015). Equus hemionus. The IUCN Red List of Threatened
Species 2015: e.T7951A45171204. https ://
UK.2015-4.RLTS.T7951 A4517 1204.en.
Keller, A., Kreis, S., Leidinger, P., Maixner, F., Ludwig, N., Backes, C.,
… Meese, E . (2017). miRNAs in Ancient Tissue Specimens of the
Tyrolean Iceman. Molecular Biology and Evolution, 34(4), 793–801.
Korlević, P., Gerber, T., Gansauge, M.-T., Hajdinjak, M., Nagel, S., Aximu-
Petri, A., & Meyer, M. (2015). Reducing microbial and human con-
tamination in DNA extractions from ancient bones and teeth.
BioTechniques, 59(2), 87–93 . ht tps ://doi.o rg /10. 214 4/0 0 011 4320
Krause, J., Fu, Q., Good, J. M., Viola, B., Shunkov, M. V., Derevianko, A. P.,
& Pääbo, S. (2010). The complete mitochondrial DNA genome of an
unknown hominin from southern Siberia. Nature, 464(7290), 894–897.
Kuzmina, I . E. (1997). Horses of North Eurasia from The Pliocene to The
Present. Zoological Institute: RAS Press, Sankt-Petersburg.
Leonard, J. A. (20 08). Ancient DNA applications for wildlife con-
servation. Molecular Ecology, 17(19), 4186–4196. https ://doi.
org /10.1111/j.1365 -294X .2008.03891 .x
Li, H., & Durbin, R . (2009). Fast and accurate short read alignment with
Burrows-Wheeler transform. Bioinformatics, 25(14), 175 4–1760.
https :// forma tics/btp324
Lorenzen, E. D., Nogués-Bravo, D., Orlando, L., Weinstock, J., Binladen,
J., Marske, K. A., … Willerslev, E. (2011). Species-specific responses
of Late Quaternar y megafauna to climate and humans. Nature,
479(7373), 359–364.
MacFadden, B. J. (1992). Fossil Horses: Systematics, Paleobiology, and
Evolution of the Family Equidae. New York, NY: Cambridge University
Mangerud, J., Jakobsson, M., Alexanderson, H., Astakhov, V., Clarke,
G. K. C ., Henriksen, M ., … Svendsen, J. I. (20 04). Ice-dammed lakes
and rerouting of the drainage of northern Eurasia during the Last
Glaciation. Quaternary Science Reviews, 23(11), 1313–1332. ht tps :// irev.2003.12.009
Markova, A . K., Smirnov, N. G., Kozharinov, A. V., Kazant seva, N. E.,
& Kitaev, L. M. (1995). Late Pleistocene distribution and diversity of
mammals in Northern Eurasia (PALEOFAUNA Database) ( Vol. 28–29).
Paleontologia i Evolucio, t.28-29, pp. 5–143.
Marr, M. M., Brace, S., Schreve, D. C ., & Barnes, I. (2018). Identif ying
source populations for the reintroduction of the Eurasian beaver,
Castor fiber L. 1758, into Britain: Evidence from ancient. DNA.
Scientific Reports, 8(1), 2708.
McGuire, J. A., Cotoras, D. D., O’Connell, B., Lawalata , S. Z. S., Wang-
Claypool, C. Y., Stubbs, A., … Iskandar, D. T. (2018). Sque ezing water
from a stone: High-throughput sequencing from a 145-year old
holotype resolves (barely) a cryptic spe cies problem in flying lizards.
Pee rJ, 6, e4 470. htt ps :// j.4 470
Meyer, M., Arsuaga, J.-L ., de Filippo, C., Nagel, S., Aximu-Petri, A.,
Nickel, B., … Pääbo, S. (2016). Nuclear DNA sequences from the
Middle Pleistocene Sima de los Huesos hominins. Nature, 531(7595),
Meyer, M., & Kircher, M. (2010). Illumina sequencing librar y preparation
for highly multiplexed target capture and sequencing. Cold Spring
Harbor Protocols, 2010 (6), db.prot54 48. https ://
pdb.prot54 48
Nakamu ra, T., Yamada, K. D., Tomii , K., & Katoh , K. (2018). Par allelizat ion of
MAFFT for large-scale multiple sequence alignments. Bioinformatics,
34(14), 2490–2492. https :// forma tics/bty121
Orlando, L., Ginolhac, A., Zhang, G., Froese, D., Albrecht sen, A ., Stiller,
M., … Willerslev, E. (2013). Recalibrating Equus evolution using the
genome sequence of an early Middle Pleistocene horse. Nature,
499(7456), 74–78.
Orlando, L., Metcalf, J. L., Alberdi, M. T., Telles-Antunes, M., Bonjean,
D., Otte, M., … Cooper, A. (2009). Revising the recent evolutionary
histor y of equids using ancient DNA . Proceedings of the National
Academy of Sciences of the United States of America, 106 (51 ), 2175 4–
21759. https ://doi.o rg/10.1073/pn as.090 36 72106
Pacioni, C., Hunt, H., Allentoft , M. E., Vaughan, T. G., Wayne, A. F.,
Baynes, A., … Bunce, M. (2015). Genetic diversity loss in a biodiver-
sity hot spot: A ncient DNA quantifies genetic decline and former
connectivity in a critically endangered marsupial. Molecular Ecology,
24(23), 5813–5828. ht tps ://
Rambau t, A. (2014). FigTree 1.4. 2 softwa re. Institute of Evolu tionary Bio logy,
University of Edinburgh. Retrieved from https ://schol
schol ar?clust er=51296 45832 77081 3008&hl=en&as_sdt=0,5&sciod
Rambaut, A., & Drummond, A. J. (2010). TreeAnnotator version
1.6. 1. Universit y of Edinburgh, Edinburgh, UK. Retrieved from
h t t p s : / / s c h o l a r . g o o g l e . c a / s c h o l a r ? c l u s t e r = 1 6 0 3 4 9 9 7 3 5 6 6 1 2 4 4 6 9 4 5
&hl=en&as_sdt=0,5&sciod t=0,5.
Rambaut, A., Drummond, A. J., Xie, D., Baele, G., & Suchard, M. A. (2018).
Posterior Summarization in Bayesian Phylogenetics Using Tracer 1.7.
Systematic Biology, 67(5), 901–904. https :// o/
Ramsey, C. B. (2009). Bayesian Analysis of Radiocarbon Dates.
Radiocarbon, 51(1), 337–360 . https ://doi .or g/10 .1017/S0 03 3 82 220
Reimer, P. J., Bard, E., Bayliss, A ., Beck, J. W., Black well, P. G., Ramsey, C.
B., … van der Plicht, J. (2013). IntCal13 and Marine13 Radiocarbon
Age Calibration Curves 0–50,000 Years cal BP. Radiocarbon, 55(4),
1869–1887. https ://
Ruane, S., & Austin , C. C. (2017). Phylogenomics using forma-
lin-fixed and 100+ year-old intrac table natural history speci-
mens. Molecular Ecology Resources, 17(5), 1003–1008. https ://doi.
org /10.1111/1755-0998.12655
Shapiro, B., Drummond, A . J., Rambaut, A ., Wilson, M. C., Matheus, P. E.,
Sher, A. V., … Cooper, A. (2004). Rise and fall of the Beringian steppe
bison. Science, 306(5701), 1561–1565.
Stadler, T. (2010). Sampling-through-time in birth-death trees. Journal
of Theoretical Biology, 267(3), 396–404. https ://
Stamatakis, A . (2014). RA xML version 8: A tool for phylogenetic analysis
and post-analysis of large phylogenies. Bioinformatics, 30(9), 131 2–
1313. https :// forma tics/btu033
St-Louis, A., & Côté, S. D. (2009). Equus kiang (Perissodactyla: Equidae).
Mammalian Species, 835, 1–11. https ://
Tarasov, P. E., Volkova, V. S., Webb, T., Guiot, J., Andreev, A. A., Bezusko,
L. G., … Sevastyanov, D. V. (2000). Last glacial maximum biomes
reconstructed from pollen and plant macrofossil data from nor th-
ern Eurasia. Journal of Biogeography, 27(3), 609–620. https ://doi.
Twiss, K. C., Wolfhagen, J., Madgwick, R ., Foster, H., Demirergi, G. A .,
Russell, N., … Mulville, J. (2017). Horses , Hemiones, Hydruntines?
Assessing the Reliabilit y of Dental Criteria for Assigning Species
to Southwe st Asian Equid Remains: Assessing the Reliability of SW
Asian Equid Dental Criteria. International Journal of Osteoarchaeology,
27(2), 298–304. https ://
van der Valk, T., Sandoval-Castellanos, E., Caillaud, D., Ngobobo, U.,
Binyinyi, E., Nishuli, R., … Guschanski, K. (2018). Significant loss of
mitochondrial diversit y within the last century due to extinction
of peripheral populations in eastern gorillas. Scientific Repor ts, 8(1),
6551. https ://
Weinstock , J., Willerslev, E., Sher, A., Tong, W., Ho, S. Y. W., Rubenstein,
D., … Cooper, A. (2005). Evolution, systematics, and phylogeog-
raphy of pleistocene horses in the new world: A molecular per-
spective. PLOS Biology, 3(8), e241. https ://
White, L . C., Mitchell, K . J., & Austin, J. J. (2018). Ancient mitochondrial
genomes reveal the demographic history and phylogeography of the
extinct, enigmatic thylacine (Thylacinus cynocephalus). Journal of
Biogeography, 45(1 ) , 1–13 .
Zimov, S. A., Zimov, N. S., Tikhonov, A. N., & Chapin, F. S. (2012).
Mammoth steppe: A high-productivity phenomenon. Quaternary
Science Reviews, 57, 26–45. https ://
How to cite this article: Vershinina AO, Kapp JD, Baryshnikov
GF, Shapiro B. The case of an arctic wild ass highlights the
utility of ancient DNA for validating problematic
identifications in museum collections. Mol Ecol Resour.
2020;00:1–9. ht tps ://doi .org/10.1111/1755- 09 98.1313 0

Supplementary resource (1)

... To analyse the taxonomic position of the hair source, we analysed the first extract by processing the sequencing reads using standard protocols (Vershinina et al. 2020;Vershinina et al. 2021). For MOE166, we mapped the merged quality-filtered reads on a variety of reference genomes; however, the results did not indicate DNA library affinity to any particular organism or reference and suggested human contamination. ...
... Next, we prepared the second DNA extract, merged the read data (MOE166 and BAN039) and mapped it to the human reference genome, hg38. In addition to nuclear genome analysis, we used the online github DNA post processing program 1 (Vershinina et al. 2020) to assemble the mitochondrial genome recovered in libraries MOE166 combined with BAN039, using NC 012920 (rCRS) as a seed for reference-free MIA v1.0 assembly. We then ran the assembly through HaploGrep 2.0 (Weissensteiner et al. 2016) to determine the haplogroup of the assembled mitochondrial genome. ...
... Note 1. (Vershinina et al. 2020). ...
Full-text available
Ancient hair and remnant plant DNA are important environmental proxies that preserve for millennia in specific archaeological contexts. However, recovery has been rare from late Pleistocene sites and more may be found if deliberately sought. Once discovered, singular hair fragments are not easily identified to taxa through comparative analyses and environmental DNA (eDNA) extraction can be difficult depending on preservation or contamination. In this paper, we present our methods for the combined recovery of ancient hair specimens and eDNA from sediments to improve our understanding of late Pleistocene environments from the Holzman site along Shaw Creek in interior Alaska. The approach serves as a useful case study for learning more about local environmental changes.
... Biological collections provide an invaluable window into the past, creating an irreplaceable record of biodiversity (Suarez & Tsutsui 2004;Yeates et al. 2016;Vershinina et al., 2019). Often, samples have been collected over long time periods from the same localities, providing time series to study evolution at multiple spatial and temporal scales (Splendiani et al. 2017;Schmitt et al. 2019;Shultz et al. 2020). ...
Until recently many historical museum specimens were largely inaccessible to genomic inquiry, but high‐throughput sequencing (HTS) approaches have allowed researchers to successfully sequence genomic DNA from dried and fluid‐preserved museum specimens. In addition to preserved specimens, many museums contain large series of allozyme supernatant samples but the amenability of these samples to HTS has not yet been assessed. Here, we compared the performance of a target‐capture approach using alternative sources of genomic DNA from ten specimens of spring salamanders (Plethodontidae: Gyrinophilus porphyriticus) collected between 1985 and 1990: allozyme supernatants, allozyme homogenate pellets, and formalin‐fixed tissues. We designed capture probes based on double‐digest restriction‐site associated sequencing (RADseq) derived loci from frozen blood samples available for seven of the specimens and assessed the success and consistency of capture and RADseq approaches. This study design enabled direct comparisons of data quality and potential biases among the different datasets for phylogenomic and population genomic analyses. We found that in phylogenetic analyses, all enrichment types for a given specimen clustered together. In principal component space all capture‐based samples clustered together, but RADseq samples did not cluster with corresponding capture‐based samples. SNP calls were on average 18.3% different between enrichment types for a given individual, but these discrepancies were primarily due to differences in heterozygous/homozygous SNP calls. We demonstrate that both allozyme supernatant and formalin‐fixed samples can be successfully used for population genomic analyses and we discuss ways to identify and reduce biases associated with combining capture and RADseq data.
... However, these protocols are primarily optimized to deal with poor DNA quality due to contamination and fragmentation (Rohland et al. 2015), designed to enrich endogenous DNA over contaminations (Horn 2012) and therefore target only specific parts of the genome (Suchan et al. 2016;Knyshov et al. 2019). Many of the protocols also still require higher amounts of input DNA or tissue than used in this study (Gamba et al. 2016;Tsai et al. 2019Tsai et al. , 2019Vershinina et al. 2020). ...
... Twenty of the 78 well-preserved samples yielded >4-fold coverage mitochondrial genomes from the screening data (Table S1). For the remaining 58, we performed RNA-bait hybridizationbased target enrichment using the horse "myBaits Expert Mito'' kit (Daicel Arbor Biosciences, previously MYcroarray) and following manufacturer's protocols v. 1 to v. 4, with hybridization at 65℃ for 36 hours as described by Vershinina et al. (2019) We augmented our data set with 110 previously published ancient and present-day horse mitogenomes from across their Holarctic Pleistocene range (Fages et al., 2019;Librado et al., 2015;Lippold, Matzke, Reissmann, & Hofreiter, 2011); see Table S2 for the complete list of references). ...
Full-text available
The Bering Land Bridge (BLB) last connected Eurasia and North America during the Late Pleistocene. Although the BLB would have enabled transfers of terrestrial biota in both directions, it also acted as an ecological filter whose permeability varied considerably over time. Here we explore the possible impacts of this ecological corridor on genetic diversity within, and connectivity among, populations of a once wide‐ranging group, the caballine horses (Equus spp.). Using a panel of 187 mitochondrial and eight nuclear genomes recovered from present‐day and extinct caballine horses sampled across the Holarctic, we found that Eurasian horse populations initially diverged from those in North America, their ancestral continent, around 1.0‐0.8 million years ago. Subsequent to this split our mitochondrial DNA analysis identified two bi‐directional long‐range dispersals across the BLB ~875‐625 and ~200‐50 thousand years ago, during the Middle and Late Pleistocene. Whole genome analysis indicated low levels of gene flow between North American and Eurasian horse populations, which likely occurred as a result of these inferred dispersals. Nonetheless, mitochondrial and nuclear diversity of caballine horse populations retained strong phylogeographic structuring. Our results suggest that barriers to gene flow, currently unidentified but possibly related to habitat distribution across Beringia or ongoing evolutionary divergence, played an important role in shaping the early genetic history of caballine horses, including the ancestors of living horses within Equus ferus.
... Затем мы амплифицировали ее с помощью 2X KAPA HIFI полимеразного микса и секвенировали в Университете Калифорнии Санта-Круз на Illumina MiSeq (парными прочтениями, по 75 нуклеотидов каждый). С помощью парных коротких прочтений (ридов) мы собрали целый митогеном по протоколу Вершининой с соавторами (Vershinina et al., 2019). Мы использовали Stephanorhinus cf. ...
Full-text available
Merck’s rhino (Stephanorhinus kirchbergensis (Jäger 1839)), one of the extinct members of the Pleistocene megafauna, is scarce in Russia’s geological record. According to the previous research paradigm, that large rhinoceros inhabited forest environments during interglacials, consumed mostly branch- and leaf-containing food, and went extinct across most of its range during the Middle Pleistocene, still persisting in southern Siberia until the Late Pleistocene. No direct evidence of this species associated with Late Pleistocene deposits and based on 14C dating has hitherto been obtained in Russia. Our studies on the mandible of Merck’s rhino from the South of western Siberia confirm that the species was present in the Altai region until the second half of the Late Pleistocene (MIS3), but much later than previously thought, until about 40000 years before present. Tooth enamel microwear shows that this rhino ate branches and leaves of various trees and shrubs. Merck’s rhino from the Chondon River (North of the Indigirka-Kolyma Lowlands) inhabited open larch forests and grassland landscapes. Considering the habitats, this species had a chance to survive there at least until the beginning of the Late Pleistocene (MIS5e), that is, their time lasted longer than previously thought. A phylogenetic analysis of complete mitochondrial genomes of extinct and extant rhinoceroses confirms the taxonomic morphological identification of the Altai and Chondon rhinos.
... Biological collections provide an invaluable window into the past, creating an irreplaceable record of biodiversity (Suarez & Tsutsui 2004;Yeates et al. 2016;Vershinina et al., 2019). Often, samples have been collected over long time periods from the same localities, providing time series to study evolution at multiple spatial and temporal scales (Splendiani et al. 2017;Schmitt et al. 2019;Shultz et al. 2020). ...
Until recently many historical museum specimens were largely inaccessible to genomic inquiry, but high-throughput sequencing (HTS) approaches have allowed researchers to successfully sequence genomic DNA from dried and fluid-preserved museum specimens. In addition to preserved specimens, many museums contain large series of allozyme supernatant samples but the amenability of these samples to HTS has not yet been assessed. Here, we compared the performance of a target-capture approach using alternative sources of genomic DNA from ten specimens of spring salamanders (Plethodontidae: Gyrinophilus porphyriticus) collected 1985–1990: allozyme supernatants, allozyme homogenate pellets, and formalin-fixed tissues. We designed capture probes based on double-digest restriction-site associated (RADseq) sequencing derived loci from seven of the specimens and assessed the success and consistency of capture and RADseq technical replicates. This study design enabled direct comparisons of data quality and potential biases among the different datasets for phylogenomic and population genomic analyses. We found that in phylogenetic analyses, all replicates for a given specimen clustered together, but in principal component space, RADseq replicates did not cluster with corresponding capture-based replicates. SNP calls were on average 18.3% different between technical replicates, but these discrepancies were primarily due to differences in heterozygous/homozygous SNP calls. We demonstrate that both allozyme supernatant and formalin-fixed samples can be successfully used for population genomic analyses and we discuss ways to identify and reduce biases associated with combining capture and RADseq data.
Objective: To understand the domestication and spread of horses in history, genetic information is essential. However, mitogenetic traits of ancient or medieval horses have yet to be comprehensively revealed, especially for East Asia. This study thus set out to reveal the maternal lineage of skeletal horse remains retrieved from a 15th century archaeological site (Gongpyeongdong) at Old Seoul City in South Korea. Methods: We extracted DNA from the femur of Equus caballus (SNU-A001) from Joseon period Gongpyeongdong site. Mitochondrial (mt) DNA (HRS 15128-16116) of E. caballus was amplified by PCR. Cloning and sequencing was conducted for the mtDNA amplicons. The sequencing results were analyzed by NCBI/BLAST and phylogenetic tool of MEGA7 software. Results: The horse is unique among domesticated animals for the remarkable impact it has on human civilization in terms of transportation and trade. Utilizing the Joseon-period horse remains, we can obtain clues to reveal the genetic traits of Korean horse that existed before the introduction of Western horses. Conclusion: By means of mtDNA cytochrome b and D-loop analysis, we found that the 15th century Korean horse belonged to haplogroup Q representing those horses that have historically been raised widely in East Asia.
Full-text available
(1) Within evolutionary biology, mitochondrial genomes (mitogenomes) provide useful insights at both population and species level. Several approaches are available to assemble mitogenomes. However, most are not suitable for divergent, extinct species, due to the requirement of a reference mitogenome from a conspecific or close relative, and relatively high-quality DNA. (2) Iterative mapping can overcome the lack of a close reference sequence, and has been applied to an array of extinct species. Despite its widespread use, the accuracy of the reconstructed assemblies are yet to be comprehensively assessed. Here, we investigated the influence of mapping software (BWA or MITObim), parameters, and bait reference phylogenetic distance on the accuracy of the reconstructed assembly using two simulated datasets: (i) spotted hyena and various mammalian bait references, and (ii) southern cassowary and various avian bait references. Specifically, we assessed the accuracy of results through pairwise distance (PWD) to the reference conspecific mitogenome, number of incorrectly inserted base pairs (bp), and total length of the reconstructed assembly. (3) We found large discrepancies in the accuracy of reconstructed assemblies using different mapping software, parameters, and bait references. PWD to the reference conspecific mitogenome, which reflected the level of incorrect base calls, was consistently higher with BWA than MITObim. The same was observed for the number of incorrectly inserted bp. In contrast, the total sequence length was lower. Overall, the most accurate results were obtained with MITObim using mismatch values of 3 or 5, and the phylogenetically closest bait reference sequence. Accuracy could be further improved by combining results from multiple bait references. (4) We present the first comprehensive investigation of how mapping software, parameters, and bait reference influence mitogenome reconstruction from ancient DNA through iterative mapping. Our study provides information on how mitogenomes are best reconstructed from divergent, short-read data. By obtaining the most accurate reconstruction possible, one can be more confident as to the reliability of downstream analyses, and the evolutionary inferences made from them.
Natural history collections are invaluable repositories of biological information that provide an unrivaled record of Earth's biodiversity. Museum genomics—genomics research using traditional museum and cryogenic collections and the infrastructure supporting these investigations—has particularly enhanced research in ecology and evolutionary biology, the study of extinct organisms, and the impact of anthropogenic activity on biodiversity. However, leveraging genomics in biological collections has exposed challenges, such as digitizing, integrating, and sharing collections data; updating practices to ensure broadly optimal data extraction from existing and new collections; and modernizing collections practices, infrastructure, and policies to ensure fair, sustainable, and genomically manifold uses of museum collections by increasingly diverse stakeholders. Museum genomics collections are poised to address these challenges and, with increasingly sensitive genomics approaches, will catalyze a future era of reproducibility, innovation, and insight made possible through integrating museum and genome sciences. Expected final online publication date for the Annual Review of Genetics, Volume 55 is November 2021. Please see for revised estimates.
Full-text available
Interest in the origin and evolution of Equus dates back to over a century, but there is still no consensus on the definition of the genus or its phylogenetic position. We review the placement of Equus within several phylogenetic frameworks and present a phylogenetic analysis of derived Equini, including taxa referred to Equus, Haringtonhippus, Dinohippus, Astrohippus, Hippidion, and Boreohippidion. A new, morphology-based phylogenetic tree was used as an initial hypothesis for discussing what taxa Equus encompasses, using four criteria previously used to define the genus category in mammals: phylogenetic gaps, uniqueness of adaptive zone, crown group definition, and divergence time. According to the phylogenetic gaps criterion, Equus encompasses clade 6 (Ha. francisci = E. francisci, E. conversidens, E. quagga, E. hemionus, E. mexicanus, E. ferus, E. occidentalis, and E. neogeus) based on morphological synapomorphies. Equus is assigned to clade 6, or possibly clade 7, according to the uniqueness of adaptive zone criterion. The crown group criterion places Equus at clade 6. Based on the time-calibrated phylogeny of Equini, the divergence time criterion suggests that Equus encompasses clade 9. This clade comprises all taxa traditionally assigned to Equus analyzed in our study, including the eight taxa listed above as well as E. stenonis, E. idahoensis, and E. simplicidens; the latter two are sometimes referred to the subgenus Plesippus and the former to the subgenus Allohippus. With the exception of the divergence time criterion, the results of our evaluation are congruent in identifying clade 6 as the most suitable position for Equus. The taxonomic implications of delimiting Equus to clade 6 in our phylogenetic tree include elevation of Allohippus and Plesippus to generic rank, assignment of a new genus to "Dinohippus" mexicanus, and synonymy of Haringtonhippus with Equus.
Full-text available
The sequencing of ancient DNA has enabled the reconstruction of speciation, migration and admixture events for extinct taxa¹. However, the irreversible post-mortem degradation² of ancient DNA has so far limited its recovery—outside permafrost areas—to specimens that are not older than approximately 0.5 million years (Myr)³. By contrast, tandem mass spectrometry has enabled the sequencing of approximately 1.5-Myr-old collagen type I⁴, and suggested the presence of protein residues in fossils of the Cretaceous period⁵—although with limited phylogenetic use⁶. In the absence of molecular evidence, the speciation of several extinct species of the Early and Middle Pleistocene epoch remains contentious. Here we address the phylogenetic relationships of the Eurasian Rhinocerotidae of the Pleistocene epoch7–9, using the proteome of dental enamel from a Stephanorhinus tooth that is approximately 1.77-Myr old, recovered from the archaeological site of Dmanisi (South Caucasus, Georgia)¹⁰. Molecular phylogenetic analyses place this Stephanorhinus as a sister group to the clade formed by the woolly rhinoceros (Coelodonta antiquitatis) and Merck’s rhinoceros (Stephanorhinus kirchbergensis). We show that Coelodonta evolved from an early Stephanorhinus lineage, and that this latter genus includes at least two distinct evolutionary lines. The genus Stephanorhinus is therefore currently paraphyletic, and its systematic revision is needed. We demonstrate that sequencing the proteome of Early Pleistocene dental enamel overcomes the limitations of phylogenetic inference based on ancient collagen or DNA. Our approach also provides additional information about the sex and taxonomic assignment of other specimens from Dmanisi. Our findings reveal that proteomic investigation of ancient dental enamel—which is the hardest tissue in vertebrates¹¹, and is highly abundant in the fossil record—can push the reconstruction of molecular evolution further back into the Early Pleistocene epoch, beyond the currently known limits of ancient DNA preservation.
Full-text available
Horse domestication revolutionized warfare and accelerated travel, trade, and the geographic expansion of languages. Here, we present the largest DNA time series for a non-human organism to date, including genome-scale data from 149 ancient animals and 129 ancient genomes (≥1-fold coverage), 87 of which are new. This extensive dataset allows us to assess the modern legacy of past equestrian civilizations. We find that two extinct horse lineages existed during early domestication, one at the far western (Iberia) and the other at the far eastern range (Siberia) of Eurasia. None of these contributed significantly to modern diversity. We show that the influence of Persian-related horse lineages increased following the Islamic conquests in Europe and Asia. Multiple alleles associated with elite-racing, including at the MSTN “speed gene,” only rose in popularity within the last millennium. Finally, the development of modern breeding impacted genetic diversity more dramatically than the previous millennia of human management.
Full-text available
Bayesian inference of phylogeny using Markov chain Monte Carlo (MCMC) (Drummond et al., 2002; Mau et al., 1999; Rannala and Yang, 1996) flourishes as a popular approach to uncover the evolutionary relationships among taxa, such as genes, genomes, individuals or species. MCMC approaches generate samples of model parameter values - including the phylogenetic tree -drawn from their posterior distribution given molecular sequence data and a selection of evolutionary models. Visualising, tabulating and marginalising these samples is critical for approximating the posterior quantities of interest that one reports as the outcome of a Bayesian phylogenetic analysis. To facilitate this task, we have developed the Tracer (version 1.7) software package to process MCMC trace files containing parameter samples and to interactively explore the high-dimensional posterior distribution. Tracer works automatically with sample output from BEAST (Drummond et al., 2012), BEAST2 (Bouckaert et al., 2014), LAMARC (Kuhner, 2006), Migrate (Beerli, 2006), MrBayes (Ronquist et al., 2012), RevBayes (Höhna et al., 2016) and possibly other MCMC programs from other domains.
Full-text available
Species and populations are disappearing at an alarming rate as a direct result of human activities. Loss of genetic diversity associated with population decline directly impacts species' long-term survival. Therefore, preserving genetic diversity is of considerable conservation importance. However, to assist in conservation efforts, it is important to understand how genetic diversity is spatially distributed and how it changes due to anthropogenic pressures. In this study, we use historical museum and modern faecal samples of two critically endangered eastern gorilla taxa, Grauer's (Gorilla beringei graueri) and mountain gorillas (Gorilla beringei beringei), to directly infer temporal changes in genetic diversity within the last century. Using over 100 complete mitochondrial genomes, we observe a significant decline in haplotype and nucleotide diversity in Grauer's gorillas. By including historical samples from now extinct populations we show that this decline can be attributed to the loss of peripheral populations rather than a decrease in genetic diversity within the core range of the species. By directly quantifying genetic changes in the recent past, our study shows that human activities have severely impacted eastern gorilla genetic diversity within only four to five generations. This rapid loss calls for dedicated conservation actions, which should include preservation of the remaining peripheral populations.
Full-text available
We used Massively Parallel High-Throughput Sequencing to obtain genetic data from a 145-year old holotype specimen of the flying lizard, Draco cristatellus . Obtaining genetic data from this holotype was necessary to resolve an otherwise intractable taxonomic problem involving the status of this species relative to closely related sympatric Draco species that cannot otherwise be distinguished from one another on the basis of museum specimens. Initial analyses suggested that the DNA present in the holotype sample was so degraded as to be unusable for sequencing. However, we used a specialized extraction procedure developed for highly degraded ancient DNA samples and MiSeq shotgun sequencing to obtain just enough low-coverage mitochondrial DNA (721 base pairs) to conclusively resolve the species status of the holotype as well as a second known specimen of this species. The holotype was prepared before the advent of formalin-fixation and therefore was most likely originally fixed with ethanol and never exposed to formalin. Whereas conventional wisdom suggests that formalin-fixed samples should be the most challenging for DNA sequencing, we propose that evaporation during long-term alcohol storage and consequent water-exposure may subject older ethanol-fixed museum specimens to hydrolytic damage. If so, this may pose an even greater challenge for sequencing efforts involving historical samples.
Full-text available
We report an update for the MAFFT multiple sequence alignment program to enable parallel calculation of large numbers of sequences. The G-INS-1 option of MAFFT was recently reported to have higher accuracy than other methods for large data, but this method has been impractical for most largescale analyses, due to the requirement of large computational resources. We introduce a scalable variant, G-large-INS-1, that has equivalent accuracy to G-INS-1 and is applicable to 50,000 or more sequences. Availability: This feature is available in MAFFT versions 7.355 or later at Contact: Supplementary information: Supplementary data are available at Bioinformatics online.
Full-text available
Revisiting the origins of modern horses The domestication of horses was very important in the history of humankind. However, the ancestry of modern horses and the location and timing of their emergence remain unclear. Gaunitz et al. generated 42 ancient-horse genomes. Their source samples included the Botai archaeological site in Central Asia, considered to include the earliest domesticated horses. Unexpectedly, Botai horses were the ancestors not of modern domestic horses, but rather of modern Przewalski's horses. Thus, in contrast to current thinking on horse domestication, modern horses may have been domesticated in other, more Western, centers of origin. Science , this issue p. 111
Full-text available
Establishing true phylogenetic relationships between populations is a critical consideration when sourcing individuals for translocation. This presents huge difficulties with threatened and endangered species that have become extirpated from large areas of their former range. We utilise ancient DNA (aDNA) to reconstruct the phylogenetic relationships of a keystone species which has become extinct in Britain, the Eurasian beaver Castor fiber. We sequenced seventeen 492 bp partial tRNAPro and control region sequences from Late Pleistocene and Holocene age beavers and included these in network, demographic and genealogy analyses. The mode of postglacial population expansion from refugia was investigated by employing tests of neutrality and a pairwise mismatch distribution analysis. We found evidence of a pre-Late Glacial Maximum ancestor for the Western C. fiber clade which experienced a rapid demographic expansion during the terminal Pleistocene to early Holocene period. Ancient British beavers were found to originate from the Western phylogroup but showed no phylogenetic affinity to any one modern relict population over another. Instead, we find that they formed part of a large, continuous, pan-Western European clade that harbored little internal substructure. Our study highlights the utility of aDNA in reconstructing population histories of extirpated species which has real-world implications for conservation planning.
Entering into the world of ancient DNA research is nontrivial. Because the DNA in most ancient specimens is degraded to some extent, the potential is high for contamination of ancient samples, ancient DNA extracts, and genomic sequencing libraries prepared from these extracts with non-degraded DNA from the present-day environment. To minimize the risk of contamination in ancient DNA environments, experimental protocols specific to handling ancient specimens, including those that outline the design and layout of laboratory space, have been introduced. Here, we outline challenges associated with working with ancient samples, including providing guidelines for setting up a new ancient DNA laboratory. We also discuss steps that can be taken at the sample collection and preparation stage to minimize the potential for contamination of ancient DNA experiments with exogenous sources of DNA.