ArticlePDF Available

Abstract and Figures

Billions of specimens can be found in natural history museum collections around the world, holding potential molecular secrets to be unveiled. Among them are intriguing specimens of rare families of moths that, while represented in morphology-based works, are only beginning to be included in genomic studies: Pseudobistonidae, Sematuridae, and Epicopeiidae. These three families are part of the superfamily Geometroidea, which has recently been defined based on molecular data. Here we chose to focus on these three moth families to explore the suitability of a genome reduction method, target enrichment (TE), on museum specimens. Through this method, we investigated the phylogenetic relationships of these families of Lepidoptera, in particular the family Epicopeiidae. We successfully sequenced 25 samples, collected between 1892 and 2001. We use 378 nuclear genes to reconstruct a phylogenetic hypothesis from the maximum likelihood analysis of a total of 36 different species, including 19 available transcriptomes. The hypothesis that Sematuridae is the sister group of Epicopeiidae + Pseudobistonidae had strong support. This study thus adds to the growing body of work, demonstrating that museum specimens can successfully contribute to molecular phylogenetic studies.
Content may be subject to copyright.
1
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/
licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For
commercial re-use, please contact journals.permissions@oup.com
© The Author(s) 2021. Published by Oxford University Press on behalf of Entomological Society of America.
Molecular Phylogenetics, Phylogenomics, and Phylogeography
Museomics: Phylogenomics of the Moth Family
Epicopeiidae (Lepidoptera) Using Target Enrichment
ElsaCall,1,5, ChristophMayer,2, VictoriaTwort,1,3, LarsDietz,2, NiklasWahlberg,1, and
MarianneEspeland4,
1Department of Biology, Lund University, 22362 Lund, Sweden, 2Statistical Phylogenetics and Phylogenomics, Zoological Research Museum Alexander
Koenig, 53113 Bonn, Germany, 3University of Helsinki, Finnish Natural History Museum, Luomus, Helsinki, Finland, 4Arthropoda Department, Zoological
Research Museum Alexander Koenig, 53113 Bonn, Germany, and 5Corresponding author, e-mail: elsa.call.fr@gmail.com
Subject Editor: MarkoMutanen
Received 25 May 2020; Editorial decision 27 October 2020
Abstract
Billions of specimens can be found in natural history museum collections around the world, holding potential
molecular secrets to be unveiled. Among them are intriguing specimens of rare families of moths that, while rep-
resented in morphology-based works, are only beginning to be included in genomic studies: Pseudobistonidae,
Sematuridae, and Epicopeiidae. These three families are part of the superfamily Geometroidea, which has
recently been defined based on molecular data. Here we chose to focus on these three moth families to ex-
plore the suitability of a genome reduction method, target enrichment (TE), on museum specimens. Through
this method, we investigated the phylogenetic relationships of these families of Lepidoptera, in particular the
family Epicopeiidae. We successfully sequenced 25 samples, collected between 1892 and 2001. We use 378
nuclear genes to reconstruct a phylogenetic hypothesis from the maximum likelihood analysis of a total of 36
different species, including 19 available transcriptomes. The hypothesis that Sematuridae is the sister group
of Epicopeiidae + Pseudobistonidae had strong support. This study thus adds to the growing body of work,
demonstrating that museum specimens can successfully contribute to molecular phylogenetic studies.
Key words: Museomics, museum sample, target enrichment, phylogenomics, Lepidoptera
Over 3 billion specimens are estimated to be found in natural history
museum collections around the world, representing one of the most
important biobanks in the world (Duckworth etal. 1993, Suarez
and Tsutsui 2004, Chapman 2005). Until recently, this vast amount
of biological resource was mainly used for morphological studies
because the DNA from these specimens was thought to be too de-
graded to be used for molecular studies (Shapiro and Hofreiter
2012). Due to this, DNA work has, for a long time, been limited
to species for which freshly collected samples could be obtained,
while molecular work from collections was restricted to Sanger
sequencing of short fragments of DNA (Hajibabaei et al. 2006,
Lozier and Cameron 2009, Strutzenberger etal. 2012, Hebert etal.
2013, Cameron etal. 2016). Moreover, the methods were often de-
structive for the specimens (Hajibabaei etal. 2006, Strutzenberger
et al. 2012, Hebert et al. 2013). Recently, high-throughput
sequencing technologies have made the DNA in museum specimens
more accessible, either through whole-genome sequencing (Cong
etal. 2017, Sproul and Maddison 2017, Allio etal. 2019, Li etal.
2019, Zhang et al. 2019) or through genome reduction methods
(Suchan etal. 2016, Breinholt et al. 2018, Toussaint etal. 2018).
These advanced sequencing approaches have opened up a new eld
with great potential for studying the evolutionary history of taxa
that are difcult to collect: museomics.
The family Epicopeiidae is a small Asian family of Lepidoptera
represented by 25 species (Minet 2002, Wei and Yen 2017, Zhang
et al. 2020). Many of them are large diurnal species mimicking
butteries in the families Papilionidae and Pieridae. The history
of the family has been dynamic. Epicopeiidae had originally been
described to harbor only one genus Epicopeia Westwood, 1841
(Laithwaite and Whalley 1975). The pierid-like moths Nossa Kirby,
1892, were previously assigned to the family Epiplemidae (now
considered a subfamily of Uraniidae), but then were rightly placed
in Epicopeiidae by Fletcher (1979) and later conrmed by Minet
(1983, 1986). In latter studies, Minet (1983, 1986) added ve
genera to Epicopeiidae: Amana Walker, 1855; Chatamla Moore,
1881; Parabraxas Leech, 1897; Psychostrophia Butler, 1877; and
Schistomitra Butler, 1881. In 2002, Minet described two new genera,
Deuveia and Burmeia. Finally, in 2017, the number of genera in-
creased to 10 with the description of Mimapora by Wei and Yen.
The family was thought to be related to Drepanidae and was placed
in the superfamily Drepanoidea (Minet 2002), until recent mo-
lecular data suggested that they are in fact related to the superfamily
Insect Systematics and Diversity, (2021) 5(2): 6; 1–10
doi: 10.1093/isd/ixaa021
Research
Copyedited by: OUP
Downloaded from https://academic.oup.com/isd/article/5/2/6/6244279 by elsa.call.fr@gmail.com on 22 April 2021
2
Geometroidea (Regier etal. 2009, Bazinet etal. 2013, Rajaei et al.
2015). The sister group of Epicopeiidae has been suggested to be
the recently described Pseudobistonidae (Rajaei etal. 2015, Wang
etal. 2019). Minet (2002) studied the relationships of genera within
Epicopeiidae based on 34 morphological characters obtained from
the head, thorax, pregenital abdomen, and male genitalia. He found
that Deuveia was sister to the rest of Epicopeiidae and that the rela-
tionships of the other genera were relatively clear (Fig.1, left side).
However, the position of Amana was not stable; it was either sister
to Chatamla + Parabraxas or sister to a clade containing Chatamla,
Parabraxas, Schistomitra, Nossa, and Epicopeia.
The rst attempt to infer the phylogeny of the family based on
genetic markers was done by Wei and Yen (2017). They used se-
quence data for three gene regions (COI, EF-1α, and 28S) and 14
species. Their study was mainly focused on describing a new genus,
Mimaporia, but they sampled widely throughout the family. The re-
sults of their analyses are highly incongruent with those of Minet
(2002), but showed poor or no support on many branches. Wei and
Yen (2017) showed that Epicopeia and Nossa likely are paraphy-
letic, and they were not able to resolve the relationships of the new
genus Mimaporia with any condence (Fig.1).
Recently, Zhang etal. (2020) used PCR-generated baits to infer
a multilocus phylogenetic hypothesis for Epicopeiidae based on 18
species and 94 loci. Their results were highly congruent with Minet’s
(2002) results based on morphology and also found that Epicopeia
and Nossa both were paraphyletic with regard to each other. In add-
ition to using fresh specimens, Zhang etal. (2020) used older speci-
mens with some degree of success, although they were able to recover
a signicantly smaller number of loci from the older specimens.
Epicopeiidae species are generally rare and difcult to collect,
as they are mainly distributed in areas that are not easy to access,
nevertheless they can be found in natural history museums. Here
we investigate the use of target enrichment (TE) methods to study
the phylogenetic relationships of this family of Lepidoptera based
only on museum specimens. Genome reduction methods, such as
TE, aim to sequence only specic segments of the genome. In the
case of highly fragmented genomes (e.g., museum specimens), such
genome reduction methods might be a very useful way of gathering
data for phylogenetic studies. To study phylogenetic relationships
among species, one usually analyzes an a priori known set of gen-
etic markers, e.g., a set of single-copy, protein-coding, homologous
genes. By targeting specic genes of interest, the TE method can be
particularly relevant for phylogenetic studies. However, it has gener-
ally been thought that such reduction methods require good-quality
DNA from fresh or properly stored tissue (Lemmon and Lemmon
2013, Jones and Good 2016). Regardless, TE methods have been
used successfully on stored DNA extractions (Faircloth etal. 2012,
McCormack et al. 2013), as well as museum specimens (Bi etal.
2013, Cruz-Dávalos etal. 2017, St Laurent etal. 2018).
Materials andMethods
Taxon Sampling
Specimens were taken from the collection at the Zoological
Research Museum Alexander Koenig (ZFMK, Bonn, Germany).
We sampled 16 available species of Epicopeiidae, including at most
four specimens per species. In addition, we sampled two species
of Sematuridae (Anurapteryx interlineata and Mania empedocles)
and two specimens of Pseudobistonidae (Pseudobiston pinratanai)
to investigate the relationships between these three families. In
total, 33 museum specimens collected between 1892 and 2001
were included (Table1). The oldest sample is a Parabraxas davidi
(Oberthür, 1885)specimen from 1892, whereas the most recent one
is Parabraxas avomarginaria (Leech, 1897)from 2001 (Table1).
We were not able to acquire samples of the genera Chatamla
(Moore, 1881), Burmeia (Minet, 2002), Mimaporia (Wei and Yen,
2017), or Amana (Walker, 1855), or samples of Heracula discivitta,
Fig. 1. Simplified representation of Epicopeiidae phylogenetic relationships according to Minet (2002) (left) and Wei and Yen (2017) (right). Each genus has a
specific color. Minet’s alternative hypothesis about the position of Amana is represented by gray lines.
Insect Systematics and Diversity, 2021, Vol. 5, No. 2
Copyedited by: OUP
Downloaded from https://academic.oup.com/isd/article/5/2/6/6244279 by elsa.call.fr@gmail.com on 22 April 2021
3
which was recently moved to the family Pseudobistonidae (Wang
etal. 2019). Details for all the specimens included can be found on
Zenodo (doi:10.5281/zenodo.3769000).
Sample Preparation and DNA Extractions
We used a semidestructive approach, i.e., we removed the abdomen
for DNA extraction without grinding the tissue, thus preserving the
genitalia for future preparation (Hundsdoerfer and Kitching 2010).
Genitalia dissections are routinely done for Lepidoptera by boiling
abdomens in KOH to remove soft tissue, thus destroying the DNA in
the process, so our approach is less destructive than what is normally
done. For large specimens (like Nossa or Epicopeia), the abdomen
was cut in half above the genitalia to ensure that they t inside 1.5-
ml Eppendorf tubes. Abdomens were rst soaked in 180-µl H2O, for
about 5min, to rehydrate tissues. Water was removed before starting
DNA extractions. Samples were lysed at 56°C overnight shaking
with 350rpm (by using a thermomixer) for approximately 12–18h.
We used the DNeasy Blood & Tissue kit (Qiagen, Hilden, Germany)
and followed the standard DNA extraction protocol for tissues, with
the following modications: we included an RNase-digestion step
and eluted the DNA in Milliq water. Finally, DNA concentration of
each sample was quantied using a Quantus Fluorometer (Promega,
Madison, WI), and fragment lengths were measured with a Fragment
Analyzer (Advanced Analytical, now Agilent Technologies Inc.,
Santa Clara, CA).
Library Preparation, TE, and Sequencing
There is still no consensus on how the DNA in museum specimens
is best accessed. Here we used TE, a genome reduction approach. TE
methods use probes, designed to target specic regions of the genome
(Breinholt etal. 2018, Toussaint etal. 2018, Espeland etal. 2019).
In the case of phylogenetic studies, this approach has the main ad-
vantage to recover exactly the loci of interest, and as long as a probe
kit exists for a group, no previous knowledge about the genomes of
the group of interest is required. This approach follows three major
steps: 1)bait design, 2) libraries preparations and sequencing, and
3)ltering and processing of thedata.
Regarding the bait design, new genes were selected and added
to the Buttery1.0 kit by Espeland etal. (2018). Mayer etal. (2021)
designed hybrid enrichment baits with BaitFisher software version
1.2.8 (Mayer et al. 2016). A bait length of 120 bp was specied
with a clustering threshold of 0.15, and a tiling design of 3 baits
per bait region with an overlap of 60bp for two consecutive baits
resulting in bait regions with a total length of 240bp. Individual
coding sequences (CDS) from Danaus plexippus (Linneaus),
Melitaea cinxia (Linneaus), Heliconius melpomene (Linneaus),
Papilio glaucus (Linneaus), Plutella xylostella (Linneaus), Bombyx
mori (Linneaus), and Manduca sexta (Linneaus) were used as ref-
erences. The LepZFMK1.0 kit includes 2,954 probe regions in dif-
ferent CDS regions belonging to 1,754 genes and is compatible with
BUTTERFLY1.0 (Espeland et al. 2018) and partially compatible
Table 1. Number of raw recovered loci and selected loci per specimen
Family Species Specimen Collection year Raw loci
Loci found in at
least 20 specimens Reference
Epicopeiidae Deuveia banghaasi (Hering, 1932) S35 1936 12 11 This study
D.banghaasi S37 1936 666 353 This study
Epicopeia hainseii (Holland 1889) S51 1932 549 327 This study
E.hainseii S53 1932 936 373 This study
E.hainseii (Moore, 1874) S43 1951 1,063 374 This study
E.hainseii S45 2001 1,383 376 This study
E.philenora (Westwood. 1841) S55 1937 736 358 This study
E.philenora S57 1938 467 306 This study
E.polydora (Westwood, 1841) S47 1992 1,270 378 Mayer etal. (2021)
E.polydora S49 1932 9 8 This study
Nossa moorei (Elwes, 1890) S11 1931 6 5 This study
N.moorei S13 1931 210 185 This study
N.nagaensis (Elwes, 1890) S9 1991 1 365 This study
N.nelcinna (Moore, 1875) S3 1932 0 0 This study
N.palaearctica (Staudinger, 1887) S5 1989 1 1 This study
N.palaearctica S7 1990 1,202 375 This study
N.palaearctica chinensis S1 1937 4 3 This study
Parabraxas davidi (Oberthür, 1885) S17 1892 516 330 Mayer etal. (2021)
P.davidi S19 1957 1 1 This study
P.davidi S21 1906 0 0 This study
P.avomarginaria (Leech, 1897) S23 2001 982 351 This study
P.nigromacularia (Leech, 1897) S25 1999 1,215 378 This study
Psychostrophia endoi (Inoue, 1992) S27 1995 1,275 373 This study
P.melanargia (Butler, 1877) S39 1956 1,115 372 This study
P.melanargia S41 1934 780 362 This study
P.nymphidiaria (Oberthür, 1893) S31 1938 848 367 This study
P.nymphidiaria S33 1946 739 364 Mayer etal. (2021)
P.picaria (Leech, 1897) S29 2001 1,219 375 This study
Schistomitra funeralis (Butler, 1881) S15 1966 151 131 This study
Pseudobistonidae Pseudobiston pinratanai (Inoue, 1994) S2 1999 232 170 Mayer etal. (2021)
P.pinratanai S63 1999 890 361 Mayer etal. (2021)
Sematuridae Anurapteryx interlineata (Walker. 1854) S61 ? 928 376 Mayer etal. (2021)
Mania empedocles (Cramer, 1782) S59 1960 431 283 This study
Insect Systematics and Diversity, 2021, Vol. 5, No. 2
Copyedited by: OUP
Downloaded from https://academic.oup.com/isd/article/5/2/6/6244279 by elsa.call.fr@gmail.com on 22 April 2021
4
with LEP1 (Breinholt etal. 2018). For more details on the kits, see
Mayer etal. (2021). In many cases, multiple exons of single genes
were targeted when they were longenough.
Library preparation was performed at the Zoological Research
Museum Alexander Koenig (Bonn, Germany). Most of our samples
contained less than 100-ng genomic DNA, which is the needed con-
centration according to standard protocol, but we included them
anyway. With the Fragment Analyzer we found that many fragments
of our samples were around 140bp; therefore, no fragmentation was
necessary for these samples. Other samples with higher quality and
longer fragments were fragmented with Bioruptor PICO sonicator
(Diagenode, Seraing, Belgium) to obtain DNA fragments with an
approximate length of 350bp.
We repaired the DNA with NEBNext FFPE DNA Repair Mix
(NEB, Ipswich, United Kingdom), following the manufacturer’s
protocol. We puried the reactions with Agencourt AMPure
XP beads with a ratio of (1:3). We quantied the resulting li-
braries with Quantus Fluorometer (Promega) and quality checked
with a Fragment Analyzer (Advanced Analytical, now Agilent
TechnologiesInc.).
We proceeded to the enrichment and captured steps with the
Agilent SureSelect XT2 protocol, with additional modication fol-
lowing Bank etal. (2017). Enrichment and sequencing were done
at StarSEQ GmbH (Mainz, Germany) on Illumina Nextseq 500
Systems with a read length of 150bp. Exons found in at least 20 of
the 33 specimens (with an average of 254 loci per specimen) were
used for downstream phylogenetic analyses (Table1). Sequencing
data is available at the NCBI under Bioproject PRJNA684488.
Data Clean up and Assembly
Reads were trimmed with fastq-mcf (Aronesty 2011) using de-
fault parameters to remove adapters and low-quality regions. Data
cleaning and assembly was done using the iterated bait assembly
(IBA) pipeline (Breinholt etal. 2018) with default parameters, ex-
cept that the paired gap length was set to 100 (-g 100). Genomic
sequences of the target regions from D.plexippus were used as a ref-
erence for the IBA pipeline. In brief, reads similar to the reference se-
quence were identied with USEARCH (Edgar 2010) and assembled
with Bridger (Chang etal. 2015). The resulting assembly was then
used as a reference sequence for another run of USEARCH, and this
process was repeated threetimes.
Alignments
The loci were aligned using the FFT-NS-i algorithm with two iter-
ations in MAFFT v.7 (Katoh and Standley 2013) prior to phylo-
genetic analyses. Alignments were trimmed to the probe regions
by using TrimAl (Capella-Gutierrez et al. 2009), with the options
‘-gapthreshold’ and ‘-conserve’. These commands were imple-
mented to remove gaps. The alignment cleanup was performed with
HmmCleaner (Di Franco etal. 2019), which allows the detection
and removal of primary sequence errors in multiple alignments. We
used the commands ‘-costs’ and ‘--noX’ and dened the four costs
as follows: −0.15, −0.08, 0.15, and 0.45. We subsequently manually
checked for frame shifts, gaps, and codon positions. Finally, align-
ments containing less than 20 samples (excluding references) were
discarded from the downstream analysis. The nal ltered data set
consisted of 378 genes.
Screening Available Genomes and Transcriptomes
Additional 19 taxa were added to our data set by mining avail-
able genomes and transcriptomes, including one epicopeiid and
one sematurid (Table2). Twelve of the transcriptomes were from
the superfamily Geometroidea, the remaining ones were from other
macroheteroceran superfamilies. Raw reads were downloaded from
the NCBI Sequence Read Archive (Leinonen etal. 2011). Reads were
rst processed to remove low-quality regions (Q < 30), adapters
and homopolymer stretches using Cutadapt 1.4.1 (Martin 2011;
minimum read length 50 bp) and Prinseq 0.20.4 (Schmieder and
Edwards 2011), respectively. De novo assembly was carried out with
Trinity 2.0.6 (Grabherr etal. 2011, Haas etal. 2013), with default
parameters, including a minimum contig length of 100bp and a
minimum kmer coverage of5.
Identication of the 378 genes was carried out with a BLAST ap-
proach. Areference sequence set was created from the TE alignments
from one representative per gene. Atblastn (Gertz etal. 2006) search
of the reference set against the transcriptomes (e-value threshold
Table 2. List of the 19 available transcriptomes and genomes added to this study
Family Subfamily Species
Source/acces-
sion numbers
Bombycidae Bombycinae Bombyx mori (Linnaeus, 1758) SilkDB
Crambidae Crambinae Chilo suppressalis (Walker, 1863) LepBase v4
Erebidae Arctiinae Callimorpha dominula (Linnaeus, 1758) SRR1191023
Epicopeiidae — Epicopeia hainseii (Holland, 1889) SRR1021610
Geometridae Larentiinae Operophtera brumata (Linnaeus, 1758) LepBase v4
Ennominae Biston betularia (Linnaeus, 1758) SRR1021599
Biston suppressaria (Guenée, 1858) SRR1777716
Ectropis obliqua (Prout, 1915) SRR3056076
Macaria distribuaria (Hubner, 1825) SRR1299213
Geometrinae Chlorosea margaretaria (Sperry, 1944) SRR1021603
Nemoria lixaria (Guenée, 1858) SRR1299347
Sterrhinae Idaea eremiata (Hulst, 1887) SRR1021615
Noctuidae Hadeninae Spodoptera frugiperda (Smith, 1797) SRR3406055
Notodontidae Nystaleinae Notoplusia minuta (Druce, 1900) SRR1299746
Pyralidae Phycitinae Amyelois transitella (Walker, 1863) LepBase v4
Sematuridae — Mania lunus (Linnaeus, 1758) SRR1299318
Sphingidae Sphinginae Manduca sexta (Linnaeus, 1763) LepBase v4
Uraniidae Uraniinae Lyssa zampa (Butler, 1869) SRR1299769
Epipleminae Calledapteryx dryopterata (Grote, 1868) SRR1021601
Published transcriptomes have their SRA accession numbers listed. SRA, NCBI Sequence Read Archive.
Insect Systematics and Diversity, 2021, Vol. 5, No. 2
Copyedited by: OUP
Downloaded from https://academic.oup.com/isd/article/5/2/6/6244279 by elsa.call.fr@gmail.com on 22 April 2021
5
10e-5) was carried out. The resulting BLAST output was used to
extract the coding regions from each assembly using a set of open
access python scripts from Dr. C.Peña (PyPhylogenomics, https://
github.com/carlosp420/PyPhyloGenomics). The extracted sequences
were aligned to the existing alignment with MAFFT 7.266 (Katoh
and Standley 2013) using the ‘add fragments’ and ‘auto’ options,
to preserve existing gaps in the alignment and choose the most ap-
propriate alignment strategy, respectively. The resulting alignments
were manually screened to ensure accurate alignment and frame
preservation.
Phylogenetic Analyses
To partition our data set, we calculated the relative rates of evo-
lution for each site in the alignment using TIGER (Cummins and
McInerney 2011) and created partitions using the RatePartitions
algorithm (Rota et al. 2018). We tested a range of d values (1.1,
1.5, 2.0, 3.0, and 4.0), which affects the number of partitions, and
calculated the Bayesian information criteria (BIC) values for each
partitioning scheme in PartitionFinder2 (Guindon et al. 2010,
Frandsen etal. 2015, Lanfear etal. 2017). The partitioning scheme
with the highest BIC value was found for d=2.0, which resulted in
14 subsets.
Using the optimal partitioning scheme, we inferred the phylo-
genetic relationships with IQ-TREE 1.6.10 (Nguyen et al.
2015, Chernomor et al. 2016) under the maximum likelihood
(ML) criterion. We used the model nding option in IQ-TREE
(Kalyaanamoorthy etal. 2017) to nd the optimal model for each
partition. To investigate the robustness of our inferences, we used
1,000 ultrafast bootstraps (-bb; Hoang et al. 2018) and 1,000
replicates for SH-aLRT (-alrt; Guindon etal. 2010), which is the
minimum recommended number.
Results
Genes
We recovered a total of 2,131 raw loci. From our total of 33 speci-
mens, two (6%) provided no data: Nossa nelcinna (S3) and P.davidi
(S21). Six specimens provided only 1–12 raw loci (18%); for 16 spe-
cimens, we obtained between 150 and 1,000 loci (48%); nally, nine
specimens gave more than 1,000 loci, with a maximum of 1,383 loci
recovered (27%; Table1, Fig.2).
There is a positive correlation between the collection date of the
specimens and the number of recovered loci (rho=0.46, P=0.008;
Fig.2). As expected, the younger a specimen is, the more loci we can
recover from it. However, there is a lot of variation, meaning some
recently collected specimens can give fewer loci than specimens col-
lected a long time ago. This is, e.g., the case in two specimens of
P.davidi. We recovered 516 raw loci from the older of the two, col-
lected in 1892, whereas the more recent one (1957) provided only a
single rawlocus.
We obtained on average 254 loci and a median of 353 loci per spe-
cimen (Table1). For our phylogenetic analyses, we rst used all the 31
specimens that produced some data, including the 6 from Mayer et al.
(2021). The samples Epicopeia philenora (S57) and Nossa palaeartica
(S5) appeared to be contaminated as their phylogenetic position in pre-
liminary analyses were highly doubtful, and thus they were excluded
from the rest of our analyses.
Our nal data set comprised 37 species, including 20 species
sequenced for this study and 17 outgroup species with published
transcriptomes. The data matrix included 378 nuclear loci (327
genes), for a total alignment of 134,881 base pairs. The average
length of the 378 loci involved in this study is 367bp.
Model Selection and Phylogenetic Analyses
The ML analyses for the different models tested gave the same
phylogenetic relationships, and there were no conicting nodes.
The taxon data set, extended with 17 outgroup species, analyzed in
IQ-TREE resulted in a highly supported ML tree (Fig.3). We also
performed the same phylogenetic analyses where we excluded speci-
mens with less than 10 loci, and we obtain the same topology (Supp
Material 1 [online only]), indicating that the necessarily somewhat
limited data recovered from old specimens are of sufcient quality
for phylogenetic analysis. Although our data set gave strong support
for many of the branches, the relationships among the Noctuoidea,
Bombycoidea, and Geometroidea were weakly supported.
The monophyly of Epicopeiidae is strongly supported, and the
sister group is Pseudobistonidae, with Sematuridae being sister to
these two, also with strong support (SH-like=100, UFBoot=100).
Within Epicopeiidae, almost all relationships are strongly sup-
ported, with the exception of the position of Schistomitra funeralis
(SH-like = 54.8, UFBoot= 81). Relationships of genera are con-
gruent with Minet (2002) and Zhang etal. (2020), i.e., Deuveia is
sister to the rest of Epicopeiidae, with Psychostrophia branching off
next, then Schistomitra, and nally Parabraxas being sister to a clade
containing paraphyletic Epicopeia and Nossa (Fig.3).
Within genera, species for which two or more individuals were
included were mainly monophyletic, with the exception of Epicopeia
hainseii and Epicopeia polydora, which were intermixed in a clade
with very short branches (Fig.3). The branch leading to P.davidi has
weak support values (66.1/94), and this species appears to be genet-
ically very closely related to P.avomarginaria. In addition, Nossa
moorei is not genetically differentiated from Nossa nagaensis, while
being morphologically very similar (Fig.3).
Discussion
Phylogenetic Relationships
Within Epicopeiidae, our results strongly support and are almost en-
tirely congruent with the relationships suggested by Minet (2002)
and Zhang etal. (2020) and thus highly incongruent with the re-
sults of Wei and Yen (2017). We nd Deuveia to be sister to the
rest of Epicopeiidae, with the monophyletic Psychostrophia being
sister to the rest of all taxa excluding Deuveia (Fig.3). Wei and Yen
(2017) found Parabraxas to be sister to Psychostrophia, but our
Fig. 2. Number of raw loci recovered for each sample per year of collection.
The dashed line is for reference and represents the trend. Plot made on R.
Insect Systematics and Diversity, 2021, Vol. 5, No. 2
Copyedited by: OUP
Downloaded from https://academic.oup.com/isd/article/5/2/6/6244279 by elsa.call.fr@gmail.com on 22 April 2021
6
results place Parabraxas in a clade with Schistomitra and (Epicopeia
+ Nossa) with strong support.
The position of S.funeralis (which has 131 loci in our dataset)
was incongruent with the hypothesis by Minet (2002), but with low
support. In our study, we found Schistomitra to be the sister group
of Parabraxas + (Epicopeia + Nossa) (Fig.3), whereas Minet (2002)
found it to be the sister group of Epicopeia + Nossa (Fig.1), and
Zhang etal. (2020) found it to be sister to Parabraxas + Chatamla.
Wei and Yen (2017) found it to be sister to Chatamla + the newly
described genus Mimapora, and this clade to be closer to Parabraxas
+ Psychostrophia than to Epicopeia + Nossa (Fig.1). However, we
are not able to condently resolve the relationships of Schistomitra,
Parabraxas, and (Epicopeia + Nossa). Our study does not include the
taxa Amana, Chatamla, or Mimapora, which are all potentially re-
lated to Schistomitra and Parabraxas (Minet, 2002). All four genera,
Schistomitra, Amana, Chatamla, and Mimapora, are currently being
considered to be monotypic, and their relationships based on morph-
ology are somewhat enigmatic (Minet 2002, Wei and Yen 2017).
Zhang etal. (2020) did include all four genera, and they were able to
resolve their phylogenetic positions with condence.
As in Zhang et al. (2020), we nd that Nossa and Epicopeia
are paraphyletic with regard to each other. Indeed, E.philenora ap-
pears to be the sister group to N.moorei and N.nagaensis, whereas
N.palaeartica comes out as related to E.hainseii and E.polydora.
Fig. 3. Phylogenetic tree from maximum likelihood analysis of 36 taxa based on 378 loci. If the support values are not displayed on the branch, it means it is
equal to 100/100. When displayed, numbers are the SH-aLRT support (%)/ultrafast bootstrap support (%). The images are representative species (indicated with
numbers; not to scale). The three families are represented by an arrow and a letter. S, Sematuridae; P, Pseudobistonidae; and E, Epicopeiidae.
Insect Systematics and Diversity, 2021, Vol. 5, No. 2
Copyedited by: OUP
Downloaded from https://academic.oup.com/isd/article/5/2/6/6244279 by elsa.call.fr@gmail.com on 22 April 2021
7
Furthermore, these relationships are well supported. Minet (2002)
also nds the two genera to be closely related and sharing six
apomorphic character states, despite being supercially quite distinct
with Epicopeia species tending to mimic papilionids, and Nossa spe-
cies tending to mimic pierid species (Fig.3). Clearly, these two genera
need to be studied in more detail by including all 12 described spe-
cies. It is possible that the genera should be synonymized, in which
case Epicopeia would have priority. Also, we found E.hainseii and
E.polydora to be genetically inseparable based on our dataset. In
contrast, Zhang et al. (2020) nd these two taxa to be completely
separate, with E. polydora being sister to N. moorei, in a similar
position to our E. philenora. Zhang et al. (2020) did not sample
E.philenora, but E.polydora and E.philenora are morphologically
very similar, suggesting that our sequences may be contaminants.
The paraphyly of Epicopeia and Nossa is surprising. These
two genera are morphologically supercially very different, with
Epicopeia species showing distinct tails on the hindwings, whereas
Nossa species lack these tails. Indeed, Minet separated these two
genera on morphological characters, including their genitalia (Minet
2002). However, one should keep in mind that Epicopeia are
mimicking species of butteries in the genera Papilio and Byasa (that
have tails on the hindwings), whereas Nossa is thought to mimic
species of Pieridae (that do not have tails; Wei and Yen 2017, Zhang
etal. 2020). It has been considered that mimicry might be one of
the causes for the rapid divergence of phenotypes (Turner 1976,
Counterman etal. 2010, Kozak etal. 2015). Thus, in further work,
we need to investigate this aspect by including more species and in-
dividuals of Epicopeia and Nossa.
Within Epicopeiidae, specimens with few loci explain most
branches with low support (the exception being Schistomitra de-
scribed above). When we removed the four specimens with less
than 10 loci (see Table1) from our analyses, the relationships do
not change, while the support greatly improved to reach the max-
imum value of 100/100 on some branches, like for N.moorei and
N.nagaensis, or for the relationships between Epicopeia hainseii and
E. polydora (Supp Material 1 [online only]). This would indicate
that specimens with few loci are only affecting the support values,
but not the general topology.
Here we obtain strong support for the hypothesis that Sematuridae
is the sister group of Epicopeiidae + Pseudobistonidae. Even with
few representatives for Sematuridae and Pseudobistonidae, the sup-
port for this hypothesis is compelling (100/100) and in line with
previous studies (Rajaei etal. 2015, Kawahara et al. 2019, Wang
etal. 2019). Furthermore, we conrmed that Epicopeiidae is mono-
phyletic with regard to Pseudobistonidae, strengthening the case for
the latterfamily.
The rst attempt to resolve the position of Pseudobistonidae
was made when the family was described by Rajaei etal. (2015) to
accommodate P. pinratanai. Rajaei et al. (2015) found the family
to be the sister group of Epicopeiidae. Recently, the position of
Pseudobistonidae was corroborated with the addition of another spe-
cies in the family: H.discivitta (Wang etal. 2019). However, Wang
etal. (2019) only included three Epicopeiidae and two Sematuridae
species. Furthermore, the support for the branches leading to these
three families was quite low, e.g., the branch supporting Sematuridae
as the sister group of Epicopeiidae + Pseudobistonidae had a boot-
strap value of 33. Zhang et al. (2020) include Heracula in their
dataset and nd it to be sister to Epicopeiidae with strong support;
thus, it would appear that Pseudobistonidae is indeed the sister lin-
eage to Epicopeiidae, with Sematuridae being sister to these two.
Old Material and Contamination
We see a tendency for old museum specimens to yield fewer loci
than the more recently collected ones (Fig.2). Overall, the older a
specimen is, the lower the chances are to get DNA out of it with the
TE approach. Nevertheless, some old specimens provide more loci
than younger ones. For instance, for the two specimens of P.davidi,
the older, collected in 1892, provided 516 raw loci, whereas the
younger, collected in 1957, provided only a single raw locus. There
is no clear explanation for these kinds of outliers, but they might
be due to different treatments during their curation (Espeland etal.
2010, Burrell et al. 2015, Vaudo etal. 2018). Unfortunately, nei-
ther a proper record of this kind of treatment nor how specimens
have been collected and curated are usually available, making it im-
possible here to infer what other factors than age can affect the
quality of DNA. Regardless, even if the tendency is, as expected, that
older samples have less and poorer DNA quality, it remains a trend.
Therefore, we should not discount these specimens just because they
are old, as they can still turn out to be real genetic treasuretroves.
Unfortunately, two specimens were denitely contaminated,
E.philenora (S57) and N.palaeartica (S5), and therefore were not
analyzed further. If they had been of good quality, they could have
helped us to conrm the position of E.philenora in the case of S57,
as well as the separation of Nossa in two groups with N.moorei +
N.nagaensis on one side and N.palaeartica (S5) on the other side.
In addition, our E.polydora specimens were found to be genetically
identical to E.hainseii, in stark contrast to Zhang etal. (2020). One
of our specimens (S47) yielded 1,270 raw loci (Table1), suggesting
large amounts of DNA in the extract. The two species cannot be con-
fused morphologically (see doi:10.5281/zenodo.3769000). Clearly,
this needs to be investigated in more detail, but for the moment, we
do not have a good explanation for these results.
The Importance of Museomics
Since their creation, natural history museums have been an essen-
tial source of biological knowledge and resources for both the scien-
tic community and the public (Duckworth etal. 1993, Suarez and
Tsutsui 2004). These collections of biological specimens are vital for
the study of systematics, global climate change research, biological
invasion studies, as well as for many other scientic disciplines (Bi
etal. 2013, Bradley etal. 2014, Bakker etal. 2020). Curated speci-
mens in museums have several advantages compared with collecting
fresh specimens; they can be easy to access, most of them are iden-
tied, and often possess information such as the date of collection
and the location. Moreover, nowadays, researchers in biology and
ecology face many challenges before being able to sample in the
eld. These issues can be monetary (e.g., lack of funding), stochastic
events (inaccessibility of species of interest, adverse weather condi-
tions, pandemic etc.), but also administrative difculties, with bur-
eaucratic hurdles being erected at an increasing pace (Neumann etal.
2018). Natural history museums also contain extinct taxa, rare and
challenging to collect species, which can be a crucial asset to studies.
However, until recently, this vast amount of biological resources was
mainly used for morphological studies because the DNA from these
specimens was thought to be too degraded to be used for molecular
studies (Shapiro and Hofreiter 2012). Due to this, DNA work has for
a long time mainly been limited to species for which freshly collected
samples could be obtained, whereas DNA work from collections has
been limited to sequencing short fragments DNA (Hajibabaei etal.
2006, Lozier and Cameron 2009, Strutzenberger etal. 2012, Hebert
etal. 2013, Cameron etal. 2016).
Insect Systematics and Diversity, 2021, Vol. 5, No. 2
Copyedited by: OUP
Downloaded from https://academic.oup.com/isd/article/5/2/6/6244279 by elsa.call.fr@gmail.com on 22 April 2021
8
We have taken advantage of recent advances in sequencing
technologies, which have opened up access to genomic data of mu-
seum specimens. Within the past few years, various studies emerged
applying these methods on a wide variety of species: from birds
(Anmarkrud and Lifjeld 2017, Cloutier et al. 2018) and mammals
(Fabre etal. 2014, Hawkins etal. 2016) to insects (Kanda etal. 2015,
Sproul and Maddison 2017), and plants (Zedane etal. 2016, Silva
etal. 2017). Part of these studies used whole-genome sequencing
(Kanda et al. 2015, Zedane et al. 2016, Sproul and Maddison
2017, Cloutier etal. 2018), whereas the others employed diverse
genome reduction methods, such as exon capture (Bi etal. 2013)
and TE (Hawkins et al. 2016). Although these studies used dif-
ferent kinds of sequencing methods, they focus on very distinct sci-
entic questions: from systematics (Silva etal. 2017), to the origin
and diversication of a taxon (Fabre etal. 2014), to population
genomics (Bi etal. 2013).
Here, we used a genome reduction method, TE, on curated mu-
seum specimens of rare and challenging to collect moth species, to
rene our knowledge of their phylogenetic relationships. We man-
aged to recover on average 566 nuclear loci per species using the
TE method. The present study also shows that it is possible to ex-
tract substantial amounts of DNA sequence data from specimens
collected up to 127 yr ago. Hence, our study contributes to the eld
of museomics, demonstrating the application of this sequencing
method on museum specimens, increasing the value of such spe-
cimens even further. Museomics opens a window to the past, pro-
viding possibilities for testing new hypotheses and for casting new
light on old ones.
Conclusion
In summary, we conducted a phylogenetic analysis on small and rare
families of Lepidoptera, using museum specimens. We successfully
sequenced samples that were collected between 1892 and 2001. By
utilizing a TE approach, we were able to recover between 150 and
1,383 loci per specimen for 75% of our samples. From all these raw
loci, we used 378 genes—present in at least 20 samples—to recon-
struct a phylogenetic hypothesis based on ML analysis of 37 taxa.
This analysis corroborates, with strong support, the hypothesis that
Sematuridae are the sister group of Epicopeiidae + Pseudobistonidae.
Within Epicopeiidae, our study nds Deuveia as sister group of the
rest of Epicopeiidae genera. The position of Schistomitra is incon-
gruent with the central hypothesis suggested by Minet (2002) for
this family; however, the support for this branch is low. The low sup-
port for this branch might be explained in our study by the lack of
some genera (Amana, Chatamla, and Mimapora). Indeed, these taxa
may help to clarify the phylogenetic position of Schistomitra, as seen
in Zhang et al. (2020). Although we showed that Psychostrophia
and Parabraxas are monophyletic, we also found that Nossa and
Epicopeia are paraphyletic. Overall, the genera of Epicopeiidae re-
quire more work to reveal their phylogenetic relationships.
Museum collections represent a varied and essential biobank of
samples for studying the diversity on earth. The availability of spe-
cimens, not only rare but also extinct, within worldwide museum
collection is a fantastic asset. Nowadays, sequencing techniques
are powerful enough to allow scientists to recover DNA from old
museum specimens. This is the beginning of an exciting era for
molecular studies. Our study makes its contribution to the eld
of museomics by successfully demonstrating that researchers can
use museum samples at a molecular level for phylogenetic studies.
Consequently, this study is paving the way for more molecular work
using museum specimens.
SupplementaryData
Supplementary data are available at Insect Systematics and
Diversityonline.
Supplementary Material 1.Phylogenetic tree from ML analysis
of 36 taxa based on 378 loci, specimens with less than 10 loci were
excluded.
Acknowledgments
We are thankful to Claudia Etzbauer for help with ordering the kit and
to Sandra Kukowka for assistance in the molecular lab. We highly appre-
ciate the effort of everyone depositing samples at the ZFMK. The study
was funded by the Zoological Research Museum Alexander Koenig, and
received funding from the European Union’s Horizon 2020 research and
innovation program under the Marie Skłodowska-Curie Grant Agreement
No. 6422141.
ReferencesCited
Allio,R., C.Scornavacca, B.Nabholz, A.-L.Clamens, F.A.H.Sperling, and
F.L.Condamine. 2019. Whole genome shotgun phylogenomics resolves
the pattern and timing of swallowtail buttery evolution. Syst. Biol. 69:
38–60.
Anmarkrud,J.A., and J.T.Lifjeld. 2017. Complete mitochondrial genomes
of eleven extinct or possibly extinct bird species. Mol. Ecol. Resour. 17:
334–341.
Aronesty,E. 2011. Fastq-mcf sequence quality lter, clipping and processor.
(Com/p/ea-utils/wiki/fastqmcf).
Bakker, F. T., A. Antonelli, J. A. Clarke, J. A. Cook, S. V. Edwards,
P.G.P.Ericson, S.Faurby, N.Ferrand, M.Gelang, and R.G.Gillespie,
etal. 2020. The Global Museum: natural history collections and the future
of evolutionary science and public education. PeerJ 8: e8225.
Bank,S., M.Sann, C.Mayer, K.Meusemann, A.Donath, L.Podsiadlowski,
A. Kozlov, M. Petersen, L. Krogmann, and R. Meier, et al. 2017.
Transcriptome and target DNA enrichment sequence data provide new
insights into the phylogeny of vespid wasps (Hymenoptera: Aculeata:
Vespidae). Mol. Phylogenetics Evol. 116: 213–226. doi:10.1016/j.
ympev.2017.08.020.
Bazinet,A. L., M.P.Cummings, K.T.Mitter, and C.W.Mitter. 2013. Can
RNA-Seq resolve the rapid radiation of advanced moths and butteries
(Hexapoda: Lepidoptera: Apoditrysia)? An exploratory study. PLoS One
8: e82615.
Bi,K., T.Linderoth, D.Vanderpool, J.M.Good, R.Nielsen, and C.Moritz.
2013. Unlocking the vault: next-generation museum population genomics.
Mol. Ecol. 22: 6018–6032.
Bradley,R. D., L.C.Bradley, H.J.Garner, and R.J.Baker. 2014. Assessing
the value of natural history collections and addressing issues regarding
long-term growth and care. BioScience 64: 1150–1158.
Breinholt, J. W., C. Earl, A. R. Lemmon, E. M. Lemmon, L. Xiao, and
A. Y.Kawahara. 2018. Resolving relationships among the megadiverse
butteries and moths with a novel pipeline for anchored phylogenomics.
Syst. Biol. 67: 78–93.
Burrell,A. S., T.R. Disotell, and C.M. Bergey. 2015. The use of museum
specimens with high-throughput DNA sequencers. J. Hum. Evol. 79:
35–44.
Cameron,S.A., H.C.Lim, J.D.Lozier, M.A.Duennes, and R.Thorp. 2016.
Test of the invasive pathogen hypothesis of bumble bee decline in North
America. Proc. Natl. Acad. Sci. USA 113: 4386–4391.
Capella-Gutierrez,S., J.M.Silla-Martinez, and T.Gabaldon. 2009. trimAl: a
tool for automated alignment trimming in large-scale phylogenetic ana-
lyses. Bioinformatics 25: 1972–1973.
Chang, Z., G. Li, J. Liu, Y.Zhang, C. Ashby, D. Liu, C. L. Cramer, and
X.Huang. 2015. Bridger: a new framework for de novo transcriptome
assembly using RNA-seq data. Genome Biol. 16: 30.
Chapman, A. D. 2005. Uses of primary species-occurrence data. Global
Biodiversity Information Facility, Copenhagen, Denmark.
Insect Systematics and Diversity, 2021, Vol. 5, No. 2
Copyedited by: OUP
Downloaded from https://academic.oup.com/isd/article/5/2/6/6244279 by elsa.call.fr@gmail.com on 22 April 2021
9
Chernomor,O., A.vonHaeseler, and B.QMinh. 2016. Terrace aware data
structure for phylogenomic inference from supermatrices. Syst. Biol. 65:
997–1008.
Cloutier,A., T.B.Sackton, P.Grayson, S.V.Edwards, and A.J.Baker. 2018.
First nuclear genome assembly of an extinct moa species, the little bush
moa (Anomalopteryx didiformis). bioRxiv, doi:10.1101/262816, 9
February 2018, preprint.
Cong,Q., J.Shen, D.Borek, R.K.Robbins, P.A.Opler, Z.Otwinowski, and
Grishin. V. 2017. When coi barcodes deceive: complete genomes reveal
introgression in hairstreaks. Proc. R.Soc. B Biol. Sci. USA 284: 20161735.
Counterman, B. A., F. Araujo-Perez, H. M. Hines, S. W. Baxter, C. M,
Morrison, D. P. Lindstrom, R. Papa, L. Ferguson, M. Joron, and
R. H. ffrench-Constant, etal. 2010. Genomic hotspots for adaptation:
the population genetics of Müllerian Mimicry in Heliconius erato. PLoS
Genet. 6: e1000796.
Cruz-Dávalos,D.I., B.Llamas, C.Gaunitz, A.Fages, C.Gamba, J.Soubrier,
P.Librado, A.Seguin-Orlando, M.Pruvost, A.H.Alfarhan, et al. 2017.
Experimental conditions improving in-solution target enrichment for an-
cient DNA. Mol. Ecol. Resour. 17: 508–522.
Cummins,C.A., and J.O.McInerney. 2011. A method for inferring the rate of
evolution of homologous characters that can potentially improve phylo-
genetic inference, resolve deep divergence and correct systematic biases.
Syst. Biol. 60: 833–844.
DiFranco,A., R.Poujol, D.Baurain, and H.Philippe. 2019. Evaluating the
usefulness of alignment ltering methods to reduce the impact of errors on
evolutionary inferences. BMC Evol. Biol. 19: 21.
Duckworth,W.D., H.H.Genoways, and C.L.Rose. 1993. Preserving natural
science collections: chronicle of our environmental heritage. Mammology
Papers: University of Nebraska State Museum No. 271. National Institute
for the Conservation of Cultural Property, Washington, DC. p 153.
Edgar,R. 2010. Usearch. Lawrence Berkeley National Lab. (LBNL), Berkeley,
CA. (https://www.osti.gov/sciencecinema/biblio/1137186).
Espeland, M., M. Irestedt, K. A. Johanson, M. Åkerlund, J.-E.Bergh, and
M.Källersjö. 2010. Dichlorvos exposure impedes extraction and ampli-
cation of DNA from insects in museum collections. Front. Zool. 7: 2.
Espeland, M., J. Breinholt, K. R. Willmott, A. D. Warren, R. Vila,
E. F. A. Toussaint, S. C. Maunsell, K. Aduse-Poku, G. Talavera, and
R.Eastwood, etal. 2018. A comprehensive and dated phylogenomic ana-
lysis of butteries. Curr. Biol. 28: 770–778.e5.
Espeland,M., J.W.Breinholt, E.P.Barbosa, M.M.Casagrande, B.Huertas,
G.Lamas, M.A.Marín, O.H.H.Mielke, J.Y.Miller, S.Nakahara, etal.
2019. Four hundred shades of brown: higher level phylogeny of the prob-
lematic Euptychiina (Lepidoptera, Nymphalidae, Satyrinae) based on hy-
brid enrichment data. Mol. Phylogenet. Evol. 131: 116–124.
Fabre,P.-H., J.T. Vilstrup, M.Raghavan, C.Der Sarkissian, E.Willerslev,
E.J.P.Douzery, and L.Orlando. 2014. Rodents of the Caribbean: origin
and diversication of hutias unravelled by next-generation museomics.
Biol. Lett. 10: 20140266–20140266.
Faircloth, B. C., J. E. McCormack, N. G. Crawford, M. G. Harvey,
R.T.Brumeld, and T.C.Glenn. 2012. Ultraconserved elements anchor
thousands of genetic markers spanning multiple evolutionary timescales.
Syst. Biol. 61: 717–726.
Fletcher,D. S. 1979. Geometroidea: apoprogonidae, Axiidae, Callidulidae,
Cyclidiidae, Drepanidae, Epicopeiidae, Epiplemidae, Geometridae,
Pterothysanidae, Sematuridae, Thyatiridae, Uraniidae, p. 243. In
I.W.B.Nye (ed.), The generic names of moths of the world, Vol. 3. British
Museum (Natural History), London, UK.
Frandsen,P.B., B.Calcott, C.Mayer, and R.Lanfear. 2015. Automatic se-
lection of partitioning schemes for phylogenetic analyses using iterative
k-means clustering of site rates. BMC Evol. Biol. 15: 13.
Gertz, E. M., Y.-K. Yu, R. Agarwala, A. A. Schäffer, and S. F. Altschul.
2006. Composition-based statistics and translated nucleotide searches:
improving the TBLASTN module of BLAST. BMC Biol. 4: 41.
Grabherr,M.G., B.J.Haas, M.Yassour, J.Z.Levin, D.A.Thompson, I.Amit,
X.Adiconis, L.Fan, R.Raychowdhury, and Q.Zeng, etal. 2011. Full-
length transcriptome assembly from RNA-Seq data without a reference
genome. Nat. Biotechnol. 29: 644–652.
Guindon, S., J.-F. Dufayard, V. Lefort, M. Anisimova, W. Hordijk, and
O.Gascuel. 2010. New algorithms and methods to estimate maximum-
likelihood phylogenies: assessing the performance of PhyML 3.0. Syst.
Biol. 59: 307–321.
Haas,B.J., A.Papanicolaou, M.Yassour, M.Grabherr, P.D.Blood, J.Bowden,
M.B.Couger, D.Eccles, B.Li, and M.Lieber, etal. 2013. De novo tran-
script sequence reconstruction from RNA-seq using the Trinity platform
for reference generation and analysis. Nat. Protoc. 8: 1494–1512.
Hajibabaei,M., M. A.Smith, D.H.Janzen, J.J.Rodriguez, J.B. Whiteld,
and P.D.N.Hebert. 2006. A minimalist barcode can identify a specimen
whose DNA is degraded. Mol. Ecol. Notes 6: 959–964.
Hawkins, M. T. R., C. A. Hofman, T. Callicrate, M. M. McDonough,
M.T.N.Tsuchiya, E.E.Gutiérrez, K.M.Helgen, and J.E.Maldonado.
2016. In-solution hybridization for mammalian mitogenome enrichment:
Pros, cons and challenges associated with multiplexing degraded DNA.
Mol. Ecol. Resour. 16: 1173–1188.
Hebert,P.D.N., J.R.Dewaard, E.V.Zakharov, S.W.J.Prosser, J.E.Sones,
J.T.A. McKeown, B.Mantle, and J. La Salle. 2013. A DNA ‘barcode
blitz’: rapid digitization and sequencing of a natural history collection.
PLoS One 8: e68535.
Hoang,D.T., O.Chernomor, A.vonHaeseler, B.Q.Minh, and L.S.Vinh.
2018. UFBoot2: improving the ultrafast bootstrap approximation. Mol.
Biol. Evol. 35: 518–522.
Hundsdoerfer,A.K., and I.J.Kitching. 2010. A method for improving DNA
Yield from century-plus old specimens of large lepidoptera while minim-
izing damage to external and internal abdominal characters. Arthropod
Syst. Phylog. 68: 151–155.
Jones,M. R., and J. M.Good. 2016. Targeted capture in evolutionary and
ecological genomics. Mol. Ecol. 25: 185–202.
Kalyaanamoorthy, S., B. Q. Minh, T. K. F. Wong, A. von Haeseler, and
L.S.Jermiin. 2017. ModelFinder: fast model selection for accurate phylo-
genetic estimates. Nat. Methods 14: 587–589.
Kanda,K., J.M. Pug, J.S.Sproul, M.A. Dasenko, and D.R.Maddison.
2015. Successful recovery of nuclear protein-coding genes from small in-
sects in museums using Illumina sequencing. PLoS One 10: e0143929.
Katoh,K., and D.M.Standley. 2013. MAFFT multiple sequence alignment
software version 7: improvements in performance and usability. Mol. Biol.
Evol. 30: 772–780.
Kawahara,A.Y., D.Plotkin, M.Espeland, K.Meusemann, E.F.A.Toussaint,
A.Donath, F.Gimnich, P.B.Frandsen, A.Zwick, and M.Reis, etal. 2019.
Phylogenomics reveals the evolutionary timing and pattern of butteries
and moths. Proc. Natl. Acad. Sci. USA 116: 22657–22663.
Kozak,K. M., N.Wahlberg, A.F.E.Neild, K.K. Dasmahapatra, J.Mallet,
and C.D.Jiggins. 2015. Multilocus species trees show the recent adaptive
radiation of the mimetic Heliconius butteries. Syst. Biol. 64: 505–524.
Laithwaite,E. R., and P.E. S. Whalley. 1975. Dictionary of butteries and
moths in color. Michael Joseph, London, United Kingdom. (http://agris.
fao.org/agris-search/search.do?recordID=US201300521786).
Lanfear,R., P.B.Frandsen, A.M.Wright, T.Senfeld, and B.Calcott. 2017.
PartitionFinder 2: new methods for selecting partitioned models of evo-
lution for molecular and morphological phylogenetic analyses. Mol. Biol.
Evol. 34: 772–773.
Leinonen, R., H. Sugawara, and M. Shumway. 2011. The sequence read
archive. Nucl. Acids Res. 39: D19–D21.
Lemmon,E.M., and A. R. Lemmon. 2013. High-throughput genomic data
in systematics and phylogenetics. Ann. Rev. Ecol. Evol. Syst. 44: 99–121.
Li, W., Q. Cong, J. Shen, J. Zhang, W. Hallwachs, D. H. Janzen, and
N.V.Grishin. 2019. Genomes of skipper butteries reveal extensive con-
vergence of wing patterns. Proc. Natl. Acad. Sci. USA 116: 6232–6237.
Lozier,J.D., and S.A.Cameron. 2009. Comparative genetic analyses of his-
torical and contemporary collections highlight contrasting demographic
histories for the bumble bees Bombus pensylvanicus and B.impatiens in
Illinois. Mol. Ecol. 18: 1875–1886.
Martin,M. 2011. Cutadapt removes adapter sequences from high-throughput
sequencing reads. EMBnet J. 17: 10–12.
Mayer,C., M.Sann, A.Donath, M.Meixner, L.Podsiadlowski, R.S.Peters,
M. Petersen, K. Meusemann, K. Liere, and J. W. Wägele, et al. 2016.
Insect Systematics and Diversity, 2021, Vol. 5, No. 2
Copyedited by: OUP
Downloaded from https://academic.oup.com/isd/article/5/2/6/6244279 by elsa.call.fr@gmail.com on 22 April 2021
10
BaitFisher: a software package for multispecies target DNA enrichment
probe design. Mol. Biol. Evol. 33: 1875–1886.
Mayer,C., L.Dietz, E.Call, S.Kukowka, S.Martin, and M.Espeland. 2021.
Adding leaves to the Lepidoptera tree: capturing hundreds of nuclear genes
from old museum specimens. Syst. Entomol. doi:10.1111/syen.12481.
McCormack, J. E., S. M. Hird, A. J. Zellmer, B. C. Carstens, and
R. T. Brumeld. 2013. Applications of next-generation sequencing to
phylogeography and phylogenetics. Mol. Phylogenet. Evol. 66: 526–538.
Minet, J. 1983. Étude morphologique et phylogénétique des organes
tympaniques des Pyraloidea. 1– Généralités et homologies (Lep. Glossata).
Ann. La Société Entomol. France 19: 175–207.
Minet, J. 1986. Ébauche d’une classication moderne de l’ordre des
Lepidoptères. Alexanor 14: 291–313.
Minet, J. 2002. The epicopeiidae: phylogeny and a redenition, with the
description of new taxa (Lepidoptera: Drepanoidea). Ann. La Société
Entomol. France 38: 463–487.
Neumann,D., A.V.Borisenko, J.A.Coddington, C.L.Häuser, C.R.Butler,
A.Casino, J.C. Vogel, G.Haszprunar, and P. Giere. 2018. Global bio-
diversity research tied up by juridical interpretations of access and benet
sharing. Organ. Divers. Evol. 18: 1–12.
Nguyen, L.-T., H. A. Schmidt, A. von Haeseler, and B. Q. Minh. 2015.
IQ-TREE: a fast and effective stochastic algorithm for estimating
maximum-likelihood phylogenies. Mol. Biol. Evol. 32: 268–274.
Rajaei, H., C. Greve, H. Letsch, D. Stüning, N. Wahlberg, J. Minet, and
B. Misof. 2015. Advances in Geometroidea phylogeny, with character-
ization of a new family based on Pseudobiston pinratanai (Lepidoptera,
Glossata). Zool. Scr. 44: 418–436.
Regier,J.C., A.Zwick, M.P.Cummings, A.Y.Kawahara, S.Cho, S.Weller,
A.Roe, J.Baixeras, J.W.Brown, and C.Parr, etal. 2009. Toward recon-
structing the evolution of advanced moths and butteries (Lepidoptera:
Ditrysia): an initial molecular study. BMC Evol. Biol. 9: 280.
Rota, J., T.Malm, N.Chazot, C.Peña, and N.Wahlberg. 2018. A simple
method for data partitioning based on relative evolutionary rates. PeerJ
6: e5498.
Schmieder,R., and R.Edwards. 2011. Quality control and preprocessing of
metagenomic datasets. Bioinformatics 27: 863–864.
Shapiro,B., and M.Hofreiter. 2012. Ancient DNA: methods and protocols.
Humana Press, New York, NY.
Silva, C., G. Besnard, A. Piot, J. Razanatsoa, R. P. Oliveira, and
M.S.Vorontsova. 2017. Museomics resolve the systematics of an endan-
gered grass lineage endemic to north-western Madagascar. Ann. Bot. 119:
339–351.
Sproul, J. S., and D. R. Maddison. 2017. Sequencing historical specimens:
successful preparation of small specimens with low amounts of degraded
DNA. Mol. Ecol. Resour. 17: 1183–1201.
StLaurent,R.A., C.A.Hamilton, and A.Y.Kawahara. 2018. Museum spe-
cimens provide phylogenomic data to resolve relationships of sack-bearer
moths (Lepidoptera, Mimallonoidea, Mimallonidae). Syst. Entomol. 43:
729–761.
Strutzenberger,P., G.Brehm, and K.Fiedler. 2012. DNA barcode sequencing
from old type specimens as a tool in taxonomy: a case study in the diverse
genus Eeois (Lepidoptera: Geometridae). PLoS One 7: e49710.
Suarez,A.V., and N.D.Tsutsui. 2004. The value of museum collections for
research and society. BioScience 54: 66.
Suchan, T., C. Pitteloud, N. S. Gerasimova, A. Kostikova, S. Schmid,
N. Arrigo, M. Pajkovic, M. Ronikier, and N. Alvarez. 2016.
Hybridization capture using RAD Probes (hyRAD), a new tool for
performing genomic analyses on collection specimens. PLoS One 11:
e0151651.
Toussaint,E.F.A., J.W.Breinholt, C. Earl, A.D.Warren, A.V.Z.Brower,
M.Yago, K.M.Dexter, M. Espeland, N.E.Pierce, and D. J.Lohman,
etal. 2018. Anchored phylogenomics illuminates the skipper buttery tree
of life. BMC Evol. Biol. 18: 101.
Turner,J.R.G. 1976. Adaptive radiation and convergence in subdivisions of
the buttery genus Heliconius (Lepidoptera: Nymphalidae). Zool. J.Linn.
Soc. 58: 297–308.
Vaudo,A.D., M.L.Fritz, and M.M.López-Uribe. 2018. Opening the door
to the past: accessing phylogenetic, pathogen, and population data from
museum curated bees. Insect Syst. Divers. 2: 4. doi:10.1093/isd/ixy014.
Wang,H., J.D.Holloway, N. Wahlberg, M.Wang, and S.Nylin. 2019.
Molecular phylogenetic and morphological studies on the sys-
tematic position of Heracula discivitta reveal a new subfamily of
Pseudobistonidae (Lepidoptera: Geometroidea). Syst. Entomol. 44:
211–225.
Wei,C.-H., and S.-H. Yen. 2017. Mimaporia, a new genus of Epicopeiidae
(Lepidoptera), with description of a new species from Vietnam. Zootaxa
4254: 537.
Zedane,L., C.Hong-Wa, J.Murienne, C.Jeziorski, B.G. Baldwin, and
G. Besnard. 2016. Museomics illuminate the history of an extinct,
paleoendemic plant lineage (Hesperelaea, Oleaceae) known from an
1875 collection from Guadalupe Island, Mexico. Biol. J. Linn. Soc.
117: 44–57.
Zhang,J., Q.Cong, J.Shen, E.Brockmann, and N.Grishin. 2019. Genomes
reveal drastic and recurrent phenotypic divergence in retip skipper
butteries (Hesperiidae: Pyrrhopyginae). Proc. R.Soc. B Biol. Sci. 286:
20190609.
Zhang,Y., S.Huang, D.Liang, H.Wang, and P. Zhang. 2020. A multilocus
analysis of Epicopeiidae (Lepidoptera, Geometroidea) provides new in-
sights into their relationships and the evolutionary history of mimicry.
Mol. Phylogenet. Evol. 149: 106847.
Insect Systematics and Diversity, 2021, Vol. 5, No. 2
Copyedited by: OUP
Downloaded from https://academic.oup.com/isd/article/5/2/6/6244279 by elsa.call.fr@gmail.com on 22 April 2021
... These methods are particularly useful in phylogenomics, as they focus on specific genomic regions that have been shown to be phylogenetically informative (Kadlec et al. 2017;Toussaint et al. 2018). In particular, TE has been successfully applied to phylogenomics of butterflies and moths Espeland et al. 2019;Homziak et al. 2019;Mayer et al. 2021). ...
... The resulting reads tend to range from 50 to 150 bp (Sproul and Maddision 2017;Korlević et al. 2021;Mullin et al. 2022), leading to difficulties with assembly. Recent studies have shown that it is possible to extract phylogenetically useful data from such fragmented genomes using both TE St Laurent et al. 2018;Espeland et al. 2019;Call et al. 2021) and WGS (Cong et al. 2017;Sproul and Maddision 2017;Li et al. 2019;Zhang et al. 2019;Cong et al. 2021;Grewe et al. 2021;Twort et al. 2021;Mullin et al. 2022). However, as museomics is such a young field, there is no consensus on which method to access genomic information is the most suitable. ...
... The relationships, not only within these two clades but also between them and Pseudobistonidae, a close family, were unclear until recently (Minet 2002;Rajaei et al. 2015;Wei and Yen 2017;Wang et al. 2019;Zhang et al. 2020). In a previous study, we conducted a phylogenetic analysis using TE on museum specimens of Epicopeiidae, Sematuridae and Pseudobistonidae (Call et al. 2021). Here, we performed a WGS study on Epicopeiidae and Sematuridae. ...
Preprint
Full-text available
There are various possibilities for sequencing highly degraded DNA, such as target enrichment (TE), or whole-genome sequencing (WGS). Here we compare TE and WGS methods using old museum specimens of two families of moths in the superfamily Geometroidea: Epicopeiidae and Sematuridae. Until recently, the relationships of these two families were unclear, as few studies had been done. Recently two studies used the TE approach, either on relatively fresh specimens, or on old museum specimens. Here, we aim to increase the sampling of the families Epicopeiidae and Sematuridae from museum specimens using the WGS method. We show that both sequencing methods give comparable results, but, unsurprisingly, WGS recovers more data. By combining TE and WGS data, we confirm that Sematuridae are sister to Pseudobistonidae+Epicopeiidae. Relationships of genera within the families are well supported. With the costs of WGS decreasing, we suggest that using low-coverage whole genome sequencing is becoming an increasingly viable option in the phylogenomics of insects.
... Within museomics studies, more cost-effective genome reduction methods, such as Ultra Conserved Elements (UCEs) and Anchored Hybrid Enrichment (AHE) are rising in popularity due to their ability to target specific informative loci within a focal group [17][18][19][20]. While standard extraction from dry-preserved specimens may yield adequate DNA for UCE and AHE sequencing [20,21], its success varies based upon specimen preservation. ...
... While standard extraction from dry-preserved specimens may yield adequate DNA for UCE and AHE sequencing [20,21], its success varies based upon specimen preservation. Therefore, exploring the effectiveness of more sensitive DNA extraction methods is essential, especially since their application in entomological collections remains poorly investigated [18]. ...
Article
Full-text available
Although several methods exist for extracting and sequencing historical DNA originating from dry-preserved insect specimens deposited in natural history museums, no consensus exists as to what is the optimal approach. We demonstrate that a customized, low-cost archival DNA extraction protocol (∼€10 per sample), in combination with Ultraconserved Elements (UCEs), is an effective tool for insect phylogenomic studies. We successfully tested our approach by sequencing DNA from scarab dung beetles preserved in both wet and dry collections, including unique primary type and rare historical specimens from internationally important natural history museums in London, Paris and Helsinki. The focal specimens comprised of enigmatic dung beetle genera (Nesosisyphus, Onychothecus and Helictopleurus) and varied in age and preservation. The oldest specimen, the holotype of the now possibly extinct Mauritian endemic Nesosisyphus rotundatus, was collected in 1944. We obtained high-quality DNA from all studied specimens to enable the generation of a UCE-based dataset that revealed an insightful and well-supported phylogenetic tree of dung beetles. The resulting phylogeny propounded the reclassification of Onychothecus (previously incertae sedis) within the tribe Coprini. Our approach demonstrates the feasibility and effectiveness of combining DNA data from historic and recent museum specimens to provide novel insights. The proposed archival DNA protocol is available at DOI 10.17504/protocols.io.81wgbybqyvpk/v3.
... Similar cases abound in other groups of Lepidoptera (e.g. Call et al. 2021). The radiation of Nivaliodes occurred mostly throughout the Pleistocene and was likely to be related to consecutive climatic shifts, dispersal, and local adaptations to high puna habitats in central Peru. ...
Article
Full-text available
A new genus of satyrine butterflies, Nivaliodes gen. nov., is described for three species, all new: Nivaliodes negrobueno sp. nov., Nivaliodes viracocha sp. nov., and Nivaliodes puriq sp. nov. (Lepidoptera: Nymphalidae), with the support of molecular data and adult morphology. A target enrichment-based phylogeny indicates that Nivaliodes gen. nov. is sister to the genus Pherepedaliodes within an extremely diverse Pedaliodes clade of the predominantly Andean subtribe Pronophilina. Although an overwhelming majority of species of this group occur in tropical montane forests, N. negrobueno sp. nov. was discovered in a central Peruvian desert puna at some 4600–4800 m a.s.l., the highest elevation reported for any species of the Pronophilina. Individuals were observed overflying rocky slopes and resting directly on snow-covered surfaces, which is an exceptionally unusual behaviour among butterflies. The other two species of the new genus were found at lower elevations, some 3300–4200 m a.s.l., at the timberline and in puna grassland.
... The field is heavily DNA-driven, but purely molecular approaches can be seen as "reductionistically mistaken" (Casacci et al. 2014). Further, even though studying 'historical' or 'ancient' DNA from old specimen from various collections is more and more promising recently, such approaches are not always straightforward and easy to adopt for every potential study object (Anderung et al. 2008;Ellis 2008;Lis et al. 2011;Call et al. 2021;Raxworthy and Smith 2021). Morphology represents the primary data we use to infer species boundaries and to describe biodiversity (Schlick- Steiner et al. 2007;Steiner et al. 2009;Yazdi et al. 2012), and morphological trait variation shows the highest heritability among various trait types (e.g. ...
Article
Full-text available
For effective conservation management of endangered taxa, it is important to define operational units for conservation. In the absence of detailed genetic analyses, morphology-based taxonomy is often used as a surrogate. The Apollo butterfly, Parnassius apollo, is one of the most endangered butterfly species in Europe (considered as a flagship species) with 26 subspecies rank taxa described from the Carpatho-Pannonian region (Central Europe), often based on old, one-by-one descriptions. We applied landmark-based geometric morphometrics on wing shape to determine the number of morphologically distinguishable groups in the region, based on 949 males and 477 females from 20 Carpatho-Pannonian putative subspecies (both extant and potentially extinct). We found a single division between the Eastern Carpathian populations (described as two subspecies: ssp. transsylvanicus and ssp. rosenius) and the rest of the populations (including our outgroup from the Swiss Alps). Since P. apollo was not observed in the Eastern Carpathians in the last two decades, and the currently known extant populations in the Carpatho-Pannonian region are all located in the Northern Carpathians, our results support a single conservation unit in the region. We suggest that (i) extensive monitoring is needed to reveal whether the unique Eastern Carpathian populations have really gone extinct and (ii) more taxonomical/phylogenetic studies on Central European P. apollo are needed for establishing the taxonomy of the species and efficient conservation strategies. We emphasize that modern integrative taxonomy is not only important for clarifying taxonomical issues, but also for providing basis for sound conservation management.
... Furthermore, smaller specimens may need to be pooled together to attain sufficient amounts of mRNA, and such practice risks mixing up individuals with undetected genetic variation. Unfortunately, a large amount of collected specimens only exist in natural history museum collections, and most of these are ethanol preserved and thus not usable for transcriptomic studies (Call et al. 2021). As taxon sampling is considered one of the most important factors for accurate phylogenetic tree reconstruction (Heath et al. 2008), it would be missing an opportunity to leave the potential of natural history collections untapped. ...
Article
Full-text available
Low-coverage whole-genome sequencing (also known as “genome skimming”) is becoming an increasingly affordable approach to large-scale phylogenetic analyses. While already routinely used to recover organellar genomes, genome skimming is rather rarely utilized for recovering single-copy nuclear markers. One reason might be that only few tools exist to work with this data type within a phylogenomic context, especially to deal with fragmented genome assemblies. We here present a new software tool called Patchwork for mining phylogenetic markers from highly fragmented short-read assemblies as well as directly from sequence reads. Patchwork is an alignment-based tool that utilizes the sequence aligner DIAMOND and is written in the programming language Julia. Homologous regions are obtained via a sequence similarity search, followed by a “hit stitching” phase, in which adjacent or overlapping regions are merged into a single unit. The novel sliding window algorithm trims away any noncoding regions from the resulting sequence. We demonstrate the utility of Patchwork by recovering near-universal single-copy orthologs within a benchmarking study, and we additionally assess the performance of Patchwork in comparison with other programs. We find that Patchwork allows for accurate retrieval of (putatively) single-copy genes from genome skimming data sets at different sequencing depths with high computational speed, outperforming existing software targeting similar tasks. Patchwork is released under the GNU General Public License version 3. Installation instructions, additional documentation, and the source code itself are all available via GitHub at https://github.com/fethalen/Patchwork.
... In phylogenetic studies with historical DNA, cost-effective methods such as Ultra Conserved Elements (UCEs) and Anchored Hybrid Enrichment (AHE) are rising in popularity due to their ability to target specific informative loci within a focal group (Faircloth, 2017;Call et al., 2021;Faircloth et al., 2012;Mayer et al., 2021). While standard extraction from dry-preserved specimens may yield adequate DNA for UCE and AHE sequencing (Gustafson et al., 2020;Mayer et al., 2021), its success varies based on preservation. ...
Preprint
Full-text available
Although several methods exist for extracting and sequencing historical DNA originating from drypreserved insect specimens deposited in natural history museums, no consensus exists as to what is the optimal approach. We demonstrate that a customized, low-cost archival DNA extraction protocol (∼ €10 per sample), in combination with Ultraconserved Elements (UCEs), is an effective tool for insect phylogenomic studies. We successfully tested our approach by sequencing DNA from scarab dung beetles preserved in both wet and dry collections, including unique primary type and rare historical specimens from internationally important natural history museums in London, Paris and Helsinki. The focal specimens comprise enigmatic dung beetle genera ( Nesosisyphus, Onychotechus and Helictopleurus ) that varied in age and preservation. The oldest specimen, the holotype of the now possibly extinct Mauritian endemic Nesosisyphus rotundatus , was collected in 1944. We obtained high-quality DNA from all studied specimens to enable the generation of a UCE-based dataset that revealed an insightful and well-supported phylogenetic tree of dung beetles. The resulting phylogeny suggested the reclassification of Onychotechus (previously incertae sedis ) within the tribe Coprini. Our approach demonstrates the feasibility and effectiveness of combining DNA data from historic and recent museum specimens to provide novel insights. The proposed archival DNA protocol is available at DOI 10.17504/protocols.io.81wgbybqyvpk/v1 Highlights We combined custom low-cost archival DNA extractions and Ultraconserved Element phylogenomics DNA from rare museum specimens of enigmatic dung beetles revealed their phylogenetic connections Genomic data was obtained from the holotype of a potentially extinct monoinsular endemic species Genomic data allowed a rare and enigmatic species of previously unknown affinity to be classified The morphology of museum specimens remained intact following non-destructive DNA extraction Abstract Figure
... Consequently, molecular approaches have been restricted, on one hand, to sequencing technologies (e.g., Sanger sequencing) (Hajibabaei et al., 2006;Hebert et al., 2013) and, on the other hand, were limited to quality materials (e.g., freshly collected samples, proper killing agent, etc.). However, recently, next-generation sequencing (NGS) technologies have made the DNA in museum specimens more accessible, either through whole-genome sequencing (WGS) (Allio et al., 2020;Call et al., 2021;Sproul & Maddison, 2017;Twort et al., 2021) or genome reduction methods (Breinholt et al., 2018;Mayer et al., 2021;Suchan et al., 2016). These advanced sequencing approaches have opened a new field with great potential for studying the evolutionary history of taxa that are difficult to collect: museomics . ...
Article
Full-text available
Here, we present multi-locus sequencing results from the enigmatic Afrotropical monotypic genus Egybolis Boisduval (occurring in East-and South Africa-previously placed in the subfamily Catocalinae, Noctuidae). Model-based phylogenetic analysis places Egybolis within a strongly supported clade comprising four Old World Tropical genera Cocytia Boisduval, Avatha Walker, Anereuthina Hübner, and Serrodes Guenée from the family Erebidae, subfamily Erebinae. Hence, we propose to formally assign the monotypic genus Egybolis to the subfamily Erebinae and the tribe Cocytiini. Timing of divergence analysis reveals the late Oligocene origin around 25 million years ago (Ma) for the tribe Cocytiini, and an early Miocene (~21 Ma) for the split between Cocytia and Egybolis.
Article
Full-text available
The rapid advancement of molecular biodiversity monitoring tools, particularly DNA metabarcoding, has improved specimen identification in bulk samples, such as those from Malaise traps, where traditional morphological identification is impractical. While not yet standardized, a typical first step in insect bulk sample analysis is the extraction of DNA from homogenized specimens. While this step yields reliable metabarcoding results, it destroys the specimens, preventing further use in monitoring and taxonomic analysis. Non-destructive lysis, which preserves specimen integrity, is still being evaluated for its effectiveness in accurately assessing bulk sample biodiversity. In this study, we assessed the suitability of non-destructive lysis for Malaise trap samples and compared its performance with homogenization using an established metabarcoding workflow. Five bulk samples were collected with Malaise traps. Samples were first incubated in a lysis buffer containing Proteinase K (non-destructive lysis) and then homogenized. DNA was extracted from both treatments and metabarcoding was performed to compare OTU richness, accumulation, and beta diversity. On average, homogenized samples yielded 3.8% more OTUs than non-destructive lysis samples. Although homogenization provides a more comprehensive and cost-effective assessment of Malaise trap bulk samples, non-destructive lysis still recovered at least 80% of the OTUs identified through homogenization and revealed similar patterns of community change. Even though our results show that both methods yield comparable data on insect biodiversity and can be used for monitoring, we consider non-destructive lysis as not suitable for integration into automated workflows or large-scale biomonitoring due to the much higher costs. Nonetheless, this method remains important in cases where morphological integrity needs to be preserved and additional sampling is not possible.
Article
Full-text available
Gerromorpha Popov, 1971 is a fascinating and diverse insect lineage that evolved about 200 Mya to spend their entire life cycle on the air-water interface and have since colonized all types of aquatic habitats. The sub-family Halobatinae Bianchi, 1896 is particularly interesting because some species have adapted to life on the open ocean-a habitat where insects are very rarely found. Several attempts have been made to reconstruct the phylogenetic hypotheses of this subfamily, but the use of a few partial gene sequences recovered only a handful of well-supported relationships, thus limiting evolutionary inferences. Fortunately, the emergence of high-throughput sequencing technologies has enabled the recovery of more genetic markers for phylogen-etic inference. We applied genome skimming to obtain mitochondrial and nuclear genes from low-coverage whole-genome sequencing of 85 specimens for reconstructing a well-supported phylogeny, with particular emphasis on Halobatinae. Our study confirmed that Metrocorini Matsuda, 1960, is paraphyletic, whereas Esakia Lundblad, 1933, and Ventidius Distant, 1910, are more closely related to Halobatini Bianchi, 1896, than Metrocoris Mayr, 1865, and Eurymetra Esaki, 1926. We also found that Ventidius is paraphyletic and in need of a taxonomic revision. Ancestral state reconstruction suggests that Halobatinae evolved progressively from limnic to coastal habitats, eventually attaining a marine lifestyle, especially in the genus Halobates Eschscholtz, 1822, where the oceanic lifestyle evolved thrice. Our results demonstrate that genome skimming is a powerful and straightforward approach to recover genetic loci for robust phylogenetic analysis in non-model insects.
Article
Full-text available
Natural History Collections (NHCs) represent the world’s largest repositories of long-term biodiversity datasets. Specimen collection and voucher deposition has been the backbone of NHCs since their inception, but recent decades have seen a drastic decline in rates of growth via active collecting. Amphibians and reptiles are amongst the most threatened zoological groups on the planet and are historically underrepresented in most worldwide NHCs. As part of an ongoing project to review the Portuguese zoological collections in the country’s NHCs, herpetological data from its three major museums and smaller collections was gathered and used to examine the coverage and representation of the different taxa extant in Portugal. These collections are not taxonomically, geographically, or temporally complete. Approximately 90% of the Portuguese herpetological taxa are represented in the country’s NHCs, and around half of the taxa are represented by less than 50 specimens. Geographically, the collections cover less than 30% of the country’s territory and almost all of the occurring taxa have less than 10% of their known distribution represented in the collections. A discussion on the implications for science of such incomplete collections and a review of the current status of Portuguese NHCs is presented.
Article
Full-text available
Museum collections around the world contain billions of specimens, including rare and extinct species. If their genetic information could be retrieved at a large scale, this would dramatically increase our knowledge of genetic and taxonomic diversity information, and support evolutionary, ecological and systematic studies. We here present a target enrichment kit for 2953 loci in 1753 orthologous nuclear genes + the barcoding region of cytochrome C oxidase 1, for Lepidoptera and demonstrate its utility to obtain a large number of nuclear loci from dry, pinned museum material collected from 1892 to 2017. We sequenced enriched libraries of 37 museum specimens across the order Lepidoptera, many from higher taxa not yet included in high‐throughput molecular studies, showing that our kit can be used to generate comparable data across the order, and provides resolution both for shallower and deeper nodes. The filtered datasets (172 taxa, 234 464 amino acid positions and corresponding nucleotides from 1835 CDS regions) were used to infer a phylogeny of Lepidoptera, which is largely congruent in topology to recent phylogenomic studies, but with the addition of some key taxa. We furthermore present our TEnriAn (Target Enrichment Analysis) workflow for processing and combining target enrichment, transcriptomic and genomic data. A target enrichment kit and workflow to sequence and process hundreds to thousands of loci for recent and old Lepidoptera specimens is presented. Taxa are added to the Lepidoptera phylogeny based on museum specimens collected between 1892 and 2017. Lepidoptera systematic relationships are discussed.
Article
Full-text available
Natural history museums are unique spaces for interdisciplinary research and educational innovation. Through extensive exhibits and public programming and by hosting rich communities of amateurs, students, and researchers at all stages of their careers, they can provide a place-based window to focus on integration of science and discovery, as well as a locus for community engagement. At the same time, like a synthesis radio telescope, when joined together through emerging digital resources, the global community of museums (the 'Global Museum') is more than the sum of its parts, allowing insights and answers to diverse biological, environmental, and societal questions at the global scale, across eons of time, and spanning vast diversity across the Tree of Life. We argue that, whereas natural history collections and museums began with a focus on describing the diversity and peculiarities of species on Earth, they are now increasingly leveraged in new ways that significantly expand their How to cite this article Bakker FT, Antonelli A, Clarke JA, Cook JA, Edwards SV, Ericson PGP, Faurby S, Ferrand N, Gelang M, Gille-spie RG, Irestedt M, Lundin K, Larsson E, Matos-Maraví P, Müller J, von Proschwitz T, Roderick GK, Schliep A, Wahlberg N, Wiedenhoeft J, Källersjö M. 2020. The Global Museum: natural history collections and the future of evolutionary science and public education. PeerJ 8:e8225 http://doi.org/10.7717/peerj.8225 impact and relevance. These new directions include the possibility to ask new, often interdisciplinary questions in basic and applied science, such as in biomimetic design, and by contributing to solutions to climate change, global health and food security challenges. As institutions, they have long been incubators for cutting-edge research in biology while simultaneously providing core infrastructure for research on present and future societal needs. Here we explore how the intersection between pressing issues in environmental and human health and rapid technological innovation have reinforced the relevance of museum collections. We do this by providing examples as food for thought for both the broader academic community and museum scientists on the evolving role of museums. We also identify challenges to the realization of the full potential of natural history collections and the Global Museum to science and society and discuss the critical need to grow these collections. We then focus on mapping and modelling of museum data (including place-based approaches and discovery), and explore the main projects, platforms and databases enabling this growth. Finally, we aim to improve relevant protocols for the long-term storage of specimens and tissues, ensuring proper connection with tomorrow's technologies and hence further increasing the relevance of natural history museums.
Article
Full-text available
Butterflies and moths (Lepidoptera) are one of the major super-radiations of insects, comprising nearly 160,000 described extant species. As herbivores, pollinators, and prey, Lepidoptera play a fundamental role in almost every terrestrial ecosystem. Lepidoptera are also indicators of environmental change and serve as models for research on mimicry and genetics. They have been central to the development of coevolutionary hypotheses, such as butterflies with flowering plants and moths' evolutionary arms race with echolocat-ing bats. However, these hypotheses have not been rigorously tested, because a robust lepidopteran phylogeny and timing of evolutionary novelties are lacking. To address these issues, we inferred a comprehensive phylogeny of Lepidoptera, using the largest data-set assembled for the order (2,098 orthologous protein-coding genes from transcriptomes of 186 species, representing nearly all super-families), and dated it with carefully evaluated synapomorphy-based fossils. The oldest members of the Lepidoptera crown group appeared in the Late Carboniferous (∼300 Ma) and fed on nonvas-cular land plants. Lepidoptera evolved the tube-like proboscis in the Middle Triassic (∼241 Ma), which allowed them to acquire nectar from flowering plants. This morphological innovation, along with other traits, likely promoted the extraordinary diversification of superfamily-level lepidopteran crown groups. The ancestor of butterflies was likely nocturnal, and our results indicate that butterflies became day-flying in the Late Cretaceous (∼98 Ma). Moth hearing organs arose multiple times before the evolutionary arms race between moths and bats, perhaps initially detecting a wide range of sound frequencies before being co-opted to specifically detect bat sonar. Our study provides an essential framework for future comparative studies on butterfly and moth evolution. Lepidoptera | coevolution | phylogeny | angiosperms | bats
Article
Full-text available
Biologists marvel at the powers of adaptive convergence, when distantly related animals look alike. While mimetic wing patterns of butterflies have fooled predators for millennia, entomologists inferred that mimics were distant relatives despite similar appearance. However, the obverse question has not been frequently asked. Who are the close relatives of mimetic butterflies and what are their features? As opposed to close convergence, divergence from a non-mimetic relative would also be extreme. When closely related animals look unalike, it is challenging to pair them. Genomic analysis promises to elucidate evolutionary relationships and shed light on molecular mechanisms of divergence. We chose the firetip skipper butterfly as a model due to its phenotypic diversity and abundance of mimicry. We sequenced and analysed whole genomes of nearly 120 representative species. Genomes partitioned this subfamily Pyrrhopyginae into five tribes (1 new), 23 genera and, additionally, 22 subgenera (10 new). The largest tribe Pyrrhopygini is divided into four subtribes (three new). Surprisingly, we found five cases where a uniquely patterned butterfly was formerly placed in a genus of its own and separately from its close relatives. In several cases, extreme and rapid phenotypic divergence involved not only wing patterns but also the structure of the male genitalia. The visually striking wing pattern difference between close relatives frequently involves disappearance or suffusion of spots and colour exchange between orange and blue. These differences (in particular, a transition between unspotted black and striped wings) happen recurrently on a short evolutionary time scale, and are therefore probably achieved by a small number of mutations.
Article
Full-text available
For centuries, biologists have used phenotypes to infer evolution. For decades, a handful of gene markers have given us a glimpse of the genotype to combine with phenotypic traits. Today, we can sequence entire genomes from hundreds of species and gain yet closer scrutiny. To illustrate the power of genomics, we have chosen skipper butterflies (Hesperiidae). The genomes of 250 representative species of skippers reveal rampant inconsistencies between their current classification and a genome-based phylogeny. We use a dated genomic tree to define tribes (six new) and subtribes (six new), to overhaul genera (nine new) and subgenera (three new), and to display convergence in wing patterns that fooled researchers for decades. We find that many skippers with similar appearance are distantly related, and several skippers with distinct morphology are close relatives. These conclusions are strongly supported by different genomic regions and are consistent with some morphological traits. Our work is a forerunner to genomic biology shaping biodiversity research.
Article
Full-text available
Background Multiple Sequence Alignments (MSAs) are the starting point of molecular evolutionary analyses. Errors in MSAs generate a non-historical signal that can lead to incorrect inferences. Therefore, numerous efforts have been made to reduce the impact of alignment errors, by improving alignment algorithms and by developing methods to filter out poorly aligned regions. However, MSAs do not only contain alignment errors, but also primary sequence errors. Such errors may originate from sequencing errors, from assembly errors, or from erroneous structural annotations (such as incorrect intron/exon boundaries). Even though their existence is acknowledged, the impact of primary sequence errors on evolutionary inference is poorly characterized. Results In a first step to fill this gap, we have developed a program called HmmCleaner, which detects and eliminates these errors from MSAs. It uses profile hidden Markov models (pHMM) to identify sequence segments that poorly fit their MSA and selectively removes them. We assessed its performances using > 700 amino-acid MSAs from prokaryotes and eukaryotes, in which we introduced several types of simulated primary sequence errors. The sensitivity of HmmCleaner towards simulated primary sequence errors was > 95%. In a second step, we compared the impact of segment filtering software (HmmCleaner and PREQUAL) relative to commonly used block-filtering software (BMGE and TrimAI) on evolutionary analyses. Using real data from vertebrates, we observed that segment-filtering methods improve the quality of evolutionary inference more than the currently used block-filtering methods. The formers were especially effective at improving branch length inferences, and at reducing false positive rate during detection of positive selection. Conclusions Segment filtering methods such as HmmCleaner accurately detect simulated primary sequence errors. Our results suggest that these errors are more detrimental than alignment errors. However, they also show that stochastic (sampling) error is predominant in single-gene evolutionary inferences. Therefore, we argue that MSA filtering should focus on segment instead of block removal and that more studies are required to find the optimal balance between accuracy improvement and stochastic error increase brought by data removal. Electronic supplementary material The online version of this article (10.1186/s12862-019-1350-2) contains supplementary material, which is available to authorized users.
Article
Full-text available
Heracula discivitta Moore is an uncommon moth species currently recorded from India, Nepal and China. Although this species has traditionally been placed in Lymantriinae, its systematic position in Macroheterocera has been enigmatic due to its unique morphological features. Here we used molecular and morphological data to explore the systematic position of H. discivitta. Our molecular phylogenetic analyses indicate that this species is sister to Pseudobiston pinratanai Inoue, a member of a recently established monotypic family Pseudobistonidae. The examinations of morphological features further show that H. discivitta shares synapomorphies with Pseudobistonidae. Based on the analysis results, we propose a new subfamily of Pseudobistonidae (Heraculinae subfam.n.) to accommodate H. discivitta. The resemblance of the habitus to that of the brahmaeid genus Calliprogonos Mell & Hering is discussed. This published work has been registered on ZooBank, http://zoobank.org/urn:lsid:urn:lsid:zoobank.org:pub:63D17850‐6D51‐4E03‐A5D6‐F9EF6E7AF402.
Preprint
A bstract High throughput sequencing (HTS) has revolutionized the field of ancient DNA (aDNA) by facilitating recovery of nuclear DNA for greater inference of evolutionary processes in extinct species than is possible from mitochondrial DNA alone. We used HTS to obtain ancient DNA from the little bush moa ( Anomalopteryx didiformis ), one of the iconic species of large, flightless birds that became extinct following human settlement of New Zealand in the 13 th century. In addition to a complete mitochondrial genome at 249.9X depth of coverage, we recover almost 900 Mb of the moa nuclear genome by mapping reads to a high quality reference genome for the emu ( Dromaius novaehollandiae ). This first nuclear genome assembly for moa covers approximately 75% of the 1.2 Gb emu reference with sequence contiguity sufficient to identify more than 85% of bird universal single-copy orthologs. From this assembly, we isolate 40 polymorphic microsatellites to serve as a community resource for future population-level studies in moa. We also compile data for a suite of candidate genes associated with vertebrate limb development and show that the wingless moa phenotype is likely not attributable to gene loss or pseudogenization among this candidate set. We also identify potential function-altering coding sequence variants in moa for future experimental assays.
Article
Evolutionary relationships have remained unresolved in many well-studied groups, even though advances in next-generation sequencing and analysis, using approaches such as transcriptomics, anchored hybrid enrichment, or ultraconserved elements, have brought systematics to the brink of whole genome phylogenomics. Recently, it has become possible to sequence the entire genomes of numerous non-biological models in parallel at reasonable cost, particularly with shotgun sequencing. Here we identify orthologous coding sequences from whole-genome shotgun sequences, which we then use to investigate the relevance and power of phylogenomic relationship inference and time-calibrated tree estimation. We study an iconic group of butterflies - swallowtails of the family Papilionidae - that has remained phylogenetically unresolved, with continued debate about the timing of their diversification. Low-coverage whole genomes were obtained using Illumina shotgun sequencing for all genera. Genome assembly coupled to BLAST-based orthology searches allowed extraction of 6,621 orthologous protein-coding genes for 45 Papilionidae species and 16 outgroup species (with 32% missing data after cleaning phases). Supermatrix phylogenomic analyses were performed with both maximum-likelihood (IQ-TREE) and Bayesian mixture models (PhyloBayes) for amino acid sequences, which produced a fully resolved phylogeny providing new insights into controversial relationships. Species tree reconstruction from gene trees was performed with ASTRAL and SuperTriplets and recovered the same phylogeny. We estimated gene site concordant factors to complement traditional node-support measures, which strengthens the robustness of inferred phylogenies. Bayesian estimates of divergence times based on a reduced dataset (760 orthologs and 12% missing data) indicate a mid-Cretaceous origin of Papilionoidea around 99.2 million years ago (Ma) (95% credibility interval: 68.6-142.7 Ma) and Papilionidae around 71.4 Ma (49.8-103.6 Ma), with subsequent diversification of modern lineages well after the Cretaceous-Paleogene event. These results show that shotgun sequencing of whole genomes, even when highly fragmented, represents a powerful approach to phylogenomics and molecular dating in a group that has previously been refractory to resolution.