ArticlePDF Available

Interrogating Genomic-Scale Data to Resolve Recalcitrant Nodes in the Spider Tree of Life


Abstract and Figures

Genome-scale data sets are converging on robust, stable phylogenetic hypotheses for many lineages; however, some nodes have shown disagreement across classes of data. We use spiders (Araneae) as a system to identify the causes of incongruence in phylogenetic signal between three classes of data: exons (as in phylotranscriptomics), non-coding regions (included in ultraconserved elements [UCE] analyses), and a combination of both (as in UCE analyses). Gene orthologs, coded as amino acids and nucleotides (with and without third codon positions), were generated by querying published transcriptomes for UCEs, recovering 1,931 UCE loci (codingUCEs). We expected that congeners represented in the codingUCE and UCEs data would form clades in the presence of phylogenetic signal. Non-coding regions derived from UCE sequences were recovered to test the stability of relationships. Phylogenetic relationships resulting from all analyses were largely congruent. All nucleotide data sets from transcriptomes, UCEs, or a combination of both recovered similar topologies in contrast with results from transcriptomes analyzed as amino acids. Most relationships inferred from low occupancy data sets, containing several hundreds of loci, were congruent across Araneae, as opposed to high occupancy data matrices with fewer loci, which showed more variation. Furthermore, we found that low occupancy data sets analyzed as nucleotides (as is typical of UCE data sets) can result in more congruent relationships than high occupancy data sets analyzed as amino acids (as in phylotranscriptomics). Thus, omitting data, through amino acid translation or via retention of only high occupancy loci, may have a deleterious effect in phylogenetic reconstruction.
Content may be subject to copyright.
Interrogating genomic-scale data to resolve recalcitrant nodes in the
Spider Tree of Life
Siddharth Kulkarni,1,2,* Robert J. Kallal,2 Hannah Wood,2 Dimitar Dimitrov,3 Gonzalo Giribet,4
and Gustavo Hormiga1
1Department of Biological Sciences, The George Washington University, Washington, D.C.
2Department of Entomology, National Museum of Natural History, Washington, D.C.
3Department of Natural History, University Museum of Bergen, University of Bergen, Bergen,
4Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology,
Harvard University, Cambridge, MA
*Corresponding author: E-mail:
Downloaded from by guest on 08 November 2020
Genome-scale data sets are converging on robust, stable phylogenetic hypotheses for many
lineages; however, some nodes have shown disagreement across classes of data. We use
spiders (Araneae) as a system to identify the causes of incongruence in phylogenetic signal
between three classes of data: exons (as in phylotranscriptomics), non-coding regions (included
in ultraconserved elements [UCE] analyses), and a combination of both (as in UCE analyses).
Gene orthologs, coded as amino acids and nucleotides (with and without third codon positions),
were generated by querying published transcriptomes for UCEs, recovering 1,931 UCE loci
(codingUCEs). We expected that congeners represented in the codingUCE and UCEs data
would form clades in the presence of phylogenetic signal. Non-coding regions derived from UCE
sequences were recovered to test the stability of relationships. Phylogenetic relationships
resulting from all analyses were largely congruent. All nucleotide data sets from transcriptomes,
UCEs, or a combination of both recovered similar topologies in contrast with results from
transcriptomes analyzed as amino acids. Most relationships inferred from low occupancy data
sets, containing several hundreds of loci, were congruent across Araneae, as opposed to high
occupancy data matrices with fewer loci, which showed more variation. Furthermore, we found
that low occupancy data sets analyzed as nucleotides (as is typical of UCE data sets) can result
in more congruent relationships than high occupancy data sets analyzed as amino acids (as in
phylotranscriptomics). Thus, omitting data, through amino acid translation or via retention of
only high occupancy loci, may have a deleterious effect in phylogenetic reconstruction.
Key words: Araneae, non-coding regions, phylogeny, target-capture, transcriptomics
Massive parallel sequencing and the exponential increase in the size of data sets have enabled
researchers to use a variety of genomic data types (whole genomes, transcribed gene regions,
introns, fast/slow evolving loci, etc.) to address specific evolutionary questions. These data sets
have rapidly dwarfed Sanger sequencing-based studies in terms of amounts of data (Mardis
2011), however, they have proven to be challenging to analyze. Once celebrated as the gold
standard for inferring evolutionary histories (Rokas et al. 2003; Gee 2003), it is now clear that
sheer quantity of data will not unequivocally resolve all problematic nodes in a phylogeny.
Conflicting but highly supported phylogenetic relationships have emerged in many data sets,
even when containing hundreds or thousands of loci.
Downloaded from by guest on 08 November 2020
Furthermore, the objective quantification of branch support is obfuscated by widespread
reliance on the bootstrap support metric (in a maximum likelihood framework), among a few
others like posterior probability in a Bayesian framework. Bootstrap values are often inflated
when comparable numbers of sites indicate conflicting relationships for a given branch
(Felsenstein 1985). Such conflicts are common among large-scale data sets and therefore
bootstrap values are generally high. This conundrum has impacted phylogenetic studies of
many groups of organisms, including birds (Jarvis et al. 2014; Prum et al. 2015; Walker et al.
2018; Cloutier et al. 2019), placental mammals (Morgan et al. 2013; Romiguier et al. 2013),
extant angiosperms (Zanis et al. 2002; Wickett et al. 2014; Xi et al. 2014) and arachnids (e.g.,
Sharma et al. 2014; Ballesteros and Sharma 2019; Lozano-Fernández et al. 2019). In the
present study, we focus on the nature of the systematic conflict (with high bootstrap support for
alternative hypotheses) across genomic data sets addressing a yet to be satisfactorily resolved
problem in spider phylogenetics.
In recent studies on the spider tree of life, phylogenies resulting from the analysis of
either transcriptomes or ultraconserved elements (UCEs) have largely converged on similar
topologies (e.g., Garrison et al. 2016; Fernández et al. 2018; Kulkarni et al. 2020; Dimitrov &
Hormiga 2021; Kallal et al. in press). However, incongruence persists in some recalcitrant
nodes, receiving high support for contradicting hypotheses. Some of these incongruences, in
the context of spider systematics, include: (a) the placement of the RTA Clade (a group of
spiders characterized by the presence of a retrolateral tibial apophysis in the male palpthe
appendage that male spiders use for copulation) with respect to the “UDOH grade” (an
assemblage containing the spider families Uloboridae, Deinopidae, Oecobiidae and Hersiliidae);
(b) the placement of Nicodamoidea with respect to Araneoidea (the ecribellate orb weavers);
and, (c) the interfamilial relationships of the miniature orb weaving familiesa group informally
known as “symphytognathoids.” The “symphytognathoids” (Griswold et al. 1998) include the
families Anapidae, Mysmenidae, Theridiosomatidae, and Symphytognathidae (which includes
smallest adult spider in the world, Patu digua; Forster and Platnick 1977). Few studies have
found support for the monophyly of “symphytognathoids”, and a particular study suggests that
Synaphridae also belongs to this group (Lopardo et al. 2011). Here, we focus on the
relationships of the “symphytognathoid” families as a major area of conflict in the spider tree of
life by comparing a diversity of approaches and data classes and their effects on this particular
The monophyly of “symphytognathoid” families has been supported, although not
formalized as a taxon, by morphological and behavioral characters (Griswold et al. 1998; Schutt
Downloaded from by guest on 08 November 2020
2003; Lopardo and Hormiga 2008; Lopardo et al. 2011; Hormiga and Griswold 2014), but these
families have appeared as either paraphyletic or polyphyletic in molecular phylogenies based on
standard Sanger markers (Dimitrov et al. 2017; Wheeler et al. 2017) or transcriptomes
(Fernández et al. 2018; Kallal et al. in press). Lopardo et al.’s (2011) extensive Sanger-based
data set supported “symphytognathoid” monophyly only when the nucleotide data were
analyzed in combination with phenotypic data. Recently, an analysis using target enrichment
methods to capture ultraconserved elements (UCEs) provided the first molecular support for the
monophyly of “symphytognathoids” (ultrafast bootstrap >95), although only with the analyzed
low occupancy data sets (Kulkarni et al. 2020). This result was surprising, given the lack of
support for symphytognathoid monophyly in all prior molecular analyses, including
phylogenomic data sets analyzed as amino acid data in a maximum likelihood framework (Kallal
et al. in press). In that study, the parsimony analysis of the amino acid data set recovered
Theridiosomatidae as the sister group of Araneidae, with the remaining “symphytognathoids”
forming a monophyletic group (Kallal et press).
The paradox of highly supported but incongruent relationships requires a critical
assessment of the nature of the data being analyzed, in our case, in the context of the high
bootstrap support for both, the monophyly or polyphyly of “symphytognathoids” in different
analyses. The phylogenetic relationships of the miniature orb-weavers offer an excellent system
to explore the nature of conflict between these two types of genomic data sets. One possible
approach, albeit unexplored up to this point, is to identify the phylogenetic signal common to
transcriptomic and UCE data sets. Transcriptomes, which are sequenced from mRNA, are often
analyzed as amino acids, and include only exonic regions. UCEs on the other hand are
sequenced from the genome and are typically analyzed as nucleotides, and include both exons
and non-coding regions. The possibility of combining the vast data sets of UCEs and
transcriptomes would enable not only an expanded taxon sampling, but also allow reconciliation
of the existing UCE and transcriptome data sets (e.g., Bossert et al. 2019). Furthermore,
because a recent study has shown that currently sequenced UCEs in Arachnida are mostly
exonic (Hedin et al. 2019) it should be possible to combine UCEs and transcriptomes in a
meaningful manner (Bossert et al. 2019; Hedin et al. 2019).
The present study aims to identify the causes of incongruence amongst transcriptome-
based and UCE-based sequences in phylogenetic analyses of spiders by leveraging data from
recent studies (e.g., Garrison et al. 2016; Fernández et al. 2018; Kulkarni et al. 2020; Kallal et
al. in press). Our approach was to reconstruct phylogenies using sequences from
transcriptomes, UCEs, and a combination of data sources, at both the amino acid and
Downloaded from by guest on 08 November 2020
nucleotide level. We then analyzed these data sets using different phylogenetic methods at
different occupancy levels, while also exploring the phylogenetic signal of non-coding regions,
something rarely attempted in this kind of phylogenetic analyses.
First, we hypothesize that transcriptomes contain ultraconserved regions. On targeting
these coding ultraconserved regions using the Spider2Kv1 probe set (Kulkarni et al. 2020), we
reconstruct a phylogeny to resolve a number of selected recalcitrant nodes. The efficacy of the
transcriptome-derived UCEs for resolving phylogenetic relationships is tested by adding multiple
congeneric or confamilial taxa that represent coding UCEs, UCEs from previous studies and
UCEs obtained from genomes. We hypothesize that analyzing data as amino acids versus
nucleotides can influence the inferred phylogenetic relationships. To test this, we reconstruct
and compare phylogenies using nucleotide and amino acid data sets from sequences derived
from both transcriptomes and ultraconserved regions of the genome. We found that nucleotide
data sets converge on a similar topology including the recovery of the symphytognathoid
representatives as a clade while amino acid data sets did not. This outcome suggests that
reducing the number of characters included in nucleotide data sets via translation to amino
acids is detrimental to the topological stability of phylogenetic inference.
Results and Discussion
Statistics for all analyzed data sets are listed in Supplementary Table 1. A few clarifications are
provided here.
With the current taxon sample, 2,019 loci were obtained (before occupancy filtering), out of
which 1,931 UCEs were recovered from the transcriptomes analyzed in Fernández et al. (2018).
This means that the transcriptomic analysis of Fernández et al. (2018) contained at least 1,931
coding UCE regions, out of the 2,021 possible UCEs targeted by the spider probe set of
Kulkarni et al. (2020) (95.5%), making both data sets nearly identical in gene composition, and
thus straightforward to combine. The number of UCEs recovered from individual transcriptomes
(i.e. taxon-wise) ranged between 62897 (µ=436.18) (Supplementary Table 2). Two taxa out of
a total of six non-spider outgroup taxa, Phrynus marginemaculatus and Limulus polyphemus,
yielded too few UCE loci, so they were omitted from the final data set.
Downloaded from by guest on 08 November 2020
This data set included a combination of the taxon sample of UCEs recovered from the
transcriptomes (Fernández et al. 2018) and UCEs (Kulkarni et al. 2020). Three ingroup species
(Amaurobius ferox, Deinopis longipes and Nesticus cooperi) were removed from the AllUCEs50
data set because they did not have any locus represented in the final alignment. This data set
(AllUCEs50), with only 21 loci, resulted in a phylogeny in which many families were polyphyletic
and thus we have excluded this tree topology (see Supplementary trees) from our further
analyses and discussion.
Six terminals (Bothriurus keyserlingi, Centruroides sculpturatus, Sofanapis antillanca, Euryopis
sp., Nesticus gertschi and Chediminae sp.) were likewise removed from the phylogenetic
analyses because they were represented by very few (less than 30) non-coding regions.
Efficiency of the spider probes in capturing codingUCEs
Out of 248 taxa in the AllUCEs data set, 40 genera had multiple representatives obtained from
transcriptomes or UCEs. Although the UCE sequences were mapped to the spider probe set,
their library preparations were enriched with either the same (Kulkarni et al. 2020) or the
Arachnida probe set of Starrett et al. (2017) and Wood et al. (2018). All such genera were
monophyletic, except Segestria (Segestriidae) and Novanapis (Anapidae), which were
Phylogenetic relationships
The AllUCEs data sets had the highest taxon representation of all data sets, including 88 out of
120 known spider families (World Spider Catalog 2020). Topology tests were conducted
between different occupancies of the AllUCEs set. AllUCEs25 was significantly rejected
(Supplementary Table 3) and thus we base our discussion mainly on the AllUCEs10 data set
(Figure 1, Supplementary Figure 2) and highlight relevant aspects of other topologies briefly
below, except for non-coding regions which are discussed in a separate section. The nodal
support values SH-aLRT and ultrafast bootstrap (UFBoot) replicates are respectively
mentioned in parentheses for each relationship. For gene and site concordance factors, refer to
Figure 1 and Supplementary figure 2.
All data sets (except non-coding) included a unanimously strong UFBoot support (>95%)
for the major Araneae lineages such as Mesothelae, Opisthothelae, Mygalomorphae and
Araneomorphae (Figures 1, 2, Supplementary Table 2). Within Araneomorphae, conflicting
Downloaded from by guest on 08 November 2020
relationships were recovered within the family Leptonetidae and the relationships among the
UDOH families, and with Araneoidea and the RTA Clade (Figure 1, 2, Supplementary Table 2,
see Suppl. trees). To briefly describe these conflicts, the UDOH families formed a clade with
AllAAUCEs, but constituted a grade in the analyses of all other data sets. Araneoidea was
recovered as the sister group to Nicodamoidea plus Eresidae in the analyses of all the data sets
except AllUCEs10 and its amino acid data sets (Figure 2). The placement of the long
Senoculidae branch varied across analyses from nesting within the RTA Clade to a sister group
to the Araneae branch. This recalcitrance may be indicative of a poor sequence quality.
Phylogenomic data as amino acids vs. nucleotides
Phylogenies resulting from the transcriptome data analyzed as amino acids (Fernández et al.
2018; Figure 3A of this study) and as nucleotide sequences (nucT67 data set, Figure 3B) at an
occupancy of 67% were congruent at many nodes. Notable differences were found among the
UDOH families and in the internal arrangement of Araneoidea. Although Deinpoidae was sister
group to the RTA Clade in both trees, Hersiliidae was either the sister group of Oecobiidae
(amino acid data) or the sister group to Oecobiidae plus Uloboridae (nucleotide data; Figure 3).
Within Araneoidea, Theridiidae plus Anapidae formed a clade sister group to all remaining
araneoid families with amino acid data, however with nucleotides, Theridiiidae was the sister
group of a clade that included all the remaining araneoid families. This latter placement is
consistently recovered with all other data sets (see supplementary files).
In recently published phylogenomic analyses using amino acid data (Fernández et al.
2018; Michalik et al. 2019), Leptonetidae was recovered as monophyletic with all the amino acid
data sets, that is the AAUCE, AllAAUCE and also in Fernández et al. (2018), but the family was
paraphyletic with the nucleotide data sets (Figure 2, 3 and Supplementary figure 3). This is
notable given that Archoleptoneta species are cribellate while all other leptonetids, including
other archoleptonetines (namely, Darkoneta), are ecribellate (Ledford and Griswold 2010). A
recent UCE study (analyzed as nucleotides) using a dense sample of leptonetids also recovered
diphyly with Archoleptonetinae separate from Leptonetinae (Ramírez et al. 2020).
The linyphioids (Linyphiidae and Pimoidae) were monophyletic with nucT data sets
(>95% UFBoot), codingUCEs (>95% UFBoot) and AAUCEs10 (<95% UFBoot), however other
data sets obtained paraphyly of linyphioids, but the pertinent nodes were poorly supported. The
monophyly of linyphioids has been supported with morphology (Hormiga 1994, 2008; Hormiga
and Tu 2008), six standard Sanger markers (Arnedo et al. 2009; Wheeler et al. 2016; Dimitrov
et al. 2017) and transcriptomes (Fernández et al. 2018).
Downloaded from by guest on 08 November 2020
Gnaphosidae was paraphyletic in both Fernández et al. (2018) (Supplementary figure
3A) and the current study (Figure1, 3B, Supplementary figure 3B). In the current study,
Lamponidae nested within Gnaphosidae whereas in Fernández et al. (2018), Trachelidae,
Liocranidae and Lamponidae nested within Gnaphosidae. Optimized taxon sampling in this part
of the tree would be required to stabilize these relationships.
Removal of third codon positions
Including third codon positions in phylogenetic analyses may influence inferred relationships
due to saturation of synonymous nucleotide substitutions and rate heterogeneity, therefore
explaining differences between analyzing data as amino acids and nucleotides, and thus some
authors recommend exclusion of saturated third codon positions (e.g., Breinholt and Kawahara
2013; O’Connor et al. 2014). In our study, the trees resulting from the analyses with
(codingUCEs and nucT data sets) and without (3RcodingUCEs and 3RnucT data sets) third
codon positions were congruent at most nodes. The differences were as follows: the
3RcodingUCEs10 data set yielded Eresidae as the sister group of Uloboridae whereas in all the
other data sets with the third codon positions removed, Eresidae was sister group to
Nicodamoidea and the 3RcodingUCEs50 data set yielded a paraphyletic Palpimanoidea.
Non-coding regions
All spider families were monophyletic with good support (>95% UFBoot), however most
interfamilial relationships and deeper nodes received poor support (see Supplementary trees).
Many groups that were corroborated with all other data sets were recovered differently when
non-coding regions were analyzed alone. For example, mygalomorphs were the sister group of
a paraphyletic Synspermiata that included Hypochilidae, and the austrochiloids were nested
within Palpimanoidea and polyphyletic UDOH families (Figure 2). These unusual relationships
could be an artifact due to the overall small amount of data included in this data set; a similar
pattern was also observed when analyzing high occupancy (>70%) coding region data sets
(Supplementary file). The high variability in sizes of non-coding regions between distantly
related taxa also requires an evaluation of the potential effect of alignment schemes on resulting
relationships. Analyzing them together with exons, as in AllUCEs, could be a useful strategy
since the conserved coding regions may alleviate the effects of alignment procedures. The use
of appended exonic regions to align non-coding regions needs further exploration. HybPiper
recovers non-exonic regions which may also include intergenic regions in addition to non-coding
regions, which are difficult to parse.
Downloaded from by guest on 08 November 2020
Monophyly of the miniature orb weavers
The “symphytognathoids” were monophyletic in the trees resulting from the analyses of the
codingUCEs, AAUCEs, AllUCEs, AllAAUCEs and nucT, except AAUCEs50 and nucT67 which
recovered Theridiosomatidae as sister group to Araneidae while the remaining
“symphytognathoids” formed a clade. In the AllUCEs tree, this clade included the families
Anapidae, Mysmenidae, Symphytognathidae, Synaphridae and Theridiosomatidae (100/100
UFBoot/SH-aLRT for the whole clade), while the codingUCEs included all these families except
Symphytognathidae (not sampled). The family Synaphridae was sister group to Mysmenidae in
AllUCEs (100/100%), whereas it was sister group to Anapidae in codingUCEs phylogenies.
Only 2.29% of loci (~24 loci) and 29.5% of sites (~68,655 sites) support the monophyly of
“symphytognathoids” in the AllUCEs10 data set (Figure 1), meaning that the remaining sites and
loci support alternative relationships in lower fractions. In the trees resulting from the analyses
of the other data sets, AllUCEs, AllAAUCEs, codingUCEs and nucT, Theridiosomatidae was the
sister group of the remaining “symphytognathoids” with two exceptions of high occupancies, as
mentioned above (AAUCEs50 and nucT67). The AllAAUCEs recovered Theridiosomatidae as
sister group to Synaphridae plus Mysmenidae and this clade was sister group to
Symphytognathidae plus Anapidae (see supplementary files). The removal of third codon
positions from the transcriptomes analysed as nucleotides (3RnucT data sets) supported
“symphytognathoid” monophyly at occupancies of 10, 25 and 50%, whereas at 67% occupancy,
Theridiosomatidae was the sister group of Araneidae and the other “symphytognathoid” families
formed a clade. The removal of third codon positions from UCEs derived from transcriptomes
(3RcodingUCEs data sets) rendered the “symphytognathoid” families polyphyletic (Table 2;
Supplementary trees).
The inclusion of Synaphridae within “symphytognathoids” had been suggested before
(Lopardo and Hormiga 2008; Lopardo et al. 2011), although these studies were cautious about
such placement due to the absence of Cyatholipidae representatives in their analyses.
Fernández et al. (2018) found Synaphridae to be the sister group of the linyphioid clade.
Because Kulkarni et al. (2020) did not include any synaphrid, its position using strictly UCE data
could not be tested. We included a synaphrid exemplar, Cepheia longiseta (from Fernández et
al. 2018), and our results corroborate the placement of Synaphridae within the
“symphytognathoid” clade.
The monophyly of “symphytognathoids'' is supported by several morphological
synapomorphies (Lopardo et al. 2011). While morphology and UCEs support the monophyly of
Downloaded from by guest on 08 November 2020
“symphytognathoids”, six-gene Sanger-based data and sequences from transcriptomes
analyzed as amino acids do not support “symphytognathoid” monophyly (Lopardo et al. 2011;
Dimitrov et al. 2012, 2017; Wheeler et al. 2017; Fernández et al. 2018; Kallal et al. in press).
Unstable and conflicting “symphytognathoid” familial relationships hinder addressing questions
about the evolution of their unique diversity of web architectures, transformations in female
pedipalps (reduction and loss) and transformations of their respiratory systems. For example,
although referred to as miniature “orb weavers”, anapid web architecture is quite variable as
they are known to build typical orb webs and their modifications, sheet webs or, theridiid-like
cobwebs. Most mysmenids build spherical or planar orbs, symphytognathids build a two-
dimensional horizontal orb web, at least some synaphrids build sheet or irregular webs, and
theridiosomatids build orb webs, some of them highly modified (e.g., sticky lines connected to
water surface) (Coddington and Valerio 1980; Eberhard 1987; Rix and Harvey 2010; Lopardo et
al. 2011). In each of these “symphytognathoid” families (except Synaphridae), there is at least
one genus with a kleptoparasitic lifestyle accompanied by loss of the foraging web in all its
constituent species. Adult anapid females have either reduced segments in the pedipalp, a
knob-like protuberance, or have lost the palp entirely, like their putative sister family
Symphytognathidae. Female pedipalps in the remaining “symphytognathoid” families bear all
the segments, like all other spiders.
Our results and those from Kulkarni et al. (2020) indicate that “symphytognathoids” are
monophyletic when analyzed as nucleotide data and when about a hundred or more loci are
available. There is also a clear tradeoff between occupancy and phylogenetic signal. Low
occupancy data matrices contain more missing data than high occupancy data sets, and
missing data can influence the outcome of phylogenetic analyses, both topologically and in
branch lengths (Lemmon et al. 2009). In the case of “symphytognathoids”, a high occupancy
data set of 70% with 433 loci (“500Spid_70” data set of Kulkarni et al. 2020) also supported
“symphytognathoid” monophyly, suggesting that miniature orb weaving spiders are indeed a
Unstable nodes in the Spider Tree of Life
The phylogenetic relationships of the UDOH group of families relative to the RTA Clade and the
interfamilial relationships of Araneoidea vary across analytical conditions, depending on the type
(coding or coding plus non-coding) and amounts of data. For example, in the case of
Araneoidea, coding data (codingUCE, AAUCE, nucT) exclusively recover this clade as sister
group to Nicodamoidea plus Eresidae. However, when combined with non-exonic data,
Downloaded from by guest on 08 November 2020
Araneoidea is sister group to a clade consisting of Nicodamoidea plus Eresidae, the RTA Clade
and the UDOH familieswith the exception of the AllUCEs25 data set. The UDOH grade
consists of Uloboridae, Deinopidae, Oecobiidae and Hersiliidae, of which the first two families
are the only cribellate orb weaving groups, while all remaining orb weaving spider families are
ecribellate and placed within Araneoidea. On the other hand, exploration of molecular data
across a variety of analytical treatments has shown that many nodes in the spider tree of life are
stable across different occupancies. For example, the sister group relationship of Nicodamoidea
and Eresidae, the Hypochilidae plus Filistatidae clade, the monophyly of Synspermiata, and the
“symphytognathoid” clade are all robust hypotheses.
Nodal support values
Overall, we found that the gene concordance and site concordance factor values were
correlated (Supplementary figure 1a, c). The UFBoot was 100% for most nodes and the SH-
aLRT was mostly above 85% (Figure 1, Supplementary figure 2). Both concordance factors
were above 50% for congeneric taxa (Figure 1), meaning that more than 50% of the sites and
loci support the monophyly of those genera. Gene and site concordance values ranged between
1 and 95%. These values were generally >50% for congeneric taxa and were lower between
families and deeper nodes (Figure 1). Several alternative placements, including that of
leptonetids, nicodamoids with respect to Araneoidea and the UDOH families, had high UFBoot
within our trees (see Suppl. files) and also compared to the trees of Fernández et al. (2018).
Occupancy and missing data
Our results show that high occupancy data sets may yield unstable relationships due to the
small number of genes often represented in such data sets (Figure 2, Supplementary Table 2,
Supplementary trees). A similar phenomenon of unusual relationships at high occupancies was
observed in phylogenetic analyses of spider transcriptomes (Kallal et al. in press). Low
occupancy data sets contain larger amounts of data but also contain larger amounts of missing
data. An increase in the proportion of missing data is known to increase the risk of systematic
error (Roure et al. 2013). However, recent empirical studies with genome scale data have
shown that excluding genes with high amounts of missing data may weaken the resolution and
consistency of the resulting tree (Prasanna et al. 2020). Chan et al. (2020) found that different
data classes such as UCEs, exons and introns contain different phylogenetic signal; however,
an unfiltered combination (low occupancy) of such data converged on a similar topology. One
study suggests that if by allowing more missing data, taxon and gene sampling can be
Downloaded from by guest on 08 November 2020
improved, the lower occupancy matrices should be preferred (Streicher et al. 2016). In addition,
allowing missing data may allow to detect gene gains/losses specific to certain lineages. Such
information may be lost in high occupancy data sets due to the exclusion of genes present in
some clade versus sequencing failures. CAT+Γ models may alleviate systematic error (Roure et
al. 2013) but this was not tested in the present study. Evaluation of model adequacy (Ripplinger
and Sullivan 2010; Duchêne et al. 2018) may be a potential next step to further improve the
phylogenetic inference of the evolutionary history of spiders, but our goal here was to evaluate
for the first time the use of amino acids versus DNA.
We have used spiders (Araneae) as a study system to address incongruence among different
classes of genomic data in phylogenetic analyses. We scrutinized sequence data from different
sources (i.e., mRNA and DNA) and analyzed the protein coding regions either as amino acids or
as nucleotides, with and without third codon positions; we also analyzed non-coding regions. All
data sets, except the non-coding data, converged upon a similar pattern of phylogenetic
relationships, which was also similar to the trees derived from low occupancy matrices resulting
from the analysis of UCEs from genomic data (Kulkarni et al. 2020). It is clear that lower
amounts of data either due to amino acid translation, increasing matrix occupancy or both, can
cause topological conflicts at some nodes in the spider tree of life and with the sequencing
strategies employed here. Although a threshold cannot be established as to how much data are
optimal to resolve such topological conflicts, at least 500 loci seem necessary, based on our
results. Our results suggest that using nucleotide data and/or low occupancies to analyze
thousands of loci may prove to be a better strategy for studying higher level phylogenetic
relationships than using amino acids and high occupancies which would yield a much smaller
data set.
Conflicting results are more difficult to interpret when mutually exclusive alternative
relationships are highly supported, particularly when using bootstrapping as a measure of
support on large data sets. Hence, alternative branch support measures that are
computationally tractable for genome-scale data sets, like concordance factors, need to be
further explored.
In the interest of spider systematics, we demonstrate that phylogenetic incongruences
can be reduced by analyzing genome-scale nucleotide data sets, especially at low occupancies.
Some of the contentious hypotheses, such as the phylogeny of “symphytognathoids”, were
impacted by the data class, composition and taxon sampling used. We recovered a congruent
Downloaded from by guest on 08 November 2020
support for their monophyly across a range of low occupancy data sets. This robustly supported
hypothesis on the phylogenetic relationships of the miniature orb weaving families will provide
an opportunity to unravel the evolutionary history of foraging webs.
Materials and Methods
Taxon sampling
The ultraconserved sequences (UCEs) for this study were obtained from a series of studies
focusing on arachnids, including Starrett et al. (2017), Wood et al. (2018) and Kulkarni et al.
(2020). Transcriptomes were obtained from Bond et al. (2014), Fernández et al. (2014, 2018),
Garrison et al. (2016), Sharma et al. (2014), and Zhao et al. (2014). Ultraconserved loci were
also retrieved from publicly available spider genomes of Latrodectus hesperus (Theridiidae; i5k
Consortium 2013), Loxosceles reclusa (Sicariidae; i5k Consortium 2013), Trichonephila clavipes
(Araneidae; Babb et al. 2017), Parasteatoda tepidariorum (Theridiidae; Schwager et al. 2017)
and Stegodyphus mimosarum (Eresidae; Sanggaard et al. 2014). Outgroups include the
horseshoe crab Limulus polyphemus and Tachypleus tridentatus (Xiphosura); the scorpions
Bothriurus keyserlingi, Centruroides sculpturatus, Chaerilus celebensis and Pandinus imperator
(Scorpiones); the whip-spiders Damon variegatus, Damon sp. and Phrynus marginemaculatus
(Amblypygi); the vinegaroon Mastigoproctus giganteus (Uropygi) and the short-tailed whip-
scorpion Stenochrus portoricensis (Schizomida). The analysis was rooted using Xiphosura
since it is the only member outside Arachnopulmonata, irrespective of whether we follow the
traditional hypothesis of Xiphosura being an outgroup to Arachnida (e.g., Lozano-Fernández et
al. 2019), or the alternative hypothesis placing them within Arachnida (see Ballesteros and
Sharma 2019; Ballesteros et al. 2019).
Transcriptome Assembly
Raw sequences were corrected for read errors using Rcorrector (Song and Florea 2015). Low
quality reads and adapters were trimmed with Trim Galore! 0.2.6
( by setting the quality parameter to 30 and a phred cut-off
to 33; reads shorter than 25 bp were discarded. Ribosomal RNA was filtered using the default
settings in Bowtie 2.9.9 (Langmead and Salzberg 2012). De novo strand-specific assemblies
were generated using Trinity 2.0.6 (Grabherr et al. 2011; Haas et al. 2013) with a path
reinforcement set to 75. Redundancy reduction was done using CD-HIT-EST (Fu et al. 2012)
with 95% global similarity. Assemblies were completed using the Colonial One High
Downloaded from by guest on 08 November 2020
Performance Computing Cluster at The George Washington University and the Smithsonian
Institution High Performance Cluster at the Smithsonian Institution. Unlike in previous
phylotranscriptomic analyses of spiders (Bond et al. 2014; Fernández et al. 2014, 2018;
Garrison et al. 2016; Sharma et al. 2014; Zhao et al. 2014), the final DNA sequences were not
translated to amino acids.
Recovering UCEs from Transcriptomes
The FASTA files of transcriptomes resulting from CD-HIT-EST were converted to 2bit format
using faToTwoBit, (Kent et al. 2002). Then, in the PHYLUCE environment (publicly available at, we created a temporary relational
database to summarize probe to assembly match using:
phyluce_probe_run_multiple_lastzs_sqlite function on the 2bit files.
The ultraconserved loci were recovered by the
phyluce_probe_slice_sequence_from_genomes command. The resulting FASTA files were
treated as contigs and used to match the reads to the Spider2Kv1 probes.
Analyzing UCEs as amino acids
The nucleotide reads from UCE and transcriptome contigs were assembled, aligned, trimmed
and processed to obtain selected loci with taxon occupancies of 10, 25 and 50 percentages
using PHYLUCE. All locus files in nexus format were converted to fasta form and translated to
amino acids using seqmagick ( These translated
UCE loci were concatenated using HybPiper (Johnson et al. 2016).
Analyzing transcriptomes as nucleotides
The FASTA files of transcriptomes resulting from CD-HIT-EST were translated to amino acids
using Transdecoder (Haas et al. 2013). Orthologs were recovered from the peptide reads using
BUSCO (Simão et al. 2015). Nucleotide data with ortholog indices and gene files were obtained
using NOrthGen (; Supplementary figure 4). Gene files
were aligned using MAFFT v7 (Katoh and Standley 2013) and trimmed using trimAl v1.2
(Capella-Gutiérrez et al. 2009). All orthologs were concatenated using the HybPiper (Johnson et
al. 2016). Third codon positions were removed using rmThirdCodon
Obtaining non-coding regions
Downloaded from by guest on 08 November 2020
Non-coding regions were extracted from the raw UCE sequence files obtained from Starrett et
al. (2017), Wood et al. (2018) and Kulkarni et al. (2020). A target file database of exons was
compiled using UCEs extracted from the transcriptomes of Damon variegatus, Loxosceles
deserta, Nicodamidae sp.,Trichonephila clavipes, Hebestatis theveneti, Palpimanus gibbulus,
Kukulcania hibernalis, Stegodyphus mimosarum, Liphistius malayanus, Anahita punctulata and
Megahexura fulva from Fernández et al. (2018) and the genome of Parasteatoda tepidariorum
(Schwager et al. 2017). These taxa were chosen to represent Araneae-wide samples and their
closest relatives used as outgroups. HybPiper (Johnson et al. 2016) was run on the raw UCE
sequence files and matched against the target file. After exon matching was completed, we
used the retriever pipeline to extract the non-coding sequences from the raw UCE sequences.
Small sequences below 50 bp (taken as an arbitrary threshold) were deleted and the remaining
non-coding sequences were aligned using MAFFT v7 (Katoh and Standley 2013) and
concatenated using HybPiper (Johnson et al. 2016).
Phylogenomic analyses
The ultraconserved loci recovered from the transcriptomes are referred to as codingUCEs in the
following text. We built eight data sets (Supplementary Table 2), as follows. All data sets (Figure
5) were analyzed at different occupancies, for a total of 15 different analyses (Supplementary
Table 2):
1. codingUCEs data set: The ultraconserved elements recovered from transcriptomes and
analyzed as nucleotide sequences with all codon positions at occupancies of 10, 25 and 50
percentages. This data set contains only exons that are ultraconserved.
2. AAUCEs data set: Sequences from codingUCEs, above, were translated to amino acids and
analyzed at occupancies of 10, 25 and 50 percentages.
3. AllUCEs data set: The codingUCEs data set was combined with the UCEs from taxa included
in Kulkarni et al. (2020) analyzed at occupancies of 10, 25 and 50 percentages. This data set of
ultraconserved elements contains both exons as well as non-coding regions.
4. AllAAUCEs data set: The amino acid sequences for the taxon sampling similar to AllUCEs
data sets analyzed at occupancies of 10, 25 and 50 percentages. This data set contains only
exons that are ultraconserved.
5. nucT data set: Transcriptomes analyzed as nucleotides with all codon positions at
occupancies of 10, 25 and 50 and 67 percentages. This data set contains only exons that may
or may not be ultraconserved.
Downloaded from by guest on 08 November 2020
6. non-coding regions data set: Non-coding regions obtained from the UCE data set of Kulkarni
et al. (2020).
7. 3RcodingUCEs data set: Third codon removed from the codingUCEs data set.
8. 3RnucT data set: Third codon removed from the nucT data set.
Contigs from all DNA sequences were matched to the Spider2Kv1 probe set (Kulkarni et al.
2020) at minimum coverage and minimum identity of 65 each. Phylogenetic analyses were
performed on the unpartitioned, concatenation of loci using IQ-TREE v.1.6.9 (Nguyen et al.
2015). Model selection was allowed for each data set using the TEST function of ModelFinder in
IQ-TREE (Kalyaanamoorthy et al. 2018; Hoang et al. 2018).
Nodal support was estimated via 1000 UFBoot replicates (Hoang et al. 2018) and
Shimodaira-Hasegawa-like approximate likelihood ratio test (SH-aLRT) (Guindon et al. 2010).
To reduce the risk of overestimating branch support with UFBoot due to model violations, we
appended the command -bnni. With this command, the UFBoot optimizes each bootstrap tree
using a hill-climbing nearest neighbor interchange (NNI) search based on the corresponding
bootstrap alignment (Hoang et al. 2018). We used concordance factors, a metric focusing on
whether the best tree represents the signal well, as implemented in IQ-TREE v1.7-betaX (Minh
et al. 2018). Gene concordance factor (gCF) indicates the percentage of gene trees containing
a given branch in the maximum likelihood tree and site concordance factor (sCF) indicates the
percentage of decisive alignment sites supporting a branch (Minh et al. 2018) and it provides
insights into incomplete lineage sorting which may be a cause for discordance between the sites
and the resulting trees (Zhang et al. 2019). We mapped the gCF against sCF with respect to
UFBoot and the SH-aLRT using R version 3.6.0 (R Core Team 2019).
We chose our preferred tree to guide the discussion of the results by conducting
topology tests, namely, approximately unbiased (AU), bootstrap proportion (BP), SH-aLRT,
Kishino-Hasegawa (KH), and expected likelihood weight (ELW) using 10,000 Resampling
estimated log-likelihoods (RELL) in IQ-TREE among the AllUCEs data set.
Supplementary Material
Supplementary data are available in the online version of this study.
Data availability
Sequences from the data sets of Fernández et al. (2018) and Kulkarni et al. (2020) were
analyzed in this study. No new data were generated in support of this research. The scripts for
NOrthGen are available at
Downloaded from by guest on 08 November 2020
All analyses were conducted on the Colonial One High Performance Computing Facility at The
George Washington University. SK was supported by a Weintraub Fellowship and by the Harlan
Research Fund. This study was supported by the US National Science Foundation grants (DEB
1457300, 1457539) to GH and GG and through multiple Putnam Expedition Grants from the
Museum of Comparative Zoology to GG. Additional support was provided by US National
Science Foundation grants DEB 1754289, 1754278, and DEB 1754262 to GH, GG and Sarah
Boyer. The authors are grateful to Silas Bossert, Amey Uchgaonkar, Nicolas Hazzi, Ligia
Benavides and Rosa Fernández for discussions. Authors would like to thank Martín J. Ramírez
and two anonymous reviewers as well as the Editors for their time and effort reviewing this
Author Contributions
All authors contributed to designing the study and writing the manuscript. S.K. and R.J.K.
conducted the analyses.
Arnedo MA, Hormiga G, Scharff N. 2009. Higher-level phylogenetics of linyphiid spiders
(Araneae, Linyphiidae) based on morphological and molecular evidence. Cladistics 25:231262.
Babb PL, Lahens NF, Correa-Garhwal SM, Nicholson DN, Kim EJ, Hogenesch JB, et al. 2017.
The Nephila clavipes genome highlights the diversity of spider silk genes and their complex
expression. Nat Genet. 49:895903.
Babraham Bioinfomatics. Trim Galore! Accessed in January 2020.
Ballesteros JA, Sharma PP. 2019. A critical appraisal of the placement of Xiphosura
(Chelicerata) with account of known sources of phylogenetic error. Syst Biol. 68:896917.
Bond JE, Garrison NL, Hamilton CA, Godwin RL, Hedin M, Agnarsson I. 2014. Phylogenomics
resolves a spider backbone phylogeny and rejects a prevailing paradigm for orb web evolution.
Curr Biol. 24:17651771.
Bossert S, Danforth BN. 2018. On the universality of target-enrichment baits for phylogenomic
research. Methods Ecol Evol. 9:14531460. 210X.12988
Downloaded from by guest on 08 November 2020
Bossert S, Murray EA, Almeida EAB, Brady SG, Blaimer BB, Danforth BN. 2019. Combining
transcriptomes and ultraconserved elements to illuminate the phylogeny of Apidae. Mol
Phylogenet Evol. 130:121131.
Bravo GA, Antonelli A, Bacon CD, Bartoszek K, Blom MPK, Huynh S, Jones G, Knowles LL,
Lamichhaney S, Marcussen T, Morlon H, Nakhleh LK, Oxelman B, Pfeil B, Schliep A, Wahlberg
N, Werneck FP, Wiedenhoeft J, Willows-Munro S, Edwards SV. 2019. Embracing
heterogeneity: coalescing the Tree of Life and the future of phylogenomics. PeerJ 7:e6399.
Breinholt JW, Kawahara AY. 2013. Phylotranscriptomics: saturated third codon positions
radically influence the estimation of trees based on next-gen data. Genome Biol Evol. 5:2082
Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. 2009. trimAl: a tool for automated
alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25:19721973.
Chan KO, Hutter CR, Wood PL Jr, Grismer LL, Brown RM. 2019. Larger, unfiltered datasets are
more effective at resolving phylogenetic conflict: Introns, exons, and UCEs resolve ambiguities
in Golden-backed frogs (Anura: Ranidae; genus Hylarana). Mol Phylogenet Evol. 151:106899.
Cloutier A, Sackton TB, Grayson P, Clamp M, Baker AJ, Edwards SV. 2019. Whole-genome
analyses resolve the phylogeny of flightless birds (Palaeognathae) in the presence of an
empirical anomaly zone. Syst Biol.68:937955.
Coddington J, Valerio C. 1980. Observations on the web and behavior of Wendilgarda spiders
(Araneae: Theridiosomatidae). Psyche 87:93105.
Dimitrov D, Hormiga G. 2021. Spider Diversification Through Space and Time. Annu. Rev.
Entomol. 66:1
Dimitrov D, Benavides LR, Arnedo MA, Giribet G, Griswold CE, Scharff N, Hormiga G. 2017.
Rounding up the usual suspects: a standard target-gene approach for resolving the interfamilial
phylogenetic relationships of ecribellate orb-weaving spiders with a new family- rank
classification (Araneae, Araneoidea). Cladistics 33:221250.
Duchêne DA, Sebastian D, Ho SYW. 2018. Differences in performance among test statistics for
assessing phylogenomic model adequacy. Genome Biol Evol. 10:3751388.
Dunn CW, Giribet G, Edgecombe GD, Hejnol A. 2014. Animal phylogeny and its evolutionary
implications. Annu Rev Ecol Evol Syst. 45:371395.
Eberhard WG. 1987. Web-building behavior of anapid, symphytognathid, and mysmenid spi-
ders. J Arachnol. 14:339358.
Felsenstein J. 1985. Confidence limits on phylogenies: An approach using the bootstrap.
Evolution 39:783791.
Downloaded from by guest on 08 November 2020
Fernández R, Hormiga G, Giribet G. 2014. Phylogenomic analysis of spiders reveals
nonmonophyly of orb weavers. Curr Biol. 24:17721777.
Fernández R, Kallal RJ, Dimitrov D, Ballesteros JA, Arnedo M, Giribet G, Hormiga G. 2018.
Phylogenomics, diversification dynamics, and comparative transcriptomics across the spider
tree of life. Curr Biol. 28:14891497.e5
Forster RR, Platnick NI. 1977. A review of the spider family Symphytognathidae (Arachnida,
Araneae). Am Mus Novit. 2619:129.
Fu L, Niu B, Zhu Z, Wu S, Li W. 2012. CD-HIT: accelerated for clustering the next-generation
sequencing data. Bioinformatics 28:31503152
Garb JE, Haney R, Schwager E, Gregorič M, Kuntner M, Agnarsson I, Blackledge T. 2019. The
transcriptome of Darwin’s bark spider silk glands predicts proteins contributing to dragline silk
toughness. Commun Biol. 2:275.
Garrison NL, Rodriguez J, Agnarsson I, Coddington JA, Griswold CE, Hamilton CA, Hedin M,
Kocot KM, Ledford JM, Bond JE. 2016. Spider phylogenomics: untangling the Spider Tree of
Life. PeerJ 4:e1719.
Gee H. 2003. Ending incongruence. Nature 425:782.
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson Da, Amit I, Adiconis X, Fan L,
Raychowdhury R, Zeng Q, et al. 2011. Full-length transcriptome assembly from RNA-Seq data
without a reference genome. Nat Biotechnol. 29:644652.
Griswold CE, Coddington JA, Hormiga G, Scharff N. 1998. Phylogeny of the orb-web building
spiders (Araneae, Orbiculariae: Deinopoidea, Araneoidea). Zool J Linnean Soc. 123:199.
Guindon S, Dufayard J, Lefort V, Anisimova M, Hordijk W, Gascuel O. 2010. New algorithms
and methods to estimate maximum-likelihood phylogenies: Assessing the performance of
PhyML 3.0. Syst Biol. 59:307321.
Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, et al. 2013. De novo
transcript sequence reconstruction from RNA-seq using the Trinity platform for reference
generation and analysis. Nat Protoc. 8:14941512.
Hedin M, Derkarabetian S, Alfaro A., Ramírez MJ, Bond, JE. 2019. Phylogenomic analysis and
revised classification of atypoid mygalomorph spiders (Araneae, Mygalomorphae), with notes on
arachnid ultraconserved element loci. PeerJ 7:e6864.
Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. 2018. UFBoot2: Improving the
ultrafast bootstrap approximation. Mol Biol Evol. 35:518522.
Hormiga G. 1994. Cladistics and the comparative morphology of linyphiid spiders and their
relatives (Araneae, Araneoidea, Linyphiidae). Zool J Linnean Soc. 111:171.
Downloaded from by guest on 08 November 2020
Hormiga G. 2008. On the spider genus Weintrauboa (Araneae, Pimoidae), with a description of
a new species from China and comments on its phylogenetic relationships. Zootaxa 1814:120.
Hormiga G, Tu L. 2008. On Putaoa, a new genus of the spider family Pimoidae (Araneae) from
southern China, with a cladistic test of its monophyly and phylogenetic placement. Zootaxa
Hormiga G, Griswold CE. 2014. Systematics, phylogeny and evolution of orb-weaving spiders.
Annu Rev Entomol. 59:487512.
i5K Consortium. 2013. The i5K Initiative: Advancing arthropod genomics for knowledge, human
health, agriculture, and the environment. J Hered. 104:595600.
Jarvis ED, Mirarab S, Aberer AJ, Li B, Houde P, Li C, Ho SYW, Faircloth BC, Nabholz B,
Howard JT et al. 2014. Whole-genome analyses resolve early branches in the tree of life of
modern birds. Science 346:13201331.
Johnson MG, Gardner EM, Liu Y, Medina R, Goffinet B, Shaw AJ, Zerega NJC, Wickett NJ.
2016. HybPiper: Extracting coding sequence and introns for phylogenetics from highthroughput
sequencing reads using target enrichment. Appl Plant Sci. 4:1600016.
Kallal R, Kulkarni S, Dimitrov D, Benavides LR, Arnedo M, Giribet G, Hormiga G. (in press)
Converging on the orb: denser taxon sampling elucidates spider phylogeny and new analytical
methods support repeated evolution of the orb web. Cladistics
Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. 2017. ModelFinder: fast
model selection for accurate phylogenetic estimates. Nat Meth. 14:587589.
Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment software version 7:
Improvements in performance and usability. Mol Biol Evol. 30:772780.
Kent WJ. 2002. BLATthe BLASTlike alignment tool. Genome Res. 12:656664.
Kulkarni S, Wood H, Lloyd M, Hormiga G. 2020. Spiderspecific probe set for ultraconserved
elements offers new perspectives on the evolutionary history of spiders (Arachnida, Araneae).
Mol Ecol Resour. 20:185 203.
Langmead B, Salzberg S. 2012. Fast gapped-read alignment with Bowtie 2. Nat Meth. 9:357
Lozano-Fernández J, Tanner AR, Giacomelli M, Carton R, Vinther J, Edgecombe GD, Pisani D.
2019. Increasing species sampling in chelicerate genomic-scale data sets provides support for
monophyly of Acari and Arachnida. Nat Commun. 10:2295.
Downloaded from by guest on 08 November 2020
Mardis ER. 2011. A decade’s perspective on DNA sequencing technology. Nature 470:198
Michalik P, Kallal R, Dederichs TM, Labarque, FM, Hormiga G, Giribet G, Ramírez, MJ. 2019.
Phylogenomics and genital morphology of cave raptor spiders (Araneae, Trogloraptoridae)
reveal an independent origin of a flowthrough female genital system. J Zool Syst Evol Res.
Prum RO, Berv JS, Dornburg A, Field DJ, Townsend JP, Moriarty LE, Lemmon AR. 2015. A
comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing.
Nature 526:569573.
Laumer CE, Rosa Fernández, Sarah Lemer, David Combosch, Kevin M. Kocot, Ana Riesgo,
Sónia C. S. Andrade, Wolfgang Sterrer, Martin V. Sørensen and Gonzalo Giribet 2019Revisiting
metazoan phylogeny with genomic sampling of all phyla. Proc R Soc B 286:20190831.
Ledford JM, Griswold CE. 2010. A study of the subfamily Archoleptonetinae (Araneae,
Leptonetidae) with a review of the morphology and relationships for the Leptonetidae. Zootaxa
Lemmon AR, Emme SA, Lemmon EM. 2012. Anchored hybrid enrichment for massively high-
throughput phylogenomics. Syst Biol. 61:727744.
Lemmon AR, Brown JB, Stanger-Hall K, Lemmon EM. 2009. The effect of ambiguous data on
phylogenetic estimates obtained by maximum likelihood and Bayesian inference. Syst Biol.
Lopardo L, Hormiga G. 2008. Phylogenetic placement of the Tasmanian spider Acrobleps
hygrophilus (Araneae, Anapidae) with comments on the evolution of the capture web in
Araneoidea. Cladistics 24:133.
Lopardo L, Giribet G, Hormiga G. 2011. Morphology to the rescue: molecular data and the
signal of morphological characters in combined phylogenetic analysesa case study from
mysmenid spiders (Araneae, Mysmenidae), with comments on the evolution of web
architecture. Cladistics 27:278330.
Minh BQ, Hahn M, Lanfear R. 2020. New methods to calculate concordance factors for
phylogenomic datasets. Mol Biol Evol. msaa106
Morgan CC, Foster PG, Webb AE, Pisani D, McInerney JO, O’Connell MJ. 2013.
Heterogeneous models place the root of the placental mammal phylogeny. Mol Biol Evol.
O'Connor DL, Runions A, Sluis A, Bragg J, Vogel JP, Prusinkiewicz P, et al. 2014. A division in
PIN-mediated auxin patterning during organ initiation in grasses. PLoS Comput Biol. 10:
Prasanna A, Gerber D, Kijpornyongpan T, Catherine Aime M, Doyle V, Nagy LG. 2020. Model
choice, missing data, and taxon sampling impact phylogenomic inference of deep
Basidiomycota relationships. Syst Biol. 69:1737.
Downloaded from by guest on 08 November 2020
Prum RO, Berv JS, Dornburg A, Field DJ, Townsend JP, Lemmon EM, Lemmon AR. 2015. A
comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing.
Nature 526:569573.
R Core Team. 2019. R: A language and environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria. URL
Ramírez MJ, Magalhaes ILF, Derkarabetian S, Ledford J, Griswold CE, Wood HW, Hedin M.
2020. Sequence-capture phylogenomics of true spiders reveals convergent evolution of
respiratory systems. Syst Biol.
Ripplinger J, Sullivan J. 2010. Assessment of substitution model adequacy using frequentist and
Bayesian methods. Mol Biol Evol. 2712:27902803.
Rix M, Harvey M. 2010. The spider family Micropholcommatidae (Arachnida, Araneae,
Araneoidea): a relimitation and revision at the generic level. ZooKeys 36:1321.
Rokas A, Williams B, King N, Carroll SB. 2003. Genome-scale approaches to resolving
incongruence in molecular phylogenies. Nature 425, 798804.
Romiguier J, Ranwez V, Delsuc F, Galtier N, Douzery EJP. 2013. Less is more in mammalian
phylogenomics: AT-rich genes minimize tree conflicts and unravel the root of placental
mammals. Mol Biol Evol. 30:21342144.
Roure B, Baurain D, Philippe H. 2013. Impact of missing data on phylogenies inferred from
empirical phylogenomic data sets, Mol Biol Evol. 30:197214.
Sanggaard KW, Bechsgaard JS, Fang X, Duan J, Dyrlund TF, Gupta V, Jiang X, Cheng L, Fan
D, Feng Y. 2014. Spider genomes provide insight into composition and evolution of venom and
silk. Nat Comm. 5:3765.
Schütt K. 2003. Phylogeny of Symphytognathidae s.l. (Araneae, Araneoidea). Zool Scr. 32:129
Schwager EE, Sharma PP, Clarke T, Leite DJ, Wierschin T, Pechmann M, Akiyama-Oda Y,
Esposito L, Bechsgaard J, Bilde T et al. 2017. The house spider genome reveals an ancient
whole-genome duplication during arachnid evolution. BMC Biol. 15:62.
Sharma PP, Kaluziak S, Pérez-Porro AR, González VL, Hormiga G, Wheeler WC, Giribet G.
2014. Phylogenomic interrogation of Arachnida reveals systemic conflicts in phylogenetic signal.
Mol Biol Evol. 31:29632984.
Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM 2015. BUSCO:
assessing genome assembly and annotation completeness with single-copy orthologs.
Bioinformatics 31:32103212.
Song L, Florea L. 2005. Rcorrector: efficient and accurate error correction for Illumina RNA-seq
reads. GigaScience 4:s137420150089y.
Downloaded from by guest on 08 November 2020
Starrett J, Derkarabetian S, Hedin M, Bryson RW, McCormack JE, Faircloth BC. 2017. High
phylogenetic utility of an ultraconserved element probe set designed for Arachnida. Mol Ecol
Res. 17:812823. 1755-0998.12621
Streicher JW, Schulte JA, Wiens JJ. 2016. How should genes and taxa be sampled for
phylogenomic analyses with missing data? An empirical study in iguanian lizards, Syst Biol.
Walker JF, Brown JW, Smith SA. 2018. Analyzing contentious relationships and outlier genes in
phylogenomics. Syst Biol. 67:916924.
Wheeler WC, Coddington JA, Crowley LM, Dimitrov D, Goloboff PA, Griswold CE, Hormiga G,
Prendini L, Ramírez MJ, Sierwald P, et al. 2017. The spider tree of life: phylogeny of Araneae
based on targetgene analyses from an extensive taxon sampling. Cladistics 33:574616.
Wickett NJ, Mirarab S, Nguyen N, Warnow T, Carpenter E, Matasci N, Ayyampalayam S,
Barker MS, Burleigh JG, Gitzendanner MA, Ruhfel BR, Wafula E, et al. 2014.
Phylotranscriptomic analysis of the origin and early diversification of land plants. Proc Natl Acad
Sci USA 111:E4859E4868.
Wood HM, González V, Lloyd M, Coddington J, Scharff N. 2018. Next-generation museum
genomics: Phylogenetic relationships among palpimanoid spiders using sequence capture
techniques (Araneae: Palpimanoidea). Mol Phylogenet Evol. 127:907918.
World Spider Catalog (2020). World Spider Catalog. Version 20.5. Natural History Museum
Bern, online at, accessed on 16 January, 2020.
Xi Z, Liu L, Rest JS, Davis CC. 2014. Coalescent versus concatenation methods and the
placement of amborella as sister to water lilies. Syst Biol. 63:919932.
Zanis MJ, Soltis DE, Soltis PS, Mathews S, Donoghue MJ. 2002. The root of the angiosperms
revisited. Proc Natl Acad Sci USA 99:68486853.
Zhao YJ, Zeng Y, Chen L, Dong Y, Wang W. 2014. Analysis of transcriptomes of three orb-web
spider species reveals gene profiles involved in silk and toxin. Insect Sci. 21:687698.
Zhang MY, Williams JL, Lucky A. 2019. Understanding UCEs: a comprehensive primer on using
ultraconserved elements for arthropod phylogenomics. Insect Syst Div. 3:112.
Downloaded from by guest on 08 November 2020
Fig. 1. Maximum likelihood phylogeny of spiders resulting from the AllUCEs10 data set
(occupancy 10, 1,060 loci) collapsed to family level. Paraphyly is indicated by violet bars. A. All
major lineages of spiders at family level except the RTA Clade and Araneoidea; B. RTA Clade;
C. All 17 families of superfamily Araneoidea. The rhombi at the nodes indicate four support
values: Shimodaira-Hasegawa-like approximate likelihood ratio test (left top), ultrafast bootstrap
(right top), gene concordance factor (gCF) (left bottom) and site concordance factor (sCF) (right
bottom). The numbers at the node indicate clades as described. Branch lengths are not to be
scaled. For the original sampled tree, see supplementary figure 2.
Downloaded from by guest on 08 November 2020
UDOH grade
Downloaded from by guest on 08 November 2020
Fig. 2. Maximum likelihood phylogenies of spiders resulting from different data sets at various
occupancies. Each colored box indicates a data set corresponding to Supplementary Table 2.
The first and second rows represent phylogenies resulting from data analyzed as nucleotides
and amino acids respectively, of codingUCEs (outlined red) and AllUCEs (outlined blue).
Fig. 3. Comparison of phylogenetic relationships between A. transcriptomic phylogeny as
published by Fernández et al. (2018) using amino acids, and B. nucT (Fernández et al. 2018,
transcriptome data set analyzed as nucleotides). Both phylogenies were constructed using
occupancy of 67%. The highlighted blue box indicates Araneoidea families.
Downloaded from by guest on 08 November 2020
Downloaded from by guest on 08 November 2020
Fig. 4. Comparison of interfamilial relationships of Araneoidea. A. AllAAUCEs tree, B. AllUCEs
tree. Occupancy of both phylogenies was 10 %. Coloured branches indicate family relationships
that are congruent in both trees.
Fig. 5. Schematic representation of data classes analyzed in this study in a maximum likelihood
framework. Squares indicate original data sets from Fernández et al. (2018) and Kulkarni et al.
(2020), and circles indicate matrices analyzed in our study. Circles with red outline indicate
amino acid data set, black outline indicates non-coding region data set and the circles with
outline indicate nucleotide data sets. Abbreviation: UCEUltraconserved elements.
Downloaded from by guest on 08 November 2020
Supplementary files
Supplementary Fig. 1. Site concordance factors mapped against gene concordance factors
with respect to ultrafast bootstrap (A,C) and Shimodaira-Hasegawa-like approximate likelihood
ratio test (B,D). Figures A and B indicate plots for the AllUCEs10 data set, and C and D indicate
plots for the AllUCEs25 data set.
Supplementary Fig. 2. Maximum likelihood phylogeny of spiders resulting from the AllUCEs10
data set (occupancy 10, 1,060 loci). A. A phylogeny with 248 taxa with taxon names in black
indicating taxa from UCE studies,blue indicating UCEs recovered from Fernandez et al. (2018)
and red indicating UCEs recovered from genomes. B. A summary of tree A. The rhombi at the
nodes indicate four support values: Shimodaira-Hasegawa-like approximate likelihood ratio test
(left top), ultrafast bootstrap (right top), gene concordance factor (gCF) (left bottom) and site
concordance factor (sCF) (right bottom). The numbers at nodes indicate clades as described.
Downloaded from by guest on 08 November 2020
Supplementary Fig. 3. Comparison of phylogenetic relationships between A. transcriptomes as
published by Fernández et al. (2018), and B. codingUCEs (UCEs retrieved from the Fernández
et al. (2018) transcriptome data set, with occupancy of 10) analyzed as DNA. The highlighted
blue box indicates Araneoidea families. Note that the miniature orb weaving spider families
(“symphytognathoids”)- Anapidae, Mysmenidae, Synaphridae and Theridiosomatidae
(Symphytognathidae not sampled), are polyphyletic in the left tree, but monophyletic in the right
Supplementary Fig. 4. Schematic representation of the workflow of NOrthGen- Nucleotide
Ortholog Generator modules ( used to obtain map
ortholog identifiers from amino acids to nucleotide data. ABC and DEF are taxon name
exemplars. Dotted arrows indicate processes and red arrows indicate mandatory input files for
using NOrthGen.
Supplementary Table 1. List of taxa with their source of data and the number of UCE loci
recovered using the spider probe set. Abbreviations in the column datatype mean as follows: G-
Genome, T- Transcriptomes and UCE- Ultraconserved elements.
Supplementary Table 2. Settings for different data sets used in phylogenetic analyses at
minimum identity 65 and minimum coverage 65 used to match the Spider 2Kv1 probe set
(Kulkarni et al. 2020) to contigs. Appended columns with black cells indicate relationships
written in newick format recovered from the respective data set and the white cells indicate an
alternative relationship.
Supplementary Table 3. Results of the topology tests conducted on the AllUCEs data set.
Supplementary Trees. Phylogenetic relationships obtained for all data sets.
Downloaded from by guest on 08 November 2020
... No extant mysmenid species, however, is known to have this hypothetical ancestral tracheal system. The mysmenid ancestral reconstruction based on the phylogenetic hypothesis of Kulkarni et al. (2021) renders a similar anterior respiratory arrangement; but due to the placement of Synaphridae as sister to Mysmenidae, the reconstruction of the posterior respiratory system is rendered ambiguous (median tracheae as third entapophyses, but either a simple or a complex lateral tracheal system) (not shown, refer to Figs. 5 and 7). The respiratory system for Brasilionata, Chanea, Gaoligonga, Mosu, Mysmeniola, Phricotelus, Simaoa, and Yamaneta remains unknown. ...
... We inferred the evolutionary transformations of the respiratory system of Mysmenidae and other symphytognathoid families based on the optimization of such characters onto the preferred optimal trees from Lopardo et al. (2011) and Kulkarni et al. (2021). Our optimizations offer different evolutionary hypotheses for both the anterior and posterior respiratory systems compared to the aforementioned hypotheses for symphytognathoids (Figs. 6 and 7). ...
... The plesiomorphic symphytognathoid arrangement seems to have re-evolved independently in the mysmenid Maymena. Furthermore, the reconstruction of the ancestral respiratory system of the ANTS (Anterior Tracheal System) clade on both the phylogenetic hypotheses of Lopardo et al. (2011) and Kulkarni et al. (2021) results in a similar arrangement to that reported as the mysmenid ancestral reconstruction (i.e., anterior tracheae extending into prosoma and connected by a transverse duct; posterior tracheae comprising a single narrow spiracle adjacent to spinnerets and internally a small atrium, median entapophyses and lateral tracheae restricted to the opisthosoma; Figs. 5-7; see above). ...
Full-text available
Spiders are unique in having a dual respiratory system with book lungs and tracheae, and most araneomorph spiders breathe simultaneously via book lungs and tracheae, or tracheae alone. The respiratory organs of spiders are diverse but relatively conserved within families. The small araneoid spiders of the symphytognathoid clade exhibit a remarkably high diversity of respiratory organs and arrangements, unparalleled by any other group of ecribellate orb weavers. In the present study, we explore and review the diversity of symphytognathoid respiratory organs. Using a phylogenetic comparative approach, we reconstruct the evolution of the respiratory system of symphytognathoids based on the most comprehensive phylogenetic frameworks to date. There are no less than 22 different respiratory system configurations in symphytognathoids. The phylogenetic reconstructions suggest that the anterior tracheal system evolved from fully developed book lungs and, conversely, reduced book lungs have originated independently at least twice from its homologous tracheal conformation. Our hypothesis suggests that structurally similar book lungs might have originated through different processes of tracheal transformation in different families. In symphytognathoids, the posterior tracheal system has either evolved into a highly branched and complex system or it is completely lost. No evident morphological or behavioral features satisfactorily explains the exceptional variation of the symphytognathoid respiratory organs.
... Oecobiidae (wall spiders) have historically been considered to be closely related to Hersiliidae (long-spinneret spiders) on the basis of both morphological and behavioral characters, such as the capture behavior in which the spider circles rapidly around the prey while placing threads on it, which attaches the prey to the substrate (details in Glatz 1967), and molecular data (Coddington & Levi 1991;Hormiga & Griswold 2014). However, some recent studies have suggested that Oecobiidae is more closely allied to orb-weaver spiders (Garrison et al. 2016;Wheeler et al. 2016;Fernández et al. 2018;Coddington et al. 2019;Kallal et al. 2020;Kulkarni et al. 2021), either maintaining a close relationship with Hersiliidae (Wheeler et al. 2016;Fernández et al. 2018;Kallal et al. 2020;Kulkarni et al. 2021) or placing it closer to the cribellate orb weavers Uloboridae (Garrison et al. 2016), though in this case Hersiliidae was not included in the analysis (Fig. 1). These phylogenetic hypotheses raise questions about the homology of web structures and web building behaviors in these families. ...
... Oecobiidae (wall spiders) have historically been considered to be closely related to Hersiliidae (long-spinneret spiders) on the basis of both morphological and behavioral characters, such as the capture behavior in which the spider circles rapidly around the prey while placing threads on it, which attaches the prey to the substrate (details in Glatz 1967), and molecular data (Coddington & Levi 1991;Hormiga & Griswold 2014). However, some recent studies have suggested that Oecobiidae is more closely allied to orb-weaver spiders (Garrison et al. 2016;Wheeler et al. 2016;Fernández et al. 2018;Coddington et al. 2019;Kallal et al. 2020;Kulkarni et al. 2021), either maintaining a close relationship with Hersiliidae (Wheeler et al. 2016;Fernández et al. 2018;Kallal et al. 2020;Kulkarni et al. 2021) or placing it closer to the cribellate orb weavers Uloboridae (Garrison et al. 2016), though in this case Hersiliidae was not included in the analysis (Fig. 1). These phylogenetic hypotheses raise questions about the homology of web structures and web building behaviors in these families. ...
... The tent is attached to the substrate at several pillars, forming arched structures between each pair of pillars, which the spiders use to move in and out of the tent. The web of Oecobius annulipes also has long radial threads that extend beyond the carpet, and cribellate silk threads Kallal et al. (2020) and Kulkarni et al. (2021); (B) shows a close phylogenetic relationship between Oecobiidae and Hersiliidae, following Wheeler et al. (2016); (C) shows a close relationship between Oecobiidae and Uloboridae, following Garrison et al. (2016). The asterisk indicates groups with orb webs, and the dashed lines indicate other lineages not included. ...
... In other spheres of evolution we recollect that the Basque and Etruscan systems of speech, which can claim kindred with no existing family of language, are excellent instances of the same phenomenon. Mello-Leitão (1946) and Lehtinen (1986) considered filistatids as the sister group of Pholcidae; Eskov and Zonshtein (1990) placed them next to mygalomorphs; the morphological data from Platnick et al. (1991) recovered them as a sister group to other spiders with simple genitalia; and NGS data recovered a close relationship to Hypochilidae (Fern andez et al., 2018;Kulkarni et al., 2020Kulkarni et al., , 2021Ram ırez et al., 2021). For a long time, they were considered closely related to the clade of spiders with simple genitalia that is now called Synspermiata (Michalik and Ram ırez, 2014), a position supported by classic characters, such as the fusion of tegulum and subtegulum and a cheliceral lamina (Platnick et al., 1991). ...
... However, filistatids present typical plesiomorphic characters, such as an M-shaped midgut (Griswold et al., 2005), retention of the posterior lungs in the first instars (Ram ırez, 2014), coenospermia (several sperm cells in the same capsule; Michalik et al., 2003), ecdysis after sexual maturity in females, copulation and spermatic web similar to that of mygalomorphs ( Barrantes and Ram ırez, 2013) and the behaviour of combing cribellate silk using leg III as support (Lopardo and Ram ırez, 2007). Finally, independent sources of NGS data strongly suggest Filistatidae are closely related to Hypochilidae, another family that has many plesiomorphic characters (Fern andez et al., 2018;Kulkarni et al., 2020Kulkarni et al., , 2021Ram ırez et al., 2021). All this suggests an adequate knowledge of the morphology and internal relationships of filistatids could have a great impact on our knowledge of the evolution of early-diverging araneomorphs. ...
Full-text available
Filistatids, the crevice weavers, are an ancient family of cribellate spiders without extant close relatives. As one of the first lineages of araneomorph spiders, they present a complicated mixture of primitive and derived characters that make them a key taxon to elucidate the phylogeny of spiders, as well as the evolution of phenotypic characters in this group. Their moderate diversity (187 species in 19 genera) is distributed mainly in arid and semi‐arid subtropical zones of all continents, except Antarctica. The objective of this paper is to generate a comprehensive phylogenetic hypothesis for this family to advance the understanding of its morphological evolution and biogeography, as well as lay the basis for a natural classification scheme. By studying the morphology using optical and electronic microscopy techniques, we produced a matrix of 302 morphological characters coded for a sample of 103 species of filistatids chosen to represent the phylogenetic diversity of the family. In addition, we included sequences of four molecular markers (COI, 16S, H3 and 28S; 3787 aligned positions) of 70 filistatid species. The analysis of the data (morphological, molecular, and combined) consistently indicates the separation of the Filistatidae into two subfamilies, Prithinae and Filistatinae, in addition to supporting several groups of genera: Filistata, Zaitunia and an undescribed genus from Madagascar; Sahastata and Kukulcania; all Prithinae except Filistatinella and Microfilistata; Antilloides and Filistatoides; a large Old World group including Pritha, Tricalamus, Afrofilistata, Labahitha, Yardiella, Wandella and putative new genera; and a South American group formed by Lihuelistata, Pikelinia and Misionella. Pholcoides is transferred to Filistatinae and Microfilistata is transferred to Prithinae, and each represents the sister group to the remaining genera of its own subfamily. Most genera are valid, although Pikelinia is paraphyletic with respect to Misionella, so we consider the two genera as synonyms and propose a few new generic combinations. Considering the new phylogenetic hypothesis, we discuss the evolution of some morphological character systems and the biogeography of the family. The ages of divergence between clades were estimated using a total‐evidence tip‐dating approach by including fossils of Filistatidae and early spider clades; this approach resulted in younger age estimates than those obtained with traditional node‐dating. Filistatidae is an ancient family that started diversifying in the Mesozoic and most genera date to the Cretaceous. Clades displaying transcontinental distributions were most likely affected by continental drift, but at least one clade shows unequivocal signs of transoceanic long‐distance dispersal.
... Each marker was aligned with MAFFT (Katoh et al. 2019) with default parameters. We made a list of topological constraints for clades supported in the most recent phylogenomic studies of spiders (Kulkarni et al. 2021;Maddison et al. 2017;Opatova et al. 2020;Ramírez et al. 2020) (see S5 of the Supplementary material available on Zenodo for details). The sequence data were analyzed with IQ-TREE 2.1.2 ...
... In spiders, a low body mass may also facilitate long-range dispersal via ballooning or thread-based locomotion (bridging) (Corcobado et al. 2010). The smallest known spiders belong to the Symphytognathidae, Mysmenidae, and Anapidae, all of which build aerial webs (orbs or suspended sheets) (Cardoso and Scharff 2009); these three families are members of the "symphytognathoid" clade (Kulkarni et al. 2021; but see also Kallal et al. 2020). It has been suggested that miniaturization is constrained by minimal organ sizes, most prominently the size of the central nervous system and sensory organs (Eberhard 2007;Quesada et al. 2011). ...
A prominent question in animal research is how the evolution of morphology and ecology interact in the generation of phenotypic diversity. Spiders are some of the most abundant arthropod predators in terrestrial ecosystems and exhibit a diversity of foraging styles. It remains unclear how spider body size and proportions relate to foraging style, and if the use of webs as prey capture devices correlates with changes in body characteristics. Here we present the most extensive dataset to date of morphometric and ecological traits in spiders. We used this dataset to estimate the change in spider body sizes and shapes over deep time and to test if and how spider phenotypes are correlated with their behavioural ecology. We found that phylogenetic variation of most traits best fitted an Ornstein-Uhlenbeck model, which is a model of stabilizing selection. A prominent exception was body length, whose evolutionary dynamics were best explained with a Brownian Motion (free trait diffusion) model. This was most expressed in the araneoid clade (ecribellate orb-weaving spiders and allies) that showed bimodal trends towards either miniaturization or gigantism. Only few traits differed significantly between ecological guilds, most prominently leg length and thickness, and although a multivariate framework found general differences in traits among ecological guilds, it was not possible to unequivocally associate a set of morphometric traits with the relative ecological mode. Long, thin legs have often evolved with aerial webs and a hanging (suspended) locomotion style, but this trend is not general. Eye size and fang length did not differ between ecological guilds, rejecting the hypothesis that webs reduce the need for visual cue recognition and prey immobilization. For the inference of the ecology of species with unknown behaviours, we propose not to use morphometric traits, but rather consult (micro-)morphological characters, such as the presence of certain podal structures. These results suggest that, in contrast to insects, the evolution of body proportions in spiders is unusually stabilized, and ecological adaptations are dominantly realized by behavioural traits and extended phenotypes in this group of predators. This work demonstrates the power of combining recent advances in phylogenomics with trait-based approaches to better understand global functional diversity patterns through space and time.
... The monotypic genus Hickmania is currently placed in the superfamily Austrochiloidea. This group of spiders forms an early diverging lineage in the evolution of araneomorph spiders (Platnick 1977;Forster et al. 1987;Wheeler et al. 2017;Fernández et al. 2018;Kallal et al. 2021;Kulkarni et al. 2021;Ramírez et al. 2021). Austrochiloids consist of two families, Austrochilidae and Gradungulidae, and are distributed in the southern hemisphere. ...
... In the six-marker Sanger sequencingbased phylogeny of Araneae, Wheeler et al. (2017) recovered Hickmania as a sister group to Archoleptonetidae and this whole clade formed a sister group to Gradungulidae, however, these relationships received low bootstrap support. Various genomic scale datasets such as transcriptomes analysed as amino acids (Fernández et al. 2018, and as nucleotides and ultra-conserved elements (UCEs, Kulkarni et al. 2021;Ledford et al. 2021;Ramírez et al. 2021) recovered Hickmania as a sister group to Gradungulidae with strong bootstrap support. ...
Hickmania troglodytes is an emblematic cave spider representing a monotypic cribellate spider genus. This is the only Australian lineage of Austrochilidae while the other members of the family are found in southern South America. In addition to being the largest spider in Tasmania, Hickmania is an oddity in Austrochilidae because this is the only lineage in the family bearing posterior book lungs, tarsal spines and an embolar process on male pedipalps. Six-gene Sanger sequences and genome scale data such as ultraconserved elements (UCEs) and transcriptomes have suggested that Hickmania troglodytes is not nested with the family of current classification, Austrochilidae. We studied the phylogenetic placement of Hickmania troglodytes using an increased taxon sample by combining publicly available UCE and UCEs recovered from transcriptomic data in a parsimony and maximum likelihood framework. Based on our phylogenetic results we formally transfer Hickmania troglodytes from Austrochilidae to the family Gradungulidae. The cladistic placement of Hickmania in the family Gradungulidae fits the geographic distribution of both gradungulids (restricted to Australia and New Zealand) and austrochilids (restricted to southern South America) more appropriately.
... The 13 PCGs were extracted and concatenated using PhyloSuite v1.2.1 (Zhang et al. 2020). The phylogenetic tree (Figure 1) shows that P. insolens is clustered within the RTA clade of the infraorder Araneomorphae, which is consistent with the result of a recent UCE phylogenomic study (Kulkarni et al. 2021). ...
Full-text available
Plator insolens Simon, 1880 belongs to the family Trochanteriidae and is distributed in China. Herein, we report the complete mitochondrial genome of P. insolens reconstructed from Illumina sequencing data, which is the first published mitochondrial genome for the family. The mitogenome is 14,519 bp in length and contains 13 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes. The phylogenetic analysis indicates that P. insolens is clustered within the RTA clade of the infraorder Araneomorphae. This study provides useful genetic information for future studies on the taxonomy, phylogeny and evolution of trochanteriid species.
... Future efforts to integrate morphology into the new phylogeny of Chelicerata may be aided by parametric tests for phylogenetic signal across anatomical character systems, with the goal of quantifying informativeness and assessing noise in anatomical partitions (e.g., Bieler et al. 2014;King 2019). Exploration of signal within both morphological and molecular data sets, in tandem with alternative recoding strategies, may be key to identifying congruence between dissonant data classes (e.g., Kulkarni et al. 2021;Lopardo et al. 2021;Redmond and McLysaght 2021). More generally, a multidimensional, modern view of morphological evolution should emphasize implementation of comparative genetic techniques for testing the shared developmental basis of putative homologies (e.g., Smith et al. 2016;Nakamura et al. 2017;Bruce and Patel 2020;Clark-Hachtel and Tomoyasu 2020), especially as it pertains to body plan diversification and the evolution of anatomical disparity. ...
Full-text available
Deciphering the evolutionary relationships of Chelicerata (arachnids, horseshoe crabs, and allied taxa) has proven notoriously difficult, due to their ancient rapid radiation and the incidence of elevated evolutionary rates in several lineages. While conflicting hypotheses prevail in morphological and molecular datasets alike, the monophyly of Arachnida is nearly universally accepted, despite historical lack of support in molecular datasets. Some phylotranscriptomic analyses have recovered arachnid monophyly, but these did not sample all living orders, whereas analyses including all orders have failed to recover Arachnida. To understand this conflict, we assembled a dataset of 506 high-quality genomes and transcriptomes, sampling all living orders of Chelicerata with high occupancy and rigorous approaches to orthology inference. Our analyses consistently recovered the nested placement of horseshoe crabs within a paraphyletic Arachnida. This result was insensitive to variation in evolutionary rates of genes, complexity of the substitution models, and alternative algorithmic approaches to species tree inference. Investigation of sources of systematic bias showed that genes and sites that recover arachnid monophyly are enriched in noise and exhibit low information content. To test the impact of morphological data, we generated a 514-taxon morphological data matrix of extant and fossil Chelicerata, analyzed in tandem with the molecular matrix. Combined analyses recovered the clade Merostomata (the marine orders Xiphosura, Eurypterida, and Chasmataspidida), but merostomates appeared nested within Arachnida. Our results suggest that morphological convergence resulting from adaptations to life in terrestrial habitats has driven the historical perception of arachnid monophyly, paralleling the history of numerous other invertebrate terrestrial groups.
... Studies have demonstrated that it is possible to retrieve the traditional Sanger sequencing markers (such as Cytochrome C Oxidase I, 28S and others) from genomic libraries obtained with high throughput sequencing methods Do Amaral et al., 2015;Zarza et al., 2016). Additionally, ultraconserved elements (UCEs) can be obtained from transcriptome libraries, enabling the combination of data produced with the two different methodologies (Bossert et al., 2019;Kulkarni et al., 2021). The possibility of combining these four data types in one analysis seems very compelling, since it could allow the best use of already available data, permitting a more complete sampling and yielding a more comprehensive (and possibly more accurate) view of the evolution of a group of organisms. ...
The importance of morphology in the phylogenomic era has recently gained attention, but relatively few studies have combined both types of information when inferring phylogenetic relationships. Sanger sequencing legacy data can also be important for understanding evolutionary relationships. The possibility of combining genomic, morphological and Sanger data in one analysis seems compelling, permitting a more complete sampling and yielding a comprehensive view of the evolution of a group. Here we used these three data types to elucidate the systematics and evolution of the Dionycha, a highly diverse group of spiders relatively underrepresented in phylogenetic studies. The datasets were analyzed separately and combined under different inference methods, including a novel approach for analyzing morphological matrices with commonly used evolutionary models. We tested alternative hypotheses of relationships and performed simulations to investigate the accuracy of our findings. We provide a comprehensive and thorough phylogenetic hypothesis for Dionycha that can serve as a robust framework to test hypotheses about the evolution of key characters. We also show that morphological data might have a phylogenetic impact, even when massively outweighed by molecular data. Our approach to analyze morphological data may serve as an alternative to the proposed practice of arbitrarily partitioning, weighting, and choosing between parsimony and stochastic models. As a result of our findings, we propose Trachycosmidae new rank for a group of Australian genera formerly included in Trochanteriidae and Gallieniellidae, and consider Ammoxenidae as a junior synonym of Gnaphosidae. We restore the family rank for Prodidomidae, but transfer the subfamily Molycriinae to Gnaphosidae. Drassinella is transferred to Liocranidae, Donuea to Corinnidae, and Mahafalytenus to Viridasiidae.
Full-text available
The arachnid order Schizomida is a relatively understudied group of soil-dwelling predators found on all continents except Antarctica. While efforts to understand their biology are growing, there is still much to know about them. A curious aspect of their morphology is the male flagellum, a sexually dimorphic, tail-like structure which differs in shape across the order and functions in their courtship rituals. The flagellar shape is important for taxonomic classification, yet few efforts have been made to examine shape diversity across the group. Using elliptical Fourier analysis, a type of geometric morphometrics based on shape outline, we quantified shape differences across a combined nearly 550 outlines in the dorsal and lateral views, categorizing them based on genus, family, biogeographic realm, and habitat, with special emphasis on Caribbean and Cuban fauna. We tested for allometric relationships, differences in disparity based on locations and sizes in morphospace among these categories, and for clusters of shapes in morphospace. We found multiple differences in all categories despite apparent overlaps in morphospace, evolutionary allometry, and evidence for discrete clusters in some flagellum shapes. This study can serve as a foundation for further study on the evolution, diversification, and taxonomic utility of the male flagellum.
The tetragnathid genus Leucauge includes some of the most common orb-weaving spiders in the tropics. Although some species in this genus have attained relevance as model systems for several aspects of spider biology, our understanding of the generic diversity and evolutionary relationships among the species is poor. In this study we present the first attempt to determine the phylogenetic structure within Leucauge and the relationship of this genus with other genera of Leucauginae. This is based on DNA sequences from the five loci commonly used and Histone H4, used for the first time in spider phylogenetics. We also assess the informativeness of the standard markers and test for base composition biases in the dataset. Our results suggest that Leucauge is not monophyletic since species of the genera Opas, Opadometa, Mecynometa and Alcimosphenus are included within the current circumscription of the genus. Based on a phylogenetic re-circumscription of the genus to fulfil the requirement for monophyly of taxa, Leucauge White, 1841 is deemed to be a senior synonym of the genera Opas Pickard-Cambridge, 1896 revalidated synonymy, Mecynometa Simon, 1894 revalidated synonymy, Opadometa Archer, 1951 new synonymy and Alcimosphenus Simon, 1895 new synonymy. We identify groups of taxa critical for resolving relationships within Leucauginae and describe the limitations of the standard loci for accomplishing these resolutions.
Full-text available
The common ancestor of spiders likely used silk to line burrows or make simple webs, with specialized spinning organs and aerial webs originating with the evolution of the megadiverse "true spiders" (Araneomorphae). The base of the araneomorph tree also concentrates the greatest number of changes in respiratory structures, a character system whose evolution is still poorly understood, and that might be related to the evolution of silk glands. Emphasizing a dense sampling of multiple araneomorph lineages where tracheal systems likely originated, we gathered genomic-scale data and reconstructed a phylogeny of true spiders. This robust phylogenomic framework was used to conduct maximum likelihood and Bayesian character evolution analyses for respiratory systems, silk glands, and aerial webs, based on a combination of original and published data. Our results indicate that in true spiders, posterior book lungs were transformed into morphologically similar tracheal systems six times independently, after the evolution of novel silk gland systems and the origin of aerial webs. From these comparative data we put forth a novel hypothesis that early-diverging web building spiders were faced with new energetic demands for spinning, which prompted the evolution of similar tracheal systems via convergence; we also propose tests of predictions derived from this hypothesis.
Full-text available
We implement two measures for quantifying genealogical concordance in phylogenomic datasets: the gene concordance factor (gCF) and the novel site concordance factor (sCF). For every branch of a reference tree, gCF is defined as the percentage of "decisive" gene trees containing that branch. This measure is already in wide usage, but here we introduce a package that calculates it while accounting for variable taxon coverage among gene trees. sCF is a new measure defined as the percentage of decisive sites supporting a branch in the reference tree. gCF and sCF complement classical measures of branch support in phylogenetics by providing a full description of underlying disagreement among loci and sites. An easy to use implementation and tutorial is freely available in the IQ-TREE software package (
Full-text available
Targeted enrichment of ultraconserved elements (UCEs) has emerged as a promising tool for inferring evolutionary history in many taxa, with utility ranging from phylogenetic and biogeographic questions at deep time scales to population level studies at shallow time scales. However, the methodology can be daunting for beginners. Our goal is to introduce UCE phylogenomics to a wider audience by summarizing recent advances in arthropod research, and to familiarize readers with background theory and steps involved. We define terminology used in association with the UCE approach, evaluate current laboratory and bioinformatic methods and limitations, and, finally, provide a roadmap of steps in the UCE pipeline to assist phylogeneticists in making informed decisions as they employ this powerful tool. By facilitating increased adoption of UCEs in phylogenomics studies that deepen our comprehension of the function of these markers across widely divergent taxa, we aim to ultimately improve understanding of the arthropod tree of life.
Full-text available
Darwin's bark spider (Caerostris darwini) produces giant orb webs from dragline silk that can be twice as tough as other silks, making it the toughest biological material. This extreme toughness comes from increased extensibility relative to other draglines. We show C. darwini dragline-producing major ampullate (MA) glands highly express a novel silk gene transcript (MaSp4) encoding a protein that diverges markedly from closely related proteins and contains abundant proline, known to confer silk extensibility, in a unique GPGPQ amino acid motif. This suggests C. darwini evolved distinct proteins that may have increased its dragline's toughness, enabling giant webs. Caerostris darwini's MA spinning ducts also appear unusually long, potentially facilitating alignment of silk proteins into extremely tough fibers. Thus, a suite of novel traits from the level of genes to spinning physiology to silk biomechanics are associated with the unique ecology of Darwin's bark spider, presenting innovative designs for engineering biomaterials.
High throughput sequencing and phylogenomic analyses focusing on relationships among spiders have both reinforced and upturned long‐standing hypotheses. Likewise, the evolution of spider webs—perhaps their most emblematic attribute—is being understood in new ways. With a matrix including 272 spider species and close arachnid relatives, we analyze and evaluate the relationships among these lineages using a variety of orthology assessment methods, occupancy thresholds, tree inference methods and support metrics. Our analyses include families not previously sampled in transcriptomic analyses, such as Symphytognathidae, the only araneoid family absent in such prior works. We find support for the major established spider lineages, including Mygalomorphae, Araneomorphae, Synspermiata, Palpimanoidea, Araneoidea and the Retrolateral Tibial Apophysis Clade, as well as the uloborids, deinopids, oecobiids and hersiliids Grade. Resulting trees are evaluated using bootstrapping, Shimodaira–Hasegawa approximate likelihood ratio test, local posterior probabilities and concordance factors. Using structured Markov models to assess the evolution of spider webs while accounting for hierarchically nested traits, we find multiple convergent occurrences of the orb web across the spider tree‐of‐life. Overall, we provide the most comprehensive spider tree‐of‐life to date using transcriptomic data and use new methods to explore controversial issues of web evolution, including the origins and multiple losses of the orb web.
Spiders (Araneae) make up a remarkably diverse lineage of predators that have successfully colonized most terrestrial ecosystems. All spiders produce silk, and many species use it to build capture webs with an extraordinary diversity of forms. Spider diversity is distributed in a highly uneven fashion across lineages. This strong imbalance in species richness has led to several causal hypotheses, such as codiversification with insects, key innovations in silk structure and web architecture, and loss of foraging webs. Recent advances in spider phylogenetics have allowed testing some of these hypotheses, but results are often contradictory, highlighting the need to consider additional drivers of spider diversification. The spatial and historical patterns of diversity and diversification remain contentious. Comparative analyses of spider diversification will advance only if we continue to make progress with studies of species diversity, distribution, and phenotypic traits, together with finer-scale phylogenies and genomic data. Expected final online publication date for the Annual Review of Entomology, Volume 66 is January 11, 2020. Please see for revised estimates.
Using FrogCap, a recently-developed sequence-capture protocol, we obtained more than 12,000 highly informative exons, introns, and Ultraconserved elements (UCEs), which we used to illustrate variation in evolutionary histories of these classes of markers, and to resolve long-standing systematic problems in Southeast Asian Golden-backed frogs of the genus-complex Hylarana. We also performed a comprehensive suite of analyses to assess the relative performance of different genetic markers, data filtering strategies, tree inference methods, and different measures of branch support. To reduce gene tree estimation errors, we filtered the data using different thresholds of taxon completeness (missing data) and parsimony informative sites (PIS). We then estimated species trees using concatenated datasets and Maximum Likelihood (IQ-TREE) in addition to summary (ASTRAL-III), distance-based (ASTRID), and site-based (SVDQuartets) multispecies coalescent methods. Topological congruence and branch support were examined using traditional bootstrap, local posterior probabilities, gene concordance factors, quartet frequencies, and quartet scores. Our results showed that separate analyses did not yield a single concordant topology. Instead, introns, exons, and UCEs clearly possessed individual categories of phylogenetic signal, resulting in conflicting, yet strongly-supported phylogenetic estimates. However, a combined analysis comprising the most informative introns, exons, and UCEs converged on a similar topology across all analyses, with the exception of SVDQuartets. Bootstrap values were consistently high despite high levels of incongruence and high proportions of gene trees supporting conflicting topologies. Although low bootstrap values did indicate low heuristic support, high bootstrap support did not necessarily reflect congruence or support for the correct topology. This study reiterates findings of some previous studies, which demonstrated that traditional bootstrap values can produce positively misleading measures of support in large phylogenomic datasets. We also showed a remarkably strong positive relationship between branch length and topological congruence across all datasets, implying that very short internodes remain a challenge to resolve, even with orders of magnitude more data than ever before. Overall, our results demonstrate that more data from unfiltered or combined datasets produced demonstrably superior results. Although data filtering reduced gene tree incongruence, decreased amounts of data also biased phylogenetic estimation. A point of diminishing returns was evident, at which higher congruence (from more stringent filtering) at the expense of amount of data led to topological error as assessed by comparison to more complete datasets across different genomic markers. Additionally, we showed that applying a parameter-rich model to a partitioned analysis of concatenated data produces better results compared to unpartitioned, or even partitioned analysis using model selection. Despite some lingering uncertainties, a combined analysis of our genomic data and taxa supplemented from GenBank (on the basis of a few gene regions) sequences revealed highly supported novel systematic arrangements. Based on these new findings, we transfer Amnirana nicobariensis into the genus Indosylvirana; and I. milleti and Hylarana celebensis to the genus Papurana. We also provisionally place H. attigua in the genus Papurana pending verification from positively identified (voucher substantiated) samples.
Resolving deep divergences in the tree of life is challenging even for analyses of genome-scale phylogenetic data sets. Relationships between Basidiomycota subphyla, the rusts and allies (Pucciniomycotina), smuts and allies (Ustilaginomycotina), and mushroom-forming fungi and allies (Agaricomycotina) were found particularly recalcitrant both to traditional multigene and genome-scale phylogenetics. Here, we address basal Basidiomycota relationships using concatenated and gene tree-based analyses of various phylogenomic data sets to examine the contribution of several potential sources of bias. We evaluate the contribution of biological causes (hard polytomy, incomplete lineage sorting) versus unmodeled evolutionary processes and factors that exacerbate their effects (e.g., fast-evolving sites and long-branch taxa) to inferences of basal Basidiomycota relationships. Bayesian Markov Chain Monte Carlo and likelihood mapping analyses reject the hard polytomy with confidence. In concatenated analyses, fast-evolving sites and oversimplified models of amino acid substitution favored the grouping of smuts with mushroom-forming fungi, often leading to maximal bootstrap support in both concatenation and coalescent analyses. On the contrary, the most conserved data subsets grouped rusts and allies with mushroom-forming fungi, although this relationship proved labile, sensitive to model choice, to different data subsets and to missing data. Excluding putative long-branch taxa, genes with high proportions of missing data and/or with strong signal failed to reveal a consistent trend toward one or the other topology, suggesting that additional sources of conflict are at play. While concatenated analyses yielded strong but conflicting support, individual gene trees mostly provided poor support for any resolution of rusts, smuts, and mushroom-forming fungi, suggesting that the true Basidiomycota tree might be in a part of tree space that is difficult to access using both concatenation and gene tree-based approaches. Inference-based assessments of absolute model fit strongly reject best-fit models for the vast majority of genes, indicating a poor fit of even the most commonly used models. While this is consistent with previous assessments of site-homogenous models of amino acid evolution, this does not appear to be the sole source of confounding signal. Our analyses suggest that topologies uniting smuts with mushroom-forming fungi can arise as a result of inappropriate modeling of amino acid sites that might be prone to systematic bias. We speculate that improved models of sequence evolution could shed more light on basal splits in the Basidiomycota, which, for now, remain unresolved despite the use of whole genome data.
Phylogenomic methods have proven useful for resolving deep nodes and recalcitrant groups in the spider tree of life. Across arachnids, transcriptomic approaches may generate thousands of loci, and target‐capture methods, using the previously designed arachnid‐specific probe‐set, can target a maximum of about 1,000 loci. Here, we develop a specialized target‐capture probe set for spiders that contains over 2,000 ultraconserved elements (UCEs) and then demonstrate the utility of this probe set through sequencing and phylogenetic analysis. We designed the “spider‐specific” probe set using three spider genomes (Loxosceles, Parasteatoda and Stegodyphus) and ensured that the newly designed probe‐set include UCEs from the previously designed Arachnida probe set. The new “spider‐specific” probes were used to sequence UCE loci in 51 specimens. The remaining samples included five spider genomes and taxa that were enriched using Arachnida probe set. The “spider‐specific” probes were also used to gather loci from a total of 84 representative taxa across Araneae. On mapping these 84 taxa to the Arachnida probe set, we captured at most 710 UCE loci, while the spider specific probe set captured up to 1,547 UCE loci from the same taxon sample. Phylogenetic analyses using Maximum Likelihood and coalescent methods corroborate most nodes resolved by recent transcriptomic analyses, but not all (e.g., UCE data suggests monophyly of “symphytognathoids”). Our preferred analysis based on topology tests, suggests monophyly of the “symphytognathoids” (the miniature orb‐weavers), which in previous studies has only been supported by a combination of morphological and behavioral characters.
The monotypic family Trogloraptoridae was only recently described from caves and old‐growth forest of Oregon and California (Western USA). These enigmatic spiders are characterized by striking raptorial claws, and based on their spinneret morphology, a close relationship to dysderoid spiders, a large clade within Synspermiata, was suggested. Here, we used a phylogenomic framework using transcriptomes to test the phylogenetic position of Trogloraptor marchingtoni. Our analysis placed this taxon within Synspermiata, which is supported by the presence of synspermia. Furthermore, a sister group relationship with Dysderoidea is strongly supported. In a second step, we reinvestigated the female genitalia using a non‐destructive approach. Our data revealed that Trogloraptor has a flow‐through genital system (entelegyne condition) and is not haplogyne as previously described based on dissections. The Trogloraptor female genital system consists of paired large spermathecae, which connect by a fertilization duct to a wide bursa. The copulatory duct arises from the sclerotized anterior margin of the bursa, and its organization is likely related to the organization of the male intromittent organ. Based on our phylogenetic data, we show that the entelegyne condition evolved at least six times independently within spiders. Moreover, our results indicate that the peculiar organization of the dysderoid female genitalia with an additional posterior sperm storage site is a synapomorphy of this Synspermiata clade.