PreprintPDF Available

Machine learning approaches to assess microendemicity and conservation risk in cave-dwelling arachnofauna

Authors:

Abstract

The biota of cave habitats faces heightened conservation risks, due to geographic isolation and high levels of endemism. Molecular datasets, in tandem with ecological surveys, have the potential to delimit precisely the nature of cave endemism and identify conservation priorities for microendemic species. Here, we sequenced ultraconserved elements of Tegenaria within, and at the entrances of, 25 cave sites to test phylogenetic relationships, combined with an unsupervised machine learning approach to delimit species. Our data identified clear species limits, as well as the incidence of previously unidentified, potential cryptic species. We employed the R package canaper and Categorical Analysis of Neo- and Paleo-Endemism (CANAPE) to generate conservation metrics that are informative for future policy, in tandem with conservation assessments for the troglobitic Israeli species of this genus.
Abstract
1
The biota of cave habitats faces heightened conservation risks, due to geographic isolation
2
and high levels of endemism. Molecular datasets, in tandem with ecological surveys, have
3
the potential to delimit precisely the nature of cave endemism and identify conservation
4
priorities for microendemic species. Here, we sequenced ultraconserved elements of
5
Tegenaria
within, and at the entrances of, 25 cave sites to test phylogenetic relationships,
6
combined with an unsupervis ed ma chin e learning approa ch to delim it spec ies . Our dat a
7
identified clear species limits, as well as the incidence of previously unidentified, potential
8
cryptic species. We employed the R package canaper and Categorical Analysis of Neo- and
9
Paleo-Endemism (CANAPE) to generate conservation metrics that are informative for
10
future policy, in tandem with conservation assessments for the troglobitic Israeli species of
11
this genus.
12
13
14
15
16
17
18
19
20
21
22
23
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
Introduction
24
25
Cave-dwelling taxa are at heightened risk of extinction due to the limited ranges imposed
26
by a single cave system or, in extreme cases, a single cave. These taxa, sometimes referred
27
to as short-range endemics or microendemics, face an outsized threat in the face of
28
disturbance to their habitats and climate change. (Harvey et al., 2011; Mammola et al.,
29
2018). With limited individuals to sample, it is a challenge both to delimit endemic cave
30
species, as well as develop management strategies for endangered taxa (Paquin & Hedin,
31
2004). Broadly, cave ecosystems share core abiotic features, such as reduction or complete
32
absence of light, high relative humidity, and buffered temperature ranges compared to
33
their surrounding terrestrial surface climates (Barr & Holsinger, 1985). The existence and
34
maintenance of biodiversity in cave habitats is predicated on the ability of biota to adapt to
35
such conditions. Because of this, unique phenotypic changes can be observed in cave-
36
dwelling organisms across the animal tree of life. These changes comprise both reductive
37
features (e.g., atrophy of structures not required for subterranean life), as well as
38
constructive adaptations (e.g., compensatory gains in tactile appendages or olfactory
39
capacity (Re et al., 2018; Riddle et al., 2018)). One of the more conspicuous examples of this
40
phenomenon is the partial or complete loss of eyes in cave-dwelling species. The Mexican
41
cavefish (
Astyanax mexicanus
) is a well-studied exemplar of eye loss in cave-dwelling
42
species. The blind morph of
A. mexicanus
is said to have evolved as recently as 20,000 years
43
ago, exemplifying phenotypic change over rapid timescales and without the requirement of
44
reproductive isolation (Fumey et al., 2018). Rapid evolution of disparate phenotypes allows
45
for the study of how speciati on begi ns in cave popula ti ons ve rsu s surfa ce popula ti ons. Ove r
46
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
time, speciation may establish troglobites, or species that are obligate cave-dwellers; these
47
can occur in proximity to epigean (surface) counterparts or to troglophiles, species that are
48
facultative cave-dwellers (Howarth, 1983). Because of this, inferring the phylogenetic
49
relationships between closely related troglobites and troglophiles presents a unique
50
challenge.
51
52
The Levant, in particular, harbors high diversity of cave fauna in a small geographic area
53
(Gavish-Regev et al., 2021; Peel et al., 2007), due to the incidence of 3 distinct climate zones
54
(Mediterranean climate, semi-arid, desert;
sensu
Köppen-Geiger climate classification
55
system) at the margin of 3 continents. Climatic shifts during the Pleistocene saw the
56
Southern Levant act as an unglaciated refuge for species, with animals colonizing the
57
region from surrounding glaciated areas (Tchernov & Belmaker, 2004). With an abundance
58
of geologically diverse caves, Israel is an ideal locale for the study of phenotypic evolution
59
and speciation of endemic groups in cave ecosystems. Recent surveys of cave sites around
60
Israel have yielded the discovery of several microendemic spider species with troglobite
61
and troglophile representatives (Gavish-Regev et al., 2021). These spiders exhibit a
62
spectrum of eye loss ranging from normal,” fully developed eyes to complete absence of
63
eyes.
64
65
In a recent work, 7 new species of troglobitic
Tegenaria
were described using traditional
66
morphological approaches, with barcoding (COI sequencing) and ddRAD sequencing
67
reinforcing interpretations of species boundaries (Aharon et al., 2023). The Israeli
68
troglobitic species were recovered as a distinct clade from local troglophilic species. These
69
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
data suggested that the current cave species are relicts that descended from a single,
70
surface-dwelling ancestor. While these data supported species recognition and a
71
hypothesis for the evolutionary history of this group, interspecies relationships were
72
disputed across the 2 different sequencing methods, possibly due to high locus drop-out
73
rates in RAD sequencing across previously unrecognized species boundaries. The resulting
74
data matrices thus bore few sites that were present across all taxa, a known driver of
75
phylogenetic uncertainty (Roure et al. 2013).
76
77
Here, based on a target-capture sequencing approach that leverages ultraconserved
78
elements, , we inferred a phylogeny to resolve these relationships, applied an unsupervised
79
machine learning approach to delimit species, and assessed conservation priority of Israeli
80
Tegenaria.
Considering the ensuing inferences of species boundaries, we generated
81
metrics of phylogenetic diversity to identify areas of unanticipatedly high endemism across
82
Israeli cave sites that warrant conservation priority. We then classified the conservation
83
status of each troglobitic species according to the IUCN Red List Categories and Criteria
84
workflow.
85
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
Materials and Methods
86
87
Species sampling and sequencing
88
89
We sequenced samples from 161
Tegenaria
specimens freshly collected from field
90
campaigns
across 25 caves of Israel (2018-2020). Outgroup taxa consisted of
91
representatives of the confamilial genera
Agelena
(N = 4) and
Lycosoides
(N = 4). Samples
92
were extracted from specimens using a Qiagen DNeasy Blood & Tissue Kit and eluted in 10
93
mM Tris-HCl, with additional bead-based purification of previously generated EDTA-
94
preserved extractions (from Aharon et al. 2023). DNA was quantified using a Qubit 3
95
fluorometer with a High Sensitivity dsDNA Assay Kit. Enzymatic fragmentation, end
96
repair/A-tailing, adapter ligation, and library amplification were performed with a KAPA
97
HyperPlus Kit and dual index primers (i5 and i7) for multiplex sequencing. Additional
98
details of the library preparation procedures are provided in S3.
99
100
Samples were pooled samples by groups of 8, each with 125 ng of DNA for subsequent
101
target capture, following the myBaits protocol v5.02 (Arbor Biosciences) for targeted
102
enrichment using a spider-specific probe set (Kulkarni et al., 2020). Pools were sequenced
103
at the University of Wisconsin Biotechnology Center on an Illumina NovaSeq 6000 2
150
104
bp S1 flow cell. Cleaning and trimming of raw read data were performed using
105
illumiprocessor v2.0, followed by assembly of libraries using AbySS 2.0. We employed the
106
software package PHYLUCE v1.6 for subsequent data processing and analysis (Faircloth,
107
2015). Individual scripts used through PHYLUCE are indicated in S3.
108
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
Phylogenomic analysis
109
110
To assess sensitivity to data completeness, we applied successive gene occupancy
111
thresholds of 50% (777 loci, 161 taxa), 90% (227 loci; 161 taxa), and 95% (88 loci; 161
112
taxa). Concatenation-based maximum likelihood analyses were performed using IQTREE
113
v.2 (Nguyen et al., 2014). Automated model selection was performed using
114
ModelFinderPlus (Hoang et al., 2018; Kalyaanamoorthy et al., 2017).
Nodal support was
115
estimated via 1500 ultrafast bootstrap replicates and 1500 bootstrap replicates for the SH-
116
like approximate likelihood ratio test (Guindon et al., 2010; Hoang et al., 2018). For
117
phylogenetic analyses using multispecies coalescent methods, species trees were estimated
118
with ASTRAL v. 3, using individual gene trees as inputs, with gene tree topologies
119
estimated using IQ-TREE v. 2 under automated model selection.
120
121
To test the robustness of the ensuing tree topologies, we additionally assessed
122
phylogenetic signal across UCE loci using
genesortR
. This method, which implements a
123
principal components-based approach to quantifying phylogenetic informativeness,
124
requires a resolved species tree
a priori
for the computation of Robinson-Foulds distances
125
for each gene tree. To limit the influence of unresolved parts of the species tree on the
126
ranking of phylogenetically useful genes, we collapsed 2 nodes in the species tree that were
127
not resolved with maximal nodal support. Subsequent to sorting on phylogenetic signal, we
128
inferred the maximum likelihood tree topology from a concatenated matrix comprised of
129
the 100 highest-ranked and informative loci.
130
131
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
Machine learning-based validation of species boundaries
132
133
To independently validate the species limits suggested by our phylogeny, we employed a
134
variational autoencoder (VAE), an unsupervised machine learning technique, which was
135
previously demonstrated to be informative for species delimitation in a cryptic genus of
136
harvestmen (Derkarabetian et al., 2019). To implement VAE, we first called single
137
nucleotide polymorphisms (SNPs) from our UCE data using the Genome Analysis Toolkit
138
(McKenna et al., 2010). These data were then encoded in the one-hot format, which allows
139
for the representation of categorical data as vectors with integer values. In the case of
140
sequence data, the vector [1, 0, 0, 0] represented an A, [0, 1, 0, 0] was C, [0, 0, 1, 0] was G,
141
and [0, 0, 0, 1] was T. Importantly, sites where data was missing were encoded as [0, 0, 0, 0]
142
which prevented the model downstream from erroneously grouping terminals solely based
143
on shared missing data. Following the encoding step, VAE was implemented using the
144
TensorFlow (Abadi et al., 2016) and Keras python libraries. A script by Derkarabetian et al.
145
(2019) was used to build the model and plot the results.
146
147
CANAPE
148
149
To assess the conservation priority of Israeli
Tegenaria
we conducted Categorical Analysis
150
of Neo- and Paleo- Endemism (CANAPE) on our full dataset (Mishler et al., 2014). We used
151
canaper, a package which allows for the running of CANAPE in its entirety in R (Nitta et al.,
152
2023). As inputs for canaper, we used our 50% taxon occupancy phylogeny and a data
153
frame consisting of a column with each terminal name and 2 columns with the latitude and
154
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
longitude from which the corresponding samples were collected. To be read by canaper,
155
the locality data frame had to be converted to a community matrix which was further
156
converted to a tibble. The first step of the CANAPE analysis was to run
cpr_rand_test
which
157
generates a set of random communities with a series of metrics to which input data is
158
compared. For our purposes, we opted for 500 random communities and used the null
159
model “curveball” for randomization (Strona et al., 2014). The next step was to run the
160
cpr_classify_endem
function to classify each of our communities as paleoendemic,
161
neoendemic, or both. We then visualized these data as a map of our study site with color
162
coded grid cells corresponding to the endemism type.
163
164
Conservation assessments and IUCN designation
165
To perform conservation assessments of troglobitic Israeli
Tegenaria,
we used data from the Israel
166
National Arachnid Collection, where
Tegenaria
specimens from our cave surveys were deposited,
167
and previous studies we have published on this system (Aharon 2023, Gavish-Regev 2021).
168
Distribution records of all Tegenaria species were extracted from original descriptions (Aharon et
169
al 2023), as well as from our ecological surveys. To generate measurements of Extent of Occurrence
170
(EOO) and Area of Occupancy (AOO) we performed spatial analysis using the R package
red
171
(Cardoso 2017).
172
173
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
Results
174
175
Relationships of
Tegenaria
inferred from UCE-based datasets
176
177
Maximum likelihood analyses recovered 2 robustly supported clades of Israeli
Tegenaria
,
178
corresponding to a clade of eye-bearing epigean species and a clade of species with eye
179
reduction or complete loss (Fig. 2). Relationships between species were robustly supported
180
across analyses (bootstrap resampling frequency [BR] >99%), barring the placement of
181
Tegenaria trogalil
(BR = 62%). Whereas the epigean clade exhibited deep genetic distances
182
between species (i.e., long patristic distances between clades), the base of the troglobitic
183
clade exhibited short branch lengths, potentially corresponding to a rapid radiation. Within
184
the troglobitic clade, species inhabiting caves in the north of Israel formed a grade with
185
respect to a nested clade of southern species (BR = 99.9%).
186
187
VAE-based inference of species clusters
188
189
We used VAE to delimit species within Israeli
Tegenaria
, with an emphasis on delimiting
190
the seven recently described troglobitic species. For computational efficiency, species
191
delimitation was performed separately for each clade, with one corresponding to the
192
epigean clade and one corresponding to the troglobitic clade. Broadly, VAE inferred
193
population clusters corresponding to species inferred by the phylogeny (Fig. 3). In the
194
epigean clade, clusters were well-defined and exhibited no overlap in standard deviation.
195
The
T. pagana
cluster appeared discontinuous but distinct from the rest of the clusters.
196
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
197
In the troglobitic clade, species limits were again well-defined, with the notable exception
198
of
T. trogalil
species, which resolved as three separate clusters. 2 of these corresponded to
199
the East and West Galilee caves and a third corresponded to a cluster comprised of both
200
regions. There was also a tendency for VAE to group some of the reduced eye species from
201
more isolated caves (e.g.,
T. ornit, T. naasane
) into tighter clusters (i.e., low variance).
202
203
Identification of biodiversity hotspots across Levantine cave sites
204
205
5 caves out of the 25 surveyed harbored a single species found in no other caves. Out of 17
206
10km
2
grid cells in our study area, 4 were classified as sites that exhibit paleo-endemism
207
and one was classified as mixed (paleo- and neo-endemism). The species represented in
208
the paleo-endemic sites are
T. pagana, T. trogalil
(West caves)
, T. angustipalpis
,
T. sp. 1
(Fig.
209
4).
One such species,
T. trogalil
from Namer cave
,
is the sole representative of the
210
troglobitic clade that occurs in a site of paleo-endemism.The single site of mixed endemism
211
is home to
T. pagana
(fully developed eyes)
and
T. yaaranford
(reduced eyes) at Te’omim
212
Cave (Fig, 4).
213
214
Conservation assessment using IUCN workflow
215
216
Upon evaluating the conservation status of the 7 troglobitic
Tegenaria
species found in
217
Israel, we classified five as Critically Endangered (CR) due to extremely small EOO and
218
AOO, as well as isolation and very low population size (<50 mature individuals) observed
219
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
in some of the caves. One troglobitic species that is found in several (likely connected)
220
neighboring caves was classified as Endangered (EN), due to threats of development that
221
have the potential to destroy its habitat. We classified only one species as Vulnerable (VU)
222
due to projections of habitat extent and quality decline, as we observed a low number of
223
mature individuals distributed across six sites.
224
225
226
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
Discussion
227
228
As global biodiversity continues to diminish due to rapidly changing climatic conditions
229
and human factors, the conservation of species has never been more important. Currently,
230
41,000 species are threatened with extinction (International Union for Conservation of
231
Nature, 2022). If biodiversity loss persists at an increasing rate, the stability of ecosystems
232
is in jeopardy. Notably, conservation status statistics published by the International Union
233
for Conservation of Nature (IUCN) only reflect species that have been assessed and
234
therefore may not serve as an accurate representation of the conservation status of many
235
small-bodied and enigmatic taxa. For instance, the order Arachnida is proportionately the
236
least represented animal taxon that is currently assessed by the IUCN, with only 0.4% of
237
described species assessed (IUCN Red List version 2022-2: Table 1a). Therefore, it is
238
imperative that specific efforts be made to assess the conservation status and priority of
239
underrepresented groups such as arachnids.
240
241
Assessing the conservation status of spiders is difficult due to insufficient taxonomic and
242
biological information on many lineages as well as difficulties in evaluating population size.
243
Few long-term monitoring schemes are in place and often data on species occurrences is
244
anecdotal or from a single sampling event in a specific locality. Since we have been studying
245
and monitoring spiders in caves in Israel for over a decade, these efforts have enabled us to
246
gain knowledge on the distribution of spiders in caves in Israel. The result that five out of
247
seven new species described are classified as CR is unsurprising, given their small
248
population sizes and limited range.
249
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
Short-range endemic (i.e., microendemic) species are highly vulnerable to environmental
250
degradation and anthropogenic disturbance. Effective recognition and conservation
251
assessment of short-range endemics requires a combination of taxonomy and genomic
252
datasets for phylogenetically-informed assessment of biodiversity metrics (Agnarsson &
253
Kuntner, 2007; Harvey et al., 2011; Mishler et al., 2014). Here, we deployed unsupervised
254
machine learning for identifying species limits, prior to assessing conservation priority for
255
several microendemic spiders of Levantine cave habitats. Our results suggest that
256
troglobitic Israeli
Tegenaria
are both genetically isolated from each other and distantly
257
related to their surface-dwelling relative counterparts. Contrary to established models of
258
cave adaptation (i.e., repeated colonization of caves by epigean taxa), the Israeli
Tegenaria
259
comprise a unique evolutionary case wherein troglobitic species likely resulted from a
260
rapid radiation of a single ancestral species that had a proclivity for cave-blindness, with
261
possible extirpation of their epigean ancestor. The epigean species found in Israel today are
262
more closely related to Mediterranean
Tegenaria
, suggesting a second wave of colonization
263
of shallower parts of caves in Israel (Aharon et al., 2023).
264
265
The clustering proposed by VAE corroborated species limits inferred by our phylogenetic
266
analyses. However, it also detected clusters not recovered as species in our phylogeny,
267
suggesting that there may be yet undescribed species, both within the cave-dwelling and
268
the surface-dwelling clades. In the case of the troglobitic clade, we observed a new cluster
269
within
T. trogalil
comprised of exemplars of both the eastern and western caves, whereas
270
standard phylogenetic analyses tended to recover the eastern and western caves as
271
separate and monophyletic groups. The detection of this mixed cluster within
T. trogalil
272
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
could represent complex dynamics stemming from introgression or incomplete lineage
273
sorting. Within the eye-bearing clade, VAE detected a possible new cluster within
T.
274
pagana
, which corresponds to specimens from five localities from the north of Israel. This
275
lineage was recovered as the sister group to the remaining
T. pagana
and may constitute a
276
new cryptic species.
277
278
Taken together, the high regional fidelity and deep genetic distances between species
279
suggest that Israeli caves serve as both “museums” of ancient diversity and “cradles” of
280
recent diversification for Levantine arachnofauna. Single cave sites can be host to faunal
281
antiquity from 2 separate sources (epigean and troglobitic), resulting in high metrics for
282
paleoendemism. However, biodiversity metrics like those implemented for CANAPE are
283
grounded in the expectation that species in the occurrence matrix are neither too rare, nor
284
too widespread. The value of microendemic species is difficult to capture using algorithmic
285
approaches to biodiversity assessment, because sites with a single microendemic species
286
that harbors low genetic diversity (e.g.,
T. ornit
) are not highly ranked for metrics like
287
phylogenetic diversity or phylogenetic endemism. Moreover, sites that harbor a
288
microendemic of one taxon may harbor other such endemics that are outside of the focal
289
study system. As an example, Orni t cav e, whic h has one of the smalle st dark ch am be rs of all
290
the caves surveyed, harbors the entirely blind species
T. ornit
, as well as an undescribed
291
species of a palpigrade (microwhip scorpion) and a pseudoscorpion. The use of software
292
like CANAPE alone to assess conservation priorities thus comes with a tradeoff.
293
294
295
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
296
Our results substantiate the need for regulations surrounding access to, and conduct
297
within, caves in Israel by tourists and visitors. Moreover, we suggest that abiotic
298
conditions as well as populations in those caves will be monitored, to assess any changes in
299
population size. It is imperative that the public is made aware of the extreme rarity and
300
uniqueness of the species within the Israeli
Tegenaria
system. We recommend that caves
301
containing type-locality endemics be identified with signage indicating to visitors that if
302
they intend to enter the cave, they must use extreme care when maneuvering about the
303
cave to not disturb any spiders, webs, or egg sacs they may encounter. Only some of those
304
type-locality caves are protected by Israeli law, as they are located in nature reserves (e.g.,
305
Ornit, Teomim, Sarah and Namer caves) or national parks (Soreq cave), while others are
306
not protected (e.g., En Sarig spring tunnel, Yir'on cave). But even the caves that are
307
protected host many visitors that affect abiotic conditions, or even actively alter
308
microhabitats by lighting fires inside the caves. Broadly, all these caves should be identified
309
by appropriate government agencies as sites of natural historical significance and receive
310
protections commensurate with this status, that will forbid use of fire inside the cave and
311
close the deep chambers for visitors. With 5 out of the 7 new species classified as CR under
312
IUCN guidelines, it is imperative that these regulations are put in place with deliberate
313
haste.
314
315
316
317
318
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
319
320
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
References
321
322
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G.,
323
Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Steiner, B.,
324
Tucker, P., Vasudevan, V., Warden, P., Zheng, X. (2016).
TensorFlow: A system for
325
large-scale machine learning
.
326
Agnarsson, I., & Kuntner, M. (2007). Taxonomy in a Changing World: Seeking Solutions for
327
a Science in Crisis.
Systematic Biology
,
56
(3), 531–539.
328
https://doi.org/10.1080/10635150701424546
329
Aharon, S., Ballesteros, J. A., Gainett, G., Hawlena, D., Sharma, P. P., & Gavish-Regev, E.
330
(2023). In the land of the blind: Exceptional subterranean speciation of cryptic
331
troglobitic spiders of the genus Tegenaria (Araneae: Agelenidae) in Israel.
Molecular
332
Phylogenetics and Evolution
, 107705.
333
https://doi.org/10.1016/j.ympev.2023.107705
334
Barr, T. C., & Holsinger, J. R. (1985). Speciation in Cave Faunas.
Annual Review of Ecology
335
and Systematics
,
16
(1), 313–337.
336
https://doi.org/10.1146/annurev.es.16.110185.001525
337
Derkarabetian, S., Castillo, S., Koo, P. K., Ovchinnikov, S., & Hedin, M. (2019). A
338
demonstration of unsupervised machine learning in species delimitation.
Molecular
339
Phylogenetics and Evolution
,
139
, 106562.
340
https://doi.org/10.1016/j.ympev.2019.106562
341
Faircloth, B. C. (2015).
PHYLUCE is a software package for the analysis of conserved genomic
342
loci
.
343
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
Fumey, J., Hinaux, H., Noirot, C., Thermes, C., Rétaux, S., & Casane, D. (2018). Evidence for
344
late Pleistocene origin of Astyanax mexicanus cavefish.
BMC Evolutionary Biology
,
345
18
(1), 43. https://doi.org/10.1186/s12862-018-1156-7
346
Gavish-Regev, E., Aharon, S., Armiach Steinpress, I., Seifan, M., & Lubin, Y. (2021). A Primer
347
on Spider Assemblages in Levantine Caves: The Neglected Subterranean Habitats of
348
the Levant—A Biodiversity Mine.
Diversity
,
13
(5), Article 5.
349
https://doi.org/10.3390/d13050179
350
Guindon, S., Dufayard, J.-F., Lefort, V., Anisimova, M., Hordijk, W., & Gascuel, O. (2010). New
351
algorithms and methods to estimate maximum-likelihood phylogenies: Assessing
352
the performance of PhyML 3.0.
Systematic Biology
,
59
(3), 307–321.
353
https://doi.org/10.1093/sysbio/syq010
354
Harvey, M. S., Rix, M. G., Framenau, V. W., Hamilton, Z. R., Johnson, M. S., Teale, R. J.,
355
Humphreys, G., & Humphreys, W. F. (2011). Protecting the innocent: Studying short-
356
range endemic taxa enhances conservation outcomes.
Invertebrate Systematics
,
357
25
(1), 1. https://doi.org/10.1071/IS11011
358
Hoang, D. T., Chernomor, O., von Haeseler, A., Minh, B. Q., & Vinh, L. S. (2018). UFBoot2:
359
Improving the Ultrafast Bootstrap Approximation.
Molecular Biology and Evolution
,
360
35
(2), 518–522. https://doi.org/10.1093/molbev/msx281
361
Howarth, F. G. (1983). Ecology of Cave Arthropods.
Annual Review of Entomol ogy
,
28
(1),
362
365–389. https://doi.org/10.1146/annurev.en.28.010183.002053
363
Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., Haeseler, A. von, & Jermiin, L. S. (2017).
364
ModelFinder: Fast model selection for accurate phylogenetic estimates.
Nature
365
Methods
,
14
(6), 587–589. https://doi.org/10.1038/nmeth.4285
366
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
Kulkarni, S., Wood, H., Lloyd, M., & Hormiga, G. (2020). Spider-specific probe set for
367
ultraconserved elements offers new perspectives on the evolutionary history of
368
spiders (Arachnida, Araneae).
Molecular Ecology Resources
,
20
(1), 185–203.
369
https://doi.org/10.1111/1755-0998.13099
370
Mammola, S., Goodacre, S. L., & Isaia, M. (2018). Climate change may drive cave spiders to
371
extinction.
Ecography
,
41
(1), 233–243. https://doi.org/10.1111/ecog.02902
372
McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., Garimella, K.,
373
Altshuler, D., Gabriel, S., Daly, M., & DePristo, M. A. (2010). The Genome Analysis
374
Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing
375
data.
Genome Research
,
20
(9), 1297–1303. https://doi.org/10.1101/gr.107524.110
376
Mishler, B. D., Knerr, N., González-O rozc o, C. E., Th ornh ill, A. H., Laffan, S. W., & Miller , J. T.
377
(2014). Phylogenetic measures of biodiversity and neo- and paleo-endemism in
378
Australian Acacia.
Nature Communications
,
5
(1), Article 1.
379
https://doi.org/10.1038/ncomms5473
380
Nguyen, L.-T., Schmidt, H. A., Haeseler, A. von, & Minh, B. Q. (2014). IQ-TREE: A Fast and
381
Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies.
382
Molecular Biology and Evolution
,
32
(1), 268–274.
383
https://doi.org/10.1093/molbev/msu300
384
Nitta, J. H., Laffan, S. W., Mishler, B. D., & Iwasaki, W. (2023). canaper: Categorical analysis of
385
neo- and paleo-endemism in R.
Ecography
,
2023
(9), e06638.
386
https://doi.org/10.1111/ecog.06638
387
Paquin, P., & Hedin, M. (2004). The power and perils of ‘molecular taxonomy’: A case study
388
of eyeless and endangered Cicurina (Araneae: Dictynidae) from Texas caves:
389
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
CICURINA CONSERV AT IO N PHYLO GE NE TI CS.
Molecular Ecology
,
13
(10), 3239–
390
3255. https://doi.org/10.1111/j.1365-294X.2004.02296.x
391
Peel, M. C., Finlayson, B. L., & McMahon, T. A. (2007). Updated world map of the Köppen-
392
Geiger climate classification.
Hydrology and Earth System Sciences
,
11
(5), 1633–
393
1644. https://doi.org/10.5194/hess-11-1633-2007
394
Re, C., Fer, Ž., Perez, J., Tacdol, A., Trontelj, P., & Protas, M. E. (2018). Common Genetic
395
Basis of Eye and Pigment Loss in Two Distinct Cave Populations of the Isopod
396
Crustacean Asellus aquaticus.
Integrative and Comparative Biology
,
58
(3), 421–430.
397
https://doi.org/10.1093/icb/icy028
398
Riddle, M. R., Aspiras, A. C., Gaudenz, K., Peuß, R., Sung, J. Y., Martineau, B., Peavey, M., Box,
399
A. C., Tabin, J. A., McGaugh, S., Borowsky, R., Tabin, C. J., & Rohner, N. (2018). Insulin
400
resistance in cavefish as an adaptation to a nutrient-limited environment.
Nature
,
401
555
(7698), 647–651. https://doi.org/10.1038/nature26136
402
Roure, B., Baurain, D., Philippe, H. (2013). Impact of Missing Data on Phylogenies Inferred
403
from Empirical Phylogenomic Data Sets.
Molecular Biology and Evolution
, 30(1),
404
197–214, https://doi.org/10.1093/molbev/mss208
405
Strona, G., Nappo, D., Boccacci, F., Fattorini, S., & San-Miguel-Ayanz, J. (2014). A fast and
406
unbiased procedure to randomize ecological binary matrices with fixed row and
407
column totals.
Nature Communications
,
5
(1), Article 1.
408
https://doi.org/10.1038/ncomms5114
409
Tchernov, E., & Belmaker, M. (2004).
The biogeographic history of the fa una in the South ern
410
Levant and the correlation with climatic change
(pp. 21–25).
411
412
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
Figures:
413
414
Figure 1.
Live habitus of Israeli
Tegenaria
.
A.
Eyeless
T. ornit
from dark zone of Ornit Cave.
415
B.
Eyeless
T. naasane
from 'Arak Na'asane Cave
.
C.
T. yaaranford
from with reduced eyes
416
from the dark zone of Te’omim Cave
.
D.
The epigean
T. pagana
with fully developed eyes at
417
the entrance of Te’omim Cave. All pictures by S. Aharon.
418
419
420
421
422
423
424
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
425
426
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
Figure 2.
Phylogeny and inferred species groups of Israeli
Tegenaria
based on UCEs and
427
unsupervised machine learning approach (VAE).
A.
Maximum likelihood phylogeny of 161
428
terminals
based on 777 UCE loci
.
The upper group (cool colors) corresponds to the
429
troglobitic clade,
and the bottom group (warm colors) corresponds to the epigean clade.
B.
430
Visualization of VAE analysis of troglobitic clade. Black circles represent the mean position
431
of individuals and colored circles represent standard deviations. Note the complexity of the
432
T. trogalil
species cluster
,
whose standard deviations overlap as a third, separate cluster.
C.
433
Visualization of VAE analysis of epigean clade. Note the detection of an undescribed
434
species.
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
450
451
Figure 3.
Maps of localities and CANAPE analysis.
A.
Map of localities from which all
452
specimens used for sequencing were collected. Colors correspond to colors of tip labels in
453
phylogenetic tree and VAE figures.
B.
Results from CANAPE analysis. Boxes are 10 km
2
g
rid
454
cells that encompass all localities sampled. Note the four sites of paleo-endemism and
455
single site of mixed endemism. The sites of paleo-endemism are home to epigean species
456
with the exception of
T.
trogalil
(West). The site of mixed endemism is home to Te’omim
457
cave which harbors the troglobitic T.
yaaranford
and epigean T.
pagana
species.
458
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted December 20, 2023. ; https://doi.org/10.1101/2023.12.19.572471doi: bioRxiv preprint
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.