Content uploaded by Chan Kin Onn
Author content
All content in this area was uploaded by Chan Kin Onn on May 22, 2020
Content may be subject to copyright.
ORIGINAL ARTICLE
Species delimitation with gene flow: A methodological
comparison and population genomics approach to elucidate
cryptic species boundaries in Malaysian Torrent Frogs
Kin Onn Chan
1
|
Alana M. Alexander
1
|
L. Lee Grismer
2
|
Yong-Chao Su
3
|
Jesse L. Grismer
4,5
|
Evan S. H. Quah
6
|
Rafe M. Brown
1
1
Biodiversity Institute and Department of
Ecology and Evolutionary Biology,
University of Kansas, Lawrence, KS, USA
2
Department of Biology, La Sierra
University, Riverside, CA, USA
3
Department of Biomedical Science and
Environmental Biology, Kaohsiung Medical
University, Kaohsiung City, Taiwan
4
Department of Biological Sciences, Auburn
University, Auburn, AL, USA
5
La Kretz Center for Californian
Conservation Science, Institute of the
Environment and Sustainability, University
of California Los Angeles, Los Angeles, CA,
USA
6
School of Biological Sciences, Universiti
Sains Malaysia, Penang, Malaysia
Correspondence
Kin Onn Chan, University of Kansas,
Lawrence, KS, USA.
Email: chanko@ku.edu
Funding information
National Geographic Society, Grant/Award
Number: 9722-15
Abstract
Accurately delimiting species boundaries is a nontrivial undertaking that can have
significant effects on downstream inferences. We compared the efficacy of com-
monly used species delimitation methods (SDMs) and a population genomics
approach based on genomewide single-nucleotide polymorphisms (SNPs) to assess
lineage separation in the Malaysian Torrent Frog Complex currently recognized as a
single species (Amolops larutensis). First, we used morphological, mitochondrial DNA
and genomewide SNPs to identify putative species boundaries by implementing
noncoalescent and coalescent-based SDMs (mPTP, iBPP, BFD*). We then tested
the validity of putative boundaries by estimating spatiotemporal gene flow (FASTSIM-
COAL2, ABBA-BABA) to assess the extent of genetic isolation among putative spe-
cies. Our results show that the A. larutensis complex runs the gamut of the
speciation continuum from highly divergent, genetically isolated lineages (mean
F
st
=0.9) to differentiating populations involving recent gene flow (mean F
st
=0.05;
N
m
>5). As expected, SDMs were effective at delimiting divergent lineages in the
absence of gene flow but overestimated species in the presence of marked popula-
tion structure and gene flow. However, using a population genomics approach and
the concept of species as separately evolving metapopulation lineages as the only
necessary property of a species, we were able to objectively elucidate cryptic spe-
cies boundaries in the presence of past and present gene flow. This study does not
discount the utility of SDMs but highlights the danger of violating model assump-
tions and the importance of carefully considering methods that appropriately fit the
diversification history of a particular system.
KEYWORDS
Amolops, FASTSIMCOAL2, gene flow, migration rate, single-nucleotide polymorphism, site
frequency spectrum
1
|
INTRODUCTION
Delimiting species boundaries is a fundamental component of sys-
tematic biology that forms the framework for understanding the
evolutionary processes that generate biodiversity (Mayr, 1968). As
such, accurately delimiting species boundaries is a nontrivial step
that can have cascading ramifications (Veach, Di Minin, Pouzols, &
Moilanen, 2017). Species delimitation is usually performed using
Received: 3 March 2017
|
Revised: 12 June 2017
|
Accepted: 1 August 2017
DOI: 10.1111/mec.14296
Molecular Ecology. 2017;1–16. wileyonlinelibrary.com/journal/mec ©2017 John Wiley & Sons Ltd
|
1
certain properties of a species as criteria for assessing lineage inde-
pendence. The most commonly used criteria include phenotypic dis-
tinctiveness, molecular divergence and phylogenetic placement
(Brown & Stuart, 2012; Leavitt, Moreau, & Lumbsch, 2015; Leliaert
et al., 2014; de Queiroz, 2007; Tobias et al., 2010; Wiens & Penkrot,
2002). These “traditional”properties can be useful in delimiting allo-
patric (Chan, Grismer, & Grismer, 2011), phenotypically distinct (Gris-
mer et al., 2010) and genetically distant lineages where barriers to
gene flow or sufficient time have passed for fixed character differ-
ences to accumulate (Chan, Grismer, Zachariah, Brown, & Abraham,
2016; Grismer et al., 2013). However, cryptic lineages that occur in
sympatry, have similar niches and are not readily distinguishable phe-
notypically such as those characterized by recent/rapid radiations
can be harder to diagnose because divergent lineages no longer con-
nected by gene flow cannot be easily distinguished from the local
population structure within such lineages, forming a hierarchy of
genetic differentiation and divergence (Barley, White, Diesmos, &
Brown, 2013; Carstens, Pelletier, Reid, & Satler, 2013; Rannala,
2015; Sukumaran & Knowles, 2017). In such cases, traditional crite-
ria are limited in utility for assessing lineage separation (de Queiroz,
2005) and, if not implemented with caution, can lead to erroneous
results (Carstens et al., 2013).
Advances in genomic sequencing and bioinformatics have led to
the ability to detect population structure between closely related
populations at unprecedented resolution (Benestan et al., 2015;
Candy et al., 2015; Larson et al., 2014; Leslie et al., 2015). One of
the challenges with such data sets is to distinguish between struc-
ture that is associated with intraspecific variation from that resulting
from speciation (Sukumaran & Knowles, 2017). Model-based meth-
ods make simplifying assumptions about certain parameters (e.g.,
gene flow, population size) during the speciation process, and range
in complexity from noncoalescent, sequence-based methods that
model speciation in terms of number of substitutions (Zhang, Kapli,
Pavlidis, & Stamatakis, 2013) to highly parameterized Bayesian mod-
els based on the multispecies coalescent (Yang & Rannala, 2010) that
allow for the integration of multiple data types into a single model-
based framework (Sol
ıs-Lemus, Knowles, & An
e, 2015). The efficacy
of each method depends on how well the model fits the data, and
processes that violate model assumptions such as gene flow (Bur-
brink & Guiher, 2015; Sousa & Hey, 2013; Streicher et al., 2014) or
spatial autocorrelation (Meirmans, 2012; Reeves & Richards, 2011)
can yield inaccurate species delimitations.
Implicit within most species definitions—that species are sepa-
rately evolving metapopulation lineages (de Queiroz, 2007; Simpson,
1961; Wiley, 1978)—is the expectation that populations within a
metapopulation lineage are connected by gene flow, but remain dis-
tinct from other such lineages (Frost & Hillis, 1990; Petit & Excoffier,
2009; de Queiroz, 2005). Levels of gene flow among populations are
not only influenced by intrinsic traits (e.g., dispersal ability) but also
extrinsic spatial and temporal processes that shape genetic patterns
across a landscape (Richardson, Brady, Wang, & Spear, 2016). If
these processes are overlooked, inferences of lineage boundaries
may fail to recognize historical population associations (Knowles &
Carstens, 2007) or may be unable to distinguish true discontinuities
(i.e., lineage separation) from variation that occurs within a species
as a result of other phenomena such as continuous geographic clines
or isolation by distance (Medrano, L
opez-Perea, & Herrera, 2014; de
Queiroz, 2007; Sexton, Hangartner, & Hoffmann, 2014). Moreover,
standard phylogenetic estimation methods have been shown to pro-
duce highly supported, erroneous topologies when gene flow is pre-
sent, thereby invalidating downstream inferences that are based on
these phylogenies, such as the identification of terminal mono-
phyletic groups (Reeves & Richards, 2007; Rosenberg, 2007).
Here, we compared a wide range of commonly used species
delimitation methods and a population genomics approach to eluci-
date cryptic species boundaries in an understudied South-East Asian
frog complex. The South-East Asian Sundaland Biodiversity Hotspot
harbours one of the highest concentrations of endemic plants and
vertebrates on the planet (Mittermeier, Myers, Thomsen, da Fonseca,
& Olivieri, 1998; Myers, Mittermeier, Mittermeier, da Fonseca, &
Kent, 2000). Unfortunately, only 7.8% of Sundaland’s original pri-
mary forest remains and some estimates suggest that up to 42% of
its biodiversity could be lost by 2100 (Myers et al., 2000; Sodhi,
Koh, Brook, & Ng, 2004). Consequently, systematic research in this
region has focused heavily on discovering and describing new spe-
cies before they are lost. This is epitomized by the rapid surge of
new amphibian and reptile descriptions over the last 15 years,
resulting in more than a 20% increase in species richness (Brown &
Stuart, 2012; Giam et al., 2012; Grismer, 2011). In virtually every
one of these descriptions, species boundaries were delimited based
on morphology and/or mitochondrial DNA (mtDNA) (e.g., Chan,
Brown, Lim, & Grismer, 2014; Chan & Grismer, 2010; Chan, Grismer,
& Brown, 2014; Grismer et al., 2012, 2014). Given the present
South-East Asian biodiversity crisis (Miettinen, Shi, & Liew, 2011;
Sodhi et al., 2004; Wilcove, Giam, Edwards, Fisher, & Koh, 2013)
and the need to rapidly inventory the region’s diversity, it is impor-
tant to tackle this problem using all available methods, including not
only traditional morphology and mtDNA-based approaches but also
species delimitation approaches that are better suited to elucidating
cryptic lineage diversity. Such methods, which importantly can take
gene flow into consideration, are best implemented using genomic-
scale data—increasingly available even for nonmodel organisms. Tor-
rent frogs of the genus Amolops are represented by 51 species that
collectively range from Tibet, northeastern India, southern China,
southward throughout Indochina and the Thai-Malay Peninsula
(Frost, 2015). The bulk of species diversity lies in southern China
and Indochina, yet only one species, Amolops larutensis, occurs in
extreme southern Thailand and Peninsular Malaysia. This species has
never been studied outside the context of higher-level phylogenetics
where it was represented by samples from only two named localities
(Hasan et al., 2014; Matsui et al., 2006; Stuart, 2008), and as a
result, little is known of its intraspecific phenotypic and genetic
diversity.
Our sampling of A. larutensis from new localities throughout
Peninsular Malaysia revealed subtle phenotypic variation among geo-
graphic populations, leading us to hypothesize that A. larutensis
2
|
CHAN ET AL.
constitutes a complex of cryptic lineages. Because no prior data
were available, we used a two-step approach to species delimitation
by applying widely used species delimitation analyses (Carstens
et al., 2013; Rannala, 2015) using a variety of data types including
morphology, mtDNA and genomewide single-nucleotide polymor-
phisms (SNPs) to form preliminary hypotheses of species boundaries.
We then tested these putative boundaries using a rigorous popula-
tion genomics framework. Specifically, we diagnose lineage separa-
tion by assessing spatiotemporal gene flow under the general
concept of species as a separately evolving metapopulation lineage
and treat this as the only necessary property of a species (de
Queiroz, 2007). Therefore, the objectives of this study are twofold:
(i) evaluate the efficacy of commonly used species delimitation meth-
ods in assessing lineage separation in cryptic species; (ii) determine
whether population genomics can be an effective tool in elucidating
cryptic species boundaries.
2
|
METHODS
2.1
|
Sampling, data collection and accessibility
Our total data set consisted of 225 samples for which some combi-
nation of morphological, mtDNA and SNP data was available
(Tables S1 and S2). Morphological data were obtained from a subset
of 141 vouchered museum specimens examined from collections at
La Sierra University Herpetological Collection, Riverside, California;
Zoological Reference Collection, Lee Kong Chian Natural History
Museum, Singapore; and University of Kansas Natural History
Museum (KU), Lawrence, Kansas (Table S1).
DNA for mitochondrial and genomic sequencing was extracted
from liver tissue using the Qiagen DNeasy Blood & Tissue Kit. A
total of 117 samples (including 79 of the 141 samples scored for
morphology) were sequenced for mtDNA and genomewide SNPs
(Table S2). These samples were chosen from populations that maxi-
mized geographic coverage and altitudinal variation across all major
mountain ranges. For mtDNA, we sequenced the 16S rRNA-encod-
ing gene using primers from Evans et al. (2003) and sequencing pro-
tocol from McLeod (2010). Raw sequence data were aligned using
the MUSCLE algorithm, and resulting alignments were subsequently
refined by eye in GENEIOUS PRO version 5.3 (Kearse et al., 2012). In
addition to these samples, 49 16S rRNA Amolops sequences were
obtained from GenBank to assess the monophyly and phylogenetic
placement of Peninsular Malaysian populations. Samples and corre-
sponding GenBank Accession nos are listed in Table S2.
A subset of 95 samples (including 67 samples scored for mor-
phology, 18 with mtDNA data; Table S2) were selected for genomic
sequencing of nuclear DNA in the form of genomewide SNPs using
a single-end multiplexed shotgun genotyping protocol (Andolfatto
et al., 2011). Briefly, 500 ng of DNA from each sample was digested
with NdeI (New England Biosystems), ligated with a sample-specific
barcode and then pooled in sets of 48 samples and run through a
Pippin Prep
TM
(Sage) to select fragments between 400 and 500 bp.
Genomic samples were sequenced in one lane of the Illumina Hiseq
2500 platform at the Genome Sequencing Core Facility at the
University of Kansas. Loci were subsequently assembled and filtered
using the program PYRAD v.3.0.5 (Eaton, 2014). The maximum number
of low quality, undetermined sites (“N”) in filtered sequences was set
to 4, and proportion of shared polymorphic sites in a locus was set
at 10% (Eaton, 2014). Because overly stringent similarity thresholds
have been shown to cause “oversplitting”by splitting orthologous
reads into multiple loci (Catchen, Hohenlohe, Bassham, Amores, &
Cresko, 2013; Harvey et al., 2015; Ilut, Nydam, & Hare, 2014),
whereas more liberal thresholds were found to have minimal bias
effects on inference (Ilut et al., 2014; Rubin, Ree, & Moreau, 2012),
we employed a relatively relaxed threshold of 88% similarity
between reads when clustering loci. We then used two different set-
tings for minimum depth of coverage (min. read depth =5 and 10)
and minimum taxon coverage (30% and 50% missing samples per
locus) to produce four SNP data sets (Table 1). To avoid linkage
across sites within the same locus, the single SNP with the highest
sample coverage was selected from each locus.
2.2
|
Establishing putative species boundaries
Putative species boundaries were established using the following
species delimitation framework based on traditional and widely used
criteria:
1. We estimated mtDNA phylogenies and identified monophyletic
lineages. Each monophyletic lineage that corresponded to a dis-
crete or recognizable geographic region was defined as an opera-
tional taxonomic unit (OTU).
2. We calculated uncorrected genetic distance between OTUs
based on the mtDNA sequence alignment.
3. We assessed morphological variation and distinctiveness between
OTUs using multivariate analyses.
4. Finally, given the support of geographic data, clades recovered
with our concatenated SNP phylogenies and the sNMF popula-
tion structure assignments (methods and results discussed below)
for the OTUs described using mtDNA, we then performed non-
coalescent and coalescent-based species delimitation analyses to
establish putative species boundaries for downstream hypothesis
testing.
TABLE 1 Summaries of the four different SNP data sets
generated using different values for minimum read depth coverage
and percentage of missing data
Min.
depth
%
missing
Total vari-
able sites
Total
PIS
Min. #
loci
Max.
# loci
Unlinked
SNPs
5 50 94,313 70,833 1,761 17,831 17,123
5 25 53,208 43,191 1,572 7,572 7,544
10 50 65,541 49,821 112 12,478 11,951
10 25 32,253 26,122 97 4,826 4,744
PIS, Parsimony Informative sites.
Min. # and Max. # loci give the minimum and maximum number of loci
observed within an individual for each data set.
CHAN ET AL.
|
3
2.2.1
|
Phylogenetic estimation
Bayesian and maximum-likelihood (ML) methods were used to infer
phylogenetic relationships from mtDNA. Bayesian inference was
implemented in the program MRBAYES 3.2.6 (Ronquist et al., 2012)
using a reversible jump MCMC +Γsubstitution model and default
priors. Four independent MCMC runs (10,000,000 generations and
four chains per run) were combined and assessed for convergence
using the program TRACER v1.6 (Rambaut, Suchard, Xie, & Drummond,
2014), and a 50% majority rule consensus tree was produced by
excluding the first 25% of sampled trees as burn-in. The program IQ-
TREE (Nguyen, Schmidt, von Haeseler, & Minh, 2014) was used to
perform ML analyses. The Bayesian information criterion was used
to select the most appropriate substitution model, and branch sup-
port was assessed using 10,000 ultrafast bootstrap approximation
replicates (Minh, Nguyen, & von Haeseler, 2013).
For SNP data, a ML phylogeny was also estimated using IQ-TREE.
We applied an ascertainment bias correction using the ASC model
(Lewis, 2001), and branch support was assessed using 10,000 ultra-
fast bootstrap approximation replicates. Additionally, we estimated a
species tree under the Bayesian multispecies coalescent framework
implemented in the program SNAPP (Bryant, Bouckaert, Felsenstein,
Rosenberg, & Roychoudhury, 2012) through BEAST v.2.3.1 (Drum-
mond & Bouckaert, 2015). We used the previously identified OTUs
as a priori species assignments and the following parameter settings:
mutation rates (uand v) and the shape parameter for the gamma dis-
tribution prior on population sizes (alpha) were set at 1.0; the beta
scale parameter was set at 333 (calculated from the mean value of
the total number of polymorphic sites); and the speciation rate prior
(lambda) was sampled from a broad gamma distribution of alpha =2
and beta =200. The MCMC chain was run for 10,000,000 genera-
tions, sampling every 1,000 states, and stationarity was assessed in
the program TRACER v1.6. The posterior distribution was considered
adequately sampled when effective sample size values for parame-
ters were >200.
2.2.2
|
Morphological variation and species
delimitation
Nine continuous morphological characters were measured from adult
specimens: snout-vent-length (SVL), head length, head width,
internarial distance, snout length, forearm length, femur length, tibia
length and third finger disc width following Chan et al. (2016). Due
to pronounced sexual size dimorphism, male and female measure-
ments were analysed separately. Characters were adjusted for allo-
metric growth using the following equation: X
adj
=Xb
(SVL SVL
mean
), where X
adj
=adjusted value; X=measured value;
b=unstandardized regression coefficient for each OTU; SVL =mea-
sured snout-vent-length; SVL
mean
=overall average SVL of all OTUs
(Lleonart, Salat, & Torres, 2000; Thorpe, 1975, 1983; Turan, 1999).
Adjusted variables were then log-transformed prior to downstream
analyses. We performed a principal components analysis (PCA) on
this adjusted morphological data set to find the best low-dimensional
representation of variation in the data. Components with eigenvalues
above 1.0 were retained in accordance to Kaiser’s criterion (Kaiser,
1960). To further characterize population clustering, a discriminant
analysis (DA) of principal components (DAPC) was performed to find
the linear combinations of variables that have the largest between-
group variance and the smallest within-group variance. The DAPC
analysis relies on data transformation using PCA as a prior step to a
DA, ensuring that variables submitted to the DA analysis are uncor-
related and that their number is less than that of analysed individuals
(Jombart, Devillard, & Balloux, 2010).
Uncorrected pairwise p-distances were calculated from the mito-
chondrial sequence alignments using the program PAUP*(Swofford,
2002). We then carried out species delimitation based on mtDNA
using a method that has been shown to perform well with single-
locus data (Tang, Humphreys, Fontaneto, & Barraclough, 2014). The
multirate Poisson tree processes (mPTP) is a noncoalescent, ML,
sequence-based method that models speciation in terms of number
of substitutions (Zhang et al., 2013). This method identifies changes
in the tempo of branching events, where the number of substitutions
between species is assumed to be significantly higher than the num-
ber of substitutions within species. Additionally, the model incorpo-
rates different levels of intraspecific genetic diversity deriving from
differences in either evolutionary history or sampling of each species
(Zhang et al., 2013). During phylogenetic inference, identical
sequences are assigned very short nonzero branch lengths to retain
the binary shape of the tree. Because this program requires a binary
phylogeny, the 50% majority consensus tree estimated from the
Bayesian analysis was used as the input tree. Confidence of the
delimitation scheme was assessed using two independent MCMC
chains at 5,000,000 generations each. Support values indicate the
fraction of sampled delimitations in which a node was part of the
speciation process.
We jointly analysed morphological and mtDNA in a common
coalescent Bayesian framework using the program iBPP. This
method has been shown to improve the accuracy of species delimi-
tation by integrating phenotypic and genetic data (Sol
ıs-Lemus et al.,
2015). The iBPP analysis was performed with and without mtDNA
data to maximize the signal derived from phenotypic variation and
to evaluate the influence of mtDNA sequence data on this inte-
grated analysis. Male and female data sets were analysed separately
to avoid biases from sexual dimorphism. We used three different
combinations of priors for ancestral population size (h) and root age
(t
0
) drawn from a gamma distribution specified as G(a,b), where ais
the shape and bis the rate parameter. All other divergence time
parameters were assigned the uniform Dirichlet prior (Fujita, Leache,
Burbrink, McGuire, & Moritz, 2012; Pyron, Hsieh, Lemmon, Emily, &
Hendry, 2016; Yang & Rannala, 2010). We chose a diffuse shape
parameter (a= 1 or 2) and parameterized bfor large ancestral popu-
lations and deep divergences, h~G(1, 10), t
0
~G(1, 10); small ances-
tral populations and shallow divergences h~G(2, 2000), t
0
~G(2,
2000); and large ancestral populations with shallow divergences h~
G(1, 10), t
0
~G(2, 2000). Both rjMCMC algorithms were imple-
mented: Algorithm 0 with e=5; Algorithm 1 with e=2 and m=1.
4
|
CHAN ET AL.
Two independent runs were performed for each algorithm and prior
combination with a chain length of 50,000 sampled every 50 genera-
tions, discarding 1,000 generations as burn-in. MCMC convergence
was assumed when results were the same between multiple runs
using the two algorithms (Yang, 2015).
Species delimitation analysis on genomic SNPs was performed
using the Bayes factor delimitation method (BFD*; Leach
e, Fujita,
Minin, & Bouckaert, 2014). Different species delimitation models
were constructed by lumping and splitting OTUs based on plausible
biogeographic scenarios and phylogenetic topologies derived from
prior phylogenetic analyses (Table 2). The marginal likelihood of each
model was estimated via path sampling using 48 steps, an alpha of
.3, and a MCMC chain length of 100,000 with a preburnin of
100,000 (Leach
e et al., 2014). Natural log Bayes factors (BF) were
used to compare the log marginal likelihoods (MLE) of competing
models using the equation BF =2[MLE(model1) MLE(model2)],
where model 1 was the model with the largest number of species. A
positive BF value indicates support for model 1 and a negative value
support for model 2.
2.3
|
Validation of putative species boundaries
using genomewide SNPs
2.3.1
|
Population structure and differentiation
Population structure was characterized by estimating individual
ancestry coefficients that represent the proportions of an individual
genome that originate from multiple ancestral gene pools. Calcula-
tions were implemented in the program sNMF based on sparse non-
negative matrix factorization and least-squares optimization (Frichot,
Mathieu, Trouillon, Bouchard, & Francßois, 2014; Kim & Park, 2007).
Ancestry coefficients estimated using the sNMF method have been
shown to produce results that are comparable to other widely used
programs such as ADMIXTURE and STRUCTURE, but have the advantage of
estimating homozygote and heterozygote frequencies and avoiding
Hardy–Weinberg equilibrium assumptions (Frichot et al., 2014). We
calculated ancestry coefficients for 1–16 ancestral populations (K)
using 100 replicates for each K. The preferred number of Kwas cho-
sen using a cross-entropy criterion based on the prediction of
masked genotypes to evaluate the error of ancestry estimation. The
sNMF method was implemented in the Rpackage LEA (Frichot &
Francßois, 2015).
To determine whether genetic structure was spatially autocorre-
lated, we conducted a Mantel test by examining the correlation
between genetic distance and Euclidean geographic distance. Corre-
lation values were compared against a distribution of permuted val-
ues based on 1,000 replicates simulated under the absence of spatial
structure. The Mantel test was performed using the Rpackage ADE-
GENET 2.0.1 (Jombart, 2008).
Genetic distances between population pairs were estimated using
Wright’sF
st
and Jost’sD(Jost, 2008; Meirmans & Hedrick, 2011;
Whitlock, 2011; Wright, 1951). Population differentiation was tested
with analysis of molecular variance, AMOVA (Excoffier, Smouse, &
Quattro, 1992) using the number of different alleles (F
st
) based on
the infinite allele model (Weir & Cockerham, 1984), nesting individu-
als within populations and populations within the eastern (East) vs.
central +western (West) mountain ranges. Significance was assessed
using 1,000 permutations. These calculations were performed using
the program GENODIVE v2.0b27 (Meirmans & Van Tienderen, 2004).
2.3.2
|
Hybridization and demographic analyses
Hybridization at the contact zone was investigated by calculating the
hybrid index (Buerkle, 2005). East 1 and Larutensis were selected as
parental populations while West 1 was designated as the putative
hybrid population.
Population connectivity was assessed by estimating the effective
number of migrants exchanged between populations per generation
(N
m
) using FASTSIMCOAL2 v.52.21 (Excoffier, Dupanloup, Huerta-
S
anchez, Sousa, & Foll, 2013). Due to computational constraints, we
TABLE 2 Results of the BFD*analysis
based on nine species delimitation models
ranging from 2 to 5 species within the
western clade
# Species Model MLE BF Rank
5 (L) (W1) (W2) (W3) (W4) 99,906.30527 –1
4 (L) (W1) (W2) (W3 +W4) 103,884.8744 3,978.57 2
4(L+W1) (W2) (W3) (W4) 107,029.7769 7,123.47 3
3 (L) (W1) (W2 +W3 +W4) 107,736.139 7,829.83 4
3 (L) (W1 +W2) (W3 +W4) 108,921.1603 9,014.86 5
3(L+W1) (W2) (W3 +W4) 111,397.7147 11,491.41 6
2 (L) (W1 +W2 +W3 +W4) 113,053.1408 13,146.84 7
2(L+W1) (W2 +W3 +W4) 115,914.2886 16,007.98 8
2(L+W1 +W2) (W3 +W4) 116,802.3151 16,896.01 9
Models were split or lumped according to plausible biogeographic scenarios and phylogenetic
topologies. Competing models were compared and ranked using log marginal likelihood estimates
(MLE) and Bayes factors (BF) following the equation: BF =29(MLE of model 1 MLE of model
2), where model 1 was the model with five species. A positive BF value indicates support for model
1 over model 2 and vice versa. Model abbreviations are L =Larutensis, W =West.
CHAN ET AL.
|
5
only analysed populations form the western clade. A folded site fre-
quency spectrum (SFS) was obtained with custom R-code (Alexander,
2017) and dAdIv1.7.0 (Gutenkunst, Hernandez, Williamson, & Busta-
mante, 2009), projecting down population sizes to maximize the
number of segregating sites using custom R-code (Alexander, 2017).
Four scenarios were examined: contemporary and historical migra-
tion, contemporary migration only, historical migration only, no
migration (Fig. S5; Table S5). Population divergence times followed
Chan and Brown (2017), with the exception of the timing of the
divergence of West 2 from West 3/West 4, which was set as half-
way between the coalescence of all populations and the divergence
of West 3/West 4 (as the SNP topology differs from the mtDNA for
this lineage). For each scenario, 50 replicate FASTSIMCOAL2 runs were
carried out with the following settings: n 100,000 N 100,000 m
multiSFS qM 0.001 l10L 40. Initial prior distributions fol-
lowed a uniform distribution based on the population-specific theta
(distribution range: one order of magnitude lower and higher than
theta estimate) estimated using the program GENEPOP (Rousset, 2008),
and the data were modelled as FREQ, with the number of indepen-
dent chromosomes equal to the number of nonmonomorphic SNPs
in the SFS (4018). We used a mutation rate of 1.91 910
8
follow-
ing Crawford (2003) and assumed vicariant splits between popula-
tions (i.e., the number of simulated individuals remained constant
through time). The range in parameter estimates for the initial 50
runs were used as the prior distributions for the next run, and this
process continued until no further increase in likelihood was
detected. Using the parameter values from the run with the highest
likelihood, an additional run with n/N=1,000,000 was carried
out to more accurately estimate the likelihood. The best-fitting sce-
nario was then assessed by Akaike’s information criterion (AIC)
score. The parameter estimates for the best-fitting scenario were
used to simulate 100 parametric bootstraps of the SFS. The data
type was changed to DNA, with a mutation rate of 1.91 910
8
,
and the number of chromosomal segments equalling the total num-
ber of sites in the SFS (including monomorphic sites: 5695). The
length of the chromosomal segments was set at 100 bp, and the
mutation rate (to three significant figures) was adjusted by trial and
error until the closest match to the number of nonmonomorphic in
the observed SFS was obtained. After the bootstrap replicates were
generated, the *.tpl and *.est files that led to the run with the high-
est likelihood in the initial screening runs of the best scenario were
then used with the bootstrap replicates to obtain confidence inter-
vals for the parameter estimates, discarding the 2.5% lowest and
highest estimates for each parameter.
To differentiate between introgression and incomplete lineage
sorting, we used Patterson’sD-statistic (ABBA-BABA test), based on
the frequencies of discordant SNP genealogies in a pectinate four-
taxon tree [(((P1,P2),P3),O)]. This test assumes that two SNP patterns,
“ABBA”and “BABA,”should be equally frequent under a scenario of
incomplete lineage sorting without gene flow, where “A”denotes the
ancestral allele and “B,”the derived allele. An excess of ABBA or
BABA patterns would therefore be indicative of introgression (Dur-
and, Patterson, Reich, & Slatkin, 2011; Patterson et al., 2012). We
calculated Dover combinations of four taxa that fitted the four-taxon
tree configuration across all plausible topologies inferred from our
phylogenetic analyses. Population pairs that did not conform to any
plausible relationships were not included in the test. Four samples
from each population were randomly chosen to form taxon sets.
Ingroup taxa (P1–P3) were then iterated over all possible combina-
tions of individuals that were chosen, while samples were pooled into
groups for the outgroup population (O). This approach allows the use
of any locus shared by the three sampled ingroup taxa and at least
one outgroup, effectively down-weighting Dif the ancestral allele
was not fixed across multiple outgroup samples, thus making it a
more conservative test. The standard deviation of Dwas calculated
from 200 bootstrap replicates, and the observed Dwas converted to
aZ-score measuring the number of standard deviations it deviated
from 0. Significance was assessed using a p-value at a=.01 after the
Holm–Bonferroni correction for multiple testing (number of possible
combinations fitting the given species tree hypothesis; Eaton & Ree,
2013; Eaton, Hipp, Gonz
alez-Rodr
ıguez, & Cavender-Bares, 2015).
The D-statistic test was implemented in PYRAD.
3
|
RESULTS
3.1
|
Phylogenetic relationships
3.1.1
|
Mitochondrial DNA
Both Bayesian and ML phylogenetic analyses on mtDNA produced
congruent topologies at most major nodes, inferred Peninsular
Malaysian Amolops as a monophyletic clade sister to A. cremnobatus
from Indochina, and had identical topologies within the Peninsular
Malaysian subclade (Fig. S1). Within the Peninsular Malaysian sub-
clade, all nodes were highly supported in the ML tree (bootstrap
>90%, Figure 1a), whereas in the Bayesian tree, one node received
relatively low support (posterior probability 0.4; Fig. S1). Two highly
divergent (14–16% p-distance, 16S; Fig. S2), reciprocally mono-
phyletic Peninsular Malaysian Amolops clades were recovered with
high support by both methods. These corresponded to populations
from the eastern vs. western +central mountain ranges (hereafter
referred to simply as eastern and western clades; Figure 1). In the
eastern clade, we defined two genetically distinct and reciprocally
monophyletic OTUs (separated by 7–8% p-distance, 16S) that corre-
sponded to populations from the northeastern mountain range (East
1) and southeastern mountain range (East 2). In the western clade,
five subclades were recovered (1–5% p-distance, 16S). We desig-
nated these OTUs as Larutensis (type locality of A. larutensis), West
1, West 2, West 3 and West 4 (Figure 1).
3.1.2
|
Genomewide SNPs
After quality control filtering of the initial 153 million reads obtained
across all samples, a total of c. 130 million reads were retained. The
total number of unlinked SNPs in the final data sets that were used
for downstream analyses ranged from 4,744 to 17,123 (Table 1).
6
|
CHAN ET AL.
Maximum-likelihood analyses on concatenated SNP data sets led
to four different phylogenies depending on how loci were filtered
(Fig. S3). At a minimum depth of 5% and 50% missing data, all major
splits were highly supported; however, a topology differing from the
mtDNA tree was produced (Figure 1b). The SNP data set at a mini-
mum depth of five and less missing data (30%) inferred a similar phy-
logeny, albeit with low support for the relationships among
populations within the western clade (Fig. S3). Phylogenies con-
structed from the SNP data sets with a minimum depth of 10 failed
to recover the West 2, West 3 and West 4 populations as mono-
phyletic groups. Furthermore, support for deeper nodes was signifi-
cantly lower. Topological placement of the East 1 and East 2
populations were congruent and highly supported throughout all phy-
logenetic analyses and data sets, including mtDNA. One sample from
the contact zone (denoted by an asterisk in Figure 1) was embedded
within West 1 in the mtDNA phylogeny but recovered as a distinct
lineage within the eastern clade across all SNP phylogenies, indicating
a putative hybrid. The SNAPP analysis failed to converge when includ-
ing populations from both eastern and western clades. As the rela-
tionships of populations in the eastern clade were highly supported
in all other analyses, we performed a separate analysis on a data set
that only included populations from the western clade. This analysis
converged and produced a maximum clade credibility tree topology
similar to the concatenated SNP ML phylogeny (min. depth =5, 50%
missing data) with 1.0 posterior probability at each node (not shown).
3.2
|
Morphological variation and putative species
boundaries
In both the male and female morphological data sets, the first three
principal components (PCs) had eigenvalues above 1.0 and were
retained for subsequent analyses. These PCs captured 69% (males)
and 78% (females) of the total variance (Table S3). In males, East 2
showed some separation along the first and third (but not second)
axes, whereas in females, the East 2 formed a distinct, nonoverlap-
ping cluster along the first axis but was undifferentiated along the
second and third axes (Figure 2). When variances between OTUs
were maximized, the DAPC analysis also isolated the East 2 as a dis-
tinct cluster in both males and females but showed no separation
for the other populations. No clear separation was detected in either
sex across the other populations in the PCA or DAPC analysis (Fig-
ure 2 and Fig. S4).
FIGURE 1 Ultrametric maximum-likelihood phylogenies inferred from (a) 1,466 bp of the 16S rRNA-encoding mitochondrial gene. All major
nodes were highly supported with >90% bootstrap; (b) 17,123 unlinked SNP loci filtered at a minimum depth of 5 and allowing for 50%
missing data. All major nodes were highly supported with >90% bootstrap. The asterisk (*) denotes the putative hybrid sample that was placed
in the western clade in the mtDNA phylogeny and the eastern clade in the SNP phylogeny. The distribution map (right) shows sampling
localities and examples of phenotypic differences between populations from the western (circles) and eastern (triangles) clades. The star
represents the type locality of Amolops larutensis, and the red box indicates the location of the putative contact zone between the eastern and
western clades. Inset: location of Peninsular Malaysia within South-East Asia
CHAN ET AL.
|
7
A total of five species were delimited using the mPTP species
delimitation method. East 1, East 2, West 3 and West 4 were delim-
ited as separate species with maximum average support values of
1.0, whereas Larutensis, West 1 and West 2 received low support
(0.003), suggesting that these OTUs should be lumped as a single
species (Figure 2).
All independent iBPP runs under both rjMCMC algorithms and
all combinations of priors produced the same results, indicating con-
vergence (Yang, 2015). Species delimitation results were similar
regardless of whether sequence data were included or excluded in
the analyses. In the male data set (with sequences included), all
OTUs were highly supported as distinct species (pp =1.0). For the
female data set, all OTUs were supported as distinct species (poste-
rior probability =1.0) with the exception of the split between West
1 and Larutensis, which was moderately supported (pp =0.7,
Figure 2).
Due to the previous lack of convergence in SNAPP analyses includ-
ing both western and eastern clade individuals, and because relation-
ships were unambiguous for the eastern clade, we restricted the
BFD*analysis to populations from the western clade only. Marginal
likelihood estimates improved as the number of species increased
and favoured the model that defined each population as a distinct
species. The second-ranked model favoured four species by lumping
West 3 and West 4 as a single species. However, when compared
to the five-species model, the BF value was high (+3,978), indicating
strong support for the five-species model (Table 2).
3.3
|
Validation of putative species boundaries
3.3.1
|
Population structure and differentiation
The population structure analysis (sNMF) on both SNP data sets at a
minimum depth of five inferred similar patterns of population struc-
ture and admixture, where K=2 split individuals into eastern and
western clusters. For the data set with 50% missing data, K=7 had
the lowest cross-entropy value (Figure 3), whereas the data set with
30% missing data inferred K=6 as the preferred number of genetic
clusters. Signatures of admixture were detected among populations
from the western clade. At the contact zone, the putative hybrid
sample appeared admixed between East 1 and West 1 genotypes.
Apart from the putative hybrid (further investigated below), no fur-
ther admixture was detected between the eastern and western
clades. This analysis also inferred an additional population at the
central region of the eastern mountain range (the southernmost East
FIGURE 2 Left: Principal components scores of morphological variables visualized as three-dimensional hypervolumes constructed using
multidimensional kernel density estimation. Geometry of hypervolumes is based on minimum convex polytopes, and axes show the first three
principal components and their proportion of variance. Right: Results of the mPTP and iBPP species delimitation analyses depicted on an
mtDNA cladogram. Values at internal nodes denote the average support value for the mPTP analysis followed by posterior probabilities from
the iBPP analysis for males and females, respectively
8
|
CHAN ET AL.
1 population in Figure 3). We refer to this subpopulation as East 1.2
in subsequent analyses, to distinguish it from the north East 1.1 sub-
population.
Within the western clade, F
st
and Jost’sDvalues based on SNP
data were low among populations, ranging from 0.03–0.09
(mean =0.053) to 0.001–0.003 (mean =0.002), respectively. These
values were higher among populations within the eastern clade, rang-
ing from 0.57–0.93 (mean =0.7) to 0.02–0.14 (mean =0.08) for F
st
and Jost’sD, respectively. Similarly, F
st
and Jost’sDvalues were also
high when western and eastern populations were compared with
each other (Table 3). Results of the AMOVAs analyses on populations
from the western clade showed that most of the variation (74%)
occurred within individuals, whereas in the eastern clade, most of the
variation (65%) occurred among populations. When populations were
nested within the western and eastern clades, most of the variation
(53%) was attributed to the eastern vs. western groupings (Table 4).
3.3.2
|
Hybridization and demographic analyses
The hybrid index analysis showed that one of seven samples (sample
ID 21011, previously identified as a putative hybrid above) within
the West 1 population was a hybrid between Larutensis and the
combined East 1 parental populations (h=0.549). The other six sam-
ples had h-values close to zero, indicating a strong affinity with
Larutensis and that it was very unlikely they were of hybrid origin
(Table S4).
For the FASTSIMCOAL2 analysis, the full migration model was the
best fit according to AIC. We examined a version of this model
where migration rates between populations were constrained to be
symmetrical, but it had a poorer fit to the data than the full migra-
tion model that allowed for asymmetrical migration rates (Table S5).
We therefore restrict our discussion to the full asymmetrical migra-
tion model only. Contemporary migration rates between Larutensis
and all other populations were low (N
m
=0.2–0.6), suggesting repro-
ductive isolation. Gene flow was highest between West 1 and West
2(N
m
=5.5) and West 3 and West 4 (N
m
=5.8), whereas relatively
low levels of gene flow were detected between West 2 and West 3
(N
m
=1.0; Figure 4, Table 3). However, it should be pointed out that
the confidence intervals of all point estimates associated with this
model were wide (Table S5), suggesting denser sampling of the gen-
ome would be needed to accurately estimate parameters of this
parameter-rich model. Among the historical migration rates, an out-
lier was the very high rate of migration from West 2 into the ances-
tor of the Larutensis/West 1 populations (Table S5). This high
inferred gene flow could explain the discrepancy between the SNP
and mtDNA phylogenies, with the sister relationship of West 2 and
Larutensis/West 1 in the latter due to this introgression event.
The D-statistic was used to differentiate between introgression
and incomplete lineage sorting among adjacent populations within
the western clade, eastern clade, and between both western and
eastern clades (Figure 4, Table 3). Within the western clade, high
levels of introgression (significant for all 63/63 combinations) were
FIGURE 3 Estimated population structure as inferred by the sNMF analysis. Each individual is partitioned into K-coloured segments that
represent the proportions of an individual’s genome that originate from one or multiple inferred genetic clusters coloured consistently with the
other figures (note the light yellow cluster was not detected using morphological PCA). Asterisk (*) indicates the putative hybrid sample at the
inferred contact zone between eastern and western clades (red box). The inset graph plots cross-entropy values (y-axis) vs. number of
ancestral populations (x-axis). K=2 splits individuals into eastern and western genotypes. Population structure for K=7 (the number of
clusters with the lowest cross-entropy value) is plotted on the map using the average ancestry coefficient values for each estimated population
CHAN ET AL.
|
9
detected between West 1 and West 2, while low levels of introgres-
sion were detected between West 2 and West 3 (significant for 28/
63 combinations). No introgression was detected between Larutensis
and West 1 or Larutensis and West 2. These results are congruent
with estimates from the FASTSIMCOAL2 analysis (Table 3). Introgression
between the sister lineages Larutensis–West 1 and West 3–West 4
were not assessed because the D-statistic is unable to test for gene
flow between sister lineages P1 and P2 in a pectinate four-taxon
tree [(((P1,P2),P3),O)].
Within the eastern clade, low levels of introgression were
detected between East 1.1 and East 1.2 (significant in 8/23 combi-
nations), whereas introgression was not detected between East 1.2
and East 2. Introgression was also absent among adjacent
populations from the western and eastern clades, even between syn-
topic populations at the contact zone, excluding the hybrid sample
(Table 3; Figure 4).
Spatial autocorrelation was not detected when the Mantel test
was performed on the entire SNP data set (p=.242), but was signif-
icant when the test was performed separately on the eastern
(p=.014) and western (p=.009) clades (Fig. S6). Although spatial
autocorrelation can result in a correlation of genetic and geographic
distances, distant and divergent populations can also result in such a
pattern. To distinguish between these two scenarios, we used a non-
parametric approach by plotting both genetic and geographic dis-
tances and using two-dimensional kernel density estimation (KDE) to
measure local densities. Continuous genetic clines such as those
caused by spatial autocorrelation would result in a single cloud of
points without discontinuities, whereas distant and divergent popula-
tions would be represented by separate high density patches. The
KDE plots show that the western clade consists mostly of a single
cloud, with a few outliers (samples from the Larutensis populations
located on a different mountain range). The eastern clade was repre-
sented by two distinct patches (Fig. S6), indicating that the East 1
and East 2 populations are both distant and divergent and are not
spatially autocorrelated.
4
|
DISCUSSION
Our results show that commonly used species delimitation methods
were effective at assessing lineage separation in highly divergent lin-
eages where gene flow was absent (East 1 and East 2) but overesti-
mated the number of species in younger lineages where gene flow
was prevalent but populations were markedly structured genetically.
“Splitting”of lineages within a metapopulation occurred even when
genomic data were used. We attribute this to the violation of the
underlying assumptions of the models implemented by these pro-
grams: the guide tree is assumed to be correct (Zhang et al., 2013);
speciation is modelled as an instantaneous event (Nee, 2006; Suku-
maran & Knowles, 2017); and divergence is assumed to occur without
gene flow (Yang & Rannala, 2010). Using these methods on a system
that violated these assumptions led to model misspecification and
inaccurate estimation of species boundaries (Camargo, Morando,
Avila, & Sites, 2012; Carstens et al., 2013; Ence & Carstens, 2011;
Jackson, Carstens, Morales, & O’Meara, 2016; Sukumaran & Knowles,
2017). On the other hand, we showed that a population genomics
approach can be an effective tool at delimiting species boundaries
both when gene flow is absent and when it is present at varying levels.
By considering lineage independence as the only necessary property
of a species, we can shift our focus away from traditional criteria (phe-
notypic distinctness, monophyly, genetic divergence, etc.) and recast
the species delimitation framework as one that strictly focuses on
assessing lineage cohesion/separation. Using this approach, we
demonstrate that Peninsular Malaysian Amolops are comprised of at
least three species, the true A. larutensis (i.e., the western clade) and
two unnamed lineages from the eastern clade (East 1 and East 2).
TABLE 3 Pairwise comparisons of demographic parameters within
and between eastern and western populations
Populations
Genetic
distances Migration
rates (N
m
)
D-statistic
Z-range (nSig.)Pop 1 Pop 2 F
st
D
Within West
Larutensis West
1
0.0860 0.0030 0.5814 NT
Larutensis West
2
0.0670 0.0020 0.3331 0.0–3.0 (0/63)
West 1 West
2
0.0630 0.0020 5.4939 2.8–7.3 (63/63)
West 2 West
3
0.0290 0.0010 1.0291 0.9–5.6 (28/63)
West 3 West
4
0.0340 0.0010 5.7762 NT
Mean 0.0530 0.0017
Within East
East 1 East
1.2
0.5700 0.0150 NT 0.8–5.3 (8/23)
East 1.2 East
2
0.8880 0.1380 NT 0.4–3.4 (0/47)
Mean 0.7290 0.0765
Between West/East
East 1 West
1
0.8970 0.1260 NT 0.0–4.0 (0/191)
East 1 West
2
0.9140 0.1350 NT 0.0–1.9 (0/63)
East 1.2 West
3
0.8820 0.1380 NT 0.0–2.7 (0/71)
East 1.2 West
4
0.8890 0.1380 NT 0.0–2.7 (0/71)
East 3 West
4
0.9320 0.2210 NT 0.0–1.6 (0/63)
Mean 0.9028 0.1516
NT, not tested.
Examined parameters include genetic distance (F
st
and Jost’sD), migration
rates (N
m
) and D-statistic scores represented by the range of Z-scores
followed by the number of significant location comparisons assessed using
ap-Value at a=.01 after the Holm–Bonferroni correction.
10
|
CHAN ET AL.
4.1
|
Support for lineage separation
All analyses unanimously supported at least three separately evolving
lineages. The western and eastern lineages were separated by very
large uncorrected mitochondrial distances and F
st
values. The differ-
entiation between these two lineages was also supported by the
majority of AMOVA variance being explained by these lineages as
opposed to populations within these lineages. Furthermore, the D-
statistic test showed no evidence of introgression between the west-
ern and eastern lineages (with the exception of a single hybrid sample
discussed below). Similar results were obtained when comparing the
populations East 1 and East 2 within the eastern lineage, thereby
supporting the divergence and isolation of these two species.
Despite the presence of a single hybrid sample, our analyses indi-
cated that all other samples from the eastern/western contact zone
(n=21) consisted of either eastern or western genotypes with no
genetic intermediates. We view this as evidence of strong reproduc-
tive isolation and hypothesize that hybridization events between
these separately evolving lineages are rare and produce hybrids of
low fitness that do not subsequently reproduce successfully. How-
ever, denser sampling will be required to better understand the
extent and viability of hybrids at this contact zone.
FIGURE 4 Left: A phylogram depicting contemporary and historical migration rates (N
m
) for populations from the western clade estimated
using FASTSIMCOAL2 under a full asymmetrical migration model. Right: Contemporary gene flow scenarios based on plausible phylogenetic
relationships, population structure and geography. High gene flow: N
m
>1 and all sample location comparisons significant for D-statistic; low
gene flow: 0.1 <N
m
<1 and some comparisons significant for D-statistic; no gene flow: N
m
<0.1 or no significant comparisons for D-statistic
TABLE 4 AMOVA results showing the proportion of variation and F
st
analogues calculated for different hierarchical levels of population
structure under the infinite allele model
Source of variation Nested in %var F-stat F-value SD p-Value F0-value
Within West
Within individual –0.744 F
it
0.256 0.011 ––
Among individual Population 0.12 F
is
0.139 0.008 .001 –
Among population –0.136 F
st
0.136 0.009 .001 0.138
Within East
Within individual –0.203 F
it
0.797 0.007 ––
Among individual Population 0.15 F
is
0.425 0.012 .001 –
Among population –0.647 F
st
0.647 0.01 .001 0.655
Between West/East
Within individual –0.262 F
it
0.738 0.005 ––
Among individual Population 0.114 F
is
0.303 0.004 .001 –
Among population East/West 0.091 F
sc
0.195 0.005 .001 0.2
p-Values were assessed using 1,000 permutations.
CHAN ET AL.
|
11
These multiple lines of congruent evidence from different sources
of data provide strong support for the recognition of at least three
distinct species of Amolops in Peninsular Malaysia: the true A. laruten-
sis, consisting of populations from the western lineage; and two unde-
scribed species represented by the lineages East 1 and East 2.
4.2
|
Support for population cohesion
Within the highly structured western clade, the relatively high mito-
chondrial distances were consistent with interspecific distances
among other amphibian species (Fouquet et al., 2007; Vences, Tho-
mas, Bonett, & Vieites, 2005; Vences, Thomas, van der Meijden,
Chiari, & Vieites, 2005) and were identified as separate species
based on traditional species delimitation methods. However, we
reject this hypothesis based on results from population genomic
analyses (sNMF, F
st
, AMOVA, H-index, FASTSIMCOAL2, D-statistic, Man-
tel test) that detected different levels of gene flow among these
populations (discussed in further detail below). Disturbingly, the pop-
ulations that had the highest mitochondrial divergences (West 1,
West 2, West 3 and West 4) were also the populations that were
most undifferentiated and showed the highest levels of gene flow
based on genomic data, potentially as a result of sex-biased gene
flow. Genetic variation within the western clade is more reflective of
intraspecific population structure than divergence associated with
speciation events. As such, we consider the entire western lineage
to be a single, cohesive metapopulation lineage represented by the
taxon name A. larutensis.
Within the eastern clade, the D-statistic showed low levels of
gene flow between the subpopulations East 1.1 and East 1.2 but not
between East 1.2 and East 2. We attribute the low levels of gene
flow and migration between the subpopulations East 1.1 and East
1.2 to the lack of samples from the region spanning those popula-
tions. We hypothesize that as samples from that area become avail-
able, populations from the entire northeastern mountain range will
form a cohesive metapopulation lineage (East 1), separate from the
southeastern population East 2 due to the lack of contiguous habitat
between East 1 and East 2.
4.3
|
Biogeography and the speciation continuum
The different levels of genetic differentiation within Peninsular
Malaysian Amolops illustrate the complex nature of speciation, rang-
ing from the presence of continuous variation within a group with-
out reproductive isolation, to complete and irreversible reproductive
isolation between groups (Hendry, Bolnick, Berner, & Peichel, 2009).
Deep divergence coupled with strong reproductive isolation could be
caused by divergent selection (McKinnon et al., 2004; Rundle &
Nosil, 2005; Schluter, 2009) or allopatric speciation (Coyne & Orr,
2004; Wiley, 1978). In this study, ecological conservatism in Amolops
and the discontinuous genetic variation between western and east-
ern lineages are more indicative of the latter as opposed to the for-
mer. At the other end of the divergence spectrum, populations from
the western lineage were highly structured and showed varying
levels of historical and contemporary migration consistent with a
complex history involving gene flow between recently diverging lin-
eages. Because speciation with gene flow can occur in nature (Nie-
miller, Fitzpatrick, & Miller, 2008; Nosil, 2008; Zarza et al., 2016),
we applied a migration threshold for genetic isolation of one individ-
ual per 10 generations as a cut-off to determine the level of gene
flow below which we consider populations to be separately evolving
lineages (Rannala, 2015; Zhang, Zhang, Zhu, & Yang-, 2011). Using
this threshold, gene flow among populations of the western clade
has not been sufficiently reduced to be considered genetically iso-
lated enough to represent distinct species. However, it is worth not-
ing that gene flow between Larutensis on the northwestern
mountain range and the geographically proximate W1 and W2 popu-
lations on the central range were the most reduced (N
m
=0.3). We
interpret this as an indication of incipient speciation triggered by
recent and rapid human development and the disruption of habitat
corridors along the Bintang–Kledang range, a small mountain range
situated between the northwestern and central mountain ranges
(Jamaluddin, Pau, & Siti-Azizah, 2011; Khoo & Lubis, 2005). Given
sufficient time, Larutensis’s lower long-term migration rates could
lead this population to qualify as a species separate from the other
western populations. However at this point, the data do not support
this split and we therefore consider these populations as belonging
to a single species.
4.4
|
Effects of genomic filtering parameters
Filtering parameters for SNP assembly can have a significant impact
on downstream analyses and inferences. The correlation between
including sites with more missing taxa and better bipartition support
is consistent with previous simulation (Huang & Knowles, 2016) and
empirical studies (Eaton & Ree, 2013; Wagner et al., 2013). Con-
versely, allowing large amounts of missing data can also result in
high bootstrap support for incorrect clades (Leach
e, Banbury, Felsen-
stein, de Oca, & Stamatakis, 2015). Since a consensus has yet to be
reached on the best ways to process large data sets, our preference
for the data set with lower minimum read depth and higher allow-
ance for missing data should be interpreted with caution, especially
as our preferred phylogeny might not actually represent the true
species tree. However, our use of population genomic methods to
delimit the number of Peninsular Malaysian Amolops species means
our conclusions are relatively robust to errors in reconstructing the
true topology of lineages included in this study.
In a separate study on the trade-off between coverage depth and
the number of individuals in a sample, Buerkle and Gompert (2013)
showed that low coverage sequencing (as low as 19coverage) is not
only sufficient, but also could be optimal to accurately estimate popu-
lation parameters, as this allows the inclusion of greater numbers of
individuals or sites in the genome. Lower coverage was also optimal
for phylogenetic estimation in our data sets, as our higher minimum
depth data sets had low branch support, and failed to recover some
monophyletic groups. However, these recommendations are depen-
dent on the overall level of sequencing depth in our project: our study
12
|
CHAN ET AL.
supports previous research in that general rules of thumb for SNP fil-
tering are unlikely but instead may depend on the properties of the
data set and species biology, which should be evaluated on a case-by-
case basis (Huang & Knowles, 2016; Leach
e et al., 2015).
5
|
CONCLUSIONS
This study does not discount the utility of traditional species delimi-
tation methods but instead highlights the importance of choosing
the right tool for the right task. Using methods that do not account
for gene flow to delimit cryptic species boundaries where gene flow
occurs will inherently yield erroneous results. We therefore caution
against using these methods to delimit recent and rapidly diverging
populations where gene flow may be prevalent. For such cases, we
demonstrate that a population genomics approach can be used to
objectively assess lineage separation in line with the general lineage
concept of species.
Our findings are especially significant for systematic research in
regions where new species are being described at a high rate. Malay-
sia stands as a particularly relevant test case (i.e., a potential future
study system for evaluating the performance of species delimitation
procedures) in that numerous newly described species, codistributed
throughout the range of localities studied here, have been split into
multiple, formally named species using traditional species delimitation
methods. This study does not invalidate those descriptions but pro-
vides evidence that gene flow is present among co-occurring popula-
tions in one taxonomic group (Amolops). Our findings suggest that
other codistributed and taxonomically diverse taxa could provide
compelling examples for future genomic species delimitation studies.
ACKNOWLEDGEMENTS
Field and molecular work for this study was supported by the
National Geographic Explorer’s Grant (9722-15) to LLG and KOC. We
thank the Advanced Computing Facility staff at the University of Kan-
sas for computational resources; and the KU Genome Sequencing
Core Laboratory (supported by the National Institute of General Medi-
cal Sciences of the National Institutes of Health: P20GM103638). We
thank J. Kelly for assisting with genomic data collection and R. Glor, L.
Welton, S. Travers, K. Olson, C. Hutter, R. Abraham, P. De Mello, K.
Allen, K. Chovanec, J. Wienell and W. Tapondjou for providing intel-
lectual input. We thank K. Lim at Lee Kong Chian Museum of Natural
History, Singapore, for specimen loans. For assistance in the field, we
thank T. Daugherty, M. Muin and S. Anuar.
DATA ACCESSIBILITY
1Sampling localities, morphological data and mtDNA sequences
(GenBank accessions) uploaded as Tables S1 and S2.
2Genomic data and associated scripts and files for analyses are
available from the Dryad Digital Repository: https://doi.org/10.
5061/dryad.p928v.
AUTHOR CONTRIBUTIONS
K.O.C. designed the study, collected samples, performed laboratory
work and analyses and wrote the manuscript. A.M.A. helped with
genomic analyses and provided valuable comments and edits
throughout numerous rounds of revisions. L.L.G. provided funding
for fieldwork, collected samples and contributed to many of the
ideas in the project design. Y.-C.S. provided valuable insights and
ideas during many sessions of discussion. J.L.G. helped with initial
project design and sampling. E.S.H.Q. helped with sampling. R.M.B.
funded laboratory work and sequencing and provided intellectual
input throughout the entire study.
ORCID
Kin Onn Chan http://orcid.org/0000-0001-6270-0983
REFERENCES
Alexander, A. (2017). Creating_dadi_SNP_input_from_structure. Retrieved
from https://github.com/laninsky/creating_dadi_SNP_input_from_
structure
Andolfatto, P., Davison, D., Erezyilmaz, D., Hu, T. T., Mast, J., Sunayama-
morita, T., & Stern, D. L. (2011). Multiplexed shotgun genotyping for
rapid and efficient genetic mapping. Genome Research,21(4), 610–
617.
Barley, A. J., White, J., Diesmos, A. C., & Brown, R. M. (2013). The challenge
of species delimitation at the extremes: Diversification without mor-
phological change in Philippine Sun Skinks. Evolution,67, 3556–3572.
Benestan, L., Gosselin, T., Perrier, C., Sainte-Marie, B., Rochette, R., &
Bernatchez, L. (2015). RAD genotyping reveals fine-scale genetic
structuring and provides powerful population assignment in a widely
distributed marine species, the American lobster (Homarus ameri-
canus). Molecular Ecology,24(13), 3299–3315.
Brown, R. M., & Stuart, B. L. (2012). Patterns of biodiversity discovery
through time: An historical analysis of amphibian species discoveries
in the Southeast Asian mainland and adjacent island archipelagos. In
D. J. Gower, K. Johnson, J. Richardson, B. Rosen, L. Ruber, & S. Wil-
liams (Eds.), Biotic evolution and environmental change in Southeast
Asia (pp. 348–389). Cambridge: Cambridge University Press.
Bryant, D., Bouckaert, R., Felsenstein, J., Rosenberg, N. A., & Roychoud-
hury, A. (2012). Inferring species trees directly from biallelic genetic
markers: Bypassing gene trees in a full coalescent analysis. Molecular
Biology and Evolution,29, 1917–1932.
Buerkle, C. A. (2005). Maximum-likelihood estimation of a hybrid index
based on molecular markers. Molecular Ecology Notes,5, 684–687.
Buerkle, A. C., & Gompert, Z. (2013). Population genomics based on low
coverage sequencing: How low should we go? Molecular Ecology,22,
3028–3035.
Burbrink, F. T., & Guiher, T. J. (2015). Considering gene flow when using
coalescent methods to delimit lineages of North American pitvipers
of the genus Agkistrodon.Zoological Journal of the Linnean Society,
173, 505–526.
Camargo, A., Morando, M., Avila, L. J., & Sites, J. W. (2012). Coalescent-
based methods with ABC and other coalescent-based methods : A
test of accuracy with simulations and an empirical example with
lizards of the Liolaemus darwinii complex (Squamata : Liolaemidae).
Evolution,66, 2834–2849.
Candy, J. R., Campbell, N. R., Grinnell, M. H., Beacham, T. D., Larson, W.
A., & Narum, S. R. (2015). Population differentiation determined from
CHAN ET AL.
|
13
putative neutral and divergent adaptive genetic markers in Eulachon
(Thaleichthys pacificus, Osmeridae), an anadromous Pacific smelt.
Molecular Ecology Resources,15(6), 1421–1434.
Carstens, B. C., Pelletier, T. A., Reid, N. M., & Satler, J. D. (2013). How to
fail at species delimitation. Molecular Ecology,22, 4369–4383.
Catchen, J., Hohenlohe, P. A., Bassham, S., Amores, A., & Cresko, W. A.
(2013). Stacks: An analysis tool set for population genomics. Molecu-
lar Ecology,22, 3124–3140.
Chan, K. O., & Brown, R. M. (2017). Did true frogs ‘dispersify’?Biology
Letters,13, 20170299.
Chan, K. O., Brown, R. M., Lim, K. K. P., & Grismer, L. L. (2014). A new
species of frog (Amphibia: Anura: Ranidae) of the Hylarana signata
Complex from Peninsular Malaysia. Herpetologica,70, 228–240.
Chan, K. O., & Grismer, L. L. (2010). Re-assessment of the Reinwardt’s
Gliding Frog, Rhacophorus reinwardtii (Schlegel 1840) (Anura: Rha-
cophoridae) in Southern Thailand and Peninsular Malaysia and its re-
description as a new species. Zootaxa,2505,40–50.
Chan, K. O., Grismer, L. L., & Brown, R. M. (2014). Reappraisal of the
Javanese Bullfrog complex, Kaloula baleata (M€
uller, 1836) (Amphibia:
Anura: Microhylidae), reveals a new species from Peninsular Malaysia.
Zootaxa,3900, 569–580.
Chan, K. O., Grismer, L. L., Zachariah, A., Brown, R. M., & Abraham, R. K.
(2016). Polyphyly of Asian Tree Toads, genus Pedostibes G€
unther,
1876 (Anura: Bufonidae), and the description of a new genus from
Southeast Asia. PLoS ONE,11, e0145903.
Chan, K. O., Grismer, L. L., & Grismer, J. L. (2011). A new insular, endemic
frog of the genus Kalophrynus Tschudi, 1838 (Anura: Microhylidae)
from Tioman Island, Pahang, Peninsular Malaysia. Zootaxa,68,60–68.
Coyne, J. A., & Orr, H. A. (2004). Speciation. Sunderland, MA: Sinauer
Associates.
Crawford, A. J. (2003). Relative rates of nucleotide substitution in frogs.
Journal of Molecular Evolution,57(6), 636–641.
Drummond, A. J., & Bouckaert, R. R. (2015). Bayesian evolutionary analysis
with BEAST (p. 260). Cambridge: Cambridge University Press.
Durand, E. Y., Patterson, N., Reich, D., & Slatkin, M. (2011). Testing for
ancient admixture between closely related populations. Molecular
Biology and Evolution,28, 2239–2252.
Eaton, D. A. R. (2014). PyRAD: Assembly of de novo RADseq loci for
phylogenetic analyses. Bioinformatics,30, 1844–1849.
Eaton, D. A. R., Hipp, A. L., Gonz
alez-Rodr
ıguez, A., & Cavender-Bares, J.
(2015). Historical introgression among the American live oaks and
the comparative nature of tests for introgression. Evolution,69,
2587–2601.
Eaton, D. A. R., & Ree, R. H. (2013). Inferring phylogeny and introgres-
sion using RADseq data: An example from glowering plants (Pedicu-
laris: Orobanchaceae). Systematic Biology,62, 689–706.
Ence, D. D., & Carstens, B. C. (2011). SpedeSTEM: A rapid and accurate
method for species delimitation. Molecular Ecology Resources,11,
473–480.
Evans, B. J., Brown, R. M., McGuire, J. A., Supriatna, J., Andayani, N.,
Diesmos, A., ... Cannatella, D. C. (2003b). Phylogenetics of fanged
frogs: Testing biogeographical hypotheses at the interface of the
asian and Australian faunal zones. Systematic Biology,52(6), 794–
819.
Excoffier, L., Dupanloup, I., Huerta-S
anchez, E., Sousa, V. C., & Foll, M.
(2013). Robust Demographic Inference from Genomic and SNP data.
PLoS Genetics,9(10). https://doi.org/10.1371/journal.pgen.1003905
Excoffier, L., Smouse, P. E., & Quattro, J. M. (1992). Analysis of molecular
variance inferred from metric distances among DNA haplotypes:
Application to human mitochondrial DNA restriction data. Genetics,
131, 479–491.
Fouquet, A., Gilles, A., Vences, M., Marty, C., Blanc, M., & Gemmell, N. J.
(2007). Underestimation of species richness in neotropical frogs
revealed by mtDNA analyses. PLoS ONE,2(10), https://doi.org/10.
1371/journal.pone.0001109
Frichot, E., & Francßois, O. (2015). LEA: An R package for landscape and
ecological association studies. Methods in Ecology and Evolution,6,
925–929.
Frichot, E., Mathieu, F., Trouillon, T., Bouchard, G., & Francßois, O. (2014).
Fast and efficient estimation of individual ancestry coefficients.
Genetics,196, 973–983.
Frost, D. R. (2015). Amphibian species of the world: An online reference.
Version 6.0. New York, NY: American Museum of Natural History.
Electronic Database Retrieved from http://research.amnh.org/herpe
tology/amphibia/index.html
Frost, D. R., & Hillis, D. M. (1990). Species in concept and practice: Her-
petological applications. Herpetologica,46,86–104.
Fujita, M. K., Leache, A. D., Burbrink, F. T., McGuire, J. A., & Moritz, C.
(2012). Coalescent-based species delimitation in an integrative taxon-
omy. Trends in Ecology and Evolution,27, 480–488.
Giam, X., Scheffers, B. R., Sodhi, N. S., Wilcove, D. S., Ceballos, G., & Ehr-
lich, P. R. (2012). Reservoirs of richness: Least disturbed tropical for-
ests are centres of undescribed species diversity. Proceedings of the
Royal Society B: Biological Sciences,279(1726), 67–76.
Grismer, L. L. (2011). Lizards of Peninsular Malaysia, Singapore and their
adjacent archipelagos. Frankfurt, Germany: Edition Chimaira.
Grismer, L. L., Anuar, S., Quah, E. S. H., Muin, M. A., Chan, K. O., Grismer,
J. L., & Ahmad, N. (2010). A new spiny, prehensile-tailed species of
Cyrtodactylus (Squamata: Gekkonidae) from Peninsular Malaysia with
a preliminary hypothesis of relationships based on morphology. Zoo-
taxa,52(2625), 40–52.
Grismer, L. L., Wood, P. L., Anuar, S., Muin, M. A., Quah, E. S. H.,
McGuire, J. A., ... Pham, H. T. (2013). Integrative taxonomy uncovers
high levels of cryptic species diversity in Hemiphyllodactylus Bleeker,
1860 (Squamata: Gekkonidae) and the description of a new species
from Peninsular Malaysia. Zoological Journal of the Linnean Society,
169(4), 849–880.
Grismer, L. L., Wood, P. L. J., Anuar, S., Riyanto, A., Ahmad, N., Muin, M.
A., ... Pauwels, O. S. G. (2014). Systematics and natural history of
Southeast Asian Rock Geckos (genus Cnemaspis Strauch, 1887) with
descriptions of eight new species from Malaysia, Thailand and
Indonesia. Zootaxa,3880(1), 1–147.
Grismer, L. L., Wood, P. L. J., Quah, E. S. H., Anuar, S., Muin, M. A.,
Sumontha, M., ... Pauwels, O. S. G. (2012). A phylogeny and taxon-
omy of the Thai-Malay Peninsula Bent-toed Geckos of the Cyrto-
dactylus pulchellus complex (Squamata: Gekkonidae): Combined
morphological and molecular analyses with descriptions of seven new
species. Zootaxa,3520,1–55.
Gutenkunst, R. N., Hernandez, R. D., Williamson, S. H., & Bustamante, C.
D. (2009). Inferring the joint demographic history of multiple popula-
tions from multidimensional SNP frequency data. PLoS Genetics,5
(10). https://doi.org/10.1371/journal.pgen.1000695
Harvey, M. G., Duffie-Judy, C., Seeholzer, G. F., Maley, J. M., Graves, G.
R., & Brumfield, R. T. (2015). Similarity threshholds used in short read
assembly reduce the comparability of population histories across spe-
cies. PeerJ,3, e895. https://doi.org/10.7717/peerj.895
Hasan, M., Islam, M. M., Khan, M. M. R., Igawa, T., Alam, M. S., Djong, H.
T., ... Sumida, M. (2014). Genetic divergences of South and South-
east Asian frogs: A case study of several taxa based on 16S riboso-
mal RNA gene data with notes on the generic name Fejervarya.
Turkish Journal of Zoology,38(4), 389–411.
Hendry, A. P., Bolnick, D. I., Berner, D., & Peichel, C. L. (2009). Along the
speciation continuum in sticklebacks. Journal of Fish Biology,75,
2000–2036.
Huang, H., & Knowles, L. L. (2016). Unforeseen consequences of exclud-
ing missing data from next-generation sequences: Simulation study of
RAD sequences. Systematic Biology,65(3), 357–365.
Ilut, D. C., Nydam, M. L., & Hare, M. P. (2014). Defining loci in restric-
tion-based reduced representation genomic data from nonmodel spe-
cies: Sources of bias and diagnostics for optimal clustering. BioMed
14
|
CHAN ET AL.
Research International,2014(675158). https://doi.org/10.1155/2014/
675158
Jackson, N. D., Carstens, B. C., Morales, A. E., & O’Meara, B. C. (2016).
Species delimitation with gene flow. Systematic Biology, syw117.
https://doi.org/10.1093/sysbio/syw117
Jamaluddin, J. A. F., Pau, T. M., & Siti-Azizah, M. N. (2011). Genetic
structure of the Snakehead Murrel, Channa striata (Channidae) based
on the Cytochrome-c Oxidase Subunit I gene: Influence of historical
and geomorphological factors. Genetics and Molecular Biology,34,
152–160.
Jombart, T. (2008). Adegenet: A R package for the multivariate analysis
of genetic markers. Bioinformatics,24, 1403–1405.
Jombart, T., Devillard, S., & Balloux, F. (2010). Discriminant analysis of
principal components: A new method for the analysis of genetically
structured populations. BMC Genetics,11, 94.
Jost, L. (2008). GST and its relatives do not measure differentiation.
Molecular Ecology,17, 4015–4026.
Kaiser, H. F. (1960). The application of electronic computers to factor
analysis. Educational and Psychological Measurement,20, 141–151.
Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock,
S., ... Drummond, A. (2012). Geneious basic: An integrated and
extendable desktop software platform for the organization and analy-
sis of sequence data. Bioinformatics,28(12), 1647–1649.
Khoo, S. N., & Lubis, A. R. (2005). Kinta Valley: Pioneering Malaysia’s mod-
ern development. Penang: Areca Books.
Kim, H., & Park, H. (2007). Sparse non-negative matrix factorizations via
alternating non-negativity-constrained least squares for microarray
data analysis. Bioinformatics,23, 1495–1502.
Knowles, L. L., & Carstens, B. C. (2007). Estimating a geographically expli-
cit model of population divergence. Evolution,61, 477–493.
Larson, W. A., Seeb, L. W., Everett, M. V., Waples, R. K., Templin, W. D.,
& Seeb, J. E. (2014). Genotyping by sequencing resolves shallow pop-
ulation structure to inform conservation of Chinook salmon (Oncor-
hynchus tshawytscha). Evolutionary Applications,7(3), 355–369.
Leach
e, A. D., Banbury, B. L., Felsenstein, J., de Oca, A. N., & Stamatakis,
A. (2015). Short tree, long tree, right tree, wrong tree: New acquisi-
tion bias corrections for inferring SNP phylogenies. Systematic Biol-
ogy,64, 1032–1047.
Leach
e, A. D., Fujita, M. K., Minin, V. N., & Bouckaert, R. R. (2014). Spe-
cies delimitation using genome-wide SNP Data. Systematic Biology,
63, 534–542.
Leavitt, S. D., Moreau, C. S., & Lumbsch, H. T. (2015). The dynamic disci-
pline of species delimitation: Progress toward effectively recognizing
species boundaries in natural populations. In D. K. Upreti, P. K. Diva-
kar, V. Shukla, & R. Bajpai (Eds.), Recent advances in lichenology: Mod-
ern methods and approaches in lichen systematics and culture
techniques, Vol. 2(pp. 11–44). New Delhi, India: Springer.
Leliaert, F., Verbruggen, H., Vanormelingen, P., Steen, F., L
opez-Bautista,
J. M., Zuccarello, G. C., & De Clerck, O. (2014). DNA-based species
delimitation in algae. European Journal of Phycology,49(2), 179–196.
Leslie, S., Winney, B., Hellenthal, G., Davison, D., Boumertit, A., Day, T.,
... Bodmer, W. (2015). The fine-scale genetic structure of the British
population. Nature,519(7543), 309–314.
Lewis, P. O. (2001). A likelihood approach to estimating phylogeny from
discrete morphological character data. Systematic Biology,50, 913–925.
Lleonart, J., Salat, J., & Torres, G. J. (2000). Removing allometric effects
of body size in morphological analysis. Journal of Theoretical Biology,
205,85–93.
Matsui, M., Shimada, T., Liu, W. Z., Maryati, M., Khonsue, W., & Orlov,
N. (2006). Phylogenetic relationships of Oriental torrent frogs in the
genus Amolops and its allies (Amphibia, Anura, Ranidae). Molecular
Phylogenetics and Evolution,38(3), 659–666.
Mayr, E. (1968). The role of systematics in biology. Science,159, 595–
599.
McKinnon, J. S., Mori, S., Blackman, B. K., David, L., Kingsley, D. M.,
Jamieson, L., ... Schluter, D. (2004). Evidence for ecology’s role in
speciation. Nature,429, 294–298.
McLeod, D. S. (2010). Of least concern? Systematics of a cryptic species
complex: Limnonectes kuhlii (Amphibia: Anura: Dicroglossidae). Molec-
ular Phylogenetics and Evolution,56, 991–1000.
Medrano, M., L
opez-Perea, E., & Herrera, C. M. (2014). Population genet-
ics methods applied to a species delimitation problem: Endemic trum-
pet daffodils (Narcissus Section Pseudonarcissi) from the Southern
Iberian Peninsula. International Journal of Plant Sciences,175, 501–517.
Meirmans, P. G. (2012). The trouble with isolation by distance. Molecular
Ecology,21(12), 2839–2846.
Meirmans, P. G., & Hedrick, P. W. (2011). Assessing population structure:
FST and related measures. Molecular Ecology Resources,11,5–18.
Meirmans, P. G., & Van Tienderen, P. H. (2004). GENOTYPE and GEN-
ODIVE: Two programs for the analysis of genetic diversity of asexual
organisms. Molecular Ecology Notes,4, 792–794.
Miettinen, J., Shi, C., & Liew, S. C. (2011). Deforestation rates in insular
Southeast Asia between 2000 and 2010. Global Change Biology,17,
2261–2270.
Minh, B. Q., Nguyen, M. A. T., & von Haeseler, A. (2013). Ultrafast
approximation for phylogenetic bootstrap. Molecular Biology and Evo-
lution,30, 1188–1195.
Mittermeier, R. A., Myers, N., Thomsen, J. B., da Fonseca, G. A. B., & Oli-
vieri, S. (1998). Biodiversity hotspots and major tropical Wilderness
areas: Approaches to setting conservation priorities. Conservation
Biology,12, 516–520.
Myers, N., Mittermeier, R. A., Mittermeier, C. G., da Fonseca, G. A. B., &
Kent, J. (2000). Biodiversity hotspots for conservation priorities. Nat-
ure,403, 853–858.
Nee, S. (2006). Birth-death models in macroevolution. Annual Review of
Ecology, Evolution, and Systematics,37,1–17.
Nguyen, L.-T., Schmidt, H. A., von Haeseler, A., & Minh, B. Q. (2014). IQ-
TREE: A fast and effective stochastic algorithm for estimating maxi-
mum likelihood phylogenies. Molecular Biology and Evolution,32, 268–
274.
Niemiller, M. L., Fitzpatrick, B. M., & Miller, B. T. (2008). Recent diver-
gence with gene flow in Tennessee cave salamanders (Plethodonti-
dae: Gyrinophilus) inferred from gene genealogies. Molecular Ecology,
17, 2258–2275.
Nosil, P. (2008). Speciation with gene flow could be common. Molecular
Ecology,17, 2006–2008.
Patterson, N., Moorjani, P., Luo, Y., Mallick, S., Rohland, N., Zhan, Y., ...
Reich, D. (2012). Ancient admixture in human history. Genetics,192
(3), 1065–1093.
Petit, R. J., & Excoffier, L. (2009). Gene flow and species delimitation.
Trends in Ecology and Evolution,24, 386–393.
Pyron, R. A., Hsieh, F. W., Lemmon, A. R., Emily, M., & Hendry, C. R.
(2016). Integrating phylogenomic and morphological data to assess
candidate species-delimitation models in brown and red-bellied snakes
(Storeria). Zoological Journal of the Linnean Society,177,937–949.
de Queiroz, K. (2005). Ernst Mayr and the modern concept of species.
Proceedings of the National Academy of Sciences of the United States
of America,102, 6600–6607.
de Queiroz, K. (2007). Species concepts and species delimitation. System-
atic Biology,56, 879–886.
Rambaut, A., Suchard, M. A., Xie, D., & Drummond, A. J. (2014). Tracer
v1.6. Retrieved from http://beast.bio.ed.ac.uk/Tracer
Rannala, B. (2015). The art and science of species delimitation. Current
Zoology,61, 846–853.
Reeves, P. A., & Richards, C. M. (2007). Distinguishing terminal
monophyletic groups from reticulate taxa: Performance of phenetic,
tree-based, and network procedures. Systematic Biology,56,
302–320.
CHAN ET AL.
|
15
Reeves, P. A., & Richards, C. M. (2011). Species delimitation under the gen-
eral lineage concept: An empirical example using wild North American
hops (Cannabaceae: Humulus lupulus). Systematic Biology,60,45–59.
Richardson, J. L., Brady, S. P., Wang, I. J., & Spear, S. F. (2016). Navigat-
ing the pitfalls and promise of landscape genetics. Molecular Ecology,
25, 849–863.
Ronquist, F., Teslenko, M., van der Mark, P., Ayres, D. L., Darling, A.,
Hohna, S., ... Huelsenbeck, J. P. (2012). MrBayes 3.2: Efficient Baye-
sian phylogenetic inference and model choice across a large model
space. Systematic Biology,61(3), 539–542.
Rosenberg, N. A. (2007). Statistical tests for taxonomic distinctiveness
from observations of monophyly. Evolution,61, 317–323.
Rousset, F. (2008). GENEPOP’007: A complete re-implementation of the
GENEPOP software for Windows and Linux. Molecular Ecology
Resources,8, 103–106.
Rubin, B. E. R., Ree, R. H., & Moreau, C. S. (2012). Inferring phylogenies
from RAD sequence data. PLoS ONE,7,1–12.
Rundle, H. D., & Nosil, P. (2005). Ecological speciation. Ecology Letters,8,
336–352.
Schluter, D. (2009). Evidence for ecological speciation and its alternative.
Science,323, 737–741.
Sexton, J. P., Hangartner, S. B., & Hoffmann, A. A. (2014). Genetic isola-
tion by environment or distance: Which pattern of gene flow is most
common? Evolution,68,1–15.
Simpson, G. G. (1961). Principles of animal taxonomy. New York: Columbia
University Press.
Sodhi, N. S., Koh, L. P., Brook, B. W., & Ng, P. K. L. (2004). Southeast
Asian biodiversity: An impending disaster. Trends in Ecology and Evo-
lution,19, 654–660.
Sol
ıs-Lemus, C., Knowles, L. L., & An
e, C. (2015). Bayesian species delimi-
tation combining multiple genes and traits in a unified framework.
Evolution,69, 492–507.
Sousa, V., & Hey, J. (2013). Understanding the origin of species with gen-
ome-scale data: Modelling gene flow. Nature Reviews. Genetics,14,
404–414.
Streicher, J. W., Devitt, T. J., Goldberg, C. S., Malone, J. H., Blackmon, H.,
& Fujita, M. K. (2014). Diversification and asymmetrical gene flow
across time and space: Lineage sorting and hybridization in polytypic
barking frogs. Molecular Ecology,23(13), 3273–3291.
Stuart, B. L. (2008). The phylogenetic problem of Huia (Amphibia: Rani-
dae). Molecular Phylogenetics and Evolution,46,49–60.
Sukumaran, J., & Knowles, L. L. (2017). Multispecies coalescent delimits
structure, not species. Proceedings of the National Academy of
Sciences,114, 1607–1612.
Swofford, D. L. (2002). PAUP*. Phylogenetic Analysis Using Parsimony
(*and Other Methods). Sunderland, MA: Sinauer Associates.
Tang, C. Q., Humphreys, A. M., Fontaneto, D., & Barraclough, T. G.
(2014). Effects of phylogenetic reconstruction method on the robust-
ness of species delimitation using single-locus data. Methods in Ecol-
ogy and Evolution,5, 1086–1094.
Thorpe, R. S. (1975). Quantitative handling of characters useful in snake sys-
tematics with particular reference to intraspecific variation in the Ringed
Snake Natrix natrix.Biological Journal of the Linnean Society,7,27–43.
Thorpe, R. S. (1983). A review of the numerical methods for recognizing
and analyzing racial differentiation. In: J. Felsenstein (Ed.), Numerical
taxonomy: Proceedings of a NATO advanced studies institute NATO ASI
series (pp. 404–423). Berlin, Heidelberg: Springer Verlag.
Tobias, J. A., Seddon, N., Spottiswoode, C. N., Pilgrim, J. D., Fishpool, L.
D. C., & Collar, N. J. (2010). Quantitative criteria for species delimita-
tion. Ibis,152(4), 724–746.
Turan, C. (1999). A note on the examination of morphometric differentia-
tion among fish populations: The Truss System. Turkish Journal of
Zoology,23, 259–263.
Veach, V., Di Minin, E., Pouzols, F. M., & Moilanen, A. (2017). Species
richness as criterion for global conservation area placement leads to
large losses in coverage of biodiversity. Diversity and Distributions,23,
715–726.
Vences, M., Thomas, M., Bonett, R. M., & Vieites, D. R. (2005). Decipher-
ing amphibian diversity through DNA barcoding: Chances and chal-
lenges. Philosophical Transactions of the Royal Society of London, Series
B, Biological Sciences,360, 1859–1868.
Vences, M., Thomas, M., van der Meijden, A., Chiari, Y., & Vieites, D. R.
(2005). Comparative performance of the 16S rRNA gene in DNA bar-
coding of amphibians. Frontiers in Zoology,2,5.
Wagner, C. E., Keller, I., Wittwer, S., Selz, O. M., Mwaiko, S., Greuter, L.,
... Seehausen, O. (2013). Genome-wide RAD sequence data provide
unprecedented resolution of species boundaries and relationships in
the Lake Victoria Cichlid adaptive radiation. Molecular Ecology,22(3),
787–798.
Weir, B. S., & Cockerham, C. C. (1984). Estimating F-statistics for the
analysis of population structure. Evolution,38, 1358–1370.
Whitlock, M. C. (2011). G’ST and D do not replace FST. Molecular Ecol-
ogy,20, 1083–1091.
Wiens, J. J., & Penkrot, T. A. (2002). Delimiting species using DNA and
morphological variation and discordant species limits in spiny lizards
(Sceloporus). Systematic Biology,51,69–91.
Wilcove, D. S., Giam, X., Edwards, D. P., Fisher, B., & Koh, L. P. (2013).
Navjot’s nightmare revisited: Logging, agriculture, and biodiversity in
Southeast Asia. Trends in Ecology and Evolution,28, 531–540.
Wiley, E. O. (1978). The evolutionary species concept reconsidered. Sys-
tematic Zoology,27,17–26.
Wright, S. (1951). The genetical structure of populations. Annals of
Eugenics,15, 322–354.
Yang, Z. (2015). A tutorial of BPP for species tree estimation and species
delimitation. Current Zoology,61, 854–865.
Yang, Z., & Rannala, B. (2010). Bayesian species delimitation using multi-
locus sequence data. Proceedings of the National Academy of Sciences
of the United States of America,107, 9264–9269.
Zarza, E., Faircloth, B. C., Tsai, W. L. E., Bryson, R. W., Klicka, J., &
McCormack, J. E. (2016). Hidden histories of gene flow in highland
birds revealed with genomic markers. Molecular Ecology,25, 5144–
5157.
Zhang, J., Kapli, P., Pavlidis, P., & Stamatakis, A. (2013). A general species
delimitation method with applications to phylogenetic placements.
Bioinformatics,29(22), 2869–2876.
Zhang, C., Zhang, D., Zhu, T., & Yang-, Z. (2011). Evaluation of a bayesian
coalescent method of species delimitation. Systematic Biology,60,
747–761.
SUPPORTING INFORMATION
Additional Supporting Information may be found online in the sup-
porting information tab for this article.
How to cite this article: Chan KO, Alexander AM, Grismer
LL, et al. Species delimitation with gene flow: A
methodological comparison and population genomics
approach to elucidate cryptic species boundaries in Malaysian
Torrent Frogs. Mol Ecol. 2017;00:1–16. https://doi.org/
10.1111/mec.14296
16
|
CHAN ET AL.