ArticlePDF Available

Transcriptome Sequences Resolve Deep Relationships of the Grape Family

PLOS
PLOS One
Authors:
  • Jishou University, Jishou, China

Abstract and Figures

Previous phylogenetic studies of the grape family (Vitaceae) yielded poorly resolved deep relationships, thus impeding our understanding of the evolution of the family. Next-generation sequencing now offers access to protein coding sequences very easily, quickly and cost-effectively. To improve upon earlier work, we extracted 417 orthologous single-copy nuclear genes from the transcriptomes of 15 species of the Vitaceae, covering its phylogenetic diversity. The resulting transcriptome phylogeny provides robust support for the deep relationships, showing the phylogenetic utility of transcriptome data for plants over a time scale at least since the mid-Cretaceous. The pros and cons of transcriptome data for phylogenetic inference in plants are also evaluated.
Content may be subject to copyright.
Transcriptome Sequences Resolve Deep Relationships of
the Grape Family
Jun Wen1*, Zhiqiang Xiong2, Ze-Long Nie3, Likai Mao2, Yabing Zhu2, Xian-Zhao Kan4, Stefanie M. Ickert-
Bond5, Jean Gerrath6, Elizabeth A. Zimmer1, Xiao-Dong Fang2*
1 Department of Botany, National Museum of Natural History, MRC166, Smithsonian Institution, Washington, D.C., United States of America, 2 BGI-Shenzhen,
Shenzhen, China, 3 Key Laboratory of Biodiversity and Biogeography, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, China,
4 College of Life Sciences, Anhui Normal University, Wuhu, Anhui, China, 5 UA Museum of the North Herbarium and Department of Biology and Wildlife,
University of Alaska Fairbanks, Fairbanks, Alaska, United States of America, 6 Department of Biology, University of Northern Iowa, Cedar Falls, Iowa, United
States of America
Abstract
Previous phylogenetic studies of the grape family (Vitaceae) yielded poorly resolved deep relationships, thus
impeding our understanding of the evolution of the family. Next-generation sequencing now offers access to protein
coding sequences very easily, quickly and cost-effectively. To improve upon earlier work, we extracted 417
orthologous single-copy nuclear genes from the transcriptomes of 15 species of the Vitaceae, covering its
phylogenetic diversity. The resulting transcriptome phylogeny provides robust support for the deep relationships,
showing the phylogenetic utility of transcriptome data for plants over a time scale at least since the mid-Cretaceous.
The pros and cons of transcriptome data for phylogenetic inference in plants are also evaluated.
Citation: Wen J, Xiong Z, Nie Z-L, Mao L, Zhu Y, et al. (2013) Transcriptome Sequences Resolve Deep Relationships of the Grape Family. PLoS ONE
8(9): e74394. doi:10.1371/journal.pone.0074394
Editor: Hector Candela, Universidad Miguel Hernández de Elche, Spain
Received February 2, 2013; Accepted August 1, 2013; Published September 17, 2013
This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by
anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Funding: This study was funded by grants from the Office of the Smithsonian Undersecretary of Science, the US National Science Foundation (grant DEB
0743474 to S.R. Manchester and J. Wen, and grant DEB 0743499 to J. Gerrath), the Small Grants Program of the National Museum of Natural History of
the Smithsonian Institution to JW, and trust funds generated by EAZ. The funders had no role in study design, data collection and analysis, decision to
publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
* E-mail: wenj@si.edu (JW); fangxd@genomics.org.cn (XDF)
Introduction
The grape family (Vitaceae) has been widely recognized for
its economic importance as the source of table grapes, wine,
and raisins. The family consists of 14 genera and ~ 900
species [1]. Many species of the family are dominant lianas in
lowland tropical forests, while species in Parthenocissus
Planchon, Ampelopsis Michx. and Vitis L. are primarily from the
temperate zone. Previous phylogenetic analyses support five
major clades within Vitaceae: (i) the Vitis Ampelocissus
clade (180 spp.), (ii) the AmpelopsisRhoicissus clade (43
spp.), (iii) the Parthenocissus -Yua clade (15 spp.), (iv) the core
Cissus clade (300 spp.), and (v) the Cayratia Tetrastigma
Cyphostemma – clade (350 spp.) [2,3]. Parthenocissus and
Yua are supported as closely related to Vitis (3, 4). However, in
spite of several recent efforts [2,3,5,6,7,8] that effectively
resolved the relationships within each of the main clades, the
deep relationships of the family remained poorly resolved.
Recently, it has been demonstrated for a number of plant and
animal lineages that uncertainty of deep relationships among
taxonomic groups hinders progress in understanding their
evolution including their temporal and spatial origins as well as
their morphological changes over time [9,10]. In particular,
biogeographic reconstructions, especially at the family level,
are a major challenge for plant biologists [2,3,11–16], even
though methods have been developed to account for
phylogenetic uncertainty in biogeographic inferences [17–19].
Transcriptome sequences, generated using high throughput
techniques, have been shown to provide a rich set of
characters to produce phylogenies in eukaryotes and are more
efficient and cost-effective than traditional PCR-based and
EST-based methods (20). Recent studies have demonstrated
the utility of transcriptome data for resolving the relationships of
mosquitoes [20], mollusks [9,21], and the large tetrapod group
consisting of turtles, birds and crocodiles [22]. For example,
even though mollusks have an excellent fossil record, deep
relationships of the phyllum have been uncertain when
molecular phylogenies used a few genes. With a transcriptome
approach, the major clades were resolved with highly
significant statistical support. Given its potential, we decided to
take a phylogenomics approach to resolve the deep
relationships of the Vitaceae. This represents the first study in
PLOS ONE | www.plosone.org 1 September 2013 | Volume 8 | Issue 9 | e74394
plants to use RNA-Seq data to reconstruct phylogenies in
flowering plants. Several previous studies employed RNA-Seq
data to explore the evolution of paleopolyploidy (e.g. [23-25];
also see 26). The 1KP collaborative project has also generated
large-scale gene sequence information for many different
species of plants (http://www.onekp.com/).
Results and Discussion
Backbone relationships of the grape family
Transcriptome (RNA-Seq) data were obtained from 14
species of the grape family and one species of its sister family
Leeaceae (Table S1, Figure S1), and augmented with publicly
available whole genome data of the domesticated grape Vitis
vinifera [27]. Each of the five major lineages of the grape family
[3] was represented in the data. We obtained about twenty
million 90 bp paired-end DNA sequence reads from non-
normalized cDNA libraries for each of the 15 species using an
Illumina HiSeq 2000, assembled the sequence reads de novo
and retained all contigs ≥ 150 bp for further analysis (Table
S2). This strategy identified 417 orthologous genes suitable for
concatenation and phylogenetic inference (Table S3, also see
Figure S2), totaling 770,922 nucleotide and 256,974 amino
acid positions. After filtering out any gene where each taxon
contained no more than 50% of the data as missing, a 229
gene data set resulted, totaling 334,317 nucleotide and
111,439 amino acid positions.
Initial maximum likelihood analysis of the nucleotide
sequences of the 417 gene matrix using PhyML [28] produced
robust support for relationships of the grape family (Figure 1; all
nodes with 100% bootstrap support values). However, to
minimize the impact of missing data, we subsequently
employed the 229 gene data set to explore various
phylogenetic inference methods. Maximum likelihood estimates
(ML [28,29]) and Bayesian inference (BI) with a phylogenetic
mixture model [30] of the 229 gene data set also supported the
topology shown in Figure 1. The maximum parsimony (MP)
analyses [31], however, placed the Cissus clade at the base,
even though the unrooted relationships within Vitaceae were
identical with all three different analytical strategies (Figure 2).
When we examined the data set closely, we noted that Cissus
is the most divergent taxon within Vitaceae. The parsimony
method has been known to be problematic under conditions of
greatly unequal branch lengths, referred to as the long-branch
attraction phenomenon [32]. Our analyses using maximum
likelihood with both PhyML (28) and RAxML [29], and Bayesian
inference [30] all yielded an identical topology of Vitaceae
(Figure 1) that showed all nodes with 100% bootstrap support
and posterior probabilities of 1.00, suggesting that all taxa of
Vitaceae were represented by sufficient data to be reliably
placed.
Thus, using the model-based analytical methods, we
produced a transcriptome phylogeny (Figure 1) that supports
the AmpelopsisRhoicissus clade as the basally diverged
clade in Vitaceae. Vitis, Ampelocissus, Pterisanthes, and
Nothocissus form a clade, which is sister to Parthenocissus.
The taxa Cissus, Cayratia, Cyphostemma and Tetrastigma
form a separate clade, with the latter three genera forming a
subclade sister to core Cissus. These four genera possess two
morphological synapomorphies: 4-merous flowers and very
well-developed thick floral discs. This backbone relationship of
Vitaceae is similar to the results of Ren et al. [3] using three
chloroplast markers, but support values were relatively low for
several major clades in that earlier study. It is of interest to
mention that the deep clades, such as the Parthenocissus-
Vitis-Ampelocissus-Nothocissus-Pterisanthus (PVANP) clade,
as well as the clade of PVANP and Cayratia, Tetrastigma,
Cyphostemma and Cissus, lack detectable morphological
synapomorphies. Morphological convergence is the most likely
reason for such a pattern at the deep level. All relationships at
the shallower level are consistent with the results of the
previous analyses of various clades of Vitaceae [4,6-8].
The biogeographic origin of the grape family has never been
explored with analytical methods. With the phylogeny of
Vitaceae unavailable at that time, in their seminal paper, Raven
and Axelrod [33] considered Vitaceae as a relatively ancient
family and proposed that it might have originated in the
Laurasian region and subsequently reached the Southern
Hemisphere subsequently. The first diverged clade, i.e., the
Ampelopsis-Rhoicissus clade, consists of ca. 43 species
disjunctly distributed over six continents (Asia, Europe, North
America, South America, Africa, and Australia), and represents
a rare example in angiosperms with such a widely disjunct
distribution in both the Northern and the Southern Hemisphere.
The Ampelopsis - Rhoicissus clade is composed of two distinct
Laurasian lineages, each disjunct between the Old and the
New World, and one Southern Hemisphere group with a
Gondwana-like intercontinental disjunction: (Africa (Australia,
and South America)). The biogeographic analyses of the 28
species sampled by Nie et al. [34] suggested that the
Ampelopsis Rhoicissus clade had an early diversification in
the Northern Hemisphere and subsequently migrated into the
Southern Hemisphere and diversified there. Our results also
support the hypothesis that the primarily North Temperate
grape genus Vitis forms a clade with the pantropical
Ampelocissus, and the tropical Asian Pterisanthes and
Nothocissus (Figure 1). This large clade of four genera
consisting of the close relatives of grapes is sister to
Parthenocissus, a North Temperate genus disjunct in eastern
Asia and eastern North America. Even though a biogeographic
analysis of the family is beyond the scope of the current paper,
the establishment of the backbone phylogeny (Figure 1) will
ultimately facilitate our inference of the family at the global
scale and help elucidate the diversification processes involving
both the temperate and tropical floristic elements. In particular,
the placement of the Ampelopsis-Rhoicissus clade as the first
diverged clade, the Northern Hemisphere taxa forming a grade,
and the Southern Hemisphere taxa (e.g., Rhoicissus) nested
within the Northern Hemisphere grade (also see [34]) are
consistent with the Northern Hemisphere origin of the family. A
detailed biogeographic analysis with a broad taxon sampling
scheme will be attempted in the near future.
Given the strong support for the Vitaceae backbone
phylogeny, we further tested its topological stability by
producing new data sets via randomly reducing the gene
number in multiples of 10, starting from the 229 gene tree. To
Transcriptome Phylogeny of Grape Plant Family
PLOS ONE | www.plosone.org 2 September 2013 | Volume 8 | Issue 9 | e74394
automatically obtain bootstrap support scores on the nodes of
large numbers of trees, we used the following strategy. For
each specified gene number N, we obtained a random set of N
genes from the 229 orthologous genes. Then we built an ML
tree based on this set and compared the topology of the tree
with that of the standard tree (Figure 2). If the topologies were
the same, the set and tree were kept, or else they were
discarded, and new sets and trees would be created and
followed by comparison of the new topology to the 229 gene
data set. The process was repeated until a tree with standard
topology was obtained. For each N, the program built 30 trees
based on 30 random sets of N genes. N was set to 10, 20, 30..
220. The average numbers of repeats for each N were
tabulated and plotted (Figure 3). With just 30 genes, all nodes
had bootstrap support (BS) of more than 95% using the
likelihood approach in PhyML; with 40 genes, all nodes had BS
of at least 99% (Figure 2; Table S4). We also examined the
average bootstrap support and the resampled nucleotide
positions in the phylogenetic analyses to show the topological
stability (Figure S3). Our results thus indicate that RNA-Seq
[35], even with non-normalized transcriptomes, offers access to
protein coding sequences very easily and quickly and
represents a data-rich, accurate, and cost-effective source of
orthologous sequences for phylogenetic inference.
Our data sets also showed that only 48 of the 229 gene trees
had exactly the topology found in Figure 1, and in fact, the
gene trees were quite diverse in topology. Nevertheless, the
concatenated gene tree had all clades strongly supported. This
result is reminiscent of the study of Rokas et al. [36], who
demonstrated that concatenation of a sufficient number of
randomly selected genes overwhelms conflicting signals
present in different genes.
Figure 1. Maximum likelihood tree of Vitaceae using nucleotide sequences of 229 genes from the 15 transcriptomes of
Vitaceae. The same topology was recovered from the 417 gene data set. Bootstrap support for all nodes was 100%, and posterior
probabilities in the Bayesian inference for all nodes were 1.00.
doi: 10.1371/journal.pone.0074394.g001
Transcriptome Phylogeny of Grape Plant Family
PLOS ONE | www.plosone.org 3 September 2013 | Volume 8 | Issue 9 | e74394
Utility of transcriptome data for phylogenetic inference
A practical disadvantage of using the transcriptome
approach is that it requires high quality RNA from fresh
material, while silica gel dried plant tissue samples and
herbarium specimens will rarely yield good RNA. In fact,
Hittinger et al. [20] have shown that large phylogenetic data
matrices can be assembled accurately from even short (50 bp
average) transcript sequences, so even non-optimal plant
material, for example, that was preserved in “RNAlater” may
eventually be used for transcriptome data generation. Our data
demonstrate that the transcriptomes can yield resolution for
previously difficult to resolve radiations, especially at the family
level in plants, in the time frame since the mid-Cretaceous
(Figure 4). This may be true, even though these are coding
sequences, since their third positions and the 5’ and 3’
untranslated regions do evolve relatively rapidly [37].
Transcriptome data can effectively lead to identification of truly
single copy transcripts and offer the conserved sequences
necessary to generate primer pairs that can be used to amplify
and sequence rapidly evolving intron regions for studies at and
below the species level, generally without cloning steps [38].
The amplifications may then be standard ones followed by
Sanger sequencing, or may be ones employed in the next-
generation sequencing approaches generally referred to as
targeted sequence capture [39,40]. Nevertheless, as the RNA-
Seq approach is still relatively costly, extensive taxon sampling
is not presently feasible. Our sampling in the grape family
emphasized the backbone relationships and represents an
example of what we can accomplish using transcriptomes and
a first step toward resolving the deep phylogenetic
relationships of Vitaceae.
Transcriptomic phylogenetic analyses do face some
challenges due to the complications associated with
pseudogenes and paralogous comparisons [38,41]. Because
Figure 2. Unrooted tree of 15 species of Vitaceae based on nucleotide sequences of 229 genes. The node numbers
correspond to those in Table S4.
doi: 10.1371/journal.pone.0074394.g002
Transcriptome Phylogeny of Grape Plant Family
PLOS ONE | www.plosone.org 4 September 2013 | Volume 8 | Issue 9 | e74394
many plant lineages may have experienced reticulate evolution
and allopolyploidy [42], it is also challenging to tease apart the
plant diversification history given the hundreds of genes
available. Our grape data set shows that transcriptome data
can be an important source for phylogenetic inference.
However, we may have been highly fortunate that the grape
family was not seriously impacted by reticulate evolution in its
early history (see Figure S4, based on the Ks value distribution
Figure 3. Average bootstrap support and the gene number in the phylogenetic analyses based on data from Table S4.
doi: 10.1371/journal.pone.0074394.g003
Figure 4. Divergence time estimation of Vitaceae clades using the program mcmctree in PAML and 4-fold degenerate
sites. The red dots correspond to calibration points.
doi: 10.1371/journal.pone.0074394.g004
Transcriptome Phylogeny of Grape Plant Family
PLOS ONE | www.plosone.org 5 September 2013 | Volume 8 | Issue 9 | e74394
of paralogs of species across the grape family), which allowed
us to recover the highly robust topology (Figure 1).
With respect to data analyses, species tree approaches [43]
may need to be explored more thoroughly and other
partitioning strategies may be applied [44]. Testing and
selecting genes with strong phylogenetic signals will be an
important next step with our data set (see [45]). New analytical
strategies will be needed to handle these large data sets, and
to deal with realistic assumptions about the complications of
molecular evolution as well as differences in nucleotide
substitution rates. A number of common computer programs
such as MrBayes [46], BEAST [47] and even PhyloBayes [44]
cannot accommodate large data sets like ours with over
300,000 aligned nucleotide positions at present. Clearly, the
systematic biology community needs to invest in the
bioinformatics front more aggressively, as large data sets now
are being generated at a rapid rate. Our study also
demonstrates that the non-parametric parsimony method [31]
may be misleading when handling genomic datasets with
hundreds of thousands of characters when the sequence
evolution is highly unequal across taxa in the study group.
Furthermore, our case study on the grape family is within the
time frame of 100 million years of evolution (Figure 4). If we
move deeper into the time scale of the tree of life, we expect
additional complications concerning homology of gene
sequences. Nevertheless, plant biologists have experienced
enormous difficulties in resolving deep relationships among
taxa in the time frame of the last 100 million years, and our
data demonstrate the power of transcriptome data over this
evolutionary time scale.
Materials and Methods
Ethics Statement
No specific permits were required for the collection of
samples as they were all grown in the greenhouse, which
complied with all relevant regulations. None of the samples
represents endangered or protected species.
RNA extraction and transcriptome sequencing
Total RNA was isolated from finely ground mixed tissue
samples of stems, leaves, tendrils and sometimes flowers of
plants growing in the Botany Department greenhouse of the
Smithsonian Institution. Voucher information for the species
used is given in Table S1. We used the Sigma Spectrum™
Plant Total RNA Kit for the extractions. The transcriptome
library construction and sequencing were performed at BGI and
followed the protocols in Peng et al. [48].
De novo assembly and transcript annotation
After we obtained raw sequencing data, we first filtered out
reads of low quality, including cases of weak signal, large
number of N’ s and PCR duplication. The reads with more than
40 bases of low quality, i.e., 71 or lower Illumina scores or with
more than 20% of N (unknown) bases were all filtered out.
Three software packages, Trinity [49], Velvet-Oases [50] and
SOAPdenovo-Trans (http://soap.genomics.org.cn/
SOAPdenovo-Trans.html) were evaluated for the initial
assembly. The genes from the grape whole genome annotation
were used as calibration to check the performance of the
programs. We also used a gene set of conservative proteins in
eukaryotes to evaluate the assemblies. After comparing the
assemblies to check for completeness, redundancy, and the
coverage of some essential or housekeeping genes in the
grape genome, we selected the software SOAPdenovo-Trans
as our assembler for its overall best performance.
After the de novo assembly of each sample with
SOAPdenovo-Trans, we filtered out highly similar transcripts
that may represent alternatively spliced transcripts. We then
aligned the remaining transcripts to the reference grape
genome in the Swiss-Prot database using BLAST [51] with the
parameters “-e 1e-5 -F F -a 5”. The transcripts that could be
aligned to reference sequences were selected and scanned to
define coding regions (CDS). Length distribution of the coding
regions extracted from the 14 transcriptomes and one
reference genome (Vitis vinifera) of Vitaceae is shown in Figure
S1.
Gene orthologs
Self-to-self BLASTP [51] was conducted for all protein
sequences with an E-value of 1E-5. We assigned a connection
(edge) between two nodes (genes) if the aligned length was
longer than 1/3 for both genes. An H-score that ranged from 0
to 100 was used to weight the edges. For genes G1 and G2, H-
score is defined as Score (G1, G2) *100 / max(Score(G1, G1),
Score(G2, G2)), where Score(A, B) is BLAST raw score of
genes A and B.
To define gene families, we used average distance for a
hierarchical clustering algorithm implemented in Hcluster_sg
(part of TreeFam) [52]. It required the minimum edge weight
(H-score) to be larger than 5 and the minimum edge density
(total number of edges / theoretical number of edges) to be
larger than 1/3. One-to-one single-copy orthologous gene
families were then selected. The length distribution of 417
ortholog gene sequences (data including 6672 sequences, the
total of 417 genes x 16 samples) is shown in Figure S2. The
grape transcriptome sequence data have been deposited in
GenBank (submission ID: Grape Transcriptome; submission
content: Transcriptome analysis of 16 grapes; Submission:
Grape Transcriptome; Created SUBMISSION: ACC =
SRA081731 subid = 14992).
Phylogenetic reconstruction
MUSCLE [53] was used to obtain multi-sequence alignments
for each orthologous gene family. All alignments were
concatenated for phylogenetic analyses using the optimality
criteria of maximum parsimony (MP), maximum likelihood (ML),
and Bayesian inference (BI), as implemented in PAUP 4.0b10
[31], PhyML 7.2.6 [28] and RAxML [29] and BayesPhylogenies
[30], respectively. For the MP analyses, we used heuristic
searches with tree-bisection-reconnection (TBR) branch
swapping, MULTREES option on, and 1000 random additions.
All characters were unordered and equally weighted, and gaps
were treated as missing data in the analyses. For the ML
analysis, the ML tree was calculated assuming a GTR + CAT
Transcriptome Phylogeny of Grape Plant Family
PLOS ONE | www.plosone.org 6 September 2013 | Volume 8 | Issue 9 | e74394
model of sequence evolution. Robustness of inference was
assessed by running 1000 fast bootstrap replicates. For the
Bayesian analysis, we employed a joint model that
accommodates both rate-heterotachy and pattern-
heterogeneity as implemented in the program
BayesPhylogenies [30]. We performed two runs of 2 million
generations, sampling every 1000 generations, using 4 chains
with the default heating scheme. After discarding the first
200,000 trees in the chain as a ‘‘burn-in’’ period, we sampled
1000 trees to ensure that successive trees in our sample were
independent.
Divergence time estimation
We used the program mcmctree in PAML [54] and 4-fold
degenerate sites to estimate divergence time. The fossil record
of Vitaceae is rich, and seed fossils can be differentiated at the
generic level [55,56]. The oldest confirmed vitaceous seed
fossil is unambiguously assigned to Ampelocissus s.l. (A.
parvisemina) and dates back to the late Paleocene in North
Dakota of North America [56]. Furthermore, Ampelocissus has
been shown not to be monophyletic, but clearly forms a clade
with Vitis, Pterisanthes, and Nothocissus [2,3]. The stem of the
Vitis-Ampelocissus-Pterisanthes-Nothocissus clade was thus
fixed at 58.5 ± 5.0 million yeas ago (Ma). For the root age of
the family Vitaceae, Nie et al. [4,34] and Zecca et al. [57] fixed
the split between Vitaceae and its sister lineage, Leea, as 85 ±
4.0 Ma based on the estimated age of 78-92 Ma by Wikström
et al. [58]. However, Magallón and Castillo [59] reported a pre-
Tertiary origin at 90.65 to 90.82 Ma for Vitaceae. The estimated
ages from Magallón and Castillo [48] and Wikström et al. [58]
are close, but the latter was criticized for using nonparametric
rate smoothing and for calibrating the tree using only a single
calibration point [50]. We herein use the estimate from
Magallón and Castillo [59] and set the normal prior distribution
of 90.7±1.0 Ma for the stem age of the family.
Supporting Information
Figure S1. Length distribution of the coding regions extracted
from the 14 transcriptomes and one reference genome (Vitis
vinifera) of Vitaceae. Sample numbers are shown in Table S1.
(TIF)
Figure S2. Length distribution of 417 ortholog gene
sequences (data include 6672 sequences, the total of 417
genes x 16 samples).
(TIF)
Figure S3. Average bootstrap support and the resampled
nucleotide positions in the phylogenetic analyses to show the
topological stability.
(TIF)
Figure S4. Ks value distributions for paralogs of Vitaceae
species and the outgroup.
(TIF)
Table S1. Vitaceae species sampled for the grape
transcriptome analyses. Voucher specimens are deposited at
the US National Herbarium (US).
(DOCX)
Table S2. Statistical information of the transcriptomes of 15
species of Vitaceae and Leeaceae.
(DOCX)
Table S3. The 1:1:1 orthlog genes selected for phylogenetic
analysis of the grape family.
(DOCX)
Table S4. Topological stability as estimated by bootstrap
support of nodes with the maximum likelihood method by
randomly reducing the gene number by 10, starting from the
229 gene data set.
(DOCX)
Acknowledgements
We thank Mike Barker and Sarah Mathews for advice and
discussions and Tim Utteridge for the photo of Pterisanthes.
Author Contributions
Conceived and designed the experiments: JW EAZ XDF.
Performed the experiments: JW ZQX ZLN LKM XZK YBZ.
Analyzed the data: ZQX ZLN LKM JW. Wrote the manuscript:
JW ZQX ZLN LKM JG XDF SMI EAZ.
References
1. Wen J (2007) Vitaceae in K Kubitzki, The families and genera of
vascular plants, vol. 9. Berlin: Springer-Verlag. pp 466–478.
2. Wen J, Nie Z-L, Soejima A, Meng Y (2007) Phylogeny of Vitaceae
based on the nuclear GAI1 gene sequences. Can J Bot 85: 731–745.
doi:10.1139/B07-071.
3. Ren H, Lu L-M, Soejima A, Luke Q, Zhang D-X et al. (2011)
Phylogenetic analysis of the grape family (Vitaceae) based on the
noncoding plastid trnC-petN, trnH-psbA, and trnL-F sequences. Taxon
60: 629–637.
4. Nie ZL, Sun H, Chen ZD, Meng Y, Manchester SR et al. (2010)
Molecular phylogeny and biogeographic diversification of
Parthenocissus (Vitaceae) disjunct between Asia and North America.
Am J Bot 97: 1342–1353. doi:10.3732/ajb.1000085. PubMed:
21616887.
5. Soejima A, Wen J (2006) Phylogenetic analysis of the grape family
(Vitaceae) based on three chloroplast markers. Am J Bot 93: 278–287.
doi:10.3732/ajb.93.2.278. PubMed: 21646189.
6. Lu L, Wen J, Chen Z (2012) A combined morphological and molecular
phylogenetic analysis of Parthenocissus (Vitaceae) and taxonomic
implications. Bot J Linn Soc, 168: 43–63. doi:10.1111/j.
1095-8339.2011.01186.x.
7. Lu L, Wang W, Chen Z, Wen J (2013) Phylogeny of the non-
monophyletic Cayratia Juss. (Vitaceae) and implications for character
Transcriptome Phylogeny of Grape Plant Family
PLOS ONE | www.plosone.org 7 September 2013 | Volume 8 | Issue 9 | e74394
evolution and biogeography. Mol Phylogenet Evol 68: 502–515. doi:
10.1016/j.ympev.2013.04.023. PubMed: 23669013.
8. Liu XQ, Ickert-Bond SM, Chen LQ, Wen J (2013) Molecular phylogeny
of Cissus L. of Vitaceae (the grape family) and evolution of its
pantropical intercontinental disjunctions. Mol Phylogenet Evol 66(1):
43–53. doi:10.1016/j.ympev.2012.09.003. PubMed: 23000818.
9. Kocot KM, Cannon JT, Todt C, Citarella MR, Kohn AB et al. (2011)
Phylogenomics reveals deep molluscan relationships. Nature 477:
452–456. doi:10.1038/nature10382. PubMed: 21892190.
10. Bell CD, Kutschker A, Arroyo MTK (2012) Phylogeny and diversification
of Valerianaceae (Dipsacales) in the southern Andes. Mol Phylogenet
Evol 63: 724–737. doi:10.1016/j.ympev.2012.02.015. PubMed:
22421085.
11. Wen J, Plunkett GM, Mitchell A, Wagstaff S (2001) Evolution of
Araliaceae: a phylogenetic analysis based on the ITS sequences of
nrDNA. Syst Bot 26: 144–167.
12. Plunkett GM, Wen J, Lowry PP (2004) Infrafamilial relationships in
Araliaceae: Insights from nuclear (ITS) and plastid (trnL-trnF) sequence
data. Plant Syst Evol 245: 1–39.
13. Gernandt DS, Magallón S, Lopez GG, Flores OZ, Willyard A et al.
(2008) Use of simultaneous analyses to guide fossil-based calibrations
of Pinaceae phylogeny. Int J Plant Sci 169: 1086–1099. doi:
10.1086/590472.
14. Bremer B, Eriksson T (2009) Time tree of Rubiaceae: Phylogeny and
dating the family, subfamilies and tribes. Int J Plant Sci 170: 766–793.
doi:10.1086/599077.
15. Franzke A, German D, Al-Shehbaz IA, Mummenhoff K (2009)
Arabidopsis family ties: molecular phylogeny and age estimates in
Brassicaceae. Taxon 58: 425–437.
16. Olmstead RG, Zjhra ML, Lohmann LG, Grose SO, Eckert AJ (2009) A
molecular phylogeny and classification of Bignoniaceae. Am J Bot 96:
1731–1743. doi:10.3732/ajb.0900004. PubMed: 21622359.
17. Nylander JAA, Olsson U, Alström P, Sanmartín I (2008) Accounting for
phylogenetic uncertainty in biogeography: A Bayesian approach to
dispersal-vicariance analysis of the thrushes (Aves: Turdus). Syst Biol
57: 257–268. doi:10.1080/10635150802044003. PubMed: 18425716.
18. Ree RH, Smith SA (2008) Maximum likelihood inference of geographic
range evolution by dispersal, local extinction, and cladogenesis. Syst
Biol 57: 4–14. doi:10.1080/10635150701883881. PubMed: 18253896.
19. Yu Y, Harris AJ, He X (2010) S-DIVA (Statistical Dispersal-Vicariance
Analysis): A tool for inferring biogeographic histories. Mol Phylogenet
Evol 56: 848–850. doi:10.1016/j.ympev.2010.04.011. PubMed:
20399277.
20. Hittinger CT, Johnston M, Tossberg JT, Rokas A (2010) Leveraging
skewed transcript abundance by RNA-Seq to increase the genomic
depth of the tree of life. Proc Natl Acad Sci U S A 107: 1476–1481. doi:
10.1073/pnas.0910449107. PubMed: 20080632.
21. Smith SA, Wilson NG, Goetz FE, Feehery C, Andrade SCS et al.
(2011) Resolving the evolutionary relationships of molluscs with
phylogenomic tools. Nature 480: 364–367. doi:10.1038/nature10526.
PubMed: 22031330.
22. Chiari Y, Cahais V, Galtier N, Delsuc F (2012) Phylogenomic analyses
support the position of turtles as the sister group of birds and crocodiles
(Archosauria). BMC Biol 10: 65. doi:10.1186/1741-7007-10-65.
PubMed: 22839781.
23. Barker MS, Kane NC, Matvienko M, Kozik A, Michelmore RW et al.
(2008) Multiple paleopolyploidizations during the evolution of the
Compositae reveal parallel patterns of duplicate gene retention after
millions of years. Mol Biol Evol 25: 2445–2455. doi:10.1093/molbev/
msn187. PubMed: 18728074.
24. Barker MS, Vogel H, Schranz ME (2009) Paleopolyploidy in the
Brassicales: Analyses of the Cleome transcriptome elucidate the
history of genome duplications in Arabidopsis and other Brassicales.
Genome Biol Evolution 1(1): 391–399. PubMed: 20333207.
25. McKain MR, Wickett N, Zhang Y, Ayyampalayam S, McCombie WR et
al. (2012) Phylogenomic analysis of transcriptome data elucidates co-
occurrence of a paleopolyploid event and the origin of bimodal
karyotypes in Agavoideae (Asparagaceae). Am J Bot 99: 397–406. doi:
10.3732/ajb.1100537. PubMed: 22301890.
26. Cronn R, Knaus BJ, Liston A, Maughan PJ, Parks M et al. (2012)
Targeted enrichment strategies for next-generation plant biology. Am J
Bot 99: 291–311. doi:10.3732/ajb.1100356. PubMed: 22312117.
27. Jaillon O, Aury JM, Noel B, Policriti A, Clepet C et al. (2007) The
grapevine genome sequence suggests ancestral hexaploidization in
major angiosperm phyla. Nature 449: 463–467. doi:10.1038/
nature06148. PubMed: 17721507.
28. Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W et al.
(2010) New algorithms and methods to estimate maximum-likelihood
phylogenies: assessing the performance of PhyML. Syst Biol 3.0 59:
307–321.
29. Stamatakis A (2006) RAxML-VI-HPC: Maximum likelihood-based
phylogenetic analyses with thousands of taxa and mixed models.
Bioinformatics 22: 2688–2690. doi:10.1093/bioinformatics/btl446.
PubMed: 16928733.
30. Pagel M, Meade A (2004) A phylogenetic mixture model for detecting
pattern-heterogeneity in gene sequence or character-state data. Syst
Biol 53: 571–581. doi:10.1080/10635150490468675. PubMed:
15371247.
31. Swofford DL (2003) PAUP*. Phylogenetic Analysis Using Parsimony (*
and Other Methods), version 4. Sunderland, MA: Sinauer Associates.
32. Felsenstein J (1978) Cases in which parsimony or compatibility
methods will be positively misleading. Syst Zool 27: 401-410. doi:
10.2307/2412923.
33. Raven PH, Axelrod DI (1974) Angiosperm biogeography and past
continental movements. Ann Mo Bot Gard 61: 539–673. doi:
10.2307/2395021.
34. Nie ZL, Sun H, Manchester SR, Meng Y, Luke Q et al. (2012) Evolution
of the intercontinental disjunctions in six continents in the Ampelopsis
clade of the grape family (Vitaceae). BMC Evol Biol 12: 17. doi:
10.1186/1471-2148-12-17.
35. Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: A revolutionary tool
for transcriptomics. Nat Rev Genet 10: 57–63. doi:10.1038/nrg2484.
PubMed: 19015660.
36. Rokas A, Williams BL, King N, Carroll SB (2003) Genome-scale
approaches to resolving incongruence in molecular phylogenies.
Nature 425: 798–804. doi:10.1038/nature02053. PubMed: 14574403.
37. Whittall JB, Medina-Marino A, Zimmer EA, Hodges SA (2006)
Generating single-copy nuclear gene data in a recent adaptive
radiation. Mol Phylogenet Evol 39: 124–134. doi:10.1016/j.ympev.
2005.10.010. PubMed: 16314114.
38. Zimmer EA, Wen J (2012) Using nuclear gene data for plant
phylogenetics: progress and prospects. Mol Phylogenet Evol 65:
774-785. doi:10.1016/j.ympev.2012.07.015. PubMed: 22842093.
39. Faircloth BC, McCormack JE, Crawford NG, Harvey MG, Brumfield RT
et al. (2012) Ultraconserved elements anchor thousands of genetic
markers for target enrichment spanning multiple evolutionary
timescales. Syst Biol 61: 717-726. doi:10.1093/sysbio/sys004. PubMed:
22232343.
40. Grover CE, Salmon A, Wendel JF (2012) Targeted sequence capture
as a powerful tool for evolutionary analysis. Am J Bot 99: 312–319. doi:
10.3732/ajb.1100323. PubMed: 22268225.
41. Franssen SU, Shrestha RP, Bräutigam A, Bornberg-Bauer E, Weber
APM (2011) Comprehensive transcriptome analysis of the highly
complex Pisum sativum genome using next generation sequencing.
BMC Genomics 12: 227. doi:10.1186/1471-2164-12-227. PubMed:
21569327.
42. Soltis DE, Albert VA, Leebens-Mack J, Bell CD, Paterson AH et al.
(2009) Polyploidy and angiosperm diversification. Am J Bot 96: 336–
348. doi:10.3732/ajb.0800079. PubMed: 21628192.
43. Liu L, Yu L, Edwards SV (2010) A maximum pseudo-likelihood
approach for estimating species trees under the coalescent model.
BMC Evol Biol 10: 302. doi:10.1186/1471-2148-10-302. PubMed:
20937096.
44. Lartillot N, Blanquart S, Lepage T (2012) PhyloBayes. p. 3.3, a
Bayesian software for phylogenetic reconstruction and molecular dating
using mixture models. Available: www.phylobayes.org. Accessed 2013
January 15.
45. Salichos L, Rokas A (2013) Inferring ancient divergences requires
genes with strong phylogenetic signals. Nature 497: 327-333. doi:
10.1038/nature12130. PubMed: 23657258.
46. Huelsenbeck JP, Ronquist R (2001) MRBAYES: Bayesian inference of
phylogenetic trees. Bioinformatics 17: 754-755. doi:10.1093/
bioinformatics/17.8.754. PubMed: 11524383.
47. Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary
analysis by sampling trees. BMC Evol Biol 7: 214. doi:
10.1186/1471-2148-7-214. PubMed: 17996036.
48. Peng Z, Cheng Y, Tan BC, Kang L, Tian Z et al. (2012) Comprehensive
analysis of RNA-seq data reveals extensive RNA editing in a human
transcriptome. Nat Biotechnol 30: 253–260. doi:10.1038/nbt.2122.
PubMed: 22327324.
49. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA et al.
(2011) Full-length transcriptome assembly from RNA-seq data without
a reference genome. Nat Biotechnol 29: 644–652. doi:10.1038/nbt.
1883. PubMed: 21572440.
50. Schulz MH, Zerbino DR, Vingron M, Birney E (2012) Oases: Robust de
novo RNA-seq assembly across the dynamic range of expression
Transcriptome Phylogeny of Grape Plant Family
PLOS ONE | www.plosone.org 8 September 2013 | Volume 8 | Issue 9 | e74394
levels. Bioinformatics 28: 1086–1092. doi:10.1093/bioinformatics/
bts094. PubMed: 22368243.
51. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z et al. (1997)
Gapped BLAST and PSI-BLAST: A new generation of protein database
search programs. Nucleic Acids Res 25: 3389–3402. doi:10.1093/nar/
25.17.3389. PubMed: 9254694.
52. Li H, Coghlan A, Ruan J, Coin LJ, Hériché JK et al. (2006) TreeFam: A
curated database of phylogenetic trees of animal gene families. Nucleic
Acids Res 34: D572–D580. doi:10.1093/nar/gkj118. PubMed:
16381935.
53. Edgar RC (2004) MUSCLE: multiple sequence alignment with high
accuracy and high throughput. Nucleic Acids Res 32: 1792–1797. doi:
10.1093/nar/gkh340. PubMed: 15034147.
54. Yang Z (2007) PAML 4: Phylogenetic analysis by maximum likelihood.
Mol Biol Evol 24: 1586–1591. doi:10.1093/molbev/msm088. PubMed:
17483113.
55. Chen I, Manchester SR (2007) Seed morphology of modern and fossil
Ampelocissus (Vitaceae) and implications for phytogeography. Am J
Bot 94: 1534–1553. doi:10.3732/ajb.94.9.1534. PubMed: 21636520.
56. Chen I, Manchester SR (2011) Seed morphology of Vitaceae. Int J
Plant Sci 172: 1–35. doi:10.1086/657283.
57. Zecca G, Abbott JR, Sun WB, Spada A, Sala F et al. (2012) The timing
and the mode of evolution of wild grapes (Vitis). Mol Phylogenet Evol
62: 736–747. doi:10.1016/j.ympev.2011.11.015. PubMed: 22138159.
58. Wikström N, Savolainen V, Chase MW (2001) Evolution of the
angiosperms: Calibrating the family tree. Proc R Soc Lond B 268:
2211–2220. doi:10.1098/rspb.2001.1782. PubMed: 11674868.
59. Magallón SA, Castillo A (2009) Angiosperm diversification through time.
Am J Bot 96: 349–365. doi:10.3732/ajb.0800060. PubMed: 21628193.
Transcriptome Phylogeny of Grape Plant Family
PLOS ONE | www.plosone.org 9 September 2013 | Volume 8 | Issue 9 | e74394
... In the last decade, phylogenomics has revolutionized plant systematics Wen et al. 2013b;Wickett et al. 2014;Xiang et al. 2017;Baker et al. 2022). In the genomic era, phylogenies can be estimated from hundreds to thousands of genes with relative computational ease depending on the phylogenomic approach used . ...
... Another major sequencing approach for plant phylogenomics is transcriptome sequencing (Wen et al. 2013b(Wen et al. , 2015aWickett et al. 2014;Landis et al. 2017;Zeng et al. 2017; One Thousand Plant Transcriptomes Initiative 2019). Transcriptomes provide a rich source of data for not only phylogenomics, but also studies of ancient polyploidy and gene expression (Yang et al. 2018;Pease et al. 2022). ...
Article
Full-text available
Systematics has experienced major recent developments in biodiversity informatics and genomics. Digitization efforts and developments in biodiversity informatics are allowing the systematics community to explore ways to discover new species and new sets of characters for taxonomy via machine learning. Large-scale genomic information has been increasingly used to construct the tree of life, improving taxonomic classification and species delimitations, disentangling complex (often network-like) evolutionary histories across deep and shallow levels, and discovering new species. In the informatics and genomics era, systematics has an incredible capacity to integrate with computational and exploratory platforms for biodiversity discovery and synthesis. Integrative systematics requires the training of the next-generation systematists with integrative taxonomic, phylogenomic, informatic, and ecological skills to address diverse questions about the assembly of biodiversity while continuing to discover and describe biodiversity before it is too late. The systematics community needs to encourage grassroots efforts in major biodiversity hotspots of the world.
... Vitis species are mainly restricted to temperate regions in the Northern Hemisphere (Zecca et al., 2012;Ma et al., 2016;Wen et al., 2018a). This genus has been classified into two subgenera, Muscadinia and Vitis, which have been supported by morphological, anatomical, cytological and molecular evidence (Wen, 2007;Wen et al., 2013;Klein et al., 2017Klein et al., , 2018Ma et al., 2018). Nonetheless, artificial hybrids made between these subgenera have been utilized in the wine grape industry for disease resistance (Bouquet, 1980;Olmo, 1986). ...
... A new set of target enrichment baits was designed based on nuclear genes from 15 transcriptomes representing all the main clades in Vitaceae (Wen et al., 2013) in conjunction with the complete Vitis vinifera genome (Jaillon et al., 2007). Putatively loci were identified with MarkerMiner 1.0, which uses reciprocal BLAST searches of input transcriptome sequences and curated lists of putatively single-copy loci from reference genomes (De Smet et al., 2013;Chamala et al., 2015). ...
Article
A set of newly designed Vitaceae baits targeting 1013 genes was employed to explore phylogenetic relationships among North American Vitis. Eurasian Vitis taxa including Vitis vinifera were found to be nested within North American Vitis subgenus Vitis. North American Vitis subgenus Vitis can be placed into nine main groups: the Monticola group, the Occidentales group, the Californica group, the Vinifera group (introduced from Eurasia), the Mustangensis group, the Palmata group, the Aestivalis group, the Labrusca group, and the Cinerea group. Strong cytonuclear discordances were detected in North American Vitis, with many species non-monophyletic in the plastid phylogeny, while monophyletic in the nuclear phylogeny. The phylogenomic analyses support recognizing four distinct species in the Vitis cinerea complex in North America: V. cinerea, V. baileyana, V. berlandieri, and V. simpsonii. Such treatment will better serve the conservation of wild Vitis diversity in North America. Yet the evolutionary history of Vitis is highly complex, with the concordance analyses indicating conflicting signals across the phylogeny. Cytonuclear discordances and Analyses using the Species Networks applying Quartets (SNaQ) method support extensive hybridizations in North American Vitis. The results further indicate that plastid genomes alone are insufficient for resolving the evolutionary history of plant groups that have undergone rampant hybridization, like the case in North American Vitis. Nuclear gene data are essential for species delimitation, identification and reconstructing evolutionary relationships; therefore, they are imperative for plant phylogenomic studies.
... We hope that our amended descriptions of the genera, summarizing their morphological variability, will be helpful in updating the taxonomy of these groups as more representatives are incorporated in future molecular studies. We anticipate that the use of new molecular techniques involving the sequencing of increasingly large regions of DNA, such as whole plastomes (e.g., Orton et al., 2021) and expanded nuclear gene sequencing, particularly from herbarium specimens (e.g., Wen et al., 2013;Zhang et al., 2022;Zuntini et al., 2024), will verify the phylogenetic relationships within koelerioid clades. In addition, a more inclusive sample of species is likely to change the topology of the trees in our study, and subsequently, affect our phylogenetic inferences. ...
Article
Full-text available
Koelerioid grasses (subtribe Aveninae, tribe Poeae; Pooideae) resolve into two major clades, here called Koelerioid Clades A and B. Phylogenetic relationships among koelerioid grasses are investigated using plastid DNA sequences of rpl32-trnL, rps16-trnK, rps16 intron, and ITS regions, focusing on Trisetum, Acrospelion, and some annual species (Rostraria p.p. and Trisetaria p.p.) closely related to Trisetum flavescens in Koelerioid Clade A. Phylogenetic analyses of several selected data sets performed for 80 taxa and using maximum likelihood and Bayesian methods, revealed mostly congruent topologies in the nuclear and plastid trees, but also reticulation affecting several lineages. Trisetum is restricted to one species, T. flavescens, which is a sister to the clade formed by Trisetum gracile and Trisetaria aurea. The latter two species are classified here in the genus Graciliotrisetum gen. nov. The sister clade includes three species of Rostraria and Trisetaria lapalmae, all of which are classified here in a resurrected genus, Aegialina, which includes four species. Acrospelion is enlarged to include 13 species after the addition of other species formerly classified in Trisetum sect. Trisetum and T. sect. Acrospelion. We also transfer Trisetum ambiguum, Trisetum longiglume, and Koeleria mendocinensis to Graphephorum; and Helictotrichon delavayi to Tzveleviochloa, expanding these genera to eight and six species, respectively. We evaluate cases of reticulate evolution between Koelerioid Clades A and B and within Koelerioid Clade A, which probably gave rise to Graphephorum, Rostraria cristata, and Rostraria obtusiflora. Finally, we comment on polyploidy and biogeographic patterns in koelerioid grasses. We propose 26 new combinations. Lectotypes are designated for 14 names and a neotype for Alopecurus litoreus.
... Certain body plan organizations appear to have inbuilt mechanical constraints which may have profound effects on the subsequent evolution of growth forms. Vitis is inferred to have originated in the New World during the late Eocene (39.4 Ma), then migrated to Eurasia in the late Eocene (37.3 Ma) while the stem age of the Vitaceae family was proposed to be around 90.7 Ma (Wen et al. 2013). Ranunculaceae became differentiated in forests and buttercup crown group was estimated around 109 Ma (Wang et al. 2016) and Clematis (Clematidaceae) around 29 Ma (He et al. 2022). ...
Article
Full-text available
This paper presents a biomechanical study of stems of two liana species, Clematis vitalba and Vitis vinifera, investigates the mechanical performance of these two liana species and attempts to enhance the understanding of structure–function relationships. The investigation involved mechanical testing of whole plant stems, supplemented by X-ray micro-CT (X-ray computed tomography at micron voxel size) imaging and 2D microscopic images of stained cross sections of the plant stems, to derive structure–function relationships with potential for application in bioinspired composite materials. The micro-CT images were compared to the microscopic images of stained cross sections, in order to show benefits and potential drawbacks of the X-ray micro-CT method with respect to traditional methods. The high-resolution 3D imaging capacity of micro-CT is exploited to explain the structural functionality derived from the mechanical testing. A simple finite element model is developed based on the plant topology derived by the micro-CT images and proved accurate enough to model the plant’s mechanical behaviour and assess the influence of their structural differences. The two plants exhibit different to each other physical and mechanical properties (density, strength and stiffness) due to their common growth form. Anatomical cross-sectional observation and X-ray micro-CT provide complementary information. The first method allows the identification of the lignified parts, supposedly more resistant mechanically, of these structures, while the second one provides a full 3D model of the structure, admittedly less detailed but providing the spatial distribution of density contrasts supposed to be important in the mechanical properties of the plant. The proposed methodological approach opens new perspectives to better understand the mechanical behaviour of the complex structure of plants and to draw inspiration from it in structural engineering.
... Phylogenomic analyses based on transcriptomic data have been applied to many groups with conflicting phylogenetic relationships, such as the Diptera (Wen et al., 2013;Kang et al., 2017;Gillung et al., 2018;Kutty et al., 2018Kutty et al., , 2019Pauli et al., 2018). The systematics of the family Tachinidae is particularly challenging because of the high number of species and few phylogenetic studies dealing with subfamily relationships (Cerretti et al., 2014;Stireman et al., 2019). ...
Article
Tachinidae is the second most species-rich family of Diptera. It comprises four subfamilies, and all of its members have para-sitoid habits. We present the first phylogenomic analysis of Tachinidae using transcriptomic data, based on 30 species. We constructed four datasets: three using translated data at the amino acid level (100% coverage, with 106 single-copy protein-coding genes; 75% coverage, with 1359 genes; and 50% coverage, with 1942 genes). The trees were estimated by analysing four matrices using maximum likelihood and maximum parsimony inferences, and only minor differences were found among them. Overall, our topologies are well resolved, with high node support. Polleniidae is corroborated as a sister group to Tachinidae. Within Tachinidae, our results confirm the hypothesis (Phasiinae + Dexiinae) + (Tachininae + Exoristinae). Phasiinae, Dexiinae and Exoristinae are recovered as monophyletic, and Tachininae as polyphyletic. Once again, the tribe Myiophasiini (Tachininae) composes a fifth lineage, clade sister to all the remaining Tachinidae. The Neotropical tribe Iceliini, formerly in Tachininae, is recovered within Exoristinae, sister to Winthemiini. In general, our results are congruent with recent phylogenetic studies that include tachinids, with the important confirmation of the subfamilial relationships and the existence of a fifth lineage of Tachinidae.
... For non-monophyletic species, one representative was selected for each clade of this species. As the missing rates of gene sequences extracted from the WGS data and transcriptome data vary, we used the 229 single-copy nuclear genes identified from transcriptomes covering the phylogeny of Vitaceae in Wen et al. [123] as targets for HybPiper pipeline to extract nuclear gene sequences of 76 species representatives. Considering the great computation demand in divergence time estimation, to reduce the scale of the dataset, we used Python package SortaDate [124] to retain clock-like genes having discernible information content and the least topological conflicts with the MSC species tree. ...
Article
Full-text available
Background Explaining contrasting patterns of distribution between related species is crucial for understanding the dynamics of biodiversity. Despite instances where hybridization and whole genome duplication (WGD) can yield detrimental outcomes, a role in facilitating the expansion of distribution range has been proposed. The Vitaceae genus Causonis exhibits great variations in species’ distribution ranges, with most species in the derived lineages having a much wider range than those in the early-diverged lineages. Hybridization and WGD events have been suggested to occur in Causonis based on evidence of phylogenetic discordance. The genus, therefore, provides us with an opportunity to for explore different hybridization and polyploidization modes in lineages with contrasting species’ distribution ranges. However, the evolutionary history of Causonis incorporating potential hybridization and WGD events remains to be explored. Results With plastid and nuclear data from dense sampling, this study resolved the phylogenetic relationships within Causonis and revealed significant cyto-nuclear discordance. Nuclear gene tree conflicts were detected across the genus, especially in the japonica-corniculata clade, which were mainly attributed to gene flow. This study also inferred the allopolyploid origin of the core Causonis species, which promoted the accumulation of stress-related genes. Causonis was estimated to have originated in continental Asia in the early Eocene, and experienced glaciation in the early Oligocene, shortly after the divergence of the early-divergent lineages. The japonica-corniculata clade mainly diversified in the Miocene, followed by temperature declines that may have facilitated secondary contact. Species distribution modeling based on current climate change predicted that the widespread C. japonica tends to be more invasive, while the endemic C. ciliifera may be at risk of extinction. Conclusions This study presents Causonis, a genus with complex reticulate evolutionary history, as a model of how hybridization and WGD modes differ in lineages of contrasting species’ geographic ranges. It is important to consider specific evolutionary histories and genetic properties of the focal species within conservation strategies. Keywords Distribution range, Causonis, Hybridization, Whole genome duplication, Allopolyploidization, Reticulate evolution
... Traditionally, phylogenetic analyses use fragments of certain plastid and/or nuclear genes to reconstruct phylogenies, but these trees often reflect the history of a subset of genes rather than the history of the species (Zou and Ge, 2008). Phylogenomic approaches pave a promising path for utilizing genomic data to clarify relationships between species, infer rapid radiations and hybridization events in diverse lineages, and reconstruct the evolutionary history of organisms (Wen et al., 2013Wickett et al., 2014). For example, Deep Genome Skimming (DGS, Liu et al., 2021) has often been used to target the high-copy fractions of genomes (Straub et al., 2012;Zimmer and Wen, 2015), including plastomes, mitochondrial genomes (mitogenomes), and nuclear ribosomal DNA (nrDNA) repeats, as well as single-copy nuclear genes (SCNs). ...
Article
Phylogenetic studies in the phylogenomics era have demonstrated that reticulate evolution greatly impedes the accuracy of phylogenetic inference, and consequently can obscure taxonomic treatments. However, the systematics community lacks a broadly applicable strategy for taxonomic delimitation in groups characterized by pervasive reticulate evolution. The red-fruit genus, Stranvaesia, provides an ideal model to examine the influence of reticulation on generic circumscription, particularly where hybridization and allopolyploidy dominate the evolutionary history. In this study, we conducted phylogenomic analyses integrating data from hundreds of single-copy nuclear (SCN) genes and plastomes, and interrogated nuclear paralogs to clarify the inter/intra-generic relationship of Stranvaesia and its allies in the framework of Maleae. Analyses of phylogenomic discord and phylogenetic networks showed that allopolyploidization and introgression promoted the origin and diversification of the Stranvaesia clade, a conclusion further bolstered by cytonuclear and gene tree discordance. With a well-inferred phylogenetic backbone, we propose an updated generic delimitation of Stranvaesia and introduce a new genus, Weniomeles. This new genus is distinguished by its purple-black fruits, thorns trunk and/or branches, and a distinctive fruit core anatomy characterized by multilocular separated by a layer of sclereids and a cluster of sclereids at the top of the locules. Through this study, we highlight a broadly-applicable workflow that underscores the significance of reticulate evolution analyses in shaping taxonomic revisions from phylogenomic data.
Article
Background and Aims Nekemias is a small genus of the grape family, with nine species discontinuously distributed in temperate to subtropical zones of the Northern Hemisphere but mostly in East Asia. Previous phylogenetic studies on Nekemias have mainly based on a few chloroplast markers, and the phylogenetic framework and systematic relationships are still highly contested. Methods We carried out a systematic framework reconstruction of Nekemias and intra-generic reticulate evolutionary analyses based on extensive single-copy nuclear and chloroplast genomic data obtained by the Hyb-Seq approach, combining genome skimming and target enrichment. Key Results Both nuclear and chloroplast genomic data strongly support the monophyly of Nekemias with its division into two major lineages from East Asia and North America, respectively. There are strong and extensive topological conflicts among nuclear gene trees and between nuclear and chloroplast topologies within the genus, especially within the East Asian clade. Conclusions Rapid radiation through predominant incomplete lineage sorting (ILS) throughout the evolutionary history of the East Asian taxa is supported to explain the relatively high species diversity of Nekemias in East Asia. This study highlights the important role of short periods of rapid evolutionary radiations accompanied by ILS as a mechanism for the complex and fast species diversifications in the grape family as well as potentially in many other plant lineages in East Asia and beyond.
Article
Full-text available
We describe a new species of Ampelopsideae (Vitaceae), Nekemias mucronata sp. nov., from the Rupelian of Cervera (Spain) and revise another fossil species, Ampelopsis hibschii, originally described from Germany. Comparison with extant Ampelopsideae suggests that the North American species Nekemias arborea is most similar to Nekemias mucronata while the East Mediterranean Ampelopsis orientalis is the closest living relative of A. hibschii. Our review of fossil data indicates that, during the Eocene, four species of Ampelopsideae occurred in Eurasia, that is, N. mucronata in the Czech Republic, A. hibschii in Kazakhstan, and two fossil species in the Far East ( Ampelopsis cercidifolia and Ampelopsis protoheterophylla ). In the Oligocene, a new species, Ampelopsis schischkinii , appeared in Kazakhstan; meanwhile, N. mucronata spread eastwards and southwards, and A. hibschii mainly grew in Central Europe. In the late Oligocene, N. mucronata became a relict in the Iberian Peninsula and Nekemias might have persisted in Western Eurasia until the latest Miocene (“ Ampelopsis ” abkhasica ). The last occurrence of A. hibschii was in the Middle Miocene in Bulgaria, probably a refuge of humid temperate taxa, along with Ampelopsis aff. cordata . Carpological remains suggest that this lineage persisted in Europe at least until the Pleistocene. Our data confirm previous notions of the North Atlantic and Bering land bridges being important dispersal routes for Ampelopsideae. However, such dispersion probably occurred during the Paleogene rather than the Neogene, as previously suggested. A single species of Ampelopsideae, A. orientalis , has survived in Western Eurasia, which appears to have been linked to a biome shift.
Article
Full-text available
Selaginellaceae, originated in the Carboniferous and survived the Permian–Triassic mass extinction, is the largest family of lycophyte, which is sister to other tracheophytes. It stands out from tracheophytes by exhibiting extraordinary habitat diversity and lacking polyploidization. The organelle genome-based phylogenies confirmed the monophyly of Selaginella, with six or seven subgenera grouped into two superclades, but the phylogenetic positions of the enigmatic Selaginella sanguinolenta clade remained problematic. Here, we conducted a phylogenomic study on Selaginellaceae utilizing large-scale nuclear gene data from RNA-seq to elucidate the phylogeny and explore the causes of the phylogenetic incongruence of the S. sanguinolenta clade. Our phylogenetic analyses resolved three different positions of the S. sanguinolenta clade, which were supported by the sorted three nuclear gene sets, respectively. The results from the gene flow test, species network inference, and plastome-based phylogeny congruently suggested a probable hybrid origin of the S. sanguinolenta clade involving each common ancestor of the two superclades in Selaginellaceae. The hybrid hypothesis is corroborated by the evidence from rhizophore morphology and spore micromorphology. The chromosome observation and Ks distributions further suggested hybridization accompanied by polyploidization. Divergence time estimation based on independent datasets from nuclear gene sets and plastid genome data congruently inferred that allopolyploidization occurred in the Early Triassic. To our best knowledge, the allopolyploidization in the Mesozoic reported here represents the earliest record of tracheophytes. Our study revealed a unique triad of phylogenetic positions for a hybrid-originated group with comprehensive evidence and proposed a hypothesis for retaining both parental alleles through gene conversion.
Article
Full-text available
Seeds of Vitaceae can be easily recognized by their unique features, a pair of ventral infolds and a dorsal chalaza knot, but the inter- and intrageneric morphological variation in seed morphology has not been explored in detail. To facilitate identification of genera based on seed morphology, a study of 252 extant seeds representing all 15 genera of Vitaceae was conducted with attention to morphological characters and seed coat anatomy. Principal-component analyses were employed to assess the variation of seed morphology. Shape and position of ventral infolds and chalaza, shape of ventral-infold cavities, and testa anatomy are characters that can generically differentiate vitaceous seeds. Typically, seeds of Leea, Cissus, Cyphostemma, Tetrastigma, Rhoicissus, and Cayratia have a long or linear chalaza, whereas those of the rest of the family usually have an oval chalaza. Nevertheless, some seed forms occur in more than one genus, and some genera possess more than one seed form. The overlapping seed morphology implies, in some cases, a closer relationship among these genera, but in other cases it probably reflects convergent or parallel evolution.
Article
Full-text available
Ore mineral and host lithologies have been sampled with 89 oriented samples from 14 sites in the Naica District, northern Mexico. Magnetic parameters permit to charac- terise samples: saturation magnetization, density, low- high-temperature magnetic sus- ceptibility, remanence intensity, Koenigsberger ratio, Curie temperature and hystere- sis parameters. Rock magnetic properties are controlled by variations in titanomag- netite content and hydrothermal alteration. Post-mineralization hydrothermal alter- ation seems the major event that affected the minerals and magnetic properties. Curie temperatures are characteristic of titanomagnetites or titanomaghemites. Hysteresis parameters indicate that most samples have pseudo-single domain (PSD) magnetic grains. Alternating filed (AF) demagnetization and isothermal remanence (IRM) ac- quisition both indicate that natural and laboratory remanences are carried by MD-PSD spinels in the host rocks. The trend of NRM intensity vs susceptibility suggests that the carrier of remanent and induced magnetization is the same in all cases (spinels). The Koenigsberger ratio range from 0.05 to 34.04, indicating the presence of MD and PSD magnetic grains. Constraints on the geometry of the intrusive source body devel- oped in the model of the magnetic anomaly are obtained by quantifying the relative contributions of induced and remanent magnetization components.
Article
Full-text available
Traditional classifications of Araliaceae have stressed a relatively small number of morphological characters in the circumscription of infrafamilial groups (usually recognized as tribes). These systems remain largely untested from a phylogenetic perspective, and only a single previous study has explicitly explored intergeneric relationships throughout this family. To test these infrafamilial classification systems, parsimony and Bayesian-inference analyses were conducted using a broad sampling of 107 taxa representing 37 (of the 41) genera currently recognized in core Araliaceae, plus five outgroup genera. Data were collected from two molecular markers, the internal transcribed spacers (ITS) of the nuclear rRNA genes and the intron and intergenic spacer found in the trnL -trnF region of the chloroplast genome. The results suggest that there are three major lineages of Araliaceae, and that these lineages generally correspond with the centers of diversity for the family. The Aralia and Asian Palmate groups are centered primarily in eastern and southeastern Asia, whereas the Polyscias-Pseudopanax group is found throughout the Pacific and Indian Ocean basins. Several poorly resolved lineages are placed at the base of core Araliaceae, and the geographic distributions of these clades are consistent with a hypothesized rapid radiation of Araliaceae, possibly correlated with the breakup of Gondwanaland. Comparison of molecular results with the traditional systems of classification shows almost no congruence, indicating that they inadequately reflect phylogenetic relationships. Moreover, the morphological characters employed in these classifications appear to be highly homoplastic, and are thus of little utility at the infrafamilial level.
Article
Full-text available
Cayratia consists of ca. 60 species primarily distributed in the tropical and subtropical regions of Asia, Australia, and Africa. It is an excellent candidate for exploring the evolution of intercontinental disjunct distributions in the Old World. Previous phylogenetic work of Vitaceae with a few species of Cayraita sampled showed that Cayratia was not monophyletic and was closely related to Cyphostemma and Tetrastigma. We herein expanded taxon sampling of Cayratia (25/60 species) with its allied genera Cyphostemma (39/150 species), Tetrastigma (27/95 species), and other related genera from Vitaceae represented, employing five plastid markers (atpB-rbcL, rps16, trnC-petN, trnH-psbA, and trnL-F), to investigate the phylogeny, character evolution and biogeography of Cayratia. The phylogenetic analyses have confirmed the monophyly of the Cayratia-Cyphostemma-Tetrastigma (CCT) clade and resolved Cayratia into three lineages: the African Cayratia clade, subg. Cayratia, and subg. Discypharia. The African Cayratia was supported as the first diverging lineage within the CCT clade and Tetrastigma is resolved as sister to subg. Discypharia. Character optimizations suggest that the presence/absence of a membrane enclosing the ventral infolds in seeds is an important character for the taxonomy of Cayratia. The presence of bracts on the lower part of the inflorescence axis is inferred to have arisen only once in Cayratia, but this character evolved several times in Tetrastigma. Both the branching pattern of tendrils and the leaf architecture are suggested as important infrageneric characters, but should be used cautiously because some states evolved multiple times. Ancestral area reconstruction and molecular dating suggest that the CCT clade originated from continental Africa in the late Cretaceous, and it then reached Asia twice independently in the late Cretaceous and late Oligocene, respectively. Several dispersals are inferred from Asia to Australia since the Eocene.
Article
Full-text available
To tackle incongruence, the topological conflict between different gene trees, phylogenomic studies couple concatenation with practices such as rogue taxon removal or the use of slowly evolving genes. Phylogenomic analysis of 1,070 orthologues from 23 yeast genomes identified 1,070 distinct gene trees, which were all incongruent with the phylogeny inferred from concatenation. Incongruence severity increased for shorter internodes located deeper in the phylogeny. Notably, whereas most practices had little or negative impact on the yeast phylogeny, the use of genes or internodes with high average internode support significantly improved the robustness of inference. We obtained similar results in analyses of vertebrate and metazoan phylogenomic data sets. These results question the exclusive reliance on concatenation and associated practices, and argue that selecting genes with strong phylogenetic signals and demonstrating the absence of significant incongruence are essential for accurately reconstructing ancient divergences.
Book
— We studied sequence variation in 16S rDNA in 204 individuals from 37 populations of the land snail Candidula unifasciata (Poiret 1801) across the core species range in France, Switzerland, and Germany. Phylogeographic, nested clade, and coalescence analyses were used to elucidate the species evolutionary history. The study revealed the presence of two major evolutionary lineages that evolved in separate refuges in southeast France as result of previous fragmentation during the Pleistocene. Applying a recent extension of the nested clade analysis (Templeton 2001), we inferred that range expansions along river valleys in independent corridors to the north led eventually to a secondary contact zone of the major clades around the Geneva Basin. There is evidence supporting the idea that the formation of the secondary contact zone and the colonization of Germany might be postglacial events. The phylogeographic history inferred for C. unifasciata differs from general biogeographic patterns of postglacial colonization previously identified for other taxa, and it might represent a common model for species with restricted dispersal.
Article
A simple method of counting the number of possible evolutionary trees is presented. The trees are assumed to be rooted, with labelled tips but unlabelled root and unlabelled interior nodes. The method generalizes the previous work of Edwards and Cavalli-Sforza by allowing forks to be multifurcations as well as bifurcations. It makes use of a simple recurrence relation for T(n,m), the number of trees with n labelled tips and m unlabelled interior nodes. A table of the total number of trees is presented up to n = 22. There are 282,137,824 different trees having 10 tip species, and over 8.87 x 10²³ different trees having 20 tip species. The method is extended to count trees some of whose interior nodes may be labelled. The principal uses of these numbers will be to double-check algorithms and to frighten taxonomists.
Article
The paper reviews the current state of low and single copy nuclear markers that have been applied successfully in plant phylogenetics to date, and discusses case studies highlighting the potential of massively parallel high throughput or next-generation sequencing (NGS) approaches for molecular phylogenetic and evolutionary investigations. The current state, prospects and challenges of specific single- or low-copy plant nuclear markers as well as phylogenomic case studies are presented and evaluated.