ArticlePDF Available

Genome sequence of Apostasia ramifera provides insights into the adaptive evolution in orchids

Authors:

Abstract and Figures

Background The Orchidaceae family is one of the most diverse among flowering plants and serves as an important research model for plant evolution, especially “evo-devo” study on floral organs. Recently, sequencing of several orchid genomes has greatly improved our understanding of the genetic basis of orchid biology. To date, however, most sequenced genomes are from the Epidendroideae subfamily. To better elucidate orchid evolution, greater attention should be paid to other orchid lineages, especially basal lineages such as Apostasioideae. Results Here, we present a genome sequence of Apostasia ramifera , a terrestrial orchid species from the Apostasioideae subfamily. The genomes of A. ramifera and other orchids were compared to explore the genetic basis underlying orchid species richness. Genome-based population dynamics revealed a continuous decrease in population size over the last 100 000 years in all studied orchids, although the epiphytic orchids generally showed larger effective population size than the terrestrial orchids over most of that period. We also found more genes of the terpene synthase gene family, resistant gene family, and LOX1 / LOX5 homologs in the epiphytic orchids. Conclusions This study provides new insights into the adaptive evolution of orchids. The A. ramifera genome sequence reported here should be a helpful resource for future research on orchid biology.
Content may be subject to copyright.
R E S E A R C H A R T I C L E Open Access
Genome sequence of Apostasia ramifera
provides insights into the adaptive
evolution in orchids
Weixiong Zhang
1
, Guoqiang Zhang
2,3,4
, Peng Zeng
1
, Yongqiang Zhang
2,3,4,5
, Hao Hu
1
, Zhongjian Liu
5
and
Jing Cai
6*
Abstract
Background: The Orchidaceae family is one of the most diverse among flowering plants and serves as an important
research model for plant evolution, especially evo-devostudy on floral organs. Recently, sequencing of several orchid
genomes has greatly improved our understanding of the genetic basis of orchid biology. To date, however, most
sequenced genomes are from the Epidendroideae subfamily. To better elucidate orchid evolution, greater attention
should be paid to other orchid lineages, especially basal lineages such as Apostasioideae.
Results: Here,wepresentagenomesequenceofApostasia ramifera, a terrestrial orchid species from the
Apostasioideae subfamily. The genomes of A. ramifera and other orchids were compared to explore the
genetic basis underlying orchid species richness. Genome-based population dynamics revealed a continuous
decrease in population size over the last 100 000 years in all studied orchids, although the epiphytic orchids
generally showed larger effective population size than the terrestrial orchids over most of that period. We
also found more genes of the terpene synthase gene family, resistant gene family, and LOX1/LOX5 homologs
in the epiphytic orchids.
Conclusions: This study provides new insights into the adaptive evolution of orchids. The A. ramifera genome
sequence reported here should be a helpful resource for future research on orchid biology.
Keywords: Orchidaceae, Apostasia ramifera, Comparative genomics, Adaptive evolution
Background
The Orchidaceae family is one of the largest among
flowering plants, with many species exhibiting great or-
namental value due to their colorful and distinctive
flowers. At present, there are more than 28 000 orchid
species assigned to 763 genera [1]. According to their
phylogeny, orchids can be divided into five subfamilies,
i.e., Apostasioideae, Vanilloideae, Cypripedioideae, Epi-
dendroideae, and Orchidoideae. It has been proposed
that whole-genome duplication occurred in the ancestor
of all orchid species, which contributed to their survival
under significant climatic change [2,3]. Orchids are a di-
verse and widespread family of flowering plants. Notably,
several orchid species with specialized floral structures,
such as labella and gynostemia, appear to have co-
evolved with animal pollinators to facilitate reproductive
success. In addition to their role in research on evolution
and pollination biology, orchids are invaluable to the
horticultural industry due to their elegant and distinctive
flowers [4].
© The Author(s). 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License,
which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if
changes were made. The images or other third party material in this article are included in the article's Creative Commons
licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons
licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the
data made available in this article, unless otherwise stated in a credit line to the data.
* Correspondence: jingcai@nwpu.edu.cn
Weixiong Zhang, Guoqiang Zhang, and Peng Zeng are co-first authors
Weixiong Zhang, Guoqiang Zhang and Peng Zeng contributed equally to
this work.
6
School of Ecology and Environment, Northwestern Polytechnical University,
710129 Xian, China
Full list of author information is available at the end of the article
Zhang et al. BMC Genomics (2021) 22:536
https://doi.org/10.1186/s12864-021-07852-3
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
The genome sequences of several orchid species have
been published recently, thereby greatly improving our
understanding of orchid biology and evolution. The first
reported orchid genome (Phalaenopsis equestris) showed
evidence of an ancient whole-genome duplication event
in the orchid lineage and revealed that expansion of
MADS-box genes may be related to the diverse morph-
ology of orchid flowers [2]. The subsequent publication
of other orchid genome sequences, such as that of
Dendrobium officinale,Dendrobium catenatum,Phalae-
nopsis aphrodite,Apostasia shenzhenica, and Vanilla
planifolia, has provided data for further investigations
on the genetic mechanisms underlying orchid species
richness [3,58].
The Apostasioideae subfamily consists of terrestrial or-
chid species [9]. Species within Apostasioideae exhibit
various primitive traits, such as radially symmetrical
flowers and no labella, supporting the placement of this
subfamily as a sister clade to all other orchids [10]. These
primitive features are considered ancient characteristics of
the orchid lineage [10]. Thus, Apostasioideae species can
serve as an important outgroup for evolutionary study of
all other orchid subfamilies. Recently, Zhang et al. [3]pub-
lished the A. shenzhenica genome and identified an
orchid-specific whole-genome duplication event as well as
changes in the MADS-box gene family associated with dif-
ferent orchid characteristics. This is the first (and only)
genome reported for the Apostasioideae subfamily, with
most currently published genomes belonging to the Epi-
dendroideae subfamily. Obtaining genomes for other or-
chid lineages, especially basal lineages, will greatly
facilitate our understanding of orchid evolution. Here, we
performed de novo assembly and analysis of the Apostasia
ramifera genome sequence, the second Apostasia genome
after A. shenzhenica. Comparative genomics were carried
out with six other published orchid genomes to provide
insight into orchid evolution.
Results
Genome sequencing and assembly
The genomic DNA of A. ramifera was sequenced using
the Illumina Hiseq 2000 platform. Sequencing of five li-
braries with different insert sizes ranging from 250 to 5
000 bp generated more than 57 Gb of clean data, account-
ing for 156X of the genome sequence (Additional file 1,
Table S1). Based on the clean reads, we generated a
365.59-Mb long assembly with a scaffold N50 of
287.45 kb (Table 1and Additional file 1, Table S2). To as-
sess the quality of the final assembly, clean reads were
mapped to the genome sequence, resulting in a mapping
ratio of 99.7%. The completeness of the gene regions in
the assembly was examined by BUSCO (Benchmarking
Universal Single-Copy Orthologs) assessment [11]. In
total, 94.9 % (1 304/1 375) of the universal single-copy
orthologs were found in our assembly (Additional file 1,
Table S3).
Genome annotation
Using both de novo and library-based repetitive sequence
annotation, 164.49 Mb of repetitive elements were un-
covered, accounting for 44.99 % of the total assembly
(Additional file 1, Table S4). The proportion of repetitive
DNA in A. ramifera was similar to that in A. shenzhe-
nica (43.74 %) but less than that in P. equestris (62 %)
and D. catenatum (78 %). Among the repetitive se-
quences, transposable elements (TEs) were the most
abundant (43.1 %), among which long terminal repeats
(LTR) were dominant, accounting for 24.07 % of the
total genome (Additional file 1, Table S5 and Fig. S1).
The protein-coding gene models were predicted
through a combination of de novo and homology-based
annotation. In total, 22 841 putative genes were identified
in the A. ramifera genome, similar to that in A. shenzhe-
nica (21 831) but less than that in V. planifolia (28 279),
P. equestris (29 545), and D. catenatum (29 257) (Add-
itional file 1, Table S6). Further functional annotation of
the predicted genes was carried out by homology searches
against various databases, including Gene Ontology (GO),
Kyoto Encyclopedia of Genes and Genomes (KEGG),
SwissProt, TrEMBL, nr database, and InterPro. Results
showed that 19 551 (85.6 %) predicted genes could be an-
notated (Additional file 1, Table S7). In addition, we iden-
tified 40 microRNA, 616 transfer RNA, 1 450 ribosomal
RNA, and 108 small nuclear RNA genes in the A. ramifera
genome (Additional file 1,TableS8).
Synteny comparison based on gene annotations of A.
ramifera and A. shenzhenica identified 927 synteny
blocks with an average block size of 12.89 genes (Add-
itional file 1, Table S9). A total of 11 950 gene pairs were
covered by these synteny blocks, accounting for 61 and
66 % of the genome sequences of A. ramifera and A.
shenzhenica, respectively (Additional file 1, Table S9).
The high co-linearity between their genomes suggested a
close relationship between these two species.
Table 1 Statistics related to A. ramifera genome assembly
Feature Summary
Genome Size 365 588 417 bp
Scaffold N50 287 449 bp
Contig N50 30 765 bp
Longest Scaffold 1 388 560 bp
GC Rate 33.38 %
Repeat Content 44.99 %
BUSCO Assessment 94.9 %
Gene Number 22 841
Zhang et al. BMC Genomics (2021) 22:536 Page 2 of 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Gene family identification
Gene family identification was carried out for the predicted
protein-coding genes in A. ramifera, together with genes
from other species, including P. equestris,P. aphrodite,D.
officinale,D. catenatum,A. shenzhenica,V. planifolia,As-
paragus officinalis,andOryza sativa. A total of 19 422 puta-
tive genes in the A. ramifera assembly were assigned to 13
251 gene families (Fig. 1AandAdditionalfile1, Table S10).
The remaining 3 419 genes could not be grouped with other
genes and were considered orphans. Among the compared
species, 266 gene families were only shared by orchid species.
KEGG and GO enrichment analyses of those orchid-specific
gene families revealed various significantly enriched pathways
and terms, including Stilbenoid, diarylheptanoid and gin-
gerol biosynthesis(ko00945), Zeatin biosynthesis(ko00908),
Flavonoid biosynthesis(ko00941), Circadian rhythm - plant
(ko04712), Regulation of gene expression(GO:0010468),
and Aromatic compound biosynthetic process(GO:
0019438) (Additional file 1,TableS11andS12).Further-
more, a total of 1 145 gene families were specifically ex-
panded in Apostasia (see Methods), and were significantly
enriched in several pathways, such as Ribosome biogenesis
in eukaryotes(ko03008), mRNA surveillance pathway
(ko03015) and Plant-pathogen interaction(ko04626) (Add-
itional file 1, Table S13 and S14).
Phylogenetic analysis
We constructed a phylogenetic tree using MrBayes with
gene sequences of 381 single copy genes shared by 16
plant species, including A. ramifera. The divergence
times among these species were estimated using PAML
MCMCTree based on our phylogeny. Results showed
that the Apostasia species separated from other orchids
82 million years ago (Fig. 1B), consistent with previously
published results [3]. The divergence time between A.
ramifera and A. shenzhenica was estimated to be 8 mil-
lion years ago (Fig. 1B). Gene family expansions and
contractions on each phylogenetic branch of the 16 spe-
cies were estimated using CAFE [12] (Fig. 1B). We fur-
ther carried out GO/KEGG enrichment analyses on the
significantly expanded gene families in A. ramifera and
found some functionally enriched pathways and terms,
including Zeatin biosynthesis(ko00908), Glyceropho-
spholipid metabolism (ko00564), Flavin adenine di-
nucleotide binding(GO:0050660), and UDP-N-
acetylmuramate dehydrogenase activity(GO:0008762)
(Additional file 1, Table S15 and S16). In addition, the
significantly contracted gene families were enriched in
Homologous recombination(ko03440), Glycosphingo-
lipid biosynthesis(ko00604), Transferase activity, trans-
ferring phosphorus-containing groups(GO:0016772),
and Transferase activity(GO:0016740) (Additional file
1, Table S17 and S18).
History of orchid population size
Population size history is important for understanding
the underlying mechanisms leading to current patterns
of species and population diversity [13]. Several investi-
gations on orchid population size have been published
[14,15]. Here, the pairwise sequential Markovian coales-
cent (PSMC) model, which uses the coalescent approach
to estimate population size changes [13], was applied to
infer population size history based on the genome
381
450
1816
1728
1250
248
73
58
1039
178
116
70 704
124
178
397
67
42
94
244
101
168
114
144
106
395
386
342
648
801
8301
Ash
Ara
Dca
Peq
Vpl
AGene Families
0255075100125150175200
Asparagus officinalis
Vitis vinifera
Arabidopsis thaliana
million
y
ears a
g
o
Amborella trichopoda
Apostasia ramifera
Phoenix dactylifera
Spirodela polyrhiza
Ananas comosus
Populus trichocarpa
Dendrobium catenatum
Oryza sativa
Apostasia shenzhenica
Brachypodium distachyon
Sorghum bicolor
Phalaenopsis equestris
Musa acuminata
126
157
130
192
118
180
42
137
82
53
104
109
44
8
125
+685/-1470
+556/-3892
+616/-1363
+963/-707
+275/-832
+479/-858
+1334/-2240
+880/-582
+637/-696
+629/-869
+1219/-1323
+4069/-973
+1585/-3253
+400/-1333
+5099/-334
+1951/-1504
/
MRCA
(11,080)
+2 /-7
+7/-338
+189/-87
+3/-377
+168/-823
+280/-647
+323/-1079
+994 /-605
+126/-132
+236/-438
+324 /-201
+16/-0
+885 /-639
+514 /-144
BExpansion Contraction
Fig. 1 Gene family and phylogenetic relationship analysis. (A) Venn diagram showing distribution of shared gene families among five orchid
species, i.e., A. ramifera (Ara), A. shenzhenica (Ash), P. equestris (Peq), D. catenatum (Dca), and V. planifolia (Vpl). (B) Phylogenetic tree showing
relationship and divergence times for 16 species. Purple bars at internal nodes represent 95% confidence interval of divergence times. Numbers
of expanded and contracted gene families are presented as green and red values, respectively. MRCA, most recent common ancestor
Zhang et al. BMC Genomics (2021) 22:536 Page 3 of 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
sequences of seven orchid species, i.e., A. ramifera,A.
shenzhenica,P. equestris,P. aphrodite,D. officinale,D.
catenatum, and V. planifolia. For the Apostasia species,
population size changed between 10 000 and 250 000
years ago, with similar population dynamics (Fig. 2).
Earlier history could not be recovered because the low-
level heterozygosity of the genome sequences of A. rami-
fera and A. shenzhenica provided limited information on
ancient changes in population size. For the other or-
chids, population size histories showed similar patterns,
especially D. catenatum,D. officinale, and P. equestris
(Fig. 2). First, a period of population growth was ob-
served for each of these orchid species. Then, all orchid
populations experienced a severe contraction (bottle-
neck) over the last 100 000 years, from which they have
not recovered (Fig. 2). During the reporting period (10
000 to 250 000 years ago), the Apostasia species had the
smallest population size compared to other orchid spe-
cies. The population size of Vanilla was slightly higher
than that of Apostasia, but lower than that of all Epiden-
droideae orchids.
Gene family evolutionary analysis
MADS-box transcription factors
In plants, MADS-box transcription factors are involved
in various developmental processes, such as floral devel-
opment, flowering control, and root growth. All MADS-
box gene family members are categorized as type I or
type II based on their gene tree. Using HMMER software
and a MADS-box domain profile (PF00319), we identi-
fied 30 putative MADS-box genes in the A. ramifera
genome, fewer than that detected in the other sequenced
orchids (Additional file 1, Table S19). Phylogenetic ana-
lysis of the putative MADS-box genes revealed that 23
belonged to the type II MADS-box clade (Fig. 3A),
fewer again than that found in other orchids, e.g., A.
shenzhenica (27 members) [3], V. planifolia (30 mem-
bers, Additional file 1, Fig. S2A), P. equestris (29) [2],
and D. catenatum (35) [5]. Compared to P. equestris,
there were fewer members in the A-class, B-class, E-
class, and AGL6-class in A. ramifera and V. planifolia
(Additional file 1, Table S19). In contrast, there were
more SVP-class, ANR1-class, and AGL12-class members
in A. ramifera and V. planifolia than in P. equestris
(Additional file 1, Table S19).
Type I MADS-box transcription factors are involved in
plant reproduction and endosperm development [16].
Here, we identified seven and six type I MADS-box
genes in A. ramifera and V. planifolia, respectively
(Fig. 3B and Additional file 1, Fig. S2B and Table S19).
Phylogenetic analysis showed that genes in the Mβ-class
were absent in A. ramifera and V. planifolia, (Fig. 3B
and Additional file 1, Fig. S2B).
Terpene synthase (TPS) gene family
In plants, TPS family members are responsible for the
biosynthesis of terpenoids, which are involved in various
physiological processes in plants such as primary metab-
olism and development [17]. The architecture of the
TPS gene family is proposed to be modulated by natural
selection for adaptation to specific ecological niches
[18]. We used both terpene_synth and terpene_synth_C
domains to search for TPS genes in the orchid genomes.
A small TPS gene family size was observed in the two
Apostasia species compared with the other orchids stud-
ied (Fig. 4). Only eight and six copies of TPS genes were
found in A. shenzhenica and A. ramifera, respectively
(Fig. 4and Additional file 1, Table S20). A small TPS
family size in Apostasia may indicate a loss of chemical
0
10
20
30
40
50
60
70
10
410510
610
7
Effective population size (x104
Years (g=4, =0.5x10-8)
)
P. aphrodite
D. catenatum
A. ramifera
A. shenzhenica
P. equestris
D. officinale
V. planifolia
Fig. 2 Population size histories of seven orchid species, including P. aphrodite (yellow), D. catenatum (green), P. equestris (purple), D. officinale
(dark blue), V. planifolia (pink), A. shenzhenica (light blue) and A. ramifera (red), between 10 000 and 10 million years ago. Generation times of
orchids were assumed to be four years, and mutation rate per generation was 0.5 × 10
8
Zhang et al. BMC Genomics (2021) 22:536 Page 4 of 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
diversity of terpenoid compounds. To resolve the phylo-
genetic relationship of TPS genes in orchids, a gene tree
was constructed using the TPS gene sequences derived
from orchids and Arabidopsis. Phylogenetic analysis
showed that four TPS subfamilies were found in Aposta-
sia (Fig. 4). In Apostasia, members of both TPS-c and
TPS-f subfamilies, which encode enzymes responsible
for the synthesis of 20-carbon diterpenes, were lost
(Fig. 4and Additional file 1, Table S20). In addition,
fewer members of TPS-a and TPS-b subfamilies were
observed in Apostasia compared with other orchids
(Fig. 4and Additional file 1, Table S20). Genes from
these two subfamilies are reportedly involved in the bio-
synthesis of 10- and 15-carbon volatile terpenoids [19],
which are the components of floral scent.
Pathogen resistance genes
Pathogen resistance-related genes are closely associated
with plant fitness and adaptive evolution [20]. Here, the
NB-ARC domain profile was used to search for Rgenes
in the predicted gene models of A. ramifera and other
orchids, including A. shenzhenica,V. planifolia,P. eques-
tris,P. aphrodite,D. catenatum, and D. officinale.We
identified 71 Rgenes in A. ramifera and 66 in A. shenz-
henica, considerably fewer than that found for P. eques-
tris (114), P. aphrodite (109), D. officinale (172), D.
catenatum (182), and V. planifolia (86) (Fig. 5). Thus,
the size of the Rgene family varied greatly among the
different Orchidaceae genera (Fig. 5).
In Apostasia, in addition to the small Rgene family
size, we also discovered lower copy numbers in both the
NAC and WRKY gene families (Fig. 5), which are known
to play important roles in plant immune response [21,
22]. We identified 55 and 64 NAC transcription factor
members in A. ramifera and A. shenzhenica, respectively,
markedly fewer than that found in Dendrobium,Phalae-
nopsis, and Vanilla (77 to 113) (Fig. 5). We also identi-
fied 56 and 50 WRKY transcription factors in A.
ramifera and A. shenzhenica, respectively, again fewer
than that found in other orchids (64 to 83) (Fig. 5).
Apostasia LOX1/LOX5 genes may contribute to lateral root
development, an important trait for terrestrial growth
LOX1 and LOX5 are involved in the development of
lateral roots in Arabidopsis,andlossofthesetwo
genes causes a significant increase in lateral root
emergence [23]. Here, we searched the homologs of
LOX1 and LOX5 in six published orchid genomes
using protein sequences from Arabidopsis as the
query, and then constructed a gene tree to elucidate
the phylogenetic relationship among these genes. We
detected multiple copies of LOX1/LOX5 homologs in
AGL70
AGL24
Ara020644
AGL104
FUL
CAL
Ara016347
Ara009011
AGL15
SHP1
FLC
AGL33
AGL21
AGL66
AGL94
Ara000690
SOC1
Ara005769
MAF2
AGL12
AGL16
Ara005184
AGL14
STK
Ara006720
Ara001077
SEP4
AGL19
AGL6
Ara003595
TT16
SEP3
FLM
AGL18
Ara011141
AGL69
SEP2
AGL67
Ara012716
AGL71
SHP2
SVP
Ara005262
SEP1
Ara005659
AGL13
AGL68
Ara003524
Ara014487
ANR1
AGL63
Ara017345
AP1
Ara018206
AGL79
AGL72
Ara016558
Ara000222
AP3
Ara009096
Ara000709
AGL65
Ara009203
PI
AGL17
AG
FYF
Ara005076
99
78
100
99
70
80
89
100
95
73
99
62
98
100
54
90
100
61
75
59
79
92
99
72
69
100
59
92
76
95
100
50
99
70
59
100
69
56
99
98
100
A
AGL47
Ara005203
AGL39
AGL48
AGL51
AGL23
AGL85
AGL96
AGL60
AGL97
AGL101
AGL81
AGL56
AGL78
AGL55
AGL35
AGL26
AGL82
AGL95
AGL36
AGL45
AGL93
AGL54
AGL58
AGL29
AGL86
AGL80
Ara021968
AGL84
PHE2
AGL61
AGL53
AGL46
AGL49
AGL59
AGL99
Ara003954
AGL52
AGL34
AGL28
AGL43
AGL57
Ara022801
AGL83
Ara020411
AGL62
AGL98
AGL75
AGL91
Ara005764
AGL73
Ara005112
AGL64
AGL74
AGL100
AGL90
AGL40
AGL102
AGL50
AGL89
AGL103
AGL92
AGL77
AGL87
PHE1
AGL76
85
90
99
57
99
79
62
74
55
100
57
66
75
82
90
56
99
83
59
99
52
97
99
79
70
83
71
94
100
100
100
90
86
100
60
100
72
99
99
86
75
97
100
95
100
100
100
98
100
86
B
FLC
MIKC*
SVP
ANR1
C/D
SOC1
A
AGL12
AGL6
E
B
AGL15
Fig. 3 Phylogenetic analysis of MADS-box genes in A. ramifera. (A) Type II MADS-box genes. (B) Type I MADS-box genes. Neighbor-joining gene
trees were constructed using MADS-box genes from A. ramifera and Arabidopsis. Genes from A. ramifera are marked in red. Different MADS-box
classes are indicated. Numbers above branches are bootstrap support values of at least 50
Zhang et al. BMC Genomics (2021) 22:536 Page 5 of 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
R
NAC
WRKY
109 77 64
114 82 65
172 113 83
182 91 71
86 92 77
66 64 50
71 55 56
Phalaenopsis aphrodite
Phalaenopsis equestris
Dendrobium officinale
Dendrobium catenatum
Vanilla planifolia
Apostasia shenzh enica
Apostasia ram ifera
R
NAC
WRKY
40
80
120
160
Fig. 5 Number of members of Rgenes and NAC and WRKY gene families in different orchids. These gene families are marked in blue, green, and
yellow, respectively. Sizes of circles are directly proportional to number of members in gene family
Ash010893
Vpl002448
AT4G15870
Dca005188
Ash001839
Dca011215
Vpl024694
Vpl023468
Dca022838
Vpl000430
Dca011214
XP_020580217.1
XP_020576698.1
AT4G16740
Vpl013757
PAXXG215100
Dca003139
AT4G20200
Dca000725
PAXXG249710
PAXXG379100
AT3G14540
Ara016901
Vpl001457
Dca020940
AT1G79460
Ash001894
PAXXG344580
Dca016979
Vpl019210
Vpl022696
XP_020590622.1
XP_020584124.1
PAXXG276750
Ara019716
PAXXG034410
Ara004686
Dca026890
XP_020591710.1
PAXXG278350
Ara008027
PAXXG024450
Dca003295
PAXXG010370
Ara013999
Dca017971
AT3G25830
XP_020596455.1
Ash000699
Vpl014635
PAXXG215110
PAXXG045650
PAXXG276820
AT4G13300
AT4G02780
Vpl000975
AT3G25820
XP_020588804.1
XP_020584121.1
AT1G33750
AT1G61120
Vpl013173
PAXXG352310
Dca003141
PAXXG024540
XP_020590461.1
Dca025698
XP_020590463.1
Vpl003795
Dca018407
AT4G20230
PAXXG149140
AT3G25810
AT5G48110
27
7
1
2
0lp
V
AT5G23960
Ash014324
XP_020597358.1
AT3G29190
Ash010892
Dca008309
AT4G13280
Dca007747
AT2G23230
XP_020590464.1
AT1G31950
PAXXG276740
Dca003142
AT3G29110
XP_020588788.1
PAXXG034430
XP_020599757.1
AT1G66020
Ash010138
Vpl019259
XP_020576641.1
PAXXG024480
Dca026369
AT3G14520
Vpl017783
Vpl014945
XP_020576699.1
PAXXG010350
XP_020579525.1
Dca019411
Dca019412
Ara010433
Vpl008182
PAXXG034420
AT3G14490
XP_020588364.1
PAXXG049850
Vpl008741
PAXXG276730
AT3G32030
Dca007746
Vpl012224
Dca026570
XP_020586098.1
Dca000723
688
9
0
0l
p
V
Vpl012059
AT2G24210
AT4G16730
Ash013010
XP_020590460.1
Dca018946
AT1G70080
XP_020598459.1
PAXXG370420
AT1G61680
0
1
49
2G3
TA
AT1G48800
XP_020576697.1
Dca000724
Vpl003138
PAXXG346400
Vpl016604
AT5G44630
AT4G20210
Dca013782
100
99
98
100
100
100
98
67
100
71
100
100
98
100
100
100
100
100
94
99
100
51
100
100
100
100
91
85
100
77
100
100
100
99
81
51
99
61
100
59
69
100
98
100
100
100
100
100
100
100
78
100
100
100
100
54
91
100
51
100
99
92
100
71
100
99
65
100
100
85
95
96
100
84
80
78
89
78
94
86
9
9
100
9
9
100
100
100
100
100
100
81
92
100
98
100
99
81
99
99
100
99
64
100
100
68
100
96
100
72
85
100
100
99
100
100
100
87
100
100
100
100
100
97
72
96
50
Arabidopsis thaliana
Apostasia ramifera
Apostasia shenzhenica
Dendrobium catenatum
Phalaenopsis equestris
Phalaenopsis aphrodit
e
Vanilla planifolia
TPS-a
TPS-b
TPS-c
TPS-e
TPS-f
TPS-g
Fig. 4 Phylogenetic tree for TPS genes predicted in six orchid species and Arabidopsis. Numbers above branches are bootstrap support values of
at least 50
Zhang et al. BMC Genomics (2021) 22:536 Page 6 of 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
the epiphytic orchid genomes (Fig. 6and Additional
file 1, Table S21). However, only one homologous
gene was found in A. ramifera,andtheLOX1/LOX5
homologs were completely lost in A. shenzhenica
(Fig. 6and Additional file 1, Table S21). We also
found one copy of the LOX1/LOX5 genes in the
hemi-epiphytic orchid V. planifolia (Fig. 6and Add-
itional file 1,TableS21).
Discussion
With worldwide distribution, orchids are one of the lar-
gest flowering plant families and their extraordinary di-
versity provides an excellent opportunity to explore
plant evolution. Certain evolutionary adaptations in or-
chids, e.g., pollinium, labella and epiphytism, are pro-
posed to have played key roles in their adaptive
evolution and radiation. However, the genetic basis
underlying those innovations remains incompletely
known. In the current study, we sequenced the genome
of A. ramifera, a basal Apostasioideae lineage terrestrial
orchid, and carried out comparative genomic analyses of
seven orchid genomes including that of A. ramifera. Sev-
eral gene families related to adaptations in orchids (e.g.,
MADS-box, pathogen resistance, TPS,andLOX genes)
were compared among different orchid lineages.
MADS-box transcription factors
Compared with other orchids, we found smaller gene
families in the B- and E-classes of type II MADS genes
in Apostasia and Vanilla. Genes in these classes of type
II MADS are involved in floral development [24]. Fur-
thermore, it has been proposed that small size in these
gene families may be related to the maintenance of the
ancestral state in Apostasia flowers, which exhibit radial
symmetry and no specialized labellum [3]. However,
small gene families in the B- and E-classes of the type II
MADS family were also found in V. planifolia, which has
bilaterally symmetrical flower petals and a specialized la-
bellum. These results indicate that members in the B-
and E-classes may not contribute to the different flower
morphologies found among Apostasioideae and other
orchids.
Recent research has suggested that genes from the
MIKC* family are involved in pollen development [25,
26]. Here, we found a MIKC*P-subclass member in the
A. ramifera genome. Furthermore, P- and S-subclasses
Ash000329
AT3G45140
XP_020571523.1
Dca023278
Vpl013972
Ash011707
Ash003356
Dca022825
XP_020592800.1
XP_020577224.1
AT1G72520
Ara021787
Vpl004147
PAXXG023340
Dca016964
XP_020574744.1
XP_020590849.1
PAXXG088720
Vpl009145
Dca016356
Ara004227
Ara003798
XP_020586977.1
Dca020949
Ash010227
Ara006511
XP_020580260.1
Dca016452
PAXXG056000
AT1G55020
Dca004884
AT3G22400
Ash018896
Dca003527
AT1G67560
PAXXG180070
Ara016944
Ara009668
8
83
020lpV
PAXXG098190
AT1G17420
Dca003526
PAXXG354990
100
81
100
100
100
100
100
100
100
100
100
99
100
96
100
100
54
100
99
100
100
99
100
96
99
76
100
100
100
100
100
92
67
100
100
100
73
100
99
Fig. 6 LOX gene tree showing LOX1/LOX5 genes in orchids. Phylogenetic analysis was conducted using LOX gene sequences from A. ramifera,A.
shenzhenica,D. catenatum,P. equestris,P. aphrodite,V. planifolia, and Arabidopsis. Branches leading to orchid LOX1/LOX5 genes are marked in
green. Numbers above branches are bootstrap support values of at least 50
Zhang et al. BMC Genomics (2021) 22:536 Page 7 of 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
members of MIKC* were identified in A. shenzhenica,
while P-subclass genes were lost in P. equestris [3]. It
has been proposed that loss of P-subclass genes is asso-
ciated with the evolution of pollinia [3]. However, both
P- and S-subclass members have been identified in the
genome assembly of P. aphrodite [7] and V. planifolia
(Additional file 1, Fig. S2). Thus, loss of MIKC* genes in
some orchids might not be relevant to the evolution of
pollinia.
A lack of Mβgenes has been reported in some orchid
genomes, including A. shenzhenica,P. equestris, and D.
catenatum [2,3,5]. Here, we found that the Mβgene
was also absent in A. ramifera and V. planifolia. Zhang
et al. [3] suggested that loss of Mβ-class type I MADS-
box transcription factors is related to the absence of
endosperm in the seeds of all orchids. However, Mβ
genes have been discovered in the genome of P. aphro-
dite and transcriptome of Orchis italica [7,27]. Thus, in-
stead of Mβgenes, other genes or mechanisms may
contribute to the absence of endosperm in orchid seeds.
TPS gene family
In comparison to that in Apostasia, more members in
the TPS-a and TPS-b clades of the TPS gene family were
found in Vanilla,Dendrobium, and Phalaenopsis.Mem-
bers of these clades are involved in the biosynthesis of
volatile terpenoids, which are the components of floral
scent [19]. In addition, it has been proposed that expan-
sion of TPS subfamilies may promote the emergence of
novel compounds [18]. As the flowers of orchids in the
Epidendroideae and Vanilloideae subfamilies are highly
adapted to animal pollination via many pollination syn-
dromes, including the development of various volatile
compounds, this result may provide new insight into the
genetic basis of adaptation to insect pollination in epi-
phytic orchids. Gene duplication and divergence are
more effective ways of evolving new enzymatic functions
than de novo evolution of a new gene [18]. Thus, more
members of the TPS-a and TPS-b subfamilies may facili-
tate the emergence of novel volatile compounds, which
may contribute to their adaptation to diverse animal pol-
linators via the production of diverse flower scents.
Lateral root development
For higher land-based plants, roots play a significant role
in their successful colonization of the terrestrial environ-
ment by providing mechanical support as well as water
and nutrient uptake from the soil (or air for epiphytic
plants) [28]. Root architecture, i.e., the spatial
organization of roots, also has a significant impact on
the functional performance of the root system and is im-
portant for plant survival [28,29]. Environmental fac-
tors, such as water and nutrient availability, contribute
to the shaping of root architecture [30]. Significant root
system differences have been reported between Aposta-
sia and other orchids [3]. Among them, branch roots
have been found in Apostasia but not in epiphytic or-
chids, such as Phalaenopsis [3]. In land plants, the for-
mation of lateral roots plays a crucial role in root
architecture, uptake, and anchoring. Following their
adaptation to soil-free environments, however, various
orchids have lost the ability to develop lateral roots, in-
stead forming specialized root structures, such as spongy
epidermis, to help preserve nutrients. Zhang et al. [3] re-
ported that variation in the copy number of ANR1 sub-
family MADS-box genes results in different lateral root
formation between A. shenzhenica and epiphytic P.
equestris and D. catenatum. However, as the develop-
ment of lateral roots is a complicated process that in-
volves intricate regulation and phytohormone
interactions [31,32], the genetic mechanisms controlling
the emergence of lateral roots in orchids await further
investigation. In this study, we found fewer copies of the
LOX1/LOX5 homologous genes in the Apostasia species
and hemi-epiphytic V. planifolia than that in the epi-
phytic orchids. Given the function of LOX1 and LOX5 in
Arabidopsis [23], we propose that copy number variation
in these genes may contribute to the differences in lat-
eral root development between terrestrial and epiphytic
orchids. In addition, according to the phylogenetic rela-
tionship of LOX genes in orchids, there are six different
subclades of LOX genes in the common ancestor of or-
chids. The variation in copy number among the different
orchid lineages may be due to the various degrees of
gene retention, rather than gene duplication.
Conclusions
In this study, we performed de novo assembly and ana-
lysis of the genome of A. ramifera, a terrestrial orchid
from the Apostasioideae subfamily. We revealed the
population size histories of different orchid species and
discovered a continuous decrease in population size
from the genomes of these species over the last 100 000
years. In addition, the gene family size and subfamily
architecture of TPS genes varied greatly among species
from different orchid subfamilies, which may be associ-
ated with the adaptive evolution of orchids. Genes asso-
ciated with pathogen resistance were significantly
reduced in the genomes of Apostasia compared with
that of other orchids. In Apostasia, we also found genes
that were likely involved in the regulation of lateral root
development, which is an important trait for terrestrial
growth. The A. ramifera genome sequence reported here
should be an important resource for further investiga-
tions on orchid biology. Comparative genomics analysis
of A. ramifera and other orchids should provide new in-
sights into the adaptive evolution of these species.
Zhang et al. BMC Genomics (2021) 22:536 Page 8 of 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Methods
Sample preparation and sequencing
The A. ramifera samples were collected from Jianfeng
Mountain, Hainan Province, China. No permission was
required to collect these samples. The formal identifica-
tion of plant material was conducted by Prof. Zhongjian
Liu. A voucher specimen of the material was deposited
at the National Orchid Conservation Center of China
under deposition number Liu.JZ6475. For genome se-
quencing, we collected fresh leaves from A. ramifera. Ex-
traction of genomic DNA was carried out using the
modified cetyltrimethylammonium bromide protocol
[33]. Five DNA libraries with different insert sizes were
constructed using an Illumina library construction kit
(NEB DNA Library Rapid Prep Kit for Illumina) and
then sequenced using the Illumina HiSeq 2000 platform.
After filtering the raw reads according to sequencing
quality and adaptor contamination, a total of 57.4 Gb of
clean data were retained for assembling the Apostasia
genome.
Genome assembly
The estimated genome size of A. ramifera was
332.24 Mb according to k-mer frequency distribution.
Only one peak was observed in the k-mer distribution,
indicating high homozygosity of the Apostasia genome.
For genome assembly, SOAPdenovo2 [34] was used for
contig construction and scaffolding, and GapCloser was
used for extending the length of the final contigs. In
total, 57.4 Gb of clean reads derived from the DNA li-
braries with five insert sizes (Additional file 1, Table S1)
were used by SOAPdenovo2 assembler and GapCloser
for de novo genome assembly.
Repeat annotation
Repeat sequences consist of tandem repeats, such as
small and micro-satellite DNA, and interspersed repeats
(also known as transposable elements, TEs). In the A.
ramifera genome, tandem repeat sequences were identi-
fied by TRF software [35]. Identification of TEs was con-
ducted by homology searches of the RepBase database
[36] and de novo prediction. Briefly, RepeatMasker [37]
and RepeatProteinMask [38] were applied to identify
TEs in the Apostasia genome with a RepBase-derived li-
brary of known repeat elements. For de novo prediction,
we used RepeatModeler and LTR-FINDER [39] to con-
struct a de novo repetitive element library for the A.
ramifera genome. RepeatMasker was then applied to
search the genome for TEs with the constructed data-
base. Finally, these results were combined, and the re-
dundant sequences were removed to generate a
complete repeat annotation.
Gene and non-coding RNA prediction
Because previously published work on the V. planifolia draft
genome [8] did not include gene prediction, we carried out
protein-coding gene prediction for the A. ramifera and V.
planifolia genomes. Firstly, we used AUGUSTUS [40]and
GlimmerHMM [41] to generate the de novo predicted gene
sets for our assembly and the V. planifolia genome (BioPro-
ject: PRJNA507095). Protein sets derived from five plant ge-
nomes, including Arabidopsis thaliana,Phalaenopsis
equestris,Oryza sativa,Sorghum bicolor,andZea mays,were
then applied to search against the Apostasia and Vanilla ge-
nomes using TBLASTN with an E-value cutoff of 1e-5 and
minimum query coverage of 25 %. GeneWise [42]wasused
to annotate the gene structures. The RNA-seq datasets
(SRR1509356, SRR1509370, and SRR1509674) for V. planifo-
lia were downloaded from NCBI SRA, and were de novo as-
sembled by Trinity software. Vanilla transcripts were applied
to annotate the V. planifolia genome using the PASA pro-
gram. The annotation results derived from different methods
were then integrated to generate integrated protein-coding
gene sets for A. ramifera and V. planifolia with the MAKER
[43]program.
Non-coding RNAs do not translate into protein se-
quences but exert significant roles in cellular metabol-
ism, and include microRNAs (miRNAs), transfer RNAs
(tRNAs), ribosomal RNAs (rRNAs), and small nuclear
RNAs (snRNAs). Here, we applied previously described
methods to search for non-coding RNAs in the Aposta-
sia genome [3]. The miRNA- and snRNA-coding genes
were predicted using INFERNAL [44] and the tRNA-
coding genes were identified using tRNAscan-SE [45].
Genes encoding rRNAs were annotated by searching the
genome with the rRNA sequences of Arabidopsis.
Functional annotation
Functional analysis of the predicted genes in the Aposta-
sia genomes was performed by searching their protein-
coding regions against sequences derived from publicly
available databases, including Gene Ontology (GO) [46,
47], Kyoto Encyclopedia of Genes and Genomes (KEGG)
[48], SwissProt [49], TrEMBL [49], non-redundant (nr)
protein database, and InterProScan [50].
Gene family identification
Gene family clustering was conducted using OrthoFinder[51]
with complete protein sets from seven species, including P.
equestris,P. aphrodite,D. officinale,D. catenatum,A. shenz-
henica,A. officinalis,andO. sativa,aswellasthepredicted
protein sequences from A. ramifera. To limit the disturbance
of alternative splicing variants on gene family clustering, the
longest transcript of each gene was selected for analysis.
Gene families in which the number of genes from Apostasia
(including A. ramifera and A. shenzhenica) was 1.5 times
Zhang et al. BMC Genomics (2021) 22:536 Page 9 of 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
higher than that from other orchids were considered ex-
panded in Apostasia.
Phylogenetic analysis
To build a high-confidence phylogenetic tree, we constructed
a multi-species protein set containing protein sequences
from A. ramifera and 15 other species, including 11 mono-
cots (Spirodela polyrhiza,D. catenatum,P. equestris,A.
shenzhenica,A. officinalis,Ananas comosus,Musa acumi-
nata,Phoenix dactylifera,Brachypodium distachyon,S. bi-
color,andO. sativa), three eudicots (Vitis vinifera,A.
thaliana,andPopulus trichocarpa), and the outgroup
Amborella trichopoda. Protein sequences that contained less
than 50 amino acids were removed from the constructed
dataset. The pairwise similarities between protein sequences
were calculated through all-against-all BLASTP with cutoff
criteria: i.e., (i) E-value < 1e-5, (ii) query coverage > 30 %, (iii)
alignment identify > 30%. The results were then entered into
OrthoMCL [52] (v2.0.9) to construct orthologous groups. In
total, 381 single-copy gene families shared by all 16 species
were applied to construct a species tree using MrBayes [53]
with the GTR + invgamma model. PAML MCMCTree [54]
was used to estimate the species divergence times with the
following time calibrations: (i) O. sativa and B. distachyon di-
vergence time (4054 million years ago) [55], (ii) P. tricho-
carpa and A. thaliana divergence time (100120 million
years ago) [56], (iii) lower boundary of monocot and eudicot
divergence time (140 million years ago) [57], and (iv) upper
boundary for angiosperm divergence time (200 million years
ago) [58]. Gene family expansions or contractions were iden-
tified using CAFÉ [12].
Heterozygosity analysis and estimation of effective
population size
Identification of heterozygous loci was performed via a
previously described method [59]. Briefly, clean reads
were aligned to the genome sequence of A. ramifera
using the BWA tool [60]. Duplicate reads were then re-
moved by Picard. SAMtools [61] was used for calling
heterozygous loci, and bcftools was used for generating
consensus sequences. The effective population sizes of
the orchid species were estimated using the PSMC pro-
gram [13]. The parameters for PSMC analysis were set
to default except for -g 4 and -u 0.5 × 10
8
.
Identification of MADS-box, TPS,NAC,WRKY,R, and LOX
genes
The hidden Markov model profiles [62] were applied to
search for MADS-box (Pfam Accession: PF00319), TPS
(Pfam Accession: PF01397 and PF03936), NAC (Pfam Acces-
sion: PF02365), WRKY (Pfam Accession: PF03106), and R
(Pfam Accession: PF00931) genes using HMMER [63]
(v3.2.1). MADS-box genes in A. thaliana reported in [3]were
used to reconstruct gene trees with the MADS-box genes
identified in A. ramifera and V. planifolia. EvolView [64]was
used to visualize the number of members in the NAC,
WRKY and Rgene families for the selected species. For the
TPS genes, the protein sequences that possessed both Pfam
domains and contained more than 500 amino acids were
considered as functional genes and used for further analysis.
To identify LOX genes, protein sequences of the LOX gene
family in A. thaliana (Gene ID: AT1G55020, AT1G72520,
AT1G67560, AT1G17420, AT3G22400, and AT3G45140)
were used to search for homologous genes in orchids. The
identified protein sequences of each gene family were aligned
using MUSCLE [65](v3.8.31)withdefaultsettings.MEGA7
[66] was then used to construct an unrooted neighbor-
joining tree for each gene family with 500 bootstrap
replicates.
Abbreviations
TEs: Transposable elements.; GO: Gene Ontology.; KEGG: Kyoto Encyclopedia
of Genes and Genomes.; PSMC: Pairwise sequential Markov coalescent.;
TPS: Terpene synthase.
Supplementary Information
The online version contains supplementary material available at https://doi.
org/10.1186/s12864-021-07852-3.
Additional file 1.
Acknowledgements
Not applicable.
Authorscontributions
G.Z., H.H., Z.L., and J.C. designed and managed the project. J.C. and W.Z.
wrote and revised the manuscript. W.Z., G.Z., P.Z., and Y.Z. contributed to
genome sequencing and data analysis. The final manuscript was read and
approved by all authors.
Funding
This study was funded by the Science and Technology Development Fund
Macau SAR (File no. 031/2017/A1) to H.H., Talents Team Construction Fund
of Northwestern Polytechnical University (NWPU) to J.C., Fundamental
Research Funds for the Central Universities (3102019JC007) to J.C., and
National Thousand Youth Talents Plan to J.C. The funders played no role in
the study.
Availability of data and materials
Raw data and the genome assembly from this study were deposited in NCBI
under the BioProject ID: PRJNA635894. The datasets supporting the
conclusions of this article are included within the article and its additional
files.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Author details
1
State Key Laboratory of Quality Research in Chinese Medicine, Institute of
Chinese Medical Sciences, University of Macau, 999078 Macau, China.
2
Key
Laboratory of National Forestry and Grassland Administration for Orchid
Zhang et al. BMC Genomics (2021) 22:536 Page 10 of 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Conservation and Utilization, 518114 Shenzhen, China.
3
Shenzhen Key
Laboratory for Orchid Conservation and Utilization, 518114 Shenzhen, China.
4
National Orchid Conservation Center of China and Orchid Conservation and
Research Center of Shenzhen, 518114 Shenzhen, China.
5
Key Laboratory of
NFGA for Orchid Conservation and Utilization, Fujian Agriculture and Forestry
University, 350002 Fuzhou, China.
6
School of Ecology and Environment,
Northwestern Polytechnical University, 710129 Xian, China.
Received: 19 June 2020 Accepted: 23 June 2021
References
1. Christenhusz MJM, Byng JW. The number of known plants species in the
world and its annual increase. Phytotaxa. 2016;261(3).
2. Cai J, Liu X, Vanneste K, Proost S, Tsai WC, Liu KW, et al. The genome
sequence of the orchid Phalaenopsis equestris. Nat Genet. 2015;47(1):6572.
3. Zhang GQ, Liu KW, Li Z, Lohaus R, Hsiao YY, Niu SC, et al. The Apostasia
genome and the evolution of orchids. Nature. 2017;549(7672):37983.
4. Wei S, Shih C-C, Chen N-H, Tung S-J, editors. Value chain dynamics in the
Taiwan orchid industry. I International Orchid Symposium 878; 2010.
5. Zhang GQ, Xu Q, Bian C, Tsai WC, Yeh CM, Liu KW, et al. The Dendrobium
catenatum Lindl. genome sequence provides insights into polysaccharide
synthase, floral development and adaptive evolution. Sci Rep. 2016;6:19029.
6. Yan L, Wang X, Liu H, Tian Y, Lian J, Yang R, et al. The Genome of
Dendrobium officinale Illuminates the Biology of the Important Traditional
Chinese Orchid Herb. Mol Plant. 2015;8(6):92234.
7. Chao YT, Chen WC, Chen CY, Ho HY, Yeh CH, Kuo YT, et al. Chromosome-
level assembly, genetic and physical mapping of Phalaenopsis aphrodite
genome provides new insights into species adaptation and resources for
orchid breeding. Plant Biotechnol J. 2018;16(12):202741.
8. Hu Y, Resende MF, Bombarely A, Brym M, Bassil E, Chambers AH. Genomics-
based diversity analysis of Vanilla species using a Vanilla planifolia draft
genome and Genotyping-By-Sequencing. Scientific reports. 2019;9(1):116.
9. Kocyan A, Qiu Y-L, Endress P, Conti E. A phylogenetic analysis of
Apostasioideae (Orchidaceae) based on ITS, trnL-F and matK sequences.
Plant Syst Evol. 2004;247(34):20313.
10. Kocyan A, Endress PK. Floral structure and development of Apostasia and
Neuwiedia (Apostasioideae) and their relationships to other Orchidaceae. Int
J Plant Sci. 2001;162(4):84767.
11. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM.
BUSCO: assessing genome assembly and annotation completeness with
single-copy orthologs. Bioinformatics. 2015;31(19):32102.
12. De Bie T, Cristianini N, Demuth JP, Hahn MW. CAFE: a computational tool
for the study of gene family evolution. Bioinformatics. 2006;22(10):126971.
13. Li H, Durbin R. Inference of human population history from individual
whole-genome sequences. Nature. 2011;475(7357):4936.
14. Gillman MP, Dodd M. The variability of orchid population size. Botanical
journal of the Linnean Society. 1998;126(12):6574.
15. Alexandersson R, Ågren J. Population size, pollinator visitation and fruit
production in the deceptive orchid Calypso bulbosa. Oecologia. 1996;107(4):
53340.
16. Masiero S, Colombo L, Grini PE, Schnittger A, Kater MM. The emerging
importance of type I MADS box transcription factors for plant reproduction.
Plant Cell. 2011;23(3):86572.
17. Pazouki L, Niinemets Ü. Multi-substrate terpene synthases: their occurrence
and physiological significance. Front Plant Sci. 2016;7:1019.
18. Karunanithi PS, Zerbe P. Terpene synthases as metabolic gatekeepers in the
evolution of plant terpenoid chemical diversity. Frontiers in plant science.
2019;10:1166.
19. Chen F, Tholl D, Bohlmann J, Pichersky E. The family of terpene synthases in
plants: a mid-size family of genes for specialized metabolism that is highly
diversified throughout the kingdom. Plant J. 2011;66(1):21229.
20. Tian D, Traw M, Chen J, Kreitman M, Bergelson J. Fitness costs of R-gene-
mediated resistance in Arabidopsis thaliana. Nature. 2003;423(6935):747.
21. Yuan X, Wang H, Cai J, Li D, Song F. NAC transcription factors in plant
immunity. Phytopathology Research. 2019;1(1):113.
22. Pandey SP, Somssich IE. The role of WRKY transcription factors in plant
immunity. Plant physiology. 2009;150(4):164855.
23. Vellosillo T, Martínez M, López MA, Vicente J, Cascón T, Dolan L, et al.
Oxylipins produced by the 9-lipoxygenase pathway in Arabidopsis regulate
lateral root development and defense responses through a specific
signaling cascade. Plant Cell. 2007;19(3):83146.
24. Tsai W-C, Chen H-H. The orchid MADS-box genes controlling floral
morphogenesis. The Scientific World Journal. 2006;6:193344.
25. Liu Y, Cui S, Wu F, Yan S, Lin X, Du X, et al. Functional conservation of
MIKC*-Type MADS box genes in Arabidopsis and rice pollen maturation.
Plant Cell. 2013;25(4):1288303.
26. Kwantes M, Liebsch D, Verelst W. How MIKC* MADS-box genes originated and
evidence for their conserved function throughout the evolution of vascular plant
gametophytes. Molecular biology evolution. 2012;29(1):293302.
27. Valoroso MC, Censullo MC, Aceto S. The MADS-box genes expressed in the
inflorescence of Orchis italica (Orchidaceae). PloS one. 2019;14(3).
28. Smith S, De Smet I. Root system architecture: insights from Arabidopsis and
cereal crops. The Royal Society; 2012.
29. Li X, Zeng R, Liao H. Improving crop nutrient efficiency through root
architecture modifications. Journal of integrative plant biology. 2016;58(3):
193202.
30. Kiba T, Krapp A. Plant nitrogen acquisition under low availability: regulation
of uptake and root architecture. Plant Cell Physiol. 2016;57(4):70714.
31. Du Y, Scheres B. Lateral root formation and the multiple roles of auxin. J
Exp Bot. 2018;69(2):15567.
32. Fukaki H, Tasaka M. Hormone interactions during lateral root formation.
Plant molecular biology. 2009;69(4):437.
33. Murray M, Thompson WF. Rapid isolation of high molecular weight plant
DNA. Nucleic acids research. 1980;8(19):43216.
34. Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, et al. SOAPdenovo2: an
empirically improved memory-efficient short-read de novo assembler.
Gigascience. 2012;1(1):2047217X-1-18.
35. Benson G. Tandem repeats finder: a program to analyze DNA sequences.
Nucleic acids research. 1999;27(2):57380.
36. Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J.
Repbase Update, a database of eukaryotic repetitive elements. Cytogenet
Genome Res. 2005;110(14):4627.
37. Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive
elements in genomic sequences. Current protocols in bioinformatics. 2009;
25(1):4. 10. 14. 4.
38. Tempel S. Using and understanding RepeatMasker. Mobile Genetic
Elements: Springer; 2012. pp. 2951.
39. Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length
LTR retrotransposons. Nucleic acids research. 2007;35(suppl_2):W265-W8.
40. Stanke M, Keller O, Gunduz I, Hayes A, Waack S, Morgenstern B. AUGUSTUS:
ab initio prediction of alternative transcripts. Nucleic acids research. 2006;
34(suppl_2):W435-W9.
41. Majoros WH, Pertea M, Salzberg SL. TigrScan and GlimmerHMM: two open
source ab initio eukaryotic gene-finders. Bioinformatics. 2004;20(16):28789.
42. Birney E, Clamp M, Durbin R. GeneWise and genomewise. Genome
research. 2004;14(5):98895.
43. Holt C, Yandell M. MAKER2: an annotation pipeline and genome-database
management tool for second-generation genome projects. BMC Bioinform.
2011;12(1):491.
44. Nawrocki EP, Kolbe DL, Eddy SR. Infernal 1.0: inference of RNA alignments.
Bioinformatics. 2009;25(10):13357.
45. Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of
transfer RNA genes in genomic sequence. Nucleic acids research. 1997;25(5):
95564.
46. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene
ontology: tool for the unification of biology. Nat Genet. 2000;25(1):259.
47. Consortium GO. The gene ontology resource: 20 years and still GOing
strong. Nucleic acids research. 2019;47(D1):D330-D8.
48. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes.
Nucleic acids research. 2000;28(1):2730.
49. Bairoch A, Apweiler R. The SWISS-PROT protein sequence database and its
supplement TrEMBL in 2000. Nucleic acids research. 2000;28(1):458.
50. Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, et al.
InterPro: the integrative protein signature database. Nucleic acids research.
2009;37(suppl_1):D211-D5.
51. Emms DM, Kelly S. OrthoFinder: solving fundamental biases in whole
genome comparisons dramatically improves orthogroup inference accuracy.
Genome biology. 2015;16(1):157.
52. Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of ortholog groups for
eukaryotic genomes. Genome research. 2003;13(9):217889.
Zhang et al. BMC Genomics (2021) 22:536 Page 11 of 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
53. Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic
trees. Bioinformatics. 2001;17(8):7545.
54. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Molecular
biology evolution. 2007;24(8):158691.
55. Initiative IB. Genome sequencing and analysis of the model grass
Brachypodium distachyon. Nature. 2010;463(7282):763.
56. Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, et al.
The genome of black cottonwood, Populus trichocarpa (Torr. & Gray).
science. 2006;313(5793):1596604.
57. Chaw S-M, Chang C-C, Chen H-L, Li W-H. Dating the monocotdicot
divergence and the origin of core eudicots using whole chloroplast
genomes. Journal of molecular evolution. 2004;58(4):42441.
58. Magallón S, Hilu KW, Quandt D. Land plant evolutionary timeline: gene
effects are secondary to fossil constraints in relaxed clock estimation of age
and substitution rates. Am J Bot. 2013;100(3):55673.
59. Chaw S-M, Liu Y-C, Wu Y-W, Wang H-Y, Lin C-YI, Wu C-S, et al. Stout
camphor tree genome fills gaps in understanding of flowering plant
genome evolution. Nature plants. 2019;5(1):6373.
60. Li H, Durbin R. Fast and accurate short read alignment with Burrows
Wheeler transform. bioinformatics. 2009;25(14):175460.
61. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The
sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):
20789.
62. El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, et al. The
Pfam protein families database in 2019. Nucleic acids research. 2019;47(D1):
D427-D32.
63. Eddy SR. Accelerated profile HMM searches. PLoS computational biology.
2011;7(10).
64. Subramanian B, Gao S, Lercher MJ, Hu S, Chen W-H. Evolview v3: a
webserver for visualization, annotation, and management of phylogenetic
trees. Nucleic acids research. 2019;47(W1):W270-W5.
65. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and
high throughput. Nucleic acids research. 2004;32(5):17927.
66. Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics
analysis version 7.0 for bigger datasets. Molecular biology evolution. 2016;
33(7):18704.
PublishersNote
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
Zhang et al. BMC Genomics (2021) 22:536 Page 12 of 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... Even within a specific range, the results may sometimes be erroneous or contradictory to traditional morphological classification. Moreover, the majority of studies [69][70][71][72][73] on orchid morphology or genetic evolution primarily focus on epiphytic orchids within the entire orchid family or subfamily, with limited analysis conducted on species diversity within Cymbidium species and particularly the evolutionary disparities among Cymbidium species from different regions. ...
Article
Full-text available
The genus Cymbidium, with its intricate floral elements, pronounced endemicity, and patchy distribution, evolves a rich diversity of morphological forms and a wide variety of species while causing an indistinctness in the classification of its species. To elucidate the phylogenetic relationships among Cymbidium species and enhance their taxonomic classification by DNA barcoding, this study conducted amplification and sequence results of nuclear (ITS) and chloroplast genes (matK, rbcL, trnL-F, psbA-trnH) with phenotypic genetic diversity analysis, genetic distance analysis, and phylogenetic analysis from 48 samples of Cymbidium species. The comparison of genetic distance variations showed that psbA-trnH, ITS + psbA-trnH, and ITS + matK + psbA-trnH exhibit minimal overlap and significant genetic variation within Cymbidium species. The phylogenetic analysis indicated that the combination, ITS + matK + psbA-trnH, has the highest identification rate. Notably, both the phylogenetic analysis and the genetic diversity analysis of phenotypic traits consistently indicated a clear divergence between epiphytic and terrestrial orchids, with epiphytic orchids forming a distinct clade. This provides reference evidence for studying the ecological adaptations and evolutionary differences between epiphytic and terrestrial orchids, as well as a scientific basis for the classification and identification, germplasm conservation, resource utilization, and phylogenetic evolution of orchids.
... These genomes indicate the orchids have undergone two whole-genome duplication (WGD) events, the most recent of which was shared by all orchids, whereas the older event was shared by most monocots (Van de Peer et al., 2017;Zhang et al., 2017). Changes within MADSbox gene classes, identified through these genomes, might have contributed to variations in labellum and pollinium morphology and accessory structures (Chao et al., 2018;Ai et al., 2021;Sun et al., 2021;Zhang et al., 2021bZhang et al., , 2021c. ...
Article
Full-text available
Orchidaceae are one of the largest families of angiosperms in terms of species richness. In the last decade, numerous studies have delved into reconstructing the phylogenetic framework of Orchidaceae, leveraging data from plastid, mitochondrial and nuclear sources. These studies have provided new insights into the systematics, diversification and biogeography of Orchidaceae, establishing a robust foundation for future research. Nevertheless, pronounced controversies persist regarding the precise placement of certain lineages within these phylogenetic frameworks. To address these discrepancies and deepen our understanding of the phylogenetic structure of Orchidaceae, we provide a comprehensive overview and analysis of phylogenetic studies focusing on contentious groups within Orchidaceae since 2015, delving into discussions on the underlying reasons for observed topological conflicts. We also provide a novel phylogenetic framework at the subtribal level. Furthermore, we examine the tempo and mode underlying orchid species diversity from the perspective of historical biogeography, highlighting factors contributing to extensive speciation. Ultimately, we delineate avenues for future research aimed at enhancing our understanding of Orchidaceae phylogeny and diversity.
... There are more than 28,000 species and 850 genera in Orchidaceae, represents approximately 10% of all flowering plants worldwide and has the largest number of species (Chase et al., 2015). Orchids are remarkable for shedding light on plant evolution, with more complete orchid genomes now available, researchers have gained significant insight into the genetic foundations of orchid biology (Zhang et al., 2021a). Extensive research has been conducted on CYP75s in model plants, but there is currently limited knowledge about the characteristics of these genes in the Orchidaceae. ...
Article
Full-text available
With a great diversity of species, Orchidaceae stands out as an essential component of plant biodiversity, making it a primary resource for studying angiosperms evolution and genomics. This study focuses on 13 published orchid genomes to identify and analyze the CYP75 gene family belonging to the cytochrome P450 superfamily, which is closely related to flavonoid biosynthetic enzymes and pigment regulation. We found 72 CYP75s in the 13 orchid genomes and further classified them into two classes: CYP75A and CYP75B subfamily, the former synthesizes blue anthocyanins, while the latter is involved in the production of red anthocyanins. Furthermore, the amount of CYP75Bs (53/72) greatly exceeds the amount of CYP75As (19/72) in orchids. Our findings suggest that CYP75B genes have a more important evolutionary role, as red plants are more common in nature than blue plants. We also discovered unique conserved motifs in each subfamily that serve as specific recognition features (motif 19 belong to CYP75A; motif 17 belong to CYP75B). Two diverse-colored varieties of C. goeringii were selected for qRT-PCR experiments. The expression of CgCYP75B1 was significantly higher in the purple-red variant compared to the yellow-green variant, while CgCYP75A1 showed no significant difference. Based on transcriptomic expression analysis, CYP75Bs are more highly expressed than CYP75As in floral organs, especially in colorful petals and lips. These results provide valuable information for future studies on CYP75s in orchids and other angiosperms.
... Climate changes during the Quaternary glacial and interglacial periods directly in uenced the distribution of plant taxa and the size of plant habitats 24 . The RR and RH effective population sizes tended to decrease sharply during the glacial period and stabilized or increased during the interglacial period, which is in accordance with the related ndings for orchids 25 , Rhododendron 26,27 , and other plant groups. ...
Preprint
Full-text available
Rose is an important aromatic plant and produces flowers that are used in medicine and food. We herein present a haplotype-resolved genome for Rosa rugosa cultivar Hanxiang. Analyses of allele-specific expression identified a potential mechanism underlying floral scent biosynthesis. Population genomic analyses involving 133 Rosa accessions elucidated evolutionary histories and a single R. rugosa domestication event. Pathways mediating the synthesis of scent-related metabolites were enriched according to the analyses of the transcriptomes, haplotype variations, and allelic imbalances during the flower development stages of Hanxiang and Guomeigui ( R. rugosa accessions with diverse fragrances). The enzyme-encoding ASE genes RrHX1G119800 and RrHX1G204700 (primary amine oxidases) and RrHX2G284700 (L-tryptophan decarboxylase) in the phenylethylamine pathway were tentatively designated as core genes useful for improving 2-phenylethanol production in rose flowers. Our results provide molecular insights into the formation of R. rugosa floral fragrances and genome-level data that are useful for enhancing rose traits via genetic engineering.
Article
Full-text available
Repetitive sequences can lead to variation in DNA quantity and composition among species. The Orchidaceae, the largest angiosperm family, is divided into five subfamilies, with Apostasioideae as the basal group and Orchidoideae and Epidendroideae showing high diversification rates. Despite their different evolutionary paths, some species in these groups have similar nuclear DNA content. This study focuses on one example to understand the dynamics of major repetitive DNAs in the nucleus. We used Next-Generation Sequencing (NGS) data from Apostasia wallichii (Apostasioideae) and Ludisia discolor (Orchidoideae) to identify and quantify the most abundant repeats. The repetitive fraction varied in abundance (27.5% in L. discolor and 60.6% in A. wallichii) and composition, with LTR retrotransposons of different lineages being the most abundant repeats in each species. Satellite DNAs showed varying organization and abundance. Despite the unbalanced ratio between single-copy and repetitive DNA sequences, the two species had the same genome size, possibly due to the elimination of non-essential genes. This phenomenon has been observed in other Apostasia and likely led to the proliferation of transposable elements in A. wallichii. Deep genome information in the future will aid in understanding the contraction/expansion of gene families and the evolution of sequences in these genomes.
Article
Full-text available
There are nearly 30,000 species of orchids globally, of which over 1,700 species are found in China. Orchids share a profound and intimate connection with Chinese society. With the rapid development of science and technology, China's orchid industry has flourished with many scientific and technological achievements. Here, we summarize the developmental history, current situation, latest research achievements, and industrialization technology of the orchid industry in China, and present a discussion and outlook on the future development direction of orchid research in China. This review unveils new prospects for the high-quality advancement of China's orchid industry.
Chapter
Genome sequences and gene expression provide important insights into the evolution and function of gene families. A database of complete genome sequences for many plant species, including orchids, is now available. Additionally, transcriptomics via next-generation sequencing can be used to analyze the regulatory mechanisms of various biological processes at the molecular level in many plant species, even nonmodel and wild plants. Recently, whole-genome sequencing and transcriptomic studies have been conducted on some orchids, unveiling the mechanisms underlying orchid mycorrhizal (OM) symbiosis, one of the most important features of Orchidaceae. Because orchids obtain nutrients from their symbiotic fungi during seed germination or even throughout their whole life cycle (mycoheterotrophy), OM symbiosis differs from mutualism, such as arbuscular mycorrhizal (AM) symbiosis. The genetic information of orchids provides a better understanding of how OM symbiosis has evolved, how orchids maintain a delicate balance of immune control during symbiosis, and how OM and AM symbioses differ. This knowledge will help establish a method for maintaining OM symbiosis, which is essential for orchids, and for conserving threatened orchids. The objectives of this chapter are (i) to review genetic study methodologies because practical guidelines of orchid species’ genome sequence and transcriptome analysis are unavailable and (ii) to summarize studies on OM symbiosis.KeywordsGenomicsOrchid mycorrhizal symbiosisTranscriptomics
Article
Full-text available
We present the latest version of the Molecular Evolutionary Genetics Analysis (MEGA) software, which contains many sophisticated methods and tools for phylogenomics and phylomedicine. In this major upgrade, MEGA has been optimized for use on 64-bit computing systems for analyzing bigger datasets. Researchers can now explore and analyze tens of thousands of sequences in MEGA. The new version also provides an advanced wizard for building timetrees and includes a new functionality to automatically predict gene duplication events in gene family trees. The 64-bit MEGA is made available in two interfaces: graphical and command line. The graphical user interface (GUI) is a native Microsoft Windows application that can also be used on Mac OSX. The command line MEGA is available as native applications for Windows, Linux, and Mac OSX. They are intended for use in high-throughput and scripted analysis. Both versions are available from www.megasoftware.net free of charge.
Article
Full-text available
Terpenoids comprise tens of thousands of small molecule natural products that are widely distributed across all domains of life. Plants produce by far the largest array of terpenoids with various roles in development and chemical ecology. Driven by selective pressure to adapt to their specific ecological niche, individual species form only a fraction of the myriad plant terpenoids, typically representing unique metabolite blends. Terpene synthase (TPS) enzymes are the gatekeepers in generating terpenoid diversity by catalyzing complex carbocation-driven cyclization, rearrangement, and elimination reactions that enable the transformation of a few acyclic prenyl diphosphate substrates into a vast chemical library of hydrocarbon and, for a few enzymes, oxygenated terpene scaffolds. The seven currently defined clades (a-h) forming the plant TPS family evolved from ancestral triterpene synthase- and prenyl transferase–type enzymes through repeated events of gene duplication and subsequent loss, gain, or fusion of protein domains and further functional diversification. Lineage-specific expansion of these TPS clades led to variable family sizes that may range from a single TPS gene to families of more than 100 members that may further function as part of modular metabolic networks to maximize the number of possible products. Accompanying gene family expansion, the TPS family shows a profound functional plasticity, where minor active site alterations can dramatically impact product outcome, thus enabling the emergence of new functions with minimal investment in evolving new enzymes. This article reviews current knowledge on the functional diversity and molecular evolution of the plant TPS family that underlies the chemical diversity of bioactive terpenoids across the plant kingdom.
Article
Full-text available
Evolview is an interactive tree visualization tool designed to help researchers in visualizing phylogenetic trees and in annotating these with additional information. It offers the user with a platform to upload trees in most common tree formats, such as Newick/Phylip, Nexus, Nhx and PhyloXML, and provides a range of visualization options, using fifteen types of custom annotation datasets. The new version of Evolview was designed to provide simple tree uploads, manipulation and viewing options with additional annotation types. The 'dataset system' used for visualizing tree information has evolved substantially from the previous version, and the user can draw on a wide range of additional example visualizations. Developments since the last public release include a complete redesign of the user interface, new annotation dataset types, additional tree visualization styles, full-text search of the documentation, and some backend updates. The project management aspect of Evolview was also updated, with a unified approach to tree and project management and sharing. Evolview is freely available at: https://www.evolgenius.info/evolview/.
Article
Full-text available
The Gene Ontology resource (GO; http://geneontology.org) provides structured, computable knowledge regarding the functions of genes and gene products. Founded in 1998, GO has become widely adopted in the life sciences, and its contents are under continual improvement, both in quantity and in quality. Here, we report the major developments of the GO resource during the past two years. Each monthly release of the GO resource is now packaged and given a unique identifier (DOI), enabling GO-based analyses on a specific release to be reproduced in the future. The molecular function ontology has been refactored to better represent the overall activities of gene products, with a focus on transcription regulator activities. Quality assurance efforts have been ramped up to address potentially out-of-date or inaccurate annotations. New evidence codes for high-throughput experiments now enable users to filter out annotations obtained from these sources. GO-CAM, a new framework for representing gene function that is more expressive than standard GO annotations, has been released, and users can now explore the growing repository of these models. We also provide the ‘GO ribbon’ widget for visualizing GO annotations to a gene; the widget can be easily embedded in any web page.
Article
Full-text available
Demand for all-natural vanilla flavor is increasing, but its botanical source, Vanilla planifolia, faces critical challenges arising from a narrow germplasm base and supply limitations. Genomics tools are the key to overcoming these limitations by enabling advanced genetics and plant breeding for new cultivars with improved yield and quality. The objective of this work was to establish the genomic resources needed to facilitate analysis of diversity among Vanilla accessions and to provide a resource to analyze other Vanilla collections. A V. planifolia draft genome was assembled and used to identify 521,732 single nucleotide polymorphism (SNP) markers using Genotyping-By-Sequencing (GBS). The draft genome had a size of 2.20 Gb representing 97% of the estimated genome size. A filtered set of 5,082 SNPs was used to genotype a living collection of 112 Vanilla accessions from 23 species including native Florida species. Principal component analysis of the genetic distances, population structure, and the maternally inherited rbcL gene identified putative hybrids, misidentified accessions, significant diversity within V. planifolia, and evidence for 12 clusters that separate accessions by species. These results validate the efficiency of genomics-based tools to characterize and identify genetic diversity in Vanilla and provide a significant tool for genomics-assisted plant breeding.
Article
Full-text available
The Orchidaceae family, which is one of the most species-rich flowering plant families, includes species with highly diversified and specialized flower shapes. The aim of this study was to analyze the MADS-box genes expressed in the inflorescence of Orchis italica, a wild Mediterranean orchid species. MADS-box proteins are transcription factors involved in various plant biological processes, including flower development. In the floral tissues of O. italica, 29 MADS-box genes are expressed that are classified as both class I and II. Class I MADS-box genes include one Mβ-type gene, thereby confirming the presence of this type of MADS-box genes in orchids. The class II MIKC* gene is highly expressed in the column, which is consistent with the conserved function of the MIKC* genes in gametophyte development. In addition, homologs of the SOC, SVP, ANR1, AGL12 and OsMADS32 genes are expressed. Compared with previous knowledge on class II MIKCC genes of O. italica involved in the ABCDE model of flower development, the number of class B and D genes has been confirmed. In addition, 4 class A (AP1/FUL) transcripts, 2 class E (SEP) transcripts, 2 new class C (AG) transcripts and 1 new AGL6 transcript have been identified. Within the AP1/FUL genes, the sequence divergence, relaxation of purifying selection and expression profiles suggest a possible functional diversification within these orchid genes. The detection of only two SEP transcripts in O. italica, in contrast with the 4 genes found in other orchids, suggests that only two SEP genes could be present in the subfamily Orchidoideae. The expression pattern of the MIKCC genes of O. italica indicates that low levels at the boundary of the domain of a given MADS-box gene can overlap with the expression of genes belonging to a different functional A-E class in the adjacent domain, thereby following a “fading borders” model.
Article
Full-text available
The NAC (NAM, ATAF and CUC) family is one of the largest plant-specific transcription factor (TF) families. Members of this family are implicated in plant growth, development and stress responses. Recent functional studies demonstrate that a number of NAC TFs function as positive or negative regulators of plant immunity to biotrophic, hemibiotrophic or necotrophic pathogens, as modulators of the hypersensitive responses and stomatal immunity or as virulence targets of pathogen effectors. They affect plant immunity through their regulatory impact on signaling of plant hormones, which in turn are key players in plant immune responses. This review summarizes current knowledge and recent progress in our understanding of the biological functions of NAC TFs in plant immunity and discusses perspectives and directions for further study to elucidate the molecular mechanisms of NAC TF functions in immunity and potential application in improvement of crop disease resistance.
Article
Full-text available
We present reference-quality genome assembly and annotation for the stout camphor tree (Cinnamomum kanehirae (Laurales, Lauraceae)), the first sequenced member of the Magnoliidae comprising four orders (Laurales, Magnoliales, Canellales and Piperales) and over 9,000 species. Phylogenomic analysis of 13 representative seed plant genomes indicates that magnoliid and eudicot lineages share more recent common ancestry than monocots. Two whole-genome duplication events were inferred within the magnoliid lineage: one before divergence of Laurales and Magnoliales and the other within the Lauraceae. Small-scale segmental duplications and tandem duplications also contributed to innovation in the evolutionary history of Cinnamomum. For example, expansion of the terpenoid synthase gene subfamilies within the Laurales spawned the diversity of Cinnamomum monoterpenes and sesquiterpenes.
Article
Full-text available
The Gene Ontology resource (GO; http://geneontology.org) provides structured, computable knowledge regarding the functions of genes and gene products. Founded in 1998, GO has become widely adopted in the life sciences, and its contents are under continual improvement, both in quantity and in quality. Here, we report the major developments of the GO resource during the past two years. Each monthly release of the GO resource is now packaged and given a unique identifier (DOI), enabling GO-based analyses on a specific release to be reproduced in the future. The molecular function ontology has been refactored to better represent the overall activities of gene products, with a focus on transcription regulator activities. Quality assurance efforts have been ramped up to address potentially out-of-date or inaccurate annotations. New evidence codes for high-throughput experiments now enable users to filter out annotations obtained from these sources. GO-CAM, a new framework for representing gene function that is more expressive than standard GO annotations, has been released, and users can now explore the growing repository of these models. We also provide the 'GO ribbon' widget for visualizing GO annotations to a gene; the widget can be easily embedded in any web page.