ArticlePDF Available

The complete chloroplast genome sequence of Goodyera schlechtendaliana in Korea (Orchidaceae)


Abstract and Figures

Goodyera schlechtendaliana is a common orchid species in East Asia, providing a case to study phylogeographic structure of understory plants in warm temperate forests. Here, we present the complete chloroplast genome of the Korean G. schlechtendaliana. Its length is 153,801 bp and it has four subregions; 82,683 bp of large-single-copy and 18,048 bp of small-single-copy regions are separated by 26,535 bp of inverted repeat regions, including 133 genes (86 protein-coding genes, eight rRNAs, and 39 tRNAs). Phylogenetic analyses suggest that the chloroplast genomic data should be useful in future phylogeographic and phylogenetic studies of Goodyera.
Content may be subject to copyright.
Full Terms & Conditions of access and use can be found at
Mitochondrial DNA Part B
ISSN: (Print) 2380-2359 (Online) Journal homepage:
The complete chloroplast genome sequence of
Goodyera schlechtendaliana in Korea (Orchidaceae)
Sang-Hun Oh, Hwa Jung Suh, Jongsun Park, Yongsung Kim & Sangtae Kim
To cite this article: Sang-Hun Oh, Hwa Jung Suh, Jongsun Park, Yongsung Kim & Sangtae Kim
(2019) The complete chloroplast genome sequence of Goodyera�schlechtendaliana in Korea
(Orchidaceae), Mitochondrial DNA Part B, 4:2, 2692-2693, DOI: 10.1080/23802359.2019.1641439
To link to this article:
© 2019 The Author(s). Published by Informa
UK Limited, trading as Taylor & Francis
Published online: 24 Jul 2019.
Submit your article to this journal
View Crossmark data
The complete chloroplast genome sequence of Goodyera schlechtendaliana in
Korea (Orchidaceae)
Sang-Hun Oh
, Hwa Jung Suh
, Jongsun Park
, Yongsung Kim
and Sangtae Kim
Department of Biology, Daejeon University, Daejeon, Korea;
Department of Biology, Sungshin University, Seoul, Korea;
Infoboss Co., Ltd.,
Seoul, Korea;
InfoBoss Research Center, Seoul, Korea
Goodyera schlechtendaliana is a common orchid species in East Asia, providing a case to study phylo-
geographic structure of understory plants in warm temperate forests. Here, we present the complete
chloroplast genome of the Korean G. schlechtendaliana. Its length is 153,801bp and it has four subre-
gions; 82,683 bp of large-single-copy and 18,048 bp of small-single-copy regions are separated by
26,535 bp of inverted repeat regions, including 133 genes (86 protein-coding genes, eight rRNAs, and
39 tRNAs). Phylogenetic analyses suggest that the chloroplast genomic data should be useful in future
phylogeographic and phylogenetic studies of Goodyera.
Received 17 June 2019
Accepted 22 June 2019
Goodyera; chloroplast
genome; Goodyera
schlechtendaliana; Korea;
Goodyera schlechtendaliana Rchb. f. (Orchidaceae) is a com-
mon orchid widely distributed in the Himalayas, Sumatra,
China, Taiwan, Korea, and Japan, occupying shady places
with moist and well-drained soils. It is characterized by hav-
ing creeping rhizomes, white variegated markings on the
adaxial surfaces of the leaves, saccate labellum, two sectile
pollinia attached to a viscidium, and a single stigmatic lobe
in Orchidaceae (Chen et al. 2009; Hu et al. 2016). It is often
cultivated as an ornamental because of the patterns on the
leaves. Both sexual and clonal reproduction occur in G.
schlechtendaliana (Brzosko et al. 2013). Phylogeographic
structure representing differentiation among populations has
not been studied despite the wide distribution range of the
species. Chloroplast genome is useful to trace the seed
movement and infer the geographic structure.
The complete chloroplast genome of G.schlechtendaliana
from southern part of Korea (3441023.2800N, 12511048.4900 E)
was determined to be used in understanding of infraspecific
variation. Total DNA was extracted from fresh leaves collected
on Hongdo Island in Shinan-gun, Jeollanam-do, Korea (vou-
cher in the herbarium of Daejeon University (TUT); Oh 7171)
using a DNeasy Plant Mini Kit (QIAGEN, Hilden, Germany).
Paired-end sequencing was performed using HiSeq4000
(Illumina, San Diego, USA) of Macrogen Inc., Korea. De novo
assembly was performed using Velvet 1.2.10 (Zerbino and
Birney 2008), and gap sequences were filled by
SOAPGapCloser 1.12 (Zhao et al. 2011), BWA 0.7.17 (Li 2013),
and SAMtools 1.9 (Li et al. 2009). Geneious R11 11.0.5
(Biomatters Ltd., Auckland, New Zealand) was used for gen-
ome annotation based on G. schlechtendaliana chloroplast
genome (MK134679; Oh et al. 2019).
The chloroplast genome of Korean G. schlechtendaliana
(GenBank accession: MK144665) is 153,801 bp (the GC-ratio is
37.2%) and has four subregions: 82,683 bp of large-single-
copy (GC-ratio, 34.9%) and 18,048 bp of small-single-copy
(GC-ratio, 29.7%) regions are separated by 26,535 bp each of
inverted repeats (IR; GC-ratio, 43.3%). It contains 133 genes
(86 protein-coding genes, eight rRNAs, and 39 tRNAs) with
19 genes (seven protein-coding genes, four rRNAs, and eight
tRNAs) duplicated in the IR regions.
Twelve complete chloroplast genomes, includig eight from
four species of Goodyera, two from closely allied groups, and
two outgroups, were aligned using MAFFT 7.388 (Katoh and
Standley 2013). Phylogenetic trees were constructed using
the neighbor-joining (with 10,000 bootstrap repeats) and
maximum likelihood methods (with 1000 bootstrap repeats)
in MEGA X (Kumar et al. 2018).
The phylogenetic tree shows that G. schlechtendaliana
from Korea forms strongly supported clade with other acces-
sions of G.schlechtendaliana from China (Figure 1A). The
result agrees with morphology and previous phylogenetic
analysis based on nuclear ITS regions (Hu et al. 2016).
Comparison of five chloroplast genomes of G.schlechtendali-
ana showed 200844 single nucleotide polymorphisms and
4142133 insertions and deletions among accessions (Figure
1B), suggesting a high level of infraspecific variation com-
pared with those in Pseudostellaria (Kim et al. 2019) and
Coffea (Park et al. 2019). The chloroplast genome will be a
useful resource for investigation of phylogeographic structure
within G.schlechtendaliana and for understanding phylogen-
etic relationship of Goodyera.
CONTACT Sang-Hun Oh Department of Biology, Daejeon University, Daejeon 34520, Korea; Sangtae Kim
Department of Biology, Sungshin University, Seoul 01133, Korea
ß2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use,
distribution, and reproduction in any medium, provided the original work is properly cited.
2019, VOL. 4, NO. 2, 26922693
Disclosure statement
No potential conflict of interest was reported by the authors.
This work was supported by a research grant from the National Research
Foundation of Korea [NRF-2016R1D1A1B03934663] to the first author.
Sang-Hun Oh
Jongsun Park
Yongsung Kim
Sangtae Kim
Brzosko E, Wroblewska A, Jermakowicz E, Hermaniuk A. 2013. High level
of genetic variation within clonal orchid Goodyera repens. Plant Syst
Evol. 299:15371548.
Chen X, Lang K, Gale S, Cribb P, Ormerod P. 2009. Goodyera. In Wu ZY,
Raven PH, Hong DY, editors. Flora of China, vol. 25. Beijing: Science
Press. p. 4554.
Hu C, Tian H, Li H, Hu A, Xing F, Bhattacharjee A, Hsu T, Kumar P, Chung
S. 2016. Phylogenetic analysis of a Jewel Orchidgenus Goodyera
(Orchidaceae) based on DNA sequence data from nuclear and plastid
regions. PLoS ONE. 11:e0150366.
Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment soft-
ware version 7: improvements in performance and usability. Mol Biol
Evol. 30:772780.
Kim Y, Heo K-I, Park J. 2019. The second complete chloroplast genome
sequence of Pseudostellaria palibiniana (Takeda) Ohwi
(Caryophyllaceae): intraspecies variations based on geographical distri-
bution. Mitochondrial DNA B. 4:13101311.
Kumar S, Stecher G, Li M, Knyaz C, Tamura K. 2018. MEGA X: Molecular
evolutionary genetics analysis across computing platforms. Mol Biol
Evol. 35:15471549.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G,
Abecasis G, Durbin R. 2009. The sequence alignment/map format and
SAMtools. Bioinformatics. 25:20782079.
Li H. 2013. Aligning sequence reads, clone sequences and assembly con-
tigs with BWA-MEM. arXiv preprint arXiv:13033997.
Oh S-H, Suh H-J, Park J, Kim Y, Kim S. 2019. The complete chloroplast
genome sequence of a morphotype of Goodyera schlechtendaliana
(Orchidaceae) with the column appendages. Mitochondrial DNA B. 4:
Park J, Kim Y, Heo K-I, Xi H. 2019. The complete chloroplast genome of
ornamental coffee tree, Coffea arabica L. (Rubiaceae). Mitochondrial
DNA B. 4:10591060.
Zerbino DR, Birney E. 2008. Velvet: algorithms for de novo short read
assembly using de Bruijn graphs. Genome Res. 18:821829.
Zhao Q-Y, Wang Y, Kong Y-M, Luo D, Li X, Hao P. 2011. Optimizing de
novo transcriptome assembly from short-read RNA-Seq data: a com-
parative study. BMC Bioinforma. 12:S2.
MK134679 Goodyera schlechtendaliana
MK144665 Goodyera schlechtendaliana
NC_029364 Goodyera schlechtendaliana
KJ501999 Goodyera fumata
KT886429 Goodyera procera
KT886432 Goodyera veluna
NC_030540 Ludisia discolor
NC_030722 Apostasia odorata
AB893949 Goodyera schlechtendaliana
LC085346 Goodyera schlechtendaliana
NC_033895 Anoectochilus emeiensis
NC_026778 Vanilla planifolia
0.05 substuons/site
200/ 511
700/ 1,066
844/ 2,045
282/ 1,060
827/ 1,794
740/ 1,470
597/ 1,065 445/ 414
700/ 2,133
# of SNPs/# of INDELs
(A) (B)
Figure 1. (A) A maximum-likelihood tree using chloroplast genomes of G. schlechtendaliana from Korea (MK144665 in this study and MK134679) and previously
published related taxa: G. schlechtendaliana from China (AB892949, LC085346, and NC_029364), G. fumata (KJ501999), G. procera (KT886429), G. velutina
(KT886432), Ludisia discolour (NC_030540), Anoectochilus emeiensis (NC_033895), and two outgroup species, Vanila planifolia (NC_026778) and Apostasia odorata
(NC_030722). Bootstrap values using the neighbor-joining and maximum-likelihood methods are indicated above the branch. (B) Pairwise comparisons of five
chloroplast genomes of G. schlechtendaliana. Numbers of single nucleotide polymorphisms (SNPs) and insertions and deletions (INDELs) between each pair are indi-
cate on the branch. Filled eclipses indicate G. schlechtendaliana originated from Korea and opened eclipses mean G. schlechtendaliana originated from China.
... This is also caused by the inserted region of the Tsu0 chloroplast genome. Other plant chloroplasts of which intraspecific variations of the GC contents are identical to those of A. thaliana are Goodyera schlechtendaliana (37.1% and 37.2%) [53][54][55] and Gastrodia elata (26.7% and 26.8%) [56][57][58] which are same to those of Arabidopsis thaliana, while Coffea arabica [50,[59][60][61][62][63], Viburnum erosum [64,65], Duchesnea chrysantha [20,21], Salix koriyanagi [66,67], Pseudostellaria palibiniana [23,25], and Pyrus ussuriensis [68,69] present no difference in the intraspecific GC contents. ...
... These studies cover 23 families showing relatively large coverage, so that we expected that some characteristics of these sequence variations on chloroplast genomes can be rescued. In addition, we used number of SNPs and INDELs directly during comparison of sequence variations for better understanding intuitively because their complete chloroplast genome lengths are around 150 kb except genera Marchantia, Selaginella, Gastrodia, Illicium, Pseudostellaria, and Daphne ( [75], some cases of Cucumis melo [76] and Chenopodium quinoa [70], all of Dioscorea polystachya [77], Oryza sativa among cultivars [78], G. schlechtendaliana [53], and G. elata [56] ( [15,80], and Nymphaea (586 SNPs and 1,150 INDELs between Nymphaea capensis and Nymphaea ampla) [21], no clear levels pertaining to the number of intraspecific or interspecific variations exist. However, the numbers of SNPs and INDELs between 180404IB4 and Col0 are relatively small considering the intercontinental distance between two samples of the same plant species. ...
... This case is similar to those of C. arabica, showing one 84 bp insertion region [50] and D. chrysantha, presenting three insertion regions [21]. In terms of the number of INDELs, it is also in relation to high intraspecific variations that only P. ussuriensis [68], G. schlechtendaliana [53], and G. elata [56] present higher numbers of INDELs (Table 3). In addition, two out of the three Orchidaceae species shows high rates of divergence in terms of flower morphologies as well as the number of species [81][82][83]. ...
Full-text available
Arabidopsis thaliana (L.) Heynh. is a model organism of plant molecular biology. More than 1,700 whole genome sequences have been sequenced, but no Korean isolate genomes have been sequenced thus far despite the fact that many A. thaliana isolated in Japan and China have been sequenced. To understand the genetic background of Korean natural A. thaliana (named as 180404IB4), we presented its complete chloroplast genome, which is 154,464 bp long and has four subregions: 85,164 bp of large single copy (LSC) and 17,781 bp of small single copy (SSC) regions are separated by 26,257 bp of inverted repeat (IRs) regions including 130 genes (85 protein-coding genes, eight rRNAs, and 37 tRNAs). Fifty single nucleotide polymorphisms (SNPs) and 14 insertion and deletions (INDELs) are identified between 180404IB4 and Col0. In addition, 101 SSRs and 42 extendedSSRs were identified on the Korean A. thaliana chloroplast genome, indicating a similar number of SSRs on the rest five chloroplast genomes with a preference of sequence variations toward the SSR region. A nucleotide diversity analysis revealed two highly variable regions on A. thaliana chloroplast genomes. Phylogenetic trees with three more chloroplast genomes of East Asian natural isolates show that Korean and Chinese natural isolates are clustered together, whereas two Japanese isolates are not clustered, suggesting the need for additional investigations of the chloroplast genomes of East Asian isolates.
... Although plastomes have the properties of uniparental inheritance and a low frequency of recombination, these features have also prompted plastomes to maintain highly conserved features; however, mutations at the intraspecific level have still occurred, such as in Eragrostis tef (12 InDels and 19 SNPs) [43], Selaginella tamariscina (1641 InDels and 1213 SNPs) [44], and five Goodyera schlechtendaliana (414-2133 InDels and 200-844 SNPs) [45]. A total of 1915 high-quality SNPs and 346 InDels were detected in 16 accessions of P. obconica subsp. ...
The genus Primula (Primulaceae) comprises more than 500 species, with 300 species distributed in China. The contradictory results between systematic analyses and morphology-based taxonomy make taxonomy studies difficult. Furthermore, frequent introgression between closely related species of Primula can result in non-monophyletic species. In this study, the complete chloroplast genome of sixteen Primula obconica subsp. obconica individuals were assembled and compared with 84 accessions of 74 species from 21 sections of the 24 sections of the genus in China. The plastome sizes of P. obconica subsp. obconica range from 153,584 bp to 154,028 bp. Genome-wide variations were detected, and 1915 high-quality SNPs and 346 InDels were found. Most SNPs were detected in downstream and upstream gene regions (45.549% and 41.91%). Two cultivated accessions, ZP1 and ZP2, were abundant with SSRs. Moreover, 12 SSRs shared by 9 accessions showed variations that may be used as molecular markers for population genetic studies. The phylogenetic tree showed that P. obconica subsp. obconica cluster into two independent clades. Two subspecies have highly recognizable morphological characteristics, isolated geographical distribution areas, and distinct phylogenetic relationships compared with P. obconica subsp. obconica. We elevate the two subspecies of P. obconica to separate species. Our phylogenetic tree is largely inconsistent with morphology-based taxonomy. Twenty-one sections of Primula were mainly divided into three clades. The monophyly of Sect. Auganthus, Sect. Minutissimae, Sect. Sikkimensis, Sect. Petiolares, and Sect. Ranunculoides are well supported in the phylogenetic tree. The Sect. Obconicolisteri, Sect. Monocarpicae, Sect. Carolinella, Sect. Cortusoides, Sect. Aleuritia, Sect. Denticulata, Sect. Proliferae Pax, and Sect. Crystallophlomis are not a monophyletic group. The possible explanations for non-monophyly may be hybridization, polyploidization, recent introgression, incorrect taxonomy, or chloroplast capture. Multiple genomic data and population genetic studies are therefore needed to reveal the evolutionary history of Primula. Our results provided valuable information for intraspecific variation and phylogenetic relationships within Primula.
... Based on pair-wise alignments against chloroplast genomes of C. fargesii and C. concinna, distributed in Southern China (Sun et al. 2014 Oh et al. 2019aOh et al. , 2019bPark, Kim, Xi, Oh, et al. 2019;Park, Suh, et al. 2020;Heo et al. 2020;Oh and Park 2020). These results indicate that the numbers of interspecific variations identified from the three Castanopsis species are at a moderate level. ...
Full-text available
Castanopsis sieboldii (Makino) Hatus is an evergreen tree that distributes in Eastern Asia including Islands of Korea and Japan. The chloroplast genome of C. sieboldii was successfully sequenced. Its length is 160,705 bp long (GC ratio is 36.8%) and has four subregions: 90,821 bp of large single copy (34.6%) and 19,014 bp of small single copy (30.8%) regions are separated by 25,075 bp of inverted repeat (42.8%) regions including 134 genes (89 protein-coding genes, eight rRNAs, and 37 tRNAs). Interspecific variations of Castanopsis are at a moderate level in comparison to those of the other genera. Phylogenetic trees show that C. sieboldii chloroplast genome was clustered with the other two Castanopsis species.
... However, the variations in these cases are less than those of Goodyera schlechtendaliana Rchb. f. (844 SNPs, 0.55% and 2,045 INDELs, 1.33%) (Oh et al. 2019a(Oh et al. , 2019b and Gastrodia elata Blume (493 SNPs, 1.40% and 650 INDELs, 1.85%) (Kang et al., 2020;Park et al., 2020a) from Orchidaceae. ...
Full-text available
Daphne genkwa (Thymelaeaceae) is a small deciduous shrub widely cultivated as an ornamental. The complete chloroplast genome of this species is presented here. The genome is 132,741 bp long and has four subregions: 85,668 bp of large single-copy and 28,365 bp of small single-copy regions are separated by 9,354 bp of inverted repeat regions with 107 genes (71 protein-coding genes, four rRNAs, and 31 tRNAs) and one pseudogene. The phylogenetic tree shows that D. genkwa is nested within Wikstroemia and is not closely related to other species of Daphne, suggesting that it should be recognized as a species of Wikstroemia.
... In addition, 6 SNPs and 7 INDEL regions (19 bp) were found against Sang Jae and Ok Hwang 1ho sequences (MN127986 and MK616470) and 99 SNPs and 18 INDEL regions (72 bp) and 6 SNPs and 8 INDEL regions (21 bp) were identified against NC_031445 and MF407183, respectively. These numbers of intraspecific variations except those against NC_031445 are smaller than those identified between Korean isolates of other plant species (Cho et al. 2019;Oh et al. 2019aOh et al. , 2019bPark et al. 2019aPark et al. , 2019bChoi et al. 2020;Park et al. 2021), suggesting that genetic diversity in A. distichum may not be sufficient enough so that morphological features are not tightly linked to those intraspecific variations on chloroplast genomes. ...
Full-text available
The chloroplast genome of Abeliophyllum distichum f. lilacinum Nakai, classified to a monotypic in this genus, and an endemic species in Korea, was sequenced to understand the genetic differences among intraspecies and cultivars of A. distichum. The chloroplast genome length is 156,015 bp (GC ratio is 37.8%) and has a typical quadripartite structure: 86,779 bp large single copy (35.8%) and 17,828 bp small single copy (31.9%) regions separated by two 25,704 bp inverted repeat (43.2%) regions. The genome encodes for 133 genes (88 protein-coding genes, eight rRNAs, and 37 tRNAs). Six to 99 SNPs and seven to 18 INDEL regions (19 bp to 72 bp) were identified against available chloroplast genomes of A. distichum. Phylogenetic trees show that A. distichum f. lilacinum is clustered with the Dae Ryun cultivar which has a larger fruit body. Our analyses suggest additional research, such as Genotyping-By-Sequencing, for understanding relationship between morphology and genotype of A. distichum.
... Numbers of intraspecific variations are relatively large based on the intraspecific variation analysis of chloroplast genomes . High levels of intraspecific variations were found in many plant species, such as some species in Orchidaceae (Oh et al. 2019a(Oh et al. , 2019bPark, Suh, et al. 2020;Kang et al. 2020) and Rosaceae (Cho et Thirteen chloroplast genomes to represent the major lineages of tribe Spiraeeae (Potter et al. 2007) were used in phylogenetic analysis of the maximum likelihood (ML) and Bayesian inference (BI). Seventy-eight genes of LSC, SSC, and IRb regions were included in the analyses. ...
Full-text available
Aruncus dioicus var. kamtschaticus is an economically important herb in the cold temperate regions of East Asia, and displays highly variable morphological features. Completed chloroplast genome of A. dioicus var. kamtschaticus isolated in Korea is 157,859 bp long with four subregions: 85,972 bp of large single copy and 19,185 bp of small single-copy regions separated by 26,351 bp of inverted repeat regions. The genome includes 131 genes (86 protein-coding genes, eight rRNAs, and 37 tRNAs). Phylogenetic analyses show that our chloroplast genome was clustered with two partial chloroplast genomes of A. dioicus.
... Chilo suppressallis , Spodoptera frugiperda , and Laodelphax striatellus (Park, Jung, et al. 2019)) and plant chloroplast genomes (e.g. Goodyera schlechtendaliana (Oh et al. 2019), Gastrodia elata (Park et al. 2020), Abeliophyllum distichum , and Pyrus ussuriensis (Cho et al. 2019)), indicating a relatively lower level of inter-subspecific variations on its mitogenomes. Moreover, phylogenetic trees displayed that the two major Neighbor-joining (bootstrap repeat is 10,000) and maximum-likelihood (bootstrap repeat is 1,000) phylogenetic trees of 30 complete mitogenomes: Cervus canadensis nannodes (MT430939 used in this study), Cervus elaphus alxaicus (KU942399), Cervus elaphus (NC_007704 and KP172593), Cervus elaphus kansuensis (NC_039923), Cervus elaphus songaricus (NC_014703), Cervus elaphus yarkandensis (NC_013840), Cervus elaphus hippelaphus (KT290948), Cervus elaphus macneilli (KX449334), Cervus elaphus (MF872248 and MF872247), Cervus nippon yesoensis (NC_006973), Cervus nippon centralis (NC_006993), Cervus nippon yakushimae (NC_007179), Cervus nippon hortulorum (NC_013834), Cervus nippon hortulorum (HQ191428), Cervus nippon hortulorum (KR868807), Cervus nippon taiouanus (NC_008462), Cervus nippon taiouanus (EF058308), Cervus nippon sichuanicus (NC_018595), Cervus nippon kopschi (NC_016178), Cervus nippon kopschi (JN389444), and Muntiacus vuquangensis (NC_016920) as an outgroup. ...
Full-text available
Cervus canadensis nannodes (Merriam, 1905) is one of the subspecies of elk distributed only in California, USA. We completed the first mitogenome of C. canadensis nannodes. Its length is 16,428 bp, which is in middle among 24 available Cervus mitogenomes. It contains 37 genes (13 protein-coding genes, 2 rRNAs, and 22 tRNAs). Phylogenetic trees show that C. c. nannodes was clustered with some subspecies of C. elaphus. Number of inter-subspecific variations between C. c. nannodes and C. e. alxaicus are relatively small in comparison to intraspecific variations of insect and fish mitogenomes and plant chloroplast genomes.
Full-text available
The chloroplast genome of Glycyrrhiza uralensis Fisch was sequenced to investigate intraspecific variations on the chloroplast genome. Its length is 127,689 bp long (34.3% GC ratio) with atypical structure of chloroplast genome, which is congruent to those of Glycyrrhiza genus. It includes 110 genes (76 protein-coding genes, four rRNAs, and 30 tRNAs). Intronic region of ndhA presented the highest nucleotide diversity based on the six G. uralenesis chloroplast genomes. A total of 150 single nucleotide polymorphisms and 10 insertion and deletion (INDEL) regions were identified from the six G. uralensis chloroplast genomes. Phylogenetic trees show that the six chloroplast genomes of G. uralensis formed the two clades, requiring additional studies to understand it.
Full-text available
GATA transcription factors (TFs) are widespread eukaryotic regulators whose DNA-binding domain is a class IV zinc finger motif (CX2CX17-20CX2C) followed by a basic region. Due to the low cost of genome sequencing, multiple strains of specific species have been sequenced: e.g., number of plant genomes in the Plant Genome Database (http://www. is 2,174 originated from 713 plant species. Thus, we investigated GATA TFs of 19 Arabidopsis thaliana genome-widely to understand intraspecific features of Arabi- dopsis GATA TFs with the pipeline of GATA database ( Num- bers of GATA genes and GATA TFs of each A. thaliana genome range from 29 to 30 and from 39 to 42, respectively. Four cases of different pattern of alternative splicing forms of GATA genes among 19 A. thaliana genomes are identified. 22 of 2,195 amino acids (1.002%) from the alignment of GATA domain amino acid sequences display variations across 19 ecotype genomes. In addition, maximally four different amino acid sequences per each GATA domain identified in this study indicate that these position-specific amino acid variations may invoke intraspecific functional variations. Among 15 functionally character- ized GATA genes, only five GATA genes display variations of amino acids across ecotypes of A. thaliana, implying variations of their biological roles across natural isolates of A. thali- ana. PCA results from 28 characteristics of GATA genes display the four groups, same to those defined by the number of GATA genes. Topologies of bootstrapped phylogenetic trees of Arabidopsis chloroplasts and common GATA genes are mostly incongruent. More- over, no relationship between geographical distribution and their phylogenetic relationships was found. Our results present that intraspecific variations of GATA TFs in A. thaliana are conserved and evolutionarily neutral along with 19 ecotypes, which is congruent to the fact that GATA TFs are one of the main regulators for controlling essential mechanisms, such as seed germination and hypocotyl elongation.
Full-text available
The complete chloroplast genome of Euscaphis japonica (Thunb.) Kanitz isolated in Korea is 160,606 bp long and has four subregions: 89,232 bp of large single-copy and 18,734 bp of small single-copy regions are separated by 26,320 bp of inverted repeat regions including 129 genes (84 CDS, 8 rRNAs, and 37 tRNAs) and three pseudogenes. There were 424 SNPs and 809 INDELs compared with the Chinese E. japonica, useful to develop markers for phylogeographic study of the species. Phylogenetic trees show that E. japonica, representing Crossosomatales, is nested within the Malvids clade, confirming pre�vious studies.
Full-text available
Pseudostellaria palibiniana which belongs to subseries Verticilatae is one of the species in Pseudostellaria palibiniana species complex (PPSC). To uncover intraspecies variation of P. palibiniana, we presented its second complete chloroplast genome which is 149,639 bp long and has four subregions: 81,286 bp of large single copy and 16,977 bp of small single copy regions are separated by 25,688 bp of inverted repeat regions including 126 genes (81 protein-coding genes, 8 rRNAs, and 37 tRNAs). The overall GC content is 36.5% and those in the LSC, SSC, and IR regions are 34.3%, 29.4%, and 42.4%, respectively. Eighty-four single nucleotide polymorphisms and 125 insertions and deletions are identified between two individuals of P. palibiniana. Phylogenetic tree presents that P. palibiniana isolated from the same place of Pseudostellaria longipedicellata is clustered with another P. palibiniana, showing that PPSC can be solved using complete chloroplast genomes. © 2019, © 2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
Full-text available
Coffea arabica L. taking 70% of world coffee production is also used as an ornamental species. One imported coffee tree from Indonesia near to thirty years ago of which leaf is wrinkled (named as IN1) was chosen to know its genetic background. Here, we presented complete chloroplast genome of Coffea arabica IN1 which is 155,277 bp long and has four subregions: 85,248 bp of large single copy (LSC) and 18,137 bp of small single copy (SSC) regions are separated by 25,946 bp of inverted repeat (IR) regions including 131 genes (86 protein-coding genes, eight rRNAs, and 37 tRNAs). The overall GC content of the chloroplast genome is 37.4% and those in the LSC, SSC, and IR regions are 35.4%, 31.3%, and 43.0%, respectively. In comparison to three available coffee chloroplast genomes, 84 bp insertion on IN1 chloroplast genome is found, which is big differences in comparison to other available coffee chloroplast genomes. Even though relatively low number of sequence variations on coffee chloroplast genomes, these results can be used as a corner stone for establishing molecular markers to identify its origin or cultivars.
Full-text available
Goodyera schlechtendaliana is a common orchid species in East Asia. A distinctive population of G. schlechtendaliana with the lateral appendages of the column in their flowers was found in southwestern Korea. In this study, we presented complete chloroplast genome of this morphotype as a part of systematic study of the Goodyera. The chloroplast genome is 153,882 bp in length and contains 134 genes (83 CDSs, 8 rRNAs, and 39 tRNAs). Phylogenetic analysis showed that the morphotype of G. schlechtendaliana with column appendages is sister to a normal form of G. schlechtendaliana with long branch, supporting that this distinctive morphotype has a potential to be a new species.
Full-text available
The molecular evolutionary genetics analysis (Mega) software implements many analytical methods and tools for phylogenomics and phylomedicine. Here, we report a transformation of Mega to enable cross-platform use on Microsoft Windows and Linux operating systems. Mega X does not require virtualization or emulation software and provides a uniform user experience across platforms. Mega X has additionally been upgraded to use multiple computing cores for many molecular evolutionary analyses. Mega X is available in two interfaces (graphical and command line) and can be downloaded from free of charge.
Full-text available
A molecular phylogeny of Asiatic species of Goodyera (Orchidaceae, Cranichideae, Goodyerinae) based on the nuclear ribosomal internal transcribed spacer (ITS) region and two chloroplast loci (matK and trnL-F) was presented. Thirty-five species represented by 132 samples of Goodyera were analyzed, along with other 27 genera/48 species, using Pterostylis longifolia and Chloraea gaudichaudii as outgroups. Bayesian inference, maximum parsimony and maximum likelihood methods were used to reveal the intrageneric relationships of Goodyera and its intergeneric relationships to related genera. The results indicate that: 1) Goodyera is not monophyletic; 2) Goodyera could be divided into four sections, viz., Goodyera, Otosepalum, Reticulum and a new section; 3) sect. Reticulum can be further divided into two subsections, viz., Reticulum and Foliosum, whereas sect. Goodyera can in turn be divided into subsections Goodyera and a new subsection.
Full-text available
Type of reproduction has an important effect on the maintenance of particular populations and species persistence in time and space. This trait significantly influences the ecological and genetic structure of populations, and in consequence the evolution of species. The primary objectives of this study were: to estimate genetic diversity within and among populations of clonal species Goodyera repens from different populations in northeastern Poland, and to amount factors shaping the genetic structure of this orchid. Based on 451 rosettes of G. repens from 11 localities in northeastern Poland, we conducted a genetic population analysis using allozymes. We included information on population size, flowering, fruit set and seed dispersal to elucidate their influences on genetic diversity of this species. Populations differed according to demographic properties. The majority of seeds (86.4–94.8 %) were found at a distance of 0.2 m. We observed a high level of genetic (P PL = 50 %, A = 1.68, H O = 0.210, H E = 0.204) and genotypic diversity (G = 163, G/N S = 0.66, G U = 30.2 %), and low but statistically significant genetic differentiation among populations (F ST = 0.060; P < 0.001). We suggest that the genetic diversity of G. repens is mainly an effect of the abundance of pine and spruce forest communities suitable for this species in NE Poland and the high level of sexual reproduction.
Full-text available
We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.
Full-text available
With the fast advances in nextgen sequencing technology, high-throughput RNA sequencing has emerged as a powerful and cost-effective way for transcriptome study. De novo assembly of transcripts provides an important solution to transcriptome analysis for organisms with no reference genome. However, there lacked understanding on how the different variables affected assembly outcomes, and there was no consensus on how to approach an optimal solution by selecting software tool and suitable strategy based on the properties of RNA-Seq data. To reveal the performance of different programs for transcriptome assembly, this work analyzed some important factors, including k-mer values, genome complexity, coverage depth, directional reads, etc. Seven program conditions, four single k-mer assemblers (SK: SOAPdenovo, ABySS, Oases and Trinity) and three multiple k-mer methods (MK: SOAPdenovo-MK, trans-ABySS and Oases-MK) were tested. While small and large k-mer values performed better for reconstructing lowly and highly expressed transcripts, respectively, MK strategy worked well for almost all ranges of expression quintiles. Among SK tools, Trinity performed well across various conditions but took the longest running time. Oases consumed the most memory whereas SOAPdenovo required the shortest runtime but worked poorly to reconstruct full-length CDS. ABySS showed some good balance between resource usage and quality of assemblies. Our work compared the performance of publicly available transcriptome assemblers, and analyzed important factors affecting de novo assembly. Some practical guidelines for transcript reconstruction from short-read RNA-Seq data were proposed. De novo assembly of C. sinensis transcriptome was greatly improved using some optimized methods.
BWA-MEM is a new alignment algorithm for aligning sequence reads or long query sequences against a large reference genome such as human. It automatically chooses between local and end-to-end alignments, supports paired-end reads and performs chimeric alignment. The algorithm is robust to sequencing errors and applicable to a wide range of sequence lengths from 70bp to a few megabases. For mapping 100bp sequences, BWA-MEM shows better performance than several state-of-art read aligners to date. Availability and implementation: BWA-MEM is implemented as a component of BWA, which is available at Contact: