Access to this full-text is provided by Springer Nature.
Content available from Scientific Reports
This content is subject to copyright. Terms and conditions apply.
1
Vol.:(0123456789)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports
Comparative and phylogenetic
analysis of the complete
chloroplast genomes
of six Polygonatum species
(Asparagaceae)
Dongjuan Zhang
1,2,3,4, Jing Ren
1,2,3,5, Hui Jiang
1,2,3,4, Vincent Okelo Wanga
1,2,3,4,
Xiang Dong
1,2,3,4 & Guangwan Hu
1,2,3,4*
Polygonatum Miller belongs to the tribe Polygonateae of Asparagaceae. The horizontal creeping
eshy roots of several species in this genus serve as traditional Chinese medicine. Previous studies
have mainly reported the size and gene contents of the plastomes, with little information on the
comparative analysis of the plastid genomes of this genus. Additionally, there are still some species
whose chloroplast genome information has not been reported. In this study, the complete plastomes
of six Polygonatum were sequenced and assembled, among them, the chloroplast genome of P.
campanulatum was reported for the rst time. Comparative and phylogenetic analyses were then
conducted with the published plastomes of three related species. Results indicated that the whole
plastome length of the Polygonatum species ranged from 154,564 bp (P. multiorum) to 156,028 bp
(P. stenophyllum) having a quadripartite structure of LSC and SSC separated by two IR regions. A total
of 113 unique genes were detected in each of the species. Comparative analysis revealed that gene
content and total GC content in these species were highly identical. No signicant contraction or
expansion was observed in the IR boundaries among all the species except P. sibiricum1, in which the
rps19 gene was pseudogenized owing to incomplete duplication. Abundant long dispersed repeats and
SSRs were detected in each genome. There were ve remarkably variable regions and 14 positively
selected genes were identied among Polygonatum and Heteropolygonatum. Phylogenetic results
based on chloroplast genome strongly supported the placement of P. campanulatum with alternate
leaves in sect. Verticillata, a group characterized by whorled leaves. Moreover, P. verticillatum and
P. cyrtonema were displayed as paraphyletic. This study revealed that the characters of plastomes
in Polygonatum and Heteropolygonatum maintained a high degree of similarity. Five highly variable
regions were found to be potential specic DNA barcodes in Polygonatum. Phylogenetic results
suggested that leaf arrangement was not suitable as a basis for delimitation of subgeneric groups in
Polygonatum and the denitions of P. cyrtonema and P. verticillatum require further study.
Abbreviations
LSC Large single-copy
SSC Small single-copy
IR Inverted repeat
BI Bayesian inference
ML Maximum Likelihood
PCGs Protein-coding genes
CDS Coding sequence
OPEN
1CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture Wuhan Botanical Garden,
Chinese Academy of Sciences, Wuhan 430074, China. 2Center of Conservation Biology, Core Botanical Gardens,
Chinese Academy of Sciences, Wuhan 430074, China. 3Sino-Africa Joint Research Center, Chinese Academy of
Sciences, Wuhan 430074, China. 4University of Chinese Academy of Sciences, Beijing 100049, China. 5College of
Life Sciences, Hunan Normal University, Changsha 410081, China. *email: guangwanhu@wbgcas.cn
Content courtesy of Springer Nature, terms of use apply. Rights reserved
2
Vol:.(1234567890)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
IGS Intergenic spacer
rRNA Ribosomal RNA
tRNA Transfer RNA
GC Guanine–cytosine
SSR Simple sequence repeats
cp Chloroplast
CTAB Cetyltrimethylammonium bromide
RSCU Relative synonymous codon usage
NGS Next-generation sequencing
Polygonatum Miller belongs to the tribe Polygonateae Benth. & Hook. f. of the family Asparagaceae1. e spe-
cies in this genus are perennial herbs with horizontal creeping eshy roots and unbranched stems2. is genus
comprises approximately 80 species in the world (https:// wcsp. scien ce. kew. org/), accessed 30 March 2022).
According to Chen and Tamura2, 39 species have been recorded in China with 20 of them being endemic. Polygo-
natum is widely distributed in Northern Hemisphere, with the center of diversity in East Asia, especially in the
Hengduan Mountains of southwest China and the eastern Himalayas3,4. is genus is valued signicantly for
its medicinal properties, with species such as Polygonatum kingianum and P. sibiricum being used as traditional
Chinese medicine due to their properties of tonifying Qi, nourishing Yin, strengthening the spleen, moistening
the lung and beneting the kidney5.
Phylogenetic relationships reconstructed using ribosomal ITS and plastid DNA sequence suggested the mono-
phyly of Polygonatum and its sister relationship to Heteropolygonatum M.N. Tamura & Ogisu6–9. In terms of
infrageneric classication of this genus, it received considerable attention from researchers in history owing to
the wide phenotypic variation within and among the species. Baker subdivided Polygonatum into three sections
according to the leaf arrangement: the sect. Alternifolia with alternate leaves, sect. Oppositifolia with opposite
leaves and sect. Verticillata with whorled leaves10. However, phyllotaxy types in this genus were considered to be
unstable in subsequent studies7. On account of morphological traits like leaf arrangement, bract size and texture,
length of the perianth tube, perianth shape, anther length and ovary shape, Tang etal. proposed eight series for
Polygonatum distributed in China11. Based on karyological and micromorphological characters, Tamura sub-
divided Polygonatum into the sect. Polygonatum and sect Verticillata12. Recently, Meng and Nie reconstructed
the phylogenetic relationship among this genus using four chloroplast (cp) genes, rbcL, trnK, trnC-petN and
psbA-trnH, and they proposed a new group on the basis of Tamura’s work, namely sect. Sibirica7. As a result,
Polygonaum was divided into sect. Polygonatum, sect. Verticillata and sect. Sibirica. is infrageneric classica-
tion system was most widely accepted and was demonstrated by Floden’s research based on the complete cp
genomes of Polygonatum13,14.
e chloroplast is a unique organelle found in green plants that is responsible for photosynthesis. It has
a separate genome from the nuclear and the mitochondria genomes, and is mostly inherited matrilineally in
angiosperms. Compared to the nuclear and the mitochondrial genomes, plastomes are small, less vulnerable to
recombination, with low nucleotide substitution rates as well as generally more conserved in terms of gene struc-
ture and organization, and therefore can provide unique genetic information15,16. Among most higher plants, the
cp genome possesses a typical tetrad structure comprising a small single-copy (SSC), a large single-copy (LSC),
and two inverted repeats (IRs)17. Most cp genomes examined in plants have a constrained size varying from 120
to 160kb, and this discrepancy is mainly related to expansion/contraction or even loss of IR15,18,19. Considerable
genetic information is involved in the cp genome, which encodes about 120–130 genes20, which can be classied
into three groups, genes involved in chloroplast gene expression, genes related to photosynthesis, and those with
functions unclear21. e speed of molecular evolution between the coding and non-coding regions of chloroplast
genomes diers noticeably, which is suitable for systematic studies at dierent levels22.
Beneting from advances in next-generation sequencing technologies, cp genomes can be obtained more
eciently and economically. In the National Center for Biotechnology Information (NCBI) organelle genome
database, there are about 40,000 cp genomes of plants currently published (accessed: 2023/1/21). Among angio-
sperms, plenty of cp genomes have been successfully employed to address the issues of phylogenetic relationships
and species identication at dierent taxonomic levels23–27. ere are about 156 complete cp genome sequences
(ca. 40 species) of Polygonatum that have been reported in NCBI (accessed: 2022/11/07). However, previous
studies have mainly been concerned with the size and gene contents of the plastid genome, with insucient
studies on the comparative genomic analysis13,28. Although, chloroplast gene fragments and complete cp genomes
between species in Polygonatum have been adopted for phylogenetic analysis recently7,13,28, there are still some
species whose complete chloroplast genome data have not been published, thus their phylogenetic placement
of them is not well understood.
In this study, we reported the initial complete chloroplast genomes of Polygonatum campanulatum, together
with the complete plastome sequences of P. franchetii, P. cyrtonema1, P. li pe s1, P. zanlanscianense1 and P. sibiri-
cum1, and then compared them with three related species i.e., P. kingianum (MW373517), Heteropolygonatum
alternicirrhosum (MZ150832), H. ginfushanicum (MW363694). P. campanulatum is a critically endangered spe-
cies discovered by professor Guangwan Hu in Yunnan Province in 2011. No molecular information about this
species has been reported before, which can provide essential information for its conservation strategies and
conducting restoration practices. In this present study, 9 species sequences were selected for plastid genome
comparative analysis, including two species of Heteropolygonatum and seven species of Polygonatum, which
covers the three subgroups of Polygonatum as well as the major branches. ere are 11 plastomes of Polygonatum
species that have not been veried by NCBI (TableS1). Manual checking of the 11 unveried plastomes found
that the two IR regions had dierent lengths, and this discrepancy mainly occurred in the no-coding regions.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3
Vol.:(0123456789)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
erefore, these unveried plastomes have only been used to reconstruct phylogenetic relationships and collect
general information, and not for deep comparative analysis of the cp genome. A total of 56 published cp genome
sequences (51 from Polygonatum; 4 from Heteropolygonatum; Maianthemum henryi was chosen as outgroup)
obtained from the NCBI database were employed to reconstruct phylogenetic tree. e aims of this study were
to (1) conducting a comprehensive analysis of the chloroplast genome among the six Polygonatum and its related
species; (2) exploring hotspots regions of Polygonatum from the cp genomes; (3) inferring the phylogenetic
relationships of Polygonatum species and determine the taxonomic status of P. campanulatum, P. franchetii, P.
cyrtonema, P. lipes, P. zanlanscianense and P. sibiricum based on cp genome.
Materials and methods
Sample collection, total DNA extraction and sequencing. e six newly sequenced Polygonatum
species (Polygonatum campanulatum, P. l ip es , P. franchetii, P. zanlanscianense, P. cyrtonema, P. sibiricum) were
collected by Guangwan Hu in China during the period of 2019 to 2021. Detailed eld collection information of
them is described in Table1. e collected species were identied and veried by professor Guangwan Hu, from
Wuhan Botanical Garden, Chinese Academy of Science. Voucher specimens were deposited at the Herbarium of
Wuhan Botanical Garden, CAS (HIB) (China), with voucher specimen numbers listed in Table1. Total genome
DNA was extracted from the dry leaves preserved in silica gels, using a modied cetyltrimethylammonium
bromide (CTAB) method, and then sequenced based on the Illumina HiSeq X Ten platform, 150bp paired-end
reads (PE150) at Novogene Co., Ltd. (Beijing, China).
Assembly and annotation of chloroplast genome. Chloroplast genome assembling was done using
Get Organelle v1.7.529 with default parameters. Gene annotation was completed by PGA (Plastid Genome
Annotator) soware30 with Amborella trichopoda as a reference31,32. To ensure the reliability of the data used for
subsequent analysis, all chloroplast genome download from NCBI was annotated over again by PGA. Manual
checking and adjustment of the annotation results, including positions of initiation and termination codons
and boundaries of IR repeat regions, were performed in Geneious v10.2.333. Annotated chloroplast genome
sequences of the six species were submitted to GenBank (TableS1) in NCBI. Further, the circular chloroplast
genome map was drawn online by OGDRAW
34.
Comparative analysis of the whole chloroplast genome. Geneious v10.2.333 was employed to ana-
lyze length and guanine-cytosine (GC) content of the whole chloroplast genome, LSC, SSC and IR regions,
together with numbers of genes and genes categories. Multiple genome alignment analysis was performed in
MAFFT program35. Comparative chloroplast genomes divergence was conducted and visualized by mVISTA36
with the annotation of Polygonatum campanulatum as a reference in Shue-LAGAN mode. To detect the con-
traction or expansion at the boundaries, the SC/IR boundary analysis of the chloroplast genomes was carried
out by IRscope37. Mauve was adopted to perform the analyses of cp genome rearrangement based on default
settings38, and one of the IR regions was removed uniformly in all sequences.
Codon usage, and repeated sequences analysis. Relative synonymous codon usage (RSCU) value
was detected using MEGA v7.039. RSCU is dened as the ratio of the observed frequency of a codon to the
expected frequency without preference. e values greater than 1.0 mean that the particular codons are used
more frequently than expected, while the reverse indicates the opposite40.
Long dispersed repeats were identied using REPuter41 with a hamming distance equal to 3bp, and repeat
size no less than 30bp. Simple sequence repeats (SSRs) were identied using MicroSatellite identication tool
(MISA)42 with minimum parameters being set as 10, 5, 4, 3, 3, and 3 for mono-, di-, tri-, tetra-, penta-, and
hexanucleotides SSR motifs, respectively.
Nucleotide diversity analysis and selective pressure. DnaSP43 was adopted to analyze the nucleotide
diversity (Pi) with the window length of 600bp and the step size of 200bp. Given that DnaSP v6 cannot recog-
nize degenerate bases, like M, K, and Y, dashes were used to take the place of these letters. Further, the gure was
generated in Excel and optimized in Adobe Illustrator.
To identify the positive selection loci of coding sequences (CDS) in the cp genome, the dN/dS values were
calculated by employing EasyCodeML v1.1244. Each single-copy CDS were extracted from the complete chloro-
plast genome using Geneious v10.2.333, aer aligning under the codon model, they were nally combined into
Table 1. Specimen collection information of the six Polygonatum samples.
Species Voucher specimen number Date Locality Decimal latitude De cimal longitude
Polygonatum franchetii HGW-1223 2019-09-02 Gaowangjie National Nature Reserve, Guzhang County, Hunan
Province, China – –
P. zanlanscianense HGW-1357 2021-05-03 Malinyaozu Village, Xinning County, Hunan Province, China 26° 27ʹ 16ʹʹ N 110° 38ʹ 02ʹʹ E
P. l ip es HGW-1359 2021-05-03 Malinyaozu Village, Xinning County, Hunan Province, China 26° 27ʹ 12ʹʹ N 110° 38ʹ 52ʹʹ E
P. sibiricum HGW-1379 2021-05-19 Donggang District, Rizhao City, Shandong Province, China – –
P. campanulatum HGW-Z-2259 2019-07-27 Xima Township, Yingjiang County, Yunnan Province, China 24° 47ʹ 49ʹʹ N 97° 40ʹ 12ʹʹ E
P. cyrtonema HGW-Z-2364 2020-08-19 Tiantangzhai Township, Jinzhai County, Anhui Province, China 31° 13ʹ 17ʹʹ N 115° 42ʹ 53ʹʹ E
Content courtesy of Springer Nature, terms of use apply. Rights reserved
4
Vol:.(1234567890)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
one matrix. e input tree was an ML tree reconstructed by IQ-TREE45. Four site models (i.e., M0 vs. M3, M1a
vs. M2a, M7 vs. M8, and M8a vs. M8) along with a likelihood ratio test (LRT) were used to perform the analyses.
Naive Empirical Bayes (NEB) and Bayes Empirical Bayes (BEB)46 analyses were conducted under the M8 model
to identify positive selection loci and the selected genes.
Phylogenetic analysis. e phylogenetic analysis was performed based on the complete chloroplast
genomes of 57 Polygonatum sequences and 4 Heteropolygonatum taxa. Maianthemum henryi was set as an out-
group. e chloroplast genomes of all species were obtained from GenBank (TableS1), except for Polygonatum
campanulatum, P. lipes1, P. franchetii, P. zanlanscianense1, P. cyrtonema1, and P. sibiricum1. e total matrix
was aligned using MAFFT35. ModelFinder47 was adopted to select the best-t model according to the Bayesian
information criterion (BIC). Maximum likelihood (ML) phylogenetic tree was reconstructed using IQ-TREE45
under the GTR+I+G model for 5000 ultrafast bootstraps48. BI (Bayesian inference) analysis was conducted using
MrBayes v3.2.649 based on GTR+F+I+G4 model. Two independent Markov Chain Monte Carlo (MCMC) run
for 1,000,000 generations, trees were sampled every 100 generations, and the initial 25% of sampled data were
discarded as burn-in. e two output trees were visualized and improved by Figtree v1.4 (http:// github. com/
ramba ut/ gtr ee/).
Ethical approval and consent to participate. e authors have complied with the relevant institu-
tional, national and international guidelines in collecting biological materials for the study. e study contrib-
utes to facilitating future studies in population genetics and species identication.
Results
Chloroplast genome structure and characteristics analyses. e complete chloroplast genomes of
the six newly sequenced species in Polygonatum displayed closed circular and common tetrad structures (Fig.1).
e length of the 57 cp genomes in Polygonatum ranged from 154,564bp (P. mu l tio r um) to 156,028bp (P.
stenophyllum), while the length of the 4 cp genomes in Heteropolygonatum ranged from 155,436 (H. pendulum)
to 155,944 (H. alternicirrhosum) (Table1). Each plastome included a large single-copy (LSC), a small single-copy
(SSC) and a pair of inverted repeats (IRa and IRb) that separated the LSC and SSC regions (Fig.1). e LSC
regions of the Polygonatum species ranged from 83,486bp (P. odoratum) to 94,843bp (P. sibiricum3), while SSC
regions varied from 18,210bp (P. cyrtonema6) to 18,570bp (P. kingianum2). e sizes of the IR regions ranged
from 42,290bp (P. sibiricum3) to 52,830bp (P. cirrhifolium, P. curvistylum, P. hookeri, P. prattii, P. verticillatu3, P.
zanlanscianense1, P. zanlanscianense2, P. zanlanscianense3) (Table2). e total Guanine-Cytosine (GC) content
of the plastomes ranged from 37.6 to 37.8%. Further, GC content exhibited an unbalanced distribution among
the regions both in the cp genomes of Polygonatum and Heteropolygonatum. e SSC regions had presented the
lowest GC content of 31.4% to 31.7%, followed by LSC regions (35.6–36.1%), whereas the IRs had the highest
GC content ranging from 42.9 to 43% (Table2).
A total of 131–132 genes (113 unique genes) were detected in the complete cp genomes of the 57 Polygona-
tum in the same order. One rps19 gene was detected pseudogenized in P. stewartianum and P. sibiricum1 and
P. sibiricum2. And, both ycf1 genes were detected pseudogenized in H. ogisui. e whole genomes included 87
protein-coding genes (PCGs), 38 transfer RNA (tRNA) genes and 8 ribosomal RNA (rRNA) genes (Table2).
Moreover, a total of 19 genes, comprising 7 PCGs (rps19, rpl2, rpl23, ycf2, ndhB, rps7, rps12), 8 tRNA genes (trnN-
GUU , trnR-ACG , trnA-UGC , trnI-GAU , trnV-GAC , trnL-CAA , trnI-CAU , trnH-GUG ) and 4 rRNA genes (rrn5,
rrn4.5, rrn23, rrn16) were duplicated in the pair of inverted repeats. In addition, a total of 18 genes (trnA-UGC ,
trnG-UCC , trnI-GAU , trnK-UUU , trnL-UAA , trnV-UAC , rps12, rps16, rpl2, rpl16, rpoC1, petB, petD, atpF, ndhA,
ndhB, clpP, ycf3) and six tRNA contained at least one intron in the complete cp genome, in which clpP and ycf3
included two introns. Particularly, rps12 gene was a trans-spliced gene with the 5’ exon situated in the LSC region
and the two copies of 3’ exon and intron sitting in the IRs. e longest intron was identied in trnK-UUU with the
length of 2,568–2,586bp and the matK gene was placed inside the intron (TableS2). All of the functional genes
can be divided into three categories, i.e., self-replication genes, photosynthesis genes, and other genes (Table3).
Relative synonymous codon usage analysis. Given that codon usage is closely related to genome-
wide protein and mRNA levels, it is an essential feature of gene expression. e same codon presents dierent
frequencies in dierent organisms. e codon usage frequencies of Polygonatum campanulatum, P. li p es1, P.
franchetii, P. zanlanscianense1, P. cyrtonema1, P. sibiricum1, P. kingianum2, Heteropolygonatum alternicirrhosum
and H. ginfushanicum were computed based on protein-coding genes of the complete chloroplast genome. e
total codons in these nine species varied from 26,453 codons (P. kingianum2) to 26,651 codons (P. zanlanscian-
ense1). e most abundant amino acid (AA) was leucine (Leu), with the proportions ranging between 10.2 and
10.3%, followed by serine (Ser) accounting for 7.8–7.9% (TableS3). In contrast, cystine (Cys) possessed the low-
est number of codons (306–309 codons) in all the nine species when terminal codons were not considered. e
AGA codon, encoding arginine (Arg), presented the highest RSCU (relative synonymous codon usage) value of
10.92–1.96, while AGC codon, encoding serine (Ser), showed the lowest RSCU value with 0.31–0.33 (TableS3).
Additionally, CGC encoding Arginine (Arg) and AGC encoding serine (Ser) shared the lowest RSCU value of
0.31–0.32 and 0.31–0.33 respectively. Figure2 illustrates the summary statistics for amino acid frequency and
relative synonymous codon usage. Among the 64 codons, there were 31 codons with RSCU values less than 1
(RSCU < 1), which showed a lower usage frequency than expected. Meanwhile, 30 codons were used more fre-
quently than expected in P. campanulatum and P. lip es 1 with RSCU values greater than 1 (RSCU > 1), while 31
codons in the other seven species. Furthermore, the RSCU values of AUG and UGG in all the nine species were
equal to one (RSCU = 1) appearing without usage preference, while UCC only showed the same characteristics
Content courtesy of Springer Nature, terms of use apply. Rights reserved
5
Vol.:(0123456789)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
in P. campanulatum and P. lip e s1. Particularly, methionine (AUG) and tryptophan (UGG) were encoded by
only one codon. All codons with RSCU > 1 were characterized by Adenine–ymine ending in the six species
apart from UUG and the UCC in P. franchetii, P. zanlanscianense1, P. cyrtonema1, P. sibiricum1, P. kingianum2,
H. alternicirrhosum and H. ginfushanicum. On the contrary, 28 of the 31 codons with RSCU < 1 were detected
ending with Guanine-Cytosine (GC) in each species. When comparing nine Polygonatum, there were nearly no
dierences in RSCU value, indicating that the codon use bias of Polygonatum is rather stable (Fig.2).
Long dispersed repeats and microsatellites analysis. A total of 378 long dispersed repeats were
observed in the seven Polygonatum and two Heteropolygonatum species, consisting of 191 palindromic repeats,
177 forward repeats, nine reverse repeats and one complementary repeat (the palindromic repeat of IR regions
itself was excluded in all the nine species) (TableS4). Obviously, palindromic repeats were the dominant repeat
type (from 47.2% in P. l i p es1 to 53.5% in P. zanlanscianense1), while complementary repeats were the least
frequent one which was only detected in P. campanulatum (2.7%). Likewise, P. franchetii and H. ginfushanicum
did not possess any reverse repeats. On the other hand, the species that harbor the highest number of long
repeats was P. zanlanscianense1 (49), and the species with the lowest number was P. kingianum2 (35) (Fig.3A).
In H. ginfushanicum, the length of the longest repeat sequence was 66bp while in the rest eight species were
Figure1. Gene map of the chloroplast genome among the Polygonatum species. Genes inside and outside the
circle transcribed in counter-clockwise and clockwise respectively. e dark gray and light gray areas inside the
inner circle indicate GC content and AT content respectively. LSC (Large single-copy), SSC (Small single-copy)
and the inverted repeats (IRa, IRb) were denoted inner the circle.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
6
Vol:.(1234567890)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
Species
Genome length (bp) GC content (%) Gene number
Tot al LSC SSC IR Tot al LSC SSC IR Tota l PCG tRNA rRNA
H. alternicirrhosum 155,944 84,968 18,520 52,456 131 (113) 85 (79) 38 (30) 8 (4) 37.6 35.6 31.5 43.0
H. ginfushanicum 155,508 84,552 18,528 52,428 132 (113) 86 (79) 38 (30) 8 (4) 37.6 35.6 31.4 43.0
H. ogisui 155,665 84,784 18,533 52,348 132 (113) 86 (79) 38 (30) 8 (4) 37.6 35.6 31.4 43.0
H. pendulum 155,436 84,609 18,365 52,462* 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.7 43.0
P. acuminatifolium 155,354 84,271 18,455 52,628 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.8 31.6 43.0
P. annamense 155,277 84,340 18,422 52,515* 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. biorum 155,470 84,291 18,469 52,710* 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. campanulatum 155,487 84,458 18,373 52,656 132 (113) 86 (79) 38 (30) 8 (4) 37.6 35.7 31.7 42.9
P. cirrhifolium 155,944 84,568 18,546 52,830 132 (113) 86 (79) 38 (30) 8 (4) 37.6 35.7 31.5 42.9
P. curvistylum 155,939 84,563 18,546 52,830 132 (113) 86 (79) 38 (30) 8 (4) 37.6 35.7 31.5 42.9
P. cyrtone ma1 155,509 84,448 18,303 52,758 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.7 43.0
P. cyrtonema2 155,512 84,462 18,292 52,758 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.7 42.9
P. cyrtonema3 155,164 84,023 18,495 52,646 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. cyrtonema4 155,614 84,452 18,420 52,742 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.7 43.0
P. cyrtonema5 155,044 83,896 18,498 52,650 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. cyrtonema6 155,205 84,457 18,210 52,538 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.7 43.0
P. lipes1 155,361 84,307 18,454 52,600 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. lipes2 155,334 84,280 18,454 52,600 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. lipes3 155,317 84,262 18,455 52,600 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. lipes4 155,337 84,262 18,455 52,620 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. franchetii 155,962 84,722 18,566 52,674 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 43.0
P. govanianum 155,089 84,212 18,228 52,649* 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. hirtum 155,490 84,385 18,419 52,686 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 42.9
P. hookeri 155,976 84,600 18,546 52,830 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 42.9
P. hunanense1 155,618 84,448 18,426 52,744 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 42.9
P. hunanense2 155,609 84,438 18,427 52,744 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 42.9
P. hunanense3 155,608 84,437 18,427 52,744 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 42.9
P. hunanense4 155,618 84,448 18,426 52,744 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 42.9
P. inatum 154,898 84,270 18,454 52,174 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. involucratum1 155,370 84,280 18,450 52,640 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. involucratum2 155,372 84,282 18,450 52,640 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. kingianum1 155,826 84,627 18,547 52,652 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. kingianum2 155,824 84,632 18,570 52,622 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 43.0
P. kingianum3 155,824 84,626 18,546 52,652 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. macropodum 154,610 83,554 18,464 52,592 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.8 31.6 43.0
P. mengtzense 155,498 84,498 18,469 52,531* 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 43.0
P. mu lt i or u m 154,564 83,525 18,457 52,582 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.8 31.5 43.0
P. nodosum 155,205 84,143 18,422 52,640 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. odoratum 154,569 83,486 18,459 52,624 132 (113) 86 (79) 38 (30) 8 (4) 37.8 35.8 31.6 43.0
P. oppositifolium 155,760 84,471 18,544 52,745* 132 (113) 86 (79) 38 (30) 8 (4) 37.6 35.7 31.4 42.9
P. orientale 155,386 84,225 18,456 52,705* 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 43.0
P. prattii 155,915 84,538 18,547 52,830 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 42.9
P. punctatum 155,657 84,542 18,423 52,692 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 43.0
P. sibiricum1 155,514 84,537 18,415 52,562 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.7 43.0
P. sibiricum2 155,514 84,542 18,416 52,556 131 (113) 85 (79) 38 (30) 8 (4) 37.7 35.7 31.7 43.0
P. sibiricum3 155,549 94,843 18,416 42,290 132 (113) 86 (79) 38 (30) 8 (4) 37.7 36.1 31.7 43.0
P. sibiricum4 155,512 84,533 18,417 52,562 131 (113) 85 (79) 38 (30) 8 (4) 37.7 35.7 31.7 43.0
P. stenophyllum 156,028 84,677 18,561 52,790 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. stewar tianum 155,867 84,540 18,559 52,768* 131 (113) 85 (79) 38 (30) 8 (4) 37.7 35.7 31.5 42.9
P. tessellatum1 155,688 84,488 18,564 52,636 132 (113) 86 (79) 38 (30) 8 (4) 37.6 35.7 31.5 42.9
P. tessellatum2 155,688 84,488 18,564 52,636 132 (113) 86 (79) 38 (30) 8 (4) 37.6 35.7 31.5 42.9
P. tessellatum3 155,724 84,485 18,497 52,742* 132 (113) 86 (79) 38 (30) 8 (4) 37.6 35.7 31.4 42.9
P. urceolatum 155,504 84,492 18,435 52,577* 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 43.0
P. verticillatum1 155,589 84,242 18,523 52,824 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 42.9
P. verticillatum2 155,856 84,545 18,523 52,788 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 42.9
Continued
Content courtesy of Springer Nature, terms of use apply. Rights reserved
7
Vol.:(0123456789)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
71bp, and all of them were forward repeats. Furthermore, among all repeats detected in the nine species, the
length of repeats ranging from 30 to 34bp accounted for the majority (260, 68.1%) (Fig.3B, TableS5). e most
repeats were detected in the CDS, followed by IGS regions, some repeats were also identied between CDS, IGS,
tRNA and introns (Fig.3C, TableS6). Most of the repeat sequences were located in the IR regions except for P.
campanulatum and P. li pe s1, which harbored the highest number of repeats in LSC region (Fig.3D, TableS7).
In this study, we observed 507 SSRs among the nine species in total, comprising 303 mono-, 91 di-, 27 tri-,
63 tetra-, 20 penta-, and two hexa-nucleotide repeats (TableS8). Moreover, a total of two mono-, three di-, four
tri-, eight tetra-, four penta-types and two hexa-nucleotide repeats types were identied. And one tri-, two tetra-,
three penta- and two hexa-nucleotide types were observed only once in only one species (TableS9). Most SSRs
were mononucleotide and dinucleotide repeats, besides, the rest of SSRs showed lower frequencies. As shown in
Fig.4a, mono-nucleotide repeats were the most frequent type ranging from 55.9% (Polygonatum kingianum2) to
61.8% (Heteropolygonatum ginfushanicum). e number of SSRs of H. alternicirrhosum reached a peak value of
64 among the nine species. On the other hand, P. sibiricum1 possessed the least number of SSRs of 50 (Fig.4a,
TableS9). e most dominant SSRs were A/T polymers (Fig.4b–j), suggesting a remarkable base preference. And
the majority of the microsatellites were located in the LSC region (TableS10). ese results indicate that there
were no distinctive dierences in SSRs between Polygonatum and Heteropolygonatum. e identied SSRs will
provide valuable genetic information for the phylogeny and population genetics of Polygonatum in the future.
Comparative genome analysis and sequence variation. To identify highly variable regions among
the seven species of Polygonatum and two species of Heteropolygonatum, multiple sequence alignment of the
cp genomes was carried out. e annotation of Polygonatum campanulatum was set as a reference. It can be
seen from the data in Fig.5 that coding regions were much more conserved than non-coding regions, with
almost no signicant variations except for ycf1. Additionally, we detected that some intergenic spacer region
and introns appeared considerable variations, including rps16-trnQ, trnS-trnG, atpF-atpH, atpH-atpI, petA-psbJ,
Species
Genome length (bp) GC content (%) Gene number
Tot al LSC SSC IR Tot al LSC SSC IR Tota l PCG tRNA rRNA
P. verticillatum3 155,505 84,207 18,468 52,830 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.5 42.9
P. yunnanense 155,363 84,229 18,427 52,707* 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 43.0
P. zanlan scianense1 155,787 84,418 18,539 52,830 132 (113) 86 (79) 38 (30) 8 (4) 37.7 35.7 31.6 42.9
P. zanlanscianense2 155,911 84,650 18,431 52,830 132 (113) 86 (79) 38 (30) 8 (4) 37.6 35.6 31.5 42.9
P. zanlanscianense3 155,911 84,650 18,431 52,830 132 (113) 86 (79) 38 (30) 8 (4) 37.6 35.6 31.5 42.9
Table 2. General information and comparison of chloroplast genomes of the 57 cp genomes of Polygonatum
and 4 cp genomes of Heteropolygonatum. e length of the two IR regions is dierent in this sequence. Newly
sequenced species in this study are highlighted in bold.
Table 3. e annotated genes in the chloroplast genomes of Polygonatum. a Genes with one intron. b Genes
with two introns. c Two genes copied in IR regions.
Category Gene group Gene name
Self-replication
Ribosomal RNA rrn4.5c, rrn5c, rrn16c, rrn23c
Transfer RNA trnA-UGC
a,c, trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnG-GCC, trnG-UCC
a, trnH-GUG
c, trnI-CAU
c, trnI-GAU
a,c,
trnK-UUU
a, trnL-CAA
c, trnL-UAA
a, trnL-UAG, trnM-CAU, trnfM-CAU, trnN-GUU
c, trnP-UGG, trnQ-UUG, trnR-UCU, trnR-
ACG
c, trnS-UGA, trnS-GCU, trnS-GGA, trnT-GGU, trnT-UGU, trnV-UAC
a, trnV-GAC
c, trnW-CCA, trnY-GUA
Small subunit of ribosome rps2, rps3, rps4, rps7c, rps8, rps11, rps12a,c, rps14, rps15, rps16a, rps18, rps19c
Large subunit of ribosome rpl2a,c, rpl14, rpl16a, rpl20, rpl22, rpl23c, rpl32a, rpl33, rpl36
RNA polymerase subunits r poA, rpoB, rpoC1a, rpoC2
Photosynthesis
Photosystem I psaA, psaB, psaC, psaI, psaJ
Photosystem II psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ
Subunits of cytochrome petA, petBa, petDa, petG, petL, petN
ATP synthase atpA, atpB, atpE, atpFa, atpH, atpI
NADH-dehydrogenase ndhAa, ndhBa,c, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
Other genes
Rubisco large subunit rbcL
Translational initiation infA
Maturase matK
Envelope membrane protein cemA
Acetyl-CoA-carboxylase accD
Proteolysis clpPb
c-type cytochrome synthesis gene ccsA
Conserved open reading frames ycf1, ycf2c, ycf3b, ycf4
Content courtesy of Springer Nature, terms of use apply. Rights reserved
8
Vol:.(1234567890)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
ndhF-rpl32, rpl32-trnL and rpl16. Another signicant result was that compared with the IRs regions, LSC and
SSC regions showed higher variation, consistent with the result of nucleotide polymorphisms analysis (Fig.8).
Apart from ycf1, all highly divergent regions mentioned above were in single-copy regions. With respect to tRNA
and rRNA, they were strongly conserved without evident variations. Additionally, collinearity detection analysis
found that there were no interspecic or intraspecic rearrangements in the nine species (Fig.6).
Expansion and contraction of IRs. A comprehensive comparison of boundaries between single-copy
and the IRs regions was carried out. We observed that the complete cp genome structure of the nine species
varied from each other slightly. Apart from Polgonatum sibiricum, the junctions of LSC/IRb sit between rpl22
gene and rps19 gene among the other eight species. e rpl22 gene was located in the LSC region completely
with 26bp to 34bp away from LSC/IRb border, while the rps19 genes within IR regions were close to two IR/
LSC boundaries. Furthermore, in P. sibiricum, two rps19 genes extended into the LSC region due to the contrac-
tion of IRs (Fig.7), leading to the one located at IRa/LSC junction being a pseudogene. Apart from this special
case, rps19 in the other species was quite conservative with the same length of 279bp. Likewise, rpl22 gene was
also very conserved with the same length of 366bp in all the nine species. Moreover, the ndhF gene was located
in the boundaries of IRb/ SSC and expanded to the IRb region by 22, 29, or 34bp. And trnN gene was close to
the IRs/SSC boundaries with the whole gene within IRs regions. e ycf1 gene ranges from 4454 to 4573bp and
straddled the SSC/IRa boundary, with 883–895bp distributed in the IRa region and the rest in the SSC region
(Fig.7). In terms of IRa-LSC boundary, rps19 gene was located on the le side while psbA gene was on the right,
and psbA gene was highly conserved with a steady length of 1062bp. e distances between psbA and the IRa/
LSC junction varied from 87 to 94bp.
Together these results provided important insights into contractions and expansions of IR region borders in
Polygonatum and Heteropolygonatum. e structures and gene orders of the two genera were relatively conserved
except for P. sibiricum, in which a slight expansion and contraction occurred between IRs and LSC.
Nucleotide diversity and selective pressure analysis. e nucleotide diversity of nine chloroplast
genomes of Polygonatum and Heteropolygonatum was calculated to detect divergence hotspots. e pair of
inverted repeats were relatively conserved regions with an average Pi value of 0.00113. At the same time, LSC
and SSC showed higher nucleotide diversity with a mean Pi value of 0.00492 and 0.00674 respectively. Signicant
Figure2. Relative synonymous codon usage (RSCU) value of 20 amino acids and stop codons of seven
Polygonatum and two Heteropolygonatum species based on protein-coding sequences in chloroplast genomes.
e colors of the bar correspond to the colors of codons. Each amino acid corresponds to nine histograms,
and y-axis represents the RSCU value. e order of each six columns from le to right is P. campanulatum, P.
lipes1, P. franchetii, P. zanlanscianense1, P. cyrtonema1, P. sibiricum1, P. kingianum 2, H. alternicirrhosum and H.
ginfushanicum.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
9
Vol.:(0123456789)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
variations (Pi > 0.014) were found in the following regions: trnK-UUU
-rps16, trnC-GCA
-petN, trnT-UGU
-trnL-UAA
,
ccsA-ndhD and ycf1 (Fig.8), in which the most divergent region was trnK-UUU
-rps16, with the Pi value of 0.01565.
Of these ve regions, 80% (4) were intergenic genes. In contrast, protein-coding regions accounted for 20% (1),
indicating that non-coding regions harbored more variations and coding region were more stable and conserva-
tive. Moreover, all ve divergent hotspots might be potential molecular markers for DNA barcodes adopted into
species identication and phylogenetic studies in the future.
Synonymous substitutions in the nucleotide preserve the same amino acids. On the contrary, non-synony-
mous substitutions will change the amino acids. e substitution rates of nonsynonymous (dN) and synonymous
(dS) have been widely used for quantifying adaptive molecular evolution in the chloroplast genome50. In the cur-
rent study, according to BEB methods, a total of 14 genes corresponding to 65 sites were detected under positive
selection. Among them, four genes (rpoC2, rpoB, psaA, ndhK) were identied under signicant positive selection,
and ten genes (psbA, psbK, atpA, rpoC1, psbD, psbC, psbZ, psaB, rps4, ndhJ) under positive selection (TableS11).
All the selected genes were located in LSC regions, and 10 were related to photosynthesis. We observed that
rpoC2 harbored the highest number of sites under positive selection (13), followed by psaA (12) and rpoB (11).
Phylogenetic analysis of Polygonatum. A total of 62 cp sequences of Polygonatum and its related spe-
cies were selected to reconstruct phylogenetic relationships among this genus. Maianthemum henryi was chosen
as an outgroup own to its closer distances and more basic position to Polygonatum and Heteropolygonatum. e
62 cp sequences comprise six newly sequenced data (i.e., Polygonatum campanulatum, P. li pe s1, P. franchetii, P.
zanlanscianense1, P. cyrtonema1, P. sibiricum1) and 56 cp genome published in NCBI (TableS1). e topologies
of Maximum likelihood (ML) and Bayesian inference (BI) were highly identical both in tree structure and spe-
cies position with generally strong support (Fig.9). e dierence lies in the fact that the BI analysis cannot tell
apart the branch structure of some dierent samples belonging to the same species (Fig.9). Both Polygonatum
Figure3. Analysis of long dispersed repeats in the cp genomes of seven Polygonatum and two
Heteropolygonatum species. (A) e number of the four types of long repeats. (B) Distribution ratio of repeats in
regions of the cp genome. (C) Distribution ratio of repetitive sequences in functional regions. (D) Proportion of
repeats in dierent length intervals of the chloroplast genome.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
10
Vol:.(1234567890)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
and Heteropolygonatum exhibited monophyletic relationships and shared the most recent common ancestor.
Polygonatum was divided into two main lineages including sect. Verticillata and the clade consisting of sect.
Polygonatum and sect. Sibirica. Phylogenetic analysis suggested that sect. Sibirica comprise only one species,
i.e., P. sibiricum. Moreover, we also observed that P. verticillatum and P. cyrtonema were paraphyletic. P. verti-
cillatum1 was sister to P. zanlanscianense (BS = 100, PP = 1.00), while P. verticillatum2 appeared as sister clade
to P. curvistylum + P. pratti + P. stewartianum (BS = 100, PP = 1.00), and P. verticillatum3 located at the base of
the branch composed by P. curvistylum + P. pratti + P. stewartianum + P. verticillatum2 + P. hookeri + P. cirrhifo-
lium + P. verticillatum3 (BS = 100, PP = 1.00). Four samples of P. cyrtonema, including the newly sequenced one,
appeared as the sister to P. hunanense (BS = 100, PP = 1.00) and this clade locates at the base of sect. Polygonatum.
e other two samples present as sister clade to P. hirtum with signicantly high Bayesian posterior probability
and bootstrap support (BS = 100, PP = 1.00). For P. franchetii, it was the sister clade to P. stenophyllum (BS = 100,
PP = 1.00). Furthermore, P. lip es strongly supported being included in sect. Polygonatum and being sister to P.
yunnanense plus P. nodosum (BS = 99, PP = 1.00). Surprisingly, P. campanulatum with alternate leaves located in
sect. Verticillata, a group characterized by whorled leaves, and formed a sister clade with Polygonatum tessella-
Figure4. Simple sequence repeats (SSRs) analysis of the complete chloroplast genomes of the seven
Polygonatum and two Heteropolygonatum species. (a) Numbers of mono-, di-, tri-, tetra-, penta-, and hexa-
nucleotide repeats. (b–j). Frequencies of SSRs motifs in dierent repeat class types.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
11
Vol.:(0123456789)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
tum plus Polygonatum oppositifolium (BS = 100, PP = 1.00), which suggested that leaf arrangement is not suitable
as the basis for delimitation of subgeneric groups in Polygonatum.
Discussion
Features of complete chloroplast genome and comparative analyses. In the current study, we
reported the initial complete cp genomes for one critically endangered Polygonatum species, Polygonatum cam-
panulatum. Additionally, the complete cp genomes of other ve species were newly sequenced (P. cyrtonema1,
P. franchetii, P. lipes1, P. zanlanscianense1, P. sibiricum1) using Illumina sequencing technology. Besides, cp
genomic comparative analyses of the plastomes were carried out among the six species plus another three related
species (P. kingianum2, Heteropolygonatum alternicirrhosum, H. ginfushanicum) to understand potential genetic
information of Polygonatum. e cp genome showed a typical quadripartite structure, with the length between
154,564 and 156,028bp in Polygonatum, and 155,436–155,944bp in Heteropolygonatum. e range of chloro-
plast genome length variation in these two species was similar to other Asparagaceae and higher plants reported
previously51–55. And the size changes are partially caused by elongation or contraction of inverted repeat regions.
Our study revealed that gene content and gene order in the cp genomes of Polygonatum and Heteropoly-
gonatum were highly conserved, with only slight variations in gene size, gene position and gene number. is
result is similar to other species of Asparagaceae56. All plastomes contained 131–132 genes comprising 85–86
protein-coding genes, 38 tRNA and eight rRNA. Among these genes, 18 included intron and 19 were duplicated
in IR regions. e dierence in gene number is due to pseudogenization of rps19 and ycf1 in some sequences. In
detail, one of the rps19 genes in P. stewartianum, P. sibiricum1 and P. sibiricum2 presented to be a pseudogene.
e rst one is attributed to genetic mutation and the others to its location at IR/LSC boundary, which makes
the gene lose its ability to replicate fully. And, both ycf1 genes were detected pseudogenized in H. ogisui due to
the insertion of a sequence Expression of the rps19 gene is relatively unstable among species of Asparagaceae,
Figure5. Alignment of chloroplast genomes of Heteropolygonatum alternicirrhosum, H. ginfushanicum,
Polygonatum campanulatum, P. l ipes 1, P. franchetii, P. zanlanscianense1, P. cyrtonema1, P. sibiricum1, P.
kingianum2. e grey arrows at the top represent the direction of gene translation, and the y-axis indicates the
percentage identity between 50 and 100%. (Exon: protein codes; UTR: tRNAs and rRNAs; CNS: conserved
noncoding sequences).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
12
Vol:.(1234567890)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
the pseudogenization of rps19 has also been reported in Behnia reticulate, Hesperaloe parviora and Hosta
ventricosa, while Camassia scilloides and Chlorophytum rhizopendulum missed this gene completely57. e rps2,
infA and other pseudogenes reported previously in Asparagaceae were not detected in this study57,58. In addition,
although there were no remarkable variations in GC content among dierent species, the distribution of GC
content was identied as asymmetrical. e higher GC content in IRs means a more stable structure in that GC
pairs include three hydrogen bonds and AT pairs have two59. Moreover, this may be attributed to the four rRNA
genes, which possess high-level GC nucleotide percentages. Similar results have been found in the chloroplast
genomes of other angiosperms60–62.
e pattern of codon usage is a vital genetic characteristic of the organism, related to mutation, selection and
other molecular evolutionary phenomena63. Our results demonstrated that Leucine (Leu) presented the high-
est frequency of all amino acids in Polygonatum campanulatum, P. l ip es 1, P. franchetii, P. zanlanscianense1, P.
cyrtonema1, P. sibiricum1, P. kingianum2, Heteropolygonatum alternicirrhosum and H. ginfushanicum. On the
contrary, cystine (Cys) was the least abundant amino acid except for stop codons, which was also found in other
angiosperm taxa24,64. Furthermore, e result of RSCU analysis illustrated that most codons ended with A or U
when RSCU value was greater than one, likewise, most codons ended with C or G when the RSCU value was less
than one. is phenomenon revealed that codon usage was biased towards A and U at the third codon position
in Polygonatum, which coincided with previous studies56,61,65.
Long dispersed repeats are essential for the rearrangement and stability of the chloroplast genome and rel-
evant to copy number dierences among species66. Identifying their number and distribution plays a key role in
genomic studies67. e current study found that palindromic repeats were the most common repeat type, fol-
lowed by forward repeats. Whereas complementary repeat was identied only in P. campanulatum, P. franchetii
and H. ginfushanicum did not harbor any reverse repeats. In the plastomes of the nine species reported here,
the length of repeats ranging from 30 to 39bp is dominant, which is commonly observed in other angiosperm
lineages31,52,68. Our study also revealed that the repetitive sequences were not randomly allocated in the seven
cp genomes of Polygonatum and two cp genomes of Heteropolygonatum, they were mainly identied in the LSC
region (48.7%) and CDs (51.9%).
SSR (Simple Sequence Repeats) is a signicant codominant DNA molecular marker with the advantages of
high abundance, random distribution throughout the genome and ample polymorphism information69,70. ere-
fore, it provides essential insights into many elds, such as species identication, phylogeography and popula-
tion genetics71,72. A total of 507 SSRs were detected in the current study, with H. alternicirrhosum containing
the most. Further, among the seven cp genomes of Polygonatum and two cp genomes of Heteropolygonatum, six
categories of SSRs were observed in total. Mononucleotide SSRs showed the highest frequency in each genome,
with A/T as the predominant motif type. Similar results had been reported in numerous taxa53,61,73. By contrast,
hexanucleotide SSRs were the rarest type, with only one element being observed in P. cyrtonema1 and P. li pes 1.
In addition, SSRs lying within LSC regions accounted for the majority (72.4%), which was in agreement with
previous studies65,68. In summary, the microsatellites identied in this study will be developed as markers for
Polygonatum, and contribute to species identication and evolutionary studies of this genus in the future.
Multiple sequence alignment results revealed the similarities of cp genome in structure, content, and order
among Polygonatum and its related species. Consistent with previous reports74–76, we also found that no coding
Figure6. Genomic rearrangement of the seven Polygonatum and two Heteropolygonatum. Blocks in dierent
colors correspond to dierent gene types. Black: transfer RNA (tRNA); green: intron-containing Trna; Red:
ribosomal RNA; White: protein-coding genes (PCGs).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
13
Vol.:(0123456789)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
regions harbored more distinctive variation than coding regions in this study. Two single-copy regions exhibited
higher sequence divergence than the IRs. e following seven intergenic regions, i.e., rps16-trnQ, trnS-trnG,
atpF-atpH, atpH-atpI, petA-psbJ, ndhF-rpl32, rpl32-trnL and two genes, i.e., ycf1 and rpl16 were detected as the
most divergent. Comparative analysis of Polygonatum and its related species discovered that the cp genomes
presented highly conserved, and no interspecic or intraspecic rearrangement was detected.
Contraction and expansion in IRs regions led to variations in cp genome size, which were observed in the
evolutionary history of terrestrial plants commonly62. e size of IR regions was relatively similar in Polygonatum
and Heteropolygonatum, ranging from 26,214bp in H. ginfushanicum to 26,415bp in P. zanlanscianense1. Despite
that, all the cp genomes showed similarity in the overall gene order and structures, several variations were identi-
ed at the junctions of IR/SC. e current study demonstrated that boundary genes in Polygonatum were mainly
rpl22, rps19, trnN, ndhF, ycf1 and psbA, which is also identied with Heteropolygonatum and Hosta56. It further
conrms that boundary features are relatively stable across closely related species77. e LSC/IRb boundary was
traversed by the rps19 gene in P. sibiricum1, whereas the junctions located between rpl22 and rps19 in the other
species. Incomplete duplication of the normal copy resulting in pseudogenization of the rps19 gene located at
IRa/LSC boundary, and this phenomenon has also been reported in Polygonatum cyrtonema (MZ029094)14 and
other taxa of Asparagaceae, such as Behnia reticulate, Hesperaloe parviora and Hosta ventricosa57. Excluding
rps19, the other genes situated at SC/IR boundaries exhibited relative stability across the six Polygonatum and
two Heteropolygonatum species studied in this work. Only ndhF and ycf1 had slight variations in size. e high
resemblances in boundaries between SC/IR also demonstrate that all the species share the same genes. Besides,
the total number of genes does not change due to IR contraction and expansion78.
Figure7. Comparative analysis of the LSC, IR and SSC boundary regions in the nine chloroplast genomes.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
14
Vol:.(1234567890)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
We detected trnK-UUU
-rps16, trnC-GCA
-petN, trnT-UGU
-trnL-UAA
, ccsA-ndhD and ycf1 were prominent divergent
regions, with nucleotide diversity greater than 0.014. ere are three loci (matK-rps16, trnC-GCA
-petN and ccsA)
consistent with previous study14. e result indicated that divergent regions located in LSC were in the major-
ity, and the IR regions displayed relatively poor diversity, which agreed with the results of multiple sequence
alignment conducted by mVISTA. e same phenomenon has been observed in many taxa24,31. e regions
detected in nucleotide diversity analysis might also provide additional genetic information for DNA barcodes
in Polygonatum, but this required the support of further experiments.
e non-synonymous (dN) and synonymous (dS) substitution rates are benecial in inferring the adaptive
evolution of genes25,79. e analysis of dN/dS was carried out owing to its popularity and reliability in quantifying
selective pressure80,81. In this study, a total of 14 positively selected sites (comprising 4 signicant positive and
10 positive sites) were detected under the BEB method, which were distributed in atpA, ndhJ, ndhK, psaA, psaB,
psbA, psbC, psbD, psbK, psbZ, rpoB, rpoC1, rpoC2, rps4. Results indicated that 10 of the 14 positively selected
genes are relevant to photosynthesis (TableS11). e plants of Polygonatum are mainly distributed in the shady
places of forest, scrub or mountain slopes11. e week sunlight may exert selective pressure on genes, which
could leave a trace of natural selection in genes of chloroplast engaged in adaptation to the environment. It can
be speculated that photosynthesis-related genes drive the successful adaptation of Polygonatum to diverse envi-
ronment conditions, considering their extensive distribution range in the northern hemisphere. Photosynthesis-
related genes were also found to undergo positive selection in other taxa that are widely distributed or live in
shady environments82–86.
Phylogenetic analysis. Phylogenetic analysis based on complete cp genome demonstrated that both Polyg-
onautm and Heteropolygonatum were monophyly. Coinciding with the results of previous studies7,13,28, Polygona-
tum was composed of three major clades, sect. Verticillata, sect. Sibirica and its sister clade sect. Polygonatum. In
the current study, we observed that sect. Sibirica contained only one species, P. sibiricum, which was consistent
with Xia, Meng and Wang’s ndings7,14,28. However, data from Floden13 suggests that one sample of P. verticil-
latum was sister to P. sibicirum within sect. Sibirica. Moreover, previous studies indicated that P. verticillatum was
paraphyletic, potentially as a result of its wide geographic distribution and diverse morphological variations13,28.
A similar result was presented in this study. P. verticillatum1 exhibited as the sister clade to P. zanlanscianense
while P. verticillatum2 was sister to P. curvistylum + P. pratti + P. stewartianum, and P. verticillatum3 located at
the base of the branch composed by P. curvistylum + P. pratti + P. stewartianum + P. verticillatum2 + P. hookeri + P.
cirrhifolium + P. verticillatum3. With similarities to previous ndings28, P. cyrtonema was either recovered as
paraphyletic in this study given that four samples, including the newly sequenced one, appeared as the sister to
P. hunanense, while the other two samples presented being sister relationship with P. hirtum. All the clades were
Figure8. Nucleotide diversity analysis of the complete chloroplast genomes of the seven Polygonatum and two
Heteropolygonatum (window length: 600bp; step size: 200bp).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
15
Vol.:(0123456789)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
Figure9. Phylogenetic relationships of the 57 cp sequences of Polygonatum and 4 of Heteropolygonatum, with
Maianthemum henryi set as the outgroup. Maximum likelihood (ML) and Bayesian inference (BI) methods were
used to reconstruct the tree. Only ML tree was shown, because of the highly identied topologies of ML tree and
BI tree. e value of ML supports and Bayesian posterior probabilities were shown above the branches. e cp
genomes newly sequenced in this study are highlighted with red triangle marks.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
16
Vol:.(1234567890)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
supported highly. It suggests that the circumscription of these two broadly distributed species, P. cyrtonema and
P. verticillatum requires further study.
ere is little study on the systematic position of P. franchetii, and even less on the its cp genome information.
Meng’s team7 reported the phylogenetic relationships included in P. franchetii using four chloroplast fragments
(rbcL, psbA-trnH, trnK and trnC-petN) for the rst time. Regrettably, the branch structure to which P. franchetii
belonged was ambiguous, making it dicult to recognize the relationship between P. franchetii and its close
taxa. Wang-Jing14 reported the cp genome of P. franchetii for the rst time. However, the sample chuster with
P. hirtum + P. mu lt i or um and located in sect. Polygonatum, which shows dierence with this study. Our study
suggests thatP. franchetii is strongly supported as the sister clade to P. stenophyllum and is situated in sect. Verticil-
lata. Furthermore, P. l ip es presented the sister clade to P. yunnanense plus P. nodosum within sect. Polygonatum
in this study. And it is found by Xia etal.28 that P. l ip es was thesister to the clade consisting of P. inatum + P.
multiorum + P. odoratum + P. macropodum + P. involucratum + P. acuminatifolium + P. arisanense + P. orientale + P.
yunnanense + P. nodosum with high support. However, the clade composed of P. yunnanense + P. nodosum was
weakly supported as the sister to the rest species in the sister clade of P. lipes. Besides, P. l ipe s is the sister clade
to P. cyrtonema, and this branch clusters with P. jinzhaiense and P. hunanense in Wang-Jing’s study14. It suggests
that the voucher specimens of P. lip es and P. franchetii in Wang’s study14 should be checked further.
One unanticipated nding was that phylogenetic tree strongly supported the placement of Polygonatum cam-
panulatum in sect. Verticillata, despite the fact that P. campanulatum grows alternating leaves, but sect Verticillata
is characterized by whorled or opposite leaves. P. campanulatum was compared to P. gongshanense and P. fran -
chetii when it was rst published, but material for P. gongshanense was not available in this work. Furthermore,
phylogenetic analysis indicated that P. franchetii and P. campanulatum presented in separate branches whereas
P. tessellatum + P. oppositifolium were highly supported as the sister to P. campanulatum (BS = 100, PP = 1.00).
Despite P. campanulatum, P. tessellatum and P. oppositifolium sharing similar lustrous and lanceolate leaves2,87,
they dier in leaf arrangement, lament structure and orescence, etc. In detail, P. campanulatum is character-
ized by alternate leaves with a retrorse spur at the lament apex and owers in October, while P. tessellatum and
P. oppositifolium dier in whorled or opposite leaves without a retrorse spur at the lament apex and ower in
May2,87. Moreover, previous studies discovered that leaf arrangement is labile and the whorled leaves have arisen
from the alternate state at least twice7,88. In conclusion, we infer that the use of phyllotaxis to dene subgenera
within Polygonatum is inappropriate. Additionally, blossom color and pollen exine sculpture were also used as
the features to subgroup Polygonatum in previous studies7,12,89. Whereas sect. Verticillata typically displayed
reticulate pollen exines and purple or pink perianths, sect. Polygonatum was distinguished by its perforated
pollen exines and greenish-white or yellow perianths7,89. In contrast, P. campanulatum placed in Verticillata has
perforate reticulate decorations and perianths that are either yellowish green or greenish white87. e controversy
over ower color has been reported in the study of Xia and her team28. From this, we can see that ower color
and pollen exine sculpture may be irrelated with phylogeny and not ideal as the basis for subgenus classica-
tion of Polygonatum either. Moreover, further research about the information is required on base chromosome
numbers and karyotypes of P. campanulatum. is work will contribute to a more insightful understanding of
the infrageneric classication of Polygonatum and demonstrate that the cp genome is an ecient tool for resolv-
ing specic level phylogeny.
Conclusion
In the current study, we sequenced and annotated the cp genomes of Polygonatum campanulatum, P. franchetii,
P. l ip es 1, P. zanlanscianense1, P. cyrtonema1 and P. sibiricum1. Comparative analyses of the chloroplast genome
of the six taxa and three related species were conducted. e genome size, gene content, gene order and G-C
content maintained a high similarity in the cp genomes of Polygonatum and Heteropolygonatum. No interspecic
or intraspecic rearrangements were detected. Five highly variable regions were found to be potential specic
DNA barcodes. Fourteen genes were revealed under positive selection and a large variety of repetitive sequences
were identied. Sixty-two cp sequences of Polygonatum and its related species were utilized for phylogenetic
analyses. e phylogenetic results illustrated that Polygonatum can be divided into two signicant clades, sect.
Verticillata and sect. Sibirica plus sect. Polygonatum. Further, P. campanulatum and P. tessellatum + P. oppositi-
folium were strongly supported being sister relationship and located in sect. Verticillata, suggesting that leaf
arrangement appears not suitable as basis for delimitation of subgeneric groups in Polygonatum. Additionally,
P. franchetii is sister to P. stenophyllum within sect. Verticillata, too. With high morphological and karyological
diversity, Polygonatum has attracted much attention in phylogenetic and taxonomic research. Our analysis pro-
vides more chloroplast genomic information of Polygonatum and contributes to improving species identication
and phylogenetic studies in further work.
Data availability
All data generated or analyzed during this study are included in this published article and the complete chloro-
plast genome sequences of Polygonatum campanulatum, P. cyrtonema1, P. l ip es 1, P. franchetii, P. sibiricum1 and
P. zanlanscianense1 are deposited in the genbank with ID no: ON534060, ON534061, ON534062, ON534063,
ON534064 and ON534059, respectively. Information for other samples used for phylogenetic analysis download
from GenBank can be found in Additional Table1: TableS1.
Received: 10 November 2022; Accepted: 24 April 2023
Content courtesy of Springer Nature, terms of use apply. Rights reserved
17
Vol.:(0123456789)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
References
1. Iv, A. P. G. An update of the angiosperm phylogeny group classication for the orders and families of owering plants: APG IV.
Bot. J. Linn. Soc. 181, 1–20 (2016).
2. Chen, S. C. & Tamura, M. N. Flora of China Vol. 24, 223–232 (Science Press & Missouri Botanical Garden Press, 2000).
3. erman, E. Chromosomal evolution in the genus Polygonatum. Hereditas 39, 277–288 (1953).
4. Tamura, M. N., Schwarzbach, A. E., Kruse, S. & Reski, R. Biosystematic studies on the genus Polygonatum (Convallariaceae) IV.
Molecular phylogenetic analysis based on restriction site mapping of the chloroplast gene trnK. Feddes Repertorium 108, 159–168
(1997).
5. Pharmacopoeia Commission. e Pharmacopoeia of the People’s Republic of China 319 (China Medical Science Press, 2015).
6. Zhao, L. H., Zhou, S. D. & He, X. J. A phylogenetic study of Chinese Polygonatum (Polygonateae, Asparagaceae). Nord. J. Bot. 37,
02019. https:// doi. org/ 10. 1111/ njb. 02019 (2019).
7. Meng, Y., Nie, Z. L., Deng, T., Wen, J. & Yang, Y. P. Phylogenetics and evolution of phyllotaxy in the Solomon’s seal genus Polygo-
natum (Asparagaceae: Polygonateae). Bot. J. Linn. Soc. 176, 435–451. https:// doi. org/ 10. 1111/ boj. 12218 (2014).
8. Tamura, M. N., Ogisu, M. & Xu, J. M. Heteropolygonatum, a new genus of the tribe Polygonateae (Convallariaceae) from West
China. Kew. Bull. 52, 949–956. https:// doi. org/ 10. 2307/ 41178 21 (1997).
9. Szczecinska, M., Sawicki, J., Polok, K., Holdynski, C. & Zielinski, R. Comparison of three Polygonatum species from Poland based
on DNA markers. Ann. Bot. Fenn. 43, 379–388 (2006).
10. Baker, J. G. Revision of the species and genera of Asparagace. Bot. J. Linn. Soc. 14, 508–629. https:// doi. org/ 10. 1111/j. 1095- 8339.
1875. tb003 49.x (1875).
11. Wang, F. T. & Tang, T. Flora Reipublicae Popularis Sinicae Vol. 15, 52–80 (Science Press, 1978).
12. Tamura, M. N. Biosystematic studies on the genus Polygonatum (Liliaceae): III. Morphology of staminal laments and karyology
of eleven Eurasian species. Bot. Jahrb. Syst 115, 1–26 (1993).
13. Floden, A. J. & Schilling, E. E. Using phylogenomics to reconstruct phylogenetic relationships within tribe Polygonateae (Aspara-
gaceae), with a special focus on Polygonatum. Mol. Phylogenet. Evol. 129, 202–213. https:// doi. org/ 10. 1016/j. ympev. 2018. 08. 017
(2018).
14. Wang, J. et al. Comparative analysis of chloroplast genome and new insights into phylogenetic relationships of Polygonatum and
tribe polygonateae. Front. Plant Sci. 13, 882189. https:// doi. org/ 10. 3389/ fpls. 2022. 882189 (2022).
15. Zheng, X. M. et al. Inferring the evolutionary mechanism of the chloroplast genome size by comparing whole-chloroplast genome
sequences in seed plants. Sci. Rep. 7, 1555. https:// doi. org/ 10. 1038/ s41598- 017- 01518-5 (2017).
16. Ravi, V., Khurana, J. P., Tyagi, A. K. & Khurana, P. An update on chloroplast genomes. Plant Syst. Evol. 271, 101–122. https:// doi.
org/ 10. 1007/ s00606- 007- 0608-0 (2007).
17. Jansen, R. K. et al. Methods in Enzymology Vol. 395, 348–384 (Academic Press, 2005).
18. Jiang, H. et al. Comparative and phylogenetic analyses of six Kenya Polystachya (Orchidaceae) species based on the complete
chloroplast genome sequences. BMC Plant Biol. 22, 1. https:// doi. org/ 10. 1186/ s12870- 022- 03529-5 (2022).
19. Palmer, J. D. Comparative organization of chloroplast genomes. Ann. Rev. Genet. 19, 325–354 (1985).
20. Ruhlman, T. A. & Jansen, R. K. e plastid genomes of owering plants. Methods Mol. Biol. 1132, 3–38. https:// doi. org/ 10. 1007/
978-1- 62703- 995-6_1 (2014).
21. Zhang, Y. J. & Li, D. Z. Advances in phylogenomics based on complete chloroplast genomes. Plant Divers. Resour. 33, 365–375.
https:// doi. org/ 10. 3724/ SP.J. 1143. 2011. 10202 (2011).
22. Clegg, M. T., Gaut, B. S., Learn, G. H. & Morton, B. R. Rates and patterns of chloroplast DNA evolution. Proc. Natl. Acad. Sci.
U.S.A. 91, 6795–6801 (1994).
23. Yan, M. H. et al. Plastid phylogenomics resolves infrafamilial relationships of the Styracaceae and sheds light on the backbone
relationships of the Ericales. Mol. Phylogenet. Evol. 121, 198–211. https:// doi. org/ 10. 1016/j. ympev. 2018. 01. 004 (2018).
24. Yang, J. X., Hu, G. X. & Hu, G. W. Comparative genomics and phylogenetic relationships of two endemic and endangered species
(Handeliodendron bodinieri and Eurycorymbus cavaleriei) of two monotypic genera within Sapindales. BMC Genom. 23, 27. https://
doi. org/ 10. 1186/ s12864- 021- 08259-w (2022).
25. Xie, D. F. et al. Insights into phylogeny, age and evolution of Allium (Amaryllidaceae) based on the whole plastome sequences.
Ann. Bot. 125, 1039–1055. https:// doi. org/ 10. 1093/ aob/ mcaa0 24 (2020).
26. Zhang, R. et al. Exploration of plastid phylogenomic conict yields new insights into the deep relationships of Leguminosae. Syst.
Biol. 69, 613–622. https:// doi. org/ 10. 1093/ sysbio/ syaa0 13 (2020).
27. Liu, B., Liu, G., Hong, D. & Wen, J. Eriobotrya belongs to rhaphiolepis (maleae, rosaceae): Evidence from chloroplast genome and
nuclear ribosomal DNA data. Front. Plant Sci. 10, 1731. https:// doi. org/ 10. 3389/ fpls. 2019. 01731 (2019).
28. Xia, M. Q. et al. Out of the Himalaya-Hengduan Mountains: Phylogenomics, biogeography and diversication of Polygonatum Mill.
(Asparagaceae) in the Northern Hemisphere. Mol. Phylogenet. Evol. 169, 107431. https:// doi. org/ 10. 1016/j. ympev. 2022. 107431
(2022).
29. Jin, J. J. et al. GetOrganelle: A fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 21, 241.
https:// doi. org/ 10. 1186/ s13059- 020- 02154-5 (2020).
30. Qu, X. J., Moore, M. J., Li, D. Z. & Yi, T. S. PGA: A soware package for rapid, accurate, and exible batch annotation of plastomes.
Plant Methods 15, 50. https:// doi. org/ 10. 1186/ s13007- 019- 0435-7 (2019).
31. Ren, J. et al. Comparative and phylogenetic analysis based on the chloroplast genome of Coleanthus subtilis (Tratt.) Seidel, a pro-
tected rare species of monotypic genus. Front. Plant Sci. 13, 828467. https:// doi. org/ 10. 3389/ fpls. 2022. 828467 (2022).
32. Ding, S. X. et al. Complete chloroplast genome of Clethra fargesii Franch., an original sympetalous plant from central China:
Comparative analysis, adaptive evolution, and phylogenetic relationships. Forests 12, 040441. https:// doi. org/ 10. 3390/ f1204 0441
(2021).
33. Kearse, M. et al. Geneious Basic: An integrated and extendable desktop soware platform for the organization and analysis of
sequence data. Bioinformatics 28, 1647–1649. https:// doi. org/ 10. 1093/ bioin forma tics/ bts199 (2012).
34. Greiner, S., Lehwark, P. & Bock, R. OrganellarGenomeDRAW (OGDRAW) version 1.3.1: Expanded toolkit for the graphical
visualization of organellar genomes. Nucleic Acids Res. 47, W59–W64. https:// doi. org/ 10. 1093/ nar/ gkz238 (2019).
35. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment soware version 7: Improvements in performance and usability.
Mol. Biol. Evol. 30, 772–780. https:// doi. org/ 10. 1093/ molbev/ mst010 (2013).
36. Mayor, C. et al. VISTA: Visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 16, 1046–1047. https://
doi. org/ 10. 1093/ bioin forma tics/ 16. 11. 1046 (2000).
37. Amiryouse, A., Hyvonen, J. & Poczai, P. IRscope: An online program to visualize the junction sites of chloroplast genomes.
Bioinformatics 34, 3030–3031. https:// doi. org/ 10. 1093/ bioin forma tics/ bty220 (2018).
38. Darling, A. E., Mau, B. & Perna, N. T. progressiveMauve: Multiple genome alignment with gene gain, loss and rearrangement.
PLoS ONE 5, e11147. https:// doi. org/ 10. 1371/ journ al. pone. 00111 47 (2010).
39. Kumar, S., Stecher, G. & Tamura, K. MEGA7: Molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol. Biol.
Evol. 33, 1870–1874. https:// doi. org/ 10. 1093/ molbev/ msw054 (2016).
40. Gupta, S. K., Bhattacharyya, T. K. & Ghosh, T. C. Synonymous codon usage in Lactococcus lactis: Mutational bias versus transla-
tional selection. J. Biomol. Struct. Dyn. 21, 527–536. https:// doi. org/ 10. 1080/ 07391 102. 2004. 10506 946 (2004).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
18
Vol:.(1234567890)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
41. Kurtz, S. et al. REPuter: e manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 29, 4633–4642. https://
doi. org/ 10. 1093/ nar/ 29. 22. 4633 (2001).
42. Beier, S., iel, T., Munch, T., Scholz, U. & Mascher, M. MISA-web: A web server for microsatellite prediction. Bioinformatics 33,
2583–2585. https:// doi. org/ 10. 1093/ bioin forma tics/ btx198 (2017).
43. Rozas, J. et al. DnaSP 6: DNA sequence polymorphism analysis of large data sets. Mol. Biol. Evol. 34, 3299–3302. https:// doi. org/
10. 1093/ molbev/ msx248 (2017).
44. Gao, F. L. et al. EasyCodeML: A visual tool for analysis of selection using CodeML. Ecol. Evol. 9, 3891–3898. https:// doi. org/ 10.
1002/ ece3. 5015 (2019).
45. Nguyen, L. T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: A fast and eective stochastic algorithm for estimating
maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274. https:// doi. org/ 10. 1093/ molbev/ msu300 (2015).
46. Yang, Z., Wong, W. S. & Nielsen, R. Bayes empirical bayes inference of amino acid sites under positive selection. Mol. Biol. Evol.
22, 1107–1118. https:// doi. org/ 10. 1093/ molbev/ msi097 (2005).
47. Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A. & Jermiin, L. S. ModelFinder: Fast model selection for accurate
phylogenetic estimates. Nat. Methods 14, 587–589. https:// doi. org/ 10. 1038/ nmeth. 4285 (2017).
48. Minh, B. Q., Nguyen, M. A. & von Haeseler, A. Ultrafast approximation for phylogenetic bootstrap. Mol. Biol. Evol. 30, 1188–1195.
https:// doi. org/ 10. 1093/ molbev/ mst024 (2013).
49. Ronquist, F. et al. MrBayes 3.2: Ecient Bayesian phylogenetic inference and model choice across a large model space. Syst. Biol.
61, 539–542. https:// doi. org/ 10. 1093/ sysbio/ sys029 (2012).
50. dos Reis, M. How to calculate the non-synonymous to synonymous rate ratio of protein-coding genes under the Fisher–Wright
mutation-selection framework. Biol. Lett. 11, 20141031. https:// doi. org/ 10. 1098/ rsbl. 2014. 1031 (2015).
51. Wu, Z. et al. Analysis of six chloroplast genomes provides insight into the evolution of Chrysosplenium (Saxifragaceae). BMC
Genom. 21, 621. https:// doi. org/ 10. 1186/ s12864- 020- 07045-4 (2020).
52. Wu, L. et al. Comparative and phylogenetic analysis of the complete chloroplast genomes of three Paeonia section moutan species
(Paeoniaceae). Front. Genet. 11, 980. https:// doi. org/ 10. 3389/ fgene. 2020. 00980 (2020).
53. Alzahrani, D. A., Yaradua, S. S., Albokhari, E. J. & Abba, A. Complete chloroplast genome sequence of Barleria prionitis, compara-
tive chloroplast genomics and phylogenetic relationships among Acanthoideae. BMC Genom. 21, 393. https:// doi. org/ 10. 1186/
s12864- 020- 06798-2 (2020).
54. Singh, N. V. et al. Chloroplast genome sequencing, comparative analysis, and discovery of unique cytoplasmic variants in Pome-
granate (Punica granatum L.). Front. Genet. 12, 704075. https:// doi. org/ 10. 3389/ fgene. 2021. 704075 (2021).
55. R aman, G. & Park, S. e complete chloroplast genome sequence of the Speirantha gardenii: Comparative and adaptive evolution-
ary analysis. Agronomy 10, 091405. https:// doi. org/ 10. 3390/ agron omy10 091405 (2020).
56. Lee, S. R., Kim, K., Lee, B. Y. & Lim, C. E. Complete chloroplast genomes of all six Hosta species occurring in Korea: Molecular
structures, comparative, and phylogenetic analyses. BMC Genom. 20, 833. https:// doi. org/ 10. 1186/ s12864- 019- 6215-y (2019).
57. McKain, M. R. et al. Timing of rapid diversication and convergent origins of active pollination within Agavoideae (Asparagaceae).
Am. J. Bot. 103, 1717–1729. https:// doi. org/ 10. 3732/ ajb. 16001 98 (2016).
58. Munyao, J. N. et al. Complete chloroplast genomes of Chlorophytum comosum and Chlorophytum gallabatense: Genome structures,
comparative and phylogenetic analysis. Plants 9, 296. https:// doi. org/ 10. 3390/ plant s9030 296 (2020).
59. Jia, Q. et al. A “GC-rich” method for mammalian gene expression: A dominant role of non-coding DNA GC content in regulation
of mammalian gene expression. Sci. China Life Sci. 53, 94–100. https:// doi. org/ 10. 1007/ s11427- 010- 0003-x (2010).
60. Du, Z. et al. e chloroplast genome of Amygdalus L. (Rosaceae) reveals the phylogenetic relationship and divergence time. BMC
Genom. 22, 645. https:// doi. org/ 10. 1186/ s12864- 021- 07968-6 (2021).
61. Luo, C. et al. Comparative chloroplast genome analysis of Impatiens species (Balsaminaceae) in the karst area of China: Insights
into genome evolution and phylogenomic implications. BMC Genom. 22, 571. https:// doi. org/ 10. 1186/ s12864- 021- 07807-8 (2021).
62. Wen, F. et al. e complete chloroplast genome of Stauntonia chinensis and compared analysis revealed adaptive evolution of
subfamily Lardizabaloideae species in China. BMC Genom. 22, 161. https:// doi. org/ 10. 1186/ s12864- 021- 07484-7 (2021).
63. Dong, S. J. et al. Complete chloroplast genome of Stephania tetrandra (Menispermaceae) from Zhejiang Province: Insights into
molecular structures, comparative genome analysis, mutational hotspots and phylogenetic relationships. BMC Genom. 22, 1.
https:// doi. org/ 10. 1186/ s12864- 021- 08193-x (2021).
64. Somaratne, Y., Guan, D. L., Wang, W. Q., Zhao, L. & Xu, S. Q. e complete chloroplast genomes of two Lespedeza species: Insights
into codon usage bias, RNA editing sites, and phylogenetic relationships in Desmodieae (Fabaceae: Papilionoideae). Plants 9,
010051. https:// doi. org/ 10. 3390/ plant s9010 051 (2019).
65. Xu, J., Liu, C., Song, Y. & Li, M. Comparative analysis of the chloroplast genome for four Pennisetum species: Molecular structure
and phylogenetic relationships. Front. Genet. 12, 687844. https:// doi. org/ 10. 3389/ fgene. 2021. 687844 (2021).
66. Park, M., Park, H., Lee, H., Lee, B. H. & Lee, J. e complete plastome sequence of an antarctic bryophyte Sanionia uncinata (Hedw.)
Loeske. Int. J. Mol. Sci. 19, 030709. https:// doi. org/ 10. 3390/ ijms1 90307 09 (2018).
67. Timme, R. E., Kuehl, J. V., Boore, J. L. & Jansen, R. K. A comparative analysis of the Lactuca and Helianthus (Asteraceae) plastid
genomes: Identication of divergent regions and categorization of shared repeats. Am. J. Bot. 94, 302–312. https:// doi. org/ 10. 3732/
ajb. 94.3. 302 (2007).
68. Li, D. M., Zhao, C. Y. & Liu, X. F. Complete chloroplast genome sequences of Kaempferia galanga and Kaempferia elegans: Molecular
structures and comparative analysis. Molecules 24, 030474. https:// doi. org/ 10. 3390/ molec ules2 40304 74 (2019).
69. Hirano, R. et al. Propagation management methods have altered the genetic variability of two traditional Mango varieties in
Myanmar, as revealed by SSR. Plant Genet. Resour. C 9, 404–410. https:// doi. org/ 10. 1017/ S1479 26211 10000 49 (2011).
70. Chen, C. X., Zhou, P., Choi, Y. A., Huang, S. & Gmitter, F. G. Mining and characterizing microsatellites from citrus ESTs. eor.
Appl. Genet. 112, 1248–1257. https:// doi. org/ 10. 1007/ s00122- 006- 0226-1 (2006).
71. Provan, J., Powell, W. & Hollingsworth, P. M. Chloroplast microsatellites: New tools for studies in plant ecology and evolution.
Trends Ecol. Evol. 16, 142–147. https:// doi. org/ 10. 1016/ S0169- 5347(00) 02097-8 (2001).
72. Jiao, Y. et al. Development of simple sequence repeat (SSR) markers from a genome survey of Chinese bayberry (Myrica rubra).
BMC Genom. 13, 201. https:// doi. org/ 10. 1186/ 1471- 2164- 13- 201 (2012).
73. Zhou, T. et al. e complete chloroplast genome of Euphrasia regelii, pseudogenization of ndh genes and the phylogenetic relation-
ships within Orobanchaceae. Front. Genet. 10, 444. https:// doi. org/ 10. 3389/ fgene. 2019. 00444 (2019).
74. Gu, C. H., Tembrock, L. R., Johnson, N. G., Simmons, M. P. & Wu, Z. Q. e complete plastid genome of Lagerstroemia fauriei
and loss of rpl2 intron from Lagerstroemia (Lythraceae). PLoS ONE 11, 0150752. https:// doi. org/ 10. 1371/ journ al. pone. 01507 52
(2016).
75. Khayi, S. et al. Complete chloroplast genome of Argania spinosa: Structural organization and phylogenetic relationships in sapo-
taceae. Plants 9, 101354. https:// doi. org/ 10. 3390/ plant s9101 354 (2020).
76. Dong, F., Lin, Z., Lin, J., Ming, R. & Zhang, W. Chloroplast genome of Rambutan and comparative analyses in Sapindaceae. Plants
10, 020283. https:// doi. org/ 10. 3390/ plant s1002 0283 (2021).
77. Liu, L. et al. Chloroplast genome analyses and genomic resource development for epilithic sister genera Oresitrophe and Mukdenia
(Saxifragaceae), using genome skimming data. BMC Genom. 19, 235. https:// doi. org/ 10. 1186/ s12864- 018- 4633-x (2018).
78. Abdullah, et al. Comparative plastome analysis of Blumea, with implications for genome evolution and phylogeny of Asteroideae.
Ecol. Evol. 11, 7810–7826. https:// doi. org/ 10. 1002/ ece3. 7614 (2021).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
19
Vol.:(0123456789)
Scientic Reports | (2023) 13:7237 | https://doi.org/10.1038/s41598-023-34083-1
www.nature.com/scientificreports/
79. Li, X., Zuo, Y., Zhu, X., Liao, S. & Ma, J. Complete chloroplast genomes and comparative analysis of sequences evolution among
seven Aristolochia (aristolochiaceae) medicinal species. Int. J. Mol. Sci. 20, 051045. https:// doi. org/ 10. 3390/ ijms2 00510 45 (2019).
80. Kryazhimskiy, S. & Plotkin, J. B. e population genetics of dN/dS. PLoS Genet. 4, e1000304. https:// doi. org/ 10. 1371/ journ al. pgen.
10003 04 (2008).
81. Mugal, C. F., Wolf, J. B. & Kaj, I. Why time matters: Codon evolution and the temporal dynamics of dN/dS. Mol. Biol. Evol. 31,
212–231. https:// doi. org/ 10. 1093/ molbev/ mst192 (2014).
82. Yang, J., Kim, S. H., Pak, J. H. & Kim, S. C. Infrageneric plastid genomes of Cotoneaster (Rosaceae): Implications for the plastome
evolution and origin of C. wilsonii on Ulleung Island. Genes 13, 050728. https:// doi. org/ 10. 3390/ genes 13050 728 (2022).
83. Raman, G., Nam, G.-H. & Park, S. Extensive reorganization of the chloroplast genome of Corydalis platycarpa: A comparative
analysis of their organization and evolution with other Corydalis plastomes. Front. Plant Sci. 13, 1043740. https:// doi. or g/ 10. 3389/
fpls. 2022. 10437 40 (2022).
84. Zhang, D. Q., Ren, Y. & Zhang, J. Q. Nonadaptive molecular evolution of plastome during the speciation of Actaea purpurea and
its relatives. Ecol. Evol. 12, e9321. https:// doi. org/ 10. 1002/ ece3. 9321 (2022).
85. Raman, G. & Park, S. Structural characterization and comparative analyses of the chloroplast genome of Eastern Asian species
Cardamine occulta (Asian C. exuosa With.) and other Cardamine species. Front. Biosci. 27, 124. https:// doi. org/ 10. 31083/j. l27
04124 (2022).
86. Wang, Y. et al. Comparative chloroplast genome analyses of Paraboea (Gesneriaceae): Insights into adaptive evolution and phy-
logenetic analysis. Front. Plant Sci. 13, 1019831. https:// doi. org/ 10. 3389/ fpls. 2022. 10198 31 (2022).
87. Cai, X. Z., Hu, G. W., Kamande, E. M., Ngumbau, V. M. & Wei, N. Polygonatum campanulatum (Asparagaceae), a new species
from Yunnan, China. Phytotaxa 236, 94–96. https:// doi. org/ 10. 11646/ phyto taxa. 236.1. 10 (2015).
88. Noltie, H. J. Flora of Bhutan Vol. 3, 38–46 (Royal Botanic Garden, 1994).
89. Jerey, C. e genus Polygonatum (Liliaceae) in Eastern Asia. Kew Bull. 34, 36. https:// doi. org/ 10. 2307/ 41098 22 (1980).
Acknowledgements
e authors acknowledge Jiaxin Yang and Miao Liao for giving suggestions on the paper.
Author contributions
G.W.H. collected these materials and identied species. G.W.H. and D.J.Z. designed and supervised the research.
D.J.Z. performed the experiment, conducted the analyses and wrote the manuscript. J.R., H.J. and V.O.W. repeat-
edly proof-read the manuscript. All authors read and approved the nal manuscript.
Funding
is work was supported by grants from the National Science & Technology Fundamental Resources Investiga-
tion Program of China (2019FY101800), Second Tibetan Plateau Scientic Expedition and Research (STEP)
program (2019QZKK0502) and National Natural Science Foundation of China (32270228, 31970211) and Sino-
Africa Joint Research Center, CAS (SAJC202101).
Competing interests
e authors declare no competing interests.
Additional information
Supplementary Information e online version contains supplementary material available at https:// doi. org/
10. 1038/ s41598- 023- 34083-1.
Correspondence and requests for materials should be addressed to G.H.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Open Access is article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons licence, and indicate if changes were made. e images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.
© e Author(s) 2023
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com