ArticlePDF Available

Characterization of the complete chloroplast genome and development of molecular markers of Salix

Springer Nature
Scientific Reports
Authors:

Abstract and Figures

Salix, an economically and ecologically multifunctional tree species widely distributed in China, encompasses five ornamental species sequenced in this study, which are highly beneficial for plant phytoremediation due to their ability to absorb heavy metals. This research utilized high-throughput sequencing to acquire chloroplast genome sequences of Salix, analyzing their gene composition and structural characteristics, identifying potential molecular markers, and laying a foundation for Salix identification and resource classification. Chloroplast DNA was extracted from the leaves of Salix argyracea, Salix dasyclados, Salix eriocephala, Salix integra ‘Hakuro Nishiki’, and Salix suchowensis using an optimized CTAB method. Sequencing was conducted on the Illumina NovaSeq PE150 platform, and bioinformatics tools were employed to compare the structural features and variations within the chloroplast genomes of the Salix. Analysis revealed high similarity among the chloroplast genome sequences of the five Salix species, with a subsequent examination identifying 276, 269, 270, 273, and 273 SSR loci, respectively, along with unique simple repeat sequences in each variety. Comparison of chloroplast genomes across 22 Salix highlighted variations in regions such as matK-trnQ, ndhC-trnV, psbE-petL, rpl36-rps8, and ndhB-rps7, which may serve as valuable molecular markers for willow resource classification studies. In this study, chloroplast genome sequencing and structural analysis of Salix not only enhances the genetic resources of Salix but also forms a critical basis for the development of molecular markers and the exploration of interspecific phylogeny in the genus.
This content is subject to copyright. Terms and conditions apply.
Characterization of the complete
chloroplast genome and
development of molecular markers
of Salix
Pu Wang1,2, Jiahui Guo1,2, Jie Zhou1 & Yixuan Wang1
Salix, an economically and ecologically multifunctional tree species widely distributed in China,
encompasses ve ornamental species sequenced in this study, which are highly benecial for plant
phytoremediation due to their ability to absorb heavy metals. This research utilized high-throughput
sequencing to acquire chloroplast genome sequences of Salix, analyzing their gene composition and
structural characteristics, identifying potential molecular markers, and laying a foundation for Salix
identication and resource classication. Chloroplast DNA was extracted from the leaves of Salix
argyracea, Salix dasyclados, Salix eriocephala, Salix integra ‘Hakuro Nishiki’, and Salix suchowensis
using an optimized CTAB method. Sequencing was conducted on the Illumina NovaSeq PE150
platform, and bioinformatics tools were employed to compare the structural features and variations
within the chloroplast genomes of the Salix. Analysis revealed high similarity among the chloroplast
genome sequences of the ve Salix species, with a subsequent examination identifying 276, 269,
270, 273, and 273 SSR loci, respectively, along with unique simple repeat sequences in each variety.
Comparison of chloroplast genomes across 22 Salix highlighted variations in regions such as matK-
trnQ, ndhC-trnV, psbE-petL, rpl36-rps8, and ndhB-rps7, which may serve as valuable molecular markers
for willow resource classication studies. In this study, chloroplast genome sequencing and structural
analysis of Salix not only enhances the genetic resources of Salix but also forms a critical basis for the
development of molecular markers and the exploration of interspecic phylogeny in the genus.
Keywords Salix, Chloroplast genome, Structure characteristics, Locus of variation, Molecular markers
e chloroplast (cp) is an essential organelle for photosynthesis and energy supply in green plant cells13. It is
vital for starch synthesis, nitrogen metabolism, sulfate reduction, and fatty acid synthesis4,5. Chloroplast DNA
(cp DNA) is a single, circular molecule with four structures, namely a large single-copy (LSC) region, a small
single-copy (SSC) region, and two copies of inverted repeat regions (IRa and IRb)6,7. Due to its small size, highly
conserved structure, low substitution rate, and haploid nature, cpDNA has become the ideal tool in studies on
diversity and evolution at lower taxonomic levels810. Chloroplast DNA is maternally inherited, thus providing
essential information for molecular markers, breeding of new varieties, and plant phylogeny1113.
Willow, a collective term for the Salix and Chosenia arbutafolia (Pall.) A. Skv in the Salicaceae family. e
genus Salix is composed of 520 species with worldwide distribution14,15. e taxonomy and phylogeny of Salix
based on traditional morphological characteristics have been controversial and unreliable because of their
dioecious reproduction, simple owers, large intraspecic phenotypic variation, frequent hybridization, and
easy propagation1618. Argus (1997) recognized four subgenera within the North American Salix species, Salix,
Longifoliae, Vetrix, and Chamaetia. Ohashi proposed a classication system for the willow genus based on
Japanese plants, dividing the willow genus into six subgenera Pleuradenia, Chosenia, Protitea, Chamaetia, Salix,
Vetrix19,20. e classication and localization of willow plants have been debated for a long time, and there are
dierences in understanding their evolutionary relationships.
is study selects ve willow trees with high ornamental value and high biomass, namely Salix argyracea,
Salix dasyclados, Salix eriocephala, Salix integra ‘Hakuro Nishiki’, and Salix suchowensis. To achieve this, the
complete chloroplast (cp) genomes of S. argyracea, S. dasyclados, S. eriocephala, S. integra ‘Hakuro Nishiki’,
and S. suchowensis were characterized and de novo assembly was performed. e 16 available cp genomes of
1Jiangsu Academy of Forestry, Nanjing, China. 2These authors contributed equally to this work. email:
zjwin718@126.com
OPEN
Scientic Reports | (2024) 14:28528 1
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports
Content courtesy of Springer Nature, terms of use apply. Rights reserved
other Salix species and Chosenia arbutifolia were also annotated. e potential molecular markers were mined
by analysis of the simple sequence repeat (SSR) markers, repetitive sequences, nucleotide diversity, positive
selection genes, and highly divergent regions which could be used for interspecic identication. To lay the
foundation for further research on the phylogeny, tree species identication, and evolution of the willow genus,
and to provide reference materials for the development of DNA barcodes in the willow genus.
Materials and methods
Plant materials
e ve Salix species of S. argyracea, S. dasyclados, S. eriocephala, S. integra ‘Hakuro Nishiki’, and S. suchowensis
were preserved and deposited in the willow collection at Jiangsu Academy of Forestry (31.861947°N,
118.777145°E). e voucher specimens were deposited at the herbarium of Jiangsu Academy of Forestry under
the voucher numbers P102, P126, 87, P646, and P63, respectively. Fresh leaves were collected for DNA isolation
and library construction. Genomic sequencing was performed using the Illumina Novaseq PE150 platform (San
Diego, CA, USA).
CpDNA sequencing and de novo assembly
e raw sequenced data were ltered by fastp (version 0.20.0, https://github.com/OpenGene/fastp) soware
to obtain clean data. en de novo assembly was constructed by SPAdes v3.10.1 ( h t t p : / / c a b . s p b u . r u / s o  w a r e
/ s p a d e s / ) for the complete pseudo genome. Five, high-quality, complete Salix cp genomes were deposited in
NCBI under these accession numbers: MT551159 (S. argyracea), MT551160 (S. dasyclados), MT551161 (S.
eriocephala), MT551162 (S. integra ‘Hakuro Nishiki’), and MT551163 (S. suchowensis).
Chloroplast gene annotation and chloroplast mapping
e cpDNA coding sequence (CDS) was annotated by Prodigal v2.6.321. e rRNA and tRNA were predicted by
HMMER v3.1b2 (http://hmmer.org/) and ARAGORN v1.2.3822. e sequences were submitted to the NCBI for
the nal annotation by BLAST v2.6 (https://blast.ncbi.nlm.nih.gov/Blast.cgi). e cp genome maps of the ve
Salix species were drawn in OGDRAW23. Chloroplast SSRs ranging from mono- to octa-nucleotide repeats, were
identied by using MISA v1.024. RSCU was analyzed with MEGA 725. e sequences were aligned by MAFFT
v7.427 (https://ma.cbrc.jp/alignment/soware/), and the synonymous and nonsynonymous substitution rates
were calculated with KaKs_Calculator v2.0 (https://sourceforge.net/projects/kakscalculator2/). e nucleotide
diversity (Pi) was calculated by dnasp5 (https://dnasp.soware.informer.com/5.1/)26. Using soware CGVIEW
(http://stothard.afns.ualberta.ca/cgview_server/) Default parameters for comparative analysis of chloroplast
genome structure in close source species.
Identication of Simple Sequence Repeats markers
e genomic sequences were analyzed to identify potential microsatellites (SSRs. i.e., mono-, di-, tri-, tetra-,
penta-, and hexanucleotide repeats) using MISA soware (http://pgrc.ipk-gatersleben.de/misa/) with thresholds
of 10 repeat units for mononucleotide SSRs and ve repeat units for di-, tri-, tetra-, penta-, and hexanucleotide
SSRs. e web-based soware REPuter (http://bibiserv.techfak.uni-bielefeld.de/reputer/)27 was used to analyze
the repeat sequences, which included forward, reverse, complement, palindromic, and tandem repeats with
minimal lengths of 30bp and edit distances of less than 3bp.
Phylogenetic analysis and genome homology analysis
e multiple alignment of the cp genomes of 32 species were conducted by MAFFT v7.427 for phylogenetic
analysis. e phylogenetic tree was constructed using the ML (maximum-likelihood) method2830 by RAxML
v8.2.1031.
e cp genomes for the following species were retrieved from the NCBI database: Chosenia arbutifolia
(NC_036718.1), S. babylonica (NC_028350.1), S. chaenomeloides (NC_037422.1), S. tetrasperma (NC_035744.1),
S. hypoleuca (NC_037423.1), S. interior (NC_024681.1), S. magnica (NC_037424.1), S. minjiangensis
(NC_037425.1), S. oreinoma (NC_035743.1), S. paraplesia (NC_037426.1), S. purpurea (NC_029693.1), S.
rehderiana (NC_037427.1), S. rorida (NC_037428.1), S. taoensis (NC_037429.1), S. tetrasperma (NC_035744.1),
S. gracilistyla (NC_043878.1), S. koriyanagi (NC_044419.1), Eucalyptus spathulata (NC_022400.1), and Quercus
bawanglingensis (NC_046583.1). Populus cathayana (NC_040874.1), Populus yunnanensis (MK267299.1),
Populus tremula (KP861984.1), Populus alba (NC_008235.1), Populus balsamifera (NC_024735.1), Populus
fremontii (NC_024734.1), Populus trichocarpa (NC_009143.1), Populus euphratica (NC_024747.1).
Results
Characterization of chloroplast genomes in Salix
Using the Illumina Novaseq PE150 platform, 20 913 346, 19 041 713, 22 544 659, 18 602 676, and 20 582 680
paired-end clean reads were obtained for S. argyracea, S. dasyclados, S. eriocephala, S. integra, and S. suchowensis,
respectively, with GC content ranging from 36.67% to 36.71% (Table 1). Aer the de novo assembly, the complete
cp genomes were 155 605bp, 155 763bp, 155 552bp, 155 538bp, and 155 550bp in size, respectively (Table 1
and Fig.1). e genomes exhibited a typical quadripartite structure with the LSC region (84 414–84 588bp), SSC
region (16 214–16 275bp), and IRs (27 384–27 479bp) (Table 1 and Fig.1). e slightly dierent size in the SSC
region and IRs indicates the expansion of these regions between species. e GC content of the IR, LSC, and SSC
regions was about 41%, 30%, and 34%, respectively (Table 1).
Among them, 14 genes (ndhA, ndhB, petB, petD, atpF, rpl16, rpl2, rpoC1, trnA-UGC, trnG GCC, trnI GAU,
trnK UUU, trnL UAA, trnV UAC) have one intron, and 3 genes (ycf3, rps12, clpP) have two introns (Table 1). e
Scientic Reports | (2024) 14:28528 2
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports/
Content courtesy of Springer Nature, terms of use apply. Rights reserved
gene rps12 is located in the IR region, while ycf3 and clpP are located in the LSC region. e ndhA gene with only
one intron is located in the SSC region, while the other genes are located in the LSC and IRs regions (Table 2).
Repetitive sequences and cpSSR analysis
Relative synonymous codon usage (RSCU) was used to evaluate the codon usage frequency. Codon usage bias
is an indicator of natural selection, species mutation, and genetic uctuation. In the ve Salix species, Arg, Leu,
and Ser are the most frequent amino acids. Trp is the only codon exhibiting no bias (RSCU = 1.00) in the ve cp
genomes (Fig.2A). ere are 19–26 forward repeats, 5–7 reverse repeats, 5–7 complement repeats, and 15–19
palindromic repeats in the ve species (Fig.2B). e total numbers are less in S. integra than in the other four
species. e largest repeat sequence was 104bp in S. argyracea in the IGS-rpl16 region.
SSRs were found in the dierent regions of the ve species. most of them were located in the LSC region.
e total numbers of SSRs were slightly dierent (Table 3), SSR-related primers and results are available in
the supplementary materials. Mono-, di-, tri-, tetra-, penta-, and hexa-nucleotide SSRs were found in the
ve species. e SSRs (A)18, (C)12, (G)10, (AT)11, and (TAATAT)3 were unique to S. dasyclados, and (T)19 and
(TTGATA)3 were found only in S. integraHakuro Nishiki. S. eriocephala, S. suchowensis, and S. argyracea shared
the same di-, tri- tetra-, penta-nucleotide SSRs, and each was repeated three times. these species lacked unique
SSRs. (A)16, (G)13, and (G)12 were present in S. eriocephala. (A)16 and (G)13 were lost, but (G)12 was present in
S. suchowensis and S. argyracea. e two latter species were dierentiated in the number of repeats at (T)15 and
(T)16 (Fig.3). ese specic SSRs provide valuable information for Salix taxonomy.
Variation among the cp genomes
e following highly-divergent regions among the 22 species were detected by Mauve ( h t t p : / / d a r l i n g l a b . o r g /
m a u v e ) , matKtrnQ, ndhC–trnV, psbE–petL, rpl36–rps8, and ndhB–rps7 (Fig.4). ey are potentially suitable
markers for species delimitation within Salix.
e border junctions were compared between IR/SSC and IR/LSC for the 21 Salix species and Chosenia
arbutifolia. In all the species, rpl22 was located at the LSC/IRb junction. rps19 was located within the IRb, 52bp
from the LSC region. ycf1 and ndhF were in the IRb/SSC region. ycf1 and trnN were at the SSC/IRa boundary.
and rps19/trnH were in the IRa/SSC region. Fieen species, including Chosenia arbutifolia, shared the same
border genes and the same junction length. e remaining seven species exhibited fragment deletions or site
variation in the border region. In the LSC/IRb border, the length of rpl22 in the LSC region (348–350bp) and
in the IRb region (50–52bp) varied slightly among S. tetrasperma, S. babylonica, and S. interior. In the IRb/SSC
junction, ycf1 exhibited long fragment deletions in S. integra ‘Hakuro Nishiki’ (1673bp in the IRb and 25bp
in the LSC region) and in S. chaenomeloides (31bp in the SSC region), but there was only a 9bp deletion in
Gene Location Exon I (bp) Intron II (bp) Exon II (bp) Intron II (bp) Exon III (bp)
trnK-UUU LSC 38,37,37,37,37 2544,2524,2545,2545,2545 37,36,36,36,36
trnG-GCC LSC 23 697,695,697,697,697 48
atpF LSC 144 740,742,739,741,741 399
rpoC1 LSC 453 763,773,762,763,764 1617
ycf3 LSC 126 680,678,679,679,680 228 725 153
trnL-UAA LSC 35 585,586,586,586,586 50
trnV-UAC LSC 39 609 35
clpP LSC 71 586,584,585,584,585 292 838 228
petB LSC 6 811 642
petD LSC 9 778,780,779,779,778 489
rpl16 LSC 9 1114,1143,1114,1120,1114 399
rpl2 IRb 396 668 435
ndhB IRb 777 682 756
trnI-GAU IRb 37 949 35
trnA-UGC IRb 38 802 35
rps12 IRb 114 - 30 539 231
rps12 IRa 114 - 231 539 30
trnA-UGC IRa 38 802 35
trnI-GAU IRa 37 949 35
ndhB IRa 777 682 756
rpl2 IRa 396 668 435
ndhA SSC 552 1112,1115,1114,1107,1108 543
Tab le 1. Genes with exons and introns annotated in the chloroplast genomes of ve Salix species. 1. Multiple
numbers in a cell refer to locations of genes in S. argyracea, S. dasyclados, S. eriocephala, S. integra ‘Hakuro
Nishiki, and S. suchowensis, respectively. A single number means the same gene location in the ve species. 2.
LSC, large single-copy region. SSC, small single-copy region. IR, inverted repeat region.
Scientic Reports | (2024) 14:28528 3
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports/
Content courtesy of Springer Nature, terms of use apply. Rights reserved
S. minjiangensis (1739bp in the IRb region). e gene ndhF was conserved in all species, except in S. integra
‘Hakuro Nishiki’, with a relatively short total length and only 2238bp in the SSC region. In the SSC/IRa border,
ycf1 was 3721–3694bp long in the IRa region in Salix interior, S. chaenomeloides, S. tetrasperma, S. babylonica,
and S. paraplesia. However, the length of the ycf1 gene in the other species in the IRa region was 3676bp, which
was shorter. e distance of yc and trnN from the border of SSC/IRa was also dierent in S. integra ‘Hakuro
Nishiki’. In S. tetrasperma, S. chaenomeloides, S. brachista, and S. babylonica, gene insertion was present in the
IRa/SSC junction (Fig.5). e length of rps19 and trnH was conserved.
Phylogenic analysis
By aligning the chloroplast genome sequences of 8 poplar species, 20 willow species, Eucalyptus spathulata, and
Quercus bawanglingensis, and using poplar, Eucalyptus spathulata, and Quercus bawanglingensis as outgroups,
the phylogenetic relationships among species were elucidated. e results show that the phylogenetic tree divides
the willow genus into 3 branches, the 20 willow species cluster together, indicating a closer relationship between
poplars and willows (Fig.6).
Fig. 1. Gene map of the ve Salix chloroplast genomes. Genes shown outside the circle are transcribed
clockwise, and those inside are recorded counterclockwise. e gray circle depicts GC content. e known
functional genes are marked with colored bars.
Scientic Reports | (2024) 14:28528 4
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports/
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Discussion
e Salicaceae chloroplast genome structure is usually highly conserved, with sizes ranging from 150-159kb32.
e results revealed that the structure and synteny of the Salix species were highly conserved, all of which are
typical four segment double chain structures, including 1 LSC region, 1 SSC region, and 2 IR regions (IRa and
IRb).
Chloroplast SSRs have been widely used in population genetics, polymorphism investigations, and
evolutionary biology3335. e number of SSRs (269–276) in the cp genome of the ve Salix species was similar
to that reported for other species, but greater than that in S. wilsonii. e presence of mono-, di-, tri-, tetra-, and
pentanucleotide repeats was conrmed. e number of poly (A)/(T) repeats was far greater than the number of
poly (G)/(C) repeats, which coincides with their number in other angiosperms36. e SSRs (A)18, (C)12, (G)10,
(AT)11, and (TAATAT)3 were unique to S. dasyclados, and (T)19 and (TTGATA)3 were found only in S. integra
‘Hakuro Nishiki’. S. eriocephala, S. suchowensis, and S. argyracea shared the same di-, tri- tetra-, penta-nucleotide
SSRs, and each was repeated three times. these species lacked unique SSRs. (A)16, (G)13, and (G)12 were present in
S. eriocephala. (A)16 and (G)13 were lost, but (G)12 was present in S. suchowensis and S. argyracea. Hexanucleotide
repeats (TTGATA)3, (TAATAT)3, and (TTGATA)4 were found only in S. dasyclados and S. integra ‘Hakuro
Nishiki’, whereas (T)19 and (TTGATA)3 were unique to S. integra ‘Hakuro Nishiki’ and located in ycf3 and rpl16.
Repetitive sequences were found to participate in the cp genome arrangement and sequence variation3739. e
increase or loss of the repetitive sequences located in the intergenic spacer (IGS) regions and in protein-coding
genes ndhA, rpl16, and psbL, make the ve species distinct from others. e genes clpP and ycf1 were commonly
found as repeat sequences in other Salix species. e repeats and SSRs identied in the Salix cp genome can be
utilized for developing lineage-specic markers for studying the evolution and taxonomy of the genus Salix.
Extension and contraction of the border regions are regarded as the main reasons for dierences in the
length of chloroplast genomes4042. e sequence and structural variation of chloroplast genes provide a basis
for plant evolution43,44. In addition, through comparative analysis of chloroplast genomes, all willow chloroplast
genomes showed dierences in regions such as matktrnQ, ndhC–trnV, psbE–petL, rpl36–rps8, and ndhB–rps7,
is indicates that the dierent sizes of the cp genomes are mainly due to the shrinkage and expansion of IR,
LSC, and SSC regions, similar to the previously reported ndings45,46.
e results of this study indicate that there are sequence insertions and deletions of IR/SSC and IR/LSC
boundary genes in the gene coding region and intergenic spacer region of the Salix chloroplast genome.
Meanwhile, the deletion in ycf1 in the SSC region at the SSC/IRa border and insertion at the IRa/LSC junction,
placed the old-world species within one subclade47. e pseudo-infA, pseudo-ycf68, orf42, and orf56 presented
in S. wilsonii were lost in these sequenced Salix species. e IRb SSC boundary genes in the chloroplast genome
of willow trees are highly conserved, while the IRa SSC boundary genes in the chloroplast genome of poplar are
highly conserved, showing signicant dierences48.
e taxonomy and systematic phylogeny of the genus Salix has been obscure because of its dioecious
reproduction, common natural hybridization, large intraspecic phenotypic variation, and scarceness of
informative morphological characteristics4951. Chen conducted a phylogenetic study on the Salicaceae family
using certain genes or gene fragments from the chloroplast genome, such as rbcL, atpB-rbcL and trnD-T52. In
this study, it was found that ycf1, psaI, ycf2, rpoC2, rpl22, atpF and ndhF genes were in positive selection during
the analysis of the evolutionary direction of protein coding genes in the Salix chloroplast genome, providing new
evidence for further in-depth research on the Salix phylogeny. Traditional taxonomy suggests that the Salicaceae.
can be divided into three genera, namely Populus, Salix, and Chosenia arbutifolia53.
Genome features S. argyracea S. dasyclados S. eriocephala S. integra ’Hakuro Nishiki’ S. suchowe nsis
Genome Size (bp) 155 605 155 763 155 552 155 538 155 550
LSC size (bp) 84 468 84 588 84 414 84 495 84 418
SSC size (bp) 16 219 16 217 16 220 16 275 16 214
IR size (bp) 27 459 27 479 27 459 27 384 27 459
Number of genes (number of unigenes) 131 (77) 131 (77) 131 (77) 131 (77) 131 (77)
tRNA genes 37 37 37 37 37
rRNA genes 8 8 8 8 8
mRNA genes 86 86 86 86 86
Duplicated genes in IR 36 36 36 36 36
GC content of LSC (%) 34.44 36.67 34.45 34.41 34.44
GC content of IR (%) 41.87 41.86 41.87 41.93 41.87
GC content of SSC (%) 30.98 31.00 31.00 30.91 30.99
GC content (%) 36.7 36.67 36.71 36.69 36.7
Total reads 20 913 346 19 041 713 22 544 659 18 602 676 20 582 680
Assembled reads 506 120 1 681 045 2 921 757 1 979 809 2 099 682
Average insert size (bp) 1008.44 3293.85 5613.27 3887.25 4125.15
Tab le 2. Summary characteristics of the ve Salix chloroplast genomes. LSC, large single-copy region. SSC,
small single-copy region. IR, inverted repeat region.
Scientic Reports | (2024) 14:28528 5
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports/
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Salix species and Chosenia arbutifolia gather within large monophyletic branches. One branch consists of S.
willsonii, S. babylonica, S. tetrasperma, and S. Interior, other tree species are another branch. At the same time,
the ycf1 deletion in the SSC region at the SSC/IRa junction and the insertion at the IRa/LSC junction will classify
the old-world species into a subclass. Conservative boundary genes classify other species under the Salix genus.
Fig. 2. e relative synonymous codon usage (RSCU) of amino acids and repeat sequences. (A) Amino acid
usage frequency calculated by RSCU (B) Repeat sequence analysis of chloroplast genomes of the ve Salix
species for positive selection.
Scientic Reports | (2024) 14:28528 6
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports/
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Conclusion
Using the Illumina NovaSeq PE150 platform, we successfully sequenced and assembled the complete chloroplast
genomes of S. argyracea, S. dasyclados, S. eriocephala, S. integra ‘Hakuro Nishiki’, and S. suchowensis, and
compared them with chloroplast genomes of other genera. e results show that the chloroplast sequences and
gene arrangements of the ve Salix species are highly conserved, with sizes, overall structures, gene orders, and
contents similar to those of other genera. By comparing the chloroplast genomes of Salix species, dierences
in regions such as matK-trnQ, ndhC-trnV, psbE-petL, rpl36-rps8, and ndhB-rps7 were identied, which can
serve as important molecular markers for willow resource classication research. e phylogenetic relationships
strongly support the known classication of the Salicaceae family. Furthermore, the high conservation of the
entire Salix cpDNA sequences reinforces the concept of shared evolutionary history among these species. ese
genes provide a promising avenue for further research to deepen our understanding of Salix evolution.
Fig. 3. e number of SSR repeats in the ve Salix species.
Species Region Exon Intron Intergenic Total number of markers in dierent regions Total markers Proportion
S. dasyclados
LSC 31 25 124 180 276 14.10%
SSC 19 6 13 38 65.20%
IR 34 6 18 58 21.00%
S. argyracea
LSC 30 23 120 173 269 13.80%
SSC 19 6 13 38 65.20%
IR 34 6 18 58 21.00%
S. eriocephala
LSC 30 24 120 174 270 14.10%
SSC 19 6 13 38 64.40%
IR 34 6 18 58 21.50%
S. ‘Hakuro Nishiki’
LSC 32 25 120 177 273 13.90%
SSC 19 6 13 38 64.80%
IR 34 6 18 58 21.20%
S. suchowensis
LSC 32 25 120 173 273 14.10%
SSC 19 6 13 38 64.30%
IR 34 6 18 58 21.60%
Tab le 3. Simple sequence repeats (SSRs) found in ve Salix species.
Scientic Reports | (2024) 14:28528 7
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports/
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Fig. 4. e alignment of 21 Salix species and Chosenia arbutifolia. e long, red rectangle represents the
similarity among the genomes. e white bar indicates the annotated gene coding sequences.
Scientic Reports | (2024) 14:28528 8
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports/
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Fig. 5. Comparison of the borders of large single-copy (LSC), small single-copy (SSC), and inverted repeat
(IR) regions among the 22 chloroplast genomes of 21 Salix species and Chosenia arbutifolia.
Scientic Reports | (2024) 14:28528 9
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports/
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Data availability
e raw sequencing data for the Illumina and Nanopore platforms and the mitogenome sequences have been
deposited in NCBI (https://www.ncbi.nlm. nih. gov/) with accession numbers MT551159 (S. argyracea),
MT551160 (S. dasyclados), MT551161 (S. eriocephala), MT551162 (S. integra ‘Hakuro Nishiki’), and MT551163
(S. suchowensis).
Received: 5 July 2024; Accepted: 11 November 2024
References
1. Lee, J. et al. e complete chloroplast genome sequence of Zanthoxylum piperitum. Mitochondrial DNA A DNA Mapp. Seq. Anal.
27, 3525–3526. https://doi.org/10.3109/19401736.2015.1074201 (2016).
2. Liu, X. F., Zhu, G. F., Li, D. M. & Wang, X. J. Complete chloroplast genome sequence and phylogenetic analysis of Spathiphyllum
“Parrish. PLoS ONE 14, e0224038. https://doi.org/10.1371/journal.pone.0224038 (2019).
3. Xia, M. & Li, Y. Complete chloroplast genome sequence of Adenostemma lavenia (Asteraceae) and phylogenetic analysis with
related species. Mitochondrial DNA B Resour. 6, 2134–2136. https://doi.org/10.1080/23802359.2021.1944369 (2021).
4. Prabhudas, S. K., Prayaga, S., Madasamy, P. & Natarajan, P. Shallow whole genome sequencing for the assembly of complete
chloroplast genome sequence of Arachis hypogaea L. Front. Plant Sci. 7, 1106. https://doi.org/10.3389/fpls.2016.01106 (2016).
5. Jo, I. H. et al. Complete chloroplast genome of the inverted repeat-lacking species Vicia bungei and development of polymorphic
simple sequence repeat markers. Front Plant. Sci. 13, 891783. https://doi.org/10.3389/fpls.2022.891783 (2022).
6. Li, X. et al. Complete chloroplast genome sequence of Magnolia grandiora and comparative analysis with related species. Sci.
China Life Sci. 56, 189–198. https://doi.org/10.1007/s11427-012-4430-8 (2013).
7. Hao, J. et al. e complete chloroplast genome sequence of Plectranthus hadiensis (Lamiaceae) and phylogenetic analysis.
Mitochondrial DNA B Resour. 8, 1049–1053. https://doi.org/10.1080/23802359.2023.2262689 (2023).
8. Xue, S. et al. Comparative analysis of the complete chloroplast genome among Prunus mume, P. armeniaca, and P. salicina. Hor tic
Res. 6, 89. https://doi.org/10.1038/s41438-019-0171-1 (2019).
9. Lin, J., Lin, Z., Chen, Y. & Xu, H. e complete chloroplast genome sequence of Lemna turionifera (Araceae). Mitochondrial DNA
B R esour. 9, 971–975. https://doi.org/10.1080/23802359.2024.2384577 (2024).
10. Li, X. Y. Complete chloroplast genome sequence of Mahonia duclouxiana (Berberidaceae), a medicinal plant in China.
Mitochondrial DNA B Resour. 6, 3023–3024. https://doi.org/10.1080/23802359.2021.1978888 (2021).
Fig. 6. e phylogenetic tree based on the 32 complete chloroplast genome sequences.
Scientic Reports | (2024) 14:28528 10
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports/
Content courtesy of Springer Nature, terms of use apply. Rights reserved
11. Njuguna, A. W. et al. Comparative analyses of the complete chloroplast genomes of nymphoides and menyanthes species
(menyanthaceae). Aquatic Bot. 156, 73–81. https://doi.org/10.1016/j.aquabot.2019.05.001 (2019).
12. Cui, Y. et al. Complete chloroplast genome and comparative analysis of three Lycium (Solanaceae) species with medicinal and
edible properties. Gene Rep. 17, 100464. https://doi.org/10.1016/j.genrep.2019.100464 (2019).
13. Zhang, Z. et al. e complete chloroplast genome sequence and phylogenetic relationship analysis of Eomecon chionantha, one
species unique to China. J. Plant Res. 137, 575–587. https://doi.org/10.1007/s10265-024-01539-y (2024).
14. Villette, C., Maurer, L. & Heintz, D. Investigation of xenobiotics metabolism in Salix alba Leaves via Mass spectrometry imaging.
J. Vis. Exp. https://doi.org/10.3791/61011 (2020).
15. Gulyaev, S. et al. e phylogeny of Salix revealed by whole genome re-sequencing suggests dierent sex-determination systems in
major groups of the genus. Ann. Bot. 129, 485–498. https://doi.org/10.1093/aob/mcac012 (2022).
16. Wu, J. et al. Phylogeny of Salix subgenus Salix s.l. (Salicaceae): Delimitation, biogeography, and reticulate evolution. BMC Evol.
Biol. 15, 31. https://doi.org/10.1186/s12862-015-0311-7 (2015).
17. Kersten, B. et al. Genome sequences of Populus tremula chloroplast and mitochondrion: Implications for holistic poplar breeding.
PLoS ONE 11, e0147209. https://doi.org/10.1371/journal.pone.0147209 (2016).
18. Ren, W. et al. e chloroplast genome of Salix oderusii and characterization of chloroplast regulatory elements. Front Plant Sci.
13, 987443. https://doi.org/10.3389/fpls.2022.987443 (2022).
19. Ohashi, H. A systematic enumeration of Japanese Salix (Salicaceae). J. Jpn. Bot. 75, 1–41 (2000).
20. Qiao, S. et al. Responses of growth and photosynthesis to alkaline stress in three willow species. Sci. Rep. 14, 14672. h t t p s : / / d o i . o r
g / 1 0 . 1 0 3 8 / s 4 1 5 9 8 - 0 2 4 - 6 5 0 0 4 - 5 (2024).
21. Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identication. BMC Bioinform. 11, 119. h t t p s :
/ / d o i . o r g / 1 0 . 1 1 8 6 / 1 4 7 1 - 2 1 0 5 - 1 1 - 1 1 9 (2010).
22. Laslett, D. & Canback, B. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids
Res. 32, 11–16. https://doi.org/10.1093/nar/gkh152 (2004).
23. Greiner, S., Lehwark, P. & Bock, R. OrganellarGenomeDRAW (OGDRAW) version 1.3.1: expanded toolkit for the graphical
visualization of organellar genomes. Nucleic Acids Res. 47, W59-w64. https://doi.org/10.1093/nar/gkz238 (2019).
24. iel, T., Michalek, W., Varshney, R. K. & Graner, A. Exploiting EST databases for the development and characterization of gene-
derived SSR-markers in barley (Hordeum vulgare L.). eor. Appl. Genet. 106, 411–422. h t t p s : / / d o i . o r g / 1 0 . 1 0 0 7 / s 0 0 1 2 2 - 0 0 2 - 1 0 3
1 - 0 (2003).
25. Kumar, S., Stecher, G. & Tamura, K. MEGA7: Molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol. Biol.
Evolut. 33, 1870–1874. https://doi.org/10.1093/molbev/msw054 (2016).
26. Librado, P. & Rozas, J. DnaSP v5: a soware for comprehensive analysis of DNA polymorphism data. Bioinformatics 25, 1451–1452.
https://doi.org/10.1093/bioinformatics/btp187 (2009).
27. Kurtz, S. et al. REPuter: e manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 29, 4633–4642. h t t p s : /
/ d o i . o r g / 1 0 . 1 0 9 3 / n a r / 2 9 . 2 2 . 4 6 3 3 (2001).
28. Luo, Z. et al. Molecular characteristics and phylogenetic denition on the complete chloroplast genome of Petrocodon longitubus.
Plant Biotechnol. Rep. https://doi.org/10.1007/s11816-024-00919-z (2024).
29. Miao, X. et al. Assembly and comparative analysis of the complete mitochondrial and chloroplast genome of Cyperus stoloniferus
(Cyperaceae), a coastal plant possessing saline-alkali tolerance. BMC Plant Biol. 24, 628. h t t p s : / / d o i . o r g / 1 0 . 1 1 8 6 / s 1 2 8 7 0 - 0 2 4 - 0 5 3 3
3 - 9 (2024).
30. Shen, Z. et al. e complete chloroplast genome sequence of the medicinal moss Rhodobryum giganteum (Bryaceae, Bryophyta):
Comparative genomics and phylogenetic analyses. Genes 15, 900 (2024).
31. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–
1313. https://doi.org/10.1093/bioinformatics/btu033 (2014).
32. Alzahrani, D. A., Yaradua, S. S., Albokhari, E. J. & Abba, A. Complete chloroplast genome sequence of Barleria prionitis,
comparative chloroplast genomics and phylogenetic relationships among Acanthoideae. BMC Genomics 21, 393. h t t p s : / / d o i . o r g / 1
0 . 1 1 8 6 / s 1 2 8 6 4 - 0 2 0 - 0 6 7 9 8 - 2 (2020).
33. Phumichai, C., Phumichai, T. & Wongkaew, A. Novel chloroplast microsatellite (cpSSR) markers for genetic diversity assessment
of cultivated and Wild Hevea Rubber. Plant Mol. Biol. Rep. 33, 1486–1498. https://doi.org/10.1007/s11105-014-0850-x (2015).
34. Honig, J. A. et al. Classication of bentgrass (Agrostis) cultivars and accessions based on microsatellite (SSR) markers. Genet.
Resour. Crop Evol. 63, 1139–1160. https://doi.org/10.1007/s10722-015-0307-6 (2016).
35. López, K. E. R., Armijos, C. E., Parra, M. & Torres, M. d. L. e rst complete chloroplast genome sequence of Mortiño (Vaccinium
oribundum) and comparative analyses with other vaccinium species. Horticulturae 9, 302 (2023).
36. Melotto-Passarin, D. M., Tambarussi, E. V., Dressano, K., De Martin, V. F. & Carrer, H. Characterization of chloroplast DNA
microsatellites from Saccharum spp and related species. Genet. Mol. Res. 10, 2024–2033. https://doi.org/10.4238/vol10-3gmr1019
(2011).
37. Bai, D., Luo, X. & Yang, Y. Complete chloroplast genome sequence of “Field Muskmelon, an invasive weed to China. Mitochondrial
DNA B Resour. 6, 3352–3353. https://doi.org/10.1080/23802359.2021.1994888 (2021).
38. Bozkurt, A., Kaymaz, Y., Ateş, D. & Tanyolaç, M. B. e complete sequence of Lens tomentosus chloroplast genome. Acta
Physiologiae Plantarum 46, 2. https://doi.org/10.1007/s11738-023-03628-2 (2023).
39. Zhao, M., Wu, Y. & Ren, Y. Complete Chloroplast Genome Sequence Structure and Phylogenetic Analysis of Kohlrabi (Brassica
oleracea var. gongylodes L.). Genes (Basel) 15. https://doi.org/10.3390/genes15050550 (2024).
40. Lloyd Evans, D., Joshi, S. V. & Wang, J. Whole chloroplast genome and gene locus phylogenies reveal the taxonomic placement and
relationship of Tripidium (Panicoideae: Andropogoneae) to sugarcane. BMC Evol. Biol. 19, 33. h t t p s : / / d o i . o r g / 1 0 . 1 1 8 6 / s 1 2 8 6 2 - 0 1
9 - 1 3 5 6 - 9 (2019).
41. Long, J., Tian, Y., Zhang, J. & Wang, Z. e complete chloroplast genome sequence of Olea dioica Roxb, 1820 (Oleaceae).
Mitochondrial DNA B Resour. 9, 748–752. https://doi.org/10.1080/23802359.2024.2366373 (2024).
42. Ahmad, W. et al. Complete chloroplast genome sequencing and comparative analysis of threatened dragon trees Dracaena serrulata
and Dracaena cinnabari. Sci. Rep. 12, 16787. https://doi.org/10.1038/s41598-022-20304-6 (2022).
43. Wei, F. et al. e complete chloroplast genome sequence of the medicinal plant Sophora tonkinensis. Sci. Rep. 10, 12473. h t t p s : / / d
o i . o r g / 1 0 . 1 0 3 8 / s 4 1 5 9 8 - 0 2 0 - 6 9 5 4 9 - z (2020).
44. Wu, F. Y., Ma, S. C., Ye, P. M., Ye, H. & Ma, J. L. e complete chloroplast genome sequence of Camellia zhaiana (eaceae), a
critically endangered species from China. Mitochondrial DNA B Resour. 6, 2425–2426. h t t p s : / / d o i . o r g / 1 0 . 1 0 8 0 / 2 3 8 0 2 3 5 9 . 2 0 2 1 . 1 9 5
5 0 2 7 (2021).
45. Kim, K. J. & Lee, H. L. Complete chloroplast genome sequences from Korean ginseng (Panax schinseng Nees) and comparative
analysis of sequence evolution among 17 vascular plants. DNA Res. 11, 247–261. https://doi.org/10.1093/dnares/11.4.247 (2004).
46. Guo, L., Zhai, J. & Gu, Y. e complete chloroplast genome sequence of Isoetes baodongii (Isoetaceae). Mitochondrial DNA B
Resour. 9, 667–671. https://doi.org/10.1080/23802359.2024.2356128 (2024).
47. Huang, Y., Wang, J., Yang, Y., Fan, C. & Chen, J. Phylogenomic analysis and dynamic evolution of chloroplast genomes in salicaceae.
Front Plant Sci. 8, 1050. https://doi.org/10.3389/fpls.2017.01050 (2017).
48. Chen, Y., Hu, N. & Wu, H. Analyzing and characterizing the chloroplast genome of Salix wilsonii. Biomed. Res. Int. 2019, 5190425.
https://doi.org/10.1155/2019/5190425 (2019).
Scientic Reports | (2024) 14:28528 11
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports/
Content courtesy of Springer Nature, terms of use apply. Rights reserved
49. Percy, D. M. et al. Understanding the spectacular failure of DNA barcoding in willows (Salix): Does this result from a trans-specic
selective sweep?. Mol. Ecol. 23, 4737–4756. https://doi.org/10.1111/mec.12837 (2014).
50. Marinček, P. et al. Challenge accepted: Evolutionary lineages versus taxonomic classication of North American shrub willows
(Salix). Am. J. Bot. 111, e16361. https://doi.org/10.1002/ajb2.16361 (2024).
51. Nie, L. et al. Complete chloroplast genome sequence of the medicinal plant Arctium lappa. Genome 63, 53–60. h t t p s : / / d o i . o r g / 1 0 .
1 1 3 9 / g e n - 2 0 1 9 - 0 0 7 0 (2020).
52. Jia-Hui, C., Hang, S. & Yong-Ping, Y. Cladistic analysis of the genus salix (Salicaceae). Acta Botanica Yunnanica (2008).
53. Zhou, J., Jiao, Z., Guo, J., Wang, B. S. & Zheng, J. Complete chloroplast genome sequencing of ve Salix species and its application
in the phylogeny and taxonomy of the genus. Mitochondrial DNA B Resour. 6, 2348–2352. h t t p s : / / d o i . o r g / 1 0 . 1 0 8 0 / 2 3 8 0 2 3 5 9 . 2 0 2 1
. 1 9 5 0 0 5 5 (2021).
Author contributions
Conceptualization, PW and J.Z.; Methodology, J.Z. and J.H.G; Validation, J.Z. and Y.X.W; Writing the original
dra, J.Z.; Supervision, J.Z.; Funding acquisition, J.Z.
Funding
is research was funded Independent Research Projects of Jiangsu Academy of Forestry [ZZKY202201] and
Key research and development plan projects National Forestry and Grassland Administration GLM202183.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
e authors declare no competing interests.
Additional information
Supplementary Information e online version contains supplementary material available at h t t p s : / / d o i . o r g / 1
0 . 1 0 3 8 / s 4 1 5 9 8 - 0 2 4 - 7 9 6 0 4 - 8 .
Correspondence and requests for materials should be addressed to J.Z.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Open Access is article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives
4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide
a link to the Creative Commons licence, and indicate if you modied the licensed material. You do not have
permission under this licence to share adapted material derived from this article or parts of it. e images or
other third party material in this article are included in the articles Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence
and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder. To view a copy of this licence, visit h t t p : / / c r e a t i v e c o m m o
n s . o r g / l i c e n s e s / b y - n c - n d / 4 . 0 / .
© e Author(s) 2024
Scientic Reports | (2024) 14:28528 12
| https://doi.org/10.1038/s41598-024-79604-8
www.nature.com/scientificreports/
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... Its flexible and uniform branches make it an ideal material for willow weaving, while its products are safe, environmentally friendly, and durable. Notably, S. suchowensis is not only a fast-growing economic tree but also an ecologically valuable species [6]. With strong tolerance to waterlogging, it grows well on riverbanks, making it an excellent tree species for soil and water conservation and embankment consolidation [7]. ...
... The motif types of SsVQs and PtVQs were consistent. However, SsVQ proteins lacked the FxxxVQxLTD/S, FxxxVQxYTG, and FxxxVQxITG motifs in contrast to AtVQs and OsVQs [6,7]. In order to be er understand the similarities and differences of SsVQ motifs, multiple sequence alignments were performed ( Figure 5). ...
... The motif types of SsVQs and PtVQs were consistent. However, SsVQ proteins lacked the FxxxVQxLTD/S, FxxxVQxYTG, and FxxxVQxITG motifs in contrast to AtVQs and OsVQs [6,7]. ...
Article
Full-text available
The valine glutamine (VQ) proteins are transcription cofactors involved in various aspects of plant biology, including growth, development, and stress resistance, making them an attractive target for genetic engineering aimed at enhancing plant resilience and productivity. However, comprehensive reports or systematic studies on VQ cofactors in Salix suchowensis remain lacking. In this study, we analyzed SsVQ genes using bioinformatics methods based on the Salix suchowensis genome database. Expression profiles were further investigated through qRT-PCR under six treatments: PEG, NaCl, 40 °C, ABA, SA, and MeJA. A total of 39 SsVQ genes were identified, with phylogenetic analysis classifying them into seven groups. Collinearity analysis suggested that SsVQ gene amplification primarily resulted from whole genome duplication (WGD) or segmental duplication events. Ka/Ks ratios indicated that willow VQ genes have undergone predominantly purifying selection. Gene structure analysis revealed that SsVQ genes are intronless. Multiple sequence alignment showed that SsVQ19 shares similarity with PtVQ27, containing a hydrophilic threonine (T) residue preceding the VQ amino acid residues. Furthermore, genes within each group exhibited conserved structures and VQ motifs. Promoter and expression analyses suggested the potential roles of SsVQ genes in regulating willow responses to environmental stresses and hormonal signals. Most SsVQ genes displayed differential expression at specific time points, with six members (SsVQ2, SsVQ9, SsVQ12, SsVQ23, SsVQ32, and SsVQ34) showing sustained high-amplitude expression profiles across treatments. Notably, SsVQ34 demonstrated pronounced transcriptional induction under PEG stress, with expression levels upregulated by 62.29-fold (1 h), 49.21-fold (6 h), 99.9-fold (12 h), and 201.50-fold (24 h). Certain SsVQ genes showed co-expression under abiotic/hormonal stresses, implying synergistic functions. Paralogous gene pairs exhibited stronger co-expression than non-paralogous pairs. This study provides novel insights into the structural and functional characteristics of the VQ gene family in Salix suchowensis, establishing a foundation for future research on the stress-resistance mechanisms of willow VQ genes.
Article
Full-text available
Lemna turionifera is native to North America and northern Asia, with significant potential for industrial wastewater remediation. The complete nucleotide sequence of the L. turionifera chloroplast genome (cpDNA) was determined. The cpDNA is a circular molecule of 166,606 bp and containing a pair of inverted repeats (IRs) measuting 31,663 bp each. These IRs are flanked by a small single-copy region of 13,542 bp and a large single-copy region of 89,738 bp. The chloroplast genome of L. turionifera consisted of 112 unique genes, including 78 protein-encoding genes, 30 tRNA genes, and four rRNA genes. The phylogenetic analysis utilizing cpDNA provided a well-supported resolution of the relationships among subfamilies within the Araceae family. Our findings indicated a close relationship between L. turionifera and a clade consisting of L. minor, L. japonica, and L. gibba. The availability of the complete chloroplast genome sequence of L. turionifera presents valuable data for future phylogenetic investigations within the Lemnaceae family.
Article
Full-text available
Rhodobryum giganteum (Bryaceae, Bryophyta), a rare medicinal bryophyte, is valued for its cardiovascular therapeutic properties in traditional Chinese medicine. This study presents the first complete chloroplast genome sequence of R. giganteum, including its assembly and annotation. The circular chloroplast genome of R. giganteum is 124,315 bp in length, displaying a typical quadripartite structure with 128 genes: 83 protein-coding genes, 37 tRNAs, and 8 rRNAs. Analyses of codon usage bias, repetitive sequences, and simple sequence repeats (SSRs) revealed an A/U-ending codon preference, 96 repetitive sequences, and 385 SSRs in the R. giganteum chloroplast genome. Nucleotide diversity analysis identified 10 high mutational hotspots. Ka/Ks ratio analysis suggested potential positive selection in rpl20, rps18, petG, and psbM genes. Phylogenetic analysis of whole chloroplast genomes from 38 moss species positioned R. giganteum within Bryales, closely related to Rhodobryum laxelimbatum. This study augments the chloroplast genomic data for Bryales and provides a foundation for molecular marker development and genetic diversity analyses in medicinal bryophytes.
Article
Full-text available
Background Cyperus stoloniferus is an important species in coastal ecosystems and possesses economic and ecological value. To elucidate the structural characteristics, variation, and evolution of the organelle genome of C. stoloniferus, we sequenced, assembled, and compared its mitochondrial and chloroplast genomes. Results We assembled the mitochondrial and chloroplast genomes of C. stoloniferus. The total length of the mitochondrial genome (mtDNA) was 927,413 bp, with a GC content of 40.59%. It consists of two circular DNAs, including 37 protein-coding genes (PCGs), 22 tRNAs, and five rRNAs. The length of the chloroplast genome (cpDNA) was 186,204 bp, containing 93 PCGs, 40 tRNAs, and 8 rRNAs. The mtDNA and cpDNA contained 81 and 129 tandem repeats, respectively, and 346 and 1,170 dispersed repeats, respectively, both of which have 270 simple sequence repeats. The third high-frequency codon (RSCU > 1) in the organellar genome tended to end at A or U, whereas the low-frequency codon (RSCU < 1) tended to end at G or C. The RNA editing sites of the PCGs were relatively few, with only 9 and 23 sites in the mtDNA and cpDNA, respectively. A total of 28 mitochondrial plastid DNAs (MTPTs) in the mtDNA were derived from cpDNA, including three complete trnT-GGU, trnH-GUG, and trnS-GCU. Phylogeny and collinearity indicated that the relationship between C. stoloniferus and C. rotundus are closest. The mitochondrial rns gene exhibited the greatest nucleotide variability, whereas the chloroplast gene with the greatest nucleotide variability was infA. Most PCGs in the organellar genome are negatively selected and highly evolutionarily conserved. Only six mitochondrial genes and two chloroplast genes exhibited Ka/Ks > 1; in particular, atp9, atp6, and rps7 may have undergone potential positive selection. Conclusion We assembled and validated the mtDNA of C. stoloniferus, which contains a 15,034 bp reverse complementary sequence. The organelle genome sequence of C. stoloniferus provides valuable genomic resources for species identification, evolution, and comparative genomic research in Cyperaceae.
Article
Full-text available
Premise The huge diversity of Salix subgenus Chamaetia/Vetrix clade in North America and the lack of phylogenetic resolution within this clade has presented a difficult but fascinating challenge for taxonomists to resolve. Here we tested the existing taxonomic classification with molecular tools. Methods In this study, 132 samples representing 46 species from 22 described sections of shrub willows from the United States and Canada were analyzed and combined with 67 samples from Eurasia. The ploidy levels of the samples were determined using flow cytometry and nQuire. Sequences were produced using a RAD sequencing approach and subsequently analyzed with ipyrad, then used for phylogenetic reconstructions (RAxML, SplitsTree), dating analyses (BEAST, SNAPPER), and character evolution analyses of 14 selected morphological traits (Mesquite). Results The RAD sequencing approach allowed the production of a well‐resolved phylogeny of shrub willows. The resulting tree showed an exclusively North American (NA) clade in sister position to a Eurasian clade, which included some North American endemics. The NA clade began to diversify in the Miocene. Polyploid species appeared in each observed clade. Character evolution analyses revealed that adaptive traits such as habit and adaxial nectaries evolved multiple times independently. Conclusions The diversity in shrub willows was shaped by an evolutionary radiation in North America. Most species were monophyletic, but the existing sectional classification could not be supported by molecular data. Nevertheless, monophyletic lineages share several morphological characters, which might be useful in the revision of the taxonomic classification of shrub willows.
Article
Full-text available
Investigating differences in resistance to alkaline stress among three willow species can provide a theoretical basis for planting willow in saline soils. Therefore we tested three willow species (Salix matsudana, Salix gordejevii and Salix linearistipularis), already known for their high stress tolerance, to alkaline stress environment at different pH values under hydroponics. Root and leaf dry weight, root water content, leaf water content, chlorophyll content, photosynthesis and chlorophyll fluorescence of three willow cuttings were monitored six times over 15 days under alkaline stress. With the increase in alkaline stress, the water retention capacity of leaves of the three species of willow cuttings was as follows: S. matsudana > S. gordejevii > S. linearistipularis and the water retention capacity of the root system was as follows: S. gordejevii > S. linearistipularis > S. matsudana. The chlorophyll content was significantly reduced, damage symptoms were apparent. The net photosynthetic rate (Pn), rate of transpiration (E), and stomatal conductance (Gs) of the leaves showed a general trend of decreasing, and the intercellular CO2 concentration (Ci) of S. matsudana and S. gordejevii first declined and then tended to level off, while the intercellular CO2 concentration of S. linearistipularis first declined and then increased. The quantum yield and energy allocation ratio of the leaf photosystem II (PSII) reaction centre changed significantly (φPo, Ψo and φEo were obviously suppressed and φDo was promoted). The photosystem II (PSII) reaction centre quantum performance index and driving force showed a clear downwards trend. Based on the results it can be concluded that alkaline stress tolerance of three willow was as follows: S. matsudana > S. gordejevii > S. linearistipularis. However, since the experiment was done on young seedlings, further study at saplings stage is required to revalidate the results.
Article
Full-text available
Olea dioica Roxb, 1820 is a very important ethnomedicinal tree because of its medicinal properties and it belongs to the Oleaceae family. It is mainly distributed in evergreen and semi-evergreen forests. However, the chloroplast genome of O. dioica has not yet been reported. In this study, the chloroplast genome sequence of O. dioica was sequenced using next-generation sequencing technologies. The complete chloroplast genome of O. dioica was 155,138 bp in length (GenBank accession no. PP048999), comprising a large single-copy (LSC) region of 86,048 bp, a small single-copy (SSC) region of 17,816 bp, and two inverted repeat (IR) regions 25,637 bp each. The overall GC content was 37.8%. The complete chloroplast genome of O. dioica contains 131 complete genes, which are 88 protein-coding genes, 35 transfer RNA genes, and eight ribosomal RNA genes. A maximum-likelihood (ML) tree of O. dioica and 14 other species in the family Oleaceae suggested that O. dioica showed a close relationship with Olea brachiata.
Article
Full-text available
Isoetes baodongii is a diploid species of Isoetaceae distributed in low altitude area, its megaspore ornamentation is similar to tetraploid species I. sinensis. We collected leaf material of I. baodongii and sequenced it for low depth whole genome sequence, then, a complete chloroplast genome of I. baodongii was assembled and annotated. This chloroplast genome has a circular structure of 145,494 bp in length with a GC content of 38.0%, comprising a large single copy (LSC) region of 91,860 bp, a pair of inverted repeat (IR) regions of 13,207 bp each, and a small single copy (SSC) region of 27,220 bp. 136 genes were annotated, including 84 protein-coding genes, 38 tRNA genes, and 8 rRNA genes. A maximum likelihood phylogeny tree was reconstructed after the sequences alignment, the result showed that I. baodongii formed a sister clade to the one clustered by I. sinensis, I. taiwanensis and I. orientalis. Although the chloroplast genome structure of I. baodongii is extremely similar to other species distributed in China, a well-supported phylogenetic relationship was reconstructed here, these results may provide new messages for further studies on phylogeny and evolution of vascular plant on the earth.
Article
Full-text available
Kohlrabi is an important swollen-stem cabbage variety belonging to the Brassicaceae family. However, few complete chloroplast genome sequences of this genus have been reported. Here, a complete chloroplast genome with a quadripartite cycle of 153,364 bp was obtained. A total of 132 genes were identified, including 87 protein-coding genes, 37 transfer RNA genes and eight ribosomal RNA genes. The base composition analysis showed that the overall GC content was 36.36% of the complete chloroplast genome sequence. Relative synonymous codon usage frequency (RSCU) analysis showed that most codons with values greater than 1 ended with A or U, while most codons with values less than 1 ended with C or G. Thirty-five scattered repeats were identified and most of them were distributed in the large single-copy (LSC) region. A total of 290 simple sequence repeats (SSRs) were found and 188 of them were distributed in the LSC region. Phylogenetic relationship analysis showed that five Brassica oleracea subspecies were clustered into one group and the kohlrabi chloroplast genome was closely related to that of B. oleracea var. botrytis. Our results provide a basis for understanding chloroplast-dependent metabolic studies and provide new insight for understanding the polyploidization of Brassicaceae species.
Article
Petrocodon is a small genus in the family Gesneriaceae, which is special for its remarkable floral diversity, and has high ornamental value. In this study, the complete chloroplast genome sequence and genome characteristics of Petrocodon longitubus are first reported. The genome size is 152,958 bp, including a large single-copy region (LSC, 83,901 bp), a small single-copy region (SSC, 18,255 bp), and two inverted repeat sequences (IRs, 25,401 bp, each). The chloroplast genome of P. longitubus was analyzed, revealing a total GC content of 37.47%. A total of 131 genes were de novo assembled, consisting of 87 protein-coding genes, 36 tRNA genes, and 8 rRNA genes. A comparative analysis was conducted between the chloroplast genome of P. longitubus and three other published species of Petrocodon. The chloroplast genome of four Petrocodon species was found to have a double-chain ring structure, with a size ranging from 152,958 to 153,292 bp. Chloroplast genome size had indistinguishable. Four Petrocodon species was ra elatively conserved sequence, with 87 or 88 protein-coding genes, and 8 rRNA were the most conserved, which contains 42 ~ 50 SSR sites, which are mainly mononucleotides and dinucleotides, 4 boundary transition regions, then trinucleotides, pentanucleotides and hexanucleotides have been not detected. The non-preferred codons of the chloroplast genome in the four Petrocodon species are those ending in A, C, G, or T. The chloroplast genomes of these four Petrocodon species are highly similar to each other and to several Primulina species. Phylogenetic trees indicate that P. longitubus and other Petrocodon species were grouped together in a clade, with P. longitubus form a single clade. The results support the scientific naming of P. Longitubusr based on horticultural traits and further clarify the systematic status using molecular information.
Article
Eomecon chionantha Hance, an endemic species in China, has a long medical history in Chinese ethnic minority medicine and is known for its anti-inflammatory and analgesic effects. However, studies of E. chionantha are lacking. In this study, we investigated the characteristics of the E. chionantha chloroplast genome and determined the taxonomic position of E. chionantha in Papaveraceae via phylogenetic analysis. In addition, we determined molecular markers to identify E. chionantha at the molecular level by comparing the chloroplast genomes of E. chionantha and its closely related species. The complete chloroplast genomic information indicated that E. chionantha chloroplast DNA (178,808 bp) contains 99 protein-coding genes, 8 rRNAs, and 37 tRNAs. Meanwhile, we were able to identify a total of 54 simple sequence repeats through our analysis. Our findings from the phylogenetic analysis suggest that E. chionantha shares a close relationship with four distinct species, namely Macleaya microcarpa, Coreanomecon hylomeconoides, Hylomecon japonica, and Chelidonium majus. Additionally, using the Kimura two-parameter model, we successfully identified five hypervariable regions (ycf4-cemA, ycf3-trnS-GGA, trnC-GCA-petN, rpl32-trnL-UAG, and psbI-trnS-UGA). To the best of our knowledge, this is the first report of the complete chloroplast genome of E. chionantha, providing a scientific reference for further understanding of E. chionantha from the perspective of the chloroplast genome and establishing a solid foundation for the future identification, taxonomic determination and evolutionary analysis of this species.