Fig 1 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Overview of the genomic features of M. suaveolens. a Image of M. suaveolens. b 155 Circos plot of M. suaveolens haplotype-resolved gap-free genomic features. I: Chromosome 156 length. II: LTR/Copia coverage. III: LTR/Gypsy elements. IV: Gene density (red). V: Repeat 157 sequence density. VI: GC content. The innermost part of the plot represents the collinear 158
Source publication
Mentha is a commonly used spice worldwide, which possesses medicinal properties and fragrance. These characteristics are conferred, at least partially, by essential oils such as menthol. In this study, a gap-free assembly with a genome size of 414.3 Mb and 31,251 coding genes was obtained for Mentha suaveolens ‘Variegata’. Based on its high heteroz...
Contexts in source publication
Context 1
... of genes encoding 104 key enzymes in the secondary metabolite synthesis pathway 15 ). The final genome size was in accordance with the estimated genome size 140 according to the K-mer analysis. Based on HiFi reads, we generated two fully 141 resolved haplotypes (termed hapA and hapB) using Hi-C and ONT ultralong reads for 142 assisted haplotyping (Fig. 1b). The assembled haplotypes contained 12 143 pseudomolecules with a total length of 401.9 Mb and 405.7 Mb, ...
Context 2
... Table 1). The final genome size was similar to that estimated by the K-mer analysis. three telomeres remain to be determined (Fig. 1c, Table S2). Compared with the 3,266 150 gaps that remained in the M. longifolia genome, the absence of gaps in all 151 chromosomes represented the continuity of the M. suaveolens genome assembly ( ...
Context 3
... lengths generally <5,000 bp) (Fig. 3a, b). Large SVs, such as duplications and 258 inversions, were observed in the M. suaveolens genome (Fig. S13) genes, were performed to link functions and metabolic pathways with these ...
Context 4
... of volatile terpenoids (Fig. 3e, f). Piperitenone oxide is the main volatile compound in M. suaveolens 285 The analysis of volatile metabolites revealed that terpenoids accounted for the largest 286 proportion (26.65%) (Fig. 4a). The downstream product of isopiperitenone, PO, was 287 predominant among all terpenoid compounds (14.77%) (Fig. 4b) (Fig. S15, S16). (Fig. 5a, b). The expression of two established of these gene and compounds were detected in the leaf, followed by the stem and ...
Context 5
... dot-plot analysis between M. suaveolens with M. Longifolia, T. quinquecostatus, 398 and S. tenuifolia showed a 1:1 pattern (Fig. S18). The peaks observed at Ks = ~0.07 ...
Context 6
... quinquecostatus and at Ks = 0.1-0.2 in S. tenuifolia were identified as suspicious 400 peaks. Subsequently, we extracted the corresponding collinear regions of S. tenuifolia 401 and T. quinquecostatus at Ks=~0.07 (Fig. S19). The results showed that the 402 corresponding peaks are more likely to be caused by tandem repeats, replication, and 403 insertion of certain genomic regions, rather than a WGD ...
Citations
... The high-quality assembly of F. macrophylla genome is largely attributable to the use of ONT Ultra-long data (N50 > 50 K) (Table S2). Several studies have shown that ONT Ultra-long data significantly improve genome continuity [55,56]. Moreover, both assembly and annotation BUSCO scores exceed 99%, which higher than the assembly of C. cajan and P. vulgaris [57,58]. ...
Flemingia macrophylla, a prominent shrub species within the Fabaceae family, is widely distributed across China and Southeast Asia. In addition to its ecological importance, it possesses notable medicinal value, with its roots traditionally used for treating rheumatism, enhancing blood circulation, and alleviating joint pain. We employed Nanopore sequencing platforms to generate a high-quality reference genome for F. macrophylla, with an assembled genome size of 1.01 Gb and a contig N50 of 59.43 Mb. A total of 33,077 protein-coding genes were predicted, and BUSCO analysis indicated a genome completeness of 99%. Phylogenomic analyses showed that F. macrophylla is most closely related to Cajanus cajan among the sampled taxa, with an estimated divergence time of 13.2–20.0 MYA. Evidence of whole-genome duplication (WGD) events was detected in F. macrophylla, C. cajan, and P. vulgaris, with these species sharing two WGD events. The unique gene families in F. macrophylla are associated with strong resistance to both abiotic and biotic stress, supporting its remarkable ecological adaptability. Furthermore, gene family expansion analysis revealed a significant enrichment of genes related to secondary metabolites biosynthesis, providing a molecular basis for its high medicinal value. In summary, this study provides a foundational genomic resource for F. macrophylla, offering valuable insights into its genetic architecture, evolutionary history, and potential applications in medecine and agriculture. The comprehensive analyses lay the groundwork for future research into the species’s medicinal properties and evolutionary biology.
... With the advancement of sequencing technology, genome survey sequencing can efficiently predict genome size, heterozygosity, and the proportion of repetitive sequences through k-mer analysis . Consequently, to improve genome-sizing accuracy, flow cytometry and genome survey sequencing are often combined for cross-validation before genome sequencing, enabling appropriate sequencing strategies (Gregory, 2005;Leng et al., 2024;Yang et al., 2024). ...
Background
Karyotype and genome size are critical genetic characteristics with significant value for cytogenetics, taxonomy, phylogenetics, evolution, and molecular biology. The Lycosidae family, known for its diverse spiders with varying ecological habits and behavioral traits, has seen limited exploration of its karyotype and genome size.
Methods
We utilized an improved tissue drop technique to prepare chromosome slides and compare the features of male and female karyotypes for two wolf spiders with different habits of Lycosidae. Furthermore, we predicted their genome sizes using flow cytometry (FCM) and K-mer analysis.
Results
The karyotypes of female and male Hippasa lycosina were 2n♀ = 26 = 14 m + 12 sm and 2n♂ = 24 = 10 m + 14 sm, respectively, and were composed of metacentric (m) and submetacentric (sm) chromosomes. In contrast, the karyotypes of Lycosa grahami consisted of telocentric (t) and subtelocentric (st) chromosomes (2n♀ = 20 = 20th and 2n♂ = 18 = 12th + 6t, for females and males). The sex chromosomes were both X 1 X 2 O. The estimated sizes of the H. lycosina and L. grahami genomes were 1966.54–2099.89 Mb and 3692.81–4012.56 Mb, respectively. Flow cytometry yielded slightly smaller estimates for genome size compared to k-mer analysis. K-mer analysis revealed a genome heterozygosity of 0.42% for H. lycosina and 0.80% for L. grahami , along with duplication ratios of 21.39% and 54.91%, respectively.
Conclusion
This study describes the first analysis of the genome sizes and karyotypes of two spiders from the Lycosidae that exhibit differential habits and provides essential data for future phylogenetic, cytogenetic, and genomic studies.
... To identify the genes directly involved in DA synthesis in the A. pendulum, a more accurate but relatively smaller set of candidate genes was obtained. In the last decades, herb-genomics has seen remarkable progress with the successful creation of high-quality assemblies for medicinal plants (Leng et al., 2024;Yang et al., 2024). Therefore, a high-quality genome assembly of plants such as A. pendulum should be carried out as soon as possible, which would facilitate a more thorough understanding of the DA biosynthesis process. ...
Introduction
Aconitum pendulum is a well-known Tibetan medicine that possesses abundant diterpenoid alkaloids (DAs) with high medicinal value. However, due to the complicated structures of DAs and the associated challenges in vitro synthesis presents, plants like Aconitum pendulum remain the primary source for DAs.
Methods
Given the underutilization of the A. pendulum, a thorough metabolomic and transcriptomic analysis was conducted on its flowers, leaves, and stems to elucidate the regulatory network underlying DA biosynthesis.
Results
Metabolomic profiling (utilizing UPLC-QQQ-MS/MS) identified 198 alkaloids, of which 61 were DAs and the relative abundance of DAs was different among different tissues. Without a reference genome, we performed de novo assembly of the transcriptome of A. pendulum. We generated 181,422 unigenes, among which 411 candidate enzyme genes related to the DA synthesis pathway were identified, including 34 differentially expressed genes (DEGs). Through joint analysis of transcriptome and metabolome data, we found a correlation between the detected metabolite levels in various tissues and the expression of related genes. Specifically, it was found that ApCYP1, ApCYP72, and ApCYP256 may be related to turupellin accumulation, while ApBAHD9, ApBAHD10, ApBAHD12 positively associated with the accumulation of aconitine. Furthermore, our study also revealed that genes involved in the diterpene skeleton synthesis pathway tend to be highly expressed in flowers, whereas genes related to DA skeleton synthesis and their subsequent modifications are more likely to be highly expressed in leaf and stem tissues. Functional analysis of gene families identified 77 BAHD acyltransferases, 12 O-methyltransferases, and 270 CYP450 enzyme genes potentially involved in the biosynthesis of DAs. The co-expression network between metabolites and related genes revealed 116 significant correlations involving 30 DAs and 58 enzyme genes.
Discussion
This study provides valuable resources for in-depth research on the secondary metabolism of A. pendulum, not only deepening our understanding of the regulatory mechanisms of DA biosynthesis but also providing valuable genetic resources for subsequent genetic improvement and metabolic engineering strategies.
... Currently, the main obstacle in the Aconitum omics files is the lack of high-quality assembly. In the last decades, herb genomics has seen remarkable progress with the successful creation of high-quality assemblies for medicinal plants, including both telomere-to-telomere and phase-resolved assemblies [115][116][117]. We would expect more high-quality Aconitum genome assemblies in the near future, which could lay the groundwork for other omics works, such as transcriptomics and proteomics. ...
Aconitum stands out among the Ranunculaceae family for its notable use as an ornamental and medicinal plant. Diterpenoid alkaloids (DAs), the characteristic compounds of Aconitum, have been found to have effective analgesic and anti-inflammatory effects. Despite their medicinal potential, the toxicity of most DAs restricts the direct use of Aconitum in traditional medicine, necessitating complex processing before use. The use of high-throughput omics allows for the investigation of Aconitum plant genetics, gene regulation, metabolic pathways, and growth and development. We have collected comprehensive information on the omics studies of Aconitum medicinal plants, encompassing genomics, transcriptomics, metabolomics, proteomics, and microbiomics, from internationally recognized electronic scientific databases such as Web of Science, PubMed, and CNKI. In light of this, we identified research gaps and proposed potential areas and key objectives for Aconitum omics research, aiming to establish a framework for quality improvement, molecular breeding, and a deeper understanding of specialized metabolite production in Aconitum plants.
... Recent advancements in third-generation sequencing technology, notably the successful integration of Oxford Nanopore ultra-long sequencing with PacBio HiFi sequencing, have significantly eased the challenging process of assembling centromeric and other highly repetitive genomic regions. Recently, telomere-to-telomere level assemblies have been completed for several medicinal plants, including Scutellaria baicalensis (Pei et al., 2023), Rhodomyrtus tomentosa (Li F. et al., 2023), Mentha suaveolens (Yang et al., 2024a), Isodon rubescens (Yang et al., 2024b), Peucedanum praeruptorum (Bai et al., 2024), and Rheum officinale , paving the way for accelerated and more comprehensive functional analyses. ...
Medicinal plants are important sources of bioactive specialized metabolites with significant therapeutic potential. Advances in multi-omics have accelerated the understanding of specialized metabolite biosynthesis and regulation. Genomics, transcriptomics, proteomics, and metabolomics have each contributed new insights into biosynthetic gene clusters (BGCs), metabolic pathways, and stress responses. However, single-omics approaches often fail to fully address these complex processes. Integrated multi-omics provides a holistic perspective on key regulatory networks. High-throughput sequencing and emerging technologies like single-cell and spatial omics have deepened our understanding of cell-specific and spatially resolved biosynthetic dynamics. Despite these advancements, challenges remain in managing large datasets, standardizing protocols, accounting for the dynamic nature of specialized metabolism, and effectively applying synthetic biology for sustainable specialized metabolite production. This review highlights recent progress in omics-based research on medicinal plants, discusses available bioinformatics tools, and explores future research trends aimed at leveraging integrated multi-omics to improve the medicinal quality and sustainable utilization of plant resources.
... Additionally, Lamiaceae stands out as the largest Lamiales family, comprising over 240 genera and more than 7800 species globally (Zhao et al., 2021). Many species in the Lamiaceae family contain various secondary metabolites and are commonly used as medicinal plants, such as mint (Vining et al., 2017;Yang et al., 2024), oriental motherwort (Li, Yan, et al., 2024), and Baikal skullcap (Xu, Gao, et al., 2020). ...
Lamiales is one of the largest orders of angiosperms with a complex evolutionary history and plays a significant role in human life. However, the polyploidization and chromosome evolution histories within this group remain in mystery. Among Lamiales, Isodon serra (Maxim.) Kudô shines for its abundance of diterpenes, notably tanshinones, long used in East Asia to combat toxicity and inflammation. Yet, the genes driving its biosynthesis and the factors governing its regulation linger in obscurity. Here, we present the telomere‐to‐telomere genome assembly of I. serra and, through gene‐to‐metabolite network analyses, pinpoint the pivotal tanshinone biosynthesis genes and their co‐expressed transcription factors. Particularly, through luciferase (LUC) assays, we speculate that IsMYB‐13 and IsbHLH‐8 may upregulate IsCYP76AH101, which is the key step in the biosynthesis of the tanshinone precursor. Among Lamiales, Oleaceae, Gesneriaceae and Plantaginaceae successively sister to a clade of seven Lamiales families, all sharing a recent whole‐genome duplication (designated as α event). By reconstructing the ancestral Lamiales karyotypes (ALK) and post‐α event (ALKα), we trace chromosomal evolution trajectories across Lamiales species. Notably, one chromosomal fusion is detected from ALK to ALKα, and three shared chromosomal fusion events are detected sequentially from ALKα to I. serra, which fully supports the phylogeny constructed using single‐copy genes. This comprehensive study illuminates the genome evolution and chromosomal dynamics of Lamiales, further enhancing our understanding of the biosynthetic mechanisms underlying the medicinal properties of I. serra.
... An increasing number of medicinal plant genomes have been published, including Artemisia argyi, Mentha suaveolens, and C. roseus, which will provide a foundation for the identification of ERF families and functional genomics research (Chen et al., 2023;Yang et al., 2024b;Sun et al., 2023;Pei et al., 2024). ERF protein identification and characterization have been studied in various plant species, including Arabidopsis thaliana (Nakano et al., 2006), barley (Taketa et al., 2008), Fagopyum Tataricum (Liu et al., 2019), grape (Zhuang et al., 2009;Zhu et al., 2019), apple (Girardi et al., 2013), and ginger (Xing et al., 2021). ...
Introduction
Cepharanthine (CEP), a bisbenzylisoquinoline alkaloid (bisBIA) extracted from Stephania japonica, has received significant attention for its anti-coronavirus properties. While ethylene response factors (ERFs) have been reported to regulate the biosynthesis of various alkaloids, their role in regulating CEP biosynthesis remains unexplored.
Methods
Genome-wide analysis of the ERF genes was performed with bioinformatics technology, and the expression patterns of different tissues, were analyzed by transcriptome sequencing analysis and real-time quantitative PCR verification. The nuclear-localized ERF gene cluster was shown to directly bind to the promoters of several CEP-associated genes, as demonstrated by yeast one-hybrid assays and subcellular localization assays.
Results
In this work, 59 SjERF genes were identified in the S. japonica genome and further categorized into ten subfamilies. Notably, a SjERF gene cluster containing three SjERF genes was found on chromosome 2. Yeast one-hybrid assays confirmed that the SjERF gene cluster can directly bind to the promoters of several CEP-associated genes, suggesting their crucial role in CEP metabolism. The SjERFs cluster-YFP fusion proteins were observed exclusively in the nuclei of Nicotiana benthamiana leaves. Tissue expression profiling revealed that 13 SjERFs exhibit high expression levels in the root, and the qRT-PCR results of six SjERFs were consistent with the RNA-Seq data. Furthermore, a co-expression network analysis demonstrated that 24 SjERFs were highly positively correlated with the contents of various alkaloids and expression levels of CEP biosynthetic genes.
Conclusion
This study provides the first systematic identification and analysis of ERF transcription factors in the S.japonica genome, laying the foundation for the future functional research of SjERFs transcription factors.
... Due to the remarkable advancements in sequencing technology, a vast array of species has been sequenced (Yang et al., 2024a), and a total of 2,836 genomes from 1,410 plant species was available by 2023 (Xie et al., 2024). Of course, the genome assembly quality has also improved rapidly (Yang et al., 2024b). These afforded the emergence of several databases dedicated to housing their genomes, such as the 1 K medicinal plant genome database (Su et al., 2022), the Rosaceae genome database (Jung et al., 2019), the cucurbit genomics database , and the Portal of Juglandaceae (Guo et al., 2020), Traditional Chinese Medicine Plant Genome database Traditional Chinese Medicine Plant Genome database (TCMPG; http://cbcb.cdutcm.edu.cn/TCMPG/) ...
Asteraceae, the largest family of angiosperms, has attracted widespread attention for its exceptional medicinal, horticultural, and ornamental value. However, researches on Asteraceae plants face challenges due to their intricate genetic background. With the continuous advancement of sequencing technology, a vast number of genomes and genetic resources from Asteraceae species have been accumulated. This has spurred a demand for comprehensive genomic analysis within this diverse plant group. To meet this need, we developed the Asteraceae Genomics Database (AGD; http://cbcb.cdutcm.edu.cn/AGD/). The AGD serves as a centralized and systematic resource, empowering researchers in various fields such as gene annotation, gene family analysis, evolutionary biology, and genetic breeding. AGD not only encompasses high-quality genomic sequences, and organelle genome data, but also provides a wide range of analytical tools, including BLAST, JBrowse, SSR Finder, HmmSearch, Heatmap, Primer3, PlantiSMASH, and CRISPRCasFinder. These tools enable users to conveniently query, analyze, and compare genomic information across various Asteraceae species. The establishment of AGD holds great significance in advancing Asteraceae genomics, promoting genetic breeding, and safeguarding biodiversity by providing researchers with a comprehensive and user-friendly genomics resource platform.
Bottle gourd (Lagenaria siceraria (Molina) Standl) is a widely distributed Cucurbitaceae species, but gaps and low-quality assemblies have limited its genomic study. To address this, we assembled a nearly complete, high-quality genome of the bottle gourd (Pugua) using PacBio HiFi sequencing and Hi-C correction. The genome, being 298.67 Mb long with a ContigN50 of 28.55 Mb, was identified to possess 11 chromosomes, 11 centromeres, 18 telomeres, and 24 439 predicted protein-coding genes; notably, gap-free telomere-to-telomere assembly was accomplished for seven chromosomes. Based on the Pugua genome, the transcriptomic and metabolomic combined analyses revealed that amino acids and lipids accumulate during the expansion stage, while sugars and terpenoids increase during ripening. GA4 and genes of the Aux/IAA family mediate fruit expansion and maturation, while cell wall remodeling is regulated by factors such as XTHs, EXPs, polyphenols, and alkaloids, contributing to environmental adaptation. GGAT2 was positively correlated with glutamate, a source of umami, and SUS5 and SPS4 expression aligned with sucrose accumulation. This study provides a valuable genetic resource for bottle gourd research, enhancing the understanding of Cucurbitaceae evolution and supporting further studies on bottle gourd development, quality, and genetic improvement.
The black wolfberry (Lyciumruthenicum; 2n = 2x = 24) is an important medicinal plant with ecological and economic value. Its fruits have numerous beneficial pharmacological activities, especially those of anthocyanins, polysaccharides, and alkaloids, and have high nutritional value. However, the lack of available genomic resources for this species has hindered research on its medicinal and evolutionary mechanisms. In this study, we developed the telomere-to-telomere (T2T) nearly gapless genome of L. ruthenicum (2.26 Gb) by integrating PacBio HiFi, Nanopore Ultra-Long, and Hi-C technologies. The assembled genome comprised 12 chromosomes with 37,149 protein-coding genes functionally annotated. Approximately 80% of the repetitive sequences were identified, of which long terminal repeats (LTRs) were the most abundant, accounting for 73.01%. The abundance of LTRs might be the main reason for the larger genome of this species compared to that of other Lycium species. The species-specific genes of L. ruthenicum were related to defense mechanisms, salt tolerance, drought resistance, and oxidative stress, further demonstrating their superior adaptability to arid environments. Based on the assembled genome and fruit transcriptome data, we further constructed an anthocyanin biosynthesis pathway and identified 19 candidate structural genes and seven transcription factors that regulate anthocyanin biosynthesis in the fruit developmental stage of L. ruthenicum, most of which were highly expressed at a later stage in fruit development. Furthermore, 154 potential disease resistance-related nucleotide-binding genes have been identified in the L. ruthenicum genome. The whole-genome and proximal, dispersed, and tandem duplication genes in the L. ruthenicum genome enriched the number of genes involved in anthocyanin synthesis and resistance-related pathways. These results provide an important genetic basis for understanding genome evolution and biosynthesis of pharmacologically active components in the Lycium genus.