Access to this full-text is provided by Springer Nature.
Content available from Nature Communications
This content is subject to copyright. Terms and conditions apply.
Article https://doi.org/10.1038/s41467-024-48771-7
Discovery of fungal onoceroid triterpenoids
through domainless enzyme-targeted global
genome mining
Jia Tang
1
& Yudai Matsuda
1
Genomics-guided methodologies have revolutionized the discovery of natural
products. However, a major challenge in the field of genome mining is
determining how to selectively extract biosynthetic gene clusters (BGCs) for
untapped natural products from numerous available genome sequences. In
this study, we developed a fungal genome mining tool that extracts BGCs
encoding enzymes that lack a detectable protein domain (i.e., domainless
enzymes) and are not recognized as biosynthetic proteins by existing bioin-
formatic tools. We searched for BGCs encoding a homologue of Pyr4-family
terpene cyclases, which are representative examples of apparently domainless
enzymes, in approximately 2000 fungal genomes and discovered several
BGCs with unique features. The subsequent characterization of selected BGCs
led to the discovery of fungal onoceroid triterpenoids and unprecedented
onoceroid synthases. Furthermore, in addition to the onoceroids, a previously
unreported sesquiterpene hydroquinone, of which the biosynthesis involves a
Pyr4-family terpene cyclase, was obtained. Our genome mining tool has broad
applicability in fungal genome mining and can serve as a beneficial platform
for accessing diverse, unexploited natural products.
Recent years have witnessed an exponential increase in data accu-
mulation across all fields, ushering us into the era of big data. Research
on naturally occurring organic compounds, typically referred to as
natural products, is no exception to this global trend1,2.Traditional
methods used to isolate natural products often lead to the rediscovery
of known compounds, and therefore, scientists are now required to
discover hidden natural products or synthesize their analogues using
alternative strategies. Leveraging big data presents a promising solu-
tion to this problem. The rapid accumulation of microbial genome
sequence data has revealed that microbial genomes harbor more
natural product biosynthetic gene clusters (BGCs) than one can pre-
dict based on the number of metabolites produced under standard
laboratory cultivation conditions3–5. Thus, the activation and char-
acterization of these silent or orphan BGCs present in microbial gen-
omes can lead to the discovery of previously undescribed natural
products. A method known as genome mining has been demonstrated
as useful for discovering unexploited metabolites in recent decades6–8.
However, a major challenge in the field of genome mining is deter-
mining how to prioritize BGCs identified in available genome sequence
data for the efficient discovery of untapped natural products.
Fungi are prolific producers of natural products with diverse
molecular structures and biological activities, establishing them as
promising sources of unexplored metabolites. To facilitate genome
mining in fungi, several bioinformatic tools, such as antiSMASH9,
SMURF10,DeepBGC
11,andTOUCAN
12, can be employed to detect BGCs
in a specific fungal genome. Among the available tools, antiSMASH has
gained the most popularity, possibl y because of its user-friendly online
platform and versatile functions. However, some challengeshave been
encountered while using antiSMASH for BGC detection and genome
mining. First, antiSMASH classifies many genes as other genes, even
when they encode a biosynthetic enzyme. For example, when anti-
SMASH (version 7.0.0) was employed to analyze the BGCof the fungal
meroterpenoid novofumigatonin13, a hybrid molecule of polyketide
and terpenoid origin, it categorized 6 out of 13 genes as other genes.
Received: 10 November 2023
Accepted: 9 May 2024
Check for updates
1
Department of Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong SAR, China. e-mail: ymatsuda@cityu.edu.hk
Nature Communications | (2024) 15:4312 1
1234567890():,;
1234567890():,;
Content courtesy of Springer Nature, terms of use apply. Rights reserved
This may be because some of these genes lacked a detectable Pfam
domain14 (Supplementary Fig. 1). Moreover, one of these six over-
looked genes encodes the terpene cyclase NvfL, which has widely
distributed homologues in natural product pathways15. Thus, although
antiSMASH should theoretically categorize this BGC as “T1PKS, ter-
pene”(type I polyketide synthase + terpene synthase), it recognized
only T1PKS. In addition, these enzymes without a detectable protein
domain (hereafter termed domainless enzymes) included the α-
ketoglutarate-dependent dioxygenase NvfI and the methyltransfer-
ase NvfJ, indicating that antiSMASH failed to recognize various bio-
synthetic proteins. Furthermore, although antiSMASH and the other
aforementioned tools can be used to extract all possible BGCs from a
genome sequence, they do not facilitate the selective extraction of
BGCs with a desired or specific feature. Thus, users need to select BGCs
based on the criteria set for subsequent wet lab experiments. In this
study, we aimed toovercome these challenges by developing a widely
applicable genome mining methodology for the rapid discovery of
unexplored natural products and biosynthetic mechanisms.
Results
Development of a fungal genome mining tool
Recent studies on the biosynthesis of fungal natural products have
identified many biosynthetic enzymes lacking a detectable Pfam
domain, such as NvfI and NvfL, which are not recognized by anti-
SMASH. Because these domainless biosynthetic enzymes occasionally
drive intriguing chemical reactions, we hypothesized that a genome
mining strategy focusing on BGCs encoding weak homologues of such
enzymes would offer a promising approach to obtaining untapped
metabolites. To identify biosynthetic proteins without detectable
Pfam domains, we first collected known fungal biosynthetic proteins.
Although these proteins can be obtained from the UniProtKB/Swiss-
Prot or MIBiG database16, these databases contain gene or protein
sequences that can be mispredicted and are not manually corrected.
Thus, we createdour own fungal BGC database by manually collecting,
reviewing, correcting where necessary, and systematically annotating
approximately 700 fungal BGCs (Supplementary Data 1). We named
this database the FunBGCs database.
Next, we extracted all protein sequences from the created data-
base and conducted a Pfam domain search on the resulting 5070
proteins. This led to the identification of 520 Pfam domains. However,
572 proteins lacked a detectable Pfam domain. These domainless
proteins were divided into groups based on their sequence similarity,
and their hidden Markov model (HMM) profiles were generated for
HMMER17 analysis. The representative members of these protein
groups include the terpene cyclase Pyr418,theDiels–Alderase Fsa219,
the hetero Diels–Alderase AsR520, the epoxide hydrolase CtvD21,and
the isomerase Trt1422. Furthermore, several additional HMM profiles
were created for the domains of polyketide synthases (PKSs) and
nonribosomal peptide synthetases (NRPSs). The identified Pfam
domains, the manually added protein families or domains, and a few
additional protein domains from the SMART23 and TIGRFAMs24 data-
bases (Supplementary Data 2 and 3) were combined for use in bio-
synthetic protein detection. Moreover, a DIAMOND25 database was
constructed with all of the extracted biosynthetic proteins.
BGCs were extracted as follows (Fig. 1a). Initially, genes whose
products belonged to a protein family of interest were identified from
a given fungal genome. Subsequently, genes in the flanking regions of
each identified gene were examined to determine if they encoded a
core protein (e.g., PKSs, NRPSs, terpene synthases or cyclases, and
prenyltransferases; refer to Supplementary Table 1 for more details) or
another type of biosynthetic protein using the previously created in-
house HMM library and DIAMOND database. If a target gene was found
to be colocalized with a core protein gene, then the genomic region
was extracted as a BGC by applying a similar strategy used for other
fungal genome mining tools, such as antiSMASH9and SMURF10.To
determine whether a gene in flanking regions was clustered with the
target, the following factors were considered: (i) whether the gene
encoded a biosynthetic protein, (ii) whether the gene was duplicated in
the genome (to include a possible self-resistance enzyme gene26), (iii)
whether the gene was biosynthetically related to nearby genes, (iv)
whether the gene encoded a small protein (to ignore mispredicted
small proteins), and (v)the distance between a given gene and the next
one (refer to the Methods section for the detailed procedure). In
addition, the extracted BGCs were visualized using a web browser,
which displayed general information on each BGC and each gene
(Fig. 1b). The fungal genome mining tool developed in this study was
named FunBGCeX (Fungal BGC eXtractor).
Discovery of fungal BGCs with unusual features
To examine whether our genome mining tool allows for the efficient
discovery of BGCs that potentially synthesize an unreported class of
natural products, we extracted BGCs that encoded a homologue of
Pyr4 (Fig. 2a)18,27,28, which is a noncanonical transmembrane terpene
cyclase and does not have a conserved domain, as inferred from the
Fig. 1 | Fungal genome mining tool. a General workflow to extract biosynthetic
gene clusters (BGCs) with a target gene from a given fungal genome. bExample of
output from the genome mining tool. For this panel, BGCs encoding a Pyr4
homologue were extracted from the fungus Aspergillus alliaceus CBS 536.65,
yielding five BGCs. The top table provides general information on a BGC. A sche-
matic representation of BGC is displayed in the middle. On clicking each gene,
information on that gene is provided in the bottom table. The BGC shown here is
identical to the alli cluster (Fig. 3a).
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 2
Content courtesy of Springer Nature, terms of use apply. Rights reserved
National Center for Biotechnology Information (NCBI) Conserved
Domain Search29. Thus, such BGCs were extracted from 1990 anno-
tated fungal reference genomes downloaded from the NCBI database
(Supplementary Data 4) using an HMM profile created from Pyr4 and
its homologues15,30. This process resulted in the identification of 1050
BGCs (including those with a single gene; Supplementary Data 5). More
than half (690) of the identified BGCs did not contain an additional
core protein gene, whereas 182 BGCs possessed at least one PKS gene,
and 131 BGCs were found to encode a PaxC-like prenyltransferase,
which is specifically required for the biosynthesis of indole sesqui-
terpenoids and diterpenoids31. Thus, except for standalone pyr4-like
genes, the majority of the pyr4-like genes were clustered with either a
Fig. 2 | Discovery of atypical BGCs encoding a Pyr4-family terpene cyclase.
aReactions catalyzed by the selected members of Pyr4-family terpene cyclases.
bPhylogenetic analysis of known Pyr4 homologues and those identified in this
study, along with theirassociated biosyntheticgene clusters. Red asterisks indicate
proteins not detected by antiSMASH analysis. The gene and protein sequences of
several identified Pyr4 homologues were manually revised upon creating the
phylogenetic tree (refer to Supplementary Data 7 for their protein sequences).
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 3
Content courtesy of Springer Nature, terms of use apply. Rights reserved
PKS gene or a paxC-like gene. This observation was consistent with the
fact that, in fungi, Pyr4-familyterpene cyclases havebeen found only in
the biosynthetic pathways of polyketide–terpenoid hybrids and indole
terpenoids32,33.
We then focused on BGCs that show low similarity with known
fungal BGCs and encode a Pyr4-like protein as well as a core enzyme
different from PKS or PaxC-like prenyltransfera se, with the expectation
that such BGCs are responsible for the synthesis of unusual metabo-
lites. After manual inspection of the raw output from the genome
mining tool, we found that 16 BGCs encoded one or two protein(s)
homologous to squalene–hopene cyclases (SHCs) or oxidosqualene
cyclases (OSCs), which are involved in the biosynthesis of triterpe-
noids. Intriguingly, Pyr4-like proteins encoded by these BGCs formed a
distinct clade when subjected to phylogenetic analysis with known
Pyr4-family terpene cyclases (Fig. 2b). During our study, we also
observed that the genome of Aspergillus fumigatus A1163, a non-
reference genome of A. fumigatus, contained a similar BGC, which was
reserved for further investigation. To date, no metabolic pathway
involving both a Pyr4-family terpene cyclase and an SHC/OSC-like
enzyme has been identified. Furthermore, the nature of these BGCs
suggests that their products are triterpenoids whose biosynthesis
requires two cyclization events. However, to the best of our knowl-
edge, none of the known fungal metabolites appear to be synthesized
by these BGCs. Therefore, we speculate that these BGCs are respon-
sible for the biosynthesis of untapped triterpenoid molecules. In
addition, we found seven BGCs that encoded a UbiA-like pre-
nyltransferase but lacked a PKS gene (Fig. 2b). Thus, it is predicted that
these UbiA-like prenyltransferases accept a non-polyketide substrate
to attach a prenyl group, which would then be cyclized by the Pyr4-
family terpene cyclase encoded by these BGCs. However, such a bio-
synthetic mechanism has never been reported, and therefore, these
BGCs might be involved in the formation of meroterpenoids with a
non-polyketide moiety.
In terms of the sources of the Pyr4-like proteins identified in this
study, most of them (1010/1050) are of ascomycete origin, which is
consistent with the fact that all of the characterized fungal Pyr4-family
terpene cyclases are from ascomycete fungi. However, we also noted
the presence of Pyr4 homologues from fungi in other divisions, namely
Chytridiomycota (12/1050), Mucoromycota (3/1050), and Basidiomy-
cota (25/1050), some of which are clustered with other potential bio-
synthetic genes (Fig. 2b). Two basidiomycete fungi, Marasmius
tenuissimus GH-37 and Psilocybe cf. subviscida CBS 101986, possess
homologous BGCs with a pyr4-like gene. These two BGCs apparently
lack a prenyltransferase gene or other well-known coreenzyme genes,
and therefore, the Pyr4 homologues encoded by these BGCs might
directly accept a ubiquitously present polyprenyl molecule. Further-
more, the chytrid fungus Rhizophlyctis rosea JEL0764 harbors a BGC
that encodes a Pyr4-like protein and a PaxC-like prenyltransferase.
Since this Pyr4 homologue is distantly related to any of the known
Pyr4-family terpene cyclases, the BGC could potentially synthesize an
unprecedented meroterpenoid species. Collectively, our global gen-
ome mining analysis could readily identify several different groups of
unexploited BGCs in diverse fungi, which are worth studying further.
To compare the performance of our genome mining methodol-
ogy with that of antiSMASH (fungiSMASH) analysis, the same set of
1990 fungal genomes usedin our genome mining were analyzed by the
standalone version of antiSMASH. The detected BGCs were then
examined to see whether they encoded a Pyr4homologue, resulting in
the extraction of 325 BGCs (Supplementary Data 6). Although these
BGCs include those encoding an (oxido)squalene cyclase, most of the
other uncharacterized BGCs mentioned above could not be detected
by antiSMASH. Furthermore, antiSMASH analysis is unable to tell
whichBGCsencodeaPyr4-likeprotein(oranotherproteinoftheuser’s
interest), andtherefore, the extraction of BGCs with a pyr4 homologue
by antiSMASH requires (i) detection of all BGCs from given genome
sequences and (ii) manual inspection to determine whether each BGC
encodes a Pyr4 homologue. Altogether, the antiSMASH analysis failsto
detect certaintypes of BGCs, which can be detected byour approach,
and also requires a much longer time for the completion of the ana-
lysis. On the contrary, our genome mining method allows for the
automated extraction of BGCs of the user’s interest within a short
period of time.
Characterization of selected triterpenoid BGCs
To examine whether the unusual BGCs identified by our global gen-
ome mining analysis can yield untapped natural products, we then
focused on those encoding an SHC/OSC-like protein and selected
three BGCs for experimental investigation. The three BGCs were from
Aspergillus homomorphus CBS 101889, A. fumigatus A1163 (CBS
144.89), and Aspergillus alliaceus CBS 536.65, which were designatedas
the homo,fumi,andalli clusters, respectively (Fig. 3a, Supplementary
Table 2, andSupplementary Data 8).The homo cluster was thesimplest
of the three and encoded the squalene or oxidosqualene cyclase
HomoS, the Pyr4-family terpene cyclase HomoB, and the FAD-
dependent monooxygenase (FMO) HomoM, the last of which is
homologous to epoxidases involved in fungal meroterpenoid
pathways32,33. In addition, the homologues of the three enzymes were
conserved in the other two BGCs, although the fumi cluster contained
two homoS homologues, fumiS1 and fumiS2.Onthebasisofthepre-
dicted functions of Homo enzymes, the product of the homo cluster
could be synthesized as follows. First, HomoS cyclizes squalene or
oxidosqualene to yield a cyclized product. Then, HomoM epoxidizes
one of the olefinic double bonds present in the cyclized product, and
HomoB finally performs a second round of cyclization to produce the
end pathway product. The other two BGCs would employ a similar
biosynthetic mechanism; however, they each encoded one or two
tailoring enzymes predicted to modify the triterpenoid scaffold. The
fumi cluster contained a cytochrome P450 monooxygenase gene
fumiP,whereasthealli cluster encoded the P450 AlliP and the acetyl-
transferase AlliA. Collectively, all of the three BGCs were expected to
yield different triterpenoid species.
To analyze the functions of the three BGCs and identify the
metabolites produced, we performed heterologous expression
experiments in Aspergillus oryzae NSAR134,whichhasbeenwidelyuti-
lized for the characterization of orphan BGCs27,35–37.Wefirst individu-
ally expressed the four putative triterpene synthase genes (homoS,
fumiS1,fumiS2,andalliS)inA. oryzae andthenanalyzedmetabolites
from the resulting transformants using gas chromatography–mass
spectrometry (GC–MS). The findings revealed that all of the enzymes,
except FumiS2, yielded specific metabolite(s) (Fig. 3b, traces i to v).
The homoS-transformed strain produced a major product 1with m/z
410 [M]+, which was not present in the host strain. After isolation,
metabolite 1was identified as bicyclic triterpene α-polypodatetraene
based on the comparison of its nuclear magnetic resonance (NMR)
spectra and specific rotation with reported data38 (Fig. 3c). Compound
1was also obtained from the A. oryzae transformant harboring fumiS1;
however, the transformant produced another product 2, which was
identified to be α-polypodatetraen-3β-ol39, the C-3 hydroxy analogue
of 1(Fig. 3c). The metabolic profile of the A. oryzae strain with alliS
differed from those of the other two. A specific metabolite 3was
detected as a major peak and was identified as 8α-hydroxypolypoda-
13,17,21-triene40 (Fig. 3c).
We then identified the downstream metabolites of the three
pathways and first focused on the end product of the homo pathway.
The co-expression of the epoxidase gene homoM and the pyr4
homologue homoB with homoS resulted in the formation of an addi-
tional metabolite 4with m/z426 [M]+(Fig. 3b, trace vi). Analysis of the
NMR data of 4revealed the presence of two additional ring systems,
whose absolute structure was established using the modified Mosher’s
method41 (Fig. 3c and Supplementary Fig. 2). Thus, compound 4,
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 4
Content courtesy of Springer Nature, terms of use apply. Rights reserved
hereby named homomonoceroid A, is synthesized through the cycli-
zation ofsqualene from both termini and is classified as an onoceroid.
Next, the co-expression system of fumiS1,fumiM,andfumiB was
constructed, which yielded two additional metabolites, 5and 6
(Fig. 3b, trace vii). The major product 5, designated as fumionoceroid
A, was determined to be another onoceroid; its structure was estab-
lished through NMR and X-ray crystallographic analyses (Fig. 3cand
Supplementary Fig. 3; CCDC: 2279439). Compound 5was also found to
be a tetracyclic onoceroid as 4, and both 4and 5appeared to be
derived from 1. However, the newly formed bicyclic system of 5dif-
fered from that of 4. The other product 6, fumionoceroid B, was
determined to be the C-3 hydroxy form of 5and appeared to be syn-
thesized in the same manner as 5, using oxidosqualene as a starting
material (Fig. 3c). The absolute structure of 6was deduced based on
that of 5. Subsequently, the P450 gene fumiP was introduced, yielding
an additional product 7(Fig. 3b, trace viii). NMR and X-ray crystal-
lographic analyses revealed that 7, named fumionoceroid C, was a
hydroxylated form of 5at C-24 (Fig. 3c and Supplementary Fig. 3;
CCDC: 2279440).
When alliS,alliM,andalliB were introduced into A. oryzae,the
transformant with the three genes produced compound 8(Fig. 3b,
trace ix). In contrast to the products obtained from the other BGCs, 8
was identified as a pentacyclic onoceroid andnamed alliaonoceroid A
(Fig. 3c). Although 8has never been isolated from natural sources, it
was predicted to be a biosynthetic precursor of cupacinoxepin42,
which had been isolated from the Ecuadorian plant Cupania cinerea43.
We investigated steps involved in the biosynthetic processes catalyzed
by the P450AlliP and the acetyltransferase AlliA. We created four gene
expression systems with either alliP or alliA to determine the enzyme
required first in the biosynthetic process. Both the transformants
yielded a specific product. The introduction of alliP and alliA led to the
formation of 9and 10,respectively(Fig.3b, traces x and xi), which
were characterized to be the C-21 keto and C-6αhydroxy analogue of 8
and the O-acetylated form of 8, respectively, through NMR and X-ray
crystallographic analyses (Fig. 3c and Supplementary Fig. 3; CCDC:
2279441 and 2279442). Finally, the transformant harboring all five alli
genes produced 11 (Fig. 3b, trace xii), which was determined to poss ess
two acetoxy groups at the C-16βand C-21αpositions (Fig. 3cand
Supplementary Fig. 3; CCDC: 2279443). Compounds 9–11 were
designated as alliaonoceroids B–D, respectively.
We then investigated whether the fungal strains that originally
harbor the onoceroid BGCs yield the onoceroid species obtained in
this study. We found that, based on LC–APCI–MS/MS analysis, A.
homomorphus CBS 101889 and A. alliaceus CBS 536.65 produced 4and
11, respectively (Supplementary Fig. 4), which are the predicted end
pathway products of the homo and alli clusters, respectively. Accord-
ingly, the expression of the biosynthetic genes was also confirmed by
RT-PCR (Supplementary Fig. 5). Meanwhile, the metabolite analysis of
A. fumigatus CBS 144.89 suggested that 7, the product from the fumi
cluster, is formed by the fungus when it was cultivated on potato
dextrose agar (PDA) plate (Supplementary Fig. 4); however, the MS/MS
spectrum of the concerned compound could not be obtained due to
its low productivity, and therefore, we were unable to confirm the
production of 7in A. fumigatus.
Discovery of 4-hydroxybenzoate-derived meroterpenoids
To further demonstrate the utility of our genome mining strategy, we
then aimed to discover another type of fungal metabolites by char-
acterizing an atypical BGC that is not detected by antiSMASH. To this
end, we focused on a BGC derived from Neoarthrinium moseri CBS
164.80, which was designated as the mos cluster (Figs. 2band4aand
Supplementary Table 3). The mos cluster encoded the two FAD-
dependent monooxygenases/oxidoreductases MosA and MosD, the
Pyr4-family terpene cyclase MosB, the UbiA-like prenyltransferase
MosC, the N-acetyltransferase MosE, and the hypothetical protein
MosF. To obtain the metabolite synthesized by the mos cluster, the six
mos genes were heterologously expressed in A. oryzae NSARU144.The
resultant A. oryzae transformant yielded metabolites 12 and 13,which
were absent in the host fungal strain (Fig. 4b, traces i and ii), both of
which possess the molecular formula C
21
H
32
O
3
.However,thepro-
ductivity of the twometabolites was considerably low, probably due to
the insufficient supply of their precursor molecule. Interestingly, we
noted that the FMO MosD is highly homologous to the
homoMhomoS homoB
fumiMfumiS1 fumiPfumiT
alliS alliBalliP alliA
fumiB
alliM
fumiS2
100%
30%
(oxido)squalene
cyclase
Pyr4-family
terpene cyclase
FAD-dependent
monooxygenase
cytochrome P450
monooxygenase acetyltransferase transporter
(ii) homoS
(i) NSAR1
(iii) fumiS1
(iv) fumiS2
81012141618
(v) alliS
(vi) homoS+M+B
Time (min)
1
1
2
3
4
20
(viii) fumiS1+M+B+P
(vii) fumiS1+M+B
(ix) alliS+M+B
(x) alliS+M+B+P
(xi) alliS+M+B+A
(xii) alliS+M+B+P+A
81012141618
Time (min)
5
5
6
67
3
3
3
3
8
8
9
10
10
11
20
H
OH
H
HO
3
24
H
OH
H
R
1
O
16
21
OR
1
H
H
H
H
6
O
21
O
H
H
H
H
OH
R
2
R
2
1: R = H
2: R = OH
34
5: R
1
= H, R
2
= H
6: R
1
= OH, R
2
= H
7: R
1
= H, R
2
= OH
8: R
1
= H, R
2
= H
10: R
1
= Ac, R
2
= H
11: R
1
= Ac, R
2
= OAc
9
3
H
RH
OH
c
a
b
Fig. 3 | Functional analysis of fungal onoceroid BGCs. a Schematic representa-
tions of fungal onoceroid biosynthetic gene clusters and BLASTp comparisons of
each gene product. bGas chromatography–mass spectrometry profiles of the
metabolites obtained from Aspergillus oryzae transformants. cStructures of tri-
terpenoids obtained in this study. The numbers of previously undescribed com-
pounds are given in blue (Note that 8had only been obtained as a synthetic
compound42).
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 5
Content courtesy of Springer Nature, terms of use apply. Rights reserved
4-hydroxybenzoate decarboxylase BisD (60% amino acid sequence
identity), which is involved in the biosynthesis of biscognienyneB45.
Thus, we reasoned that 4-hydroxybenzoic acid (4-HBA) served as the
precursor for 12 and 13. To investigate this hypothesis, the A. oryzae
transformant was cultivated in the presence of 4-HBA, which resulted
intheproductionof12 and 13 in a much higher yield (Fig. 4b, trace iii).
Subsequently, both compounds were isolated from large-scale culti-
vation. The minor product 13 was identified as ent-yahazunol based on
its NMR spectra and specificrotation(Fig.4c)46,47, which was originally
isolated from a sponge of the genus Dysidea.TheNMRspectraof12
revealed that 12 possesses the same planar structure as 13 and is the
C-8 epimer of 13,and12 was named moserinol (Fig. 4c). As discussed
later, the structures of 12 and 13 suggested that MosB, MosC, and
MosD are sufficient to synthesize these compounds. Thus, we created
another A. oryzae transformant harboring mosB,mosC,andmosD,
which expectedly afforded 12 and 13 (Fig. 4b, trace iv). The metabolite
analysis of N. moseri CBS 164.80 revealed the presence of 12 upon
feeding 4-HBA (Supplementary Fig. 4), and the RT-PCR analysis con-
firmed the expression of all mos genes except for mosA (Supplemen-
tary Fig. 5).
We then evaluated the antibacterial activity of compounds 12 and
13 by a disk diffusion assay using a 5mm paper disk loaded with 25 µg
of each compound. As a result, only 12 exhibited weak activities against
Bacillus cereus,Staphylococcus epidermidis,Staphylococcus faecalis,
and Staphylococcus aureus (11, 11, 9, and 11 mm of inhibition zones,
respectively; lengths are the averages from three independent
experiments). Meanwhile, ampicillin, which was used as a positive
control, formed 23, 29, 34, and 47mm of inhibition zones against the
four bacterial species, respectively, under the same experimental
condition.
Discussion
In this study, we developed a fungal genome mining tool, which was
based on a manually curated fungal BGC database and custom-made
HMM profiles. We demonstrated that our genome mining tool could
effectively identify BGCs for previously undiscovered natural pro-
ducts. The onoceroid BGCs characterized in this study could also be
identified using antiSMASH; however, the BGCs extracted by anti-
SMASH contained substantially more genes outside the boundary of a
BGC than those extracted in this study (Supplementary Fig. 6). In
addition, antiSMASH could not detect the presence of pyr4-like genes
in these BGCs. Alternatively, these BGCs could be discovered by
standard BLAST search using Pyr4 as a query and subsequent manual
investigation of the flanking regions of each identified pyr4-like gene,
which, however,requires tedious procedures and might take weeks to
months for completion. Therefore, the onoceroid BGCs would not
have been effectively discovered using antiSMASH or conventional
genome mining methodologies. On the other hand, our genome
mining tool has allowed the automated extraction and visualization of
all possible BGCs encoding a Pyr4 homologue from approximately
2000 fungal genomes within a single day, which also facilitated the
selection of BGCs with unusual features. Although we focused only on
BGCs with a pyr4-liketerpenecyclasegeneinthisstudy,ourstrategy
can be readily applied to extract BGCs that encode a homologue of
user-selected proteins. For example, a similar global genome mining
analysis can be conducted to extract BGCs encoding a homologue of
PydY, which is a domainless enzyme and serves as the pericyclase in
the biosynthesis of pyrrocidines (Supplementary Fig. 7)48.PydYandits
homologues, namely ScpY48,PN3-20
49,G73
50, and MGG_1509651, have
been found in the biosynthetic pathways of a few fungal
polyketide–nonribosomal peptide hybrid molecules; however, their
prevalence in fungal natural product pathways has not been investi-
gated. The global genome analysis resulted in the detection of more
than 300 BGCs with a pydY homologue (Supplementary Fig. 7 and
Supplementary Data 5), some of whichdisplayed interesting features,
including thosewith an SHC/OSC gene and others with both UbiA-like
prenyltransferase and polyprenyl pyrophosphate synthase genes.
Intriguingly, one of the identified BGCs encodes a Pyr4 homologue and
is of chytrid origin, which was mentioned earlier (Fig. 2b). The PydY
homologues encoded by these BGCs might possess a catalytic function
other than as a pericyclase. Collectively, our genome mining strategy
has general applicability and will facilitate and accelerate thediscovery
of unexploited natural products, particularly those synthesized with
the involvement of a domainless enzyme.
In our global fungal genome mining study, we discovered three
types of onoceroids, which are triterpenoids synthesized from squa-
lene or oxidosqualene through cyclization at both ends of the prenyl
chain. Onoceroids have been isolated from bacteria52,ferns
53,higher
plants54,55, and animals56. However, to the best of our knowledge,
onoceroids have never been isolated from fungi. Thus, the present
study provides landmark examples of fungal onoceroids and their
biosynthetic pathways. In terms of the biosynthesis of onoceroids,
BmeTC from the bacterium Bacillus megaterium was the initial ono-
ceroid synthase characterized in 2013, and this enzyme solely trans-
forms squalene into onoceroids52.InthefernLycopodium clavatum,a
pair of homologous enzymes (LCC and LCD or LCE) convert oxidos-
qualene into onoceroid species57,58. These known enzymes for ono-
ceroid biosynthesis are all homologous to SHCs and OSCs.In contrast,
the biosynthesis of fungal onoceroids requires two families of terpene
cyclases (i.e., SHC/OSC-like enzyme and Pyr4-family terpene cyclase),
introducing an unprecedented biosynthetic mechanism of onocer-
oids. This study demonstrated that Pyr4-family terpene cyclases are
also involved in the biosynthesis of pure (not mero-) terpenoids. The
c
a
b
8
H
HOH
HO
OH
H
HOH
HO
OH
ent-yahazunol (13)
moserinol (12)
6789
Time (min)
(i) NSARU1
(ii) mosA+B+C+D+E+F
(iii) mosA+B+C+D+E+F
(iii) + 4-HBA
12 13
12
13
UbiA-like
prenyltransferase
Pyr4-family
terpene cyclase
FAD-dependent
monooxygenase/
oxidoreductase
acetyltransferase hypothetical protein
The mos cluster from Neoarthrinium moseri CBS 164.80
ca. 10.5 kb
mosA mosDmosB mosC mosFmosE
12
13 (iv) mosB+C+D
(iv) + 4-HBA
Fig. 4 | Characterization of the BGC for 4-hydroxybenzoate-derived mer-
oterpenoids. a Schematic representations of the mos cluster. bHigh-performance
liquidchromatographyprofiles of the metabolites obtainedfrom Aspergillus oryzae
transformants. The chromatograms were monitored at 300 nm. cStructures of
meroterpenoids obtained in this study.
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 6
Content courtesy of Springer Nature, terms of use apply. Rights reserved
majority of fungal triterpenoids are synthesized through the cycliza-
tion of oxidosqualene59–61, and hexaprenyl pyrophosphate is known to
serve as the precursor of a few fungal triterpenoids62.However,the
fungal onoceroid pathways identified in this study represent rare
examples in which squalene is directly cyclized to produce fungal
triterpenoids.
The biosynthetic pathways of fungal onoceroids discovered in
this study can be proposed as follows (Fig. 5). The biosynthesis of
homomonoceroid A (4) begins with squalene being cyclized into α-
polypodatetraene (1) by the squalene cyclase HomoS through the
carbocationic intermediate 14. Subsequently, the FMO HomoM
epoxidizes the terminal olefinof1to yield epoxide 15, which was not
isolated in this study. The Pyr4-family terpene cyclase HomoB then
protonates epoxide 15 to initiate the second round of cyclization, and
the resulting tetracyclic carbocationic species 16 is neutralized by a
water attack, yielding 4. In addition, the bicyclic intermediate 15 is
involved in the biosynthesis of fumionoceroid C (7), where FumiB
accepts 15 to produce a differently cyclized product fumionoceroid A
(5). FumiB first cyclizes 15 into the carbocationic intermediate 17.
Insteadof being quenched by water,the reaction concludes with a 1,2-
hydride shift, 1,2-methyl shift, and deprotonation from C-19. Subse-
quently, the P450 FumiP hydroxylates 5at C-24 to complete the bio-
synthesis. Oxidosqualene can also be accepted by Fumi enzymes,
except for FumiP, to produce the C-3 hydroxy analogue of 5,fumio-
noceroid B (6), through α-polypodatetraen-3β-ol (2) (Supplementary
Fig. 8). Meanwhile, the biosynthetic pathway leading to alliaonoceroid
D(11) branches from the other two pathways in the first step. AlliS
transforms squalene into 8α-hydroxypolypoda-13,17,21-triene (3)
instead of 1. After the epoxidation of 3by AlliM to yield 18, which was
not obtained in our work, AlliB cyclizes 18 into alliaonoceroid A (8).
The cyclization mode followed by AlliB is similar to that employed by
HomoB, but the AlliB-catalyzed reaction uses the C-8 hydroxy group
instead of a water molecule to produce the pentacyclic onoceroid 8.
Compound 8then undergoes acetylation at the C-21 hydroxy group;
this reaction is catalyzed by the acetyltransferase AlliA, yielding
alliaonoceroid C (10). The P450 AlliP then installs a hydroxy group at
C-16 to produce 19, which is again acetylated by AlliA to yield the end
product 11. In the absence of AlliA, AlliP might cause the hydroxylation
of 3at C-6 to yield a shunt pathway product 19, which is then oxidized
by an endogenous enzyme of A. oryzae to give alliaonoceroid B (9)
(Supplementary Fig. 8). The three characterized fungal onoceroid
pathways suggest that fungi synthesize diverse onoceroids due to
various factors affecting structural diversification, such as the (i)
cyclization mode of (oxido)squalene, (ii) the reaction mechanisms of
Pyr4-family terpene cyclases, and (iii) tailoring reactions. The char-
acterization of other onoceroid BGCs not examined in this study or the
construction of artificial pathways with onoceroid biosynthetic genes
from different pathways can further expand the molecular diversity of
fungal onoceroids.
Finally, we obtained two 4-HBA-derived meroterpenoids by the
heterologous expression of the mos cluster, which was not detectable
by antiSMASH. The major product from the mos cluster, moserinol
(12), was determined to be a previously unreported molecule, further
demonstrating the utility of our genome mining tool in accessing
unexploited compounds. Unfortunately, 12 lacks a structural novelty,
as its diastereomers, yahazunol and ent-yahazunol (13), have already
squalene
H
14
H
H
H
HOH
-polypodatetraene (1)
8-hydroxypolypoda-
13,17,21-triene (3)
H
O
15
8
H
HOH
O
18
H
OH
H
H
OH
H
HO
homomonoceroid A (4)
16
H
OH
H
fumionoceroid A (5)
24 H
OH
H
HO
fumionoceroid C (7)
O
OH
H
H
H
H
alliaonoceroid A (8)
O
16
OAc
H
H
H
H
OH
O
21
OAc
H
H
H
H
alliaonoceroid C (10)
O
OAc
H
H
H
H
alliaonoceroid D (11)
OAc
19
HomoS/
FumiS1/
AlliS
AlliS
HomoS/
FumiS
HomoM/
FumiM HomoB
FumiP
AlliM
AlliB AlliA AlliAAlliP
H+
a
OH
2
b
path a
path b
H+
H2O
H+
FumiB
FumiB
HomoB
19
H
OH
H
17
HH
Fig. 5 | Proposed biosynthetic pathways of the onoceroids obtained in this study. Predicted biosynthetic/reaction intermediates are shown in brackets.
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 7
Content courtesy of Springer Nature, terms of use apply. Rights reserved
been isolated from nature46,63. However, these two compounds have
been obtained from a brown alga and a sponge, respectively, and, to
the best of our knowledge, these sesquiterpene hydroquinones, and
their analogues, have never been reported from fungi. In addition,
although 4-HBA is a precursor for diverse meroterpenoids both from
primary and secondary metabolism, such as ubiquinones64,
xiamenmycin65,andbiscognienyneB
45, our study provides an intri-
guing example in which a Pyr4-family terpene cyclase is involved in the
biosynthesis of 4-HBA-derived meroterpenoids. Although the com-
plete biosynthetic pathway of 12 and the functions of each Mos protein
await to be elucidated in future studies, the biosynthetic route to 12
could be proposed as follows (Supplementary Fig. 9). Initially, the
UbiA-like prenyltransferase MosC farnesylates 4-HBA, which is fol-
lowed by the oxidative decarboxylation by the FMO MosD. Subse-
quently, the terpene cyclase MosB cyclizes the farnesyl moiety, in
which the cyclization is completed by the attack of a water molecule
from both sides, resulting in the formation of a pair of epimers, 12 and
13. The proposed pathway is consistent with the fact that mosB,mosC,
and mosD are sufficient to yield the two meroterpenoids (Fig. 4b, trace
iv). It should also be mentioned that our recent study based on the
genome mining analysis reported herein unraveled another unprece-
dented biosynthetic mechanism for fungal meroterpenoids, in which
Pyr4 homologues cyclize the prenyl moiety installed by a dimethy-
lallyltryptophan synthase (DMATS)-type prenyltransferase66.Overall,
our global genome mining has so far identified three previously
undescribed forms of biosynthetic mechanisms with Pyr4-family ter-
pene cyclases.
In conclusion, we successfully isolated several previously unre-
ported natural products using our genome mining tool. We believe
that this tool can be widely applied for the discovery of natural pro-
ducts with unprecedented scaffolds. Currently, our genome mining
tool mainly targets known–unknown BGCs67, which produce unknown
natural products synthesized by the known classes of core enzymes.
Recent studies have highlighted the genome mining-driven discovery
of unknown–unknown natural products68. We are now enhancing the
genome mining platform by incorporating additional functions to
readily extract BGCs encoding self-resistance enzymes26 or BGCs that
lack a known core protein. This would facilitate the discovery of
unexploited bioactive or unknown–unknown natural products
from fungi.
Methods
Generalexperimentalprocedures
Organic solvents were purchased from Anaqua (Hong Kong) Co. Ltd.,
and other chemicals were purchased from Wako Chemicals Ltd.,
Thermo Fisher Scientific, Sigma–Aldrich, or J&K Scientific Ltd., unless
noted otherwise. Oligonucleotide primers (Supplementary Data 9)
were purchased from Beijing Genomics Institute. PCR was performed
using a T100 Thermal Cycler (Bio-Rad Laboratories, Inc.) with Phanta
Max Super-Fidelity DNA Polymerase (Vazyme Biotech Co., Ltd).
GC–MS analyses were performed with an Agilent 7890 A/5975 C sys-
tem using an HP-5ms capillary column (0.25 mm i.d. × 30 m, 0.25 μm
film thickness). Preparative HPLC was performed on a Waters 1525
Binary HPLC pump with a 2998 photodiode array detector (Waters
Corporation),using a COSMOSIL 5C18-AR-II column (10 i.d. x 250 mm,
Nacalai Tesque, Inc). Flash chromatography was performed using an
Isolera Spektra One flash purification system (Biotage). NMR spectra
were obtained at 600 MHz (1H)/150 MHz (13C) with a Bruker Ascend
AvanceIII HD spectrometer at 25 °C, and chemical shifts were recorded
with reference to solvent signals (1H NMR: CDCl
3
7.26 ppm, C
6
D
6
7.16
ppm, acetone-d
6
2.05 ppm; 13CNMR:CDCl
3
77.0 ppm, C
6
D
6
128.06 ppm, acetone-d
6
29.8 ppm). HR-APCI-MS and HR-ESI-MS spec-
tra were obtained with a SCIEX X500R Q-TOF mass spectrometer.
Samples for LC-MS analysis were injected into a SCIEX ExionLC AD
System with a SCIEX X500R Q-TOF mass spectrometer, using an
Accucore C18 column (4.6 i.d. x 100 mm; Thermo Scientific) for tri-
terpenoids and a Luna Omega 1.6 μm C18 100 Å column (2.1 i.d. x
100 mm; Phenomenex) for sesquiterpene hydroquinones. Optical
rotations were measured with a P-2000 Digital Polarimeter (JASCO
Corporation). X-ray diffraction data were collected on a Bruker D8
Venture Photon II diffractometer. UV spectra were measured using a
LAMBDA 1050 + UV/Vis/NIR Spectrophotometer (PerkinElmer). IR data
were recorded on a Spectrum 100 FT-IR Spectrometer (PerkinElmer).
Microbial strains
Aspergillus alliaceus CBS 536.65, Aspergillus fumigatus CBS 144.89,
Aspergillus homomorphus CBS 101889, and Neoarthrinium moseri CBS
164.80 were purchased from the Westerdijk Fungal Biodiversity Insti-
tute. Aspergillus oryzae NSAR1 (niaD–,sC–,ΔargB,adeA–
)34 and NSARU1
(niaD–,sC–,ΔargB,adeA–
,pyrG–)44 were utilized as the fungal hetero-
logous expression host. Standard DNA engineering was performed
with Escherichia coli DH5α(Takara Bio Inc).
Collection and curation of fungal biosynthetic gene clusters
When genomic sequence data corresponding to a fungal BGC were
publicly available, we obtained these data from the NCBI or the Joint
Genome Institute (JGI) databases. For sequence data lacking gene
annotations, gene prediction was performed using AUGUSTUS69
(version 3.5.0). All of the gene products were compared with their
homologous proteins available in the NCBI database, and their
sequences were manually revised based on this comparison when
necessary. All of the BGC sequences were stored in a GenBank format,
with each biosynthetic protein represented as a coding DNA sequence
(CDS) feature and assigned a unique identification number that starts
with FBGC (e.g., FBGC00001). Each CDS feature was labeled with its
gene name and given a product qualifier consisting of the protein
classification and protein name (e.g., terpene cyclase Pyr4).
In cases where the complete BGC sequence was unavailable but
individual gene or protein sequence data were available, all of the
genes/proteins for a BGC were collected in a directory with a unique
identification number starting with FPROT (e.g., FPROT00001).
Information on all of the collected BGCs is summarized in Supple-
mentary Data 1, which comprises their representative metabolite, source
organism, and primary reference. The FBGC GenBank files and FPROT
directories were deposited at Zenodo under the DOI 10.5281/
zenodo.8126803. The database was named FunBGCs (Collection of
Manually Curated Fungal Biosynthetic Gene Clusters) and is available
online at http://staffweb1.cityu.edu.hk/ymatsuda/funbgcs/funbgcs.html.
Development of a fungal genome mining tool
Creation of a custom Pfam database for fungal biosynthetic protein
detection. All of the protein sequences (5,070 proteins) were initially
extracted from the collected fungal BGCs and analyzed using the
hmmscan tool in HMMER software17 (version 3.3.2). The analysis was
conducted against the Pfam-A HMM library(downloaded from https://
ftp.ebi.ac.uk/pub/databases/Pfam/current_release/Pfam-A.hmm.gz)
using an E-value and conditional E-value cutoff of 1e-5. We detected
520 Pfam domains, which were used for subsequent bioinformatic
analyses.
No Pfam domain was detected in 572 proteins, and these protein
sequences were employed to create a BLAST database using the
makeblastdb function in NCBI BLAST + 70 (version 2.11.0). The 572
proteins were classified into groups based on the findings of BLASTp
analysis (E-value cutoff: 0.01) using the aforementioned created
database. If a protein aligned with another protein in the BLASTp
analysis, then these two proteins were placed in the same group. If a
protein group had more than one protein, then sequence alignment
was performed using Clustal Omega71 (version 1.2.3). These sequence
alignments were then converted into an HMM file using the hmmbuild
tool in HMMER. Each manually created HMM file was named after a
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 8
Content courtesy of Springer Nature, terms of use apply. Rights reserved
representative member of the respective protein group (e.g.,
Pyr4.hmm). This process yielded 59 additional HMM profiles.
For the enhanced detection of PKS/NRPS domains, some HMM
profiles were adopted from the SMART23 and TIGRFAMs24 databases.
Furthermore, several HMM files for PKS/NRPS domains were manually
created using the PKSs and NRPSs available in the FunBGCs database.
Several Pfam HMM files were replaced with these additional HMM files.
All of the HMM profiles used in this study are summarized in Supple-
mentary Data 2 with their sources and accession numbers. The
manually created HMM files were deposited at Zenodo under the DOI
10.5281/zenodo.8126803, and their descriptions are provided in Sup-
plementary Data 3.
Extraction of BGCs of interest from given fungal genome
sequence data. To extract BGCs encoding the homologuesof proteins
of interest from a given fungal genome (which needed to be provided
as a GenBank file containing CDS features with a translation qualifier),
the following steps were implemented. First, all protein sequences
fromthegenomewereextractedandthenanalyzedusinghmmscan
against the HMM profile created from the proteins of interest; an
E-value and conditional E-value cutoff of 1e-5 were applied.
Subsequently, products from 20 genes (if available) in the
flanking regions of each gene identified in the previous step were
extracted. The protein sequences were analyzed using hmmscan
against the HMM library created for the detection of core enzymes
(refer to Supplementary Table 1 for detection criteria). We then
conducted a search for possible precursor peptides involved in the
biosynthesis of ribosomally synthesized and posttranslationally
modified peptides (RiPPs)72. For this purpose, a protein was con-
sidered a candidate for a RiPP precursor peptide (RiPP PP) if its
sequence contained more than two partial sequences that were
≥90% identical, with each containing ≥10 amino acids, with ≥3
different amino acids, and either KR, KK, RK, or RR. The extracted
protein sequences were analyzed using hmmscan against the HMM
library containing all HMM profiles for fungal biosynthetic pro-
teins; an E-value and conditional E-value cutoff of 1e-5 were
applied. In addition, a BLASTp search was performed using
DIAMOND25 (version 2.1.7) to compare the extracted proteins
against the protein database containing all proteins from the
FunBGCs database. Furthermore, we looked for the presence of a
close homologue of each extracted protein with ≥50% identity in
the given fungal genome; some BGCs are known to encode a self-
resistant enzyme that is a close homologue of a housekeeping
protein but is a resistant version that is not inhibited by the
metabolite synthesized by the BGC26.Proteinsidentified in at least
one of the aforementioned analyses were classified as biosynthetic
proteins. Small proteins with <200 amino acid residues were
removed from the subsequent analysis unless they were classified
as biosynthetic proteins.
Next, we predicted whether two adjacently located genes were
clustered, indicating that they were responsible for the biosynthesis of
the same metabolite. We performed a BLASTp analysis of the two
proteins with DIAMOND and used the protein database containing all
of the proteins from the FunBGCs database. BLAST hits with ≥25%
identity and ≥50% sequence coverage were used in the step. If the
origin (parental BGC) of a protein aligned with one of the query pro-
teins was identical to that of at least one of the proteins aligned with
the other query and the distance between the two genes encoding the
two proteins was less than a set length (default: 15,000 bp), then the
two proteins were considered clustered. However, as an exception, if
the two proteins both aligned with a protein from the same originwith
≥50% identity, then they were considered clustered regardless of the
distance between the genes encoding the two proteins. If two adja-
cently located genes were not considered clustered, then the same
analysis was performed on the next gene until no correlation was
observed in three consecutive genes. If two genes were considered
clustered, all of the genes between the twogenes were also considered
clustered.
To extract BGCs from each scaffold, genes were individually
examined from one end to identify those encoding a biosynthetic
protein (as defined above). When a gene encoding a biosynthetic
protein was identified, it was considered a starting point of a possible
BGC. Subsequent genes were included in the BGC as long as they met
the following criteria: a gene of interest (i) was considered clustered
with the previous gene or (ii) encoded a biosynthetic protein and was
closely located with the previous gene (default: within 2500 bp). If
such a candidate BGC encoded a core enzyme or both a RiPP precursor
peptide candidate and a UstY-like protein, then the BGC was extracted
as a GenBank file.
Extraction of all BGCs from given fungal genome sequence data.
The aforementioned methods can be used to extract all possible BGCs
from a given fungal genome with slight modifications, although these
modified methods were not used in the current study. After the
extraction of all of the proteins, we searched for core enzymes and
RiPP precursor peptides instead of identifying the homologues of
target proteins. Other BGC extraction procedures were the same as
described above.
Calculation of the similarity score. To examine whether a given BGC
was similar to a known BGC, the similarity score was calculated using
the following Eq. (1):
similar ity scor e =½average ident ity%×number of homologous genes ina given BGC
number of total genes in a known BGC
ð1Þ
where the average identity% is the average percentage value of the
identities of each pair of corresponding proteins in the two BGCs. A
pair of proteins with ≥45% identity and ≥50% sequence coverage was
considered homologous in this calculation.
The genome mining tool developed herein was designated as
FunBGCeX (Fungal Biosynthetic Gene Cluster eXtractor). FunBGCeX
has been deposited at Zenodo under the DOI 10.5281/zenodo.8126797
and is also available at https://github.com/ydmatsd/funbgcex.
Extraction of biosynthetic gene clusters encoding a Pyr4 or
PydY homologue
To extract BGCs encoding a Pyr4 homologue from available fungal
genomes, we downloaded 1990 annotated fungal reference genomes
from the NCBI database. BGCs were extracted using the HMM profile
for Pyr4 homologues (Pyr4.hmm) with default parameters. This pro-
cess yielded 1050 BGCs (including those with a single gene). Infor-
mation on all the extracted BGCs is provided in Supplementary Data 5.
TheextractionofBGCswithapydY-likegenewasperformedina
similarmannertotheHMMprofile for PydY homologues (PydY.hmm).
Phylogenetic analysis of Pyr4 homologues
The sequences of known Pyr4-family terpene cyclases and their
homologues identified in this study were first aligned using MUSCLE73
(version 5.1), and the conserved sequences were extracted with
Gblocks74 (version 0.91b). A maximum likelihood (ML) phylogenetic
tree was generated with FastTree75 (version 2.1.11), and the resultant
phylogeny was used as a starting tree to infer an ML tree using RaxML76
(version 8.0.0) under the LG + G + F model, as identified by ProtTest77
(version 3.4.2). The Pyr4 homologues of bacterial origin were used as
outgroups. The phylogenetic tree was visualized with Geneious Prime
2023.2.1 (https://www.geneious.com).
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 9
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Analysis of fungal genomes using antiSMASH
The 1990 fungal genomes mentioned above were analyzed by the
standalone version of antiSMASH (version 6.1.1) using the default
parameters. Subsequently, all protein sequences from all the detected
BGCs were extracted and analyzed using hmmscan against the HMM
profile of Pyr4 to extract BGCs with a pyr4 homologue; an E-value and
conditional E-value cutoff of 1e-5 were applied.
Construction of fungal transformation plasmids
Initially, each gene in the homo,fumi,alli,andmos clusters was
amplified from the genomic DNA of A. homomorphus CBS 101889, A.
fumigatus CBS 144.89, and A. alliaceus CBS 536.65, and N. moseri CBS
164.80withtheprimersshowninSupplementaryData9andSupple-
mentary Table 4, and then ligated to the pTAex3-HR vector78 or pPyrG-
HR vector44, using a ClonExpress Ultra One Step Cloning Kit (Vazyme
Biotech Co., Ltd). To construct fungal transformation plasmids with
multiple genes, DNA fragments with the amyB promoter (PamyB)and
the amyB terminator (TamyB) were amplified from the pTAex3-HR-
based plasmids, and further introduced into the already constructed
single gene-containing vector or another vector, pAdeA-HR79.Details
of the plasmid construction are provided in Supplementary Table 4.
Fungal transformation
Transformation of A. oryzae NSAR1 was performed by the
protoplast–polyethylene glycol method13 coupled with CRISPR–Cas9-
guided homologous recombination44,78,79.Initially,A. oryzae NSAR1 or
NSARU1 was cultivated in 10 mL of DPY medium [2% dextrin, 1%
hipolypepton (Nihon Pharmaceutical Co., Ltd.), 0.5% yeast extract,
0.5% KH
2
PO
4
,0.05%MgSO
4
•7H
2
O] supplemented with 0.01% adenine,
0.2% uracil, and 0.5% uridine (uracil and uridine were only added for
the NSARU1 strain) for two days at 30 °C and at 160rpm. The resulting
preculture was then transferred to 100 mL of DPY medium with the
necessary supplements and incubated at 30 °C and 160 rpm for 24 h.
Mycelia were then collected and incubated in TF solution1 [1% Yatalase
(Takara Bio Inc.) and 0.6 M (NH
4
)
2
SO
4
in 50 mM maleic acid (pH 5.5)]
for 2 h at 30 °C and at 160 rpm. The resultant protoplast-containing
mixture was filtrated, and the filtrate was centrifuged at 1,500 rpm for
10 min at room temperature. The protoplast pellet was washed with TF
solution 2 (10 mM Tris-HCl, pH 7.5, 1.2 M sorbitol, 50 mMCaCl
2
,35mM
NaCl) and resuspended in TF solution 2 to a concentration of
1–5×10
7cells/mL. Meanwhile, Cas9 ribonucleoprotein (RNP) com-
plexeswere assembledas follows: (i) dissolve lyophilized Alt-R CRISPR-
Cas9 RNA [crRNA; Integrated DNA Technologies (IDT)] [5′-AGTGTTGC
AATCCAAGGATA-3′(for the HS801 locus80), 5′-CAAGGACCACG
TCCTTCAAC-3′(for the sC locus81), 5′- TGTCGGAAGTTTAGTACCAA-3′
(for the HS401 locus80)] and trans-activating crRNA (tracrRNA; IDT) in
the nuclease-free duplex buffer to a concentration of 100 μM; (ii) mix
each crRNA separately with equal volumes of tracrRNA and nuclease-
free duplex buffer, boil the resulting mixtures at 95 °C for 5 min, and
then cool at room temperature for 15 min to generate guide RNA
(gRNA); (iii) dilute the Alt-R S.p. Cas9 Nuclease V3 (IDT) solution with
1 × phosphate-buffered saline (PBS) to a final concentration of 1 μg/μL;
and (iv) combine 1.5 μL of each gRNA solution with 0.75 μLofCas9
(1 μg/μL) and 11 μLof1×PBS(final volume: 13.25 μL), and incubate the
resulting mixture at room temperature for 5 min to generate dual RNP
complexes. Subsequently, 5 μgofeachplasmid,13.25μLofthecor-
responding dualRNP complexes, and 400 μL of protoplast suspension
were mixed and incubated at room temperature for 30 min. A total of
2.7 mL of TF solution 3 (10 mM Tris-HCl, pH 7.5, 60% PEG4000, 50 mM
CaCl
2
) was then added in three portions (500, 500, and 1700μL,
respectively) to the resulting mixture. After 2 0 min incubation at room
temperature, the mixture was diluted with 10mL of TF solution 2 and
then centrifuged at 1,500 rpm for 10 min at room temperature. The
protoplast-cell pellet was resuspended in 500 μLofTFsolution2and
then mixed with 5 mL of M-sorbitol [0.2% NH
4
Cl, 0.1% (NH
4
)
2
SO
4
,
0.05% KCl, 0.05% NaCl, 0.1% KH
2
PO
4
, 0.05% MgSO
4
·7H
2
O, 0.002%
FeSO
4
·7H
2
O, 2% glucose, 1.2M sorbitol, and supplemented with 0.1%
arginine, 0.15% methionine, 0.01% adenine, when necessary, pH 5.5]
top agar medium (1% agar), which was poured on the top of the pre-
made M-sorbitol bottom agar plate (2% agar). The plate was further
incubated for 3–7 days at 30 °C until transformant colonies appeared.
Finally, the resultant colonies were transferred to M agar plates with
the necessary supplements (and without sorbitol) and incubated for
2–3 days. The transformants created in this study and the plasmids
used for the transformation are provided in Supplementary Table 5.
The successful construction of a transformant was confirmed by
PCR-amplifying each gene in the transformant (Supplementary
Fig. 10). To confirm the expression of the biosynthetic genes, the A.
oryzae transformants with homoS +M+B,fumiS1 +B+M+P,alliS +
M+B+P+A,ormosA +B+C+D+E+Fwere cultivated in DPY liquid
medium [2% dextrin, 1% hipolypepton (Nihon Pharmaceutical Co.,
Ltd.), 0.5% yeast extract, 0.5% KH
2
PO
4
, and 0.05% MgSO
4
•7H
2
O], and
their total RNA was extracted by using a GeneJET Plant RNA Purifica-
tion Kit (Thermo Scientific). Subsequently, complementary DNA
(cDNA) synthesis was performed by using a HiScript III 1st Strand cDNA
Synthesis Kit ( + gDNA wiper) (Vazyme Biotech), and the expression of
each gene was confirmed by PCR using the synthesized cDNA as a
template (Supplementary Fig. 11).
Analysis of metabolites derived from A. oryzae transformants
To analyze the metabolites produced by each A. oryzae transformant,
the transformants were cultivated on a DPY agar plate (1.5% agar) for
seven days at 30 °C. When necessary, 0.01% 4-HBA was supplemented
to the DPY plate. After cultivation, a small piece of fungal mycelia and
agar was cut from the plate, soaked in ethyl acetate, and extracted
using an ultrasonic bath. The ethyl acetate layer was transferred to
another tube, and the solvent was removed using nitrogen gas flow.
The residue was dissolved in methanol and then subjected to a GC–MS
or an HPLC analysis. For the GC–MS analysis, the temperature of the
ionization chamber was 260 °C, with electron impact ionization at
70 eV. Helium was used as a carrier gas, and its average velocity was
29.374cm/sec.The initial temperature of the program was 150 °C, and
it increased to 250 °C at a rate of 40 °C/min, then increased at a rate of
5 °C/min to 320 °C and held at 320 °C for 5 min. The HPLC analysis was
performed with a solvent system of 20 mM formic acid (solvent A) and
acetonitrile containing 20mM formic acid (solvent B), at a flow rate of
0.4 mL/min and a column temperature of 40 °C. Separation was per-
formed using a linear gradient from 10:90 (solvent B/solvent A) to
100:0 for 10min, 100:0 for the following 3min, and a linear gradient
from 100:0 to 10:90 within the following 2.0 min, and then 10:90 for
2.5 min of equilibrium.
Isolation of each metabolite from A. oryzae transformants
To isolate each metabolite, A. oryzae transformants were cultivated
either on DPY agar plates (the volume of medium in one plate was ca.
20 mL) for seven days at 30 °C or in DPY liquid medium at 30 °C/
160 rpm for five days.
When cultivated on the agar plates, the resultant fungal cultures,
including agar medium, were crushed into small pieces, soaked in
acetone, and extracted three times using an ultrasonic bath. After fil-
tration, acetone was removed in vacuo. The residue was then parti-
tioned between water and ethyl acetate, and the water layer was further
extracted with hexane. Both ethyl acetate and hexane extracts were
combined, fractionated by flash chromatography, and then purified by
open silica-gel column chromatography or preparative HPLC.
When cultivated in the liquid medium, medium and mycelia were
first separated by filtration. The mycelia were extracted with acetone
with sonication for one hour, concentrated, and reextracted with ethyl
acetate. The resultant crude extract was fractionated by flash chro-
matography, and further purified by preparative HPLC.
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 10
Content courtesy of Springer Nature, terms of use apply. Rights reserved
The purification methods for each compound are described in
detail below.
Purification conditions for α-polypodatetraene (1).Theextractofthe
A. oryzae strain with homoS cultivated on 50 DPY agar plates was
subjected to flash chromatography and eluted stepwise using a hex-
ane:ethyl acetate gradient (100:0 to 100:19). Fractions that contained 1
were then purified by open silica-gel column chromatography and
eluted with hexane to yield 20.5 mg of 1.Compound1(16.7 mg) was
also isolated from the A. oryzae strain with fumiS1 cultivated on 150
DPYagarplates,usingasimilarmethod.
Purification conditions for α-polypodatetraen-3β-ol (2). The extract
of the A. oryzaestrain with fumiS1 cultivated on150 DPY agar plates was
subjected to flash chromatography and eluted stepwise using a hex-
ane:ethyl acetate gradient (100:0 to 100:21).Fractions that contained 2
were then purified by preparative HPLC (100% acetonitrile, 3.0 mL/
min) to yield 25. 7 mg of 2.
Purification conditions for 8α-hydroxypolypoda-13,17,21-triene (3).
The extract of the A. oryzae strain with alliS cultivated on 50 DPY agar
plates was subjected to flash chromatography and eluted stepwise
using a hexane:ethyl acetate gradient (100:0 to 100:24). Fractions that
contained 3werethen purified by preparative HPLC (100% acetonitrile,
3.0 mL/min) to yield 58.6 mg of 3.
Purification conditions for homomonoceroid A (4).Theextractof
the A. oryzae strain with homoS,homoB,andhomoM cultivated on 150
DPY agar plates was subjected to flash chromatography and eluted
stepwise using a hexane:ethyl acetate gradient (100:0 to 100:24).
Fractions that contained 4were then purified by reverse-phase pre-
parative HPLC (90% aqueous acetonitrile, 3.0 mL/min) to yield 80.5 mg
of 4.
Purification conditions for fumionoceroids A (5) and B (6).The
extract of the A. oryzae strain with fumiS1,fumiB,andfumiM cultivated
on 50 DPY agar plates was subjected to flash chromatography and
eluted stepwise using a hexane:ethyl acetate gradient (100:0 to
100:31). Fractions that contained 5were then purified by preparative
HPLC (100% acetonitrile, 3.0mL/min) to yield 21.2 mg of 5. Fractions
that contained 6were then purified by preparative HPLC (100% acet-
onitrile, 3.0 mL/min) to yield 8.3 mg of 6.
Purification conditions for fumionoceroid C (7). The extract of the A.
oryzae strain with fumiS1,fumiB,fumiM,andfumiP cultivated on 50
DPY agar plates was subjected to flash chromatography and eluted
stepwise using a hexane:ethyl acetate gradient (100:0 to 100:32).
Fractions that contained 7were then purified by preparative HPLC
(95% aqueous acetonitrile, 3.0 mL/min) to yield 19.2 mg of 7.
Purification conditions for alliaonoceroids A (8) and B (9).The
extract of the A. oryzae strain with alliS,alliB,alliM,andalliP cultivated
on 200 DPY agar plates was subjected to flash chromatography and
eluted stepwise using a hexane:ethyl acetate gradient (100:0 to
100:17). Fractions that contained 8were then purified by open silica-
gel column chromatography and eluted with hexane:ethyl acetate
(50:1) to yield 25.5 mg of 8. Fractions that contained 9were then
purified by preparative HPLC (85% aqueous acetonitrile, 3.0mL/min)
to yield 18. 8 mg of 9.
Purification conditions for alliaonoceroids C (10) and D (11).The
extract of the A. oryzae strain with alliS,alliB,alliM,alliP,andalliA
cultivated on 200 DPY agar plates was subjected to flash chromato-
graphy and eluted stepwise using a hexane:ethyl acetate gradient
(100:0 to 100:15). Fra ctions that contained 10 and 11 werethen purified
by open silica-gel column chromatography and eluted with hex-
ane:ethyl a cetate (120:1) to yield 21.5 mg of 10 and 15.3 mg of 11.
Purification conditions for moserinol (12) and ent-yahazunol (13).
The extract of the A. oryzae strain with mosA,mosB,mosC,mosD,mosE,
and mosF cultivated in 3 L of DPY liquid medium was subjected to flash
chromatography and eluted stepwise using a dichloromethane:ethyl
acetate gradient (100:0 to 0:100). Fractions that contained 12 were
then purified by reverse-phase preparative HPLC (60% aqueous acet-
onitrile, 3.0 mL/min) to yield30.2 mg of 12. Fractions thatcontained 13
were then purified by reverse-phase preparative HPLC (55% aqueous
acetonitrile, 3.0 mL/min) to yield 12.8 mg of 13.
Synthesis of (S)-MTPA ester of compound 4
(R)-MTPA chloride (13.5 mg) was added to a solution of 4(5.6 mg) in
dry pyridine (1 mL) at room temperature. After 2 h, the reaction mix-
ture was concentrated to dryness and purified by reverse-phase pre-
parative HPLC (100% acetonitrile) to yield 7.1 mg of (S)-MTPA ester of
4.1H NMR (600 MHz, CDCl
3
): δ[ppm] 4.83 (1H, brd, J=1.0Hz,H-26),
4.71 (1H, dd, J= 11.7, 4.4 Hz, H-21), 4.68 (1H, brs, H-26), 2.39 (1H, ddd,
J= 12.6, 4.0, 2.3 Hz, H-7β), 1.97 (1H, td, J=12.7, 4.8Hz, H-7α), 1.86 (1H,
dt, J= 12.3, 3.0 Hz, H-15β), 1.81 (1H, dq, J=12.7,4.1Hz, H-20α), 1.74 (1H,
m, H-1β), 1.72 (1H, m, H-6α), 1.68 (1H, m, H-19β), 1.65 (1H, m, H-16α), 1.63
(1H, m, H-20β), 1.60 (1H, m, H-12), 1.56 (1H, m, H-2β), 1.54 (1H, m, H-9),
1.51 (1H, m, H-2α), 1.49 (2H, m, H-11), 1.41 (1H, m, H-15α), 1.40 (1H, m, H-
3β), 1.32 (1H, m, H-16β), 1.31 (1H, m, H-6β), 1.23 (1H, td, J= 13.3, 3.2 Hz, H-
19α), 1.17 (1H, td, J= 13.5, 4.0 Hz, H-3α), 1.12 (3H, s, H-27), 1.07 (1H, dd,
J= 12.6, 2.6 Hz, H-5), 1.03 (1H, t, J=3.5Hz,H-13),1.02(1H,dd,J=12.5,
2.0 Hz, H-17), 0.97 (1H, td, J= 12.9, 3.8 Hz, H-1α), 0.92 (1H, m, H-12), 0.91
(3H, s, H-29), 0.87 (3H, s, H-24), 0.80 (3H, s, H-23), 0.78 (3H, s, H-30),
0.76 (3H, s, H-28), 0.65 (3H, s, H-25); for the NMR spectrum, see Sup-
plementary Fig. 25.
Synthesis of (R)-MTPA ester of compound 4
In the same manner as described forthe (S)-MTPA ester of 4, 4 (5.3 mg)
was treated with (S)-MTPA chloride (13.5 mg) to yield 7.0 mg of (R)-
MTPA ester of 4.1H NMR (600MHz, CDCl
3
): δ[ppm] 4.83 (1H, brd,
J= 0.9 Hz, H-26), 4.74 (1H, dd, J= 11.8, 4.5 Hz, H-21), 4.68 (1H, brs, H-26),
2.39 (1H, ddd, J= 12.7, 4.0, 2.3 Hz, H-7β), 1.97 (1H, td, J=12.7,4.8Hz,H-
7α), 1.88 (1H, m, H-20α), 1.86(1H, dt, J= 12.2, 3.0 Hz, H-15β), 1.77 (1H, dq,
J= 12.4, 2.7 Hz, H-20β), 1.73 (1H, m, H-1β), 1.72 (1H, m, H-6α), 1.71 (1H, m,
H-19β), 1.64 (1H, dq, J= 13.3, 2.2 Hz, H-16α), 1.60 (1H, m, H-12), 1.56 (1H,
m, H-2β), 1.54 (1H, m, H-9), 1.49 (1H, m, H-2α), 1.49 (2H, m, H-11), 1.41
(1H, m, H-15α), 1.39 (1H, m, H-3β), 1.31 (1H, m, H-16β), 1.31 (1H, m, H-6β),
1.24 (1H, td, J= 13.3, 3.1 Hz, H-19α), 1.17 (1H, td, J= 13.4, 3.9Hz, H-3α),
1.12 (3H, s, H-27), 1.07 (1H, dd, J= 12.6, 2.6 Hz, H-5), 1.03 (1H, t, J=3.4Hz,
H-13), 1.02 (1H, dd, J= 12.2, 1.8 Hz, H-17), 0.96 (1H, td, J=13.0,3.9Hz,H-
1α), 0.93 (1H, m, H-12), 0.87 (3H, s,H-24), 0.83 (3H, s, H-29),0.79 (3H, s,
H-23),0.79(3H,s,H-28),0.78(3H,s,H-30),0.65(3H,s,H-25);forthe
NMR spectrum, see Supplementary Fig. 26.
X-ray crystallographic analysis
Single crystals of compounds 5, 7,and10 were grown in CH
3
OH/
CH
2
Cl
2
(1:2, v/v), whereas those of 9and 11 were grown in CH
3
CN/
CH
2
Cl
2
(1:2, v/v), by a slow evaporation process at room temperature.
Single crystal X-ray diffraction measurements were performed on a
Bruker D8 Venture diffractometer using Cu Kαradiation at 213 K (for 5
and 7), 203 K (for 9and 11), or 233 K (for 10). The data collection was
performed with the APEX3 program, and cell refinement and data
reductionwerecarriedoutusingtheSAINTprogram.Thestructuresof
5, 7, 9, 10,and11 were solved by the direct method with the SHELXT
program and refined using the SHELXL program. All non-hydrogen
atoms were refined anisotropically, whereas hydrogen atoms were
placed by geometrical calculations. The absolute configuration of 5, 7,
9, 10,and11 was determined by the Flack parameters.
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 11
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Analysis of the metabolites produced by A. homomorphus,A.
fumigatus,A. alliaceus,andN. moseri and the expression of
biosynthetic genes
A. homomorphus CBS 101889, A. fumigatus CBS 144.89, and A. alliaceus
CBS 536.65 were cultivated on five different agar plates, namely DPY
agar, PDA (0.4% potato extract, 2% dextrose, 1.5%agar), MEA (1.7% malt
extract, 0.3% mycological peptone, 2% agar), YES agar (2% yeast
extract, 15% sucrose, 0.05% MgSO
4
•7H
2
O, 0.001% ZnSO
4
•7H
2
O,
0.0005% CuSO
4
•5H
2
O, agar 2%, pH 6.4–6.6), and CZYA (0.3% NaNO
3
,
0.1% K
2
HPO
4
, 0.05% KCl, 0.05% MgSO
4
•7H
2
O, 0.001% FeSO
4
•7H
2
O,
0.5% yeast extract, 3% sucrose, 0.001% ZnSO
4
•7H
2
O, 0.0005%
CuSO
4
•5H
2
O, 1.5% agar, pH 6.0–6.5), and five different liquid media,
namely DPY, PDB (0.4% potato extract, 2% dextrose), MEB (1.7% malt
extract, 0.3% mycological peptone),YES (2% yeast extract, 15%sucrose,
0.05% MgSO
4
•7H
2
O, 0.001% ZnSO
4
•7H
2
O, 0.0005% CuSO
4
•5H
2
O, pH
6.4–6.6), and CZY (0.3% NaNO
3
,0.1%K
2
HPO
4
, 0.05% KCl, 0.05%
MgSO
4
•7H
2
O, 0.001% FeSO
4
•7H
2
O, 0.5% yeast extract, 3% sucrose,
0.001% ZnSO
4
•7H
2
O, 0.0005% CuSO
4
•5H
2
O, pH 6.0–6.5) for seven
days at 25 °C. N. moseri CBS 164.80 was cultivated in the same manner
as described above, but each medium was supplemented with
0.01% 4-HBA.
Metabolite extraction from the agar plates was performed as
described above. When cultivated in the liquid medium, medium and
mycelia were first separated by filtration. The mycelia were extracted
with acetone with sonication for one hour, concentrated, and reex-
tracted with ethyl acetate. The resultant crudeextract was dissolved in
methanol and analyzed by LC–MS.
For the analysis of triterpenoids, a solvent system of 20mM for-
mic acid (solvent A) and acetonitrile containing 20 mM formic acid
(solvent B) was used, at a flow rate of 1.0 mL/min and a column tem-
perature of 40 °C. Separation was performed using a linear gradient
from 10:90 (solvent B/solvent A) to 70:30 for 2 min, 70:30 to 100:0 for
the following 3 min, then 100:0 for the following 5 min, and a linear
gradient from 100:0 to 10:90 within the following 1 min, and then
10:90 for 3min of equilibrium. APCI source was used in positive
polarity for mass spectrometry. The ion source temperature was set to
350 °C, and the curtain gas was 30psi. For TOF-MS, the parameters
were set as follows: nebulizer current 3 μA, CAD gas 8, declustering
potential 80 V, declustering potentialspread 0 V, collision energy 10 V,
collision energy spread 0 V, TOF mass range 400–500 Da, and accu-
mulation time 0.25 s. For TOF-MS/MS, the parameters were set as
follows: nebulizer current 3 μA, CAD gas 8, declustering potential 80 V,
declustering potential spread 0 V, collision energy 35 V, collision
energy spread 15 V, TOF mass range 50–1000 Da, accumulation
time 0.1 s.
For the analysis of sesquiterpene hydroquinones, a system of
20 mM formic acid (solvent A) and acetonitrile containing 20 mM
formic acid (solvent B) was used, at a flow rate of 0.4 mL/min and a
column temperature of 40 °C. Separation was performed using a linear
gradient from 10:90 (solvent B/solvent A) to 100:0 for 10 min, 100:0
for the following 3 min, and a linear gradient from 100:0 to 10:90
within the following 2min, and then 10:90 for 2.5min of equilibrium.
ESI source was used in negative polarity for mass spectrometry. The
ion source temperature was set to 450 °C, and the curtain gas was
25 psi. For TOF-MS, the parameters were set as follows: spray voltage
−4.5 kV, CAD gas 7, declustering potential −80 V, declustering poten-
tial spread 0 V, collision energy −10 V, collision energy spread 0 V, TOF
mass range 320–340 Da, and accumulation time 0.25 s. For TOF-MS/
MS, the parameters were set as follows: spray voltage −4.5 kV, CDS gas
7, declustering potential −80 V, declustering potential spread 0 V,
collision energy −35 V, collision energy spread 0 V, TOF mass range
50–1000 Da, accumulation time 0.1 s.
Compounds 4, 11, and 12 were detected in YES liquid medium,
DPY liquid medium, and MEB, respectively. Mycelia of A. homo-
morphus CBS 101889, A. alliaceus CBS 536.65, and N. moseri CBS
164.80 from YES liquid medium, DPY liquid medium, and MEB,
respectively, were collected, and subsequently, their total RNA was
extracted using a GeneJET Plant RNA Purification Kit (Thermo Sci-
entific). Complementary DNA (cDNA) synthesis was then performed
with a HiScript III 1st Strand cDNA Synthesis Kit (+ gDNA wiper)
(Vazyme Biotech). The expression of each gene was confirmed by
PCR using the synthesized cDNA as a template (Supplemen-
tary Fig. 5).
Antibacterial assay
Compounds 12 and 13 were tested for their antimicrobial activity using
the Bauer-Kirby method. Five bacterial strains (Staphylococcus epi-
dermidis ATCC 12228, Staphylococcus aureus ATCC 6538, Bacillus cer-
eus,Streptococcus faecalis,andEscherichia coli ATCC 10536) were
cultivated in Luria-Bertani (LB) broth at37 °C for 12h and then diluted
to a concentration of ~5 × 105colony forming unit (CFU)/mL using
Mueller Hinton broth (2g/L beef infusion solids, 1.5 g/L starch, 17.5 g/L
casein hydrolysate). Subsequently, 180 μLofthepreparedmicrobial-
containing medium wasspread onto LB plates. After the surfaces of the
plates dried, 5 mm filter papers impregnated with 5 µL of a continuous
2-fold dilution of compounds (from 10 to 1.25 mg/mL) were placed on
each plate, and the LB plates were incubated at 37°C. After 24h,
inhibition zones around each filter paper were measured. The experi-
ment was performed in triplicate. Ampicillin was used as a positive
control.
Analytical data
α-polypodatetraene (1). Colorless oil; [α]22.2
D
+28.6 (c0.94, CHCl
3
);
UV (CHCl
3
)λ
max
(log ε) 200 (3.55) nm; IR (KBr) ν
max
2955, 2924, 2850,
1461, 1377 cm−1;1H NMR (600 MHz, CDCl
3
): δ[ppm] 5.11 (m, 3H), 4.82
(q, J= 1.4 Hz, 1H), 4.54 (brd, J= 1.1 Hz, 1H), 2.39 (ddd, J= 12.8, 4.2, 2.5 Hz,
1H), 2.08 (m, 5H), 1.99 (m, 5H), 1.82 (m, 1H), 1.75 (m, 1H), 1.72 (m, 1H),
1.68 (s, 3H), 1.60 (s, 3H), 1.60 (s, 3H), 1.59 (m, 1H), 1.56 (s, 3H), 1.56 (m,
3H), 1.48 (m, 2H), 1.31 (qd, J= 13.0, 4.3Hz, 1H), 1.17 (td, J= 13.4, 3.9 Hz,
1H), 1.08 (dd, J= 12.6, 2.7Hz, 1H), 1.00 (td, J= 13.0, 3.9 Hz, 1H), 0.87 (s,
3H), 0.80 (s, 3H), 0.66 (s, 3H); 13C NMR (150 MHz, CDCl
3
): δ[ppm]
148.8, 134.9, 134.9, 131.2, 125.1, 124.4, 124.3, 106.1, 56.2, 55.6, 42.2, 39.8,
39.7, 39.6, 39.1, 38.4, 33.6, 33.6, 26.9, 26.8, 26.7, 25.7, 24.5, 23.8, 21.7,
19.4, 17.7, 16.0, 16.0, 14.5; for NMR spectra, see Supplementary
Figs. 12 and 13; HRMS (APCI) m/z:[M+H]
+Calcd for C
30
H
51
411.3985;
Found 411.3966. The NMR data are in good agreement with the
reported data38.
α-polypodatetraen-3β-ol (2). Colorless oil; [α]18.4
D
+21.9 (c0.99,
CHCl
3
); UV (CHCl
3
)λ
max
(log ε) 200 (3.58) nm; IR (KBr)
νmax
3446, 2956,
2926, 2852, 1461, 1378 cm−1;1H NMR (600 MHz, CDCl
3
): δ[ppm] 5.11 (m,
3H), 4.84 (q, J= 1.4 Hz, 1H), 4.55 (brd, J= 0.9 Hz, 1H), 3.25 (dd, J= 11.8,
4.5 Hz, 1H), 2.40 (ddd, J= 12.8, 4.2, 2.5Hz, 1H), 2.07 (m, 5H), 1.98 (m,
5H),1.82(m,1H),1.78(dt,J= 13.2, 3.6 Hz, 1H), 1.73 (dddd, J= 12.9, 4.9,
2.5, 2.5Hz, 1H), 1.69 (m, 1H), 1.68 (s, 3H), 1.60 (s,3H), 1.60 (s, 3H), 1.58
(m, 2H), 1.56 (s, 3H), 1.41 (m, 4H),1.15 (td, J=13.3, 3.6 Hz, 1H), 1.07 (dd,
J= 12.6, 2.7 Hz, 1H), 0.99 (s, 3H), 0.77 (s, 3H), 0.67 (s, 3H); 13CNMR
(150 MHz, CDCl
3
): δ[ppm] 148.1, 135.1, 134.9, 131.3, 124.9, 124.4, 124.2,
106.6,78.9,55.9,54.6,39.8,39.7,39.2, 39.1, 38.2, 37.1, 28.3, 27.9, 26.8,
26.7, 26.8, 25.7, 24.0, 23.9, 17.7, 16.0, 16.0, 15.4, 14.5; for NMR spectra,
see Supplementary Figs. 14 and 15; HRMS (APCI) m/z:[M–H
2
O+H]
+
Calcd for C
30
H
49
409.3829; Found 409.3810. The NMR data are in
good agreement with the reported data39.
8α-hydroxypolypoda-13,17,21-triene (3). Colorless oil; [α]20.0
D
+ 0.0
(c1.00, CHCl
3
); UV (CHCl
3
)λ
max
(log ε) 200 (3.59) nm; IR (KBr) ν
max
2955, 2925, 2852, 1460, 1378 cm−1;1H NMR (600 MHz, CDCl
3
): δ[ppm]
5.18 (tq, J= 7.2, 1.1 Hz, 1H), 5.10 (m, 2H), 2.07 (m, 6H), 1.98 (m, 4H),
1.86 (dt, J= 12.4, 3.3 Hz, 1H), 1.68 (s, 3H), 1.64 (m, 1H), 1.61 (s, 3H), 1.60
(s, 3H), 1.60 (s, 3H), 1.57 (dt, J= 13.7, 3.7Hz, 1H), 1.43 (m, 2H), 1.37 (m,
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved
2H), 1.30 (m, 2H), 1.26 (m, 1H), 1.15 (td, J= 13.5, 4.2 Hz, 1H), 1.13 (s, 3H),
1.03 (t, J= 3.9 Hz, 1H), 0.98 (td, J= 12.9, 3.6Hz, 1H), 0.91 (dd, J= 12.3,
2.4 Hz, 1H), 0.89 (m, 1H), 0.87 (s, 3H), 0.78 (s, 3H), 0.78 (s, 3H); 13C
NMR (150 MHz, CDCl
3
): δ[ppm] 135.2, 135.0, 131.2, 125.0, 124.4, 124.2,
74.1, 61.5, 56.2, 44.5, 42.0, 39.7, 39.7, 39.7, 39.1, 33.4, 33.2, 31.4, 26.8,
26.7, 25.7, 25.5, 23.8, 21.5, 20.6, 18.5, 17.7, 16.2, 16.0, 15.4; for NMR
spectra, see Supplementary Figs. 16 and 17; HRMS (APCI) m/z:[M–
H
2
O+H]
+Calcd for C
30
H
51
411.3985; Found 411.3966. The NMR data
are in good agreement with the reported data40.
homomonoceroid A (4). Colorless oil; [α]19.6
D
+28.9 (c0.45, CHCl
3
);
UV (CHCl
3
)λ
max
(log ε) 205 (4.08) nm; IR (KBr) ν
max
3420, 2938, 2870,
2846, 1459, 1388 cm−1; for NMR data, see Supplementary Figs. 18–24;
HRMS (APCI) m/z:[M–H
2
O+H]
+Calcd for C
30
H
51
O 427.3934; Found
427.3923.
fumionoceroid A (5). Colorless crystal; [α]18.7
D
+14.5 (c0.89, CHCl
3
);
UV (CHCl
3
)λ
max
(log ε)207(4.68)nm;IR(KBr)ν
max
3290, 2959, 2928,
2867, 1459, 1379 cm−1; for NMR data, see Supplementary Figs. 27–33;
HRMS (APCI) m/z:[M–H
2
O+H]
+Calcd for C
30
H
49
409.3829; Found
409.3811.
fumionoceroid B (6).Whiteamorphoussolid;[α]23.9
D
+8.2 (c0.69,
CHCl
3
); UV (CHCl
3
)λ
max
(log ε)207(4.09)nm;IR(KBr)ν
max
3394, 2957,
2927, 2853, 1459, 1379 cm−1; for NMR data, see Supplementary
Figs. 34–40; HRMS (APCI) m/z:[M–H
2
O+H]
+Calcd for C
30
H
49
O
425.3778; Found 425.3758.
fumionoceroid C (7). Colorless crystal; [α]19.9
D
+18.1(c1.00, CHCl
3
); UV
(CHCl
3
)λ
max
(log ε) 200 (3.60) nm; IR (KBr) ν
max
3302, 2954, 2926,
2852, 1465, 1381 cm−1; for NMR data, see Supplementary Figs. 41–47;
HRMS (APCI) m/z:[M–H
2
O+H]
+Calcd for C
30
H
49
O 425.3778; Found
425.3762.
alliaonoceroid A (8). White amorphous solid; [α]23.2
D
+ 5.3 (c0.71,
CHCl
3
); UV (CHCl
3
)λ
max
(log ε)206(4.08)nm;IR(KBr)ν
max
3338, 2955,
2925, 2852, 1460, 1378 cm−1;1H NMR (600 MHz, C
6
D
6
): δ[ppm] 3.01 (m,
1H), 1.95 (tt, J= 13.2, 3. 2 Hz, 2H) , 1.79 (td, J= 13.1, 3.9 Hz, 1H), 1.75 (m,
2H), 1.69 (m, 1H), 1.63 (dt, J= 12.7, 3.6 Hz, 1H), 1.59 (m, 1H), 1.56 (m, 1H),
1.51 (ddd, J= 13.9, 6.1, 3.6 Hz, 1H), 1.46 (m, 2H), 1.41 (m, 1H), 1.46 (m, 3H),
1.37 (s, 3H), 1.36 (m, 3H), 1.35 (s, 3H), 1.25 (m, 2H),1.16 (m, 3H), 0.96 (s,
3H), 0.86 (m, 1H), 0.86 (s, 3H), 0.83 (m, 1H), 0.80 (s, 3H), 0.75 (s, 3H),
0.75 (s, 3H), 0.74 (m, 1H), 0.69 (s, 3H); 13C NMR (150MHz, C
6
D
6
): δ
[ppm] 80.1, 79.9, 78.4, 61.1, 61.1, 56.5, 55.5, 45.8, 45.6, 42.3, 40.6, 39.1,
39.1, 38.9, 38.7, 33.7, 33.6, 28.4, 28.1, 25.7, 25.5, 25.5, 25.3, 21.7, 21.2,
20.7, 19.3, 16.1, 16.0, 15.7; for NMR spectra, see Supplementary
Figs. 48 and 49; HRMS (APCI) m/z:[M–H
2
O+H]
+Calcd for C
30
H
51
O
427.3934; Found 427.3915. The NMR data are in good agreement with
the reported data42.
alliaonoceroid B (9). Colorless crystal; [α]23.5
D
+48.2 (c0.81, CHCl
3
);
UV (CHCl
3
)λ
max
(log ε) 200 (3.61) nm; IR (KBr) ν
max 3445,
2954, 2926,
2856, 1694, 1459, 1378cm−1; for NMR data, see Supplementary
Figs. S50–S56; HRMS (APCI) m/z:[M–H
2
O+H]
+Calcd for C
30
H
49
O
2
441.3727; Found 441.3715.
alliaonoceroid C (10). Colorless crystal; [α]23.0
D
+15.5(c1.00, CHCl
3
);
UV (CHCl
3
)λ
max
(log ε) 200 (3.64) nm; IR (KBr) ν
max
2955, 2926, 2856,
1728, 1463, 1375, 1248 cm−1; for NMR data, see Supplementary
Figs. 57–63; HRMS (APCI) m/z:[M–H
2
O+H]
+Calcd for C
32
H
53
O
2
469.4040; Found 469.4036.
alliaonoceroid D (11). Colorless crystal; [α]24.3
D
+40.0(c1.00, CHCl
3
);
UV (CHCl
3
)λ
max
(log ε) 200 (3.67) nm; IR (KBr) ν
max
2958, 2926, 2871,
1737, 1462, 1367, 1244cm−1; for NMR data, see Supplementary
Figs. 64–70; HRMS (APCI) m/z:[M–AcOH + H]+Calcd for C
32
H
53
O
3
485.3989; Found 485.3989.
moserinol (12). White amorphous solid; [α]22.8
D
+2.1(c1.00, CHCl
3
); UV
(CHCl
3
)λ
max
(log ε) 215 (3.54), 233 (3.41), 294 (3.54) nm; IR (KBr) ν
max
3391, 2925, 1505, 1459, 1388, 1200 cm−1; for NMR data, see Supple-
mentary Figs. 71–77; HRMS (ESI) m/z:[M–H]–Calcd for C
21
H
31
O
3
331.2279; Found 331.2283.
ent-yahazunol (13).Whiteamorphoussolid;[α]21.8
D
+7.3 (c0.10,
CHCl
3
); UV (CHCl
3
)λ
max
(log ε) 206 (3.33), 215 (3.37), 237 (3.24), 294
(3.25) nm; IR (KBr) ν
max
3382, 2935, 1506, 1458, 1392, 1234 cm−1;1HNMR
(600 MHz, acetone-d
6
): δ[ppm] 6.64 (d, J= 2.8 Hz, 1H), 6.52 (d,
J= 8.5 Hz, 1H), 6.49 (dd, J= 8.5, 2.8 Hz, 1H), 2.85 (dd, J=14.9,2.0Hz,1H),
2.39 (dd, J= 15.0, 6.1 Hz, 1H), 1.92 (dt, J= 12.4, 3.2 Hz, 1H), 1.83 (brd,
J= 13.0Hz, 1H), 1.67 (m, 1H), 1.63 (m, 1H), 1.58 (td, J= 14.8, 3.9 Hz, 1H),
1.58 (dd, J=5.9, 2.0 Hz, 1H), 1.36 (m, 1H), 1.34 (m, 1H), 1.32 (m, 1H), 1.30
(s, 3H), 1.10 (td, J= 13.5, 4.2 Hz, 1H), 0.97 (s, 3H), 0.96 (dd, J=12.3,
2.6 Hz, 1H), 0.86 (s, 3H), 0.82 (s, 3H), 0.72 (td, J= 13.2, 3.5, 1H); 13 CNMR
(150 MHz, acetone-d
6
): δ[ppm] 150.3, 149.7, 131.0, 118.8, 117.4, 114.2,
75.0, 62.2, 56.9, 44.4, 42.4, 41.3, 40.5, 33.7, 33.7, 27.8, 24.5, 21.8, 21.1,
19.0, 15.8; for NMR spectra, see Supplementary Figs. 78 and 79; HRMS
(ESI) m/z:[M–H]–Calcd for C
21
H
31
O
3
331.2279; Found 331.2288. The
NMR data are in good agreement with the reported data for
yahazunol47,andtheabsoluteconfiguration was determined by com-
paring the specific rotation of 13 with that reported for ent-
yahazunol46.
Crystallographic data for 5.C
30
H
50
O, M=426.70, a=19.8821(4) Å,
b=19.8821(4) Å, c=7.7762(2) Å, α=90°, β=90°, γ=90°,
V= 3073.92(15) Å3,T= 213(2) K, space group P4
3
,Z=4, μ(Cu
Kα)=0.395mm
−1,38497reflections measured, 6229 independent
reflections (R
int
= 0.0575). The final R
1
values were 0.0403 (I>2σ(I)).
The final wR(F2)valueswere0.1037(I>2σ(I)). The final R
1
values were
0.0413 (all data). The final wR(F2) values were 0.1054 (all data). The
goodness of fitonF2was 1.034. Flack parameter 0.03(8). The crystal-
lographic information file (CIF) fo r this crystal structure was submitted
to The Cambridge Crystallographic Data Centre (CCDC) under refer-
ence number 2279439.
Crystallographic data for 7.C
30.5
H
52
O
2.5
,M=458.72,a=12.7561(3)Å,
b=25.8142(7) Å, c= 8.2201(2) Å, α=90°, β=90°, γ=90°,
V=2706.78(12) Å
3,T= 213(2) K, space group P2
1
2
1
2, Z=4, μ(Cu
Kα) = 0.523 mm−1,36412reflections measured, 5545 independent
reflections (R
int
= 0.1262). The final R
1
values were 0.0536 (I>2σ(I)).
The final wR(F2) values were 0.1306 (I>2σ(I)). The final R
1
values were
0.0684 (all data). The final wR(F2) values were 0.1411 (all data). The
goodness of fitonF2was 1.054. Flack parameter 0.1(2). The crystal-
lographic information file (CIF) fo r this crystal structure was submitted
to The Cambridge Crystallographic Data Centre (CCDC) under refer-
ence number 2279440.
Crystallographic data for 9.C
30
H
50
O
3
,M=458.70, a= 6.3025(2) Å,
b= 25.5737(8) Å, c= 16.2273(5) Å, α=90°, β= 99.296(2)°, γ=90°,
V=2581.14(14)) Å
3,T= 203(2) K, space group P2
1
,Z=4, μ(Cu
Kα) = 0.564 mm−1,43044reflections measured, 10 447 independent
reflections (R
int
= 0.0974). The final R
1
values were 0.0544 (I>2σ(I)).
The final wR(F2)valueswere0.1293(I>2σ(I)). The final R
1
values were
0.0587 (all data). The final wR(F2) values were 0.1325 (all data). The
goodness of fitonF2was 1.055. Flack parameter −0.10(12). The crys-
tallographic information file (CIF) for this crystal structure was sub-
mitted to The Cambridge Crystallographic Data Centre (CCDC) under
reference number 2279441.
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 13
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Crystallographic data for 10.C
30.5
H
52
O
2.5
,M= 486.75, a= 11.6771(3) Å,
b=7.3892(2) Å, c= 33.1184(8) Å, α=90°, β= 93.8920(10)°, γ=90°,
V= 2851.01(13) Å3,T= 233(2) K, space group P2
1
,Z=4, μ(Cu
Kα) = 0.537 mm−1,43244reflections measured, 10785 independent
reflections (R
int
= 0.0437). The final R
1
values were 0.0403 (I>2σ(I)).
The final wR(F2) values were 0.1176 (I>2σ(I)). The final R
1
values were
0.0417 (all data). The final wR(F2) values were 0.1197 (all data). The
goodness of fitonF2was 1.022. Flack parameter −0.01(6). The crys-
tallographic information file (CIF) for this crystal structure was sub-
mitted to The Cambridge Crystallographic Data Centre (CCDC) under
reference number 2279442.
Crystallographic data for 11.C
34
H
56
O
5
,M=544.78, a= 7.8398(3) Å,
b= 10.0596(4) Å, c=10.6376(4) Å, α= 74.8350(10)°, β= 75.3740(10)°,
γ= 83.1520(10)°, V=782.21(5)Å
3,T= 203(2) K, space group P1, Z=1,
μ(Cu Kα)=0.590mm
−1,20283reflections measured, 6036 inde-
pendent reflections (R
int
=0.0496). The final R
1
values were
0.0415 (I>2σ(I)). The final wR(F2) values were 0.1142 (I>2σ(I)).
The final R
1
values were 0.0420 (all data). The final wR(F2)values
were 0.1150 (all data). The goodness of fitonF2was 1.096. Flack
parameter 0.17(8). The crystallographic information file (CIF)
for this crystal structure was submitted to The Cambridge Crys-
tallographic Data Centre (CCDC) under reference number
2279443.
Reporting summary
Further information on research design is available in the Nature
Portfolio Reporting Summary linked to this article.
Data availability
The manually curated fungal BGCsand custom-made HMM profiles are
deposited at Zenodo under the https://doi.org/10.5281/zenodo.
812680382. The online version of the FunBGCs database can currently
be accessed at http://staffweb1.cityu.edu.hk/ymatsuda/funbgcs/
funbgcs.html. The corresponding genomic regions for the biosyn-
thetic gene clusters reported in this study can be accessed at the
National Center for Biotechnology Information (NCBI) database with
the accession numbers NW_020291903 [https://www.ncbi.nlm.nih.
gov/nuccore/NW_020291903.1?report=genbank&from=95123&to=
101061&strand=true](thehomo cluster; region: 95,123…101,061),
DS499601 (the fumi cluster; region: 75,155…91,115), NW_022474693
[https://www.ncbi.nlm.nih.gov/nuccore/NW_022474693.1?report=
genbank&from=128036&to=138909](thealli cluster; region:
128,036…138,909), and NW_026055164 [https://www.ncbi.nlm.nih.
gov/nuccore/NW_026055164.1?report=genbank&from=259950&to=
270410](themos cluster; region: 259,950…270,410). The crystal-
lographic data obtained in this study have been deposited at the
Cambridge Crystallographic Data Centre (CCDC) under deposition
numbers 2279439 (5), 2279440 (7), 2279441 (9), 2279442 (10), and
2279443 (11). Copies of the crystallographic data can be obtained free
of charge via https://www.ccdc.cam.ac.uk/structures/.TherawNMR
data obtained in this study have been deposited at the Natural Product
Magnetic Resonance Database (NP-MRD) under deposition numbers
NP0332710 (1), NP0332711 (2), NP0332712 (3), NP0332713 (4),
NP0332714 (5), NP0332715 (6), NP0332716 (7), NP0332717 (8),
NP0332718 (9), NP0332719 (10), NP0332720 (11), NP0332721 (12), and
NP0332722 (13). Copies of the NMR data can be obtained free of
charge via https://np-mrd.org/. Source data are provided with
this paper.
Code availability
All original codes for FunBGCeX are deposited at Zenodo under the
https://doi.org/10.5281/zenodo.812679783 and are also available at
https://github.com/ydmatsd/funbgcex84.
References
1. Navarro-Muñoz, J. C. et al. A computational framework to explore
large-scale biosynthetic diversity. Nat. Chem. Biol. 16,
60–68 (2020).
2. Chevrette, M. G. et al. The confluence of big data and evolutionary
genomeminingforthediscoveryofnaturalproducts.Nat. Prod.
Rep. 38,2024–2040 (2021).
3. Bergmann, S. et al. Genomics-driven discovery of PKS-NRPS hybrid
metabolites from Aspergillus nidulans.Nat. Chem. Biol. 3,
213–217 (2007).
4. Nett, M., Ikeda, H. & Moore, B. S. Genomic basis for natural product
biosynthetic diversity in the actinomycetes. Nat. Prod. Rep. 26,
1362–1384 (2009).
5. Brakhage,A.A.Regulationof fungal secondary metabolism. Nat.
Rev. Microbiol. 11,21–32 (2013).
6. Rutledge, P. J. & Challis, G. L. Discovery of microbial natural pro-
ducts by activation of silent biosynthetic gene clusters. Nat. Rev.
Microbiol. 13, 509 (2015).
7. Covington, B. C., Xu, F. & Seyedsayamdost, M. R. A natural product
chemist’s guide to unlocking silent biosynthetic gene clusters.
Annu.Rev.Biochem.90,763–788 (2021).
8. Bauman, K. D., Butler, K. S., Moore, B. S. & Chekan, J. R. Genome
mining methods to discover bioactive natural products. Nat. Prod.
Rep. 38,2100–2129 (2021).
9. Blin, K. et al. antiSMASH 7.0: New and improved predictions for
detection, regulation, chemical structures and visualisation.
Nucleic Acids Res. 51, W46–W50 (2023).
10. Khaldi, N. et al. SMURF: Genomic mapping of fungal secondary
metabolite clusters. Fungal Genet. Biol. 47,736–741 (2010).
11. Hannigan,G.D.etal.Adeeplearninggenome-miningstrategyfor
biosynthetic gene cluster prediction. Nucleic Acids Res. 47, e110
(2019).
12. Almeida,H.,Palys,S.,Tsang,A.&Diallo,A.B.TOUCAN:Aframe-
work for fungal biosynthetic gene cluster discovery. NAR Genom.
Bioinform. 2, lqaa098 (2020).
13. Matsuda, Y. et al. Novofumigatonin biosynthesis involves a non-
heme iron-dependent endoperoxide isomerase for orthoester for-
mation. Nat. Commun. 9,2587(2018).
14. Mistry, J. et al. Pfam: The protein families database in 2021. Nucleic
Acids Res. 49,D412–D419 (2020).
15. Barra, L. & Abe, I. Chemistry of fungal meroterpenoid cyclases. Nat.
Prod. Rep. 38,566–585 (2021).
16. Terlouw, B. R. et al. MIBiG 3.0: A community-driven effort to
annotate experimentally validated biosynthetic gene clusters.
Nucleic Acids Res. 51,D603–D610 (2022).
17. HMMER: biosequence analysis using profile hidden Markov models
18. Itoh, T. et al. Reconstitution of a fungal meroterpenoid biosynthesis
reveals the involvement of a novel family of terpene cyclases. Nat.
Chem. 2,858–864 (2010).
19. Fujiyama, K. et al. Molecular basis for two stereoselective
Diels–Alderases that produce decalin skeletons. Angew. Chem. Int.
Ed. 60, 22401–22410 (2021).
20. Schor, R., Schotte, C., Wibberg, D., Kalinowski, J. & Cox, R. J. Three
previously unrecognised classes of biosynthetic enzymes revealed
during the production of xenovulene A. Nat. Commun. 9,
1963 (2018).
21. Lin, T.-S., Chiang, Y.-M. & Wang, C. C. C. Biosynthetic pathway of the
reduced polyketide product citreoviridin in Aspergillus terreus var.
aureus revealed by heterologous expression in Aspergillus nidu-
lans.Org. Lett. 18,1366–1369 (2016).
22. Matsuda, Y., Iwabuchi, T., Wakimoto, T., Awakawa, T. & Abe, I.
Uncovering the unusual D-ring construction in terretonin bio-
synthesis by collaboration of a multifunctional cytochrome P450
and a unique isomerase. J. Am. Chem. Soc. 137,3393–3401 (2015).
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 14
Content courtesy of Springer Nature, terms of use apply. Rights reserved
23. Letunic, I., Khedkar, S. & Bork, P. SMART: Recent updates, new
developments and status in 2020. Nucleic Acids Res. 49,
D458–D460 (2020).
24. Haft, D. H. et al. TIGRFAMs and genome properties in 2013. Nucleic
Acids Res. 41,D387–D395 (2012).
25. Buchfink, B., Reuter, K. & Drost, H.-G. Sensitive protein alignments
at tree-of-life scale using DIAMOND. Nat. Methods 18,
366–368 (2021).
26. Yan, Y., Liu, N. & Tang, Y. Recent developments in self-resistance
gene directed natural product discovery. Nat. Prod. Rep. 37,
879–892 (2020).
27. Tsukada,K.etal.Syntheticbiology based construction of biological
activity-related library of fungal decalin-containing diterpenoid
pyrones. Nat. Commun. 11,1830(2020).
28. Tagami, K. et al. Reconstitution of biosynthetic machinery for
indole-diterpene paxilline in Aspergillus oryzae.J. Am. Chem. Soc.
135,1260–1263 (2013).
29. Marchler-Bauer, A. et al. CDD/SPARCLE: Functional classification of
proteins via subfamily domain architectures. Nucleic Acids Res. 45,
D200–D203 (2016).
30. Tang, J. & Matsuda, Y. Dissection of the catalytic mechanisms of
transmembrane terpene cyclases involved in fungal meroterpenoid
biosynthesis. Angew. Chem. Int. Ed. 62, e202306046 (2023).
31. Ozaki, T., Minami, A. & Oikawa, H. Biosynthesis of indole diterpenes:
A reconstitution approach in a heterologous host. Nat. Prod. Rep.
40,202–213 (2023).
32. Matsuda, Y. & Abe, I. Biosynthesis of fungal meroterpenoids. Nat.
Prod. Rep. 33,26–53 (2016).
33. Matsuda, Y. & Abe, I. 1.14 - Fungal meroterpenoids. in Comprehen-
sive Natural Products III (eds. Liu, H.-W. & Begley, T. P.) 445-478
(Elsevier, Oxford, 2020).
34. Jin, F. J., Maruyama, J., Juvvadi, P. R., Arioka, M. & Kitamoto, K.
Development of a novel quadruple auxotrophic host transformation
system by argB gene disruption using adeA gene and exploiting
adenine auxotrophy in Aspergillus oryzae.FEMS Microbiol. Lett.
239,79–85 (2004).
35. Ye, Y. et al. Genome mining for sesterterpenes using bifunctional
terpene synthases reveals a unified intermediate of di/ses-
terterpenes. J. Am. Chem. Soc. 137,11846–11853 (2015).
36. Yuan, Y. et al. Efficient exploration of terpenoid biosynthetic gene
clusters in filamentous fungi. Nat. Catal. 5,277–287 (2022).
37. Chen, L., Wei, X. & Matsuda, Y. Depside bond formation by the
starter-unit acyltransferase domain of a fungal polyketide synthase.
J. Am. Chem. Soc. 144, 19225–19230 (2022).
38. Kinoshita, M., Ohtsuka, M., Nakamura, D. & Akita, H. First synthesis
of (+)-α-and(+)-γ-polypodatetraenes. Chem.Pharm.Bull.50,
930–934 (2002).
39. Bennett, G. J., Harrison, L. J., Sia, G.-L. & Sim, K.-Y. Triterpenoids,
tocotrienols and xanthones from the bark of Cratoxylum
Cochinchinense.Phytochemistry 32,1245–1251 (1993).
40. Arai, Y., Hirohara, M., Ageta, H. & Hsű,H.Y.Fernconstituents:Two
new triterpenoid alcohols with mono- and bi-cyclic skeletons, iso-
lated from Polypodiodes formosana.Tetrahedron Lett. 33,
1325–1328 (1992).
41. Ohtani, I., Kusumi, T., Kashman, Y. & Kakisawa, H. High-field FT NMR
application of Mosher’s method. The absolute configurations of
marine terpenoids. J. Am. Chem. Soc. 113,4092–4096 (1991).
42. Bartels, F. et al. Bioinspired synthesis of pentacyclic onocerane
triterpenoids. Chem. Sci. 8,8285–8290 (2017).
43. Gachet, M. S. et al. Antiparasitic compounds from Cupania cinerea
with activities against Plasmodium falciparum and Trypanosoma
bruceirhodesiense.J. Nat. Prod. 74,559–566 (2011).
44. Tang, J. & Matsuda, Y. Discovery of branching meroterpenoid bio-
synthetic pathways in Aspergillus insuetus: Involvement of two
terpene cyclases with distinct cyclization modes. Chem. Sci. 13,
10361–10369 (2022).
45. Lv, J.-M. et al. Biosynthesis of biscognienyneB involving a cyto-
chrome P450-dependent alkynylation. Angew. Chem. Int. Ed. 59,
13531–13536 (2020).
46. Pérez-García,E.,Zubía,E.,Ortega,M.J.&Carballo,J.L.Mer-
osesquiterpenes from two sponges of the genus Dysidea.J. Nat.
Prod. 68,653–658 (2005).
47. Laube, T., Schröder, J., Stehle, R. & Seifert, K. Total synthesis of
yahazunol, zonarone and isozonarone. Tetrahedron 58,
4299–4309 (2002).
48. Ohashi, M. et al. Biosynthesis of para-cyclophane-containing hir-
sutellone family of fungal natural products. J. Am. Chem. Soc. 143,
5605–5609 (2021).
49. Qi, J. et al. Chaetoglobosins and azaphilones from Chaetomium
globosum associated with Apostichopus japonicus.Appl. Microbiol.
Biotechnol. 104,1545–1553 (2020).
50. Perlatti, B. et al. Identification of the antifungal metabolite chae-
toglobosin P from Discosia rubi using a Cryptococcus neoformans
inhibition assay: Insights into mode of action and biosynthesis.
Front. Microbiol.11, 1766 (2020).
51. Zhong, Z. et al. Emergence of a hybrid PKS-NRPS secondary
metabolite cluster in a clonal population of the rice blast fungus
Magnaporthe oryzae.Environ. Microbiol. 22,2709–2723 (2020).
52. Ueda, D., Hoshino, T. & Sato, T. Cyclization of squalene from both
termini: Identification of an onoceroid synthase and enzymatic
synthesis of ambrein. J. Am. Chem. Soc. 135, 18335–18338 (2013).
53. Ageta,H., Shiojima,K. & Masuda, K. Fern constituents; onoceroid, α-
onoceradiene, serratene and onoceranoxide, isolated from Lem-
maphyllum microphyllum varieties. Chem.Pharm.Bull.(Tokyo)30,
2272–2274 (1982).
54. Tanaka, T. et al. New onoceranoid triterpene constituents from
Lansium domesticum.J. Nat. Prod. 65,1709–1711 (2002).
55. Abbet, C. et al. Phyteumosides A and B: New saponins with unique
triterpenoid aglycons from Phyteuma orbiculare L. Org. Lett. 13,
1354–1357 (2011).
56. Ohloff, G. 15 - The fragrance of ambergris. in Fragrance Chemistry
(ed. Theimer, E. T.) 535-573 (Academic Press, San Diego, 1982).
57. Araki, T. et al. Onocerin biosynthesis requires two highly dedicated
triterpene cyclases in a fern Lycopodium clavatum.ChemBioChem
17,288–290 (2016).
58. Saga, Y. et al. Identification of serratane synthase gene from the
fern Lycopodium clavatum.Org. Lett. 19, 496–499 (2017).
59. Quin, M. B., Flynn, C. M. & Schmidt-Dannert, C. Traversing the
fungal terpenome. Nat. Prod. Rep. 31,1449–1473 (2014).
60. Gao, Y. et al. Biosynthesis of fungal triterpenoids and steroids. Chin.
J. Org. Chem. 38,2335–2347 (2018).
61. Zhao, P. et al. Structural diversity, fermentation production, bioac-
tivities and applications of triterpenoids from several common
medicinal fungi: Recent advances and future perspectives. Fito-
terapia 166, 105470 (2023).
62. Tao, H. et al. Discovery of non-squalene triterpenes. Nature 606,
414–419 (2022).
63. Ochi, M., Kotsuki, H., Muraoka, K. & Tokoroyama, T. The Structure of
yahazunol, a new sesquiterpene-substituted hydroquinone from
the brown seaweed Dictyopteris undulata Okamura. Bull. Chem.
Soc. Jpn. 52,629–630 (1979).
64. Meganathan, R. Ubiquinone biosynthesis in microorganisms. FEMS
Microbiol. Lett. 203,131–139 (2001).
65. He, B.-B. et al. Enzymatic pyran formation involved in xiamenmycin
biosynthesis. ACS Catal. 9,5391–5399 (2019).
66. Yan, D. & Matsuda, Y. Global genome mining-driven discovery of an
unusual biosynthetic logic for fungal polyketide–terpenoid hybrids.
Chem. Sci. 15,3011–3017 (2024).
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 15
Content courtesy of Springer Nature, terms of use apply. Rights reserved
67. Chiang,C.-Y.,Ohashi,M.&Tang,Y.Decipheringchemicallogicof
fungal natural product biosynthesis through heterologous expres-
sion and genome mining. Nat. Prod. Rep. 40,89–127 (2023).
68. Yee, D. A. et al. Genome mining for unknown–unknown natural
products. Nat. Chem. Biol. 19,633–640 (2023).
69. Stanke, M., Diekhans, M., Baertsch, R. & Haussler, D. Using native
and syntenically mapped cDNA alignments to improve de novo
gene finding. Bioinformatics 24,637–644 (2008).
70. Camacho, C. et al. BLAST+: Architecture and applications. BMC
Bioinforma. 10,421(2009).
71. Sievers, F. & Higgins, D. G. Clustal Omega for making accurate
alignments of many protein sequences. Protein Sci. 27,
135–145 (2018).
72. Kessler, S. C. & Chooi, Y.-H. Out for a RiPP: Challenges and
advances in genome mining of ribosomal peptides from fungi. Nat.
Prod. Rep. 39, 222–230 (2022).
73. Edgar, R. C. MUSCLE: Multiple sequence alignment with high
accuracy and high throughput. Nucleic Acids Res 32,
1792–1797 (2004).
74. Castresana, J. Selection of conserved blocks from multiple align-
ments for their use in phylogenetic analysis. Mol. Biol. Evol. 17,
540–552 (2000).
75. Price,M.N.,Dehal,P.S.&Arkin,A.P.FastTree2–approximately
maximum-likelihood trees for large alignments. PLOS ONE 5,
e9490 (2010).
76. Stamatakis, A. RAxML version 8: A tool for phylogenetic analysis
and post-analysis of large phylogenies. Bioinformatics 30,
1312–1313 (2014).
77. Darriba, D., Taboada, G. L., Doallo, R. & Posada, D. ProtTest 3: Fast
selection of best-fit models of protein evolution. Bioinformatics 27,
1164–1165 (2011).
78. Wei, X. et al. Molecular and computational bases for spirofuranone
formationinsetosusinbiosynthesis.J. Am. Chem. Soc. 143,
17708–17715 (2021).
79. Chen, L., Tang, J.-W., Liu, Y. Y. & Matsuda, Y. Aspcandine: A pyrro-
lobenzazepine alkaloid synthesized by a fungal nonribosomal
peptide synthetase-polyketide synthase hybrid. Org. Lett. 24,
4816–4819 (2022).
80. Liu,C.etal.Efficient reconstitution of Basidiomycota diterpene
erinacine gene cluster in Ascomycota host Aspergillus oryzae based
on genomic DNA sequences. J. Am. Chem. Soc. 141,
15519–15523 (2019).
81. Tamano, K. et al. Heterologous production of free dihomo-γ-
linolenic acid by Aspergillus oryzae and its extracellular release via
surfactant supplementation. J. Biosci. Bioeng. 127,451–457 (2019).
82. Matsuda, Y. Data Sources for FunBGCeX: v0.0.0 Zenodo https://
doi.org/10.5281/zenodo.8126803 (2023).
83. Matsuda, Y. FunBGCeX: v0.0.0 Zenodo https://doi.org/10.5281/
zenodo.8126797 (2023).
84. Matsuda, Y. FunBGCeX GitHub https://github.com/ydmatsd/
funbgcex (2023).
Acknowledgements
We thank Prof. Katsuya Gomi (Tohoku University) and Profs. Katsuhiko
Kitamoto and Jun-ichi Maruyama (University of Tokyo) for the expression
vectors and fungal strain. We are also grateful to Dr. Man Kit Tse and Dr.
Kwok Chung Law (City University of Hong Kong) for the NMR spectra
acquisition and Dr. Shek Man Yiu (City University of Hong Kong) for X-ray
diffraction data collection and analysis. This work was supported by
General Research Fund grants from the Research Grants Council of
Hong Kong (Project Nos. 11301321 and 11309022 to Y.M.).
Author contributions
Y.M. designed the research and conducted the bioinformatic analysis.
J.T. performed experiments. Both authors analyzed the data and co-
wrote the manuscript.
Competing interests
The authors declare no competing interests.
Additional information
Supplementary information The online version contains
supplementary material available at
https://doi.org/10.1038/s41467-024-48771-7.
Correspondence and requests for materials should be addressed to
Yudai Matsuda.
Peer review information Nature Communications thanks the anon-
ymous reviewer(s) for their contribution to the peer review of this work. A
peer review file is available.
Reprints and permissions information is available at
http://www.nature.com/reprints
Publisher’s note Springer Nature remains neutral with regard to jur-
isdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as
long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons licence, and indicate if
changes were made. The images or other third party material in this
article are included in the article’s Creative Commons licence, unless
indicated otherwise in a credit line to the material. If material is not
included in the article’s Creative Commons licence and your intended
use is not permitted by statutory regulation or exceeds the permitted
use, you will need to obtain permission directly from the copyright
holder. To view a copy of this licence, visit http://creativecommons.org/
licenses/by/4.0/.
© The Author(s) 2024
Article https://doi.org/10.1038/s41467-024-48771-7
Nature Communications | (2024) 15:4312 16
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com