Available via license: CC BY-NC-ND 4.0
Content may be subject to copyright.
1
The Transcriptional Architecture of Bacterial
Biosynthetic Gene Clusters
Silvia Ribeiro Monteiro1, Yasmine Kerdel1, Julianne Gathot1, and Sébastien Rigali1,*.
1InBioS – Center for Protein Engineering, University of Liège, Institut de Chimie, Liège B-4000, Belgium
*Corresponding author: srigali@uliege.be
Abstract
Bacteria produce diverse bioactive metabolites with ecological and pharmaceutical importance. These compounds
are synthesized by biosynthetic gene clusters (BGCs), whose expression is tightly regulated. While many studies have
examined the factors inuencing BGC expression, including transcription factors (TFs) and environmental signals, the
regulatory architecture governing BGCs expression remains largely unexplored. In this meta-analysis, we collected
experimental datasets of bacterial transcription factor binding sites (TFBSs) to unveil i) the functional gene categories
preferentially targeted by TFs, ii) the regulatory coverage based on cluster organization, iii) the positional distribution of
TFBSs, and iv) the binding strength of TFs. Our analysis reveals a regulatory strategy where global TFs primarily target
pathway-specic TFs when present, aligning with a "one-for-all" strategy ensuring cluster-wide expression control.
Additionally, examination of the organization of TFBS-associated genes identied distinct transcriptional strategies:
regulatory genes are frequently monocistronic, while biosynthetic genes tend to be co-transcribed in operons to
guarantee biosynthesis eiciency. The positional distribution of TFBSs highlights a strong enrichment in the upstream
regions of genes optimizing their role in gene regulation. Finally, assessment of TF-TFBS interaction strength suggests
that TFBSs within BGCs exhibit lower binding ainities compared to those associated with core regulon genes that
reside outside BGCs, allowing greater regulatory exibility in response to multiple environmental cues. These ndings
provide new insights into the regulatory principles shaping BGC expression and would help predict conditions for
activating cryptic BGCs, facilitating the discovery of novel bioactive compounds through targeted culture and
engineering strategies.
Introduction
Bacteria are remarkable producers of specialized
metabolites (polyketides, nonribosomal peptides,
terpenoids, aminoglycosides, quinones…) which are
small organic compounds critical for survival in natural
habitats. These compounds display an extraordinary
structural and bioactive diversity (e.g. antimicrobials,
toxins, anti-oxidants, metal-chelators, signaling
molecules…) which have provided useful drugs or leads
for treating infections and cancer chemotherapy
amongst other applications 1,2. Only an estimated 3% of
the natural products encoded in bacterial genomes
have been experimentally characterized, highlighting
the immense potential for discovering numerous
valuable compounds in the future 3. Their production
depends on biosynthetic gene clusters (BGCs), which
are groups of genes that work together to coordinate
biosynthesis, transport, regulation, and, when
mandatory, self-resistance mechanisms to the
compound they produce. The Minimum Information
about a Biosynthetic Gene cluster (MIBiG) provides a
standardized framework for annotating gene groups,
enabling the systematic classication of genes
according to their role (Figure 1A) 4. The genes for
biosynthesis in BGCs can be of two types, either “core”
or “additional”, each playing distinct roles in metabolite
production. Core biosynthetic genes encode the
enzymes directly responsible for constructing the
backbone of the metabolite which can then be
chemically modied by enzymes encoded by additional
biosynthetic genes. In some cases, additional
biosynthetic genes also encode enzymes involved in
metabolic pathways that generate precursors and
building blocks used by core biosynthetic enzymes in
order to ensure the timely supply of substrates to
construct the metabolite’s primary structure 5,6. The
presence of genes for building block biosynthesis can
also help predicting structural features of compounds
associated with cryptic BGCs thereby facilitating the
discovery of novel natural products 7. Coordinated
expression in BGCs also includes genes encoding
transport-related proteins with diverse functions such
as i) secreting the natural product to fulll its role in the
surrounding environment, ii) importing the metabolite
for intracellular processes, iii) providing a resistance
mechanism when the metabolite is toxic to the producer
8,9, iv) importing building blocks 10, and v) importing
signaling molecules to trigger or repress the expression
of the BGC.
The interplay between biosynthetic and transport-
related genes illustrates how bacteria tightly regulate
their secondary metabolism. By denition, specialized
metabolites are produced in response to specic
environmental cues and/or at a precise stage of
bacterial growth or developmental program to meet
particular needs 11,12. Consequently, the expression of
BGCs is subjected to intricate multilevel transcriptional
13–15 and post-transcriptional controls 16, reecting both
the diverse ecological niches in which microorganisms
evolve and the evolutionary pressures they face.
Understanding the regulation of BGCs is essential for
multiple reasons. Since BGC-associated compounds
often contribute to microbial competition and survival,
identifying the signals and molecular mechanisms that
control their expression provides insight into how
bacteria interact with their environment. To ensure
.CC-BY-NC-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted March 19, 2025. ; https://doi.org/10.1101/2025.03.18.644061doi: bioRxiv preprint
2
proper spatiotemporal production of bioactive natural
products, BGCs frequently include genes encoding
transcription factors (TFs) that regulate the expression
of the BGC they reside in, referred to as cluster-situated
and/or pathway-specic regulators 17. In addition to
these specialized TFs, BGCs’ expression is also directly
inuenced by pleiotropic and global TFs, which exert
broader control over multiple biological processes 13–
15,18,19. Global regulators often act as molecular bridges,
linking environmental cues, such as sensing host-
related signals, to the production of metabolites,
including those involved in host colonization and
virulence 20. From an applied perspective, many
bioactive compounds are produced ineiciently in their
native hosts. Modulating BGC regulation can enhance
yields, making large-scale production more cost-
eective. Additionally, many bacteria harbor silent or
weakly expressed BGCs that remain inactive under
standard laboratory conditions. Understanding their
regulation allows researchers to activate these clusters,
unlocking novel secondary metabolites with potential
pharmaceutical applications21,22. With more than 2,000
bacterial BGCs experimentally characterized 23,
substantial data now exist to deepen our understanding
of how BGC expression is controlled. Key questions
remain about the functional gene categories targeted by
global TFs within a BGC, the positional distribution of
transcription factor binding sites (TFBSs) across the
dierent types of genomic regions, and the interaction
strength between TFs and their binding sites. In this
meta-analysis, we collected experimental datasets of
hundreds of TFBSs to shed light on the general and
specic features of the regulatory architecture
governing the expression control of bacterial BGCs.
Figure 1. Occurrence of TFBS according to the gene functional categories in BGCs. A. Functional categories and the roles of genes
in BGCs. The gene functions are color-coded according to the scheme applied across MIBiG entries. B. Percentage of gene functional
categories targeted by both global and BGCs-situated TFs in all BGCs. C. Percentage of gene functional categories targeted by BGCs-
situated TFs. D. Percentage of gene functional categories targeted global TFs for all BGCs (left panel), for BGCs that include pathway-
specic TFs (center panel), and for BGCs that do not include specic TFs (right panel).
.CC-BY-NC-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted March 19, 2025. ; https://doi.org/10.1101/2025.03.18.644061doi: bioRxiv preprint
3
Functional targeting: Genes’ function in BGC
targeted by transcription factors
Coordinated expression of all gene types within a BGC is
rational for several reasons. First, overexpression of core
biosynthetic enzymes alone is ineective if the uptake
or synthesis of their substrates by transporter or
additional biosynthetic genes is not simultaneously
enhanced. Second, failing to concomitantly enhance
the expression of export-related genes could lead to
intracellular metabolite accumulation, potentially
overwhelming the organism’s self-resistance
mechanisms. But how do global TFs coordinate the
expression of all genes within BGCs? Do they evenly
distribute TFBSs across all transcriptional units, or do
they primarily target the BGC-situated TF, which
typically governs the entire cluster’s expression? To
address this, we analyzed experimental datasets of
TFBSs identied within BGCs using techniques that
provide direct evidence of TF-TFBS interactions, such as
DNAse footprinting assays, ElectroMobility Shift Assays
(EMSA), Chromatin ImmunoPrecipitation on Chip (ChIP-
on-chip), or/and Chromatin ImmunoPrecipitation
sequencing (ChIP-seq) assays. The results of a literature
survey on 91 TFs (38 global and 53 pathway-specic TFs)
associated with the control of 75 BGCs across 17
bacterial genera, allowed us to collect 328 TFBSs. The
distribution of TFBSs according to the dierent
functional categories in BGCs is presented in Figure 1B.
Overall, the binding sites of TFs are most frequently
associated with the genes of the regulatory functional
category (36%), followed by additional biosynthetic
genes (31%), core biosynthetic genes (22%), genes with
“other or unknown” functions (7%), and nally genes for
transport-related proteins (4%) (Figure 1B). When
analyzing only the TFBSs of cluster-situated TFs, the
distribution of the binding sites is very similar with 34%,
31%, and 23% of TFBSs being associated with additional
biosynthetic genes, regulatory genes, and core
biosynthetic genes, respectively (Figure 1C). Among the
31% of TFBSs linked to TFs, 38% were located near their
corresponding TF, suggesting autoregulation of their
own expression via positive or negative feedback loops,
depending on whether they function as activators or
repressors. The targeting of genes encoding other
regulatory proteins for 62% of TFBSs conrms the
existence of regulatory cascades, where several TFs
within a BGC occupy distinct hierarchical roles to
coordinate the expression control of the entire cluster 19.
Interestingly, when considering only global TFs, the
proportion of TFBSs targeting the regulatory category
increases to 42% (Figure 1D, left panel). However, not all
BGCs contain a dedicated pathway-specic TF. When
we reanalyzed only BGCs that include a regulatory
protein, the bias toward targeting regulatory genes
became even more pronounced, rising to 57% (Figure
1D, center panel). This enrichment is especially striking
given the relatively low proportion of regulatory genes
compared to biosynthetic genes within BGCs. These
ndings suggest a “one-for-all” regulatory strategy:
since pathway-specic regulators often activate the
entire BGC 17, targeting the “regulation” functional
category ensures global TFs to exert comprehensive
transcriptional control over the whole cluster. When
instead there is no regulatory proteins in a BGC, TFBSs
of global TFs are almost equally associated with
additional biosynthetic genes (39%) and core
biosynthetic genes (37%), followed by genes with other
or yet unknown functions (24%) (Figure 1D, right panel).
Overall, our results highlight a preferential regulatory
mechanism where global TFs primarily target pathway-
specic regulators when present but shift their focus to
biosynthetic genes in their absence.
Organization of genes targeted by transcription
factors
Bacterial genes involved in the same biological process
are often organized into operons or co-expressed
transcription units (TUs) within a cluster. Consequently,
a TFBS associated with one gene can inuence
downstream genes transcribed in the same direction,
ensuring synchronized expression. This organization
enables bacteria to regulate entire biosynthetic
processes in a coordinated manner, and understanding
operon structures helps predict how regulatory inputs
aect multiple genes simultaneously. Using existing
expression data (e.g., RT-PCR), we determined the
number of genes across the various functional
categories that are co-transcribed with the gene where
TFBSs were identied. Among the 400 genes retrieved
from 69 BGCs, we identied 151 TUs, 80 (53%) being
monocistronic genes and 71 (47%) organized in operon
(Figure 2A). The number of genes included in an operon
varied from 2 to a maximum of 19 genes (Figure 2B).
Regulatory genes are the most abundant monocistronic
genes (Figure 2A), as they often require precise and
independent control, separate from biosynthetic
operons. Monocistronic transcription allows nely
tuned expression in response to environmental signals
prior to aect other genes in the cluster. As shown in
Figure 1D, many pathway-specic TFs autoregulate their
own expression through positive or negative feedback.
This mechanism is more eicient when the TF is
transcribed independently, enabling a rapid response to
uctuations in metabolite levels or environmental cues.
In contrast, additional biosynthetic genes belong to the
functional category most commonly found within
operons (Figure 2A). Many of these genes encode
tailoring enzymes that introduce structural
modications essential for the nal bioactive
compound. Organizing them in operons alongside core
biosynthetic genes ensures their coordinated
expression, preventing the accumulation of incomplete
intermediates and optimizing metabolic eiciency.
These ndings highlight distinct transcriptional
strategies within BGCs: regulatory genes often function
independently for precise control, while additional
biosynthetic genes are co-transcribed in operons to
ensure whole biosynthetic pathway progression.
The data collected allowed us to deduce the distance
tendencies of adjacent genes within operons of BGCs.
The median distance between the end and start of two
adjacent co-transcribed genes is 19 nucleotides (nt)
with the inter-quartile range (the distance of 50% of start
.CC-BY-NC-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted March 19, 2025. ; https://doi.org/10.1101/2025.03.18.644061doi: bioRxiv preprint
4
codons from the upstream stop codon) between
nucleotide positions -4 and +66 nt relative to the
translation stop codon (Figure 2C). Interestingly, there is
a frequency peak at position -4, with nearly 25% of all
collected distances between adjacent co-transcribed
genes occurring at this specic position. The four
quartile groups ranging from -89 and +164 nt as the
upper and lower limits, respectively, dene the distance
between most (90%) co-transcribed genes in BGCs.
Only 24 (10%) outlier values were found with an
exceptional case of 1441 nt intergenic distance due to
the insertion of an integrase gene (in opposite direction)
within a TU 24. Based on the frequency distance
distributions we provide useful information to predict
the TU organization.
Figure 2. Organization of BGCs’ genes targeted by TFs. A. Percentage of transcription units associated with a TFBS that are either
organized as monocistronic gene or in operon. B. Number of genes and their frequency found in BGCs’ operons. C. Distribution of
intergenic distances between pairs of adjacent or overlapping genes in operons. (distance = Start coordinate of downstream gene -
End coordinate of upstream gene +1). The edges in the boxplot indicate the 1st and 3rd quartiles (Q), and the median as center line.
IQR, Inter-quartile range.
Positional distribution of TFBSs in BGCs
Another key question concerns the positional
distribution of TFBSs across dierent types of regions in
a BGC, i.e. the gene coding sequences, and the
intergenic regions, the latter comprising the genes’
upstream regions and the “terminator” regions
(between two stop codons) (Figure 3A). Another region
called “regulatory” is also occasionally stated and refers
to the positions where promoters and TFBSs are likely to
be found (Figure 3A). Genes’ coding and intergenic
regions are easily determined based on the positions of
the start and stop codons. In contrast, delineating the
boundaries of the regulatory region cannot be
anticipated given that the exact position of TFBSs
regarding the transcriptional and the translational start
sites varies widely for each TF, and according to the
possible presence of one or more internal promoters
within the transcriptional units they control.
The positional distribution of 328 experimentally
validated BGC-associated TFBSs reveals a strong
density in the genes’ upstream region (Figure 3B). The
.CC-BY-NC-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted March 19, 2025. ; https://doi.org/10.1101/2025.03.18.644061doi: bioRxiv preprint
5
TFBSs distribution enabled to determine the inter-
quartile range (the position of 50% of TFBSs) between
nucleotide positions -192 and -66 relative to the
translation start codon, with the median at position -109
nt. The four quartile groups ranging from -371 and +94 nt
as the upper and lower limits, respectively, dene the
“regulatory” region of BGCs where most TFBSs (92%)
are located. Only 14 (4%) outliers TFBSs are found
upstream of and another 14 downstream of the
regulatory region. These results contrast with recent
genome-scale studies that have shown that model
bacterial TFs exhibit similar binding patterns in both
coding sequences and regulatory regions, with a
widespread presence of internal cryptic promoters
within coding sequences 25. Since BGCs often exceed
200 kb and core biosynthetic genes also being
exceptionally large and organized in operons, such
features may be favorable to the presence of internal
promoters – and therefore internal TFBSs – in the coding
sequences. However, the positional distribution of
TFBSs according to the genes’ functional categories did
not reveal such internal position of TFBS neither for core
biosynthetic genes nor for any other functional category
(Figure 3C).
Figure 3. Positional distribution of TFBSs in BGCs. A. Denition of the dierent types of regions in a gene cluster. B. Histogram (lower
panel) and Boxplot (upper panel) showing the positional distribution of all TFBSs. The edges in the boxplot indicate the 1st and 3rd
quartiles (Q), and the median as center line. IQR, Inter-quartile range. C. Boxplots showing the positional distribution of TFBSs
according to the genes’ functional categories.
.CC-BY-NC-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted March 19, 2025. ; https://doi.org/10.1101/2025.03.18.644061doi: bioRxiv preprint
6
Strength assessment of the TFBS-TF interactions in
BGCs
Finally, we aimed to compare the strength of interaction
between a global TF and its TFBSs in BGCs versus those
associated with genes outside BGCs, which are
involved in broader biological processes and conserved
across organisms. The rationale behind this
comparison is that TFBSs of core regulon genes
typically exhibit nucleotide sequences optimized for
specic and tight recognition. In contrast, in BGCs
whose presence is often limited to a series of species
within a bacterial genus, TFBSs may be less tightly
bound by the TF. This reduced ainity could stem from
the need for these genes to respond to multiple
environmental signals rather than their expression
being exclusively regulated by a single TF, leading to
deviations from the canonical TFBS sequences and
therefore reducing binding ainity. To assess interaction
strength, we used statistical models of TFBS
preferences, applying position weight matrices to score
sequence similarity to the consensus sequence
following the scoring methodology described in 26 and
calculated via the PREDetector software 27. These
scores, normalized between 0 and 1 (with 1
representing the consensus sequence), were used to
compare the strength of TFBSs within BGCs and those
associated with non-BGC genes.
Figure 4. Strength assessment of the TFBS-TF interactions in BGCs and non-BGC genes. A. Boxplot showing the interaction
strength of TFBSs within BGCs and those associated with non-BGC genes. The edges in the boxplot indicate the 1st and 3rd quartiles
(Q), and the median as center line. B. Specic distribution of interaction strength scores of 11 global TFs. Blue and red circles are for
TFBSs within BGCs and those associated with non-BGC genes, respectively.
.CC-BY-NC-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted March 19, 2025. ; https://doi.org/10.1101/2025.03.18.644061doi: bioRxiv preprint
7
Overall, TFBSs within BGCs exhibit lower strength
scores compared to those outside BGCs, with median
scores of 0.53 and 0.84, respectively (Figure 4A).
However, strength score distributions vary depending on
the biological process regulated by each analyzed TF.
TFs principally involved in primary metabolism, such as
CebR (cellulose utilization) 28,29, BxlR (xylan utilization)
30, and DasR (chitin metabolism)31–34, show higher
strength scores for TFBSs of core regulon genes involved
in carbon source utilization than for those within BGCs.
For instance, the CebR TFBSs found upstream of genes
of the cellulolytic system display the perfect
palindromic 14 nt sequence TGGGAGCGCTCCCA 20. In
contrast, CebR TFBSs associated with the thaxtomin-
production BGC in plant-pathogenic species
consistently contain at least one mutation 35. These
mutations weaken CebR binding, leading to a lower
expression fold-change of thaxtomin biosynthetic genes
upon sensing the cellotriose elicitor, compared to the
stronger response of primary metabolism genes
involved in cellulose byproduct import and
catabolism36. The contrast in strength score distribution
is even more pronounced for the chitin utilization
regulator DasR. While TFBSs upstream of chitinase and
N-acetylglucosamine utilization genes accept a
maximum of three mismatches, those within BGCs
frequently contain up to six mismatches, further
reducing DasR binding (Figure 4B). A similar contrast in
strength scores is observed for the redox-sensitive
transcriptional regulator SoxR 37, and for TFs involved in
micronutrient utilization, such as PhoP38, GlnR39,40, and
Zur41–43, which regulate phosphate, nitrogen, and zinc
sources uptake/utilization, respectively (Figure 4B).
Instead, TFs typically involved in morphological and/or
physiological dierentiations in Streptomyces species
such as AdpA44,45, AfsQ146, Crp47,48, and MtrA49,50, exhibit
a more homogeneous distribution of interaction
strength scores, consistent with their role in specialized
metabolite production control (Figure 4B). These results
support our hypothesis that core regulon genes require
strict regulation, relying on high-ainity TFBSs and
transcriptional activation only in response to their
specic substrate. In contrast, BGC-associated genes,
which are occasionally regulated by these TFs, utilize
sub-optimal TFBSs. This reduced binding ainity
enables a more exible expression pattern, allowing
adaptation to multiple environmental signals.
Conclusions
This study sheds light on the complex regulatory
architecture governing bacterial BGCs, reecting a ne-
tuned balance between metabolic eiciency,
environmental adaptability, and genetic organization.
Through a complex interplay of global and pathway-
specic transcription factors, bacteria ensure
coordinated expression of biosynthetic, transport, and
regulatory genes, optimizing resource utilization while
maintaining exibility in response to environmental
cues. The observed regulatory patterns of global TFs
highlighted a “one-for-all” regulatory strategy by
targeting pathway-specic TFs to ensure inclusive
transcriptional control over the whole cluster.
Additionally, the variability in transcription factor
binding ainities within BGCs suggests a mechanism
that allows greater regulatory plasticity, potentially
enabling dynamic responses to shifting ecological
conditions. A comprehensive understanding of the
regulatory networks and the signaling molecules that
inuence BGC expression is crucial as a strategy for
activating pathways leading to novel compound
discovery. Bioinformatic tools such as MiniMotif51
facilitate the discovery of signaling pathways to BGC
expression by cracking the regulatory codes 21. To
reliably predict how BGCs are controlled, our study
shows that parameters other than the similarity of a
TFBS to sequences known to be bound by a TF must be
considered. The predicted interaction strength is itself a
parameter, whose importance varies according to the
exibility of the TF to bind to degenerate sequences. The
other criteria that together improve prediction reliability
are: i) the functional category of the genes targeted by a
TF (Functional targeting), ii) the position of the TFBS for
optimal expression control (Positional distribution of
TFBSs), and iii) the gene’s organization to assess the
number of genes of the BGCs that might be under the
control of a single TFBS. These insights not only deepen
our understanding of BGC regulation but also provide a
foundation for future eorts to optimize culture
conditions and manipulate gene expression, paving the
way for the discovery and production of novel bioactive
compounds.
Author contributions
Conceptualization: S.R.M., S.R. – Data curation: All
authors. – Software: S.R.M. – Formal Analysis: S.R.M.
S.R. – Visualization: S.R.M., S.R. – Supervision: S.R. –
Writing – original draft: S.R. – Writing – review & editing:
All authors.
Conicts of interest
There are no conicts to declare.
Acknowledgments
The work was supported by FNRS aspirant Grant to
S.R.M. , and an FNRS-PDR T.0195.23 (40013674) Grant
to Y.K.
References
1 J. Clardy and C. Walsh, Nature, 2004, 432, 829–837.
2 A. G. Atanasov, S. B. Zotchev, V. M. Dirsch and C. T.
Supuran, Nat Rev Drug Discov, 2021, 20, 200–216.
3 A. Gavriilidou, S. A. Kautsar, N. Zaburannyi, D. Krug,
R. Müller, M. H. Medema and N. Ziemert, Nat
Microbiol, 2022, 7, 726–735.
4 M. H. Medema, R. Kottmann, P. Yilmaz, M.
Cummings, J. B. Biggins, K. Blin, I. de Bruijn, Y. H.
Chooi, J. Claesen, R. C. Coates, P. Cruz-Morales, S.
Duddela, S. Düsterhus, D. J. Edwards, D. P. Fewer,
N. Garg, C. Geiger, J. P. Gomez-Escribano, A. Greule,
M. Hadjithomas, A. S. Haines, E. J. N. Helfrich, M. L.
Hillwig, K. Ishida, A. C. Jones, C. S. Jones, K.
Jungmann, C. Kegler, H. U. Kim, P. Kötter, D. Krug, J.
.CC-BY-NC-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted March 19, 2025. ; https://doi.org/10.1101/2025.03.18.644061doi: bioRxiv preprint
8
Masschelein, A. V. Melnik, S. M. Mantovani, E. A.
Monroe, M. Moore, N. Moss, H.-W. Nützmann, G.
Pan, A. Pati, D. Petras, F. J. Reen, F. Rosconi, Z. Rui,
Z. Tian, N. J. Tobias, Y. Tsunematsu, P. Wiemann, E.
Wycko, X. Yan, G. Yim, F. Yu, Y. Xie, B. Aigle, A. K.
Apel, C. J. Balibar, E. P. Balskus, F. Barona-Gómez,
A. Bechthold, H. B. Bode, R. Borriss, S. F. Brady, A. A.
Brakhage, P. Carey, Y.-Q. Cheng, J. Clardy, R. J.
Cox, R. De Mot, S. Donadio, M. S. Donia, W. A. van
der Donk, P. C. Dorrestein, S. Doyle, A. J. M.
Driessen, M. Ehling-Schulz, K.-D. Entian, M. A.
Fischbach, L. Gerwick, W. H. Gerwick, H. Gross, B.
Gust, C. Hertweck, M. Höfte, S. E. Jensen, J. Ju, L.
Katz, L. Kaysser, J. L. Klassen, N. P. Keller, J.
Kormanec, O. P. Kuipers, T. Kuzuyama, N. C.
Kyrpides, H.-J. Kwon, S. Lautru, R. Lavigne, C. Y. Lee,
B. Linquan, X. Liu, W. Liu, A. Luzhetskyy, T. Mahmud,
Y. Mast, C. Méndez, M. Metsä-Ketelä, J. Mickleeld,
D. A. Mitchell, B. S. Moore, L. M. Moreira, R. Müller,
B. A. Neilan, M. Nett, J. Nielsen, F. O’Gara, H.
Oikawa, A. Osbourn, M. S. Osburne, B. Ostash, S. M.
Payne, J.-L. Pernodet, M. Petricek, J. Piel, O. Ploux, J.
M. Raaijmakers, J. A. Salas, E. K. Schmitt, B. Scott, R.
F. Seipke, B. Shen, D. H. Sherman, K. Sivonen, M. J.
Smanski, M. Sosio, E. Stegmann, R. D. Süssmuth, K.
Tahlan, C. M. Thomas, Y. Tang, A. W. Truman, M.
Viaud, J. D. Walton, C. T. Walsh, T. Weber, G. P. van
Wezel, B. Wilkinson, J. M. Willey, W. Wohlleben, G.
D. Wright, N. Ziemert, C. Zhang, S. B. Zotchev, R.
Breitling, E. Takano and F. O. Glöckner, Nat Chem
Biol, 2015, 11, 625–631.
5 P. Cimermancic, M. H. Medema, J. Claesen, K.
Kurita, L. C. Wieland Brown, K. Mavrommatis, A.
Pati, P. A. Godfrey, M. Koehrsen, J. Clardy, B. W.
Birren, E. Takano, A. Sali, R. G. Linington and M. A.
Fischbach, Cell, 2014, 158, 412–421.
6 K. D. Morgan, R. J. Andersen and K. S. Ryan, Nat.
Prod. Rep., 2019, 36, 1628–1653.
7 L. Martinet, A. Naômé, L. C. D. Rezende, D. Tellatin,
B. Pignon, J.-D. Docquier, F. Sannio, D. Baiwir, G.
Mazzucchelli, M. Frédérich and S. Rigali, Int J Mol Sci,
2023, 24, 1114.
8 E. Tenconi and S. Rigali, Curr. Opin. Microbiol., 2018,
45, 100–108.
9 J. F. Martín, J. Casqueiro and P. Liras, Current
Opinion in Microbiology, 2005, 8, 282–293.
10 A. Crits-Christoph, N. Bhattacharya, M. R. Olm, Y. S.
Song and J. F. Baneld, Genome Res, 2021, 31, 239–
250.
11 J. R. McCormick and K. Flärdh, FEMS Microbiology
Reviews, 2012, 36, 206–231.
12 E. Tenconi, M. F. Traxler, C. Hoebreck, G. P. van
Wezel and S. Rigali, Front Microbiol, 2018, 9, 1742.
13 M. Urem, M. A. Świątek-Połatyńska, S. Rigali and G.
P. van Wezel, Mol Microbiol, 2016, 102, 183–195.
14 A. Romero-Rodríguez, I. Robledo-Casados and S.
Sánchez, Biochimica et Biophysica Acta (BBA) -
Gene Regulatory Mechanisms, 2015, 1849, 1017–
1039.
15 J. F. Martín, F. Santos-Beneit, A. Sola-Landa and P.
Liras, in Stress and Environmental Regulation of
Gene Expression and Adaptation in Bacteria, John
Wiley & Sons, Ltd, 2016, pp. 257–267.
16 A. Gessner, T. Heitzler, S. Zhang, C. Klaus, R. Murillo,
H. Zhao, S. Vanner, D. L. Zechel and A. Bechthold,
Chembiochem, 2015, 16, 2244–2252.
17 Y. Yan and H. Xia, Front Microbiol, 2024, 15,
1368809.
18 X. Pei, Y. Lei and H. Zhang, World J Microbiol
Biotechnol, 2024, 40, 156.
19 H. Xia, X. Zhan, X.-M. Mao and Y.-Q. Li, World J
Microbiol Biotechnol, 2020, 36, 13.
20 B. Deandre, N. Stulanovic, S. Planckaert, S.
Anderssen, B. Bonometti, L. Karim, W. Coppieters,
B. Devreese and S. Rigali, Microb Genom, 2022, 8,
000760.
21 S. Rigali, S. Anderssen, A. Naômé and G. P. van
Wezel, Biochem Pharmacol, 2018, 153, 24–34.
22 M. H. Medema and G. P. van Wezel, PLOS Biology,
2025, 23, e3003058.
23 M. M. Zdouc, K. Blin, N. L. L. Louwen, J. Navarro, C.
Loureiro, C. D. Bader, C. B. Bailey, L. Barra, T. J.
Booth, K. A. J. Bozhüyük, J. D. D. Cediel-Becerra, Z.
Charlop-Powers, M. G. Chevrette, Y. H. Chooi, P. M.
D’Agostino, T. de Rond, E. Del Pup, K. R. Duncan, W.
Gu, N. Hanif, E. J. N. Helfrich, M. Jenner, Y.
Katsuyama, A. Korenskaia, D. Krug, V. Libis, G. A.
Lund, S. Mantri, K. D. Morgan, C. Owen, C.-S. Phan,
B. Philmus, Z. L. Reitz, S. L. Robinson, K. S. Singh, R.
Teufel, Y. Tong, F. Tugizimana, D. Ulanova, J. M.
Winter, C. Aguilar, D. Y. Akiyama, S. A. A. Al-Salihi, M.
Alanjary, F. Alberti, G. Aleti, S. A. Alharthi, M. Y. A.
Rojo, A. A. Arishi, H. E. Augustijn, N. E. Avalon, J. A.
Avelar-Rivas, K. K. Axt, H. B. Barbieri, J. C. J. Barbosa,
L. G. Barboza Segato, S. E. Barrett, M. Baunach, C.
Beemelmanns, D. Beqaj, T. Berger, J. Bernaldo-
Agüero, S. M. Bettenbühl, V. A. Bielinski, F.
Biermann, R. M. Borges, R. Borriss, M. Breitenbach,
K. M. Bretscher, M. W. Brigham, L. Buedenbender, B.
W. Bulcock, C. Cano-Prieto, J. Capela, V. J. Carrion,
R. S. Carter, R. Castelo-Branco, G. Castro-Falcón, F.
O. Chagas, E. Charria-Girón, A. A. Chaudhri, V.
Chaudhry, H. Choi, Y. Choi, R. Choupannejad, J.
Chromy, M. S. C. Donahey, J. Collemare, J. A.
Connolly, K. E. Creamer, M. Crüsemann, A. A. Cruz,
A. Cumsille, J.-F. Dallery, L. C. Damas-Ramos, T.
Damiani, M. de Kruij, B. D. Martín, G. D. Sala, J.
Dillen, D. T. Doering, S. R. Dommaraju, S. Durusu, S.
Egbert, M. Ellerhorst, B. Faussurier, A. Fetter, M.
Feuermann, D. P. Fewer, J. Foldi, A. Frediansyah, E.
A. Garza, A. Gavriilidou, A. Gentile, J. Gerke, H.
Gerstmans, J. P. Gomez-Escribano, L. A. González-
Salazar, N. E. Grayson, C. Greco, J. E. G. Gomez, S.
Guerra, S. G. Flores, A. Gurevich, K. Gutiérrez-
García, L. Hart, K. Haslinger, B. He, T. Hebra, J. L.
Hemmann, H. Hindra, L. Höing, D. C. Holland, J. E.
Holme, T. Horch, P. Hrab, J. Hu, T.-H. Huynh, J.-Y.
Hwang, R. Iacovelli, D. Iftime, M. Iorio, S.
Jayachandran, E. Jeong, J. Jing, J. J. Jung, Y. Kakumu,
E. Kalkreuter, K. B. Kang, S. Kang, W. Kim, G. J. Kim,
H. Kim, H. U. Kim, M. Klapper, R. A. Koetsier, C.
Kollten, Á. T. Kovács, Y. Kriukova, N. Kubach, A. M.
Kunjapur, A. K. Kushnareva, A. Kust, J. Lamber, M.
Larralde, N. J. Larsen, A. P. Launay, N.-T.-H. Le, S.
Lebeer, B. T. Lee, K. Lee, K. L. Lev, S.-M. Li, Y.-X. Li,
C. Licona-Cassani, A. Lien, J. Liu, J. A. V. Lopez, N. V.
.CC-BY-NC-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted March 19, 2025. ; https://doi.org/10.1101/2025.03.18.644061doi: bioRxiv preprint
9
Machushynets, M. I. Macias, T. Mahmud, M.
Maleckis, A. M. Martinez-Martinez, Y. Mast, M. F.
Maximo, C. M. McBride, R. M. McLellan, K. M. Bhatt,
C. Melkonian, A. Merrild, M. Metsä-Ketelä, D. A.
Mitchell, A. V. Müller, G.-S. Nguyen, H. T. Nguyen, T.
H. J. Niedermeyer, J. H. O’Hare, A. Ossowicki, B. O.
Ostash, H. Otani, L. Padva, S. Paliyal, X. Pan, M.
Panghal, D. S. Parade, J. Park, J. Parra, M. P. Rubio,
H. T. Pham, S. J. Pidot, J. Piel, B. Pourmohsenin, M.
Rakhmanov, S. Ramesh, M. H. Rasmussen, A. Rego,
R. Reher, A. J. Rice, A. Rigolet, A. Romero-Otero, L. R.
Rosas-Becerra, P. Y. Rosiles, A. Rutz, B. Ryu, L.-A.
Sahadeo, M. Saldanha, L. Salvi, E. Sánchez-Carvajal,
C. Santos-Medellin, N. Sbaraini, S. M. Schoellhorn,
C. Schumm, L. Sehnal, N. Selem, A. D. Shah, T. K.
Shishido, S. Sieber, V. Silviani, G. Singh, H. Singh, N.
Sokolova, E. C. Sonnenschein, M. Sosio, S. T. Sowa,
K. Steen, E. Stegmann, A. B. Strei, A. Strüder, F.
Surup, T. Svenningsen, D. Sweeney, J. Szenei, A.
Tagirdzhanov, B. Tan, M. J. Tarnowski, B. R. Terlouw,
T. Rey, N. U. Thome, L. R. Torres Ortega, T. Tørring,
M. Trindade, A. W. Truman, M. Tvilum, D. W. Udwary,
C. Ulbricht, L. Vader, G. P. van Wezel, M. Walmsley,
R. Warnasinghe, H. G. Weddeling, A. N. M. Weir, K.
Williams, S. E. Williams, T. E. Witte, S. M. W. Rocca,
K. Yamada, D. Yang, D. Yang, J. Yu, Z. Zhou, N.
Ziemert, L. Zimmer, A. Zimmermann, C.
Zimmermann, J. J. J. van der Hooft, R. G. Linington, T.
Weber and M. H. Medema, Nucleic Acids Research,
2024, gkae1115.
24 A. R. Reeves, R. S. English, J. S. Lampel, D. A. Post
and T. J. Vanden Boom, Journal of Bacteriology,
1999, 181, 7098–7106.
25 C. Hua, J. Huang, T. Wang, Y. Sun, J. Liu, L. Huang
and X. Deng, mBio, 2022, 13, e01643-22.
26 G. Z. Hertz and G. D. Stormo, Bioinformatics, 1999,
15, 563–577.
27 S. Hiard, R. Marée, S. Colson, P. A. Hoskisson, F.
Titgemeyer, G. P. van Wezel, B. Joris, L. Wehenkel
and S. Rigali, Biochem Biophys Res Commun, 2007,
357, 861–864.
28 I. M. Francis, S. Jourdan, S. Fanara, R. Loria and S.
Rigali, mBio, 2015, 6, e02018.
29 K. Marushima, Y. Ohnishi and S. Horinouchi, J
Bacteriol, 2009, 191, 5930–5940.
30 H. Tsujibo, M. Kosaka, S. Ikenishi, T. Sato, K.
Miyamoto and Y. Inamori, J Bacteriol, 2004, 186,
1029–1037.
31 S. Rigali, H. Nothaft, E. E. E. Noens, M. Schlicht, S.
Colson, M. Müller, B. Joris, H. K. Koerten, D. A.
Hopwood, F. Titgemeyer and G. P. van Wezel, Mol
Microbiol, 2006, 61, 1237–1251.
32 S. Colson, J. Stephan, T. Hertrich, A. Saito, G. P. van
Wezel, F. Titgemeyer and S. Rigali, J Mol Microbiol
Biotechnol, 2007, 12, 60–66.
33 B. Nazari, A. Saito, M. Kobayashi, K. Miyashita, Y.
Wang and T. Fujii, FEMS Microbiol Ecol, 2011, 77,
623–635.
34 M. A. Świątek-Połatyńska, G. Bucca, E. Laing, J.
Gubbens, F. Titgemeyer, C. P. Smith, S. Rigali and G.
P. van Wezel, PLoS One, 2015, 10, e0122479.
35 F. Ker, S. Jourdan, I. M. Francis, B. Deandre, S.
Ribeiro Monteiro, N. Stulanovic, R. Loria and S.
Rigali, Microbiol Spectr, 2023, 11, e0197523.
36 I. M. Francis, D. Bergin, B. Deandre, S. Gupta, J. J.
C. Salazar, R. Villagrana, N. Stulanovic, S. Ribeiro
Monteiro, F. Ker, R. Loria and S. Rigali, Biology
(Basel), 2023, 12, 234.
37 Q. Wang, X. Lu, H. Yang, H. Yan and Y. Wen, Microb
Biotechnol, 2022, 15, 561–576.
38 J. F. Martín, F. Santos-Beneit, A. Rodríguez-García, A.
Sola-Landa, M. C. M. Smith, T. E. Ellingsen, K.
Nieselt, N. J. Burroughs and E. M. H. Wellington, Appl
Microbiol Biotechnol, 2012, 95, 61–75.
39 J. Reuther and W. Wohlleben, J Mol Microbiol
Biotechnol, 2007, 12, 139–146.
40 Y. Zhu, J. Wang, W. Su, T. Lu, A. Li and X. Pang, Microb
Biotechnol, 2022, 15, 1795–1810.
41 M. Lyu, Y. Cheng, Y. Dai, Y. Wen, Y. Song, J. Li and Z.
Chen, Appl Environ Microbiol, 2022, 88, e0027822.
42 J.-H. Shin, S.-Y. Oh, S.-J. Kim and J.-H. Roe, J
Bacteriol, 2007, 189, 4070–4077.
43 D. Kallidas, B. Pascoe, G. A. Owen, C. M. Strain-
Damerell, H.-J. Hong and M. S. B. Paget, J Bacteriol,
2010, 192, 608–611.
44 Y. Ohnishi, H. Yamazaki, J.-Y. Kato, A. Tomono and S.
Horinouchi, Biosci Biotechnol Biochem, 2005, 69,
431–439.
45 H. Yamazaki, A. Tomono, Y. Ohnishi and S.
Horinouchi, Mol Microbiol, 2004, 53, 555–572.
46 D. Shu, L. Chen, W. Wang, Z. Yu, C. Ren, W. Zhang,
S. Yang, Y. Lu and W. Jiang, Appl Microbiol
Biotechnol, 2009, 81, 1149–1160.
47 A. Derouaux, S. Halici, H. Nothaft, T. Neutelings, G.
Moutzourelis, J. Dusart, F. Titgemeyer and S. Rigali, J
Bacteriol, 2004, 186, 1893–1897.
48 C. Gao, null Hindra, D. Mulder, C. Yin and M. A.
Elliot, mBio, 2012, 3, e00407-12.
49 N. F. Som, D. Heine, N. Holmes, F. Knowles, G.
Chandra, R. F. Seipke, P. A. Hoskisson, B. Wilkinson
and M. I. Hutchings, Microbiology (Reading), 2017,
163, 1415–1419.
50 Y. Zhu, P. Zhang, J. Zhang, J. Wang, Y. Lu and X. Pang,
Appl Environ Microbiol, 2020, 86, e01201-20.
51 H. E. Augustijn, D. Karaplias, K. M. M. Joosten, S.
Rigali, G. P. van Wezel and M. H. Medema, Journal of
Molecular Biology, 2024, 436, 168558.
.CC-BY-NC-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted March 19, 2025. ; https://doi.org/10.1101/2025.03.18.644061doi: bioRxiv preprint