Purple photosynthetic bacteria (PPB) are versatile microorganisms capable of producing various value-added chemicals, e.g., biopolymers and biofuels. They employ diverse metabolic pathways, allowing them to adapt to various growth conditions and even extreme environments. Thus, they are ideal organisms for the Next Generation Industrial Biotechnology concept of reducing the risk of contamination by using naturally robust extremophiles. Unfortunately, the potential of PPB for use in biotechnology is hampered by missing knowledge on regulations of their metabolism. Although Rhodospirillum rubrum represents a model purple bacterium studied for polyhydroxyalkanoate and hydrogen production, light/chemical energy conversion, and nitrogen fixation, little is known regarding the regulation of its metabolism at the transcriptomic level. Using RNA sequencing, we compared gene expression during the cultivation utilizing fructose and acetate as substrates in case of the wild-type strain R. rubrum DSM 467T and its knock-out mutant strain that is missing two polyhydroxyalkanoate synthases PhaC1 and PhaC2. During this first genome-wide expression study of R. rubrum, we were able to characterize cultivation-driven transcriptomic changes and to annotate non-coding elements as small RNAs.
Cultivation driven transcriptomic changes in the wild-type and mutant
strains of Rhodospirillum rubrum
Rhodospirillum rubrum
Depolymerase knock-out
Gene ontology
Purple photosynthetic bacteria (PPB) are versatile microorganisms capable of producing various value-added
chemicals, e.g., biopolymers and biofuels. They employ diverse metabolic pathways, allowing them to adapt
to various growth conditions and even extreme environments. Thus, they are ideal organisms for the Next
Generation Industrial Biotechnology concept of reducing the risk of contamination by using naturally robust
extremophiles. Unfortunately, the potential of PPB for use in biotechnology is hampered by missing knowledge
on regulations of their metabolism. Although Rhodospirillum rubrum represents a model purple bacterium studied
for polyhydroxyalkanoate and hydrogen production, light/chemical energy conversion, and nitrogen xation,
little is known regarding the regulation of its metabolism at the transcriptomic level. Using RNA sequencing, we
compared gene expression during the cultivation utilizing fructose and acetate as substrates in case of the wild-
type strain R. rubrum DSM 467
and its knock-out mutant strain that is missing two polyhydroxyalkanoate
synthases PhaC1 and PhaC2. During this rst genome-wide expression study of R. rubrum, we were able to
characterize cultivation-driven transcriptomic changes and to annotate non-coding elements as small RNAs.
1. Introduction
Bacteria represent a remarkably diverse group of organisms, which is
not surprising, as, according to estimations, there are 10
to 10
prokaryotic genospecies on Earth [1]. While the majority of these mi-
croorganisms living in a wide range of natural environments seem to be
uncultivable with current techniques, some organisms are very versatile
and can prosper under various cultivation conditions and grown on
various substrates. The ideal example is Rhodospirillum rubrum, a purple,
non-sulfur, Gram-negative facultative anaerobe from the class of
Alphaproteobacteria, which was observed for the rst time by Esmarch in
1887 [2]. It was shown to grow both aerobically and anaerobically. The
absence of oxygen triggering the photosynthesis apparatus for synthesis
of membrane proteins, bacteriochlorophylls, and the carotenoids
turning the culture purple. Its metabolic versatility is further supported
by the fact that it prospers under both heterotrophic and autotrophic
conditions. Besides utilizing various organic substrates as saccharides or
organic ions, R. rubrum can x and metabolize inorganic compounds
such as CO and CO
and is therefore considered a prospective strain for
valorization of waste gases such as combustion products or syngas [3,4].
On the other hand, it is capable of producing other gaseous products
utilizable as fuels, particularly hydrogen [5]. Last but not least,
R. rubrum is a potent producer of biopolymers of the class of poly-
hydroxyalkanoates (PHAs) in the form of intracellular granules from
many carbon sources. When supplemented by butyrate, for instance, it
can accumulate up to 50 w/w % of dry weight as PHA [6]. Hence, it
hosts a large variety of metabolic pathways that can be leveraged to
produce sustainable carbon substrates.
Computational and Structural Biotechnology Journal 23 (2024) 2681–2694
R. rubrum is considered to be a model strain for studying the con-
version of light energy to chemical energy [7], hydrogen biosynthesis
[8], formation of photosynthetic membranes (PM) [9], and regulatory
pathways of the nitrogen xation system [10,11]. Therefore, it is not
surprising that there are currently 10 genome assemblies of several
R. rubrum in the GenBank database (accessed March 8th, 2024),
including the genome presented in this study. The genes involved in
various metabolic pathways are therefore known. Apart from synthetic
pathways mentioned above, PHA synthesis occurs, and the three genes
coding PHA synthases were identied in R. rubrum genome [12]. While
one of these genes is as part of biosynthetic phaCAB operon, the other
two are located separately in different locations in the genome. The
main challenges related to R. rubrum and its capacity to produce
value-added chemicals are a low specic growth rate, a low biomass and
PHA volumetric productivity, and the need for an organic co-substrate to
increase productivity of the autotrophic pathways in the case of PHAs
synthesis. These hurdles could be overcome by adopting genome engi-
neering strategies, such as the one already applied to R. rubrum, for
example, consisting of overexpressing genes coding PHA synthases [13].
Despite relatively good genome characterization and even avail-
ability of various mutant strains, only a little is known on gene regula-
tions in R. rubrum as studies exploring gene expression on a genome-
wide scale using RNA sequencing (RNA-Seq) are missing and only two
studies based on microarrays are available [14,15]. Thus, in our study,
we compared R. rubrum transcriptomes from cultivation on fructose and
on acetate to describe basic changes in various pathways observed along
the cultivation or between substrates. Moreover, to explore the stability
of expression in engineered strains, we also analyzed transcriptomes of a
mutant strain with two PHA synthases deleted and compared it to the
wild-type strain for both same substrates. Furthermore, as the genome
assembly of the type strain was created relatively long time ago, we
re-sequenced the genome of R. rubrum DSM 467
, which was used for
experiments to exert inuence of possible mutations that might be
accumulated over time. We also sequenced a ΔphaC1ΔphaC2 mutant
strain to conrm deletions of PHA synthases and to capture other
genomic changes. Additionally, we improved genome annotation by
small RNA (sRNA) gene inference using RNA-Seq data.
2. Material and methods
2.1. Growth conditions and experiments
A freeze-dried bacterial culture of the type strain Rhodospirillum
rubrum DSM 467
(WT strain) was purchased from the Leibnitz Institute
DSMZ-German Collection of Microorganism and Cell Cultures,
Braunschweig, Germany. The double mutant R. rubrum ΔphaC1ΔphaC2
knock-out (KO) strain was obtained from the team of Professor Kevin E.
O΄Connor and Professor Tanja Narancic (University College Dublin,
Ireland). This PHA-negative mutant was designed and constructed as
reported previously in [16].
The rst cultivation step was incubation of both wild-type and
mutant strains on LB broth (tryptone 10.0 g/L, yeast extract 5.0 g/L,
NaCl 5.0 g/L) in Petri dishes at 30 C in the dark for ve days. The
second part of cultivation was inoculated by loop of the bacterial culture
from Petri dishes and the cultivation was performed in 500 mL Erlen-
meyer asks containing 100 mL of LB broth at 30 C in the dark under
shaking at 160 rpm for approximately 72 h till OD
=1.5. During the
main part of the cultivation, the cultures were inoculated to OD
by culture grown in liquid LB broth and cultivated in SYN medium. Its
composition for 1 L of medium was: 250 mg of MgSO
7 H
O, 132 mg of
2 H
O, 10 g of NH
Cl, 21 g of MOPS buffer, 10 mL NiCl
(20 µM),
100 mL of a chelated iron-molybdenum solution (0.28 g H
, 2.1 g of
EDTA, 0.7 g of FeSO
7 H
O and 0.1 g of Na
per Liter of
distilled water). In this medium, two carbon sources were used: 2.4 mL
of 1.5 M fructose (18 mM in total) with 4 mL of 50 g/L yeast extract (1 g/
L in total) and 11 mL of 1 M acetate (55 mM in total) with 4 mL of 50 g/L
yeast extract (1 g/L in total). They will be referred to as FY and AY,
respectively. Cultivation was performed in triplicates in 200 mL of
medium in 1 L Erlenmeyer ask at 30 C in the dark under 160 rpm until
the last sample of culture in stationary phase was taken. All the culti-
vations were performed under aerobic conditions.
During the cultivation of wild-type R. rubrum (WT) and R. rubrum
knock-out (KO), samples were taken at three time-points, at different
growth phases. In each sampling time-point, the biological triplicates
were spectrophotometrically screened at λ =660 nm and used for
further DNA and RNA analysis. The culture-specic growth rates and
doubling time were calculated using optical density data. The OD660
measurements for calculating
were taken every hour. The
values were calculated using a standard method: OD values were
transformed into ln(OD660), and the linear part of the ln(OD660) vs.
time curve was used for regression analysis to determine
2.2. Transmission electron microscopy
Cultures of wild-type R. rubrum cultivated on both fructose and ac-
etate as carbon sources were xed using the high-pressure freezing
method. Samples were pipetted on the 200 µm side of 3 mm copper-gold
carrier type A, which was closed using the at side of the type B carrier.
Both carriers were pretreated with 1 % solution of lecithin in chloro-
form. Vitrication of the samples was performed using a high-pressure
freezer EM ICE (Leica Microsystems, Austria). Frozen samples were
transferred under liquid nitrogen into a freeze substitution unit (EM
AFS2, Leica Microsystems, Austria). The substitution solution contained
1.5 % OsO
in acetone and the protocol was set as previously described
in [17]. Freeze substitution was followed by resin embedding (Epoxy
Embedding Medium kit, Sigma Aldrich, Germany) and curing at 62 C
for 48 h. Cured samples were cut to ultrathin sections (~75 nm) using a
diamond knife (ultra 45, DiATOME, Switzerland) and ultramicrotome
(EM UC7, Leica Microsystems, Vienna, Austria). Since ultrathin sections
were imaged using a low-voltage transmission electron microscope
(LVEM 25 Delong Instruments, Czech Republic) at 25 kV voltage of
electron beam, no post-staining procedure was necessary to achieve
sufcient contrast [18].
2.3. DNA and RNA extraction and sequencing
High molecular weight genomic DNA of the WT strain for long-read
sequencing was extracted using a MagAttract HMW DNA kit (Qiagen,
NL) in accordance with the manufacturers instructions. The DNA purity
was checked using NanoDrop (Thermo Fisher Scientic, USA), while the
concentration was measured using Qubit 4.0 Fluorometer (Thermo
Fisher Scientic, USA), and the length was measured using Agilent 4200
TapeStation (Agilent Technologies, USA). The Ligation Sequencing 1D
Kit (Oxford Nanopore Technologies, UK) was used for library prepara-
tion, and the sequencing was performed using R9.4.1 Flow Cell on the
Oxford Nanopore Technologies (ONT) MinION platform.
Genomic DNA of WT and KO strains for the high-throughput short-
read sequencing was extracted using GenElute Bacterial Genomic DNA
Kit (Sigma-Aldrich, USA) in accordance with the manufacturers in-
structions. Sequencing libraries were prepared using the KAPA Hyper-
Plus kit, and sequencing was carried out using MiSeq Reagent kit v2
(500 cycles) on the MiSeq platform (Illumina, USA).
RNA isolation followed an optimized extraction protocol consisting
of a combination of procedures focused on RNA isolation, where the
crucial point was addition of 1 mL of TRIzol per 40 mg of wet biomass
followed by incubation for 5 min to permit complete dissociation of the
nucleoproteins complex. Then, 0.1 mL of 1-bromo-3-chloropropane per
1 mL of TRIzolReagent used for cell lysis was added, and the tube was
securely capped and incubated for 23 min at room temperature. Af-
terwards, the samples were centrifuged at 11,000 ×g at 4 C for 15 min.
Subsequently, the supernatant containing the RNA was transferred to a
new tube where 70 v/v % EtOH was added at a ratio 1:1. Then, the
K. Jureckova et al.
Computational and Structural Biotechnology Journal 23 (2024) 2681–2694
samples were transferred to spin columns and the procedure continued
according to the manual of the NucleoSpin RNA Plus isolation kit
(Macherey-Nagel) with washing and drying steps of silica membrane
and elution of RNA that were stored at 80 C till the sequencing.
Ribodepletion was performed with QIAseq FastSelect 5S/16S/23S Kit
(Qiagen, NL) for WT samples cultivated on fructose (no. 1 9, see
Supplementary Table S1) or with RiboCop rRNA Depletion Kit for Bac-
teria Mixed bacterial samples (Lexogen, AT) for remaining samples (no.
10 33). Strand-specic sequencing libraries were prepared with
NEBNext Ultra II Directional RNA Library Prep Kit (New England Bio-
labs, USA) and sequenced with Illumina NextSeq550 to produce
reversely stranded reads. For samples no. 10 33, Unique Molecular
Identiers (UMIs) were added using xGen Duplex Seq Adapters (IDT,
2.4. Genome assembly
Nanopore sequencing data of wild-type strain were basecalled using
Guppy (v3.4.4), and the data quality was checked using PycoQC (v2.2.3,
[19]). De novo assembly by Flye (v2.8.1, [20]) was performed, and the
obtained contigs were polished using Minimap2 (v2.17, [21]) combined
with Racon (v1.4.13, [22]) and then nal polishing was performed by
Medaka (v1.1.2).
Both WT and KO strains were sequenced using Illumina MiSeq. Ob-
tained reads were rstly checked for quality using FastQC (v0.11.5)
combined with MultiQC (v1.7, [23]) and secondly, the low-quality ends
of reads together with sequencing adapters were removed by Trimmo-
matic (v0.36, [24]) with following settings: ILLUMINACLIP:TruSeq3-PE.
LEN:36. Next, the set of reads was checked for contamination by human
DNA, and detected reads were removed using BBMap (v39.01) and
human genome GCF_000001405.40 available in the NCBI RefSeq data-
base. In the case of the WT strain, the reads were mapped to the nano-
pore contigs using BWA (v0.7.17, [25]), and for managing the obtained
assembly Samtools (v1.10, [26]) was employed. Final assembly polish-
ing was made by Pilon (v1.24, [27]). In the last step, the assembled
genome and plasmid were rearranged so that the DnaA gene was the rst
gene in the genome and repB was the rst gene in the plasmid. For the
KO strain assembly, BWA and Samtools were used again, but in this case,
the pre-processed reads were mapped to the previously assembled WT
The variant calling was conducted to nd differences between both
(WT and KO) analyzed strains. For this purpose, GATK4 (v4.3) was used.
Underrepresented variants, variants with low coverage and false posi-
tive calls were ltered out, and the remaining ones underwent addi-
tional analysis to ascertain their presence within coding regions and, if
applicable, their impact on the phenotype.
2.5. Genome annotation
NCBI Prokaryotic Genome Annotation Pipeline (PGAP) [28] was
used for the wild-type strain chromosome and plasmid annotation.
Protein coding genes functional annotation was performed by classi-
fying them into clusters of orthologous groups (COG) categories from
the eggNOG database via the eggNOGmapper (v2.1.6, [29]). The
DNAplotter [30], which is integrated into Artemis (v18.2.0, [31]), was
used to create the chromosome and plasmid circular maps including GC
content and CG skew plots, which were calculated in a window of size
10,000 bp with 200 bp step. The interspaced short palindromic repeats
(CRISPR) arrays were searched by the CRISPDetect tool (v2.4, [32]), and
Cas genes were manually searched in the annotated genome. Physpy
(v4.2.6, [33]) and online tools Prophage Hunter and Phaster were used
for prophage DNA identication. The restriction-modication systems
were located using REBASE database [34].
The GenBank database was searched for genomes of R. rubrum using
Entrez [35]. Then Roary (v3.13, [36]) was used to identify the core
genome, which consists of genes located in all analyzed strains, the
accessory genome formed by genes presented in at least two strains but
not in all of them, and the number of unique genes. The minimum
percentage identity to assess whether two genes are similar was 95 %,
and all other parameters for Roary were left at their default settings.
2.6. Transcriptomic analysis
The raw RNA-Seq reads for both strains were checked for their
quality using FastQC and MultiQC. Next, the reads trimming was per-
formed to discard low-quality bases and adapters using Trimmomatic
with following parameters for samples no. 10 33: ILLUMINACLIP:
TruSeq3-SE.fa:2:30:10 LEADING:3 TRAILING:3 SLI-
DINGWINDOW:4:15 MINLEN:36 and for samples no. 1 9 HEADCROP
parameter was added with value 5, because rst ve bases of the reads
contained randomized 5 bp long adapters. Remaining contamination by
rRNA was removed using SortMeRNA (v4.3.4, [37]) together with the
default database (smr_v4.3_default_db.fasta) and the SILVA database
[38] with known 16S and 23S rRNA sequences. Processed reads were
mapped to the genome of the wild-type strain using STAR (v2.7.10a,
[39]), and mapped reads were deduplicated using UMI-tools (v1.1.4,
[40]). Finally, the reads were counted using featureCounts function [41]
from Rsubread package (R/Bioconductor). Reads counting considered
two options: Uniquely mapped reads and multimapping reads. For the
multimapping reads the contribution of those reads to the nal count
was always divided by the number of genomic loci to which the read was
mapped, therefore the number of reads remained unchanged.
Created count tables for all samples were further normalized by
calculating RPKM (Reads Per Kilobase per Million mapped reads) and by
using built-in function in R/Bioconductor package DESeq2 [42], which
was also used for differential expression analysis. Normalized count
tables were used for dimension reduction using Barnes-Hut t-SNE
implemented in R package Rtsne [43]. The results were visualized using
ggplot2 [44] R package, which was also employed for creation of Vol-
cano plots. Gene ontology (GO) enrichment analysis was performed by
topGO [45] R/Bioconductor package together with GO map, that was
created with custom Python script based on available GO annotation of
wild-type strain in NCBI RefSeq database (NZ_CP077803.1 for chro-
mosome and NZ_CP077804.1 for plasmid). Finally, the expression pro-
les of selected genes were visualized using heatmaps created using R
packages gplots, RColorBrewer, magick and openxlsx.
To gain insight into the regulatory processes of R. rubrum, mapped
reads were further used for non-coding RNAs (ncRNA) prediction with
baerhunter (v0.9, [46]) The sample depths were normalized using
sizeFactors function from DESeq2 package and visualized using box-
plots. Subsequently, to obtain a count table of a collapsed annotation le
with newly predicted features, barhunters count_features script using
featureCounts function was applied. Furthermore, attention was paid to
putative sRNAs, and their length distribution was visualized by a his-
togram. Also, these predictions were categorized as trans, respectively
cis-acting elements using IRanges R package [47]. Finally, differential
expression analysis was performed on the count table using DESeq2.
Counts of the differentially expressed (up or down) sRNAs between
specic conditions were visualized by a bar chart using R package
ggplot2 as well as for all mentioned graphs.
3. Results
3.1. Genome assembly
The Oxford Nanopore Technologies MinION produced 135,804
reads; from them, 104,955 had Q >7 and were used for further WT
strain assembly. The mean reads length was approximately 3.5 kbp. The
Illumina MiSeq provided 2.5 million 250 bp-long paired reads with an
average Phred score of 35. From them, 296 reads were mapped to the
human genome; thus, they were discarded from further analysis. The
K. Jureckova et al.
Computational and Structural Biotechnology Journal 23 (2024) 2681–2694
assembly process of wild-type strain resulted in the nal assembly of one
circular chromosome and one circular plasmid with a coverage of 370×.
The sequences were deposited under accession numbers CP077803.1 for
chromosome and CP077804.1 for plasmid at DDBJ/EMBL/GenBank.
The obtained chromosome sequence length was 4,352,570 bp with a
GC content of about 65.4 %, and the plasmid was 53,835 bp long with a
GC content of around 59.8 %. In total, 3,968 open reading frames (ORFs)
divided into 2,146 operons were identied for both sequences. Most of
the genes were protein coding; however, 49 pseudogenes were also
found. Furthermore, 521 genes overlapped with another neighboring
gene, and 3 overlaps were found between a gene and a pseudogene. The
overlap size was in 390 cases equal to three nucleotides and the longest
overlap was 87 bp between genes KUL73_19715 and KUL73_19720 on
plasmid. See Table 1 for complete statistics for chromosome and
Illumina MiSeq sequencing of KO strain provided about 2.2 million
250-bp long reads with an average Phred score of 33. Of these, 260 reads
that mapped to the human genome were removed. Variant calling of the
KO strain conrmed deletion of PHA polymerases. Moreover, it revealed
three changes in its genome: two single nucleotide mutations and one
insertion. The rst mutation was found in position 823,861 (A G) in
the promoter of cimA gene encoding citramalate synthase, the second
mutation was identied in position 1,305,962 (C T) in the rpoH gene
encoding RHA polymerase sigma factor RpoH and the last third change
was localized in position 3,782,354 (C CGCTTCAGGGGAAA-
CACGTTATGAAG) in the promoter of gene encoding hypothetical
Ten genomes obtained from GenBank were used for R. rubrum core
genome determination. In total, 1,732 genes were found in all analyzed
strains and thus formed the core genome. Another 2,859 genes were
identied in at least two genomes, and thus, they comprised accessory
genome. Together these 4,591 genes form the R. rubrum pangenome. In
addition, 920 unique genes only identied in one genome were found.
The number of unique genes ranged from 4 to 467, with a median of 44
(see Supplementary Table S2).
3.2. Wild-type strain functional annotation
Protein coding genes and pseudogenes were classied according to
COG into 20 categories. Only 2 CDSs were not assigned to any COG;
however, 1,093 CDSs were classied to class S with unknown function.
The remaining 2,804 CDSs (out of all 3,850 protein coding genes and 49
pseudogenes) were classied to COG class. Individual chromosomal and
plasmidic features are shown in Fig. 1 together with GC content and GC
skew plots for the whole genome. Substantial drop in the average CG
content was observed around position 3,810,000 bp on chromosome,
where are located 5S rRNA (KUL73_17050), 23S rRNA (KUL73_17055)
and 16S rRNA (KUL73_17070) genes. More detailed results can be seen
in Supplementary Table S3.
The genome of R. rubrum DSM 467 was annotated in terms of gene
ontology. Together, 3,047 GO terms were assigned to 1,312 genomic
loci on the chromosome, and 30 GO terms were connected to 14
genomic elements on the plasmid. The most common terms for the three
GO categories were: GO:0006412 translation for biological process
assigned 58 times, for molecular function GO:0005524 ATP binding
with 86 genomic loci and for cellular component GO:0016020 mem-
branewas assigned 107 times.
CRISPR analysis showed 11 arrays with lengths from 337 to 2,467 bp
and a number of spacers from 4 to 40 (see Supplementary Table S4).
CRISPRDetect tool did not identify any Cas genes, but a manual search of
the annotated genes found 34 Cas-like genes, but no Cas9 protein. In the
case of prophage identication, Physpy did not identify any viral DNA,
Prophage Hunter found one ambiguous prophage candidate, and
Phaster localized four incomplete prophages, all of them presenting a
low score (see Supplementary Table S5). The restriction-modication
systems analysis revealed two systems containing restriction endonu-
clease and methylase where one belonged to type II and one to type III.
In addition, another gene coding for methylase type II was also found
(see Supplementary Table S6).
3.3. Cultivation & growth kinetics
To achieve cultivation-driven transcriptome changes in the wild-type
strain of R. rubrum and its knock-out strain, these microorganisms were
cultivated on two substrates, namely acetate and fructose. Growth
curves were determined throughout cultivation, with samples taken
over time and their optical density measured at 660 nm. Except for the
cultivation of the KO strain on acetate, samples were collected in each
cultivation in the mid-exponential phase, at the end of the exponential
phase, and in the stationary phase (for precise sampling times see Sup-
plementary Table S1). For the KO strain on acetate, only two time points
were selected for further characterization (see Fig. 2).
From the obtained growth curves, the growth rate and doubling time
for each cultivation were also calculated (see Table 2 and Supplemen-
tary Fig. S1).
Additionally, pH was of course also analyzed during the cultivations.
The pH of both cultures cultivated on fructose quickly reached values of
7.5 - 7.7 and remained almost constant until the end of cultivation. In
the case of acetate, the pH reached more alkaline values of about 8.6 -
8.8 during the initial stage of cultivation and remained constant until the
end of the experiment.
3.4. Ultrastructural analysis
The presence of cytoplasmic PHA granules in the cultures cultivated
on two different carbon sources as well as the overall ultrastructure of
the cells of R. rubrum wild-type was determined using low voltage
transmission electron microscopy. As seen in Fig. 3, both cultures
contain in their spiral-shaped cells a substantial amount of small
electron-lucent granules chromatophores. However, only cultures
grown on acetate as a carbon source were able to also produce PHA in
their cells, which formed bigger electron-lucent granules.
3.5. RNA-Seq transcriptome
RNA sequencing (Supplementary Table S1) produced 713 millions of
single-end reads averaging on 21.6 million reads per sample. After the
rst part of pre-processing (adapter and quality trimming), the reads
lengths were either 70 bp (samples no. 1 9) or 66 bp (samples no. 10
33) based on the used library construction method, and the average
Phred score was 35 for all samples. The subsequent pre-processing step
removed the remaining contamination of rRNA from the data. The
proportion of reads corresponding to 16S and 23S differed among
samples, e.g., all three samples for WT strain grown on fructose in the
third time-point contained around 46 % of rRNA reads, and on the other
hand 23 samples had less than 5 % of rRNA contamination (see Sup-
plementary Fig. S2).
Finally, reads were mapped to the reference genome of the wild-type
strain. The majority of reads (81 - 99 %) were mapped uniquely, yet
Table 1
Genomic features of Rhodospirillum rubrum DSM 467
Chromosome Plasmid
Length [bp] 4,352,570 53,835
GC content [%] 65.4 59.8
Number of ORFs 3,919 49
Number of operons 2,116 30
Protein coding genes 3,807 43
Pseudogenes 43 6
rRNA genes (5S, 16S, 23S) 4, 4, 4 -
tRNA 54 -
ncRNA 3 -
K. Jureckova et al.
Computational and Structural Biotechnology Journal 23 (2024) 2681–2694
multimapping reads, persisting deduplication, were also considered due
to the existence of overlapping genes in the genome (see Supplementary
Fig. S3). The multimapping reads were in further analysis down
heightened by the number of associated genomic features; therefore, the
original number of reads in each sample remained the same. Overall,
only two pseudogenes located on the plasmid (KUL73_19640 and
KUL73_19670) remained completely silent with RPKM <1 in all tested
Reproducibility of our experiments was veried by three biological
replicates for each sampling time-point and all tested condition, and
visualized through dimension reduction of normalized read counts by t-
Distributed Stochastic Neighbor Embedding (t-SNE) method. In most
cases, the data points formed individual clusters representing specic
cultivation condition and sampling time-point (see Fig. 4). Exceptions
can be found in WT AY and KO AY samples, where two bigger clusters
were formed based on the type of cultivated strain and used substrate,
yet within them smaller clusters can be located and they are repre-
senting the time-points.
The main differences between cultivation conditions were identied
through differential expression analysis, which was performed between
cultivations on fructose and on acetate separately for wild-type (WT AY
vs WT FY) and for knock-out strain (KO AY vs KO FY), and between wild-
type and knock-out strain, separately for cultivation on fructose (KO FY
vs WT FY) and on acetate (KO AY vs WT AY). Therefore, four different
combinations of cultivation conditions were tested and for each of them,
comparisons between available time-points were conducted, e.g., for
WT AY vs WT FY differences were found between WT AY T1 vs WT FY
T1, WT AY T2 vs WT FY T2, and WT AY T3 vs WT FY T3, and similarly
for the rest of comparisons (see Supplementary Figs. S4-S13 for corre-
sponding Volcano plots). Results for each combination of cultivation
Fig. 1. Chromosomal maps of Rhodospirillum rubrum DSM 467
chromosome and plasmid. The rst, second, and third outermost circles represent CDSs on the
forward and backward strands, and pseudogenes, respectively. Classication of COGs is represented by colors. Next, RNA genes, distinguishing among tRNA, rRNA,
and ncRNA, are represented in the fourth outermost circle. The inner area represents the GC content and GC skew (window size 10,000 bp, step size 200 bp).
Fig. 2. Growth dynamics of R. rubrum WT and KO on acetate and fructose,
obtained in SYN medium at 30
C, 160 rpm in dark.
Table 2
Growth rate and doubling time for R. rubrum WT and KO strains grown on ac-
etate (AY) and fructose (FY).
Sample µ max [h
] Td [h]
WT FY 0.057 ±0.001 12.082 ±0.280
WT AY 0.094 ±0.000 7.356 ±0.039
KO FY 0.037 ±0.002 18.992 ±1.299
KO AY 0.017 ±0.001 40.312 ±1.946
K. Jureckova et al.
Computational and Structural Biotechnology Journal 23 (2024) 2681–2694
conditions were then separately searched for genes with any statistically
signicant changes in the gene expression levels (either up- or down-
regulated) (adjusted p value <0.05, Benjamini-Hochberg correction)
and they were sorted into two groups based on the number of signicant
regulations. The rst group contained genes with at least one regulation,
i.e., statistically signicant differential expression, among tested pairs of
time-points. The second group consisted of genes that had statistically
signicant change in every performed comparison of available time-
points. The rst group was used as gene universum in GO enrichment
analysis and the second group was used as a set of interesting genes, that
Fig. 3. Morphology of R. rubrum wild-type grown on A) acetate or B) fructose as carbon source, imaged using low voltage transmission electron microscopy. PHA
granules are marked with arrows, smaller electron-lucent granules represent chromophores.
Fig. 4. Comparison of RNA-Seq samples through dimensionality reduction by t-SNE. All samples are represented as points color-coded according to strain (WT or
KO), substrate (FY or AY), time-point (T1, T2 or T3) and text label indicates biological replicate (sfA, sfB or sfC).
K. Jureckova et al.
Computational and Structural Biotechnology Journal 23 (2024) 2681–2694
were used to identify GO terms that were signicantly enriched (p value
<0.05, Fishers exact test) between tested conditions.
GO enrichment analysis revealed that for the comparison WT AY vs
WT FY 18 GO terms were enriched in biological process (BP) category
and 11 terms in molecular function (MF) category. BP terms were
related to disaccharide metabolic process, cell cycle, DNA recom-
bination and more, and MF terms corresponded to acetyltransferase
activity, transposase activity or isomerase activity (see Supple-
mentary File 1 and Table 3 for results of differential expression analysis
of selected genes).
Comparison between KO AY and KO FY showed 10 enriched BP GO
terms connected to regulation of biological process, maintenance of
DNA repeat elements and organic acid catabolic processterms and
the only MF term intramolecular transferase activity (see Supple-
mentary File 2).
The difference between WT and KO strains grown on fructose where
characterized with 12 BP terms and 5 MF terms, that where signicantly
enriched. The main BP terms described archaeal or bacterial-type a-
gellum-dependent cell motility or vitamin metabolic process pro-
cesses and MF terms were related to NAD binding, methyltransferase
activityor oxidoreductase activity, acting on CH-OH group of donors
(see Supplementary File 3 and Table 4 for results of differential
expression analysis of selected genes).
Finally, GO enrichment analysis between KO AY and WT AY iden-
tied 21 BP and 6 MF enriched terms. The main BP terms belonged to
aspartate family amino acid metabolic process, regulation of
biological process, regulation of cellular metabolic processterms. MF
terms were pyrophosphatase activityor hydrolase activity, acting on
acid anhydride (see Supplementary File 4 and Table 5 for results of
differential expression analysis of selected genes).
3.6. Small RNA prediction
For the prediction of small RNAs from the strand-specic RNA-Seq
data, a coverage-based detection with baerhunter tool, which requires
setting three input values, was performed. While the low_cut_off
threshold was left on default value of 5, the high_cut_off threshold was set
to 25 as inferred from our dataset (see Supplementary Fig. S14). The last
parameter, min _sRNA_length of predicted elements was set to 40 bp. The
length distribution of putative sRNAs (see Supplementary Fig. S15)
contains a wide variety of lengths ranging from ten to thousands bp.
Table 6 then contains counts of predicted elements in the chromosome
and plasmid sequences of R. rubrum. Small RNAs were further classied
into two groups based on their overlap with other CDSs. Those pre-
dictions that did not overlap with the CDS on either strand, thus were in
intergenic regions, were labeled as trans-encoded sRNAs. On the con-
trary, if there was even a slight overlap with any annotated element on
the opposite strand as an inferred sRNA, the prediction was considered
Additionally, differential expression analysis revealed that majority
(>99 %) of predicted sRNA were at least once differentially expressed
among various combinations of available conditions, see Table 6. The
Table 3
Differential expression analysis results of selected genes related to signicantly enriched GO terms for comparison between cultivation on fructose and on acetate for
wild-type strain (WT AY vs WT FY).
WT AY T1 vs WT FY T1 WT AY T2 vs WT FY T2 WT AY T3 vs WT FY T3
Putative physiological function Locus tag log2 fold
p-adj log2 fold
p-adj log2 fold
MF GO term: GO:0009055 electron transfer activity
nuoB NADH-quinone oxidoreductase subunit NuoB KUL73_07380 -1.65 4.95E-
-1.52 5.11E-
-1.04 2.43E-
NADH-quinone oxidoreductase subunit C KUL73_08070 -0.33 3.31E-
0.54 2.38E-
-1.08 2.95E-
nuoF NADH-quinone oxidoreductase subunit NuoF KUL73_08085 0.66 6.42E-
2.15 5.34E-
1.51 3.40E-
NADH-quinone oxidoreductase subunit J KUL73_08105 -0.84 7.79E-
1.00 1.18E-
3.49 4.78E-
nuoL NADH-quinone oxidoreductase subunit L KUL73_08115 -1.36 2.39E-
0.61 8.42E-
1.98 2.17E-
NADH-quinone oxidoreductase subunit M KUL73_08120 -2.27 7.70E-
-0.35 1.58E-
1.23 6.61E-
nuoN NADH-quinone oxidoreductase subunit NuoN KUL73_08125 -1.12 1.87E-
0.87 1.75E-
2.02 6.22E-
grxD Grx4 family monothiol glutaredoxin KUL73_03670 -1.67 1.44E-
-3.93 1.46E-
-2.82 7.84E-
c-type cytochrome KUL73_05315 -1.51 1.16E-
-1.72 1.44E-
-2.66 1.14E-
petA ubiquinol-cytochrome c reductase iron-sulfur
KUL73_06225 0.14 7.21E-
-1.77 2.11E-
-2.18 9.80E-
cytochrome b/b6 domain-containing protein KUL73_06485 -0.55 8.36E-
-2.70 5.84E-
-2.79 4.51E-
ccoP cytochrome-c oxidase, cbb3-type subunit III KUL73_17205 -0.74 7.63E-
1.55 6.66E-
1.93 3.56E-
ccoO cytochrome-c oxidase, cbb3-type subunit II KUL73_17215 -1.91 1.81E-
1.25 8.00E-
1.03 1.97E-
ccoN cytochrome-c oxidase, cbb3-type subunit I KUL73_17220 -0.79 9.68E-
1.73 3.21E-
1.90 3.12E-
pseudoazurin KUL73_05930 -0.51 1.35E-
3.25 1.64E-
1.42 1.02E-
BP GO term: GO:0006950 response to stress
ahpC peroxiredoxin KUL73_07360 -1.56 2.05E-
-1.80 2.39E-
-1.56 1.57E-
msrA peptide-methionine (S)-S-oxide reductase MsrA KUL73_12215 -1.64 1.64E-
-5.21 1.09E-
-4.50 5.38E-
nth endonuclease III KUL73_00795 -1.41 8.15E-
-3.08 5.56E-
-2.03 2.29E-
K. Jureckova et al.
Computational and Structural Biotechnology Journal 23 (2024) 2681–2694
further statistics showing counts of differentially expressed sRNAs be-
tween these various combinations also distinguishes whether these el-
ements are up/down regulated (see Supplementary Fig. S16).
4. Discussion
4.1. Genome and transcriptome
R. rubrum DSM 467
hybrid assembly combining long Oxford
Table 4
Differential expression analysis results of selected genes related to signicantly enriched GO terms for comparison between wild-type vs knock-out strain for cultivation
on fructose (KO FY vs WT FY).
KO FY T1 vs WT FY T1 KO FY T2 vs WT FY T2 KO FY T3 vs WT FY T3
Putative physiological function Locus tag log2 fold
p-adj log2 fold
p-adj log2 fold
BP GO term: GO:0009110 vitamin biosynthetic process
thiE thiamine phosphate synthase KUL73_05625 -1.19 1.32E-
-0.44 8.13E-
0.88 4.37E-
thiD bifunctional hydroxymethylpyrimidine kinase/
phosphomethylpyrimidine kinase
KUL73_05735 0.09 4.91E-
0.75 5.81E-
-0.74 2.96E-
thiL thiamine-phosphate kinase KUL73_09425 1.17 1.44E-
3.08 5.67E-
1.23 1.96E-
thiC phosphomethylpyrimidine synthase ThiC KUL73_10385 1.06 1.49E-
0.34 2.45E-
-0.65 3.02E-
pdxY pyridoxal kinase KUL73_06260 1.05 4.41E-
1.91 5.21E-
1.74 3.95E-
pyridoxine 5
-phosphate synthase KUL73_09595 1.05 4.41E-
1.91 5.21E-
1.74 3.95E-
pdxH pyridoxamine 5
-phosphate oxidase KUL73_13555 0.05 8.79E-
0.89 5.05E-
-0.71 1.15E-
cobS cobaltochelatase subunit CobS KUL73_01095 -0.40 3.57E-
-0.79 3.57E-
-0.97 2.89E-
cobN cobaltochelatase subunit CobN KUL73_17390 0.46 8.99E-
2.40 5.11E-
1.07 1.19E-
cobS adenosylcobinamide-GDP ribazoletransferase KUL73_03465 1.58 1.63E-
2.38 5.65E-
1.42 1.66E-
cobU bifunctional adenosylcobinamide kinase/adenosylcobinamide-
phosphate guanylyltransferase
KUL73_03475 0.42 5.80E-
2.44 7.94E-
0.52 1.77E-
cobW cobalamin biosynthesis protein CobW KUL73_03480 0.33 3.57E-
1.68 1.83E-
-0.93 1.43E-
cobF precorrin-6A synthase (deacetylating) KUL73_15320 -0.13 6.83E-
0.73 1.12E-
0.22 6.50E-
cobalt-precorrin-6A reductase KUL73_15410 0.36 1.54E-
0.90 2.06E-
-0.79 1.60E-
precorrin-2 C(20)-methyltransferase KUL73_15420 1.33 2.04E-
2.30 1.57E-
2.27 4.18E-
cobJ precorrin-3B C(17)-methyltransferase KUL73_15415 1.49 1.35E-
2.75 2.30E-
2.19 2.69E-
BP GO term: GO:0071973 bacterial-type agellum-dependent cell motility
BP GO term: GO:0097588 archaeal or bacterial-type agellum-dependent cell motility
iJ agellar export protein FliJ KUL73_02755 0.84 5.36E-
1.88 1.39E-
1.28 2.55E-
agellin KUL73_13080 -1.35 1.00E-
-2.56 5.22E-
-2.46 2.30E-
agellar basal body-associated FliL family protein KUL73_14640 1.06 2.11E-
1.13 1.12E-
2.72 1.03E-
gG agellar basal-body rod protein FlgG KUL73_14650 -5.77 2.36E-
-7.48 2.46E-
-1.41 5.68E-
agellar basal body L-ring protein FlgH KUL73_14660 -1.21 5.01E-
-1.41 8.07E-
1.12 1.26E-
hypothetical protein KUL73_14700 -0.97 1.51E-
-3.52 2.04E-
-0.18 5.87E-
agellin KUL73_14725 -1.46 7.41E-
-2.47 1.91E-
-2.37 3.48E-
MF GO term: GO:0016614 oxidoreductase activity, acting on CH-OH group of donors
MF GO term: GO:0016616 oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor
guaB IMP dehydrogenase KUL73_01290 -1.21 1.53E-
-0.64 6.71E-
-0.99 5.12E-
NADP-dependent isocitrate dehydrogenase KUL73_01875 -2.10 3.10E-
-1.24 1.05E-
-1.75 1.80E-
mdh malate dehydrogenase KUL73_06315 -1.13 2.58E-
-0.72 2.82E-
-0.04 8.98E-
3-hydroxybutyryl-CoA dehydrogenase KUL73_15880 -0.68 4.10E-
-1.41 4.66E-
-0.92 1.85E-
MF GO term: GO:0051287 NAD binding
NADP-dependent isocitrate dehydrogenase KUL73_01875 -2.10 3.10E-
-1.24 1.05E-
-1.75 1.80E-
nuoF NADH-quinone oxidoreductase subunit NuoF KUL73_08085 0.41 3.86E-
2.17 2.66E-
0.45 2.96E-
K. Jureckova et al.
Computational and Structural Biotechnology Journal 23 (2024) 2681–2694
Nanopore and short Illumina reads reveal one circular chromosome and
one plasmid contig as was expected based on the previously published
type strain genome [10]. The chromosome exhibits GC content of
65.4 %, and the plasmid has 59.8 %, which is more than the average for
Gram-negative bacteria [48]; nevertheless, it again corresponds to the
previously published type strain as well as the chromosome length 4.35
Mbp and plasmid length 53.84 kbp. In comparison with type strain, the
overall number of genes is similar; there is only a slight difference be-
tween the number of predicted protein-coding genes (DSM 467
:3807 vs
:3850), number of RNAs (DSM 467
:71 vs S1
:83) and predicted
pseudogenes (DSM 467
:49 vs S1
:9). The differences may be due to the
use of different sequencing platforms for strain sequencing and various
tools for their assemblies and annotations.
RNA-Seq samples for WT cultivation on fructose (samples no. 1 9)
were prepared slightly differently from the remaining ones (samples no.
10 33). Particularly, different rRNA depletion kit was used for the
latter group, which resulted in very low contamination of reads
belonging to 16S and 23S rRNA genes (see Supplementary Fig. S2) and
UMIs were added to allow their deduplication which caused the differ-
ence in read lengths after pre-processing. While the reads without UMIs
collected from the WT cultivation on fructose were 70 bp long, the
remaining ones were only 66 bp long. Yet, these dissimilarities did not
cause any signicant differences in the overall high number of reads per
sample or in the quality of the reads. Furthermore, reads were mapped to
the reference wild-type strain and the results also proved a high quality
of our data as we were able to uniquely map the majority of the reads. In
the worst case, WT_FY_sfB_T3 sample had 81 % of uniquely mapped
reads. This particular sample also had the highest contamination of
rRNA in reads with almost 47 %. On the other hand, many samples had
over 98 % of the reads with unique hit within the genome, and the
multimapping reads were identied only in 12 % of the reads. Multi-
mapping reads could be a consequence of 523 existing overlaps between
neighboring genes within the genome of R. rubrum (see Supplementary
Fig. S3). Nevertheless, our RNA-Seq data proved to be of a very high
quality, thus enabling further genome-wide study of changes in the
expression levels between different cultivation conditions. As main
differences were observed for particular substrates, we rstly used RNA-
Seq data to detect differential expression on a genome-wide scale in
particular time-points for acetate and fructose cultivation for WT
(Supplementary Figs. S4 S6) and KO (Supplementary Figs. S7 and S8)
strains. Additionally, we compared differences between KO and WT
strains in particular time-points on fructose (Supplementary Figs. S9
S11) and acetate (Supplementary Figs. S12 and S13). All volcano plots
showed expected relation among fold changes and statistically signi-
cant differences in expression of protein coding genes where number of
down-regulated and up-regulated genes is roughly the same.
4.2. Comparative analysis of acetate and fructose growth
The GO enrichment analysis comparing acetate and fructose cultures
Table 5
Differential expression analysis results of selected genes related to signicantly enriched GO terms for comparison between wild-type vs knock-out strain for cultivation
on acetate (KO AY vs WT AY).
KO AY T1 vs WT AY T1 KO AY T2 vs WT AY T2
Putative physiological function Locus tag log2 fold
p-adj log2 fold
MF GO term: GO:0016887 ATP hydrolysis activity
bchI magnesium chelatase ATPase subunit I KUL73_02545 -1.00 4.95E-
-0.68 1.98E-
AFG1 family ATPase KUL73_06305 -0.51 2.44E-
-0.28 9.87E-
arsA arsenical pump-driving ATPase KUL73_07505 1.89 4.00E-
1.50 9.17E-
tsaE tRNA (adenosine(37)-N6)-threonylcarbamoyltransferase complex ATPase subunit
type 1 TsaE
KUL73_17750 1.32 1.32E-
1.58 2.37E-
AAA family ATPase KUL73_19335 -0.63 3.15E-
-0.68 7.10E-
mfd transcription-repair coupling factor KUL73_08940 -0.96 6.99E-
-1.07 7.44E-
uvrA excinuclease ABC subunit UvrA KUL73_09080 1.09 1.59E-
1.53 1.04E-
mutL DNA mismatch repair endonuclease MutL KUL73_15195 -1.16 1.75E-
-1.02 1.40E-
htpG molecular chaperone HtpG KUL73_00370 1.06 1.54E-
1.00 2.13E-
groL chaperonin GroEL KUL73_00840 1.71 4.34E-
1.88 1.06E-
clpB ATP-dependent chaperone ClpB KUL73_03910 1.89 3.75E-
2.55 2.95E-
dnaK molecular chaperone DnaK KUL73_18350 0.77 1.04E-
1.42 1.44E-
lon endopeptidase La KUL73_08040 1.04 3.86E-
1.14 3.20E-
hslU ATP-dependent protease ATPase subunit HslU KUL73_18580 0.76 1.54E-
1.06 9.12E-
Table 6
Number of predicted small RNAs in Rhodospirillum rubrum DSM 467
Feature Predicted # in chromosome Differentially expressed Predicted # in plasmid Differentially expressed
Total sRNAs 2,329 2,321 10 10
trans 55 55 0 0
cis 2,274 2,266 10 10
K. Jureckova et al.
Computational and Structural Biotechnology Journal 23 (2024) 2681–2694
for the R. rubrum wild-type strain revealed insights into the different
pathways of substrate assimilation. Fructose assimilation undergoes
catabolism mainly via the Embden-Meyerhof-Parnas (EMP) [49]. This
was highlighted by enriched GO terms involved in carbohydrate meta-
bolism, such as carbohydrate biosynthetic process (GO:0016051),
disaccharide metabolic process (GO:0005984) and trehalose metabolic
process (GO:0005991). In contrast, acetate is assimilated via the
ethylmalonyl-CoA (EMC) and methylbutanoyl-CoA (MBC) [5052]
pathways, both of which require an electron-transferring avoprotein
(ETF) to channel electrons to the membrane. The EMP pathway, in the
step of glyceraldehyde-3-phosphate oxidation, produces NADH [53]
which is then oxidized in the membrane through the electron transport
coupled phosphorylation. Therefore, the type of electron carrier used is
a key difference between the assimilation of acetate and fructose. The
term NADH dehydrogenase (ubiquinone) activity (GO:0008137) could
refer to the initial step in the electron transport chain [54,55] and reect
this difference in the assimilation pathway. This change in electron
delivery to the membrane seems to also be found in the term electron
transfer activity (GO:0009055). Investigation of the specic genes
involved in these terms revealed that most of the genes for the subunits
of the NADH-quinone oxidoreductase are more transcribed and regu-
lated under fructose conditions, except for the subunit F (nuoF)
(KUL73_08085) which is upregulated for acetate (see Table 3 and Sup-
plementary File 1). The NADH-quinone oxidoreductase is an important
element of the respiratory chain, therefore a different efciency in the
electron transport chain (ETC) between both carbon sources could be
expected, potentially leading to different energy levels. This hypothesis
is also supported by the different expression and regulation of genes
involved in the electron transport chain, c-type cytochrome
(KUL73_05315), ubiquinol-cytochrome c reductase complex
(KUL73_06225), cytochrome b/b6 domain (KUL73_06485)) found
under GO:0009055 (see Table 3). It is also noteworthy to mention that
the genes for the cbb3-type cytochrome c oxidase complex [56], usually
involved in presence of low oxygen levels, seemed to be transcribed for
both carbon sources. This observation may appear counterintuitive
considering the aerobic culture conditions, but the poor gas exchange
expected in shake ask without bafes could have led to lower dissolved
oxygen levels. Furthermore, the gene coding for pseudoazurin ppaZ
(KUL73_05930), which is encoded by ppaZ [57], is positively regulated
by the Reg/Prr system, related to sensing low oxygen levels, was acti-
vated exclusively under acetate culture.
The GO enrichment analysis also identied genes that could be
relevant in the interpretation of the lower maximum specic growth rate
) measured for fructose (see Table 2). Genes involved in reactive
oxygen species (ROS) scavenging and management of the oxidative
response (peroxiredoxin (KUL73_07360), peptide-methionine (S)-S-
oxide reductase MsrA (KUL73_12215) and endonuclease III
(KUL73_00795)) were more expressed during the growth on fructose
(see GO:0006950 response to stress in Table 3 and Supplementary File
1). While this upregulation of a limited set of three genes suggests a
possible increase in ROS production, this alone does not establish a
direct causal relationship with lower µ
. Further investigations are
required to understand the nature of this upregulation. However, it has
been observed that ROS production could be coming from electron
leakage from the ETC [58,59]. The increased presence of ROS may
indicate that electrons are diverted from the ETC, diminishing the ef-
ciency of the proton motive force (PMF), subsequently leading to
reduced energy production efciency (ATP). Moreover, ROS is known to
damage cells and negatively impact their metabolism. A study on Rho-
dobacter sphaeroides, a closely related organism, has shown that, in the
absence of oxygen and aerobic respiratory chain, a higher growth rate
was associated with lower ROS generation in autotrophically growing
cells [60]. Consequently, higher ROS levels could negatively affect cell
metabolism and indicate a less efcient respiratory chain, resulting in
diminished PMF, both contributing to the lower µ
The enrichment of the high-level BP terms cell cycle (GO:0007049)
and DNA recombination (GO:0006310) could be the result of the
different assimilation pathways between acetate and fructose explaining
different measured specic growth rates (µ
max AY
=0.094 h
and µ
=0.057 h
). This observation suggests a different metabolic reor-
ganization around DNA transcription and RNA translation for acetate
and fructose cultures. Such profound difference seems to also be high-
lighted under the term GO:0006950 response to stress. Various stress
responses mechanisms activations were identied under both conditions
regarding DNA repair functions, transcription repair, recombination
mediator, elements related to DNA methylation and chaperone synthe-
sis. Nevertheless, it is worth noting that the carbon source was most
probably not the only parameter that changed between both cultures.
For example, cultivation using acetate has been shown to result in the
basication of the medium throughout the culture (data not shown). As
mentioned previously different levels of dissolved oxygen and carbon
dioxide should also be expected. These additional changes could nuance
the role of fructose and acetate metabolization in these growth rate and
transcriptomics observations.
Another important difference between acetate and fructose is that
acetate is an excellent substrate for growth associated PHB biosynthesis,
which is not the case for fructose. Although the GO analysis did not
directly reect any changes in PHB cycle activity, PHB granules were
observed only under acetate conditions (see Fig. 3). Therefore, these
observations led us to orient our interpretation of the GO terms also
towards changes induced by the presence of PHB. For instance, the
changes in DNA recombination could have been the result of the pres-
ence of PHB granules associated with growth on acetate. A multifunc-
tional protein (PhaM) has been discovered in C. necator [61,62] whose
role is to link PHB granules to the nucleoid (via PhaC), among other
functions. It was further proposed that PHB granules are associated with
the DNA and are segregated with the nucleoid during cell division [63].
However, no phaM homologs were so far found in R. rubrum.
4.3. Consequences of the deletion of the PHB biosynthesis on fructose
The comparison of the cultivation and transcriptomics data analysis
between the polymerases mutant and wild-type strains grown on fruc-
tose led to unexpected results. It started with a signicant decrease in
the maximum specic growth rate for the mutant strain (see Table 2).
This nding is intriguing given the observation that fructose-grown cells
did not present PHB granules. Thus, cellsgrowth was not expected to be
affected by knocking out the PHA synthases. This suggests an unantic-
ipated activity of the PHB metabolism during aerobic growth with
fructose. Indeed, R. rubrum has been described to produce PHB from
fructose mainly in case of nutrient (nitrogen) or oxygen limitation [64,
65], suggesting that activation of the PHB biosynthesis from fructose
necessitates these specic conditions.
The GO enrichment analysis (see Supplementary File 3) indicates
that vitamin biosynthesis (GO:0009110) was inuenced by the presence
of both phaC1 and phaC2. Vitamins are important cofactors in numerous
enzymatic reactions and mutant cells could react to the disrupted PHB
biosynthesis by adjusting its pool of vitamins to favor alternative
metabolic pathways. Transcription of genes associated with vitamins
biosynthesis (see Table 4) suggested that the pool of thiamine, pyri-
doxine, and cobalamin could be modied. Additionally, the terms
GO:0009236 cobalamin biosynthetic process and GO:0033014 tetra-
pyrrole biosynthetic process, which are closely related, have also been
enriched upon deletion of the PHA synthases. Overall, it seems that a
majority of cobalamin biosynthesis genes presented higher expression
for the KO strain, suggesting that the cell could potentially try to favor
cobalamin production (Supplementary File 3). Cobalamin (vitamin B12)
participates in various important reactions (e.g., cofactor of reductase,
acetyltransferase, and isomerase) and can inuence transcriptional
regulation [66]. Interestingly, the cobalamin metabolism was also
modied by the CO adaptation of R. rubrum [67]. Regarding
K. Jureckova et al.
Computational and Structural Biotechnology Journal 23 (2024) 2681–2694
tetrapyrrole-related genes, it also seems that a majority of them were
more expressed for the mutant strain compared to the WT strain (Sup-
plementary File 3). Tetrapyrroles [68] are essential cofactors notably for
light absorption, oxidative stress and electron transport. Therefore, the
mutant cells may also redirect their metabolic effort towards tetrapyr-
role production as a consequence of the altered PHB biosynthesis.
Tetrapyrrole is an intermediate of the bacteriochlorophyll biosynthetic
pathway and has been observed to be accumulated and excreted during
high cell density culture of R. rubrum [69], growing on succinate and
fructose. It can be speculated that the accumulation of these pigments
may contribute to photosynthetic membrane repression [70]. Thus,
these observations suggest that shifts in cobalamin and tetrapyrrole
metabolism could hint at changes in the regulation of the photosynthetic
apparatus [71]. Supporting this idea further, the literature also suggests
that pigment synthesis and PHB production may be interconnected via
the cells energy and redox status. A recent study [72] indeed showed
that the pigment synthesis regulatory protein HP1 is able to sense the
intracellular redox state and adjust the pigment synthesis. However, the
interpretation of these results remains challenging as R. rubrum was
cultivated in the presence of oxygen and in the dark. It is known that in
this organism, pigment synthesis is repressed by oxygen to avoid
oxidative stress and therefore occurs at low or zero oxygen levels, which
is not the case in our experiment. We thus hypothesize that an intra-
cellular redox imbalance may have been generated by disrupting the
PHB cycle, which leads to a redox state relatively similar to the one
observed in the absence of oxygen, which consequently impacted
pigment-related regulations.
Changes in oxidoreductase activity were also detected in the oxida-
tion of CH-OH group (GO:0016614) and reduction of NAD(P)
(GO:0016616) suggesting again that a perturbation of the PHB biosyn-
thesis could inuence redox-related cellular mechanisms. When looking
closely into the transcription of genes representing these GO terms
(Table 4 and Supplementary File 3), some of them were downregulated
for the KO strain (encoding for IMP dehydrogenase (KUL73_01290),
NADP-dependent isocitrate dehydrogenase (KUL73_01875), malate de-
hydrogenase (KUL73_06315), and 3-hydroxybutyryl-CoA dehydroge-
nase (KUL73_15880)). This interpretation is also conrmed by the
enriched term NAD binding (GO:0051287). In particular, the wild-type
strain presented an upregulation of the NADP-dependent isocitrate de-
hydrogenase gene (KUL73_01875) during the exponential growth phase,
compared to the mutant strain (Table 4). This enzyme is responsible for
the production of NADPH in the tricarboxylic acid (TCA) cycle and is
responsible for the production of NADPH and is also important with
respect to oxidative stress response [73]. Consequently, changes in the
regulation of KUL73_01875 upon deletion of the PHA synthase could
indicate a different intracellular redox state between the WT and KO
strains. This difference could also be one of the reasons why different
maximum specic growth rates were observed, as the lower growth rate
of the mutant strain could be the result of an altered activation of the
TCA cycle. In addition, the KO strain presented a higher transcription of
nuoF (KUL73_08085) (Table 4), responsible for NADH-quinone oxido-
reductase subunit NuoF, which was previously upregulated for acetate
in the comparison between both carbon sources for the WT strain
(Table 3 and Supplementary File 1). The relationship between the PHB
cycle and redox metabolism could originate from a modication of the
activity of acetoacetyl-CoA reductase (PhaB), which uses a reduced
cofactor to produce PHA precursors. The control mechanism of the PhaB
activity would need to be elucidated to establish the link between the
absence of PHA synthases and the overall redox state of the cell.
Moreover, the enrichment analysis also revealed an unexpected
impact of both PHA synthases deletions on the agellum-mediated cell
motility (GO:0071973 and GO:0097588). Further analysis of the genes
involved (see Table 4 and Supplementary File 3) suggested that the as-
sembly or composition of agellar components is modied, impacting
the cell motility. Interestingly, the relationship between the presence of
agellum and PHB accumulation has been studied in the past. For
instance, in the nutrient-limited accumulation of PHB in C. necator, cells
appeared to be agellated during the exponential growth phase and the
agellation became stagnant during the PHB accumulation phase [74].
This is followed by the complete loss of agella during the subsequent
PHB mobilization (after addition of a nitrogen source). The PHB
strain [75] unable to produce PHB also had a complete absence of
agellation under all conditions [74]. Additionally, other studies
showed that the deletion of genes involved in agellum formation
resulted in enhanced PHA production in natural and unnatural PHA
producers [76,77]. Our observation in R. rubrum is therefore coherent
with the literature on other strains and demonstrated that the link be-
tween agellum-mediated cell motility and PHB formation deserves
further exploration.
4.4. Consequences of the deletion of the PHB biosynthesis on acetate
The strong growth inhibition observed in KO strain (see Table 2) is
expected to be related to the enrichment of high-level GO terms regu-
lation of biological process (GO:0050789) and regulation of cellular
metabolic process (GO:0031323). Our ndings also suggested that the
growth defect correlates with changes in ATP hydrolysis activity
(GO:0016887). Investigating the genes representing this term (see
Table 5 and Supplementary File 4) suggested variations inexpression of
ATPase elements between strains (magnesium chelatase ATPase subunit
I, AFG1 family ATPase, arsenical pump-driving ATPase, tRNA (adeno-
sine(37)-N6)-threonylcarbamoyltransferase complex ATPase subunit
type 1 TsaE, and AAA family ATPase). Genes involved in the dysfunction
of DNA repair (transcription-repair coupling factor, excinuclease ABC
subunit UvrA, and DNA mismatch repair endonuclease MutL) were also
differently transcribed for the KO strain. In addition, genes involved in
chaperones (molecular chaperone HtpG, chaperonin GroEL, ATP-
dependent chaperone ClpB, and molecular chaperone DnaK) and pro-
teases (endopeptidase La and ATP-dependent protease ATPase subunit
HslU) were upregulated in the KO strain, indicative of stress response
elements. These genes were also present in the GO:0016817 hydrolase
activity, acting on acid anhydride. These observations are expected to be
more related to an impaired substrate assimilation and carbon meta-
bolism in the mutant strain, rather than to the PHB metabolism per se.
Indeed, the similar R. rubrum ΔphaC1ΔphaC2 strain has been reported to
poorly assimilate acetate under aerobic conditions [52], compared with
the wild-type strain. It was also observed elsewhere that the presence of
intracellular PHB was associated with higher growth rates [12,16].
Thus, a perturbation of the EMC pathway and possibly also of the
methylbutanoyl-CoA pathway (MBC) [50] acting under acetate assimi-
lation as an anaplerotic pathway, could impact negatively the TCA cycle
when its biosynthetic precursors are depleted. This diminished TCA
cycle activity would lead to a weaker production of reduced electron
carriers and a lower energy generation despite harbouring a functional
aerobic respiratory chain. This could subsequently lead to the observed
defect in ATP hydrolysis. It is also worth noting that the EMC pathway
shares the 2 rst steps with the PHB biosynthesis.
Based on the absence of expression differences in the redox regula-
tion related terms, the redox metabolism may not be the reason for the
growth defect, as it was previously suggested in Rhizobium etli [78] and
Azotobacter beijerinckii [79] phaC mutants presenting NADH build-up.
This supports the hypothesis that an impaired carbon ow is central to
explain the differences in acetate assimilation between PHB
and wild-type R. rubrum. Indeed, several studies on PHB
mutant or-
ganisms [8083] observed a disrupted carbon ow under conditions
conducive to PHB accumulation. This illustrates the role of the PHB
cycle in the central metabolism and provides a framework for under-
standing how the phaC deletion affects acetate assimilation.
K. Jureckova et al.
Computational and Structural Biotechnology Journal 23 (2024) 2681–2694
4.5. Small RNAs
Post-transcriptional bacterial regulation by small RNAs has been
described as the fastest response to external stimuli under specic con-
ditions related to the availability of regulatory elements at a given time
[84]. Besides that, Reyer et al. [85] showed by modelling that sRNAs
may be able to also act co-transcriptionally on nascent mRNA molecules
and, thus, making the regulation by sRNAs even more efcient. Their
regulatory role is primarily associated with the bacterial response to
stress conditions [86] and thus making sRNAs the subject of studies in
order to understand the principles of stress responses in organisms.
Direct inference of sRNA from batch RNA-Seq is very sensitive to the
choice of main thresholds used for the peak detection from the coverage
signal. We left low_cut_off threshold on the default value as it seems
reasonable value to both avoid low sequencing noise while not
excluding true positives with lower transcription. High_cut_off threshold
for the baerhunters prediction was inferred from samples depth distri-
butions as this threshold affects the number of predicted features and
thus should be experiment specic. Therefore, it seems to be ideal using
normalized sample depth as in our study. The length of sRNA can vary
but it typically spans within the interval 40500 bp [87]. Our pre-
dictions exceed both boundaries, which is expected. While the shortest
length is given by the threshold min _sRNA_length of predicted elements,
the longest length does not explicitly refer to the length of predicted
element but rather it is situated in that region as the exact prediction
from standard RNA-Seq can be misleading. The high number (>99 %) of
regulated sRNAs (see Table 6) correlates with the regulatory role of
these elements, however from the principle of baerhunter detection and
the fact that differentially expressed elements do not explicitly imply
that there cannot be the noise in the data, further analysis and experi-
ments need to be done to get more precise information about the sRNA
content. On the other hand, a larger number of non-coding regulatory
elements in R. rubrum genome is expected as versatile bacteria were
shown to possess more regulatory elements [88]. As our results
demonstrated, intergenic regions in R. rubrum genome hide probably
regulatory potential that remains to be further studied and understood.
5. Conclusions
In summary, our study focused on the genomic, transcriptomic, and
metabolic aspects of R. rubrum DSM 467
and its ΔphaC1ΔphaC2 knock-
out strain unable of PHA accumulation, shedding light on their re-
sponses to different carbon sources, namely acetate and fructose.
Comparative analysis between cultures grown on these substrates un-
veiled distinct pathways for substrate assimilation. The growth on
fructose resulted in an upregulation of carbohydrate metabolism due to
the most likely activity of the Embden-Meyerhof-Parnas pathway.
Indeed, acetate assimilation is believed to utilize the ethylmalonyl-CoA
and methylbutanoyl-CoA pathways. These differences extended to
electron transport chain components, impacting energy production ef-
ciency and growth rates. Generally, the absence of PHA biosynthetic
capability in KO strain affected fructose metabolism unexpectedly,
indicating a broader impact on vitamin biosynthesis and tetrapyrrole
metabolism. Changes in redox-related mechanisms and agellum-
mediated cell motility emphasized the intricate connections within
cellular processes. Similarly, the PHB synthesis deletions consequences
on acetate metabolism elucidated growth inhibition mechanisms. It was
suggested that impaired substrate assimilation resulted in reduced en-
ergy generation and, as a consequence, activated systems for cellular
repair and stress response. Finally, the study delved into metabolism and
stress response regulation by small RNA. A comprehensive analysis of
differentially expressed sRNAs underscored their regulatory role
emphasizing the need for further experiments to rene our
The genome assembly referred to in this paper is CP077803.1 for
chromosome and CP077804.1 for plasmid. All sequencing data have
been deposited in the NCBI Sequence Read Archive (SRA) under the
project accession number PRJNA742260. Particular SRA accession
numbers for genome sequencing are SRR28268536 (WT strain ONT),
SRR28268535 (WT strain Illumina), and SRR28268534 (KO strain
Illumina). Particular SRA accession numbers for RNA-Seq samples are
listed in Supplementary Table S1.
This study was supported by the bilateral grant project GACR
2115958L / SNSF 205321L_197275/1. The authors would like to thank
Professor Kevin E. O΄Connor and Professor Tanja Narancic, University
College Dublin, Ireland for kindly providing the ΔphaC1ΔphaC2 R.
rubrum mutant strain and Jana Hanslikova for help with sequencing this
knock-out strain. The authors thank Delong Instruments a.s., Czech
Republic, namely Jaromir Bacovsky for providing the low voltage
transmission electron microscopy analysis.
