Content uploaded by Dr R. Prathiviraj
Author content
All content in this area was uploaded by Dr R. Prathiviraj on Nov 19, 2024
Content may be subject to copyright.
Medicine in Omics 12 (2024) 100040
Available online 23 July 2024
2590-1249/© 2024 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
Functional prediction and assignment of Clostridium botulinum type A1
operome: A quest for prioritizing drug targets
B. Roja , S. Saranya , R. Prathiviraj , P. Chellapandi
*
Industrial Systems Biology Lab, Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620024, Tamil Nadu, India
ARTICLE INFO
Keywords:
Clostridium botulinum
Protein function
Molecular machinery
Bioinformatics
Drug target
Food spoiling
Virulence
ABSTRACT
Clostridium botulinum strain Hall produces potent botulinum neurotoxin type A1, which causes food-borne, in-
fant, and wound botulism in humans. Antibiotics and botulinum antitoxins can control growth and prevent
botulinum toxicity. However, limited information on a protein with an unknown function hinders the discovery
of new drug targets for this disease. In this study, a combined bioinformatics approach with literature support
was applied to predict, assign, and validate operome functions. Our functional annotation scheme was based on
sequence motifs, conserved domains, structures, protein folds, and evolutionary relationships. Approximately
14.62 % of the operome exhibited sequence similarity to known proteins, with 6.65 % predicted functions for
293 proteins, including 121 proteins exclusive to C. botulinum. Structural analysis revealed a signicant presence
of the Rossmann fold (26 %) and miscellaneous folds (43 %) among the operome. Transporters (>85) and
transcriptional regulators (>45) were prevalent, underscoring their importance in C. botulinum adaptive stra-
tegies. The newly identied operome contributed to the diverse cellular and metabolic processes of this or-
ganism. The function of its operome was involved in amino acid metabolism and botulinum neurotoxin
biosynthesis. In this study, we identied and characterized 13 new virulence proteins from the operome to
determine their structure–function relationships. These new metabolic and virulence proteins allow the organism
to colonize and interact with the human gastrointestinal tract. This study provides a quest for new drugs and
targets for treating the underlying diseases of C. botulinum in humans.
Introduction
Clostridium botulinum is a food-borne bacterium that produces eight
distinct types of botulinum neurotoxin (BoNT/A-H) [1,2]. Botulism is a
life-threatening neuroparalytic syndrome characterized by acute febrile
symmetric descending accid paralysis [3]. It is a public health emer-
gency with a high fatality rate (5–10 %) in cases of suspected ingestion
of homemade, packed, and canned foods (Sobel et al. 2004). Food-
borne, infant, and wound botulism are clinical cases that are
frequently reported in humans [4]. According to the Centers for Disease
Control and Prevention, C. botulinum type A accounts for 42 % of infant
botulism and 79 % of wound botulism cases. Contamination of home-
prepared or preserved foods with type A or B strains causes 90 %
food-borne botulism. Botulism affects 110 cases annually in the United
States, with the majority being females aged 41 years. Approximately
half of these cases are caused by toxin type A, with the remaining cases
divided between toxin types E and B. Botulism prevalence varies based
on underlying conditions, including stroke survivors, multiple sclerosis
patients, traumatized brain injury patients, and cerebral palsy patients.
Botulinum neurotoxins are categorized as a Class I bioweapon [5].
Consumption of 30–100 ng of BoNT/A is estimated to cause food-borne
botulism, with high economic and medical costs associated with treating
type A botulism strains [6,7].
Food-borne botulism is not a result of infection but is a direct func-
tional link between metabolism and virulence. C. botulinum type A
vegetative cells produce BoNT/A to kill the host rapidly for subsequent
saprophytic utilization. In addition to bont/A, this organism contains
two adherence genes (fbp and groEL) and three toxin-coding genes: cloSI,
hlyA, and colA [8,9]. This organism also produces unknown virulence
factors that are required for full virulence in the host. Understanding its
pathophysiological mechanisms is vital for controlling toxicity.
Genome-scale studies on virulence and metabolic crosstalk have been of
great concern in recent systems biology research [5,10–14].C. botulinum
strain Hall (CBOA) has a genome consisting of a circular chromosome
* Corresponding author at: Industrial Systems Biology Lab, Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli
620024, Tamil Nadu, India.
E-mail address: pchellapandi@bdu.ac.in (P. Chellapandi).
Contents lists available at ScienceDirect
Medicine in Omics
journal homepage: www.elsevier.com/locate/meomic
https://doi.org/10.1016/j.meomic.2024.100040
Received 6 March 2024; Received in revised form 26 June 2024; Accepted 26 June 2024
Medicine in Omics 12 (2024) 100040
2
(3,886,916 bp) and a plasmid (16,344 bp). The chromosome carries
3650 predicted genes with 28.5 % unknown biological functions, while
the plasmid contains 19 predicted genes [15]. It exhibits a proteolytic
phenotype, breaking down proteins with secreted proteases and en-
zymes involved in amino acid uptake and metabolism. Additionally, it
has an active chitinolytic system that allows it to colonize environments
containing chitin-containing organisms. This genomic makeup reects
its proteolytic nature and adaptive strategies.
The term operome refers to proteins with unknown biological in-
formation (hypothetical proteins or HPs) in a genome [16–18]. Putative
genes with known orthologs and no orthologs are referred to as
conserved hypothetical proteins and uncharacterized proteins, respec-
tively [19,20]. Automated genome annotation tools have successfully
annotated 50–70 % of coding genes in most bacterial genomes with
condence [21]. Conserved domain-based functional assignment has
been used for genome-wide annotation of poorly understood genomes
such as Pongo abelii and Sus scrofa [22]. The structure-based approach
has been applied to predict operome function in Mycoplasma hyopneu-
moniae [23].Mycobacterium tuberculosis H37Rv operome has been an-
notated using various methods, including functional and structural
domain analysis [24], integrated genomic context analysis [25], litera-
ture mining [26], functional enrichment analysis [19], and genome-
scale fold-recognition [27]. Sequence-based and structure-based ap-
proaches have been used to prioritize operome from Candida dublin-
iensis,Vibrio cholerae O139, and Staphylococcus aureus as therapeutic
targets in human infections [28–31].
Functional annotation of the operome is crucial for drug target
implementation, genome renement, and improved microbial genome-
scale reconstruction [14,18,19,32]. A precise operome annotation can
lead to new functions in veterinary and human therapeutics [33].
Despite various methods developed to aid operome function from pro-
karyotic genomes, no combined bioinformatics prediction approach has
been employed for C. botulinum strains [10–13,34–36]. A combined
bioinformatics approach has been used to functionally annotate, char-
acterize, and categorize operome from prokaryotes [18,37,38]. Hence,
our study aimed to utilize a combined bioinformatics approach for the
prediction, characterization, and categorization of comprehensive
functional contexts in operome from the CBOA genome. Newly anno-
tated functions of this genome can generate high-quality genome-scale
metabolic networks, potentially aiding the discovery of novel drugs or
vaccines against botulinum-intoxication in humans [18].
Materials and methods
Dataset
We analyzed 1052 HPs from the CBOA genome (Accession
NC_009495) using a search method with specic text phrases (“hypo-
thetical proteins, unknown, uncharacterized, and putative”) against the
Kyoto Encyclopaedia of Genes and Genomes v103 (KEGG) database
[39]. FASTA sequences and NCBI accession numbers were used for the
sequence analysis. Six prediction tasks were employed to functionally
annotate and assign the operome to CBOA. These tasks were aimed at
functionally annotating and assigning the operome from the CBOA by
manually curating summative information about the predicted contexts
using various approaches. Supplementary File 1 has a comprehensive
dataset for predicting the operome.
Conserved motif analysis
A conserved motif is closely linked to enzyme catalytic activity,
which helps determine the function and family of a protein [40]. To
determine the specic role of CBOA operome from motifs, sequences
were examined using the KEGG-Motif search engine (https://www.geno
me.jp/tools/motif/) and InterProScan [41]. The dataset excludes oper-
ome that conrm DUF domains that match unspecic proles and that
have similarity hits below the e-value limit of 10–5. Motif similarity
matches were found for 584 HPs and 320 were selected for further study.
Conserved domain analysis
Conserved domains are essential for understanding gene function, as
they provide insights into potential roles in cellular processes. They are
distinct structural and functional units within proteins, particularly
useful for proteins lacking comprehensive annotations or those not well
conserved across species. We used NCBI-CDD v3.16 to search for
conserved domains in the protein query sequences [42]. This resource
contained domain models derived from tertiary protein structures [43].
The template was compared to a position-specic score matrix to
identify the relevant elements. The RPS-BLAST 2.2.28 tool was used to
forecast the domains from the HP sequences against conserved domain
models. The SMART tool was used to identify the conserved domain
architecture and proles [44]. The PROSITE prole was scanned to
identify the structurally and functionally important protein domains
[45]. InterPro was used to classify each HP based on its anticipated
domain and signicant spots in the sequence [46].
Structural analysis
Protein structure prediction is a crucial task that connects protein
sequences to their three-dimensional structures. Accurate predictions
enhance our understanding of biological processes. We predicted sec-
ondary structural features (helices, sheets, extended coils, and loops)
using SOPMA [47]. Homologous crystal structures were analyzed using
PSI-BLAST to identify structural and functional characteristics [48].
Parameters were: threshold 0.005, BLOSUM62 matrix with conditional
adjustment scoring matrix, existence penalty 11, and extension penalty
1. Similarity hits were chosen from the protein data bank and functional
residue alignment was determined using ClustalW [49]. MEGA11 soft-
ware [50] was used to create phylogenetic trees for predicted virulence
proteins using the maximum likelihood method and 1000 bootstrap
replicates, which were then visualized using iTOL v3.0 viewer [51].
Protein sequences were used to construct three-dimensional structures
using the Swiss Model and the corresponding templates [52]. The
structural integrity and precision of homology models were assessed
using potential functions [53].
QMEAN5 score =0.3×Scoretorsion 3−residue +0.17
×Scorepairwise Cβ/SSE +0.7×Scoresolvation Cβ+80
×ScoreSSE PSIPRED +45 ×ScoreACCpro
Protein fold analysis
The class, architecture, topology, and homology superfamily (CATH)
is a hierarchical classication system for protein structures that can
predict protein folds with high accuracy; particularly at the topology
level [54]. It focused on protein superfamilies with fold members that
overlapped by at least 80 %. It contains a wide range of distinct fold
patterns, resulting in structurally similar proteins with minimal
sequence similarity [55]. It was used to predict the functional properties
of HPs by analyzing the protein folds in the CATH database.
Evolutionary analysis
Protein function prediction through molecular evolution uses phy-
logenomic principles to accurately infer function even with sparse or
noisy data. A simple statistical model encodes knowledge of how mo-
lecular function evolves within a phylogenetic tree based on protein
sequences. SIFTER servers use a sequence-based method to predict
protein function and analyze evolutionary connections and annotation
quality [56]. It was used to determine domain family and Gene Ontology
B. Roja et al.
Medicine in Omics 12 (2024) 100040
3
functions, providing condence scores for predictions. as below.
Sg(f) = 1−∏
k
i=1
(1−Sgi(f))
The probability of the ith domain having function f for a protein g with k
domains (gi), where i=1⋅⋅⋅k. The probability of protein g having
function f is calculated using the formula sg(f), as below [57].
Analysis of functional protein association networks
Protein function and activity are often inuenced by other proteins,
providing insights into predicting protein function. STRING v10.5 server
was used to predict the protein–protein interactions (PPIs) of all human
proteins, assigning a condence score based on functional similarity
[58,59].
Analysis of physicochemical properties
Physicochemical properties of amino acids signicantly inuence
protein function, affecting folding, stability, and interactions due to
their inherent characteristics. We predicted molecular weight, theoret-
ical pI, instability index and aliphatic index and grand average hydro-
pathicity (GRAVY) from the CBOA operome using Expasy’s ProtParam
(http://web.expasy.org/protparam/). Calculations were made to
determine the molecular weight, theoretical pI, total number of resi-
dues, instability index, aliphatic index, and grand average hydro-
pathicity. The instability index estimates protein stability, with a score
below 40 indicating stability and a score above 40 indicating instability.
The aliphatic index measures the space occupied by aliphatic side-chains
of amino acids. The GRAVY value was calculated by adding the hy-
dropathy values of the amino acids [60].
Analysis of protein subcellular localization
The subcellular localization prediction is a method used to determine
the location of a protein within a cell, providing valuable insights into its
protein function. It was predicted from all HPs using the PSORTb v3.0.2
[61]. The propensity of a protein to become a membrane protein was
predicted using the SOSUI v2.0 [62]. The transmembrane helix and
topology of each protein were detected by TMHMM v2.0 [63] and
HMMTOP v2.0 [64]. The signal peptide and location of the cleavage site
in the peptide chain were predicted using SingnalP v4.0 [65].
Virulence factor analysis
Virulence factors are potential drug/vaccine targets for infection
severity [66]. VICMpred [67] and MP3 [68] are Support Vector
Machine-based methods used to identify potential virulence factors from
HP sequences with 70.75 % accuracy. These servers used a ve-fold
cross-validation technique and assigned a maximum condence score
of two to each HP if both servers predicted it as virulent.
Literature search
Knowledge-based discovery is the systematic extraction of valuable
information from bioinformatics and literary databases [69]. We
collected empirical data for each predicted protein from NCBI PubMed
(https://www.ncbi.nlm.nih.gov/pubmed/). We established a maximum
condence level of 12 with 50 % based on computational prediction and
50 % based on manual annotation. If the function of a protein is similar
across all methodologies, a score of 6 is provided. Condence scores
were assigned based on the literature (6–Same organism,
5–Phylogenetic neighbors, 4–Bacillus, 3–Bacteria, 2–Archaea, and
1–eukaryotes), and a condence score window of 3–6 was established
for both in silico prediction and knowledge-based techniques. Proteins
with a condence score <3 were excluded from the dataset.
Functional categorization
The genomic organization and order of HP gene clusters were
determined using the KEGG genome database, with adjutant genes an-
notated [70]. HPs were identied and classied based on protein fold,
functional properties, and metabolic subsystems using metabolic data
from the MetaCyc (Metabolic Pathways from all Domains of Life)
database, which contains pathways from various life forms [71].
Results
Functional classication and categorization
We predicted the functions of all HPs based on their sequence and
structural characteristics and categorized them into corresponding mo-
lecular functions and metabolic subsystems. Approximately 14.62 % of
the operome (28 %) showed signicant sequence similarity to known
proteins, with 6.65 % predicted using a combined bioinformatics
approach. Of these, 121 HPs were found to be exclusive to the CBOA
genome. Approximately 26 % of the operome contained the Rossmann
fold and 43 % consisted of miscellaneous folds (Fig. 1a). The Rossmann
fold is a tertiary structure found in proteins that bind nucleotides such as
the enzyme cofactors FAD, NAD
+
, and NADP
+
. It is composed of alter-
nating β-strands and
α
-helical segments, with the most conserved
segment being the initial β-
α
-β-fold. It is common in 20 % of known
protein structures, and functions as a metabolic enzyme, DNA/RNA
binder, and regulatory protein. This operome contains an Arc repressor
mutant fold and
α
-βplaits occupy 4 % of the operome. The Arc repressor,
a small homodimeric protein, contains two mutants. Arc homodimer has
monomers that wrap around each other, forming a globular structure
with three regular secondary elements: β-strand,
α
-helix A, and
α
-helix
B. This fold contributes to the side chains forming the hydrophobic core,
which is crucial for structure and stability. The
α
/β-plait fold is a protein
structural domain where the secondary structure alternates between
α
-helices and β-strands along the backbone. Common examples include
the avodoxin fold and TIM barrel. This fold combines the stability of
β-sheets with the versatility of
α
-helices, allowing for diverse ligand-
binding sites and functional roles. Immunoglobulins, jelly rolls, and
the TIM barrel were detected in 3 % of the operome. The operome was
categorized based on metabolic subsystems, with most functions being
involved in amino acid metabolism, defense, and virulence (Fig. 1b). A
high proportion of operome was predicted to be transporters (>85) and
transcriptional regulators (>45), with signicant operome coverage for
hydrolase and transferase activities (Fig. 1c). Binding proteins (DNA,
RNA, and metals) covered the operome moderately. This study presents
the predicted functions of HPs in the genome annotation data, including
those not included in Supplementary File 2 (Tables S1-S4). It also
discusses the roles of the operome in transcriptional regulation, meta-
bolic subsystems, virulence, host defense, and cell wall architecture,
along with the relevant literature on CBOA.
Cellular process
The CBOA operome has been predicted to contain proteins involved
in chromosome segregation and cell skeleton architecture similar to
those in yeast (Table 1). This Pirin-like protein might act as a tran-
scriptional regulator of cell death [72]. The NGG1p interacting factor 3
protein (NIF3-like protein 1) is found in animals and shares sequence
similarity with Helicobacter pylori GTP cyclohydrolase 1 type 2
[73–75]. The tRNA C32 thiolase is required to modify nucleoside 2-thi-
ocytidine in this organism, similar to archaea and bacteria (J¨
ager et al.
2004). The alanyl-tRNA synthetase predicted in this organism catalyzes
the attachment of an amino acid to its cognate tRNA molecule [76,77].
B. Roja et al.
Medicine in Omics 12 (2024) 100040
4
Metabolic subsystems
We assigned precise functions to the HPs involved in electron
transfer, carbohydrate and lipid metabolism, and phosphate and sulfate
assimilation (Table 2). NAD(P)H-binding avin reductase from CBOA
produces reduced avin for bacterial bioluminescence [78]. Ferredoxin
reductase, a member of the avoprotein pyridine nucleotide cytochrome
reductase family, is involved in electron-transfer reactions (Hyde et al.
1991). The bifunctional coenzyme pyrroloquinolinequinone (PQQ)
synthesis protein in the CBOA is required for PQQ synthesis, but its
function remains unclear [79]. Quinoproteins, a class of
Fig. 1. Functional classication of operome from C. botulinum type A1 based on the protein fold. Abbreviations: AD-Aldehyde Degradation; AAB-Amino Acids
Biosynthesis; AAD-Amino Acids Degradation; ATC-Aminoacyl-tRNA Charging; CB-Carbohydrates Biosynthesis; CD-Carbohydrates Degradation; CSB-Cell Structures
Biosynthesis; CPEB-Cofactors, Prosthetic Groups, Electron Carriers Biosynthesis; DRS-DNA Reactions; DR-DNA REPAIR; FLB-Fatty Acid and Lipid Biosynthesis; HD-
Hormones Degradation; INM-Inorganic Nutrients Metabolism; MD-Mercury Detoxication; MMS-Miscellaneous; NNB-Nucleosides and Nucleotides Biosynthesis;
NND-Nucleosides and Nucleotides Degradation; PDT-Phosphenolpyruvate (PEP)-Dependent Transport; PS-Photosynthesis; PMR-Protein-Modication Reactions; PRS-
Protein-Reactions; RRS-RNA-Reactions; SMB-Secondary Metabolites Biosynthesis; SMD-Secondary Metabolites Degradation; SMR-Small-Molecule Reactions; TC-TCA
cycle; TRR-tRNA-Reactions.
Table 1
Functional annotation of operome of C. botulinum type A1 involved in cellular
process.
Locus tag Assigned function Gene
CBO0860 Pirin-like protein yhhW
CBO2935 NGG1p interacting factor 3 protein niF3
CBO0165 tRNA C32 thiolase ttcA
CBO1509 Aalanyl-tRNA synthetase alaS
B. Roja et al.
Medicine in Omics 12 (2024) 100040
5
dehydrogenases, catalyze the oxidation of compounds in electron
transfer reactions [80]. Phospho-L-lactate guanylyltransferase activates
2-phospho-L-lactate via pyrophosphate linkage for coenzyme F
420
biosynthesis (Grochowski et al. 2008). Phosphoenolpyruvate carbox-
ykinase and lichenan-specic phosphotransferase are also involved in
carbohydrate metabolism [81]. Subtilisin-like serine protease has an
alpha/beta fold containing a 7-stranded parallel beta-sheet, which
contributes to protein degradation in this organism [82]. Phosphatidic
acid phosphatase type 2 is a 5-helical enzyme found in this organism
that dephosphorylates phosphatidate into diacylglycerol and inorganic
phosphates [83,84]. This is similar to phosphoglycolate phosphatases,
which catalyze the dephosphorylation of 2-phosphoglycolate [85]. Acid
phosphatase/phosphotransferase is one of several unrelated acid phos-
phatase families found in this organism, similar to those found in
humans and other mammals [86]. The CBS domain is located in cysteine
synthase, which is responsible for the formation of cysteine from O-
acetyl-serine and H2S with the concomitant release of acetic acid.
Host defense responses
We predicted the function of HPs in the host defense responses in this
organism (Table 3). The predicted functions were categorized into cell
wall biogenesis, biolm formation, starvation response, and metal
detoxication. N-acetylglucosaminyltransferase II catalyzes an essential
step in cell wall biosynthesis (D’Agostaro et al. 1995). The swim zinc
nger domain protein, which is highly conserved among gram-positive
bacteria, stimulates biolm formation by inducing the transcription of
the tapA-sipW-tasA operon [87]. S-Adenosyl-l-methionine (SAM)-
dependent methyltransferases utilize SAM as a cofactor to methylate
proteins, small molecules, lipids, and nucleic acids, contributing to
quorum sensing-dependent metabolic homeostasis of the activated
methyl cycle in the CBOA, similar to that in Burkholderia glumae [88].
Phasins, granule-associated proteins found in the CBOA operome, store
carbon and energy and confer stress resistance [89]. The cupin domain
protein, a conserved barrel domain of the ’cupin’superfamily, is a major
nitrogen source for the survival of CBOA such as plants[90]. The ni-
trogen metabolite repression protein controls nitrogen metabolite
repression similar to fungi [91]. Nucleoside triphosphate pyrophos-
phohydrolase regulates the oxidative and nutritional stress responses
[92–94]. Enterocin A is a soluble cytoplasmic immunity protein that
confers bacteriocin resistance to CBOA by disorienting and closing
membrane pores [95,96]. The LURP1-related protein domain, similar to
the C-terminal domain of the Tubby protein, plays a role in the defense
against competing microorganisms. These proteins play crucial roles in
the survival and defense of various microorganisms [97,98]. Mercuric
reductase from CBOA is a avoprotein that detoxies Hg compounds by
reducing Hg(II) to Hg(0) using NADPH [99]. It is responsible for the
reduction and volatilization of mercury compounds [100]. The arsR-
type HTH domain is a transcriptional regulator that is involved in the
stress response to heavy metal ions [101].
Transporter proteins
We successfully annotated 18 transporter proteins in the CBOA
operome, including type II and III secretions, 2-hydroxy carboxylate
transporters, cell wall-active antibiotic response proteins, inner mem-
brane putative ABC superfamily transporter permease, ECF transporter,
sulfur transporter, apolipoprotein A-I, sulte exporter TauE/SafE family
protein, thiamine-binding periplasmic protein, bacterial PH domain
protein, and QueT transporter protein (Table 4). The predicted functions
primarily involve secretary, sulfur sulfate, and carbohydrate transport
across the CBOA membrane.
Discovery of new virulence proteins
Similar to other bacteria, we predicted new virulence proteins in the
CBOA operome, including DNA-binding and winged helix-turn-helix
domains in the transcription regulators of the crp-fnr family (Table 5).
These proteins are involved in regulating virulence factors and nitrogen
metabolism in CBOA, similar to several pathogens [102]. Bacteriocin-
processing endopeptidase cleaves an N-terminal leader peptide in bac-
teriocins, enabling their activation, similar to gram-positive bacteria.
Calcineurin is involved in intracellular calcium signaling and regulates
various cellular processes, including the expression of virulence factors
and spore formation [103,104]. Prolyl oligopeptidase cleaves short
peptides at the C-terminus of proline residues, which are associated with
virulence in several pathogens by modulating host immune responses
and processing other virulence factors [105,106]. YbjY-like metal-
binding proteins are involved in metal ion binding and homeostasis,
protecting bacteria from metal ion toxicity and contributing to virulence
[107,108].β-Alanyl aminopeptidase is a biomarker for Pseudomonas
aeruginosa in cystic brosis patients and a virulence factor in C. chauvoei
[109,110]. Quinoprotein glucose dehydrogenase from the CBOA cata-
lyzes the oxidation of glucose to gluconic acid, which is involved in
inorganic phosphorus-dissolving metabolism, virulence, and prodigiosin
antibiotic biosynthesis, similar to that in Proteobacteria (Fender et al.
2012; [111]. F5/8 type C domain-containing proteins from this
Table 2
Functional annotation of operome of C. botulinum type A1 involved in metabolic
subsystems.
Locus tag Assigned function Gene
Electron transfer
CBO0286 NAD(P)H-binding avin reductase 1.5.1.30 cysJ/
fre
CBO1829 Ferredoxin 1.18.1.2 fpR
CBO2230 Bifunctional coenzyme PQQ synthesis protein
C/D
1.3.3.11 pqqCD
CBO2633 Quinoprotein 1.1.5.2 gcD
CBO2613 2-Phospho-L-lactate guanylyltransferase 2.7.7.68 cofC
Carbohydrate
CBO1145 Phosphoenolpyruvate carboxykinase 4.1.1.49 pck
CBO1241 Lichenan-specic phosphotransferase 2.7.1.69 licC
CBO2322 Subtilisin-like serine protease 3.4.21.62 sdD1
Lipid
CBO0364 Phosphatidic acid phosphatase type 2 plpP4
CBO0388 Phosphoglycolate phosphatase 3.1.3.18 gph
Phosphate
CBO0464 Acid phosphatase/Phosphotransferase 3.1.3.2 aphA
Sulfate
CBO0199 Cystathionine beta-synthase (CBS domain) 4.2.1.22 cbS
Table 3
Functional annotation of operome of C. botulinum type A1 involved in host
defence systems.
Locus tag Assigned function EC Gene
Cell wall
CBO0127 N-Acetylglucosaminyltransferase II 2.4.1.141 alG13/
alG14
CBO0014 Swim zinc nger domain Protein −– znF
Biolm
CBO0116 Veg Protein −– veg
CBO3144 SAM-dependent methyltransferase 2.1.1.176 rsmB
Starvation
CBO0027 Phasin −– phaP
CBO2926|
CBO0731
Cupin domain Protein −– rmlC/
oxdD
CBO3367 Nitrogen metabolite repression protein nmrA
CBO2373 Nucleoside triphosphate
pyrophosphohydrolase
3.6.1.8 mazG
CBO2563 Enterocin A −– entA
CBO1233 LURP1-related protein domain −– lurP1
Metal
CBO0058 Mercuric reductase 1.16.1.1 merA
CBO3253 Alkylmercurylyase 4.99.1.2 merB
CBO0617 ArsR-type HTH domain −– arsR
B. Roja et al.
Medicine in Omics 12 (2024) 100040
6
organism have high antigenicity indices, as described for C. perfringens
type A and C strains [112]. DNA methylation regulates virulence gene
expression in C. difcile, and the presence of DNA adenine methyl-
transferase in the CBOA suggests a role in controlling spore formation
and colonization in response to virulence [113,114]. Fucose-specic
lectins may support host-pathogen interactions via protein glycosyla-
tion, enhancing the attachment of spores to human cell membranes and
contributing to the pathogenicity of CBOA [115]. The YbbR domain of
the CBOA is an important activator of non-ribosomal peptide virulence
factor biosynthesis [116]. As shown by our analyses, we suggest the
importance of targeting these virulence proteins in drug and vaccine
discovery.
PPI networks of virulence-associated proteins
We constructed PPI networks to identify virulence proteins in the
CBOA, highlighting their roles in maintaining bacterial virulence, sur-
vival, and adaptation in the host environment (Fig. 2). This indicated
that bacteriocin-processing endopeptidase is central to bacteriocin
activation and resistance. Calcineurin, which displays specic in-
teractions, is involved in calcium signaling and the regulation of viru-
lence factors. YbjY-like metal-binding proteins are crucial for metal-ion
homeostasis and virulence-related processes. Alanyl aminopeptidase
interacts with several proteins, potentially aiding in tissue invasion and
infection. RNA-binding proteins regulate the expression of virulence
genes and multimeric avodoxin is involved in electron transfer and
redox reactions. These interactions may serve as potential therapeutic
targets for mitigating the pathogenic effects of this organism.
Phylogenetic analysis of virulence-associated proteins
Fig. 3 of virulence-associated proteins and their related organisms
allows for the inference of a phylogeny to understand the evolutionary
relationships within the predicted proteins from the CBOA operome.
Multimeric avodoxin, YbbR-like protein, calcineurin, and prolyl oli-
gopeptidase of the CBOA are phylogenetically related to C. sporogenes.
The fucose specic lectin, RNA-binding protein, and quinoprotein
glucose dehydrogenase of this organism closely resemble those of
C. combesii. Alanyl aminopeptidase and NGG1p interacting factor 3
proteins are evolutionarily related to those found in C. botulinum CDC
297 and F str.230613, respectively. DNA adenine methyltransferase and
YbjY-like metal binding proteins showed phylogenetic proximity to
those found in Clostridium spp. F5/8 type C domain protein and
bacteriocin-processing endopeptidase showed phylogenetic relation-
ships with C. suldigens and Paraclostridium bifermentans, respectively.
Phylogenetic analysis in this study offers insights into the evolutionary
relationships and functional roles of virulence-associated proteins in
CBOA and related species. This revealed that some proteins are
conserved across species, whereas others have specialized roles related
to virulence and host interactions.
Discussion
The role of the operome remains unclear because of gaps in the
genomic biology of prokaryotes. Proteins with unclear functions have
been identied, characterized, and validated using biochemical and
genetic tests [117].In silico approaches describe organism operome
based on genomes, thereby helping to understand their physiological
activities [5]. This approach uses summative data from conserved do-
mains, structures, folds, protein–protein interactions, subcellular local-
ization, phylogenetic inference, and gene expression proles to predict
molecular function. The tertiary structure of a protein is more conserved
than its sequence and correct folding is crucial for accurate protein
prediction [118–120]. Total proteomic information from operome data
provides valuable prediction metrics for protein-binding motifs, cata-
lytic cores, and functional classication. Functional annotation of the
operome has uncovered more protein pathways and novel domains and
motifs [33,121]. This method incorporates supplementary literature and
bioinformatics resources to assign protein functions to prokaryotes,
highlighting their key roles in cellular processes and metabolic
subsystems.
Predicting operome function is crucial for understanding molecular
pathogenesis and identifying drug targets in pathogenic organisms.
Bioinformatics tools have been used to functionally annotate proteins
from many pathogenic bacteria [20,30,31,36,122–124] and no reports
have been made for C. botulinum yet. This study is the rst to predict and
characterize unknown proteins in CBOA, a potential strain for human
botulism. We used various predictive measures to identify, characterize,
and validate the function of HPs in the CBOA operome [125]. Combining
knowledge with literature helps to understand growth physiology and
virulence in the human gastrointestinal tract according to previous in-
vestigations [18,37,38]. The results of our study emphasize the impor-
tance of metabolic subsystems, virulence mechanisms, and therapeutic
targets, based on the assigned function of the CBOA operome. This helps
Table 4
Functional annotation of operome of C. botulinum type A1 involved in trans-
porter systems.
Locus tag Assigned function TC
Number
Gene
CBO0180 Type III secretion system
substrate exporter
−hB /hrpN
/yscU/
spaS
CBO0363 2-Hydroxycarboxylate
transporter
2.A.24 yadS
CBO0551 Cell wall-active antibiotics
response protein
9.
B.116.2.1
liaF
CBO0778 Inner membrane putative ABC
superfamily transporter
permease
3.
A.1.5.11
ybhR
CBO0790 ECF transporter protein 2.
A.88.1.1
−
CBO1577|
CBO1575|
CBO1581
Sulphur transporter protein 2.A.1 dsrE
CBO1758 Apolipoprotein A-I 5.B.2.2.4 apoa1
CBO1904 Type II secretory pathway,
pseudopilin
3.
A.1.143.1
pulG
CBO2862|
CBO3180|
CBO2460|
CBO2473|
CBO2467|
Sulte exporter TauE/SafE
family protein
2.A.52 tauE/safE
CBO2910 Thiamine-binding periplasmic
protein
2.
A.102.4.5
thiB
CBO2937 Bacterial PH domain protein 3.
A.1.19.1
−
CBO3177 QueT transporter protein −queT
Table 5
Functional annotation of prioritized virulence-associated proteins from operome
of C. botulinum type A1.
Locus Score Gene Assigned function
CBO0747 0.785111 −Bacteriocin-processing endopeptidase
CBO1758 0.900486 gp66 Calcineurin
CBO1781 1.837087 pop Prolyl oligopeptidase
CBO1828 0.810445 ybiY YbjY-like metal-binding protein
CBO2138 1.655908 pepN Alanyl aminopeptidase
CBO2578 1.016458 khpB RNA-binding protein
CBO2603 0.486305 wrbA Multimeric avodoxin
CBO2633 0.794207 gcd Quinoprotein glucose dehydrogenase
CBO2935 0.1128 nif3 NGG1p interacting factor 3 protein
CBO3022 0.710851 −F5/8 type C domain protein
CBO3144 0.476182 camA DNA adenine methyltransferase
CBO3353 2.351369 eA Fucose-specic lectin
CBO3430 0.54771 ybbR YbbR-like protein
B. Roja et al.
Medicine in Omics 12 (2024) 100040
7
Fig. 2. PPI networks for inferring functional prediction of virulence proteins from C. botulinum type A1operome.
B. Roja et al.
Medicine in Omics 12 (2024) 100040
8
predict the precise function of the bacterial operome [126].
The operome of the CBOA virus contains proteins with known
functions and approximately 14.62 % sequence similarity to known
proteins. We predicted that 6.65 % of the operome would have known
functions. Specic proteins identied included Rossmann fold-
containing proteins (26 %), miscellaneous folds (43 %), arc repressor
mutant folds (4 %), immunoglobulins, jelly rolls, and TIM barrels (3 %
each). The CBOA operome contains over 85 transporters and 45 tran-
scriptional regulators, as well as 121 unique proteins, highlighting its
unique genomic repertoire. A signicant portion of the operome (28 %)
contained newly identied functions, indicating current discoveries in
protein function predictions. These proteins display various structural
folds, demonstrating their versatility in various biological processes.
CBOA proteins play prominent roles in metabolic subsystems, including
amino acid metabolism, defense mechanisms, and virulence pathways.
The newly identied proteins play crucial roles in cellular and metabolic
processes. Rossmann-fold proteins are essential metabolic enzymes
involved in nucleotide binding. The key proteins involved in amino acid
metabolism include tRNA C32 thiolase, alanyl-tRNA synthetase, and
cysteine synthase. The electron transfer and metabolic subsystems
include NAD(P)H-binding avin reductase and ferredoxin reductase.
Botulinum neurotoxin biosynthesis may be inuenced by transcriptional
regulation, protein processing, virulence-factor modication, and im-
mune evasion. Newly identied proteins with DNA-binding and winged
helix-turn-helix domains regulate the genes associated with virulence
and neurotoxin production.
We annotated 18 transporter proteins involved in secretion and
transport across the CBOA membrane, including type II and III secre-
tions and carbohydrate transporters. Transporter proteins, transcrip-
tional regulators, hydrolases, and transferases constitute a signicant
proportion of this protein. CBOA proteins are implicated in chromosome
segregation, cell skeleton architecture, transcriptional regulation, elec-
tron transfer, carbohydrate and lipid metabolism, and cofactor synthe-
sis. HPs were predicted to be involved in cell wall biogenesis, biolm
formation, and stress responses, such as the N-
acetylglucosaminyltransferase II, bacteriocin-related proteins and swim
zinc nger domain protein. New virulence proteins were identied,
including those involved in DNA-binding, calcium signaling, metal ion
homeostasis, and protein glycosylation. Key virulence-associated pro-
teins, such as bacteriocin-processing endopeptidase, calcineurin, prolyl
oligopeptidase, and RNA-binding proteins, are potential therapeutic
targets for combating CBOA infections.
As shown by our PPI network analysis, we identied virulence pro-
teins in the CBOA, highlighting their roles in bacterial virulence, sur-
vival, and adaptation. Phylogenetic analysis in this study revealed that
virulence-associated proteins in the CBOA operome are related to
various organisms, including C. sporogenes,C. combesii,C botulinum, and
C. suldigens. These proteins have specialized roles in virulence and host
interactions, providing insights into their evolutionary relationships and
functional roles. Therefore, this study enhances our understanding of its
genomic and proteomic structure, offering insights into its metabolic
versatility, defense strategies, and virulence mechanisms at the systems
level [5,10–14].
Conclusions
The operome is crucial for the understanding of the metabolic and
molecular functions of this organism. This study provides valuable in-
sights into the molecular and metabolic functions of the CBOA operome.
Our annotation scheme considers sequence motifs, conserved domains,
protein structures, and evolutionary relationships to predict their func-
tions. The operome of C. botulinum contains 521 HPs, with 293 previ-
ously annotated and 228 newly identied HPs. Its operome contains 96
metabolic enzymes that recognize various elements such as DNA, RNA,
metals, and membranes for cellular and metabolic purposes. Our
approach assigned and categorized the functions of 74 metabolic en-
zymes, 80 transporter proteins, and eight-cell division proteins from this
organism. Unique proteins contribute to cellular processes like chro-
mosome segregation, electron transfer, and carbohydrate metabolism.
Transporter proteins play essential roles in cellular homeostasis and
Fig. 3. A phylogeny for inferring evolutionary relationships of predicted virulence-associated proteins from C. botulinum type A1 operome.
B. Roja et al.
Medicine in Omics 12 (2024) 100040
9
environmental adaptation. The discovery of novel virulence factors
underscores its potential pathogenicity and adaptation mechanisms in
mammalian hosts. However, these predictions rely on bioinformatics
tools and existing databases, and some proteins remain poorly charac-
terized. The functions of certain HPs should be evaluated through pro-
tein expression, purication, crystallization, and structural studies for
further therapeutic development [127–132]. Further exploration of the
regulatory networks and molecular interactions between C. botulinum
and humans could provide critical insights into its pathogenesis.
Therefore, Functional predictions of the operome have the potential to
identify novel drug targets and vaccine candidates for this emerging
pathogen.
CRediT authorship contribution statement
B. Roja: Validation, Methodology, Formal analysis, Data curation,
Conceptualization. S. Saranya: Formal analysis. R. Prathiviraj: Formal
analysis, Data curation, Conceptualization. P. Chellapandi: Writing –
original draft, Writing –review &editing, Validation, Supervision,
Methodology.
Declaration of competing interest
The authors declare that they have no known competing nancial
interests or personal relationships that could have appeared to inuence
the work reported in this paper.
Acknowledgments
The authors express their gratitude to the Tamil Nadu State Council
for Higher Education (RGP/2019-20/BDU/HECP-0042), Government of
Tamil Nadu for their nancial support.
Ethics approval and consent to participate
The need for ethical approval and individual consent was waived.
Appendix A. Supplementary data
Supplementary data to this article can be found online at https://doi.
org/10.1016/j.meomic.2024.100040.
References
[1] Prisilla A, Deena Remin M, Roja B, Chellapandi P. A human-food web-animal
interface on the prevalence of food-borne pathogens (Clostridia and Enterococcus)
in mixed veterinary farms. Food Sci Biotechnol 2019;28:1583–91.
[2] Pernu N, Keto-Timonen R, Lindstr¨
om M, Korkeala H. High prevalence of
Clostridium botulinum in vegetarian sausages. Food Microbiol 2020;91:103512.
[3] Lonati D, Schicchi A, Crevani M, Buscaglia E, Scaravaggi G, Maida F, et al.
Foodborne botulism: Clinical diagnosis and medical treatment. Toxins (Basel)
2020;12:509.
[4] Hatami F, Shokouhi S, Mardani M, Shabani M, Gachkar L, AlaviDarazam I. Early
recovery of botulism: One decade of experience. Clin Toxicol (Phila) 2021;59:
628–32.
[5] Chellapandi P, Prisilla A. Clostridium botulinum type A-virulome-gut interactions:
A systems biology insight. Hum Microbiome J 2018;7–8:15–22.
[6] Dhaked RK, Singh MK, Singh P, Gupta P. Botulinum toxin: Bioweapon &magic
drug. Indian J Med Res 2010;132:489–503.
[7] Lúquez C, Edwards L, Grifn C, Sobel J. Foodborne botulism outbreaks in the
United States, 2001–2017. Front Microbiol 2021;12:713101.
[8] Popoff MR. Clostridial pore-forming toxins: Powerful virulence factors. Anaerobe
2014;30:220–38.
[9] Muhammad SA, Ahmed S, Ali A, Huang H, Wu X, Yang XF, et al. Prioritizing drug
targets in Clostridium botulinum with a computational systems biology approach.
Genomics 2014;104:24–35.
[10] Roja B, Chellapandi P. Design and characterization of a multi-epitope vaccine
against Clostridium botulinum A3 Loch Maree intoxication in humans. Gene
2024;892:147865.
[11] Roja B, Saranya S, Thamanna L, Chellapandi P. Inferring molecular mechanisms
of host-microbe-drug interactions in the human gastrointestinal tract. Med Omics
2024;10:100027.
[12] Roja B, Saranya S, Chellapandi P. Discovery of novel virulence mechanisms in
Clostridium botulinum type A3 using genome-wide analysis. Gene 2023;869:
147402.
[13] Roja B, Thamanna L, Chellapandi P. Reconstruction and analysis of the
transcriptome regulatory network of Clostridium botulinum type A3 str. Loch
Maree. Asian J Microbiol Biotechnol Environ Exp Sci 2023;25(2):102–8.
[14] Saranya S, Thamanna L, Chellapandi P. Unveiling the potential of systems biology
in biotechnology and biomedical research. Syst Microbiol Biomanuf 2024.
https://doi.org/10.1007/s43393-024-00286-4.
[15] Sebaihia M, Peck MW, Minton NP, Thomson NR, Holden MT, Mitchell WJ, et al.
Genome sequence of a proteolytic (Group I) Clostridium botulinum strain Hall A
and comparative analysis of the clostridial genomes. Genome Res 2007;17:
1082–92.
[16] Greenbaum D, Luscombe NM, Jansen R, Qian J, Gerstein M. Interrelating
different types of genomic data, from proteome to secretome: ’oming in on
function. Genome Res 2001;11:1463–8.
[17] Chellapandi P, Mohamed KH, Cpsir PR-CM. A database for structural properties of
proteins identied in cyanobacterial C1 metabolism. Algal Res 2017;22:135–9.
ISSN 2211-9264.
[18] Prathiviraj R, Chellapandi P. Functional annotation of operome from
Methanothermobacter thermautotrophicus ΔH: An insight to metabolic gap lling.
Int J Biol Macromol 2019;123:350–62.
[19] Mazandu GK, Mulder NJ. Function prediction and analysis of Mycobacterium
tuberculosis hypothetical proteins. Int J Mol Sci 2012;13:7283–302.
[20] Shahbaaz M, Hassan MI, Ahmad F. Functional annotation of conserved
hypothetical proteins from Haemophilus inuenzae Rd KW20. PloS One 2013;8:
e84263.
[21] Lobb B, Tremblay BJ, Moreno-Hagelsieb G, Doxey AC. An assessment of genome
annotation coverage across the bacterial tree of life. Microb Genom 2020;6:
e000341.
[22] Jitendra S, Narula R, Agnihotri S, Singh M. Annotation of hypothetical proteins
orthologous in Pongo abelii and Sus scrofa. Bioinformation 2011;6:297–9.
[23] da Fonsˆ
eca MM, Zaha A, Caffarena ER, Vasconcelos AT. Structure-based
functional inference of hypothetical proteins from Mycoplasma hyopneumoniae.
J Mol Model 2012;18:1917–25.
[24] Namboori S, Mhatre N, Sujatha S, Srinivasan N, Pandit SB. Enhanced functional
and structural domain assignments using remote similarity detection procedures
for proteins encoded in the genome of Mycobacterium tuberculosis H37Rv. J Biosci
2004;29:245–59.
[25] Yellaboina S, Goyal K, Mande SC. Inferring genome-wide functional linkages in E.
coli by combining improved genome context methods: comparison with high-
throughput experimental data. Genome Res 2007;17:527–35.
[26] Doerks T, van Noort V, Minguez P, Bork P. Annotation of the M. tuberculosis
hypothetical orfeome: adding functional information to more than half of the
uncharacterized proteins. PloS One 2012;7:e34302.
[27] Mao C, Shukla M, Larrouy-Maumus G, Dix FL, Kelley LA, Sternberg MJ, et al.
Functional assignment of Mycobacterium tuberculosis proteome revealed by
genome-scale fold-recognition. Tuberculosis (Edinb) 2013;93:40–6.
[28] McAdow M, Kim HK, Dedent AC, Hendrickx AP, Schneewind O, Missiakas DM.
Preventing Staphylococcus aureus sepsis through the inhibition of its agglutination
in blood. PloS Pathog 2011;7:e1002307.
[29] McAdow M, DeDent AC, Emolo C, Cheng AG, Kreiswirth BN, Missiakas DM, et al.
Coagulases as determinants of protective immune responses against
Staphylococcus aureus. Infect Immun 2012;80:3389–98.
[30] Varma PBS, Adimulam YB, Kodukula S. Insilico functional annotation of a
hypothetical protein from Staphylococcus aureus. J Infect Public Health 2015;8:
526–32.
[31] Islam MS, Shahik SM, Sohel M, Patwary NI, Hasan MA. In silico structural and
functional annotation of hypothetical proteins of Vibrio cholerae O139. Genomics
Inform 2015;13:53–9.
[32] Poulsen C, Akhter Y, Jeon AH, Schmitt-Ulms G, Meyer HE, Stefanski A, et al.
Proteome-wide identication of mycobacterial pupylation targets. Mol Syst Biol
2010;6:386.
[33] Ijaq J, Chandrasekharan M, Poddar R, Bethi N, Sundararajan VS. Annotation and
curation of uncharacterized proteins- challenges. Front Genet 2015;6:119.
[34] Sivashankari S, Shanmughavel P. Functional annotation of hypothetical proteins -
A review. Bioinformation 2006;1:335–8.
[35] Chellapandi P, Mohamed Khaja Hussain M, Prathiviraj R. CPSIR-CM: A database
for structural properties of proteins identied in cyanobacterial C1 metabolism.
Algal Res 2017;22:135–9.
[36] Singh G, Singh V. Functional elucidation of hypothetical proteins for their
indispensable roles toward drug designing targets from Helicobacter pylori strain
HPAG1. J Biomol Struct Dyn 2018;36:906–18.
[37] Bharathi M, Chellapandi P. Comparative analysis of differential proteome-wide
protein-protein interaction network of Methanobrevibacterruminantium M1.
Biochem Biophys Rep 2019;20:100698.
[38] Sangavai C, Prathiviraj R, Chellapandi P. Functional prediction, characterization
and categorization of operome from Acetoanaerobium sticklandii DSM 519.
Anaerobe 2020;61:102088.
[39] Kanehisa M, Furumichi M, Sato Y, Ishiguro-Watanabe M, Tanabe MKEGG.
integrating viruses and cellular organisms. Nucleic Acids 2021;49(D1):D545–51.
[40] Bork P, Koonin EV. Protein sequence motifs. Curr Opin Struct Biol 1996;6(3):
366–76.
[41] Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, et al.
InterProScan: protein domains identier. Nucleic Acids Res 2005;33:W116–20.
B. Roja et al.
Medicine in Omics 12 (2024) 100040
10
[42] Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, et al.
CDD: NCBI’s conserved domain database. Nucleic Acids Res 2015;43:D222–6.
[43] Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-
Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI,
Jackson JD, Ke Z, Lanczycki CJ, Lu F, Marchler GH, Mullokandov M,
Omelchenko MV, Robertson CL, Song JS, Thanki N, Yamashita RA, Zhang D,
Zhang N, Zheng C, Bryant SH. CDD: a Conserved Domain Database for the
functional annotation of proteins. Nucleic Acids Res 2011;(Database issue):
D225–9.
[44] Letunic I, Doerks T, Bork P. SMART 7: recent updates to the protein domain
annotation resource. Nucleic Acids Res 2012;40:D302–5.
[45] de Castro E, Sigrist CJ, Gattiker A, Bulliard V, Langendijk-Genevaux PS,
Gasteiger E, et al. ScanProsite: Detection of PROSITE signature matches and
ProRule-associated functional and structural residues in proteins. Nucleic Acids
Res 2006;34(Web server issue):W362–5.
[46] Finn RD, Attwood TK, Babbitt PC, et al. InterPro in 2017-beyond protein family
and domain annotations. Nucleic Acids Res 2016;45:D190–9.
[47] Geourjon C, Del´
eage G. SOPMA: Signicant improvements in protein secondary
structure prediction by consensus prediction from multiple alignments. Comput
Appl Biosci 1995;11:681–4.
[48] Altschul SF, Madden TL, Sch¨
affer AA, Zhang J, Zhang Z, Miller W, et al. Gapped
BLAST and PSI-BLAST: a new generation of protein database search programs.
Nucleic Acids Res 1997;25:3389–402.
[49] Thompson JD, Gibson TJ, Higgins DG. Multiple sequence alignment using
ClustalW and ClustalX. Curr Protoc Bioinformatics 2002;2. Chapter 2:Unit 2.3.
[50] Tamura K, Stecher G, Kumar SMEGA11. Molecular evolutionary genetics analysis
version 11. Mol Biol Evol 2021;38(7):3022–7.
[51] Letunic I, Doerks T, Bork P. P. Smart: Recent updates, new developments and
status in 2015. Nucleic Acids Res 2015;43:D257–60.
[52] Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, et al. SWISS-
MODEL: Modelling protein tertiary and quaternary structure using evolutionary
information. Nucleic Acids Res 2014;42(Web server issue):W252–8.
[53] Benkert P, Tosatto SC, Schomburg D. QMEAN: A comprehensive scoring function
for model quality assessment. Proteins 2008;71:261–77.
[54] Micsonai A, Wien F, Kernya L, Lee YH, Goto Y, R´
efr´
egiers M, et al. Accurate
secondary structure prediction and fold recognition for circular dichroism
spectroscopy. Proc Natl Acad Sci 2015;112(24):E3095–103.
[55] Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM. CATH–a
hierarchic classication of protein domain structures. Structure 1997;5(8):
1093–109.
[56] Radivojac P, Clark WT, Oron TR, Schnoes AM, Wittkop T, Sokolov A, et al.
A large-scale evaluation of computational protein function prediction. Nat
Methods 2013;10(3):221–7.
[57] Sahraeian SM, Luo KR, Brenner SE. SIFTER search: A web server for accurate
phylogeny-based protein function prediction. Nucleic Acids Res 2015;43:
W141–7.
[58] Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J,
et al. STRING v10: protein-protein interaction networks, integrated over the tree
of life. Nucleic Acids Res 2015;43(Database issue):D447–52.
[59] Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, et al. The
STRING database in 2017: quality-controlled protein-protein association
networks, made broadly accessible. Nucleic Acids Res 2017;45(D1):D362–8.
[60] Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of
a protein. J Mol Biol 1982;157:105–32.
[61] Yu NY, Wagner JR, Laird MR, Melli G, Rey S, Lo R, et al. PSORTb 3.0: improved
protein subcellular localization prediction with rened localization subcategories
and predictive capabilities for all prokaryotes. Bioinformatics 2010;26:1608–15.
[62] Mitaku S, Hirokawa T, Tsuji T. Amphiphilicity index of polar amino acids as an
aid in the characterization of amino acid preference at membrane-water
interfaces. Bioinformatics 2002;18:608–16.
[63] Krogh A, Larsson B, von Heijne G, Sonnhammer EL. Predicting transmembrane
protein topology with a hidden Markov model: Application to complete genomes.
J Mol Biol 2001;305:567–80.
[64] Tusn´
ady GE, Simon I. The HMMTOP transmembrane topology prediction server.
Bioinformatics 2001;17:849–50.
[65] Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating
signal peptides from transmembrane regions. Nat Methods 2011;8:785–6.
[66] Baron C, Coombes B. Targeting bacterial secretion systems: benets of
disarmament in the microcosm. Infect Disord Drug Targets (Formerly Current
Drug Targets-Infectious Disorders) 2007;7(1):19–27.
[67] Saha S, Raghava GPS. VICMpred: an SVM-based method for the prediction of
functional proteins of Gram-negative bacteria using amino acid patterns and
composition. Genomics Proteomics Bioinformatics 2006;4(1):42–7.
[68] Gupta A, Kapil R, Dhakan DB, Sharma VK. MP3: a software tool for the prediction
of pathogenic proteins in genomic and metagenomic data. PloS One 2014;9(4):
e93907.
[69] D’Agostaro GA, Zingoni A, Moritz RL, Simpson RJ, Schachter H, Bendiak B.
Molecular cloning and expression of cDNA encoding the rat UDP-N-
acetylglucosamine:alpha-6-D-mannoside beta-1,2-N-
acetylglucosaminyltransferase II. J Biol Chem 1995;270:15211–21.
[70] Kanehisa M, Sato Y, Furumichi M, Morishima K, Tanabe M. New approach for
understanding genome variations in KEGG. Nucleic Acids Res 2018. https://doi.
org/10.1093/nar/gky962.
[71] Caspi R, Altman T, Billington R, et al. The MetaCyc database of metabolic
pathways and enzymes and the BioCyc collection of pathway/genome databases.
Nucleic Acids Res 2014;42:D459–71.
[72] Licciulli S, Luise C, Scafetta G, Capra M, Giardina G, Nuciforo P, et al. Pirin
inhibits cellular senescence in melanocytic cells. Am J Pathol 2011;178:
2397–406.
[73] Tascou S, Uedelhoven J, Dixkens C, Nayernia K, Engel W, Burfeind P. Isolation
and characterization of a novel human gene, NIF3L1, and its mouse ortholog,
Nif3l1, highly conserved from bacteria to mammals. Cytogenet Cell Genet 2000;
90:330–6.
[74] Tascou S, Kang TW, Trappe R, Engel W, Burfeind P. Identication and
characterization of NIF3L1 BP1, a novel cytoplasmic interaction partner of the
NIF3L1 protein. Biochem Biophys Res Commun 2003;309:440–8.
[75] Choi WY, Gemberling M, Wang J, Holdway JE, Shen MC, Karlstrom RO, et al. In
vivo monitoring of cardiomyocyte proliferation to identify chemical modiers of
heart regeneration. Development 2013;140:660–6.
[76] Woese CR, Olsen GJ, Ibba M, S¨
oll D. Aminoacyl-tRNAsynthetases, the genetic
code, and the evolutionary process. Microbiol Mol Biol Rev 2000;64:202–36.
[77] Francklyn C, Perona JJ, Puetz J, Hou YM. Aminoacyl-tRNAsynthetases: Versatile
players in the changing theater of translation. RNA 2002;8:1363–72.
[78] Russell TR, Tu SC. Aminobacter aminovorans NADH:avin oxidoreductase His140:
A highly conserved residue critical for NADH binding and utilization.
Biochemistry 2004;43:12887–93.
[79] Toyama H, Fukumoto H, Saeki M, Matsushita K, Adachi O, Lidstrom ME. PqqC/D,
which converts a biosynthetic intermediate to pyrroloquinolinequinone. Biochem
Biophys Res Commun 2002;299:268–72.
[80] Oubrie A, Rozeboom HJ, Dijkstra BW. Active-site structure of the soluble
quinoprotein glucose dehydrogenase complexed with methylhydrazine: A
covalent cofactor-inhibitor complex. Proc Natl Acad Sci U S A 1999;96:11787–91.
[81] Villarreal JM, Bueno C, Arenas F, Jabalquinto AM, Gonz´
alez-Nilo FD,
Encinas MV, et al. Nucleotide specicity of Saccharomyces cerevisiae
phosphoenolpyruvate carboxykinaseKinetics, uorescence spectroscopy, and
molecular simulation studies. Int J Biochem Cell Biol 2006;38:576–88.
[82] Bergeron F, Leduc R, Day R. Subtilase-like pro-protein convertases: From
molecular specicity to therapeutic applications. J Mol Endocrinol 2000;24:1–22.
[83] Ishikawa K, Mihara Y, Gondoh K, Suzuki E, Asano Y. X-ray structures of a novel
acid phosphatase from Escherichia blattae and its complex with the transition-state
analog molybdate. EMBO J 2000;19:2412–23.
[84] Carman GM, Han GS. Roles of phosphatidate phosphatase enzymes in lipid
metabolism. Trends Biochem Sci 2006;31:694–9.
[85] Kim Y, Yakunin AF, Kuznetsova E, Xu X, Pennycooke M, Gu J, et al. Structure- and
function-based characterization of a new phosphoglycolate phosphatase from
Thermoplasma acidophilum. J Biol Chem 2004;279:517–26.
[86] Ling P, Roberts RM. Uteroferrin and intracellular tartrate-resistant acid
phosphatases are the products of the same gene. J Biol Chem 1993;268:
6896–902.
[87] Lei Y, Oshima T, Ogasawara N, Ishikawa S. Functional analysis of the protein Veg,
which stimulates biolm formation in Bacillus subtilis. J Bacteriol 2013;195:
1697–705.
[88] Chun H, Choi O, Goo E, Kim N, Kim H, Kang Y, et al. The quorum sensing-
dependent gene katG of Burkholderia glumae is important for protection from
visible light. J Bacteriol 2009;191:4152–7.
[89] de Almeida A, Nikel PI, Giordano AM, Pettinari MJ. Effects of granule-associated
protein PhaP on glycerol-dependent growth and polymer production in poly(3-
hydroxybutyrate)-producing Escherichia coli. Appl Environ Microbiol 2007;73:
7912–6.
[90] Dunwell JM. Cupins: A new superfamily of functionally diverse proteins that
include germins and plant storage proteins. Biotechnol Genet Eng Rev 1998;15:
1–32.
[91] Stammers DK, Ren J, Leslie K, Nichols CE, Lamb HK, Cocklin S, et al. The
structure of the negative transcriptional regulator NmrA reveals a structural
superfamily which includes the short-chain dehydrogenase/reductases. EMBO J
2001;20:6619–26.
[92] Gross M, Marianovsky I, Glaser G. MazG - A regulator of programmed cell death
in Escherichia coli. Mol Microbiol 2006;59:590–601.
[93] Lu LD, Sun Q, Fan XY, Zhong Y, Yao YF, Zhao GP. Mycobacterial MazG is a novel
NTP pyrophosphohydrolase involved in oxidative stress response. J Biol Chem
2010;285:28076–85.
[94] Lee S, Kim MH, Kang BS, Kim JS, Kim GH, Kim YG, et al. Crystal structure of
Escherichia coli MazG, the regulator of nutritional stress response. J Biol Chem
2008;283:15232–40.
[95] Fimland G, Eijsink VGH, Nissen-Meyer J. Comparative studies of immunity
proteins of pediocin-like bacteriocins. Microbiol 2002;148:3661–70.
[96] Soliman W, Bhattacharjee S, Kaur K. Molecular dynamics simulation study of
interaction between a class IIa bacteriocin and its immunity protein. Biochim
Biophys Acta 2007;1774:1002–13.
[97] Bateman A, Finn RD, Sims PJ, Wiedmer T, Biegert A, S¨
oding J. Phospholipid
scramblases and Tubby-like proteins belong to a new superfamily of membrane
tethered transcription factors. Bioinformatics 2009;25:159–62.
[98] Knoth C, Eulgem T. The oomycete response gene LURP1 is required for defense
against Hyaloperonospora parasitica in Arabidopsis thaliana. Plant J 2008;55:
53–64.
[99] Blaghen M, Vidon DJ, el Kebbaj MS. Purication and properties of mercuric
reductase from Yersinia enterocolitica 138A14. Can J Microbiol 1993;39:193–200.
[100] Kiyono M, Omura T, Inuzuka M, Fujimori H, Pan-Hou H. Nucleotide sequence and
expression of the organomercurial-resistance determinants from a Pseudomonas K-
62 plasmid pMR26. Gene 1997;189:151–7.
B. Roja et al.
Medicine in Omics 12 (2024) 100040
11
[101] Busenlehner LS, Pennella MA, Giedroc DP. The SmtB/ArsR family of
metalloregulatory transcriptional repressors: Structural insights into prokaryotic
metal resistance. FEMS Microbiol Rev 2003;27:131–43.
[102] Irvine AS, Guest JR. Lactobacillus casei contains a member of the CRP-FNR family.
Nucleic Acids Res 1993;21:753.
[103] Lee JT, Whitson BA, Kelly RF, D’Cunha J, Dunitz JM, Hertz MI, et al. Calcineurin
inhibitors and Clostridium difcile infection in adult lung transplant recipients: The
effect of cyclosporine versus tacrolimus. J Surg Res 2013;184:599–604.
[104] Juvvadi PR, Lee SC, Heitman J, Steinbach WJ. Calcineurin in fungal virulence and
drug resistance: Prospects for harnessing targeted inhibition of calcineurin for an
antifungal therapeutic approach. Virulence 2017;8:186–97.
[105] García-Horsman JA. The role of prolyl oligopeptidase, understanding the puzzle.
Ann Transl Med 2020;8:983.
[106] Lasse C, Azevedo CS, de Araújo CN, Motta FN, Andrade MA, Rocha AP, et al.
Prolyl Oligopeptidase from Leishmania infantum: Biochemical characterization
and involvement in macrophage infection. Front Microbiol 2020;11:1060.
[107] Vercruysse M, K¨
ohrer C, Davies BW, Arnold MFF, Mekalanos JJ, RajBhandary UL,
et al. The highly conserved bacterial RNase YbeY is essential in Vibrio cholerae,
playing a critical role in virulence, stress regulation, and RNA processing. PloS
Pathog 2014;10:e1004175.
[108] Sharma D, Sharma A, Singh B, Verma SK. Bioinformatic exploration of metal-
binding proteome of zoonotic pathogen Orientia tsutsugamushi. Front Genet 2019;
10:797.
[109] Frey J, Falquet L. Patho-genetics of Clostridium chauvoei. Res Microbiol 2015;166:
384–92.
[110] Thompson R, Stephenson D, Sykes HE, Perry JD, Stanforth SP, Dean JR. Detection
of β-alanyl aminopeptidase as a biomarker for Pseudomonas aeruginosa in the
sputum of patients with cystic brosis using exogenous volatile organic
compound evolution. RSC Adv 2020;10:10634–45.
[111] Li Y, Zhang J, Gong Z, Xu W, Mou Z. Gcd gene diversity of quinoprotein glucose
dehydrogenase in the sediment of Sancha Lake and its response to the
environment. Int J Environ Res Public Health 2018;16:1.
[112] SenGupta N, Alam SI, Kumar B, Kumar RB, Gautam V, Kumar S, et al.
Comparative proteomic analysis of extracellular proteins of Clostridium perfringens
type A and type C strains. Infect Immun 2010;78:3957–68.
[113] Heusipp G, F¨
alker S, Schmidt MA. DNA adenine methylation and bacterial
pathogenesis. Int J Med Microbiol 2007;297:1–7.
[114] Zhou J, Horton JR, Yu D, Ren R, Blumenthal RM, Zhang X, et al. Repurposing
epigenetic inhibitors to target the Clostridioides difcile-specic DNA adenine
methyltransferase and sporulation regulator CamA. Epigenetics 2022;17:970–81.
[115] Kuboi S, Ishimaru T, Tamada S, Bernard EM, Perlin DS, Armstrong D. Molecular
characterization of AfuFleA, an L-fucose-specic lectin from Aspergillus
fumigatus. J Infect Chemother 2013;19:1021–8.
[116] Yasgar A, Foley TL, Jadhav A, Inglese J, Burkart MD, Simeonov A. A strategy to
discover inhibitors of Bacillus subtilis surfactin-type phosphopantetheinyl
transferase. Mol Biosyst 2010;6:365–75.
[117] Mills CL, Beuning PJ, Ondrechen MJ. Biochemical functional predictions for
protein structures of unknown or uncertain function. Comput Struct Biotechnol J
2015;13:182–91.
[118] Eisenstein E, Gilliland GL, Herzberg O, Moult J, Orban J, Poljak RJ, et al.
Biological function made crystal clear—annotation of hypothetical proteins via
structural genomics. Curr Opin Biotechnol 2000;11(1):25–30.
[119] Shapiro L, Harris T. Finding function through structural genomics. Curr Opin
Biotechnol 2000;11:31–5.
[120] Dobson PD, Cai Y-D, Stapley BJ, Doig AJ. Prediction of protein function in the
absence of signicant sequence similarity. Curr Med Chem 2004;11(16):2135–42.
[121] Zhang T, Tan P, Wang L, Jin N, Li Y, Zhang L, et al. RNALocate: a resource for
RNA subcellular localizations. Nucleic Acids Res 2017;45:D135–8.
[122] Naqvi AAT, Ahmad F, Hassan MI. Identication of functional candidates amongst
hypothetical proteins of Mycobacterium leprae Br 4923, a causative agent of
leprosy. Genome 2015;58(1):25–42.
[123] Naqvi AAT, Shahbaaz M, Ahmad F, Hassan MI. Identication of functional
candidates amongst hypothetical proteins of Treponema pallidum ssp. pallidum.
PloS One 2015;10(4):e0124177.
[124] Marklevitz J, Harris LK. Prediction driven functional annotation of hypothetical
proteins in the major facilitator superfamily of S. aureus NCTC 8325.
Bioinformation 2016;12(4):254.
[125] Kotze HL, Armitage EG, Sharkey KJ, Allwood JW, Dunn WB, Williams KJ, et al.
A novel untargeted metabolomics correlation-based network analysis
incorporating human metabolic reconstructions. BMC Syst Biol 2013;7:107.
[126] Kumar K, Prakash A, Tasleem M, Islam A, Ahmad F, Hassan MI. Functional
annotation of putative hypothetical proteins from Candida dubliniensis. Gene
2014;543:93–100.
[127] George Priya Doss C, Rajasekaran R, Sudandiradoss C, Ramanathan K, Purohit R,
Sethumadhavan R. A novel computational and structural analysis of nsSNPs in
CFTR gene. Genomic Med 2008;2(1–2):23–32.
[128] Kumar A, Rajendran V, Sethumadhavan R, Purohit R. Evidence of colorectal
cancer-associated mutation in MCAK: a computational report. Cell Biochem
Biophys 2013;67(3):837–51.
[129] Kumar A, Rajendran V, Sethumadhavan R, Purohit R. Molecular dynamic
simulation reveals damaging impact of RAC1 F28L mutation in the switch I
region. PloS One 2013;8(10):e77453.
[130] Kumar A, Rajendran V, Sethumadhavan R, Shukla P, Tiwari S, Purohit R.
Computational SNP analysis: current approaches and future prospects. Cell
Biochem Biophys 2014;68(2):233–9.
[131] Sharma J, Bhardwaj VK, Das P, Purohit R. Identication of naturally originated
molecules as γ-aminobutyric acid receptor antagonist. J Biomol Struct Dyn 2021;
39(3):911–22.
[132] Singh R, Bhardwaj VK, Sharma J, Das P, Purohit R. Identication of selective
cyclin-dependent kinase 2 inhibitor from the library of pyrrolone-fused
benzosuberene compounds: an in silico exploration. J Biomol Struct Dyn 2022;40
(17):7693–701.
B. Roja et al.