Transcription factor family-based reconstruction of singleton regulons: study of the Crp/Fnr, ArsR and GntR families in Desulfovibrionales genomes.

Lawrence Berkeley National Laboratory, Berkeley, CA 94710, USA.
Journal of bacteriology (Impact Factor: 2.69). 10/2012; DOI: 10.1128/JB.01977-12
Source: PubMed

ABSTRACT Accurate detection of transcriptional regulatory elements is essential for high-quality genome annotation, metabolic reconstruction and modeling of regulatory networks. We developed a computational approach for reconstruction of regulons operated by transcription factors (TFs) from large protein families and applied this novel approach to three TF families in ten Desulfovibrionales genomes. Phylogenetic analyses of 125 regulators from the ArsR, Crp/Fnr, and GntR families revealed that 65% of these regulators (termed reference TFs) are well conserved in Desulfovibrionales, while the remaining 35% of regulators (termed singleton TFs) are species-specific and show a mosaic distribution. For regulon reconstruction in the group of singleton TFs, the standard orthology-based approach was inefficient, and thus we developed a novel approach based on the simultaneous study of all homologous TFs from the same family in a group of genomes. As a result, we identified binding motifs for 21 singleton TFs and for all reference TFs in all three analyzed families. Within each TF family we observed structural similarities between DNA binding motifs of different reference and singleton TFs. The collection of reconstructed regulons is available at the RegPrecise database (

Download full-text


Available from: Alexey Kazakov, Jul 13, 2015
  • Source
    • "Frontiers in Microbiology | Terrestrial Microbiology July 2014 | Volume 5 | Article 382 | 2 2004; Kazakov et al., 2013b "
    [Show abstract] [Hide abstract]
    ABSTRACT: We surveyed the eight putative cyclic-di-GMP-modulating response regulators (RRs) in Desulfovibrio vulgaris Hildenborough that are predicted to function via two-component signaling. Using purified proteins, we examined cyclic-di-GMP (c-di-GMP) production or turnover in vitro of all eight proteins. The two RRs containing only GGDEF domains (DVU2067, DVU0636) demonstrated c-di-GMP production activity in vitro. Of the remaining proteins, three RRs with HD-GYP domains (DVU0722, DVUA0086, and DVU2933) were confirmed to be Mn(2+)-dependent phosphodiesterases (PDEs) in vitro and converted c-di-GMP to its linear form, pGpG. DVU0408, containing both c-di-GMP production (GGDEF) and degradation domains (EAL), showed c-di-GMP turnover activity in vitro also with production of pGpG. No c-di-GMP related activity could be assigned to the RR DVU0330, containing a metal-dependent phosphohydrolase HD-OD domain, or to the HD-GYP domain RR, DVU1181. Studies included examining the impact of overexpressed cyclic-di-GMP-modulating RRs in the heterologous host E. coli and led to the identification of one RR, DVU0636, with increased cellulose production. Evaluation of a transposon mutant in DVU0636 indicated that the strain was impaired in biofilm formation and demonstrated an altered carbohydrate:protein ratio relative to the D. vulgaris wild type biofilms. However, grown in liquid lactate/sulfate medium, the DVU0636 transposon mutant showed no growth impairment relative to the wild-type strain. Among the eight candidates, only the transposon disruption mutant in the DVU2067 RR presented a growth defect in liquid culture. Our results indicate that, of the two diguanylate cyclases (DGCs) that function as part of two-component signaling, DVU0636 plays an important role in biofilm formation while the function of DVU2067 has pertinence in planktonic growth.
    Frontiers in Microbiology 07/2014; 5:382. DOI:10.3389/fmicb.2014.00382 · 3.94 Impact Factor
  • Source
    • "An important issue in such studies is to connect TFs to the cognate TF binding sites (TFBSs) identified by phylogenetic footprinting and other computational techniques (Conlan et al., 2005; Wels et al., 2006; Liu et al., 2008). This problem is either solved experimentally or addressed computationally, for instance for regulons controlled by local TF from specific protein families (Rigali et al., 2004; Francke et al., 2008; Sahota and Stormo, 2010; Ahn et al., 2012; Kazakov et al., 2013). Phylogenetic profiling of TF genes and motifs upstream of candidate regulon members is an alternative bioinformatics approach for assigning TFs to putative regulons (Rodionov and Gelfand, 2005). "
    [Show abstract] [Hide abstract]
    ABSTRACT: DNA-binding transcription factors (TFs) are essential components of transcriptional regulatory networks in bacteria. LacI-family TFs (LacI-TFs) are broadly distributed among certain lineages of bacteria. The majority of characterized LacI-TFs sense sugar effectors and regulate carbohydrate utilization genes. The comparative genomics approaches enable in silico identification of TF-binding sites and regulon reconstruction. To study the function and evolution of LacI-TFs, we performed genomics-based reconstruction and comparative analysis of their regulons. For over 1300 LacI-TFs from over 270 bacterial genomes, we predicted their cognate DNA-binding motifs and identified target genes. Using the genome context and metabolic subsystem analyses of reconstructed regulons, we tentatively assigned functional roles and predicted candidate effectors for 78 and 67% of the analyzed LacI-TFs, respectively. Nearly 90% of the studied LacI-TFs are local regulators of sugar utilization pathways, whereas the remaining 125 global regulators control large and diverse sets of metabolic genes. The global LacI-TFs include the previously known regulators CcpA in Firmicutes, FruR in Enterobacteria, and PurR in Gammaproteobacteria, as well as the three novel regulators-GluR, GapR, and PckR-that are predicted to control the central carbohydrate metabolism in three lineages of Alphaproteobacteria. Phylogenetic analysis of regulators combined with the reconstructed regulons provides a model of evolutionary diversification of the LacI protein family. The obtained genomic collection of in silico reconstructed LacI-TF regulons in bacteria is available in the RegPrecise database ( It provides a framework for future structural and functional classification of the LacI protein family and identification of molecular determinants of the DNA and ligand specificity. The inferred regulons can be also used for functional gene annotation and reconstruction of sugar catabolic networks in diverse bacterial lineages.
    Frontiers in Microbiology 06/2014; 5:294. DOI:10.3389/fmicb.2014.00294 · 3.94 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches).DescriptionRegPrecise ( is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in bacterial genomes. Analytical capabilities include exploration of: regulon content, structure and function; TF binding site motifs; conservation and variations in genome-wide regulatory networks across all taxonomic groups of Bacteria. RegPrecise 3.0 was selected as a core resource on transcriptional regulation of the Department of Energy Systems Biology Knowledgebase, an emerging software and data environment designed to enable researchers to collaboratively generate, test and share new hypotheses about gene and protein functions, perform large-scale analyses, and model interactions in microbes, plants, and their communities.
    BMC Genomics 11/2013; 14(1):745. DOI:10.1186/1471-2164-14-745 · 4.04 Impact Factor
Show more