Figure 4 - uploaded by Andrew Tolonen
Content may be subject to copyright.
| TSS in carbon source-specific clusters share DNA sequence motifs. TSS clusters differentially expressed on (a,b) glucose, (c) galacturonic acid and HG, (d) xylan and (e) stover and cellulose are shown along with their associated sequence motifs. Rows are expression of a TSS cluster member and columns are duplicate glucose (Glc), xylose (Xlo), galacturonic acid (GalA), homogalacturonan (HG), xylan (Xya), stover (Sto) and cellulose (Cel) cultures. Colours show TSS expression as log 2-transformed read counts scaled to a median of zero for each TSS. 

| TSS in carbon source-specific clusters share DNA sequence motifs. TSS clusters differentially expressed on (a,b) glucose, (c) galacturonic acid and HG, (d) xylan and (e) stover and cellulose are shown along with their associated sequence motifs. Rows are expression of a TSS cluster member and columns are duplicate glucose (Glc), xylose (Xlo), galacturonic acid (GalA), homogalacturonan (HG), xylan (Xya), stover (Sto) and cellulose (Cel) cultures. Colours show TSS expression as log 2-transformed read counts scaled to a median of zero for each TSS. 

Source publication
Article
Full-text available
Bacteria respond to their environment by regulating mRNA synthesis, often by altering the genomic sites at which RNA polymerase initiates transcription. Here, we investigate genome-wide changes in transcription start site (TSS) usage by Clostridium phytofermentans, a model bacterium for fermentation of lignocellulosic biomass. We quantify expressio...

Contexts in source publication

Context 1
... the Pribnow hexamer (TATAAT) has an upstream TG di-nucleotide (Fig. 2i), which enhances transcription in certain other bacteria [19][20][21] by interacting with the RNAP sigma-A subunit 22 . In contrast, searching upstream of IntraS TSS identified an AT-rich stretch B10 bp upstream of the TSS lacking RNAP binding sites ( Supplementary Fig. 4A), suggesting IntraS TSS often result from promiscuous initiation at AT-rich sequences. We observed IntraS TSS comprised that more than 50% of TSS (Fig. 2d), albeit with fewer reads per site than InterS TSS. ...
Context 2
... studies have rationalized similarly abundant intragenic TSS as resulting from incomplete TEX degradation 12 , but our data support these TSS bear 5 0 -PPP indicative of transcription initiation. IntraS TSS are preferentially found in the 5 0 end of genes ( Supplementary Fig. 4B), supporting they are under selective pressure and may have roles including expression of alternative protein isoforms or as mimicry molecules to sequester other RNA and ribonucleases from their mRNA targets 9 . ...
Context 3
... associated with TSS clusters. We clustered TSS based on expression across carbon sources and searched sequences surrounding TSS for overrepresented motifs (Supplementary Fig. 7; Supplementary Data 3), revealing TSS clusters that share motifs with potential regulatory functions (Fig. 4). For example, the TSS cluster up-regulated on galacturonic acid and homo- galacturonan (HG) (Fig. 4c) has a palindromic motif resembling the cre operator (TGAAAGCGCTTTCA) bound by B. subtilis CcpA 26,27 , a LacI/GalR regulator of numerous carbon metabolism genes. LacI/GalR genes often have upstream copies of their operators to ...
Context 4
... We clustered TSS based on expression across carbon sources and searched sequences surrounding TSS for overrepresented motifs (Supplementary Fig. 7; Supplementary Data 3), revealing TSS clusters that share motifs with potential regulatory functions (Fig. 4). For example, the TSS cluster up-regulated on galacturonic acid and homo- galacturonan (HG) (Fig. 4c) has a palindromic motif resembling the cre operator (TGAAAGCGCTTTCA) bound by B. subtilis CcpA 26,27 , a LacI/GalR regulator of numerous carbon metabolism genes. LacI/GalR genes often have upstream copies of their operators to auto-repress transcription 28 , and we found three copies of the galacturonic acid cluster motif in the 5 0 ...
Context 5
... C. phytofermentans genome may encode significantly more genes than in the NCBI Genbank annotation. Classifying TSS using the MaGe annotation showed 735 (7%) TSS map to MaGe-specific clops genes of unknown function ( Supplementary Data 4), including 64 clops genes with InterS TSS. We examined which of these novel TU encode proteins by mapping C. phytofermentans MS/MS peptide spectra to the genome translated in all frames, identifying peptides outside the predicted proteome in 21 InterS, 13 IntraS, 5 InterA and 25 IntraA regions (Supplementary Data 5). ...

Citations

... For those genes with two TSSs, TSS with a higher value was considered. This result is in agreement with Boutard et al. [29] where most genes were expressed from a single TSS. Identification of transcription start sites enables identification of promoter regions [30]. ...
Article
Full-text available
Background Mycobacterium colombiense is an acid-fast, non-motile, rod-shaped mycobacterium confirmed to cause respiratory disease and disseminated infection in immune-compromised patients, and lymphadenopathy in immune-competent children. It has virulence mechanisms that allow them to adapt, survive, replicate, and produce diseases in the host. To tackle the diseases caused by M . colombiense , understanding of the regulation mechanisms of its genes is important. This paper, therefore, analyzes transcription start sites, promoter regions, motifs, transcription factors, and CpG islands in TetR family transcriptional regulatory (TFTR) genes of M . colombiense CECT 3035 using neural network promoter prediction, MEME, TOMTOM algorithms, and evolutionary analysis with the help of MEGA-X. Results The analysis of 22 protein coding TFTR genes of M . colombiense CECT 3035 showed that 86.36% and 13.64% of the gene sequences had one and two TSSs, respectively. Using MEME, we identified five motifs (MTF1, MTF2, MTF3, MTF4, and MTF5) and MTF1 was revealed as the common promoter motif for 100% TFTR genes of M . colombiense CECT 3035 which may serve as binding site for transcription factors that shared a minimum homology of 95.45%. MTF1 was compared to the registered prokaryotic motifs and found to match with 15 of them. MTF1 serves as the binding site mainly for AraC, LexA, and Bacterial histone-like protein families. Other protein families such as MATP, RR, σ-70 factor, TetR, LytTR, LuxR, and NAP also appear to be the binding candidates for MTF1. These families are known to have functions in virulence mechanisms, metabolism, quorum sensing, cell division, and antibiotic resistance. Furthermore, it was found that TFTR genes of M . colombiense CECT 3035 have many CpG islands with several fragments in their CpG islands. Molecular evolutionary genetic analysis showed close relationship among the genes. Conclusion We believe these findings will provide a better understanding of the regulation of TFTR genes in M . colombiense CECT 3035 involved in vital processes such as cell division, pathogenesis, and drug resistance and are likely to provide insights for drug development important to tackle the diseases caused by this mycobacterium. We believe this is the first report of in silico analyses of the transcriptional regulation of M . colombiense TFTR genes.
... Genome-wide determination of prokaryotic TSSs has been greatly facilitated by RNA-seq-derived methods that take advantage of the characteristic presence of a 59 triphosphate on the initiating nucleotide of unprocessed RNAs (such as mRNAs). In the most recent approaches, these RNAs are selectively targeted by the vaccinia capping enzyme, which adds either a desthiobiotinylated (Cappableseq) or a biotinylated (Capp-Switch seq) guanosine cap on triphosphorylated RNAs, permitting their capture on streptavidin-coupled beads (2)(3)(4). ...
... Hence, we reasoned that the final threshold set on reads per million (RPM) values of TSSs (i.e., 10 RPM for Capp-Switch sequencing [3]) needed to be dynamically adjusted to consider the local gene expression downstream from each TSS. Each TSS is associated with a single gene based on its localization (intragenic, positioned inside the associated gene; intergenic, positioned outside the associated gene, defined as having the closest gene start or end relative to the TSS). ...
... Libraries were next obtained by PCR Capp-Switch library preparation. The Capp-switch sequencing library preparation protocol was used (3). Briefly, a 59 biotinylated cap was first added to the 59-PPP RNAs using vaccinia capping enzyme (New England Biolabs). ...
Article
Full-text available
Solventogenic clostridia have been employed in industry for more than a century, initially being used in the acetone-butanol-ethanol (ABE) fermentation process for acetone and butanol production. Interest in these bacteria has recently increased in the context of green chemistry and sustainable development.
... A previous study using a technical variant of Cappable-seq (Capp-switch) investigated genome-wide patterns of C. phy transcription initiation on various plant substrates and demonstrated conditiondependent transcription regulation modifications. Amongst these changes, interesting regulatory mechanisms such as antisense transcription, leaderless transcription and non-coding RNA were identified (Boutard et al., 2016). ...
Thesis
Full-text available
The development of high-throughput DNA sequencing revolutionized the study of complex bacterial communities called “microbiomes'', in diverse environments, from the central oceans to the human intestine. The research aim of this thesis is to develop new sequencing-based technologies and apply them to provide further insights into changes to the composition and activities of microbiomes. Specifically, Chapter One presents RIMS-seq (Rapid Identification of Methylase Specificity), a method to simultaneously obtain the DNA sequence and 5-methylcytosine (m5C) profile of bacterial genomes. Modification by m5C has been described in the genomes of many bacterial species to modulate gene expression and protect from viral infection. Chapter Two introduces ONT-cappable-seq and Loop-Cappable-seq, two new techniques to reveal operon architecture through full-length transcript sequencing, using Nanopore and LoopSeq sequencing, respectively. In Chapter Three, we applied a multiomics approach using some of the tools developed in the previous chapters tostudy the dynamics of the response of a model human intestinal microbiome after treatment with ciprofloxacin, a widely used broad-spectrum antibiotic. Antibiotics are critical treatments to prevent pathogenic infections, but they also kill commensal species that promote health, enhance the spread of resistant strains, and may degrade the protective effect of microbiota against invasion by pathogens. Therefore, it is crucial to be able to characterize both the composition but also the functional response of a microbial community to antibiotic treatment. We examined both the short and long-term transcriptional and genomic responses of the synthetic community and explored how the immediate transcriptomic response correlates and potentially predicts the later changes of the microbiome composition. The goal is to try to identify a marker appearing a few minutes/hours after the treatment that could be used to potentially predict the outcome of an antibiotic treatment, opening up the path to a more personalized medicine.
... The genes in the RefSeq annotation divided the TSS into the categories InterS (intergenic TSS with downstream gene in same orientation), InterA (intergenic TSS with downstream gene in opposite orientation), IntraS (intragenic TSS in gene with same orientation), or IntraA (intragenic TSS in gene with opposite orientation) according to Boutard et al. [72]. The Bedtools toolset was used to search for the closest gene and determine its distance from the TSS [73]. ...
Article
Full-text available
The bacterial pathogen Salmonella enterica, which causes enteritis, has a broad host range and extensive environmental longevity. In water and soil, Salmonella interacts with protozoa and multiplies inside their phagosomes. Although this relationship resembles that between Salmonella and mammalian phagocytes, the interaction mechanisms and bacterial genes involved are unclear. Here, we characterized global gene expression patterns of S. enterica serovar Typhimurium within Acanthamoeba castellanii at the early stage of infection by Cappable-Seq. Gene expression features of S. Typhimurium within A. castellanii were presented with downregulation of glycolysis-related, and upregulation of glyoxylate cycle-related genes. Expression of Salmonella Pathogenicity Island-1 (SPI-1), chemotaxis system, and flagellar apparatus genes was upregulated. Furthermore, expression of genes mediating oxidative stress response and iron uptake was upregulated within A. castellanii as well as within mammalian phagocytes. Hence, global S. Typhimurium gene expression patterns within A. castellanii help better understand the molecular mechanisms of Salmonella adaptation to an amoeba cell and intracellular persistence in protozoa inhabiting water and soil ecosystems.
... These findings suggest that similar evolutionary tuning may enable GCadaptation of σ70 stringency in concert with evolving GC content. In line with our proposed mechanism, promoters from GC-rich Rhizobium and Streptomyces species (61.5 and 72.1 %GC, respectively) [31,32] have higher levels of degeneracy in their σ70 motifs than low GC organisms like Clostridium fermentans (GC% = 35) [33], consistent with less stringent requirements for initiation of transcription. A study of synthetic promoter sequences in industrial Clostridium species with similarly low GC contents identified a strong preference for AT-rich promoters [23], providing further support for GC-adaptation of promoter stringency. ...
Article
While horizontal gene transfer is prevalent across the biosphere, the regulatory features that enable expression and functionalization of foreign DNA remain poorly understood. Here, we combine high-throughput promoter activity measurements and large-scale genomic analysis of regulatory regions to investigate the cross-compatibility of regulatory elements (REs) in bacteria. Functional characterization of thousands of natural REs in three distinct bacterial species revealed distinct expression patterns according to RE and recipient phylogeny. Host capacity to activate foreign promoters was proportional to their genomic GC content, while many low GC regulatory elements were both broadly active and had more transcription start sites across hosts. The difference in expression capabilities could be explained by the influence of the host GC content on the stringency of the AT-rich canonical σ70 motif necessary for transcription initiation. We further confirm the generalizability of this model and find widespread GC content adaptation of the σ70 motif in a set of 1,545 genomes from all major bacterial phyla. Our analysis identifies a key mechanism by which the strength of the AT-rich σ70 motif relative to a host’s genomic GC content governs the capacity for expression of acquired DNA. These findings shed light on regulatory adaptation in the context of evolving genomic composition.
... Genome-wide determination of prokaryotic TSSs has been greatly facilitated by RNA-seq-derived methods that take advantage of the characteristic presence of a 59 triphosphate on the initiating nucleotide of unprocessed RNAs (such as mRNAs). In the most recent approaches, these RNAs are selectively targeted by the vaccinia capping enzyme, which adds either a desthiobiotinylated (Cappableseq) or a biotinylated (Capp-Switch seq) guanosine cap on triphosphorylated RNAs, permitting their capture on streptavidin-coupled beads (2)(3)(4). ...
... Hence, we reasoned that the final threshold set on reads per million (RPM) values of TSSs (i.e., 10 RPM for Capp-Switch sequencing [3]) needed to be dynamically adjusted to consider the local gene expression downstream from each TSS. Each TSS is associated with a single gene based on its localization (intragenic, positioned inside the associated gene; intergenic, positioned outside the associated gene, defined as having the closest gene start or end relative to the TSS). ...
... Libraries were next obtained by PCR Capp-Switch library preparation. The Capp-switch sequencing library preparation protocol was used (3). Briefly, a 59 biotinylated cap was first added to the 59-PPP RNAs using vaccinia capping enzyme (New England Biolabs). ...
Article
Full-text available
Innovative processes to transform plant biomass into renewable chemicals are needed to replace fossil fuels and limit climate change. Clostridium acetobutylicum is of industrial interest because it ferments sugars into acetone, butanol and ethanol (ABE). However, this organism is unable to depolymerize cellulose, limiting its use for the direct transformation of lignocellulose. In this study, the freely secreted family 5 cellulase cphy2058 from Clostridium phytofermentans was efficiently expressed in C. acetobutylicum to enhance cellulolysis. RT-qPCR confirmed the transcription of the cellulase gene, and galactomannan and carboxymethyl cellulose plate assays showed endoglucanase and mannanase activity of cphy2058. ABE titers by the recombinant, cellulase-expressing C. acetobutylicum reached 85.83 mM in cellobiose cultures, whereas ABE titers by wild-type were 44.52 mM. ABE titers by the cellulase-expressing strain were similarly elevated relative to wild-type in cellulose and galactomannan cultures. These results demonstrate a novel strategy to enhance complex substrate utilization by heterologous cellulase expression in C. acetobutylicum.
... As such, cphy2976 is among the top 10 most highly expressed genes in the cell on cellobiose and galactan. The cphy2976 gene is cotranscribed with cphy2975 (29) in an operon that opposes all other genes in the prophage region (Fig. S2). Neither gene encodes domains of known function nor has significant similarity to genes in the NCBI database, and so we cannot postulate their roles in the cell. ...
Article
Full-text available
Clostridia are a group of Gram-positive anaerobic bacteria of medical and industrial importance for which limited genetic methods are available. Here, we demonstrate an approach to make large genomic deletions and insertions in the model Clostridium phytofermentans by combining designed group II introns (targetrons) and Cre recombinase. We apply these methods to delete a 50-gene prophage island by programming targetrons to position markerless lox66 and lox71 sites, which mediate deletion of the intervening 39-kb DNA region using Cre recombinase. Gene expression and growth of the deletion strain showed that the prophage genes contribute to fitness on nonpreferred carbon sources. We also inserted an inducible fluorescent reporter gene into a neutral genomic site by recombination-mediated cassette exchange (RMCE) between genomic and plasmid-based tandem lox sites bearing heterospecific spacers to prevent intracassette recombination. These approaches generally enable facile markerless genome engineering in clostridia to study their genome structure and regulation. IMPORTANCE Clostridia are anaerobic bacteria with important roles in intestinal and soil microbiomes. The inability to experimentally modify the genomes of clostridia has limited their study and application in biotechnology. Here, we developed a targetron-recombinase system to efficiently make large targeted genomic deletions and insertions using the model Clostridium phytofermentans . We applied this approach to reveal the importance of a prophage to host fitness and introduce an inducible reporter by recombination-mediated cassette exchange.
... We have developed ReCappable-seq ( Figure 1a) to comprehensively identify TSS of all eukaryotic genes transcribed by Pol-I, Pol-II, Pol-III and POLRMT RNA polymerases. We used the Vaccinia Capping Enzyme (VCE) which can add a biotinylated G-cap structure to either a 5' triphosphate or 5' diphosphate RNA and has been used previously to identify TSS and primary transcripts in prokaryotes [9] [12] [13]. In eukaryotes the 5' ends of transcripts derived from Pol-II are capped and therefore cannot be directly biotinylated with VCE. ...
Preprint
Full-text available
Methodologies for determining eukaryotic Transcription Start Sites (TSS) rely on the selection of the 5 prime canonical cap structure of Pol-II transcripts and are consequently ignoring entire classes of TSS derived from other RNA polymerases which play critical roles in various cell functions. To overcome this limitation, we developed ReCappable-seq and identified TSS from Pol-lI and non-Pol-II transcripts at nucleotide resolution. Applied to the human transcriptome, ReCappable-seq identifies Pol-II TSS with higher specificity than CAGE and reveals a rich landscape of TSS associated notably with Pol-III transcripts which have been previously not possible to study on a genome-wide scale. Novel TSS consistent with non-Pol-II transcripts can be found in the nuclear and mitochondrial genomes. By identifying TSS derived from all RNA-polymerases, ReCappable-seq reveals distinct epigenetic marks among Pol-lI and non-Pol-II TSS and provides a unique opportunity to concurrently interrogate the regulatory landscape of coding and non-coding RNA.
... The C. phytofermentans genome is predicted to carry 572 transporter genes by TransportDB (6) and 485 transporter genes by KEGG (17), with 438 genes common to both databases (see Table S1 in the supplemental material). Similar to other bacteria, C. phytofermentans genes encoding a transporter are often cotranscribed as an operon to facilitate their expression at similar levels (18). Among the 173 genes annotated by TransportDB as sugar transporters, we observed that distinct and often single ABC transporter operons are transcriptionally upregulated on each carbon source ( Fig. 1; see also Table S2 in the supplemental material). ...
... Fermentation of glucose and galactose produces ethanol/acetate ratios of ϳ3:1 (2), supporting the hypothesis that the need for NADH oxidation outweighs that of ATP production, even though metabolism of a molecule of glucose or galactose to pyruvate yields only 1 ATP (Fig. 7). Assimilation of galacturonic acid to pyruvate yields only 1 ATP (Fig. 7), and concomitantly, the primary fermentation product shifts to acetate (18), which boosts ATP production by 2 ATPs per sugar equivalent. C. phytofermentans can likely produce additional ATP using the F 1 F 0 -ATPase (Cphy3735-Cphy3742) to harness the ion gradient generated by the Rnf complex (Cphy0211-Cphy0216), which couples ferredoxin oxidation to NAD ϩ reduction to pump Na ϩ or H ϩ outside the cell (Fig. 7). ...
Article
Full-text available
Plant-fermenting Clostridia are anaerobic bacteria that recycle plant matter in soil and promote human health by fermenting dietary fiber in the intestine. Clostridia degrade plant biomass using extracellular enzymes and then uptake the liberated sugars for fermentation. The main sugars in plant biomass are hexoses, and here, we identify how hexoses are taken in to the cell by the model organism Clostridium phytofermentans . We show that this bacterium uptakes hexoses using a set of highly specific, nonredundant ABC transporters. Once in the cell, the hexoses are phosphorylated by intracellular hexokinases. This study provides insight into the functioning of abundant members of soil and intestinal microbiomes and identifies gene targets to engineer strains for industrial lignocellulosic fermentation.
... Identification of TUs is a new task that has to be resolved to understand transcriptional regulation. Exact mapping of TUs can facilitate identification of new riboswitches, non-coding RNA, antisense transcripts (Sharma et al., 2010;Boutard et al., 2016). The same TU may have multiple TSSs (transcription start sites) and transcription ends. ...
Article
Full-text available
Prokaryotes are actively studied objects in the scope of genomic regulation. Microbiologists need special tools for complex analysis of data to study and identification of regulatory mechanism in bacteria and archaea. We developed a tool BAC-BROWSER, specifically for visualization and analysis of small prokaryotic genomes. BAC-BROWSER provides tools for different types of analysis to study a wide set of regulatory mechanisms of prokaryotes: -transcriptional regulation by transcription factors (TFs), analysis of TFs, their targets, and binding sites. -other regulatory motifs, promoters, terminators and ribosome binding sites -transcriptional regulation by variation of operon structure, alternative starts or ends of transcription. -non-coding RNAs, antisense RNAs -RNA secondary structure, riboswitches -GC content, GC skew, codon usage BAC-browser incorporated free programs accelerating the verification of obtained results: primer design and oligocalculator, vector visualization, the tool for synthetic gene construction. The program is designed for Windows operating system and freely available for download in http://smdb.rcpcm.org/tools/index.html.