Figure 1 - uploaded by Andrew Tolonen
Content may be subject to copyright.
| Overview of the Capp-Switch sequencing approach. Capp-Switch includes (a-c) capture of 5 0 mRNA fragments and (d-f) cDNA synthesis and sequencing. (a) The mRNA 5 0 triphosphate is capped with biotin-GTP by VCE. (b) RNA is fragmented and (c) the capped 5 0 mRNA fragments are captured on streptavidin magnetic beads and separated from other RNA. (d) The 5 0 mRNA fragments are reverse transcribed to single-stranded cDNA using MMLV reverse transcriptase. An oligonucleotide hybridizes to the 3 0 overhang and the complementary sequence is synthesized by the MMLV template-switching activity. (e) Double-stranded cDNA is synthesized using primers that hybridize to the single-stranded cDNA termini. (f) The cDNA is sequenced on a highthroughput platform. 

| Overview of the Capp-Switch sequencing approach. Capp-Switch includes (a-c) capture of 5 0 mRNA fragments and (d-f) cDNA synthesis and sequencing. (a) The mRNA 5 0 triphosphate is capped with biotin-GTP by VCE. (b) RNA is fragmented and (c) the capped 5 0 mRNA fragments are captured on streptavidin magnetic beads and separated from other RNA. (d) The 5 0 mRNA fragments are reverse transcribed to single-stranded cDNA using MMLV reverse transcriptase. An oligonucleotide hybridizes to the 3 0 overhang and the complementary sequence is synthesized by the MMLV template-switching activity. (e) Double-stranded cDNA is synthesized using primers that hybridize to the single-stranded cDNA termini. (f) The cDNA is sequenced on a highthroughput platform. 

Source publication
Article
Full-text available
Bacteria respond to their environment by regulating mRNA synthesis, often by altering the genomic sites at which RNA polymerase initiates transcription. Here, we investigate genome-wide changes in transcription start site (TSS) usage by Clostridium phytofermentans, a model bacterium for fermentation of lignocellulosic biomass. We quantify expressio...

Contexts in source publication

Context 1
... in other bacteria report many leaderless mRNA without 5 0 UTR and ribosome binding sites (RBS) 11 . Four per cent of InterS TSS are potentially leaderless in C. phytofermentans, but these genes generally have another upstream TSS and retain a typical RBS similar to highly expressed C. phytofermentans genes ( Supplementary Fig. 1). ...
Context 2
... examined which of these novel TU encode proteins by mapping C. phytofermentans MS/MS peptide spectra to the genome translated in all frames, identifying peptides outside the predicted proteome in 21 InterS, 13 IntraS, 5 InterA and 25 IntraA regions (Supplementary Data 5). The combination of TSS and expressed peptides supports ORFs with N-terminal extensions such as cphy0891 ( Supplementary Fig. 10A) and the existence of novel ORFs. For example, clops3461, which overlaps with cphy2929 on the opposite strand (Fig. 6d), and an antisense overlapping ORF in cphy1953 encoding the ComEA competence protein (Supplementary Fig. 10B). ...
Context 3
... combination of TSS and expressed peptides supports ORFs with N-terminal extensions such as cphy0891 ( Supplementary Fig. 10A) and the existence of novel ORFs. For example, clops3461, which overlaps with cphy2929 on the opposite strand (Fig. 6d), and an antisense overlapping ORF in cphy1953 encoding the ComEA competence protein (Supplementary Fig. 10B). ...
Context 4
... TSS mapping with motif searching could be broadly applied to LacI/GalR regulators and other types of transcription factors. For example, each of the 4 TetR regulators for which we detected TSS also have conserved, TSS-associated palindromes that resemble operator sites ( Supplementary Fig. 11). ...

Citations

... Changes in the transcription start site depending on two different electron acceptor have been reported in studies on Geobactor [35]. Also, genome-wide analysis of transcription start sites in Clostridium identified several metabolism-related genes with multiple transcription start sites that change depending on the substrate [36]. Although the importance of having multiple transcription start sites has not been fully elucidated, it is considered an important regulatory mechanism of gene expression because it largely influences transcription efficiency, translation initiation, and protein abundance [37]. ...
Preprint
Full-text available
N 2 O is the major greenhouse gases influencing global warming, and agricultural land is the predominant (anthropogenic) source of N 2 O emissions. Here, we report the high N 2 O-reducing activity of Bradyrhizobium ottawaense , suggesting the potential for efficiently mitigating N 2 O emission from agricultural lands. Among the 15 B. ottawaense isolates examined, the N 2 O-reducing activities of most (13) strains were approximately 5-fold higher than that of Bradyrhizobium diazoefficiens USDA110 T under anaerobic free-living conditions. This robust N 2 O-reducing activity of B. ottawaense was confirmed by N 2 O reductase (NosZ) protein levels and in the soybean rhizosphere after nodule decomposition. While the NosZ of B. ottawaense and B. diazoefficiens showed high homology, nosZ gene expression in B . ottawaense was over 150-fold higher than that in B. diazoefficiens USDA110 T , suggesting the high N 2 O-reducing activity of B. ottawaense is achieved by high nos expression. Furthermore, we examined the nos operon transcription start sites and found that, unlike B. diazoefficiens , B . ottawaense has two transcription start sites under N 2 O-respiring conditions, which may contribute to the high nosZ expression. Our study proposes the potential of B. ottawaense for effective N 2 O reduction and unique regulation of nos gene expression that contributes to the high performance of N 2 O mitigation in the soil.
... database to construct the training and independent test datasets. These 13 bacterial species include B. amyloliquefaciens [35], C. jejuni [36,37], C. phytofermentans [38], C. pneumoniae [39], E. coli [40,41], H. pylori [36,42], L. interrogans [43], M. smegmatis [44], R. capsulatus [45], S. coelicolor [46], S. oneidensis [47], S. pyogenes [48] and S. Typhimurium [49]. The experimentally verified annotations of bacterial transcription start sites (TSSs) were taken from the corresponding references. ...
Article
Background: Promoters are DNA regions that initiate the transcription of specific genes near the transcription start sites. In bacteria, promoters are recognized by RNA polymerases and associated sigma factors. Effective promoter recognition is essential for synthesizing the gene-encoded products by bacteria to grow and adapt to different environmental conditions. A variety of machine learning-based predictors for bacterial promoters have been developed; however, most of them were designed specifically for a particular species. To date, only a few predictors are available for identifying general bacterial promoters with limited predictive performance. Results: In this study, we developed TIMER, a Siamese neural network-based approach for identifying both general and species-specific bacterial promoters. Specifically, TIMER uses DNA sequences as the input and employs three Siamese neural networks with the attention layers to train and optimize the models for a total of 13 species-specific and general bacterial promoters. Extensive 10-fold cross-validation and independent tests demonstrated that TIMER achieves a competitive performance and outperforms several existing methods on both general and species-specific promoter prediction. As an implementation of the proposed method, the web server of TIMER is publicly accessible at http://web.unimelb-bioinfortools.cloud.edu.au/TIMER/.
... For those genes with two TSSs, TSS with a higher value was considered. This result is in agreement with Boutard et al. [29] where most genes were expressed from a single TSS. Identification of transcription start sites enables identification of promoter regions [30]. ...
Article
Full-text available
Background Mycobacterium colombiense is an acid-fast, non-motile, rod-shaped mycobacterium confirmed to cause respiratory disease and disseminated infection in immune-compromised patients, and lymphadenopathy in immune-competent children. It has virulence mechanisms that allow them to adapt, survive, replicate, and produce diseases in the host. To tackle the diseases caused by M . colombiense , understanding of the regulation mechanisms of its genes is important. This paper, therefore, analyzes transcription start sites, promoter regions, motifs, transcription factors, and CpG islands in TetR family transcriptional regulatory (TFTR) genes of M . colombiense CECT 3035 using neural network promoter prediction, MEME, TOMTOM algorithms, and evolutionary analysis with the help of MEGA-X. Results The analysis of 22 protein coding TFTR genes of M . colombiense CECT 3035 showed that 86.36% and 13.64% of the gene sequences had one and two TSSs, respectively. Using MEME, we identified five motifs (MTF1, MTF2, MTF3, MTF4, and MTF5) and MTF1 was revealed as the common promoter motif for 100% TFTR genes of M . colombiense CECT 3035 which may serve as binding site for transcription factors that shared a minimum homology of 95.45%. MTF1 was compared to the registered prokaryotic motifs and found to match with 15 of them. MTF1 serves as the binding site mainly for AraC, LexA, and Bacterial histone-like protein families. Other protein families such as MATP, RR, σ-70 factor, TetR, LytTR, LuxR, and NAP also appear to be the binding candidates for MTF1. These families are known to have functions in virulence mechanisms, metabolism, quorum sensing, cell division, and antibiotic resistance. Furthermore, it was found that TFTR genes of M . colombiense CECT 3035 have many CpG islands with several fragments in their CpG islands. Molecular evolutionary genetic analysis showed close relationship among the genes. Conclusion We believe these findings will provide a better understanding of the regulation of TFTR genes in M . colombiense CECT 3035 involved in vital processes such as cell division, pathogenesis, and drug resistance and are likely to provide insights for drug development important to tackle the diseases caused by this mycobacterium. We believe this is the first report of in silico analyses of the transcriptional regulation of M . colombiense TFTR genes.
... While environmental conditions change, several genes are differentially expressed in bacteria. TSSs of these genes may be changed accordingly (Mendoza-Vargas et al. 2009;Boutard et al. 2016;Vera et al. 2020). For the primary and secondary TSSs corresponding to 530 genes, we investigated the distribution of the TSSs number per gene (Fig. 7A). ...
... There were even up to ten TSSs for a single gene. The distribution was similar to that observed in E. coli and Clostridium phytofermentans (Mendoza-Vargas et al. 2009;Boutard et al. 2016). Some of these genes have been reported to be essential for the growth and virulence of Brucella species, such as quorum sensing-dependent transcriptional regulator VjbR, two-component system genes, and lipopolysaccharide-related genes (Guzman-Verri et al. 2002;Kleinman et al. 2017;Smith 2018). ...
... Bacteria could regulate the expression of genes in response to different environments by changing the position of transcription initiation (Mendoza-Vargas et al. 2009;Boutard et al. 2016;Vera et al. 2020). We found that 57 % genes are expressed from more than one TSS, some of which are essential for Brucella growth and virulence ( Fig. 7A and B). ...
Article
Brucella melitensis (B. melitensis) is an important facultative intracellular bacterium that causes global zoonotic diseases. Continuous intracellular survival and replication are the main obstruction responsible for the accessibility of prevention and treatment of brucellosis. Bacteria respond to complex environment by regulating gene expression. Many regulatory factors function at loci where RNA polymerase initiates messenger RNA synthesis. However, limited gene annotation is a current obstacle for the research on expression regulation in bacteria. To improve annotation and explore potential functional sites, we proposed a novel genome-wide method called Capping-seq for transcription start site (TSS) mapping in B. melitensis. This technique combines capture of capped primary transcripts with Single Molecule Real-Time (SMRT) sequencing technology. We identified 2,369 TSSs at single nucleotide resolution by Capping-seq. TSSs analysis of Brucella transcripts showed a preference of purine on the TSS positions. Our results revealed that -35 and -10 elements of promoter contained consensus sequences of TTGNNN and TATNNN, respectively. The 5' ends analysis showed that 57% genes are associated with more than one TSS and 47% genes contain long leader regions, suggested potential complex regulation at the 5' ends of genes in B. melitensis. Moreover, we identified 52 leaderless genes that are mainly involved in the metabolic processes. Overall, Capping-seq technology provides a unique solution for TSS determination in prokaryotes. Our findings develop a systematic insight into the primary transcriptome characterization of B. melitensis. This study represents a critical basis for investigating gene regulation and pathogenesis of Brucella.
... Genome-wide determination of prokaryotic TSSs has been greatly facilitated by RNA-seq-derived methods that take advantage of the characteristic presence of a 59 triphosphate on the initiating nucleotide of unprocessed RNAs (such as mRNAs). In the most recent approaches, these RNAs are selectively targeted by the vaccinia capping enzyme, which adds either a desthiobiotinylated (Cappableseq) or a biotinylated (Capp-Switch seq) guanosine cap on triphosphorylated RNAs, permitting their capture on streptavidin-coupled beads (2)(3)(4). ...
... Hence, we reasoned that the final threshold set on reads per million (RPM) values of TSSs (i.e., 10 RPM for Capp-Switch sequencing [3]) needed to be dynamically adjusted to consider the local gene expression downstream from each TSS. Each TSS is associated with a single gene based on its localization (intragenic, positioned inside the associated gene; intergenic, positioned outside the associated gene, defined as having the closest gene start or end relative to the TSS). ...
... Libraries were next obtained by PCR Capp-Switch library preparation. The Capp-switch sequencing library preparation protocol was used (3). Briefly, a 59 biotinylated cap was first added to the 59-PPP RNAs using vaccinia capping enzyme (New England Biolabs). ...
Article
Full-text available
Solventogenic clostridia have been employed in industry for more than a century, initially being used in the acetone-butanol-ethanol (ABE) fermentation process for acetone and butanol production. Interest in these bacteria has recently increased in the context of green chemistry and sustainable development.
... A previous study using a technical variant of Cappable-seq (Capp-switch) investigated genome-wide patterns of C. phy transcription initiation on various plant substrates and demonstrated conditiondependent transcription regulation modifications. Amongst these changes, interesting regulatory mechanisms such as antisense transcription, leaderless transcription and non-coding RNA were identified (Boutard et al., 2016). ...
Thesis
Full-text available
The development of high-throughput DNA sequencing revolutionized the study of complex bacterial communities called “microbiomes'', in diverse environments, from the central oceans to the human intestine. The research aim of this thesis is to develop new sequencing-based technologies and apply them to provide further insights into changes to the composition and activities of microbiomes. Specifically, Chapter One presents RIMS-seq (Rapid Identification of Methylase Specificity), a method to simultaneously obtain the DNA sequence and 5-methylcytosine (m5C) profile of bacterial genomes. Modification by m5C has been described in the genomes of many bacterial species to modulate gene expression and protect from viral infection. Chapter Two introduces ONT-cappable-seq and Loop-Cappable-seq, two new techniques to reveal operon architecture through full-length transcript sequencing, using Nanopore and LoopSeq sequencing, respectively. In Chapter Three, we applied a multiomics approach using some of the tools developed in the previous chapters tostudy the dynamics of the response of a model human intestinal microbiome after treatment with ciprofloxacin, a widely used broad-spectrum antibiotic. Antibiotics are critical treatments to prevent pathogenic infections, but they also kill commensal species that promote health, enhance the spread of resistant strains, and may degrade the protective effect of microbiota against invasion by pathogens. Therefore, it is crucial to be able to characterize both the composition but also the functional response of a microbial community to antibiotic treatment. We examined both the short and long-term transcriptional and genomic responses of the synthetic community and explored how the immediate transcriptomic response correlates and potentially predicts the later changes of the microbiome composition. The goal is to try to identify a marker appearing a few minutes/hours after the treatment that could be used to potentially predict the outcome of an antibiotic treatment, opening up the path to a more personalized medicine.
... The genes in the RefSeq annotation divided the TSS into the categories InterS (intergenic TSS with downstream gene in same orientation), InterA (intergenic TSS with downstream gene in opposite orientation), IntraS (intragenic TSS in gene with same orientation), or IntraA (intragenic TSS in gene with opposite orientation) according to Boutard et al. [72]. The Bedtools toolset was used to search for the closest gene and determine its distance from the TSS [73]. ...
Article
Full-text available
The bacterial pathogen Salmonella enterica, which causes enteritis, has a broad host range and extensive environmental longevity. In water and soil, Salmonella interacts with protozoa and multiplies inside their phagosomes. Although this relationship resembles that between Salmonella and mammalian phagocytes, the interaction mechanisms and bacterial genes involved are unclear. Here, we characterized global gene expression patterns of S. enterica serovar Typhimurium within Acanthamoeba castellanii at the early stage of infection by Cappable-Seq. Gene expression features of S. Typhimurium within A. castellanii were presented with downregulation of glycolysis-related, and upregulation of glyoxylate cycle-related genes. Expression of Salmonella Pathogenicity Island-1 (SPI-1), chemotaxis system, and flagellar apparatus genes was upregulated. Furthermore, expression of genes mediating oxidative stress response and iron uptake was upregulated within A. castellanii as well as within mammalian phagocytes. Hence, global S. Typhimurium gene expression patterns within A. castellanii help better understand the molecular mechanisms of Salmonella adaptation to an amoeba cell and intracellular persistence in protozoa inhabiting water and soil ecosystems.
... These findings suggest that similar evolutionary tuning may enable GCadaptation of σ70 stringency in concert with evolving GC content. In line with our proposed mechanism, promoters from GC-rich Rhizobium and Streptomyces species (61.5 and 72.1 %GC, respectively) [31,32] have higher levels of degeneracy in their σ70 motifs than low GC organisms like Clostridium fermentans (GC% = 35) [33], consistent with less stringent requirements for initiation of transcription. A study of synthetic promoter sequences in industrial Clostridium species with similarly low GC contents identified a strong preference for AT-rich promoters [23], providing further support for GC-adaptation of promoter stringency. ...
Article
While horizontal gene transfer is prevalent across the biosphere, the regulatory features that enable expression and functionalization of foreign DNA remain poorly understood. Here, we combine high-throughput promoter activity measurements and large-scale genomic analysis of regulatory regions to investigate the cross-compatibility of regulatory elements (REs) in bacteria. Functional characterization of thousands of natural REs in three distinct bacterial species revealed distinct expression patterns according to RE and recipient phylogeny. Host capacity to activate foreign promoters was proportional to their genomic GC content, while many low GC regulatory elements were both broadly active and had more transcription start sites across hosts. The difference in expression capabilities could be explained by the influence of the host GC content on the stringency of the AT-rich canonical σ70 motif necessary for transcription initiation. We further confirm the generalizability of this model and find widespread GC content adaptation of the σ70 motif in a set of 1,545 genomes from all major bacterial phyla. Our analysis identifies a key mechanism by which the strength of the AT-rich σ70 motif relative to a host’s genomic GC content governs the capacity for expression of acquired DNA. These findings shed light on regulatory adaptation in the context of evolving genomic composition.
... Genome-wide determination of prokaryotic TSSs has been greatly facilitated by RNA-seq-derived methods that take advantage of the characteristic presence of a 59 triphosphate on the initiating nucleotide of unprocessed RNAs (such as mRNAs). In the most recent approaches, these RNAs are selectively targeted by the vaccinia capping enzyme, which adds either a desthiobiotinylated (Cappableseq) or a biotinylated (Capp-Switch seq) guanosine cap on triphosphorylated RNAs, permitting their capture on streptavidin-coupled beads (2)(3)(4). ...
... Hence, we reasoned that the final threshold set on reads per million (RPM) values of TSSs (i.e., 10 RPM for Capp-Switch sequencing [3]) needed to be dynamically adjusted to consider the local gene expression downstream from each TSS. Each TSS is associated with a single gene based on its localization (intragenic, positioned inside the associated gene; intergenic, positioned outside the associated gene, defined as having the closest gene start or end relative to the TSS). ...
... Libraries were next obtained by PCR Capp-Switch library preparation. The Capp-switch sequencing library preparation protocol was used (3). Briefly, a 59 biotinylated cap was first added to the 59-PPP RNAs using vaccinia capping enzyme (New England Biolabs). ...
Article
Full-text available
Innovative processes to transform plant biomass into renewable chemicals are needed to replace fossil fuels and limit climate change. Clostridium acetobutylicum is of industrial interest because it ferments sugars into acetone, butanol and ethanol (ABE). However, this organism is unable to depolymerize cellulose, limiting its use for the direct transformation of lignocellulose. In this study, the freely secreted family 5 cellulase cphy2058 from Clostridium phytofermentans was efficiently expressed in C. acetobutylicum to enhance cellulolysis. RT-qPCR confirmed the transcription of the cellulase gene, and galactomannan and carboxymethyl cellulose plate assays showed endoglucanase and mannanase activity of cphy2058. ABE titers by the recombinant, cellulase-expressing C. acetobutylicum reached 85.83 mM in cellobiose cultures, whereas ABE titers by wild-type were 44.52 mM. ABE titers by the cellulase-expressing strain were similarly elevated relative to wild-type in cellulose and galactomannan cultures. These results demonstrate a novel strategy to enhance complex substrate utilization by heterologous cellulase expression in C. acetobutylicum.
... As such, cphy2976 is among the top 10 most highly expressed genes in the cell on cellobiose and galactan. The cphy2976 gene is cotranscribed with cphy2975 (29) in an operon that opposes all other genes in the prophage region (Fig. S2). Neither gene encodes domains of known function nor has significant similarity to genes in the NCBI database, and so we cannot postulate their roles in the cell. ...
Article
Full-text available
Clostridia are a group of Gram-positive anaerobic bacteria of medical and industrial importance for which limited genetic methods are available. Here, we demonstrate an approach to make large genomic deletions and insertions in the model Clostridium phytofermentans by combining designed group II introns (targetrons) and Cre recombinase. We apply these methods to delete a 50-gene prophage island by programming targetrons to position markerless lox66 and lox71 sites, which mediate deletion of the intervening 39-kb DNA region using Cre recombinase. Gene expression and growth of the deletion strain showed that the prophage genes contribute to fitness on nonpreferred carbon sources. We also inserted an inducible fluorescent reporter gene into a neutral genomic site by recombination-mediated cassette exchange (RMCE) between genomic and plasmid-based tandem lox sites bearing heterospecific spacers to prevent intracassette recombination. These approaches generally enable facile markerless genome engineering in clostridia to study their genome structure and regulation. IMPORTANCE Clostridia are anaerobic bacteria with important roles in intestinal and soil microbiomes. The inability to experimentally modify the genomes of clostridia has limited their study and application in biotechnology. Here, we developed a targetron-recombinase system to efficiently make large targeted genomic deletions and insertions using the model Clostridium phytofermentans . We applied this approach to reveal the importance of a prophage to host fitness and introduce an inducible reporter by recombination-mediated cassette exchange.