Figure 5 - uploaded by Andrew Tolonen
Content may be subject to copyright.
(A) Bacterial abundance in the ATCC gut microbiome calculated from bisulfite-seq data (left) and RIMS-seq (Right) normalized to DNAseq(+3H). The normalized abundance is plotted relative to the GC content of each bacterium. (B) Methylation levels in Acinetobacter johnsonii (ATCC skin microbiome).The methylation level was calculated for cytosine positions in the context of ACGT (yellow) and randomly selected positions in other contexts (blue). These bisulfite-seq data suggest some sites are methylated in the context of ACGT, but they are not fully methylated. (C) Methylation level in Streptococcus mitis (ATCC skin microbiome) calculated from bisulfite-seq data. The methylation level was calculated for cytosine positions in the context of ACGT and GCNGC (yellow) as well as for randomly selected positions in other contexts (blue). (D) Methylation level in Helicobacter pylori (ATCC gut microbiome) calculated from bisulfite-seq data. The methylation level was calculated for cytosine positions in the context of GCGC and CCTC (yellow) as well as for randomly selected positions in other contexts (blue).
Source publication
DNA methylation is widespread amongst eukaryotes and prokaryotes to modulate gene expression and confer viral resistance. 5-Methylcytosine (m5C) methylation has been described in genomes of a large fraction of bacterial species as part of restriction-modification systems, each composed of a methyltransferase and cognate restriction enzyme. Methylas...
Contexts in source publication
Context 1
... The phylogenetic tree and abundance data obtained from DNA-seq, RIMS-seq and bisulfite-seq were visualized using iTOL (34) for each synthetic community (see Supplementary Figure S5). ...
Context 2
... therefore compared RIMS-seq sequencing performances with DNA-seq(+3H) and bisulfite sequencing. We found that bisulfite sequencing elevates abundances of AT-rich species such as Clostridioides difficile (71% AT), Enterococcus faecalis (63% AT) and Fusobacterium nucleatum (73% AT) ( Figure 5A, Supplementary Figure S5). For example, bisulfite sequencing over-estimated the presence of Clostridioides difficile by a factor of 2.65 and Staphylococcus epidermidis by a factor of 3.9 relative to DNAseq. ...
Context 3
... order to confirm the RIMS-seq motif, we investigated the bisulfite-seq data and compared the methylation level in cytosines present in the CCTC context versus cytosines in any other context. We see a methylation level above 90% at the cytosines in the CCTC context confirming the existence of this methylated motif in Helicobacter pylori ( Figure 5D). Interestingly, m4C methylation in Helicobacter pylori has been shown to also occur at TCTTC (26), resulting in the composite motif CYTC (TCTTC and NCCTC) found in the bisulfite data. ...
Context 4
... interestingly, bisulfite-seq results indicate that the ACGT motif in Acinetobacter johnsonii and Streptococcus mitis from the ATCC synthetic skin microbiome are not fully methylated ( Figure 5B). Most of the sites in Acinetobacter johnsonii show a methylation of about 10% while in Streptococcus mitis, the average methylation per site is 23% ( Figure 5C). ...
Context 5
... interestingly, bisulfite-seq results indicate that the ACGT motif in Acinetobacter johnsonii and Streptococcus mitis from the ATCC synthetic skin microbiome are not fully methylated ( Figure 5B). Most of the sites in Acinetobacter johnsonii show a methylation of about 10% while in Streptococcus mitis, the average methylation per site is 23% ( Figure 5C). These results highlight that despite the low methylation levels, RIMS-seq is able to detect the ACGT motif at high significance (P-value < 1e −100 ). ...
Context 6
... results highlight that despite the low methylation levels, RIMS-seq is able to detect the ACGT motif at high significance (P-value < 1e −100 ). We took advantage of the fact that Streptococcus mitis has two methylated motifs, ACGT and GCNGC with an average methylation per site at 23% and 91% respectively ( Figure 5C) to evaluate the sequencing depth required for RIMS-seq to Nucleic Acids Research, 2021 11 de-novo identify both motifs. As expected, the fully methylated GCNGC motif is found using 4 times fewer sequencing reads than the ACGT motif, with a required 1 million and 4 million mapped reads respectively (Supplementary Figure S6A and B). ...
Citations
... Recently, RIMS-seq (Baum et al. 2021) has been developed to sequence non-model organisms, such as bacterial genomes, and simultaneously determine m5 C methyltransferase specificities without requiring a reference genome. However, RIMS-seq is limited in its application to native microbiomes because short-read assemblies do not result in long enough contigs to assemble complete genomes from complex metagenomes. ...
... Earlier versions of RIMS-seq and RIMS-seq2 were employed on mock microbiomes and human genomic DNA, respectively (Baum et al. 2021;Yan, Wang, and Ettwiller 2024). We first tested RIMS-seq2 on the mock gut microbiome (ATCC ® MSA-1006™) spiked with XP12 bacteriophage genomic DNA, where all cytosines have been replaced by m5 C. RIMS-seq2 has a higher deamination rate compared to RIMS-seq and would require significantly less coverage per genome. ...
... Sequencing reads were downsampled to 1 Million Paired-end reads and were mapped to the mock community reference genomes. We calculated the imbalance of C-to-T transition between paired-end read 1 and 2, established previously to be linearly correlated with methylation ( (Baum et al. 2021;Yan, Wang, and Ettwiller 2024). Using XP12 as our deamination control, we achieved around 1.42% deamination rate on m5 C which is consistent with the previous report of RIMS-seq2 (Yan, Wang, and Ettwiller 2024) (Supplementary Figure 1A). ...
Methylation patterns in bacteria can be used to study Restriction-Modification (RM) or other defense systems with novel properties. While m4C and m6A methylation is well characterized mainly through PacBio sequencing, the landscape of m5C methylation is under-characterized. To bridge this gap, we performed RIMS-seq2 on microbiomes composed of resolved assemblies of distinct genomes through proximity ligation. This high-throughput approach enables the identification of m5C methylated motifs and links them to cognate methyltransferases directly on native microbiomes without the need to isolate bacterial strains. Methylation patterns can also be identified on viral DNA and compared to host DNA, strengthening evidence for virus-host interaction. Applied to three different microbiomes, the method unveils over 1900 motifs that were deposited in REBASE. The motifs include a novel 8-base recognition site (CATm5CGATG) that was experimentally validated by characterizing its cognate methyltransferase. Our findings suggest that microbiomes harbor arrays of untapped m5C methyltransferase specificities, providing insights to bacterial biology and biotechnological applications.
... We have used EM-Seq methodology and addressed some open questions in the highly methylation-prone bacterium H. pylori, adapting the method for single-base quantitative resolution of methylation to bacterial whole genomes. Before, high-throughput Rapid Identification of Methylase Specificity (RIMS)-Seq sequencing, which is similar but not identical to EM-Seq, was employed to obtain some single-base-resolution quantitative information on human and bacterial genomes, including H. pylori [57]; however, the latter method was not able to detect m4 C methylation. We succeeded in obtaining highquality whole-genome cytosine methylomes from two different H. pylori strains. ...
Background
Bacterial epigenetics is a rapidly expanding research field. DNA methylation by diverse bacterial methyltransferases (MTases) contributes to genomic integrity and replication, and many recent studies extended MTase function also to global transcript regulation and phenotypic variation. Helicobacter pylori is currently one of those bacterial species which possess the highest number and the most variably expressed set of DNA MTases. Next-generation sequencing technologies can directly detect DNA base methylation. However, they still have limitations in their quantitative and qualitative performance, in particular for cytosine methylation.
Results
As a complementing approach, we used enzymatic methyl sequencing (EM-Seq), a technology recently established that has not yet been fully evaluated for bacteria. Thereby, we assessed quantitatively, at single-base resolution, whole genome cytosine methylation for all methylated cytosine motifs in two different H. pylori strains and isogenic MTase mutants. EM-Seq reliably detected both m5C and m4C methylation. We demonstrated that three different active cytosine MTases in H. pylori provide considerably different levels of average genome-wide single-base methylation, in contrast to isogenic mutants which completely lost specific motif methylation. We found that strain identity and changed environmental conditions, such as growth phase and interference with methyl donor homeostasis, significantly influenced quantitative global and local genome-wide methylation in H. pylori at specific motifs. We also identified significantly hyper- or hypo-methylated cytosines, partially linked to overlapping MTase target motifs. Notably, we revealed differentially methylated cytosines in genome-wide coding regions under conditions of methionine depletion, which can be linked to transcript regulation.
Conclusions
This study offers new knowledge on H. pylori global and local genome-wide methylation and establishes EM-Seq for quantitative single-site resolution analyses of bacterial cytosine methylation.
... 1.4 × 10 7 reads from the asp7IM clone and 1.5 × 10 7 reads from the suaIIM clone were obtained. Methylation at m 5 C sites was determined by comparing the C>T deamination rates of read1 and read2 [21]. Motifs were determined by searching for over-represented sequences around these sites using pipelines based on both MoSDi [22] and DiNAMO [23], with similar results. ...
... From these data, one can readily identify DNA motifs around m 6 A and m 4 C methyl marks; m 5 C-associated motifs can also sometimes be identified, but with less efficiency and accuracy [25,26]. Alternative methods such as bisulfite sequencing, EM-seq [27], TAPS-seq [28], and RIMS-seq [21] are better suited to identifying m 5 C motifs, but they have not yet been applied to archaeal genomes at a large scale. Table 2 shows the number of genomes in each taxonomic group that have associated methylome data derived from SMRT sequencing, as well as the mean number of genes and observed motifs of each methylation type. ...
... Prior to this work, two examples from this clade had predicted recognition sites, although neither had been tested directly: M.SuaII had been predicted to modify RGATCY based on SMRT sequencing of Sulfolobus acidocaldarius DSM639 [14] and M.Asp7I was predicted to modify GGCAC in Acidilobus species 7A. To address the conflicting predictions, we cloned and expressed both genes in a methyl-deficient strain of E. coli and, using RIMS-seq [21], found both to modify the heterologous host chromosome in vivo at CCWGG sites, the same site modified in wild-type E. coli strains by the product of dcm. In other words, both predictions were incorrect. ...
When compared with bacteria, relatively little is known about the restriction–modification (RM) systems of archaea, particularly those in taxa outside of the haloarchaea. To improve our understanding of archaeal RM systems, we surveyed REBASE, the restriction enzyme database, to catalog what is known about the genes and activities present in the 519 completely sequenced archaeal genomes currently deposited there. For 49 (9.4%) of these genomes, we also have methylome data from Single-Molecule Real-Time (SMRT) sequencing that reveal the target recognition sites of the active m6A and m4C DNA methyltransferases (MTases). The gene-finding pipeline employed by REBASE is trained primarily on bacterial examples and so will look for similar genes in archaea. Nonetheless, the organizational structure and protein sequence of RM systems from archaea are highly similar to those of bacteria, with both groups acquiring systems from a shared genetic pool through horizontal gene transfer. As in bacteria, we observe numerous examples of “persistent” DNA MTases conserved within archaeal taxa at different levels. We experimentally validated two homologous members of one of the largest “persistent” MTase groups, revealing that methylation of C(m5C)WGG sites may play a key epigenetic role in Crenarchaea. Throughout the archaea, genes encoding m6A, m4C, and m5C DNA MTases, respectively, occur in approximately the ratio 4:2:1.
... combination with DNA substrates to detect methylation in vitro are feasible approaches, although optimization of experimental conditions may be required. [26,84] In sum, although assignment of prokaryotic enzyme specificity is now largely achieved via high-throughput analysis of genomes and methylomes, [85,86] proper assignment of specificity to novel eukaryotic MTases requires a thorough manual curation of database search outputs, as well as experimental verification of sequence specificity and DNA modification activity. ...
DNA methylation constitutes one of the pillars of epigenetics, relying on covalent bonds for addition and/or removal of chemically distinct marks within the major groove of the double helix. DNA methyltransferases, enzymes which introduce methyl marks, initially evolved in prokaryotes as components of restriction-modification systems protecting host genomes from bacteriophages and other invading foreign DNA. In early eukaryotic evolution, DNA methyltransferases were horizontally transferred from bacteria into eukaryotes several times and independently co-opted into epigenetic regulatory systems, primarily via establishing connections with the chromatin environment. While C5-methylcytosine is the cornerstone of plant and animal epigenetics and has been investigated in much detail, the epigenetic role of other methylated bases is less clear. Recent addition of N4-methylcytosine of bacterial origin as a metazoan DNA modification highlights the prerequisites for foreign gene co-option into the host regulatory networks, and challenges the existing paradigms concerning the origin and evolution of eukaryotic regulatory systems.
https://onlinelibrary.wiley.com/share/author/V54H5NWFQS34M42VDGAQ?target=10.1002/bies.202200232
... Genomic DNA was extracted from frozen tissues using the PureLink™ genomic DNA mini kit (K182001; Invitrogen), and the extracted DNA was treated with sodium bisulfite as described above [18,19]. The primers (forward, 5′-GGA TTT GTA TTG AGG TTT TGG AG-3′, reverse, 5′-TAA CCC ATC ACC TCC ACC AC-3′) was designed to amplify the promoter region of OCT4 and sequence the bisulfite genome from − 234 to + 46 in exon 1. ...
There is an ongoing debate regarding whether gliomas originate due to functional or genetic changes in neural stem cells (NSCs). Genetic engineering has made it possible to use NSCs to establish glioma models with the pathological features of human tumors. Here, we found that RAS, TERT, and p53 mutations or abnormal expression were associated with the occurrence of glioma in the mouse tumor transplantation model. Moreover, EZH2 palmitoylation mediated by ZDHHC5 played a significant role in this malignant transformation. EZH2 palmitoylation activates H3K27me3, which in turn decreases miR-1275, increases glial fibrillary acidic protein (GFAP) expression, and weakens the binding of DNA methyltransferase 3A (DNMT3A) to the OCT4 promoter region. Thus, these findings are significant because RAS, TERT, and p53 oncogenes in human neural stem cells are conducive to a fully malignant and rapid transformation, suggesting that gene changes and specific combinations of susceptible cell types are important factors in determining the occurrence of gliomas.
... We first characterized human thymine-DNA glycosylase hTDG repair on G:T and G:hmU mismatch activity on various sequence contexts using a NGS-based assay. For this, we used fully modified XP12 and T4gt phage genomic DNAs that harbor 5mC (16) and 5hmC (17) respectively and were subjected to a limited deamination using heat alkaline treatment following similar conditions as previously published (18) with some minor changes as described in Material and Methods. Using such treatment, the deamination rate of XP12 has been previously shown to be even across all contexts (18). ...
... For this, we used fully modified XP12 and T4gt phage genomic DNAs that harbor 5mC (16) and 5hmC (17) respectively and were subjected to a limited deamination using heat alkaline treatment following similar conditions as previously published (18) with some minor changes as described in Material and Methods. Using such treatment, the deamination rate of XP12 has been previously shown to be even across all contexts (18). We further demonstrate that the deamination rate of 5hmC is proportional to the heat alkaline treatment time (Supplementary Figure S2A) and evenly distributed across all analyzed contexts (Supplementary Figure S2B). ...
Avoiding damage-induced sequencing errors is a critical step for the accurate identification of medium to rare frequency mutations in DNA samples. In the case of FFPE samples, deamination of cytosine moieties represents a major damage resulting in the loss of DNA material and sequencing errors. In this study, we demonstrated that, while damage from deamination of both cytosine and methylated cytosine moieties results in elevated C to T transition, the error profiles and mediation strategies are different and easily distinguishable. While damage-induced sequencing errors from cytosine deamination is driven by the end-repair step commonly used in NGS workflow, DNA damage resulting from deamination of methylated cytosine is another major contributor to sequencing errors at CpG sites. Uracil DNA glycosylase and human thymine DNA glycosylase can respectively eliminate and mitigate both damages in FFPE DNA samples, therefore increasing sequencing accuracy notably for the identification of moderate allelic frequency variants.
... Three forms of DNA methylation are common in bacterial genomes: 5methylcytosine (m5C), N 4 -methylcytosine (m4C), and N 6methyladenine (m6A). 23 We profiled the C. phytofermentans genome for these three methylation patterns using two complementary sequencing methods: RIMS-seq identifies m5C as C-to-T mutations on Read 1, 23 and SMRT sequencing identifies m4C and m6A based on nucleotide incorporation rates measured as pulse width and interpulse duration. 24 RIMS-seq of the C. phytofermentans genome revealed m5C modification at 5′-GATC-3′ based on elevated C-to-T mutations (p = 1.23 × 10 −4061 ) ( Figure 1A). ...
... Three forms of DNA methylation are common in bacterial genomes: 5methylcytosine (m5C), N 4 -methylcytosine (m4C), and N 6methyladenine (m6A). 23 We profiled the C. phytofermentans genome for these three methylation patterns using two complementary sequencing methods: RIMS-seq identifies m5C as C-to-T mutations on Read 1, 23 and SMRT sequencing identifies m4C and m6A based on nucleotide incorporation rates measured as pulse width and interpulse duration. 24 RIMS-seq of the C. phytofermentans genome revealed m5C modification at 5′-GATC-3′ based on elevated C-to-T mutations (p = 1.23 × 10 −4061 ) ( Figure 1A). ...
... We identified DNA methylation patterns in the C. phytofermentans ISDg genome using SMRT sequencing 53 and RIMS-seq. 23 SMRT DNA sequencing was performed at the U.S. Department of Energy Joint Genome Institute (DOE JGI Project ID 1100006), and sequence files are available in the NCBI SRA (accession code SRP123768). RIMS-seq was performed at Genoscope-CEA, and sequences files are available in the European Nucleotide Archive (accession code ERA17973853). ...
Control of gene expression is fundamental to cell engineering. Here we demonstrate a set of approaches to tune gene expression in Clostridia using the model Clostridium phytofermentans. Initially, we develop a simple benchtop electroporation method that we use to identify a set of replicating plasmids and resistance markers that can be cotransformed into C. phytofermentans. We define a series of promoters spanning a >100-fold expression range by testing a promoter library driving the expression of a luminescent reporter. By insertion of tet operator sites upstream of the reporter, its expression can be quantitatively altered using the Tet repressor and anhydrotetracycline (aTc). We integrate these methods into an aTc-regulated dCas12a system with which we show in vivo CRISPRi-mediated repression of reporter and fermentation genes in C. phytofermentans. Together, these approaches advance genetic transformation and experimental control of gene expression in Clostridia.
Mining phages for new enzymatic activities continues to be important for the development of new tools for biotechnology. In this study, we used MetaGPA—a method linking genotype to phenotype in metagenomic data—to identify deoxycytidine deaminases, a protein family highly associated with cytosine modifications in metaviromes. Unexpectedly, a subset of these deaminases exhibited a preference for 5-methylcytosine (5mC) over cytosine (C) in both mononucleotide and single-stranded DNA substrates. In a methylome sequencing workflow, preferential deamination of 5mC by these enzymes enabled direct conversion of methylated cytosine while completely eliminating any background deamination of unmodified cytosine. This direct conversion allows for precise identification of methylated sites at single-base resolution with unmatched sensitivity enabling broad applications for the simultaneous sequencing of genome and methylome.
Multiomics requires concerted recording of independent information, ideally from a single experiment. In this study, we introduce RIMS-seq2, a high-throughput technique to simultaneously sequence genomes and overlay methylation information while requiring only a small modification of the experimental protocol for high throughput DNA sequencing to include a controlled deamination step. Importantly, the rate of deamination of 5mC is negligible and thus, do not interfere with standard DNA sequencing and data processing. Thus, RIMS-seq2 libraries from whole or targeted genome sequencing show the same germline variation calling accuracy and sensitivity as compared to standard DNA-seq. Additionally, regional methylation levels provide an accurate map of the human methylome.
Although restriction-modification systems are found in both Eubacterial and Archaeal kingdoms, comparatively less is known about patterns of DNA methylation and genome defense systems in archaea. Here we report the complete closed genome sequence and methylome analysis of Methanococcus aeolicus PL15/Hp, a strain of the CO2-reducing methanogenic archaeon and a commercial source for MaeI, MaeII, and MaeIII restriction endonucleases. The M. aeolicus PL15/Hp genome consists of a 1.68 megabase circular chromosome predicted to contain 1,615 protein coding genes and 38 tRNAs. A combination of methylome sequencing, homology-based genome annotation, and recombinant gene expression identified five restriction-modification systems encoded by this organism, including the methyltransferase and site-specific endonuclease of MaeIII. The MaeIII restriction endonuclease was recombinantly expressed, purified and shown to have site-specific DNA cleavage activity in vitro.