[Show abstract][Hide abstract] ABSTRACT: The methylation of DNA bases plays an important role in numerous biological processes including development, gene expression, and DNA replication. Salmonella is an important foodborne pathogen, and methylation in Salmonella is implicated in virulence. Using single molecule real-time (SMRT) DNA-sequencing, we sequenced and assembled the complete genomes of eleven Salmonella enterica isolates from nine different serovars, and analysed the whole-genome methylation patterns of each genome. We describe 16 distinct N6-methyladenine (m6A) methylated motifs, one N4-methylcytosine (m4C) motif, and one combined m6A-m4C motif. Eight of these motifs are novel, i.e., they have not been previously described. We also identified the methyltransferases (MTases) associated with 13 of the motifs. Some motifs are conserved across all Salmonella serovars tested, while others were found only in a subset of serovars. Eight of the nine serovars contained a unique methylated motif that was not found in any other serovar (most of these motifs were part of Type I restriction modification systems), indicating the high diversity of methylation patterns present in Salmonella.
PLoS ONE 04/2015; 10(4):e0123639. DOI:10.1371/journal.pone.0123639 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: TET/JBP enzymes oxidize 5-methylpyrimidines in DNA. In mammals, the oxidized methylcytosines (oxi-mCs) function as epigenetic marks and likely intermediates in DNA demethylation. Here we present a method based on diglucosylation of 5-hydroxymethylcytosine (5hmC) to simultaneously map 5hmC, 5-formylcytosine, and 5-carboxylcytosine at near-base-pair resolution. We have used the method to map the distribution of oxi-mC across the genome of Coprinopsis cinerea, a basidiomycete that encodes 47 TET/JBP paralogs in a previously unidentified class of DNA transposons. Like 5-methylcytosine residues from which they are derived, oxi-mC modifications are enriched at centromeres, TET/JBP transposons, and multicopy paralogous genes that are not expressed, but rarely mark genes whose expression changes between two developmental stages. Our study provides evidence for the emergence of an epigenetic regulatory system through recruitment of selfish elements in a eukaryotic lineage, and describes a method to map all three different species of oxi-mCs simultaneously.
Proceedings of the National Academy of Sciences 11/2014; DOI:10.1073/pnas.1419513111 · 9.67 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The Campylobacter lari group is a phylogenetic clade within the epsilon subdivision of the Proteobacteria and is part of the thermotolerant Campylobacter spp., a division within the genus that includes the human pathogen Campylobacter jejuni. The C. lari group is currently composed of five species (C. lari, C. insulaenigrae, C. volucris, C. subantarcticus and C. peloridis), as well as a group of strains termed the urease-positive thermophilic Campylobacter (UPTC) and other C. lari-like strains. Here we present the complete genome sequences of 11 C. lari group strains, including the five C. lari group species, four UPTC strains and a lari-like strain isolated in this study. The genome of C. lari subsp. lari strain RM2100 was described previously. Analysis of the C. lari group genomes indicates that this group is highly related at the genome level. Furthermore, these genomes are strongly syntenic with minor rearrangements occurring only in four of the twelve genomes studied. The C. lari group can be bifurcated, based on the flagella and flagellar modification genes. Genomic analysis of the UPTC strains indicated that these organisms are variable but highly-similar, closely related to but distinct from C. lari. Additionally, the C. lari group contains multiple genes encoding hemagglutination domain proteins, which are either contingency genes or linked to conserved contingency genes. Many of the features identified in strain RM2100, such as major deficiencies in amino acid biosynthesis and energy metabolism, are conserved across all 12 genomes, suggesting that these common features may play a role in the association of the C. lari group with coastal environments and watersheds.
[Show abstract][Hide abstract] ABSTRACT: Public health officials have raised concerns that plasmid transfer between Enterobacteriaceae species may spread resistance to carbapenems, an antibiotic class of last resort, thereby rendering common health care-associated infections nearly impossible to treat. To determine the diversity of carbapenemase-encoding plasmids and assess their mobility among bacterial species, we performed comprehensive surveillance and genomic sequencing of carbapenem-resistant Enterobacteriaceae in the National Institutes of Health (NIH) Clinical Center patient population and hospital environment. We isolated a repertoire of carbapenemase-encoding Enterobacteriaceae, including multiple strains of Klebsiella pneumoniae, Klebsiella oxytoca, Escherichia coli, Enterobacter cloacae, Citrobacter freundii, and Pantoea species. Long-read genome sequencing with full end-to-end assembly revealed that these organisms carry the carbapenem resistance genes on a wide array of plasmids. K. pneumoniae and E. cloacae isolated simultaneously from a single patient harbored two different carbapenemase-encoding plasmids, indicating that plasmid transfer between organisms was unlikely within this patient. We did, however, find evidence of horizontal transfer of carbapenemase-encoding plasmids between K. pneumoniae, E. cloacae, and C. freundii in the hospital environment. Our data, including full plasmid identification, challenge assumptions about horizontal gene transfer events within patients and identify possible connections between patients and the hospital environment. In addition, we identified a new carbapenemase-encoding plasmid of potentially high clinical impact carried by K. pneumoniae, E. coli, E. cloacae, and Pantoea species, in unrelated patients and in the hospital environment.
Science translational medicine 09/2014; 6(254-254):254ra126. DOI:10.1126/scitranslmed.3009845 · 15.84 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Bacterial phosphorothioate (PT) DNA modifications are incorporated by Dnd proteins A-E and often function with DndF-H as a restriction-modification (R-M) system, as in Escherichia coli B7A. However, bacteria such as Vibrio cyclitrophicus FF75 lack dndF-H, which points to other PT functions. Here we report two novel, orthogonal technologies to map PTs across the genomes of B7A and FF75 with >90% agreement: single molecule, real-time sequencing and deep sequencing of iodine-induced cleavage at PT (ICDS). In B7A, we detect PT on both strands of GpsAAC/GpsTTC motifs, but with only 12% of 40,701 possible sites modified. In contrast, PT in FF75 occurs as a single-strand modification at CpsCA, again with only 14% of 160,541 sites modified. Single-molecule analysis indicates that modification could be partial at any particular genomic site even with active restriction by DndF-H, with direct interaction of modification proteins with GAAC/GTTC sites demonstrated with oligonucleotides. These results point to highly unusual target selection by PT-modification proteins and rule out known R-M mechanisms.
[Show abstract][Hide abstract] ABSTRACT: The genome of Helicobacter pylori is remarkable for its large number of restriction-modification (R-M) systems, and strain-specific diversity in R-M systems has been suggested to limit natural transformation, the major driving force of genetic diversification in H. pylori. We have determined the comprehensive methylomes of two H. pylori strains at single base resolution, using Single Molecule Real-Time (SMRT®) sequencing. For strains 26695 and J99-R3, 17 and 22 methylated sequence motifs were identified, respectively. For most motifs, almost all sites occurring in the genome were detected as methylated. Twelve novel methylation patterns corresponding to nine recognition sequences were detected (26695, 3; J99-R3, 6). Functional inactivation, correction of frameshifts as well as cloning and expression of candidate methyltransferases (MTases) permitted not only the functional characterization of multiple, yet undescribed, MTases, but also revealed novel features of both Type I and Type II R-M systems, including frameshift-mediated changes of sequence specificity and the interaction of one MTase with two alternative specificity subunits resulting in different methylation patterns. The methylomes of these well-characterized H. pylori strains will provide a valuable resource for future studies investigating the role of H. pylori R-M systems in limiting transformation as well as in gene regulation and host interaction.
Nucleic Acids Research 12/2013; 42(4). DOI:10.1093/nar/gkt1201 · 9.11 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The Caulobacter DNA methyltransferase CcrM is one of five master cell-cycle regulators. CcrM is transiently present near the end of DNA replication when it rapidly methylates the adenine in hemimethylated GANTC sequences. The timing of transcription of two master regulator genes and two cell division genes is controlled by the methylation state of GANTC sites in their promoters. To explore the global extent of this regulatory mechanism, we determined the methylation state of the entire chromosome at every base pair at five time points in the cell cycle using single-molecule, real-time sequencing. The methylation state of 4,515 GANTC sites, preferentially positioned in intergenic regions, changed progressively from full to hemimethylation as the replication forks advanced. However, 27 GANTC sites remained unmethylated throughout the cell cycle, suggesting that these protected sites could participate in epigenetic regulatory functions. An analysis of the time of activation of every cell-cycle regulatory transcription start site, coupled to both the position of a GANTC site in their promoter regions and the time in the cell cycle when the GANTC site transitions from full to hemimethylation, allowed the identification of 59 genes as candidates for epigenetic regulation. In addition, we identified two previously unidentified N(6)-methyladenine motifs and showed that they maintained a constant methylation state throughout the cell cycle. The cognate methyltransferase was identified for one of these motifs as well as for one of two 5-methylcytosine motifs.
Proceedings of the National Academy of Sciences 11/2013; 110(48). DOI:10.1073/pnas.1319315110 · 9.67 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We performed whole genome analyses of DNA methylation in Shewanella oneidensis MR-1 to examine its possible role in regulating gene expression and other cellular processes. Single-Molecule Real Time (SMRT) sequencing revealed extensive methylation of adenine (N6mA) throughout the genome. These methylated bases were located in five sequence motifs, including three novel targets for Type I restriction/modification enzymes. The sequence motifs targeted by putative methyltranferases were determined via SMRT sequencing of gene knockout mutants. In addition, we found S. oneidensis MR-1 cultures grown under various culture conditions displayed different DNA methylation patterns. However, the small number of differentially methylated sites could not be directly linked to the much larger number of differentially expressed genes in these conditions, suggesting DNA methylation is not a major regulator of gene expression in S. oneidensis MR-1. The enrichment of methylated GATC motifs in the origin of replication indicate DNA methylation may regulate genome replication in a manner similar to that seen in Escherichia coli. Furthermore, comparative analyses suggest that many Gammaproteobacteria, including all members of the Shewanellaceae family, may also utilize DNA methylation to regulate genome replication.
Journal of bacteriology 08/2013; 195(21). DOI:10.1128/JB.00935-13 · 2.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Haplogroup H dominates present-day Western European mitochondrial DNA variability (>40%), yet was less common (~19%) among Early Neolithic farmers (~5450 BC) and virtually absent in Mesolithic hunter-gatherers. Here we investigate this major component of the maternal population history of modern Europeans and sequence 39 complete haplogroup H mitochondrial genomes from ancient human remains. We then compare this ‘real-time’ genetic data with cultural changes taking place between the Early Neolithic (~5450 BC) and Bronze Age (~2200 BC) in Central Europe. Our results reveal that the current diversity and distribution of haplogroup H were largely established by the Mid Neolithic (~4000 BC), but with substantial genetic contributions from subsequent pan-European cultures such as the Bell Beakers expanding out of Iberia in the Late Neolithic (~2800 BC). Dated haplogroup H genomes allow us to reconstruct the recent evolutionary history of haplogroup H and reveal a mutation rate 45% higher than current estimates for human mitochondria.
[Show abstract][Hide abstract] ABSTRACT: Author Summary
DNA modifications have been found in a wide range of living organisms, from bacteria to human. Many existing studies have shown that they play important roles in development, disease, bacteria virulence, etc. However, for many types of DNA modification, for example N6-methyladenine and 8-oxoG, there is not an efficient and accurate detection method. Single molecule real time (SMRT) sequencing not only generates DNA sequences, but also generates DNA polymerase kinetic information. The kinetic information is sensitive to DNA modifications in the sequenced DNA template, and therefore can be used for detecting a wide range of DNA modification types. The usual detection strategy is a case-control method, which compare kinetic information between native sample and a control sample whose modifications have been removed. However, generating a control sample doubles the cost. We proposed a hierarchical model, which can incorporate existing SMRT sequencing data to increase detection accuracy and reduce coverage requirement of control sample or even avoid the need of a control sample in some cases. We tested our method on SMRT sequencing data of plasmids with known modified sites and E. coli K-12 strain to demonstrate our method can greatly increase detection accuracy and reduce sequencing cost.
[Show abstract][Hide abstract] ABSTRACT: We report a closed genome of Salmonella enterica subsp. enterica serovar Javiana (S. Javiana). This serotype is a common food-borne pathogen and is often associated with fresh-cut produce. Complete (finished) genome assemblies will support pilot studies testing the utility of next-generation sequencing (NGS) technologies in public health laboratories.
[Show abstract][Hide abstract] ABSTRACT: Background
DNA methylation serves as an important epigenetic mark in both eukaryotic and prokaryotic organisms. In eukaryotes, the most common epigenetic mark is 5-methylcytosine, whereas prokaryotes can have 6-methyladenine, 4-methylcytosine, or 5-methylcytosine. Single-molecule, real-time sequencing is capable of directly detecting all three types of modified bases. However, the kinetic signature of 5-methylcytosine is subtle, which presents a challenge for detection. We investigated whether conversion of 5-methylcytosine to 5-carboxylcytosine using the enzyme Tet1 would enhance the kinetic signature, thereby improving detection.
We characterized the kinetic signatures of various cytosine modifications, demonstrating that 5-carboxylcytosine has a larger impact on the local polymerase rate than 5-methylcytosine. Using Tet1-mediated conversion, we show improved detection of 5-methylcytosine using in vitro methylated templates and apply the method to the characterization of 5-methylcytosine sites in the genomes of Escherichia coli MG1655 and Bacillus halodurans C-125.
We have developed a method for the enhancement of directly detecting 5-methylcytosine during single-molecule, real-time sequencing. Using Tet1 to convert 5-methylcytosine to 5-carboxylcytosine improves the detection rate of this important epigenetic marker, thereby complementing the set of readily detectable microbial base modifications, and enhancing the ability to interrogate eukaryotic epigenetic markers.
[Show abstract][Hide abstract] ABSTRACT: In the bacterial world, methylation is most commonly associated with restriction-modification systems that provide a defense mechanism against invading foreign genomes. In addition, it is known that methylation plays functionally important roles, including timing of DNA replication, chromosome partitioning, DNA repair, and regulation of gene expression. However, full DNA methylome analyses are scarce due to a lack of a simple methodology for rapid and sensitive detection of common epigenetic marks (ie N(6)-methyladenine (6 mA) and N(4)-methylcytosine (4 mC)), in these organisms. Here, we use Single-Molecule Real-Time (SMRT) sequencing to determine the methylomes of two related human pathogen species, Mycoplasma genitalium G-37 and Mycoplasma pneumoniae M129, with single-base resolution. Our analysis identified two new methylation motifs not previously described in bacteria: a widespread 6 mA methylation motif common to both bacteria (5'-CTAT-3'), as well as a more complex Type I m6A sequence motif in M. pneumoniae (5'-GAN(7)TAY-3'/3'-CTN(7)ATR-5'). We identify the methyltransferase responsible for the common motif and suggest the one involved in M. pneumoniae only. Analysis of the distribution of methylation sites across the genome of M. pneumoniae suggests a potential role for methylation in regulating the cell cycle, as well as in regulation of gene expression. To our knowledge, this is one of the first direct methylome profiling studies with single-base resolution from a bacterial organism.
[Show abstract][Hide abstract] ABSTRACT: "Candidatus Microthrix" bacteria are deeply branching filamentous actinobacteria which occur at the water-air interface of biological wastewater treatment plants, where they are often responsible for foaming and bulking. Here, we report the first draft genome sequence of a strain from this genus: "Candidatus Microthrix parvicella" strain Bio17-1.
Journal of bacteriology 12/2012; 194(23):6670-1. DOI:10.1128/JB.01765-12 · 2.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Single-molecule real-time (SMRT) DNA sequencing allows the systematic detection of chemical modifications such as methylation but has not previously been applied on a genome-wide scale. We used this approach to detect 49,311 putative 6-methyladenine (m6A) residues and 1,407 putative 5-methylcytosine (m5C) residues in the genome of a pathogenic Escherichia coli strain. We obtained strand-specific information for methylation sites and a quantitative assessment of the frequency of methylation at each modified position. We deduced the sequence motifs recognized by the methyltransferase enzymes present in this strain without prior knowledge of their specificity. Furthermore, we found that deletion of a phage-encoded methyltransferase-endonuclease (restriction-modification; RM) system induced global transcriptional changes and led to gene amplification, suggesting that the role of RM systems extends beyond protecting host genomes from foreign DNA.
[Show abstract][Hide abstract] ABSTRACT: Current generation DNA sequencing instruments are moving closer to seamlessly sequencing genomes of entire populations as a routine part of scientific investigation. However, while significant inroads have been made identifying small nucleotide variation and structural variations in DNA that impact phenotypes of interest, progress has not been as dramatic regarding epigenetic changes and base-level damage to DNA, largely due to technological limitations in assaying all known and unknown types of modifications at genome scale. Recently single molecule real time (SMRT) sequencing has been reported to identify kinetic variation (KV) events that have been demonstrated to reflect epigenetic changes of every known type, providing a path forward for detecting base modifications as a routine part of sequencing. However, to date, no statistical framework has been proposed to enhance the power to detect these events while also controlling for false positive events. By modeling enzyme kinetics in the neighborhood of an arbitrary location in a genomic region of interest as a conditional random field, we provide a statistical framework for incorporating kinetic information at a test positions of interest as well as at neighboring sites that help enhance the power to detect KV events. The performance of this and related models is explored, with the best performing model applied to plasmid DNA isolated from Escherichia coli and mitochondrial DNA isolated from human brain tissue. We highlight widespread kinetic variation events, some of which strongly associate with known modification events while others represent putative chemically modified sites of unknown types.
Genome Research 10/2012; 23(1). DOI:10.1101/gr.136739.111 · 14.63 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Six bacterial genomes, Geobacter metallireducens GS-15, Chromohalobacter salexigens, Vibrio breoganii 1C-10, Bacillus cereus ATCC 10987, Campylobacter jejuni subsp. jejuni 81-176 and C. jejuni NCTC 11168, all of which had previously been sequenced using other platforms were re-sequenced using single-molecule, real-time (SMRT) sequencing specifically to analyze their methylomes. In every case a number of new N(6)-methyladenine ((m6)A) and N(4)-methylcytosine ((m4)C) methylation patterns were discovered and the DNA methyltransferases (MTases) responsible for those methylation patterns were assigned. In 15 cases, it was possible to match MTase genes with MTase recognition sequences without further sub-cloning. Two Type I restriction systems required sub-cloning to differentiate their recognition sequences, while four MTase genes that were not expressed in the native organism were sub-cloned to test for viability and recognition sequences. Two of these proved active. No attempt was made to detect 5-methylcytosine ((m5)C) recognition motifs from the SMRT® sequencing data because this modification produces weaker signals using current methods. However, all predicted (m6)A and (m4)C MTases were detected unambiguously. This study shows that the addition of SMRT sequencing to traditional sequencing approaches gives a wealth of useful functional information about a genome showing not only which MTase genes are active but also revealing their recognition sequences.
Nucleic Acids Research 10/2012; 40(22). DOI:10.1093/nar/gks891 · 9.11 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Advances in DNA sequencing technology have improved our ability to characterize most genomic diversity. However, accurate resolution of large structural events is challenging because of the short read lengths of second-generation technologies. Third-generation sequencing technologies, which can yield longer multikilobase reads, have the potential to address limitations associated with genome assembly. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at >99.9% accuracy. Complex regions with clinically relevant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 cholera reference strain, we obtained 14 scaffolds of greater than 1 kb for the experimental data and 8 scaffolds of greater than 1 kb for the simulated data, which allowed us to correct several errors in contigs assembled from the short-read data alone. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly.
[Show abstract][Hide abstract] ABSTRACT: We exploit the optical and spatial features of subwavelength nanostructures to examine individual receptors on the plasma membrane of living cells. Receptors were sequestered in portions of the membrane projected into zero-mode waveguides. Using single-step photobleaching of green fluorescent protein incorporated into individual subunits, the resulting spatial isolation was used to measure subunit stoichiometry in α4β4 and α4β2 nicotinic acetylcholine and P2X2 ATP receptors. We also show that nicotine and cytisine have differential effects on α4β2 stoichiometry.