The essential genome of a bacterium. Mol Syst Biol 7:528

Department of Developmental Biology, Stanford University, Stanford, CA, USA.
Molecular Systems Biology (Impact Factor: 10.87). 08/2011; 7(1):528. DOI: 10.1038/msb.2011.58
Source: PubMed


The regulatory events that control polar differentiation and cell-cycle progression in the bacterium Caulobacter crescentus are highly integrated, and they have to occur in the proper order (McAdams and Shapiro, 2011). Components of the core regulatory circuit are largely known. Full discovery of its essential genome, including non-coding, regulatory and coding elements, is a prerequisite for understanding the complete regulatory network of this bacterial cell. We have identified all the essential coding and non-coding elements of the Caulobacter chromosome using a hyper-saturated transposon mutagenesis strategy that is scalable and can be readily extended to obtain rapid and accurate identification of the essential genome elements of any sequenced bacterial species at a resolution of a few base pairs.
We engineered a Tn5 derivative transposon (Tn5Pxyl) that carries at one end an inducible outward pointing Pxyl promoter (Christen et al, 2010). We showed that this transposon construct inserts into the genome randomly where it can activate or disrupt transcription at the site of integration, depending on the insertion orientation. DNA from hundred of thousands of transposon insertion sites reading outward into flanking genomic regions was parallel PCR amplified and sequenced by Illumina paired-end sequencing to locate the insertion site in each mutant strain (Figure 1). A single sequencing run on DNA from a mutagenized cell population yielded 118 million raw sequencing reads. Of these, >90 million (>80%) read outward from the transposon element into adjacent genomic DNA regions and the insertion site could be mapped with single nucleotide resolution. This yielded the location and orientation of 428 735 independent transposon insertions in the 4-Mbp Caulobacter genome.
Within non-coding sequences of the Caulobacter genome, we detected 130 non-disruptable DNA segments between 90 and 393 bp long in addition to all essential promoter elements. Among 27 previously identified and validated sRNAs (Landt et al, 2008), three were contained within non-disruptable DNA segments and another three were partially disruptable, that is, insertions caused a notable growth defect. Two additional small RNAs found to be essential are the transfer-messenger RNA (tmRNA) and the ribozyme RNAseP (Landt et al, 2008). In addition to the 8 non-disruptable sRNAs, 29 out of the 130 intergenic essential non-coding sequences contained non-redundant tRNA genes; duplicated tRNA genes were non-essential. We also identified two non-disruptable DNA segments within the chromosomal origin of replication. Thus, we resolved essential non-coding RNAs, tRNAs and essential replication elements within the origin region of the chromosome. An additional 90 non-disruptable small genome elements of currently unknown function were identified. Eighteen of these are conserved in at least one closely related species. Only 2 could encode a protein of over 50 amino acids.
For each of the 3876 annotated open reading frames (ORFs), we analyzed the distribution, orientation, and genetic context of transposon insertions. There are 480 essential ORFs and 3240 non-essential ORFs. In addition, there were 156 ORFs that severely impacted fitness when mutated. The 8-bp resolution allowed a dissection of the essential and non-essential regions of the coding sequences. Sixty ORFs had transposon insertions within a significant portion of their 3′ region but lacked insertions in the essential 5′ coding region, allowing the identification of non-essential protein segments. For example, transposon insertions in the essential cell-cycle regulatory gene divL, a tyrosine kinase, showed that the last 204 C-terminal amino acids did not impact viability, confirming previous reports that the C-terminal ATPase domain of DivL is dispensable for viability (Reisinger et al, 2007; Iniesta et al, 2010). In addition, we found that 30 out of 480 (6.3%) of the essential ORFs appear to be shorter than the annotated ORF, suggesting that these are probably mis-annotated.
Among the 480 ORFs essential for growth on rich media, there were 10 essential transcriptional regulatory proteins, including 5 previously identified cell-cycle regulators (McAdams and Shapiro, 2003; Holtzendorff et al, 2004; Collier and Shapiro, 2007; Gora et al, 2010; Tan et al, 2010) and 5 uncharacterized predicted transcription factors. In addition, two RNA polymerase sigma factors RpoH and RpoD, as well as the anti-sigma factor ChrR, which mitigates rpoE-dependent stress response under physiological growth conditions (Lourenco and Gomes, 2009), were also found to be essential. Thus, a set of 10 transcription factors, 2 RNA polymerase sigma factors and 1 anti-sigma factor are the core essential transcriptional regulators for growth on rich media. To further characterize the core components of the Caulobacter cell-cycle control network, we identified all essential regulatory sequences and operon transcripts. Altogether, the 480 essential protein-coding and 37 essential RNA-coding Caulobacter genes are organized into operons such that 402 individual promoter regions are sufficient to regulate their expression. Of these 402 essential promoters, the transcription start sites (TSSs) of 105 were previously identified (McGrath et al, 2007).
The essential genome features are non-uniformly distributed on the Caulobacter genome and enriched near the origin and the terminus regions. In contrast, the chromosomal positions of the published E. coli essential coding sequences (Rocha, 2004) are preferentially located at either side of the origin (Figure 4A). This indicates that there are selective pressures on chromosomal positioning of some essential elements (Figure 4A).
The strategy described in this report could be readily extended to quickly determine the essential genome for a large class of bacterial species.

Download full-text


Available from: Beat Christen
    • "When Tn-seq profiles of a library are quantitatively compared with an appropriate normalization and statistical method between before and after a selection, the genetic factors that are required for optimal growth or survival under the selection process can easily be identified on a genomic scale. Since the first versions of Tn-seq methods were reported (Gawronski et al. 2009; Goodman et al. 2009; Langridge et al. 2009; van Opijnen et al. 2009), several variations on the method have been described (Christen et al. 2011; Dawoud et al. 2014; Gallagher et al. 2011; Khatiwara et al. 2012; Klein et al. 2012). These variations differ mainly in the manner in which specific amplification of the transposon junction sequences is accomplished. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A comprehensive understanding of genotype-phenotype links in bacteria is the primary theme of bacterial functional genomics. Transposon sequencing (Tn-seq) or its equivalent methods that combine random transposon mutagenesis and next-generation sequencing (NGS) represent a powerful approach to understand gene functions in bacteria on a genome-wide scale. This approach has been utilized in a variety of bacterial species to provide comprehensive information on gene functions related to various phenotypes or biological processes of significance. With further improvements in the molecular protocol for specific amplification of transposon junction sequences and increasing capacity of next generation sequencing technologies, the applications of Tn-seq have been expanding to tackle questions that are important yet difficult to address in the past. In this review, we will discuss the technical aspects of different Tn-seq methods along with their pros and cons to provide a helpful guidance for those who want to implement or improve Tn-seq for their own research projects. In addition, we also provide a comprehensive summary of recent published studies based on Tn-seq methods to give an updated perspective on the current and emerging applications of Tn-seq.
    No preview · Article · Oct 2015 · Applied Microbiology and Biotechnology
  • Source
    • "This illustrates the limitation of transposon essentiality studies using deep sequencing for fitness genes. Notably, we found that the insertions were not evenly distributed along the entire ORFs as previously observed in Caulobacter crescentus (Christen et al, 2011). In this respect, it is important to note that our mini-transposon has an internal promoter that could allow expression of downstream genes or domains if there is a start codon for translation. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Identifying all essential genomic components is critical for the assembly of minimal artificial life. In the genome-reduced bacterium Mycoplasma pneumoniae, we found that small ORFs (smORFs; < 100 residues), accounting for 10% of all ORFs, are the most frequently essential genomic components (53%), followed by conventional ORFs (49%). Essentiality of smORFs may be explained by their function as members of protein and/or DNA/RNA complexes. In larger proteins, essentiality applied to individual domains and not entire proteins, a notion we could confirm by expression of truncated domains. The fraction of essential non-coding RNAs (ncRNAs) non-overlapping with essential genes is 5% higher than of non-transcribed regions (0.9%), pointing to the important functions of the former. We found that the minimal essential genome is comprised of 33% (269,410 bp) of the M. pneumoniae genome. Our data highlight an unexpected hidden layer of smORFs with essential functions, as well as non-coding regions, thus changing the focus when aiming to define the minimal essential genome. © 2015 The Authors. Published under the terms of the CC BY 4.0 license.
    Full-text · Article · Jan 2015 · Molecular Systems Biology
    • "Transposon insertion sequencing, a technique first reported in 2009 that allows genome-wide analysis of insertions impairing bacterial fitness, was however suitable for identifying sRNA and other non-coding regulatory sequences (reviewed in Van Opijnen and Camilli, 2013). High-density transposon libraries identified non-coding RNAs important for growth of Caulobacter crescentus (Christen et al., 2011) and the pathogens Mycobacterium tuberculosis and Streptococcus pneumoniae (Mann et al., 2012; Zhang et al., 2012). Analysis of S. pneumoniae libraries in in vivo infection models also revealed sRNAs required for virulence (Mann et al., 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Intracellular bacterial pathogens have evolved distinct lifestyles inside eukaryotic cells. Some pathogens coexist with the infected cell in an obligate intracellular state, whereas others transit between the extracellular and intracellular environment. Adaptation to these intracellular lifestyles is regulated in both space and time. Non-coding small RNAs (sRNAs) are post-transcriptional regulatory molecules that fine-tune important processes in bacterial physiology including cell envelope architecture, intermediate metabolism, bacterial communication, biofilm formation, and virulence. Recent studies have shown production of defined sRNA species by intracellular bacteria located inside eukaryotic cells. The molecules targeted by these sRNAs and their expression dynamics along the intracellular infection cycle remain, however, poorly characterized. Technical difficulties linked to the isolation of "intact" intracellular bacteria from infected host cells might explain why sRNA regulation in these specialized pathogens is still a largely unexplored field. Transition from the extracellular to the intracellular lifestyle provides an ideal scenario in which regulatory sRNAs are intended to participate; so much work must be done in this direction. This review focuses on sRNAs expressed by intracellular bacterial pathogens during the infection of eukaryotic cells, strategies used with these pathogens to identify sRNAs required for virulence, and the experimental technical challenges associated to this type of studies. We also discuss varied techniques for their potential application to study RNA regulation in intracellular bacterial infections.
    No preview · Article · Nov 2014 · Frontiers in Cellular and Infection Microbiology
Show more