The essential genome of a bacterium. Mol Syst Biol 7:528

Department of Developmental Biology, Stanford University, Stanford, CA, USA.
Molecular Systems Biology (Impact Factor: 10.87). 08/2011; 7(1):528. DOI: 10.1038/msb.2011.58
Source: PubMed


The regulatory events that control polar differentiation and cell-cycle progression in the bacterium Caulobacter crescentus are highly integrated, and they have to occur in the proper order (McAdams and Shapiro, 2011). Components of the core regulatory circuit are largely known. Full discovery of its essential genome, including non-coding, regulatory and coding elements, is a prerequisite for understanding the complete regulatory network of this bacterial cell. We have identified all the essential coding and non-coding elements of the Caulobacter chromosome using a hyper-saturated transposon mutagenesis strategy that is scalable and can be readily extended to obtain rapid and accurate identification of the essential genome elements of any sequenced bacterial species at a resolution of a few base pairs.
We engineered a Tn5 derivative transposon (Tn5Pxyl) that carries at one end an inducible outward pointing Pxyl promoter (Christen et al, 2010). We showed that this transposon construct inserts into the genome randomly where it can activate or disrupt transcription at the site of integration, depending on the insertion orientation. DNA from hundred of thousands of transposon insertion sites reading outward into flanking genomic regions was parallel PCR amplified and sequenced by Illumina paired-end sequencing to locate the insertion site in each mutant strain (Figure 1). A single sequencing run on DNA from a mutagenized cell population yielded 118 million raw sequencing reads. Of these, >90 million (>80%) read outward from the transposon element into adjacent genomic DNA regions and the insertion site could be mapped with single nucleotide resolution. This yielded the location and orientation of 428 735 independent transposon insertions in the 4-Mbp Caulobacter genome.
Within non-coding sequences of the Caulobacter genome, we detected 130 non-disruptable DNA segments between 90 and 393 bp long in addition to all essential promoter elements. Among 27 previously identified and validated sRNAs (Landt et al, 2008), three were contained within non-disruptable DNA segments and another three were partially disruptable, that is, insertions caused a notable growth defect. Two additional small RNAs found to be essential are the transfer-messenger RNA (tmRNA) and the ribozyme RNAseP (Landt et al, 2008). In addition to the 8 non-disruptable sRNAs, 29 out of the 130 intergenic essential non-coding sequences contained non-redundant tRNA genes; duplicated tRNA genes were non-essential. We also identified two non-disruptable DNA segments within the chromosomal origin of replication. Thus, we resolved essential non-coding RNAs, tRNAs and essential replication elements within the origin region of the chromosome. An additional 90 non-disruptable small genome elements of currently unknown function were identified. Eighteen of these are conserved in at least one closely related species. Only 2 could encode a protein of over 50 amino acids.
For each of the 3876 annotated open reading frames (ORFs), we analyzed the distribution, orientation, and genetic context of transposon insertions. There are 480 essential ORFs and 3240 non-essential ORFs. In addition, there were 156 ORFs that severely impacted fitness when mutated. The 8-bp resolution allowed a dissection of the essential and non-essential regions of the coding sequences. Sixty ORFs had transposon insertions within a significant portion of their 3′ region but lacked insertions in the essential 5′ coding region, allowing the identification of non-essential protein segments. For example, transposon insertions in the essential cell-cycle regulatory gene divL, a tyrosine kinase, showed that the last 204 C-terminal amino acids did not impact viability, confirming previous reports that the C-terminal ATPase domain of DivL is dispensable for viability (Reisinger et al, 2007; Iniesta et al, 2010). In addition, we found that 30 out of 480 (6.3%) of the essential ORFs appear to be shorter than the annotated ORF, suggesting that these are probably mis-annotated.
Among the 480 ORFs essential for growth on rich media, there were 10 essential transcriptional regulatory proteins, including 5 previously identified cell-cycle regulators (McAdams and Shapiro, 2003; Holtzendorff et al, 2004; Collier and Shapiro, 2007; Gora et al, 2010; Tan et al, 2010) and 5 uncharacterized predicted transcription factors. In addition, two RNA polymerase sigma factors RpoH and RpoD, as well as the anti-sigma factor ChrR, which mitigates rpoE-dependent stress response under physiological growth conditions (Lourenco and Gomes, 2009), were also found to be essential. Thus, a set of 10 transcription factors, 2 RNA polymerase sigma factors and 1 anti-sigma factor are the core essential transcriptional regulators for growth on rich media. To further characterize the core components of the Caulobacter cell-cycle control network, we identified all essential regulatory sequences and operon transcripts. Altogether, the 480 essential protein-coding and 37 essential RNA-coding Caulobacter genes are organized into operons such that 402 individual promoter regions are sufficient to regulate their expression. Of these 402 essential promoters, the transcription start sites (TSSs) of 105 were previously identified (McGrath et al, 2007).
The essential genome features are non-uniformly distributed on the Caulobacter genome and enriched near the origin and the terminus regions. In contrast, the chromosomal positions of the published E. coli essential coding sequences (Rocha, 2004) are preferentially located at either side of the origin (Figure 4A). This indicates that there are selective pressures on chromosomal positioning of some essential elements (Figure 4A).
The strategy described in this report could be readily extended to quickly determine the essential genome for a large class of bacterial species.

Download full-text


Available from: Beat Christen,
  • Source
    • "This illustrates the limitation of transposon essentiality studies using deep sequencing for fitness genes. Notably, we found that the insertions were not evenly distributed along the entire ORFs as previously observed in Caulobacter crescentus (Christen et al, 2011). In this respect, it is important to note that our mini-transposon has an internal promoter that could allow expression of downstream genes or domains if there is a start codon for translation. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Identifying all essential genomic components is critical for the assembly of minimal artificial life. In the genome-reduced bacterium Mycoplasma pneumoniae, we found that small ORFs (smORFs; < 100 residues), accounting for 10% of all ORFs, are the most frequently essential genomic components (53%), followed by conventional ORFs (49%). Essentiality of smORFs may be explained by their function as members of protein and/or DNA/RNA complexes. In larger proteins, essentiality applied to individual domains and not entire proteins, a notion we could confirm by expression of truncated domains. The fraction of essential non-coding RNAs (ncRNAs) non-overlapping with essential genes is 5% higher than of non-transcribed regions (0.9%), pointing to the important functions of the former. We found that the minimal essential genome is comprised of 33% (269,410 bp) of the M. pneumoniae genome. Our data highlight an unexpected hidden layer of smORFs with essential functions, as well as non-coding regions, thus changing the focus when aiming to define the minimal essential genome. © 2015 The Authors. Published under the terms of the CC BY 4.0 license.
    Molecular Systems Biology 01/2015; 11(1):780. DOI:10.15252/msb.20145558 · 10.87 Impact Factor
    • "Transposon insertion sequencing, a technique first reported in 2009 that allows genome-wide analysis of insertions impairing bacterial fitness, was however suitable for identifying sRNA and other non-coding regulatory sequences (reviewed in Van Opijnen and Camilli, 2013). High-density transposon libraries identified non-coding RNAs important for growth of Caulobacter crescentus (Christen et al., 2011) and the pathogens Mycobacterium tuberculosis and Streptococcus pneumoniae (Mann et al., 2012; Zhang et al., 2012). Analysis of S. pneumoniae libraries in in vivo infection models also revealed sRNAs required for virulence (Mann et al., 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Intracellular bacterial pathogens have evolved distinct lifestyles inside eukaryotic cells. Some pathogens coexist with the infected cell in an obligate intracellular state, whereas others transit between the extracellular and intracellular environment. Adaptation to these intracellular lifestyles is regulated in both space and time. Non-coding small RNAs (sRNAs) are post-transcriptional regulatory molecules that fine-tune important processes in bacterial physiology including cell envelope architecture, intermediate metabolism, bacterial communication, biofilm formation, and virulence. Recent studies have shown production of defined sRNA species by intracellular bacteria located inside eukaryotic cells. The molecules targeted by these sRNAs and their expression dynamics along the intracellular infection cycle remain, however, poorly characterized. Technical difficulties linked to the isolation of "intact" intracellular bacteria from infected host cells might explain why sRNA regulation in these specialized pathogens is still a largely unexplored field. Transition from the extracellular to the intracellular lifestyle provides an ideal scenario in which regulatory sRNAs are intended to participate; so much work must be done in this direction. This review focuses on sRNAs expressed by intracellular bacterial pathogens during the infection of eukaryotic cells, strategies used with these pathogens to identify sRNAs required for virulence, and the experimental technical challenges associated to this type of studies. We also discuss varied techniques for their potential application to study RNA regulation in intracellular bacterial infections.
    Frontiers in Cellular and Infection Microbiology 11/2014; 4:162. DOI:10.3389/fcimb.2014.00162 · 3.72 Impact Factor
  • Source
    • "Similarly , B. subvibrioides has a MipZ homologue and no apparent homologues to Min proteins, Noc or SlmA. MipZ was characterized as essential in C. crescentus along with the ParA and ParB proteins necessary for MipZ gradient formation (Thanbichler and Shapiro, 2006; Christen et al., 2011). It was later shown that mipZ can be deleted from C. crescentus, but deletion results in severely filamentous and slow-growing cells (Radhakrishnan et al., 2010). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The cell cycle of Caulobacter crescentus is controlled by a complex signaling network that coordinates events. Genome sequencing has revealed many C. crescentus cell cycle genes are conserved in other Alphaproteobacteria, but it is not clear to what extent their function is conserved. As many cell cycle regulatory genes are essential in C. crescentus, the essential genes of two Alphaproteobacteria, Agrobacterium tumefaciens (Rhizobiales) and Brevundimonas subvibrioides (Caulobacterales), were elucidated to identify changes in cell cycle protein function over different phylogenetic distances as demonstrated by changes in essentiality. The results show the majority of conserved essential genes are involved in critical cell cycle processes. Changes in component essentiality reflect major changes in lifestyle, such as divisome components in A. tumefaciens resulting from that organism's different growth pattern. Larger variability of essentiality was observed in cell cycle regulators, suggesting regulatory mechanisms are more customizable than the processes they regulate. Examples include variability in the essentiality of divJ and divK spatial cell cycle regulators, and non-essentiality of the highly conserved and usually essential DNA methyltransferase CcrM. These results show that while essential cell functions are conserved across varying genetic distance, much of a given organism's essential gene pool is specific to that organism.
    Molecular Microbiology 06/2014; 93(4). DOI:10.1111/mmi.12686 · 4.42 Impact Factor
Show more