A genome-wide survey of short coding sequences in

Unité de Biochimie Bactérienne, UR477, INRA, 78350 Jouy-en-Josas, France.
Microbiology (Impact Factor: 2.56). 12/2007; 153(Pt 11):3631-44. DOI: 10.1099/mic.0.2007/006205-0
Source: PubMed


Identification of short genes that encode peptides of fewer than 60 aa is challenging, both experimentally and in silico. As a consequence, the universe of these short coding sequences (CDSs) remains largely unknown, although some are acknowledged to play important roles in cell-cell communication, particularly in Gram-positive bacteria. This paper reports a thorough search for short CDSs across streptococcal genomes. Our bioinformatic approach relied on a combination of advanced intrinsic and extrinsic methods. In the first step, intrinsic sequence information (nucleotide composition and presence of RBSs) served to identify new short putative CDSs (spCDSs) and to eliminate the differences between annotation policies. In the second step, pseudogene fragments and false predictions were filtered out. The last step consisted of screening the remaining spCDSs for lines of extrinsic evidence involving sequence and gene-context comparisons. A total of 789 spCDSs across 20 complete genomes (19 Streptococcus and one Enterococcus) received the support of at least one line of extrinsic evidence, which corresponds to an average of 20 short CDSs per million base pairs. Most of these had no known function, and a significant fraction (31%) are not even annotated as hypothetical genes in GenBank records. As an illustration of the value of this list, we describe a new family of CDSs, encoding very short hydrophobic peptides (20-23 aa) situated just upstream of some of the positive transcriptional regulators of the Rgg family. The expression of seven other short CDSs from Streptococcus thermophilus CNRZ1066 that encode peptides ranging in length from 41 to 56 aa was confirmed by real-time quantitative RT-PCR and revealed a variety of expression patterns. Finally, one peptide from this list, encoded by a gene that is not annotated in GenBank, was identified in a cell-envelope-enriched fraction of S. thermophilus CNRZ1066.

Download full-text


Available from: Véronique Monnet, May 04, 2015
  • Source
    • "After the discovery that the deletion of a small pre-peptide gene inhibited the regulatory activity of an Rgg protein in S. thermophilus, it was recognized that activity of Rgg regulators was modulated by short peptides, constituting putative QS circuits (Ibrahim et al., 2007a). Commonly, Rgg genes are located next to a short open reading frame that encodes the propeptide of their cognate pheromone, short genes which are usually overlooked in genome annotation processes but have been predicted by in silico analysis (Ibrahim et al., 2007b; Fleuchot et al., 2011). Rgg pheromones have been classified in two groups to date, short hydrophobic peptides (SHPs) and peptides involved in competence pathways, termed XIPs (Table 1) (Mashburn-Warren et al., 2010; Fleuchot et al., 2011). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Quorum sensing (QS) is a widespread phenomenon in the microbial world that has important implications in the coordination of population-wide responses in several bacterial pathogens. In Group A Streptococcus (GAS), many questions surrounding QS systems remain to be solved pertaining to their function and their contribution to the GAS lifestyle in the host. The QS systems of GAS described to date can be categorized into four groups: regulator gene of glucosyltransferase (Rgg), Sil, lantibiotic systems, and LuxS/AI-2. The Rgg family of proteins, a conserved group of transcription factors that modify their activity in response to signaling peptides, has been shown to regulate genes involved in virulence, biofilm formation and competence. The sil locus, whose expression is regulated by the activity of signaling peptides and a putative two-component system (TCS), has been implicated on regulating genes involved with invasive disease in GAS isolates. Lantibiotic regulatory systems are involved in the production of bacteriocins and their autoregulation, and some of these genes have been shown to target both bacterial organisms as well as processes of survival inside the infected host. Finally AI-2 (dihydroxy pentanedione, DPD), synthesized by the LuxS enzyme in several bacteria including GAS, has been proposed to be a universal bacterial communication molecule. In this review we discuss the mechanisms of these four systems, the putative functions of their targets, and pose critical questions for future studies.
    Frontiers in Cellular and Infection Microbiology 09/2014; 4. DOI:10.3389/fcimb.2014.00127 · 3.72 Impact Factor
    • "Most of them characterized up to now, are encoded by short open reading frames (ORFs), which have been only heterogeneously and incompletely annotated in genome sequences. In a survey of 20 streptococcal genomes, purpose-built detection software predicted the putative occurrence of 20 short ORFs per million bp; of these, nearly onethird lacked any form of annotation (Ibrahim et al., 2007a). Figure 1. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract Within Gram-positive bacteria, the expression of target genes is controlled at the population level via signaling peptides, also known as pheromones. Pheromones control a wide range of functions, including competence, virulence, and others that remain unknown. Until now, their role in bacterial gene regulation has probably been underestimated; indeed, bacteria are able to produce, by ribosomal synthesis or surface protein degradation, an extraordinary variety of peptides which are released outside bacteria and among which, some are pheromones that mediate cell-to-cell communication. The review aims at giving an updated overview of these peptide-dependant communication pathways. More specifically, it follows the whole peptide circuit from the peptide production and secretion in the extracellular medium to its interaction with sensors at bacterial surface or re-import into the bacteria where it plays its regulation role. In recent years, as we have accumulated more knowledge about these systems, it has become apparent that they are more complex than they first appeared. For this reason, more research on peptide-dependant pathways is needed to develop new strategies for controlling functions of interest in Gram-positive bacteria. In particular, such research could lead to alternatives to the use of antibiotics against pathogenic bacteria. In perspective, the review identifies new research questions that emerge in this field and that have to be addressed.
    Critical Reviews in Microbiology 09/2014; 8:1-13. DOI:10.3109/1040841X.2014.948804 · 6.02 Impact Factor
  • Source
    • ": 1) Known RNA genes identified individually by 5′ end mapping and Northern blot analysis, reported in the literature. 2) New RNA genes, so called independent transcription units identified through the large scale tiling array analysis described in [31]; 3) New CDSs, unannotated expressed regions described in [31] and containing a coding sequence detected with high confidence by SHOW, based on an enriched HMM model of DNA sequences [45]; 4) 5′ cis-acting structures compiled from [30], and reliable predictions against the RFAM database [46], reported in the Genome Reviews version of annotated genomes from the EBI [47]; 5) Newly expressed antisense segments, complementary to expressed regions corresponding to 5′ UTRs, 3′ UTRs, and intergenic regions of annotated features; 6) Newly expressed segments not antisense, classified as in the previous item, but not antisense, see [31] for criteria distinguishing antisense regions from those that are not. We refer to all RNAs described here using the same «segment» (S) number and classification as in [31] which corresponds to their order of appearance on the B. subtilis genome. "
    [Show abstract] [Hide abstract]
    ABSTRACT: RNase Y is a key endoribonuclease affecting global mRNA stability in Bacillus subtilis. Its characterization provided the first evidence that endonucleolytic cleavage plays a major role in the mRNA metabolism of this organism. RNase Y shares important functional features with the RNA decay initiating RNase E from Escherichia coli, notably a similar cleavage specificity and a preference for 5' monophosphorylated substrates. We used high-resolution tiling arrays to analyze the effect of RNase Y depletion on RNA abundance covering the entire genome. The data confirm that this endoribonuclease plays a key role in initiating the decay of a large number of mRNAs as well as non coding RNAs. The downstream cleavage products are likely to be degraded by the 5' exonucleolytic activity of RNases J1/J2 as we show for a specific case. Comparison of the data with that of two other recent studies revealed very significant differences. About two thirds of the mRNAs upregulated following RNase Y depletion were different when compared to either one of these studies and only about 10% were in common in all three studies. This highlights that experimental conditions and data analysis play an important role in identifying RNase Y substrates by global transcriptional profiling. Our data confirmed already known RNase Y substrates and due to the precision and reproducibility of the profiles allow an exceptionally detailed view of the turnover of hundreds of new RNA substrates.
    PLoS ONE 01/2013; 8(1):e54062. DOI:10.1371/journal.pone.0054062 · 3.23 Impact Factor
Show more