Properties of overlapping genes are conserved across microbial genomes

Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
Genome Research (Impact Factor: 13.85). 12/2004; 14(11):2268-72. DOI: 10.1101/gr.2433104
Source: PubMed

ABSTRACT There are numerous examples from the genomes of viruses, mitochondria, and chromosomes that adjacent genes can overlap, sharing at least one nucleotide. Overlaps have been hypothesized to be involved in genome size minimization and as a regulatory mechanism of gene expression. Here we show that overlapping genes are a consistent feature (approximately one-third of all genes) across all microbial genomes sequenced to date, have homologs in more microbes than do non-overlapping genes, and are therefore likely more conserved. In addition, the size, phase (reading frame offset), and distribution, among other characteristics, of overlapping genes are most consistent with the hypothesis that overlaps function in the regulation of gene expression. The upstream sequences and conservation of overlapping orthologs of two model organisms from the genus Prochlorococcus that have significantly different GC-content, and therefore different nucleotide sequences for orthologs, are also consistent with small overlapping sequence regions and programmed shifts in reading frame as a common mechanism in the regulation of microbial gene expression.

  • Source
    • "This arrangement is thought to be responsible for maintaining the ∼1 :35 ratio between the two proteins, a ratio likely maintained to prevent undesired cross-talk between the numerous two-component systems (Siryaporn and Goulian, 2008). Accordingly , overlapping genes are primarily found in operons and their patterns are strongly conserved between phylogenetically distant bacteria (Johnson and Chisholm, 2004). More strikingly, operons can also have internal regulatory elements (e.g. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The proper functioning of bacteria is encoded in their genome at multiple levels or scales, each of which is constrained by specific physical forces. At the smallest spatial scales, interatomic forces dictate the folding and function of proteins and nucleic acids. On longer length scales, stochastic forces emerging from the thermal jiggling of proteins and RNAs impose strong constraints on the organization of genes along chromosomes, more particularly in the context of the building of nucleoprotein complexes and the operational mode of regulatory agents. At the cellular level, transcription, replication and cell division activities generate forces that act on both the internal structure and cellular location of chromosomes. The overall result is a complex multi-scale organization of genomes that reflects the evolutionary tinkering of bacteria. The goal of this review is to highlight avenues for deciphering this complexity by focusing on patterns that are conserved among evolutionarily distant bacteria. To this end, I discuss three different organizational scales: the protein structures, the chromosomal organization of genes and the global structure of chromosomes.
    Computational Biology and Chemistry 08/2014; 53. DOI:10.1016/j.compbiolchem.2014.08.017 · 1.60 Impact Factor
  • Source
    • "Overlapping genes are pairs of adjacent genes with a certain extent of overlap in genomic locations. They are known to be common in viruses, mitochondria, and bacteria, and may compose a compact genome organization to facilitate gene regulation efficiency [1], like operons. Overlapping genes have recently been found in the human genome as well [2] [3] [4] [5] [6]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Overlapping genes are pairs of adjacent genes whose genomic regions partially overlap. They are notable by their potential intricate regulation, such as cis-regulation of nested gene-promoter configurations, and post-transcriptional regulation of natural antisense transcripts. The originations and consequent detailed regulation remain obscure. Herein, we propose a unified framework comprising biological classification rules followed by extensive analyses, namely, exon-sharing analysis, a human-mouse conservation study, and transcriptome analysis of hundreds of microarrays and transcriptome sequencing data (mRNA-Seq). We demonstrate that the tail-to-tail architecture would result from sharing functional elements in 3'-untranslated regions (3'-UTRs) of pre-existing genes. Dissimilarly, we illustrate that the other gene overlaps would originate from a new gene arising in a pre-existing gene locus. Interestingly, these types of coupled overlapping genes may influence each other synergistically or competitively during transcription, depending on the promoter configurations. This framework discloses distinctive characteristics of overlapping genes to be a foundation for a further comprehensive understanding of them.
    Genomics 07/2012; 100(4):231-9. DOI:10.1016/j.ygeno.2012.06.011 · 2.79 Impact Factor
  • Source
    • "In support of this model, Rogozin et al. (2002) found that among oppositestrand overlaps in bacteria, the most evolvable overlap phase (phase 1) was the most abundant. However, this model failed to explain the phase-distribution of same-strand overlaps in bacteria (Johnson and Chisholm 2004; Cock and Whitworth 2007). Cock and Whitworth (2007) attributed the unexpected phase-distribution to either gene location or to an unspecified selective advantage. "
    [Show abstract] [Hide abstract]
    ABSTRACT: INTRODUCTION The phenomenon of viral proteins encoded by overlapping reading frames has attracted the attention of evolutionary biologists since its discovery. Overlaps have been hypothesized to be involved in genome size minimization and as a regulatory mechanism of gene expression. One question of evolutionary interest raised by this phenomenon is how natural selection can act simultaneously on two different reading frames by the same DNA sequence. OBJECTIVE 1) To study the rate and pattern of evolution of HIV-1 within and among the individual through long term follow up. 2) To study the effect of evolution on viral proteins encoded by overlapping reading frames. METHODOLOGY Near full length genome was amplified from HIV+ve PBMCs or Plasma and cloned into pCR2.1 vector. Positive clones were sequenced by primer walking method. All the four overlapping regions (p6gag/pro, tat1/rev1, tat2/rev2/gp41, nef/LTR) were studied by aligning them with the help of Clustal W and Phylogenetic distance were calculated using maximum likelyhood model using Kimura-2 parameter. RESULTS & CONCLUSION Among all four overlapping regions p6/pro were found the most variable and the motif encoded is responsible for Vpr binding, and virus budding by interacting with two host proteins Tsg 101 and ESCRT-I (AIP1). Functional study of these variations will reveal the effect of this kind of variation on viral life cycle.
    ICGEB-IUBMB Workshop on “Human RNA Viruses”.; 01/2010


Available from