Purifying and directional selection in overlapping prokaryotic genes

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Trends in Genetics (Impact Factor: 9.92). 06/2002; 18(5):228-32. DOI: 10.1016/S0168-9525(02)02649-5
Source: PubMed


In overlapping genes, the same DNA sequence codes for two proteins using different reading frames. Analysis of overlapping genes can help in understanding the mode of evolution of a coding region from noncoding DNA. We identified 71 pairs of convergent genes, with overlapping 3' ends longer than 15 nucleotides, that are conserved in at least two prokaryotic genomes. Among the overlap regions, we observed a statistically significant bias towards the 123:132 phase (i.e. the second codon base in one gene facing the degenerate third position in the second gene). This phase ensures the least mutual constraint on nonconservative amino acid replacements in both overlapping coding sequences. The excess of this phase is compatible with directional (positive) selection acting on the overlapping coding regions. This could be a general evolutionary mode for genes emerging from noncoding sequences, in which the protein sequence has not been subject to selection.

10 Reads
  • Source
    • "When two genes overlap the same portion of the DNA codes for the constituent amino acids of the two, typically different, proteins involved in the overlap. These overlapping structures can be observed in viruses [1], prokaryotes [2], and also eukaryotes [3]. Several previous studies have already made attempts at characterizing overlapping genes (OGs) in bacteria. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background The forces underlying genome architecture and organization are still only poorly understood in detail. Overlapping genes (genes partially or entirely overlapping) represent a genomic feature that is shared widely across biological organisms ranging from viruses to multi-cellular organisms. In bacteria, a third of the annotated genes are involved in an overlap. Despite the widespread nature of this arrangement, its evolutionary origins and biological ramifications have so far eluded explanation. Results Here we present a comparative approach using information from 699 bacterial genomes that sheds light on the evolutionary dynamics of overlapping genes. We show that these structures exhibit high levels of plasticity. Conclusions We propose a simple model allowing us to explain the observed properties of overlapping genes based on the importance of initiation and termination of transcriptional and translational processes. We believe that taking into account the processes leading to the expression of protein-coding genes hold the key to the understanding of overlapping genes structures.
    BMC Genomics 08/2014; 15(1):721. DOI:10.1186/1471-2164-15-721 · 3.99 Impact Factor
  • Source
    • "NATs are transcribed from the strand opposite to the template DNA strand and may hybridize with the sense transcripts of same genomic loci (cis-NATS) or to the complementary transcripts of separate genomic loci (trans-NATs). This has been implicated in regulating gene expression in both prokaryotes and eukaryotes through diverse postulated mechanisms (Faghihi and Wahlestedt, 2009; Werner and Berdal, 2005; Lavorgna et al., 2004; Rogozin et al., 2002; Wagner and Simons, 1994). Gene expression regulation by NATs in different organisms includes genomic imprinting, transcriptional collision, X chromosome inactivation, alternative splicing and termination, RNA interference, translational regulation , and RNA editing (Faghihi and Wahlestedt, 2009; Lavorgna et al., 2004). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Mechanisms regulating gene expression in malaria parasites are not well understood. Little is known about how the parasite regulates its gene expression during transition from one developmental stage to another and in response to various environmental conditions. Parasites in a diseased host face environments which differ from the static, well adapted in vitro conditions. Parasites thus need to adapt quickly and effectively to these conditions by establishing transcriptional states which are best suited for better survival. With the discovery of natural antisense transcripts (NATs) in this parasite and considering the various proposed mechanisms by which NATs might regulate gene expression, it has been speculated that these might be playing a critical role in gene regulation. We report here the diversity of NATs in this parasite, using isolates taken directly from patients with differing clinical symptoms caused by malaria infection. Using a custom designed strand specific whole genome microarray, a total of 797 NATs targeted against annotated loci have been detected. Out of these, 545 NATs are unique to this study. The majority of NATs were positively correlated with the expression pattern of the sense transcript. However, 96 genes showed a change in sense/antisense ratio on comparison between uncomplicated and complicated disease conditions. The antisense transcripts map to a broad range of biochemical/ metabolic pathways, especially pathways pertaining to the central carbon metabolism and stress related pathways. Our data strongly suggests that a large group of NATs detected here are unannotated transcription units antisense to annotated gene models. The results reveal a previously unknown set of NATs that prevails in this parasite, their differential regulation in disease conditions and mapping to functionally well annotated genes. The results detailed here call for studies to deduce the possible mechanism of action of NATs, which would further help in understanding the in vivo pathological adaptations of these parasites.
    Experimental Parasitology 03/2014; 141(1). DOI:10.1016/j.exppara.2014.03.008 · 1.64 Impact Factor
  • Source
    • "For example, purifying selection on synonymous sites was found in 9.4% of all yeast genes and 5.1% in all worm genes using a likelihood based model that distinguishes between synonymous substitutions between preferred and unpreferred codons [12]. Purifying selection on synonymous sites was suggested to result from considerations such as mRNA stability [13-17], splicing regulatory elements [e.g., [18], cis regulatory elements, and overlapping genes [11,19,20]. Furthermore, in bacteria and yeast, codon bias in highly expressed genes is the most documented type of selection on synonymous sites [e.g., [21]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Synonymous or silent mutations are usually thought to evolve neutrally. However, accumulating recent evidence has demonstrated that silent mutations may destabilize RNA structures or disrupt cis regulatory motifs superimposed on coding sequences. Such observations suggest the existence of stretches of codon sites that are evolutionary conserved at both DNA-RNA and protein levels. Such stretches may point to functionally important regions within protein coding sequences not necessarily reflecting functional constraints on the amino-acid sequence. The HIV-1 genome is highly compact, and often harbors overlapping functional elements at the protein, RNA, and DNA levels. This superimposition of functions leads to complex selective forces acting on all levels of the genome and proteome. Considering the constraints on HIV-1 to maintain such a highly compact genome, we hypothesized that stretches of synonymous conservation would be common within its genome. We used a combined computational-experimental approach to detect and characterize regions exhibiting strong purifying selection against synonymous substitutions along the HIV-1 genome. Our methodology is based on advanced probabilistic evolutionary models that explicitly account for synonymous rate variation among sites and rate dependencies among adjacent sites. These models are combined with a randomization procedure to automatically identify the most statistically significant regions of conserved synonymous sites along the genome. Using this procedure we identified 21 conserved regions. Twelve of these are mapped to regions within overlapping genes, seven correlate with known functional elements, while the functions of the remaining four are yet unknown. Among these four regions, we chose the one that deviates most from synonymous rate homogeneity for in-depth computational and experimental characterization. In our assays aiming to quantify viral fitness in both early and late stages of the replication cycle, no differences were observed between the mutated and the wild type virus following the introduction of synonymous mutations. The contradiction between the inferred purifying selective forces and the lack of effect of these mutations on viral replication may be explained by the fact that the phenotype was measured in single-cycle infection assays in cell culture. Such a system does not account for the complexity of HIV-1 infections in vivo, which involves multiple infection cycles and interaction with the host immune system.
    BMC Evolutionary Biology 08/2013; 13(1):164. DOI:10.1186/1471-2148-13-164 · 3.37 Impact Factor
Show more