Overlapping genes in the human and mouse genomes

Article (PDF Available)inBMC Genomics 9(1):169 · February 2008with16 Reads
DOI: 10.1186/1471-2164-9-169 · Source: PubMed
Abstract
Increasing evidence suggests that overlapping genes are much more common in eukaryotic genomes than previously thought. In this study we identified and characterized the overlapping genes in a set of 13,484 pairs of human-mouse orthologous genes. About 10% of the genes under study are overlapping genes, the majority of which are different-strand overlaps. The majority of the same-strand overlaps are embedded forms, whereas most different-strand overlaps are not embedded and in the convergent transcription orientation. Most of the same-strand overlapping gene pairs show at least a tenfold difference in length, much larger than the length difference between non-overlapping neighboring gene pairs. The length difference between the two different-strand overlapping genes is less dramatic. Over 27% of the different-strand-overlap relationships are shared between human and mouse, compared to only approximately 8% conservation for same-strand-overlap relationships. More than 96% of the same-strand and different-strand overlaps that are not shared between human and mouse have both genes located on the same chromosomes in the species that does not show the overlap. We examined the causes of transition between the overlapping and non-overlapping states in the two species and found that 3' UTR change plays an important role in the transition. Our study contributes to the understanding of the evolutionary transition between overlapping genes and non-overlapping genes and demonstrates the high rates of evolutionary changes in the un-translated regions.

Full-text (PDF)

Available from: PubMed Central · License: CC BY
    • "Such an organization of the genome in viruses, bacteria and mitochondria is understandable and proo vides compactness, as well as increasing the efficiency of gene regulation [92, 93]. Overlapping genes are also found in the human genome, and up to 10% of all genes are such overlapping genes [88, 89]. The purpose of overlapping genes is unclear, but it may be assumed that the overlapping genes in a pair mutually influence each other, at least at the level of transcription [32]. "
    [Show abstract] [Hide abstract] ABSTRACT: Although a relatively small part of the human genome contains protein encoding genes, the latest data on the discovery of alternative open reading frames (ORFs) in conventional mRNAs has highlighted the expanded coding potential of these genes. Until recently, it was believed that each mRNA transcript encodes a single protein. Recent proteogenomics data indicate the existence of exceptions to this rule, which greatly changes the usual meaning of the term “gene.” The topology of a gene with overlapping ORFs resembles a Russian “matreshka” toy. There are two levels of “matreshka” genetic systems. First, the chromosomal level, when the “nested” gene is located within introns and exons of the main chromosomal gene, both in the sense and antisense orientation relative to the external gene. The second level is a mature mRNA molecule containing overlapping ORFs or an ORF with an alternative start codon. In this review, we will focus on the properties of “matreshka” genes of the second type and methods for their detection and verification. Particular attention is paid to the biological properties of the polypeptides encoded by these genes.
    Full-text · Article · Feb 2016
    • "A strand-specific RNA-sequencing protocol was used to distinguish sense and antisense transcripts. Sequencing RNA strand-specifically is important considering that genes can be encoded on different strands of the DNA and a considerable part of these genes is known to overlap [20,21]. Strand-specific information, therefore, will improve the accuracy of the gene expression analysis. "
    [Show abstract] [Hide abstract] ABSTRACT: Cellular processes underlying memory formation are evolutionary conserved, but natural variation in memory dynamics between animal species or populations is common. The genetic basis of this fascinating phenomenon is poorly understood. Closely related species of Nasonia parasitic wasps differ in long-term memory (LTM) formation: N. vitripennis will form transcription-dependent LTM after a single conditioning trial, whereas the closely-related species N. giraulti will not. Genes that were differentially expressed (DE) after conditioning in N. vitripennis, but not in N. giraulti, were identified as candidate genes that may regulate LTM formation. RNA was collected from heads of both species before and immediately, 4 or 24 hours after conditioning, with 3 replicates per time point. It was sequenced strand-specifically, which allows distinguishing sense from antisense transcripts and improves the quality of expression analyses. We determined conditioning-induced DE compared to naïve controls for both species. These expression patterns were then analysed with GO enrichment analyses for each species and time point, which demonstrated an enrichment of signalling-related genes immediately after conditioning in N. vitripennis only. Analyses of known LTM genes and genes with an opposing expression pattern between the two species revealed additional candidate genes for the difference in LTM formation. These include genes from various signalling cascades, including several members of the Ras and PI3 kinase signalling pathways, and glutamate receptors. Interestingly, several other known LTM genes were exclusively differentially expressed in N. giraulti, which may indicate an LTM-inhibitory mechanism. Among the DE transcripts were also antisense transcripts. Furthermore, antisense transcripts aligning to a number of known memory genes were detected, which may have a role in regulating these genes. This study is the first to describe and compare expression patterns of both protein-coding and antisense transcripts, at different time points after conditioning, of two closely related animal species that differ in LTM formation. Several candidate genes that may regulate differences in LTM have been identified. This transcriptome analysis is a valuable resource for future in-depth studies to elucidate the role of candidate genes and antisense transcription in natural variation in LTM formation.
    Full-text · Article · Dec 2015
    • "A study in human cells correlated the expression of the RevErb messenger to the regulation of erbAa2 mRNA splicing (Hastings et al., 1997; Salato et al., 2010) via an mRNA-mRNA interaction. In this case, at least 600 additional overlapping coding genes have been identified (Sanna et al., 2008). We wished to address the question of the fate of 3 0 -overlapping messengers in the model organism Saccharomyces cerevisiae , where hundreds of 3 0 -overlapping mRNA result from convergent gene transcription and can theoretically form mRNA duplexes ( Wilkening et al., 2013). "
    [Show abstract] [Hide abstract] ABSTRACT: Transcriptome analyses have revealed that convergent gene transcription can produce many 3'-overlapping mRNAs in diverse organisms. Few studies have examined the fate of 3'-complementary mRNAs in double-stranded RNA-dependent nuclear phenomena, and nothing is known about the cytoplasmic destiny of 3'-overlapping messengers or their impact on gene expression. Here, we demonstrate that the complementary tails of 3'-overlapping mRNAs can interact in the cytoplasm and promote post-transcriptional regulatory events including no-go decay (NGD) in Saccharomyces cerevisiae. Genome-wide experiments confirm that these messenger-interacting mRNAs (mimRNAs) form RNA duplexes in wild-type cells and thus have potential roles in modulating the mRNA levels of their convergent gene pattern under different growth conditions. We show that the post-transcriptional fate of hundreds of mimRNAs is controlled by Xrn1, revealing the extent to which this conserved 5'-3' cytoplasmic exoribonuclease plays an unexpected but key role in the post-transcriptional control of convergent gene expression.
    Full-text · Article · Sep 2015
Show more