Article

Expression of conjoined genes: another mechanism for gene regulation in eukaryotes.

MetaSystems Research Team, Computational Systems Biology Research Group, Advanced Computational Sciences Department, RIKEN Advanced Science Institute, Yokohama, Japan.
PLoS ONE (impact factor: 4.09). 01/2010; 5(10):e13284. DOI:10.1371/journal.pone.0013284 pp.e13284
Source: PubMed

ABSTRACT From the ENCODE project, it is realized that almost every base of the entire human genome is transcribed. One class of transcripts resulting from this arises from the conjoined gene, which is formed by combining the exons of two or more distinct (parent) genes lying on the same strand of a chromosome. Only a very limited number of such genes are known, and the definition and terminologies used for them are highly variable in the public databases. In this work, we have computationally identified and manually curated 751 conjoined genes (CGs) in the human genome that are supported by at least one mRNA or EST sequence available in the NCBI database. 353 representative CGs, of which 291 (82%) could be confirmed, were subjected to experimental validation using RT-PCR and sequencing methods. We speculate that these genes are arising out of novel functional requirements and are not merely artifacts of transcription, since more than 70% of them are conserved in other vertebrate genomes. The unique splicing patterns exhibited by CGs reveal their possible roles in protein evolution or gene regulation. Novel CGs, for which no transcript is available, could be identified in 80% of randomly selected potential CG forming regions, indicating that their formation is a routine process. Formation of CGs is not only limited to human, as we have also identified 270 CGs in mouse and 227 in drosophila using our approach. Additionally, we propose a novel mechanism for the formation of CGs. Finally, we developed a database, ConjoinG, which contains detailed information about all the CGs (800 in total) identified in the human genome. In summary, our findings reveal new insights about the functionality of CGs in terms of another possible mechanism for gene regulation and genomic evolution and the mechanism leading to their formation.

0 0
 · 
0 Bookmarks
 · 
42 Views
  • Article: Tandem chimerism as a means to increase protein complexity in the human genome.
    [show abstract] [hide abstract]
    ABSTRACT: The "one-gene, one-protein" rule, coined by Beadle and Tatum, has been fundamental to molecular biology. The rule implies that the genetic complexity of an organism depends essentially on its gene number. The discovery, however, that alternative gene splicing and transcription are widespread phenomena dramatically altered our understanding of the genetic complexity of higher eukaryotic organisms; in these, a limited number of genes may potentially encode a much larger number of proteins. Here we investigate yet another phenomenon that may contribute to generate additional protein diversity. Indeed, by relying on both computational and experimental analysis, we estimate that at least 4%-5% of the tandem gene pairs in the human genome can be eventually transcribed into a single RNA sequence encoding a putative chimeric protein. While the functional significance of most of these chimeric transcripts remains to be determined, we provide strong evidence that this phenomenon does not correspond to mere technical artifacts and that it is a common mechanism with the potential of generating hundreds of additional proteins in the human genome.
    Genome Research 02/2006; 16(1):37-44. · 13.61 Impact Factor
  • Source
    Article: ChimerDB--a knowledgebase for fusion sequences.
    [show abstract] [hide abstract]
    ABSTRACT: Chromosome translocation and gene fusion are frequent events in the human genome and are often the cause of many types of tumor. ChimerDB is the database of fusion sequences encompassing bioinformatics analysis of mRNA and expressed sequence tag (EST) sequences in the GenBank, manual collection of literature data and integration with other known database such as OMIM. Our bioinformatics analysis identifies the fusion transcripts that have non-overlapping alignments at multiple genomic loci. Fusion events at exon-exon borders are selected to filter out the cloning artifacts in cDNA library preparation. The result is classified into two groups--genuine chromosome translocation and fusion between neighboring genes owing to intergenic splicing. We also integrated manually collected literature and OMIM data for chromosome translocation as an aid to assess the validity of each fusion event. The database is available at http://genome.ewha.ac.kr/ChimerDB/ for human, mouse and rat genomes.
    Nucleic Acids Research 02/2006; 34(Database issue):D21-4. · 8.03 Impact Factor
  • Article: Short homologous sequences are strongly associated with the generation of chimeric RNAs in eukaryotes.
    [show abstract] [hide abstract]
    ABSTRACT: Chimeric RNAs have been reported in varieties of organisms and are conventionally thought to be produced by trans-splicing of two or more distinct transcripts. Here, we conducted a large-scale search for chimeric RNAs in the budding yeast, fruit fly, mouse, and human. Thousands of chimeric transcripts were identified in these organisms except in yeast, in which five chimeric RNAs were observed. RT-PCR experiments for a sample of yeast and fly chimeric transcripts using specific primers show that about one-third of these chimeric RNAs can be reproduced. The results suggest that at least a considerable amount of chimeric RNAs is unlikely from aberrant transcription or splicing, and thus formation of chimeric RNAs is probably a widespread process and can greatly contribute to the complexity of the transcriptome and proteome of organisms. However, only a small fraction (<20%) of these chimeric RNAs has GU-AG at the junction sequences which fits the classical trans-splicing model. In contrast, we observed that about half of the chimeric RNAs have short homologous sequences (SHSs) at the junction sites of the source sequences. Our sequence mutation experiments in yeast showed that disruption of SHSs resulted in the disappearance of the corresponding chimeric RNAs, suggesting that SHSs are essential for generating this kind of chimeric RNA. In addition to the classical trans-splicing model, we propose a new model, the transcriptional slippage model, to explain the generation of those chimeric RNAs synthesized from templates with SHSs.
    Journal of Molecular Evolution 12/2008; 68(1):56-65. · 2.27 Impact Factor

Full-text

View
0 Downloads
Available from

Keywords

353 representative CGs
 
conjoined gene
 
ENCODE project
 
entire human genome
 
genomic evolution
 
human genome
 
manually curated 751 conjoined genes
 
NCBI database
 
Novel CGs
 
novel functional requirements
 
novel mechanism
 
one mRNA
 
possible mechanism
 
possible roles
 
protein evolution
 
public databases
 
routine process
 
transcription
 
unique splicing patterns exhibited
 
vertebrate genomes