[show abstract][hide abstract] ABSTRACT: Fused genes are important sources of data for studies of evolution and protein function. To date no service has been made available online to aid in the large-scale identification of fused genes in sequenced genomes. We have developed a program, Gene deFuser, that analyzes uploaded protein sequence files for characteristics of gene fusion events and presents the results in a convenient web interface.
To test the ability of this software to detect fusions on a genome-wide scale, we analyzed the 24,725 gene models predicted for the ciliated protozoan Tetrahymena thermophila. Gene deFuser detected members of eight of the nine families of gene fusions known or predicted in this species and identified nineteen new families of fused genes, each containing between one and twelve members. In addition to these genuine fusions, Gene deFuser also detected a particular type of gene misannotation, in which two independent genes were predicted as a single transcript by gene annotation tools. Twenty-nine of the artifacts detected by Gene deFuser in the initial annotation have been corrected in subsequent versions, with a total of 25 annotation artifacts (about 1/3 of the total fusions identified) remaining in the most recent annotation.
The newly identified Tetrahymena fusions belong to classes of genes involved in processes such as phospholipid synthesis, nuclear export, and surface antigen generation. These results highlight the potential of Gene deFuser to reveal a large number of novel fused genes in evolutionarily isolated organisms. Gene deFuser may also prove useful as an ancillary tool for detecting fusion artifacts during gene model annotation.
[show abstract][hide abstract] ABSTRACT: The methionine salvage pathway is responsible for regenerating methionine from its derivative, methylthioadenosine. The complete set of enzymes of the methionine pathway has been previously described in bacteria. Despite its importance, the pathway has only been fully described in one eukaryotic organism, yeast. Here we use a computational approach to identify the enzymes of the methionine salvage pathway in another eukaryote, Tetrahymena thermophila. In this organism, the pathway has two fused genes, MTNAK and MTNBD. Each of these fusions involves two different genes whose products catalyze two different single steps of the pathway in other organisms. One of the fusion proteins, mtnBD, is formed by enzymes that catalyze non-consecutive steps in the pathway, mtnB and mtnD. Interestingly the gene that codes for the intervening enzyme in the pathway, mtnC, is missing from the genome of Tetrahymena. We used complementation tests in yeast to show that the fusion of mtnB and mtnD from Tetrahymena is able to do in one step what yeast does in three, since it can rescue yeast knockouts of mtnB, mtnC, or mtnD. Fusion genes have proved to be very useful in aiding phylogenetic reconstructions and in the functional characterization of genes. Our results highlight another characteristic of fusion proteins, namely that these proteins can serve as biochemical shortcuts, allowing organisms to completely bypass steps in biochemical pathways.
[show abstract][hide abstract] ABSTRACT: We used the recently sequenced genomes of the ciliates Tetrahymena thermophila and Paramecium tetraurelia to analyze the codon usage patterns in both organisms; we have analyzed codon usage bias, Gln codon usage, GC content and the nucleotide contexts of initiation and termination codons in Tetrahymena and Paramecium. We also studied how these trends change along the length of the genes and in a subset of highly expressed genes. Our results corroborate some of the trends previously described in Tetrahymena, but also negate some specific observations. In both genomes we found a strong bias toward codons with low GC content; however, in highly expressed genes this bias is smaller and codons ending in GC tend to be more frequent. We also found that codon bias increases along gene segments and in highly expressed genes and that the context surrounding initiation and termination codons are always AT rich. Our results also suggest differences in the efficiency of translation of the reassigned stop codons between the two species and between the reassigned codons. Finally, we discuss some of the possible causes for such translational efficiency differences.