The Transcriptome of the Human Pathogen Trypanosoma brucei at Single-Nucleotide Resolution

School of Public Health, Yale University, New Haven, Connecticut, United States of America.
PLoS Pathogens (Impact Factor: 7.56). 09/2010; 6(9):e1001090. DOI: 10.1371/journal.ppat.1001090
Source: PubMed


The genome of Trypanosoma brucei, the causative agent of African trypanosomiasis, was published five years ago, yet identification of all genes and their transcripts remains to be accomplished. Annotation is challenged by the organization of genes transcribed by RNA polymerase II (Pol II) into long unidirectional gene clusters with no knowledge of how transcription is initiated. Here we report a single-nucleotide resolution genomic map of the T. brucei transcriptome, adding 1,114 new transcripts, including 103 non-coding RNAs, confirming and correcting many of the annotated features and revealing an extensive heterogeneity of 5' and 3' ends. Some of the new transcripts encode polypeptides that are either conserved in T. cruzi and Leishmania major or were previously detected in mass spectrometry analyses. High-throughput RNA sequencing (RNA-Seq) was sensitive enough to detect transcripts at putative Pol II transcription initiation sites. Our results, as well as recent data from the literature, indicate that transcription initiation is not solely restricted to regions at the beginning of gene clusters, but may occur at internal sites. We also provide evidence that transcription at all putative initiation sites in T. brucei is bidirectional, a recently recognized fundamental property of eukaryotic promoters. Our results have implications for gene expression patterns in other important human pathogens with similar genome organization (Trypanosoma cruzi, Leishmania sp.) and revealed heterogeneity in pre-mRNA processing that could potentially contribute to the survival and success of the parasite population in the insect vector and the mammalian host.

Download full-text


Available from: Christian Tschudi,
24 Reads
  • Source
    • "Moreover, these reads did not map to ORFs, indicating that they do not reflect the presence of translating ribosomes (data not shown). Of the >1100 novel transcripts previously described [11], we found that fewer than 10% showed clear evidence of protein production, suggesting they play other roles in parasite biology. In contrast, only 22 annotated CDSs with mRNA levels above the lowest quartile had negligible ribosome footprints (<180 total summed across all libraries). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Since the initial publication of the trypanosomatid genomes, curation has been ongoing. Here we make use of existing Trypanosoma brucei ribosome profiling data to provide evidence of ribosome occupancy (and likely translation) of mRNAs from 225 currently unannotated coding sequences (CDSs). A small number of these putative genes correspond to extra copies of previously annotated genes, but 85% are novel. The median size of these novels CDSs is small (81 aa), indicating that past annotation work has excelled at detecting large CDSs. Of the unique CDSs confirmed here, over half have candidate orthologues in other trypanosomatid genomes, most of which were not yet annotated as protein-coding genes. Nonetheless, approximately one-third of the new CDSs were found only in T. brucei subspecies. Using ribosome footprints, RNA-Seq and spliced leader mapping data, we updated previous work to definitively revise the start sites for 414CDSs as compared to the current gene models. The data pointed to several regions of the genome that had sequence errors that altered coding region boundaries. Finally, we consolidated this data with our previous work to propose elimination of 683 putative genes as protein-coding and arrive at a view of the translatome of slender bloodstream and procyclic culture form T. brucei.
    Molecular and Biochemical Parasitology 09/2015; 202(2). DOI:10.1016/j.molbiopara.2015.09.002 · 1.79 Impact Factor
  • Source
    • "Although RBP33 binds to mRNAs encoding known or conserved proteins, most of its RNA targets are likely to be non-coding and/or present at minimal levels in the cell. Indeed, 90% of the RBP33-associated RNAs annotated as coding for ‘hypothetical proteins’ or ‘hypothetical proteins, unlikely’ were not detected by ribosome profiling [14], and 75% were not detected in a global transcriptome survey [13] (Table S1). Moreover, over one-third of the corresponding genes are located next to SSRs, close to the ends of chromosomes or have an antisense orientation within a transcription unit (Fig. 3 and Table S1). "
    [Show abstract] [Hide abstract]
    ABSTRACT: We have characterized the RNA-binding protein RBP33 in Trypanosoma brucei, and found that it localizes to the nucleus and is essential for viability. The subset of RNAs bound to RBP33 was determined by immunoprecipitation of ribonucleoprotein complexes followed by deep sequencing. Most RBP33-bound transcripts are predicted to be non-coding. Among these, over one-third are located close to the end of transcriptional units (TUs) or have an antisense orientation within a TU. Depletion of RBP33 resulted in an increase in the level of RNAs derived from regions that are normally silenced, such as strand-switch regions, retroposon and repeat sequences. Our work provides the first example of an RNA-binding protein involved in the regulation of gene silencing in trypanosomes.
    PLoS ONE 09/2014; 9(9):e107608. DOI:10.1371/journal.pone.0107608 · 3.23 Impact Factor
  • Source
    • "Recent RNA-sequencing studies of T. brucei and L. major have revealed that many genes can harbour multiple alternative processing sites [11] [12]. While the functional significance of these sites has yet to be determined on a genome wide scale, it is likely that some of these alternative sites are important to the regulation and/or function of the final transcript. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The Kinetoplastida are a diverse and globally distributed class of free-living and parasitic single-celled eukaryotes that collectively cause a significant burden on human health and welfare. In kinetoplastids individual genes do not have promoters, but rather all genes are arranged downstream of a small number of RNA polymerase II transcription initiation sites are thus transcribed in polycistronic gene clusters. Production of individual mRNAs from this continuous transcript occurs co-transcriptionally by trans-splicing of a ∼39 nucleotide capped RNA and subsequent polyadenylation of the upstream mRNA. SLaP mapper (Spliced-Leader and Polyadenylation mapper) is a fully automated web-service for identification, quantitation and gene-assignment of both spliced-leader and polyadenylation addition sites in Kinetoplastid genomes. SLaP mapper only requires raw read data from paired-end Illumina RNAseq and performs all read processing, mapping, quality control, quantification, and analysis in a fully automated pipeline. To provide usage examples and estimates of the quantity of sequence data required we use RNAseq obtained from two different library preparations from both Trypanosoma brucei and Leishmania mexicana to show the number of expected reads that are obtained from each preparation type. SLaP mapper is an easy to use, platform independent webserver that is freely available for use at Example files are provided on the website.
    Molecular and Biochemical Parasitology 08/2014; 196(2). DOI:10.1016/j.molbiopara.2014.07.012 · 1.79 Impact Factor
Show more