The Transcriptome of the Human Pathogen Trypanosoma brucei at Single-Nucleotide Resolution

School of Public Health, Yale University, New Haven, Connecticut, United States of America.
PLoS Pathogens (Impact Factor: 7.56). 09/2010; 6(9):e1001090. DOI: 10.1371/journal.ppat.1001090
Source: PubMed

ABSTRACT The genome of Trypanosoma brucei, the causative agent of African trypanosomiasis, was published five years ago, yet identification of all genes and their transcripts remains to be accomplished. Annotation is challenged by the organization of genes transcribed by RNA polymerase II (Pol II) into long unidirectional gene clusters with no knowledge of how transcription is initiated. Here we report a single-nucleotide resolution genomic map of the T. brucei transcriptome, adding 1,114 new transcripts, including 103 non-coding RNAs, confirming and correcting many of the annotated features and revealing an extensive heterogeneity of 5' and 3' ends. Some of the new transcripts encode polypeptides that are either conserved in T. cruzi and Leishmania major or were previously detected in mass spectrometry analyses. High-throughput RNA sequencing (RNA-Seq) was sensitive enough to detect transcripts at putative Pol II transcription initiation sites. Our results, as well as recent data from the literature, indicate that transcription initiation is not solely restricted to regions at the beginning of gene clusters, but may occur at internal sites. We also provide evidence that transcription at all putative initiation sites in T. brucei is bidirectional, a recently recognized fundamental property of eukaryotic promoters. Our results have implications for gene expression patterns in other important human pathogens with similar genome organization (Trypanosoma cruzi, Leishmania sp.) and revealed heterogeneity in pre-mRNA processing that could potentially contribute to the survival and success of the parasite population in the insect vector and the mammalian host.

Download full-text


Available from: Christian Tschudi, Sep 25, 2015
21 Reads
  • Source
    • "Although RBP33 binds to mRNAs encoding known or conserved proteins, most of its RNA targets are likely to be non-coding and/or present at minimal levels in the cell. Indeed, 90% of the RBP33-associated RNAs annotated as coding for ‘hypothetical proteins’ or ‘hypothetical proteins, unlikely’ were not detected by ribosome profiling [14], and 75% were not detected in a global transcriptome survey [13] (Table S1). Moreover, over one-third of the corresponding genes are located next to SSRs, close to the ends of chromosomes or have an antisense orientation within a transcription unit (Fig. 3 and Table S1). "
    [Show abstract] [Hide abstract]
    ABSTRACT: We have characterized the RNA-binding protein RBP33 in Trypanosoma brucei, and found that it localizes to the nucleus and is essential for viability. The subset of RNAs bound to RBP33 was determined by immunoprecipitation of ribonucleoprotein complexes followed by deep sequencing. Most RBP33-bound transcripts are predicted to be non-coding. Among these, over one-third are located close to the end of transcriptional units (TUs) or have an antisense orientation within a TU. Depletion of RBP33 resulted in an increase in the level of RNAs derived from regions that are normally silenced, such as strand-switch regions, retroposon and repeat sequences. Our work provides the first example of an RNA-binding protein involved in the regulation of gene silencing in trypanosomes.
    PLoS ONE 09/2014; 9(9):e107608. DOI:10.1371/journal.pone.0107608 · 3.23 Impact Factor
  • Source
    • "Recent RNA-sequencing studies of T. brucei and L. major have revealed that many genes can harbour multiple alternative processing sites [11] [12]. While the functional significance of these sites has yet to be determined on a genome wide scale, it is likely that some of these alternative sites are important to the regulation and/or function of the final transcript. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The Kinetoplastida are a diverse and globally distributed class of free-living and parasitic single-celled eukaryotes that collectively cause a significant burden on human health and welfare. In kinetoplastids individual genes do not have promoters, but rather all genes are arranged downstream of a small number of RNA polymerase II transcription initiation sites are thus transcribed in polycistronic gene clusters. Production of individual mRNAs from this continuous transcript occurs co-transcriptionally by trans-splicing of a ∼39 nucleotide capped RNA and subsequent polyadenylation of the upstream mRNA. SLaP mapper (Spliced-Leader and Polyadenylation mapper) is a fully automated web-service for identification, quantitation and gene-assignment of both spliced-leader and polyadenylation addition sites in Kinetoplastid genomes. SLaP mapper only requires raw read data from paired-end Illumina RNAseq and performs all read processing, mapping, quality control, quantification, and analysis in a fully automated pipeline. To provide usage examples and estimates of the quantity of sequence data required we use RNAseq obtained from two different library preparations from both Trypanosoma brucei and Leishmania mexicana to show the number of expected reads that are obtained from each preparation type. SLaP mapper is an easy to use, platform independent webserver that is freely available for use at Example files are provided on the website.
    Molecular and Biochemical Parasitology 08/2014; 196(2). DOI:10.1016/j.molbiopara.2014.07.012 · 1.79 Impact Factor
  • Source
    • "Transcript lengths vary from 333 to 4,100 nucleotides and the average 5’ UTR length is 119 nucleotides with a median of 110 nucleotides. This is similar to the global analysis of the transcriptome [28,30], where a median length of 128 to 130 nucleotides was reported. On the other hand, the 3’ UTR is on average 390 nucleotides long with a median of 237 nucleotides, with the latter being notably smaller than the medians reported in the aforementioned transcriptome studies, namely 400 nucleotides [30] and 388 nucleotides [28]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Although technical advances in genomics and proteomics research have yielded a better understanding of the coding capacity of a genome, one major challenge remaining is the identification of all expressed proteins, especially those less than 100 amino acids in length. Such information can be particularly relevant to human pathogens, like Trypanosoma brucei, the causative agent of African trypanosomiasis, since it will provide further insight into the parasite biology and life cycle. Starting with 993 T. brucei transcripts, previously shown by RNA-Seq not to coincide with annotated coding sequences (CDS), homology searches revealed that 173 predicted short open reading frames in these transcripts are conserved across kinetoplastids with 13 also conserved in representative eukaryotes. Mining mass spectrometry data sets revealed 42 transcripts encoding at least one matching peptide. RNAi-induced down-regulation of these 42 transcripts revealed seven to be essential in insect-form trypanosomes with two also required for the bloodstream life cycle stage. To validate the specificity of the RNAi results, each lethal phenotype was rescued by co-expressing an RNAi-resistant construct of each corresponding CDS. These previously non-annotated essential small proteins localized to a variety of cell compartments, including the cell surface, mitochondria, nucleus and cytoplasm, inferring diverse biological roles they are likely to play in T. brucei. We also provide evidence that one of these small proteins is required for replicating the kinetoplast (mitochondrial) DNA. Our studies highlight the presence and significance of small proteins in a protist and expose potential new targets to block the survival of trypanosomes in the insect vector and/or the mammalian host.
    BMC Biology 02/2014; 12(1):14. DOI:10.1186/1741-7007-12-14 · 7.98 Impact Factor
Show more