[Show abstract][Hide abstract] ABSTRACT: Recent research has uncovered extensive variability in the boundaries of transcript isoforms, yet the functional consequences of this variation remain largely unexplored. Here, we systematically discriminate between the molecular phenotypes of overlapping coding and non-coding transcriptional events from each genic locus using a novel genome-wide, nucleotide-resolution technique to quantify the half-lives of 3' transcript isoforms in yeast. Our results reveal widespread differences in stability among isoforms for hundreds of genes in a single condition, and that variation of even a single nucleotide in the 3' untranslated region (UTR) can affect transcript stability. While previous instances of negative associations between 3' UTR length and transcript stability have been reported, here, we find that shorter isoforms are not necessarily more stable. We demonstrate the role of RNA-protein interactions in conditioning isoform-specific stability, showing that PUF3 binds and destabilizes specific polyadenylation isoforms. Our findings indicate that although the functional elements of a gene are encoded in DNA sequence, the selective incorporation of these elements into RNA through transcript boundary variation allows a single gene to have diverse functional consequences.
Molecular Systems Biology 02/2014; 10(2):719. · 14.10 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Active human promoters produce promoter-upstream transcripts (PROMPTs). Why these RNAs are coupled to decay, whereas their neighboring promoter-downstream mRNAs are not, is unknown. Here high-throughput sequencing demonstrates that PROMPTs generally initiate in the antisense direction closely upstream of the transcription start sites (TSSs) of their associated genes. PROMPT TSSs share features with mRNA-producing TSSs, including stalled RNA polymerase II (RNAPII) and the production of small TSS-associated RNAs. Notably, motif analyses around PROMPT 3' ends reveal polyadenylation (pA)-like signals. Mutagenesis studies demonstrate that PROMPT pA signals are functional but linked to RNA degradation. Moreover, pA signals are under-represented in promoter-downstream versus promoter-upstream regions, thus allowing for more efficient RNAPII progress in the sense direction from gene promoters. We conclude that asymmetric sequence distribution around human gene promoters serves to provide a directional RNA output from an otherwise bidirectional transcription process.
[Show abstract][Hide abstract] ABSTRACT: The use of alternative poly(A) sites is common and affects the post-transcriptional fate of mRNA, including its stability, subcellular localization and translation. Here, we present a method to identify poly(A) sites in a genome-wide and strand-specific manner. This method, termed 3'T-fill, initially fills in the poly(A) stretch with unlabeled dTTPs, allowing sequencing to start directly after the poly(A) tail into the 3'-untranslated regions (UTR). Our comparative analysis demonstrates that it outperforms existing protocols in quality and throughput and accurately quantifies RNA levels as only one read is produced from each transcript. We use this method to characterize the diversity of polyadenylation in Saccharomyces cerevisiae, showing that alternative RNA molecules are present even in a genetically identical cell population. Finally, we observe that overlap of convergent 3'-UTRs is frequent but sharply limited by coding regions, suggesting factors that restrict compression of the yeast genome.
Nucleic Acids Research 01/2013; · 8.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Due to the complexity of the protocols and a limited knowledge of the nature of microbial communities, simulating metagenomic sequences plays an important role in testing the performance of existing tools and data analysis methods with metagenomic data. We developed metagenomic read simulators with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved the accuracy and contig lengths of resulting assemblies. We then compared the quality-trimmed Illumina assemblies to those from Sanger and pyrosequencing. For the simple community (10 genomes) all sequencing technologies assembled a similar amount and accurately represented the expected functional composition. For the more complex community (100 genomes) Illumina produced the best assemblies and more correctly resembled the expected functional composition. For the most complex community (400 genomes) there was very little assembly of reads from any sequencing technology. However, due to the longer read length the Sanger reads still represented the overall functional composition reasonably well. We further examined the effect of scaffolding of contigs using paired-end Illumina reads. It dramatically increased contig lengths of the simple community and yielded minor improvements to the more complex communities. Although the increase in contig length was accompanied by increased chimericity, it resulted in more complete genes and a better characterization of the functional repertoire. The metagenomic simulators developed for this research are freely available.
PLoS ONE 02/2012; 7(2):e31386. · 3.53 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Alternative polyadenylation site usage gives rise to variation in 3' ends of transcripts in diverse organisms ranging from yeast to human. Accurate mapping of polyadenylation sites of transcripts is of major biological importance, since the length of the 3'UTR can have a strong influence on transcript stability, localization, and translation. However, reads generated using total mRNA sequencing mostly lack the very 3' end of transcripts. Here, we present a method that allows simultaneous analysis of alternative 3' ends and transcriptome dynamics at high throughput. By using transcripts produced in vitro, the high precision of end mapping during the protocol can be controlled. This method is illustrated here for budding yeast. However, this method can be applied to any natural or artificially polyadenylated RNA.
Methods in enzymology 01/2012; 513:271-96. · 1.90 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Several studies have shown that promoters of protein-coding genes are origins of pervasive non-coding RNA transcription and can initiate transcription in both directions. However, only recently have researchers begun to elucidate the functional implications of this bidirectionality and non-coding RNA production. Increasing evidence indicates that non-coding transcription at promoters influences the expression of protein-coding genes, revealing a new layer of transcriptional regulation. This regulation acts at multiple levels, from modifying local chromatin to enabling regional signal spreading and more distal regulation. Moreover, the bidirectional activity of a promoter is regulated at multiple points during transcription, giving rise to diverse types of transcripts.
Trends in Genetics 07/2011; 27(7):267-76. · 11.60 Impact Factor