A High-Resolution Map of Transcription in the Yeast Genome

Department of Biochemistry, Stanford University, Palo Alto, California, United States
Proceedings of the National Academy of Sciences (Impact Factor: 9.67). 05/2006; 103(14):5320-5. DOI: 10.1073/pnas.0601091103
Source: PubMed


There is abundant transcription from eukaryotic genomes unaccounted for by protein coding genes. A high-resolution genome-wide survey of transcription in a well annotated genome will help relate transcriptional complexity to function. By quantifying RNA expression on both strands of the complete genome of Saccharomyces cerevisiae using a high-density oligonucleotide tiling array, this study identifies the boundary, structure, and level of coding and noncoding transcripts. A total of 85% of the genome is expressed in rich media. Apart from expected transcripts, we found operon-like transcripts, transcripts from neighboring genes not separated by intergenic regions, and genes with complex transcriptional architecture where different parts of the same gene are expressed at different levels. We mapped the positions of 3' and 5' UTRs of coding genes and identified hundreds of RNA transcripts distinct from annotated genes. These nonannotated transcripts, on average, have lower sequence conservation and lower rates of deletion phenotype than protein coding genes. Many other transcripts overlap known genes in antisense orientation, and for these pairs global correlations were discovered: UTR lengths correlated with gene function, localization, and requirements for regulation; antisense transcripts overlapped 3' UTRs more than 5' UTRs; UTRs with overlapping antisense tended to be longer; and the presence of antisense associated with gene function. These findings may suggest a regulatory role of antisense transcription in S. cerevisiae. Moreover, the data show that even this well studied genome has transcriptional complexity far beyond current annotation.

Download full-text


Available from: Lior David,
40 Reads
  • Source
    • "The protein-coding part of the yeast genome includes verified genes, which are conserved open reading frames (ORFs) typically associated with a biological process, uncharacterized genes, which play no known roles, and dubious genes, which are assumed not to encode functional proteins [8]. The yeast transcriptome has been studied in numerous conditions using microarrays and RNA-Sequencing (RNA-Seq) [9] [10] [11] [12]. Furthermore, protein-profiling experiments detected nearly all theoretically predicted yeast proteins in pre-fractionated extracts from growing cells [13] [14]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Unlabelled: Diploid budding yeast undergoes rapid mitosis when it ferments glucose, and in the presence of a non-fermentable carbon source and the absence of a nitrogen source it triggers sporulation. Rich medium with acetate is a commonly used pre-sporulation medium, but our understanding of the molecular events underlying the acetate-driven transition from mitosis to meiosis is still incomplete. We identified 263 proteins for which mRNA and protein synthesis are linked or uncoupled in fermenting and respiring cells. Using motif predictions, interaction data and RNA profiling we find among them 28 likely targets for Ume6, a subunit of the conserved Rpd3/Sin3 histone deacetylase-complex regulating genes involved in metabolism, stress response and meiosis. Finally, we identify 14 genes for which both RNA and proteins are detected exclusively in respiring cells but not in fermenting cells in our sample set, including CSM4, SPR1, SPS4 and RIM4, which were thought to be meiosis-specific. Our work reveals intertwined transcriptional and post-transcriptional control mechanisms acting when a MATa/α strain responds to nutritional signals, and provides molecular clues how the carbon source primes yeast cells for entering meiosis. Biological significance: Our integrated genomics study provides insight into the interplay between the transcriptome and the proteome in diploid yeast cells undergoing vegetative growth in the presence of glucose (fermentation) or acetate (respiration). Furthermore, it reveals novel target genes involved in these processes for Ume6, the DNA binding subunit of the conserved histone deacetylase Rpd3 and the co-repressor Sin3. We have combined data from an RNA profiling experiment using tiling arrays that cover the entire yeast genome, and a large-scale protein detection analysis based on mass spectrometry in diploid MATa/α cells. This distinguishes our study from most others in the field-which investigate haploid yeast strains-because only diploid cells can undergo meiotic development in the simultaneous absence of a non-fermentable carbon source and nitrogen. Indeed, we report molecular clues how respiration of acetate might prime diploid cells for efficient spore formation, a phenomenon that is well known but poorly understood.
    Journal of Proteomics 02/2015; 119. DOI:10.1016/j.jprot.2015.01.015 · 3.89 Impact Factor
  • Source
    • "We also defined those nucleosomes ordered towards the upstream region of the TSS as the −1st, −2nd, …, kth nucleosomes, and the nucleosomes ordered towards the downstream region of the TSS as the 1st, 2nd, …, kth nucleosomes. We obtained microarray gene expression data in yeast from David et al. [30]. This dataset contains 5736 UCSC-annotated genes, of which 5300 genes were identified as high-confidence transcripts, meaning that their transcript segments overlap greater than 50% with a nondubious annotated coding region on the 5′ end [17]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The identification of important factors that affect nucleosome formation is critical to clarify nucleosome-forming mechanisms and the role of the nucleosome in gene regulation. Various features reported in the literature led to our hypothesis that multiple features can together contribute to nucleosome formation. Therefore, we compiled 779 features and developed a pattern discovery and scoring algorithm FFNs (Finding Features for Nucleosomes) to identify feature patterns that are differentially enriched in nucleosome-forming sequences and nucleosome-depletion sequences. Applying FFN to genome-wide nucleosome occupancy data in yeast and human, we identified statistically significant feature patterns that may influence nucleosome formation, many of which are common to the two species. We found that both sequence and structural features are important in nucleosome occupancy prediction. We discovered that, even for the same feature combinations, variations in feature values may lead to differences in predictive power. We demonstrated that the identified feature patterns could be used to assist nucleosomal sequence prediction.
    Genomics 08/2014; 104(2). DOI:10.1016/j.ygeno.2014.07.002 · 2.28 Impact Factor
  • Source
    • "Knowledge of transcriptional start sites (TSSs) and promoter architecture are thus crucial for understanding the transcriptional regulation underlying these fundamental processes. For genes transcribed by RNA polymerase II (pol-II), the architecture of promoter regions has been extensively studied in many prokaryotes and eukaryotes, such as bacteria, yeast, and humans (David et al., 2006; Yamashita et al., 2011; Jorjani and Zavolan, 2014; Park et al., 2014). A major component of promoter architecture identified in animal species are DNA sequence elements, which are bound by different components of the pol-II transcription initiation machinery (Kadonaga, 2004, 2012; Thomas and Chiang, 2006). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Understanding plant gene promoter architecture has long been a challenge due to the lack of relevant large-scale data sets and analysis methods. Here, we present a publicly available, large-scale transcription start site (TSS) data set in plants using a high-resolution method for analysis of 5' ends of mRNA transcripts. Our data set is produced using the paired-end analysis of transcription start sites (PEAT) protocol, providing millions of TSS locations from wild-type Columbia-0 Arabidopsis thaliana whole root samples. Using this data set, we grouped TSS reads into "TSS tag clusters" and categorized clusters into three spatial initiation patterns: narrow peak, broad with peak, and weak peak. We then designed a machine learning model that predicts the presence of TSS tag clusters with outstanding sensitivity and specificity for all three initiation patterns. We used this model to analyze the transcription factor binding site content of promoters exhibiting these initiation patterns. In contrast to the canonical notions of TATA-containing and more broad "TATA-less" promoters, the model shows that, in plants, the vast majority of transcription start sites are TATA free and are defined by a large compendium of known DNA sequence binding elements. We present results on the usage of these elements and provide our Plant PEAT Peaks (3PEAT) model that predicts the presence of TSSs directly from sequence.
    The Plant Cell 07/2014; 26(7). DOI:10.1105/tpc.114.125617 · 9.34 Impact Factor
Show more