[Show abstract][Hide abstract] ABSTRACT: By integrating genome-wide maps of RNA polymerase II (Polr2a) binding with gene expression data and H3ac and H3K4me3 profiles, we characterized promoters with enriched activity in mouse embryonic stem cells (mES) as well as adult brain, heart, kidney, and liver. We identified approximately 24,000 promoters across these samples, including 16,976 annotated mRNA 5' ends and 5153 additional sites validating cap-analysis of gene expression (CAGE) 5' end data. We showed that promoters with CpG islands are typically non-tissue specific, with the majority associated with Polr2a and the active chromatin modifications in nearly all the tissues examined. By contrast, the promoters without CpG islands are generally associated with Polr2a and the active chromatin marks in a tissue-dependent way. We defined 4396 tissue-specific promoters by adapting a quantitative index of tissue-specificity based on Polr2a occupancy. While there is a general correspondence between Polr2a occupancy and active chromatin modifications at the tissue-specific promoters, a subset of them appear to be persistently marked by active chromatin modifications in the absence of detectable Polr2a binding, highlighting the complexity of the functional relationship between chromatin modification and gene expression. Our results provide a resource for exploring promoter Polr2a binding and epigenetic states across pluripotent and differentiated cell types in mammals.
Genome Research 02/2008; 18(1):46-59. DOI:10.1101/gr.6654808 · 14.63 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: ChIP-chip (or ChIP-on-chip) is a technology for isolation and identification of genomic sites occupied by specific DNA-binding proteins in living cells. The ChIP-chip signals can be obtained over the whole genome by tiling arrays, where a peak shape is generally observed around a protein-binding site. In this article, we describe the ChIP-chip process and present a probability model for ChIP-chip data. We then propose a model-based method for recognizing the peak shapes for the purpose of detecting protein-binding sites. We also investigate the issue of bandwidth in nonparametric kernel smoothing method.
[Show abstract][Hide abstract] ABSTRACT: ChIP-chip combines chromatin immunoprecipitation (ChIP) with microarrays (chip) to determine protein-DNA interactions occurring in living cells. The high throughput nature of this method makes it an ideal approach for identifying transcription factor targets or chromatin modification sites along the genome. UNIT 21.9 describes a protocol for analysis of protein-DNA interactions in yeast cells. This unit introduces an alternative protocol developed for mammalian cells.
Current protocols in molecular biology / edited by Frederick M. Ausubel ... [et al.] 08/2007; Chapter 21:Unit 21.13. DOI:10.1002/0471142727.mb2113s79
[Show abstract][Hide abstract] ABSTRACT: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
[Show abstract][Hide abstract] ABSTRACT: Eukaryotic gene transcription is accompanied by acetylation and methylation of nucleosomes near promoters, but the locations and roles of histone modifications elsewhere in the genome remain unclear. We determined the chromatin modification states in high resolution along 30 Mb of the human genome and found that active promoters are marked by trimethylation of Lys4 of histone H3 (H3K4), whereas enhancers are marked by monomethylation, but not trimethylation, of H3K4. We developed computational algorithms using these distinct chromatin signatures to identify new regulatory elements, predicting over 200 promoters and 400 enhancers within the 30-Mb region. This approach accurately predicted the location and function of independently identified regulatory elements with high sensitivity and specificity and uncovered a novel functional enhancer for the carnitine transporter SLC22A5 (OCTN2). Our results give insight into the connections between chromatin modifications and transcriptional regulatory activity and provide a new tool for the functional annotation of the human genome.
[Show abstract][Hide abstract] ABSTRACT: Control of eukaryotic gene expression involves combinatorial interactions between transcription factors and regulatory sequences in the genome. In addition, chromatin structure and modification states play key roles in determining the competence of transcription. The term 'transcriptional regulatory code' has been used to describe the interplay of these events in the complex control of transcription. With the maturation of methods for detecting in vivo protein-DNA interactions on a genome-wide scale, detailed maps of chromatin features and transcription factor localization over entire genomes of eukaryotic cells are enriching our understanding of the properties and nature of this transcriptional regulatory code. The rapidly growing number of maps has revealed the dynamic nature of nucleosome composition and chromatin remodeling at regulatory regions and highlighted some unexpected properties of transcriptional regulatory networks in eukaryotic cells.
Current Opinion in Cell Biology 07/2006; 18(3):291-8. DOI:10.1016/j.ceb.2006.04.002 · 8.47 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In eukaryotic cells, transcription of every protein-coding gene begins with the assembly of an RNA polymerase II preinitiation complex (PIC) on the promoter. The promoters, in conjunction with enhancers, silencers and insulators, define the combinatorial codes that specify gene expression patterns. Our ability to analyse the control logic encoded in the human genome is currently limited by a lack of accurate information regarding the promoters for most genes. Here we describe a genome-wide map of active promoters in human fibroblast cells, determined by experimentally locating the sites of PIC binding throughout the human genome. This map defines 10,567 active promoters corresponding to 6,763 known genes and at least 1,196 un-annotated transcriptional units. Features of the map suggest extensive use of multiple promoters by the human genes and widespread clustering of active promoters in the genome. In addition, examination of the genome-wide expression profile reveals four general classes of promoters that define the transcriptome of the cell. These results provide a global view of the functional relationships among transcriptional machinery, chromatin structure and gene expression in human cells.
[Show abstract][Hide abstract] ABSTRACT: Transcriptional regulatory elements play essential roles in gene expression during animal development and cellular response to environmental signals, but our knowledge of these regions in the human genome is limited despite the availability of the complete genome sequence. Promoters mark the start of every transcript and are an important class of regulatory elements. A large, complex protein structure known as the pre-initiation complex (PIC) is assembled on all active promoters, and the presence of these proteins distinguishes promoters from other sequences in the genome. Using components of the PIC as tags, we isolated promoters directly from human cells as protein-DNA complexes and identified the resulting DNA sequences using genomic tiling microarrays. Our experiments in four human cell lines uncovered 252 PIC-binding sites in 44 semirandomly selected human genomic regions comprising 1% (30 megabase pairs) of the human genome. Nearly 72% of the identified fragments overlap or immediately flank 5' ends of known cDNA sequences, while the remainder is found in other genomic regions that likely harbor putative promoters of unannotated transcripts. Indeed, molecular analysis of the RNA isolated from one cell line uncovered transcripts initiated from over half of the putative promoter fragments, and transient transfection assays revealed promoter activity for a significant proportion of fragments when they were fused to a luciferase reporter gene. These results demonstrate the specificity of a genome-wide analysis method for mapping transcriptional regulatory elements and also indicate that a small, yet significant number of human genes remains to be discovered.
Genome Research 07/2005; 15(6):830-9. DOI:10.1101/gr.3430605 · 14.63 Impact Factor