Genome-wide mapping of in vivo protein-DNA interactions

Department of Genetics, Stanford University School of Medicine, Stanford, CA, 94305-5120, USA.
Science (Impact Factor: 31.48). 07/2007; 316(5830):1497-502. DOI: 10.1126/science.1141319
Source: PubMed

ABSTRACT In vivo protein-DNA interactions connect each transcription factor with its direct targets to form a gene network scaffold. To map these protein-DNA interactions comprehensively across entire mammalian genomes, we developed a large-scale chromatin immunoprecipitation assay (ChIPSeq) based on direct ultrahigh-throughput DNA sequencing. This sequence census method was then used to map in vivo binding of the neuron-restrictive silencer factor (NRSF; also known as REST, for repressor element-1 silencing transcription factor) to 1946 locations in the human genome. The data display sharp resolution of binding position [+/-50 base pairs (bp)], which facilitated our finding motifs and allowed us to identify noncanonical NRSF-binding motifs. These ChIPSeq data also have high sensitivity and specificity [ROC (receiver operator characteristic) area >/= 0.96] and statistical confidence (P <10(-4)), properties that were important for inferring new candidate interactions. These include key transcription factors in the gene network that regulates pancreatic islet cell development.

Download full-text


Available from: Richard M Myers, Jul 06, 2015
1 Follower
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: CarD is an essential mycobacterial protein that binds the RNA polymerase (RNAP) and affects the transcriptional profile of Mycobacterium smegmatis and Mycobacterium tuberculosis (6). We predicted that CarD was directly regulating RNAP function but our prior experiments had not determined at what stage of transcription CarD was functioning and at which genes CarD interacted with the RNAP. To begin to address these open questions, we performed Chromatin Immunoprecipitation sequencing (ChIP-seq) to survey the distribution of CarD throughout the M. smegmatis chromosome. The distribution of RNAP subunits β and σA were also profiled. We expected that RNAP β would be present throughout transcribed regions and RNAP σA would be predominantly enriched at promoters based on work in Escherichia coli (3), however this had yet to be determined in mycobacteria. The ChIP-seq analyses revealed that CarD was never present on the genome in the absence of RNAP, was primarily associated with promoter regions, and was highly correlated with the distribution of RNAP σA. The colocalization of σA and CarD led us to propose that in vivo, CarD associates with RNAP initiation complexes at most promoters and is therefore a global regulator of transcription initiation. Here we describe in detail the data from the ChIP-seq experiments associated with the study published by Srivastava and colleagues in the Proceedings of the National Academy of Science in 2013 (5) as well as discuss the findings from this dataset in relation to both CarD and mycobacterial transcription as a whole. The ChIP-seq data have been deposited in the Gene Expression Omnibus (GEO) database, (accession no. GSE48164).
    12/2014; 2. DOI:10.1016/j.gdata.2014.05.012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Single nucleotide polymorphisms (SNPs) both in coding and non-coding regions govern gene functions prompting differential vulnerability to diseases, heterogeneous response to pharmaceutical regimes and environmental anomalies. These genetic variations, SNPs, may alter an individual's susceptibility for alcohol dependence by remodeling DNA-protein interaction patterns in prodynorphin (PDYN) and the κ-opioid receptor (OPRK1) genes. In order to elaborate the underlying molecular mechanism behind these susceptibility differences we used bioinformatics tools to retrieve differential DNA-protein interactions at PDYN and OPRK1 SNPs significantly associated with alcohol dependence. Our results show allele-specific DNA-protein interactions depicting allele-specific mechanisms implicated in differential regulation of gene expression. Several transcription factors, for instance, VDR, RXR-alpha, NFYA, CTF family, USF-1, USF2, ER, AR and predominantly SP family show an allele-specific binding affinity with PDYN gene; likewise, GATA, TBP, AP-1, USF-2, C/EBPbeta, Cart-1 and ER interact with OPRK1 SNPs on intron 2 in an allele-specific manner. In a nutshell, transition of a single nucleotide may modify differential DNA-protein interactions at OPRK1 and PDYN's SNPs, significantly associated with pathology that may lead to altered individual vulnerability for alcohol dependence.
    Computers in Biology and Medicine 10/2014; 53. DOI:10.1016/j.compbiomed.2014.07.021 · 1.46 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Peak calling is a critical step in ChIPseq data analysis. Choosing the correct algorithm as well as optimized parameters for a specific biological system is an essential task. In this article, we present an original peak calling method (bPeaks) specifically designed to detect transcription factor (TF) binding sites in small eukaryotic genomes, such as in yeasts. As TF interactions with DNA are strong and generate high binding signals, bPeaks uses simple parameters to compare the sequences (reads) obtained from the IP (immunoprecipitation) with those from the control DNA (input). Because yeasts have small genomes (<20 Mb), our program has the advantage to use ChIPseq information at the single nucleotide level and can explore, in a reasonable computational time, results obtained with different sets of parameter values. Graphical outputs and text files are provided to rapidly assess the relevance of the detected peaks. Taking advantage of the simple promoter structure in yeasts, additional functions were implemented in bPeaks to automatically assign the peaks to promoter regions and retrieve peak coordinates on the DNA sequence for further predictions of regulatory motifs, enriched in the list of peaks. Applications of the bPeaks program to three different ChIPseq datasets from Saccharomyces cerevisiae, Candida albicans and Candida glabrata are presented. Each time, bPeaks allowed to correctly predicted the DNA binding sequence of the studied TF and provided relevant lists of peaks. The bioinformatics tool bPeaks is freely distributed to academic users. Supplementary data together with detailed tutorials are available online: This article is protected by copyright. All rights reserved.
    Yeast 10/2014; 31(10). DOI:10.1002/yea.3031 · 1.74 Impact Factor