Nucleosome positioning signals in genomic DNA.

Bioinformatics Program, Boston University, Boston, MA 02215, USA.
Genome Research (Impact Factor: 13.85). 09/2007; 17(8):1170-7. DOI: 10.1101/gr.6101007
Source: PubMed

ABSTRACT Although histones can form nucleosomes on virtually any genomic sequence, DNA sequences show considerable variability in their binding affinity. We have used DNA sequences of Saccharomyces cerevisiae whose nucleosome binding affinities have been experimentally determined (Yuan et al. 2005) to train a support vector machine to identify the nucleosome formation potential of any given sequence of DNA. The DNA sequences whose nucleosome formation potential are most accurately predicted are those that contain strong nucleosome forming or inhibiting signals and are found within nucleosome length stretches of genomic DNA with continuous nucleosome formation or inhibition signals. We have accurately predicted the experimentally determined nucleosome positions across a well-characterized promoter region of S. cerevisiae and identified strong periodicity within 199 center-aligned mononucleosomes studied recently (Segal et al. 2006) despite there being no periodicity information used to train the support vector machine. Our analysis suggests that only a subset of nucleosomes are likely to be positioned by intrinsic sequence signals. This observation is consistent with the available experimental data and is inconsistent with the proposal of a nucleosome positioning code. Finally, we show that intrinsic nucleosome positioning signals are both more inhibitory and more variable in promoter regions than in open reading frames in S. cerevisiae.

Download full-text


Available from: Robert E Thurman, Jun 20, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The identification of important factors that affect nucleosome formation is critical to clarify nucleosome-forming mechanisms and the role of the nucleosome in gene regulation. Various features reported in the literature led to our hypothesis that multiple features can together contribute to nucleosome formation. Therefore, we compiled 779 features and developed a pattern discovery and scoring algorithm FFNs (Finding Features for Nucleosomes) to identify feature patterns that are differentially enriched in nucleosome-forming sequences and nucleosome-depletion sequences. Applying FFN to genome-wide nucleosome occupancy data in yeast and human, we identified statistically significant feature patterns that may influence nucleosome formation, many of which are common to the two species. We found that both sequence and structural features are important in nucleosome occupancy prediction. We discovered that, even for the same feature combinations, variations in feature values may lead to differences in predictive power. We demonstrated that the identified feature patterns could be used to assist nucleosomal sequence prediction.
    Genomics 08/2014; 104(2). DOI:10.1016/j.ygeno.2014.07.002 · 2.79 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The nucleosome is a fundamental structural and functional chromatin unit that affects nearly all DNA-templated events in eukaryotic genomes. It is also a biochemical substrate for higher-order, cis-acting gene-expression codes and the monomeric structural unit for chromatin packaging at multiple scales. To predict the nucleosome landscape of a model plant genome, we used a Support Vector Machine (SVM) computational algorithm trained on human chromatin to predict the nucleosome occupancy likelihood (NOL) across the maize (Zea mays L.) genome. Experimentally validated NOL plots provide a novel genomic annotation that highlights gene structures, repetitive elements, and chromosome-scale domains likely to reflect regional gene density. We established a new genome browser ( for viewing SVM-based NOL scores. This annotation provides sequence-based comprehensive coverage across the entire genome, including repetitive genomic regions typically excluded from experimental genomics data. We find that transposable elements often displayed family-specific NOL profiles that included distinct regions, especially near their termini, predicted to have strong affinities for nucleosomes. We examined transcription-start-site consensus NOL plots for maize gene sets and discovered that most maize genes display a typical +1 nucleosome positioning signal just downstream of the start site, but not upstream. This overall lack of an -1 nucleosome positioning signal was also predicted by our method for Arabidopsis genes, and verified by additional analysis of previously published Arabidopsis MNase-Seq data, revealing a general feature of plant promoters. Our study advances plant chromatin research by defining the potential contribution of DNA sequence to observed nucleosome positioning, and provides an invariant baseline annotation against which other genomic data can be compared.
    Plant physiology 04/2013; DOI:10.1104/pp.113.216432 · 7.39 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Gene regulation at functional elements (e.g., enhancers, promoters, insulators) is governed by an interplay of nucleosome remodeling, histone modifications, and transcription factor binding. To enhance our understanding of gene regulation, the ENCODE Consortium has generated a wealth of ChIP-seq data on DNA-binding proteins and histone modifications. We additionally generated nucleosome positioning data on two cell lines, K562 and GM12878, by MNase digestion and high-depth sequencing. Here we relate 14 chromatin signals (12 histone marks, DNase, and nucleosome positioning) to the binding sites of 119 DNA-binding proteins across a large number of cell lines. We developed a new method for unsupervised pattern discovery, the Clustered AGgregation Tool (CAGT), which accounts for the inherent heterogeneity in signal magnitude, shape, and implicit strand orientation of chromatin marks. We applied CAGT on a total of 5084 data set pairs to obtain an exhaustive catalog of high-resolution patterns of histone modifications and nucleosome positioning signals around bound transcription factors. Our analyses reveal extensive heterogeneity in how histone modifications are deposited, and how nucleosomes are positioned around binding sites. With the exception of the CTCF/cohesin complex, asymmetry of nucleosome positioning is predominant. Asymmetry of histone modifications is also widespread, for all types of chromatin marks examined, including promoter, enhancer, elongation, and repressive marks. The fine-resolution signal shapes discovered by CAGT unveiled novel correlation patterns between chromatin marks, nucleosome positioning, and sequence content. Meta-analyses of the signal profiles revealed a common vocabulary of chromatin signals shared across multiple cell lines and binding proteins.
    Genome Research 09/2012; 22(9):1735-47. DOI:10.1101/gr.136366.111 · 13.85 Impact Factor