DNA physical properties determine nucleosome occupancy from yeast to fly

Laboratoire Statistique et Génome, CNRS/INRA/UEVE, 523 place des Terrasses, 91000 Evry, France.
Nucleic Acids Research (Impact Factor: 9.11). 06/2008; 36(11):3746-56. DOI: 10.1093/nar/gkn262
Source: PubMed


Nucleosome positioning plays an essential role in cellular processes by modulating accessibility of DNA to proteins. Here, using only sequence-dependent DNA flexibility and intrinsic curvature, we predict the nucleosome occupancy along the genomes of Saccharomyces cerevisiae and Drosophila melanogaster and demonstrate the predictive power and universality of our model through its correlation with experimentally determined nucleosome occupancy data. In yeast promoter regions, the computed average nucleosome occupancy closely superimposes with experimental data, exhibiting a <200 bp region unfavourable for nucleosome formation bordered by regions that facilitate nucleosome formation. In the fly, our model faithfully predicts promoter strength as encoded in distinct chromatin architectures characteristic of strongly and weakly expressed genes. We also predict that nucleosomes are repositioned by active mechanisms at the majority of fly promoters. Our model uses only basic physical properties to describe the wrapping of DNA around the histone core, yet it captures a substantial part of chromatin's structural complexity, thus leading to a much better prediction of nucleosome occupancy than methods based merely on periodic curved DNA motifs. Our results indicate that the physical properties of the DNA chain, and not just the regulatory factors and chromatin-modifying enzymes, play key roles in eukaryotic transcription.

Download full-text


Available from: Thierry Grange,
21 Reads
  • Source
    • "It is also possible that the observed wide-spread transcription activity of interband regions results from the static physical properties of interband DNA, such as sequence-dependent DNA flexibility, which may create nucleosome-free regions at promoters. Such regions may serve as “entry points” to recruit proteins promoting further binding of transcription factors, chromatin remodelers, etc [66]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Drosophila melanogaster polytene chromosomes display specific banding pattern; the underlying genetic organization of this pattern has remained elusive for many years. In the present paper, we analyze 32 cytology-mapped polytene chromosome interbands. We estimated molecular locations of these interbands, described their molecular and genetic organization and demonstrate that polytene chromosome interbands contain the 5' ends of housekeeping genes. As a rule, interbands display preferential "head-to-head" orientation of genes. They are enriched for "broad" class promoters characteristic of housekeeping genes and associate with open chromatin proteins and Origin Recognition Complex (ORC) components. In two regions, 10A and 100B, coding sequences of genes whose 5'-ends reside in interbands map to constantly loosely compacted, early-replicating, so-called "grey" bands. Comparison of expression patterns of genes mapping to late-replicating dense bands vs genes whose promoter regions map to interbands shows that the former are generally tissue-specific, whereas the latter are represented by ubiquitously active genes. Analysis of RNA-seq data (modENCODE-FlyBase) indicates that transcripts from interband-mapping genes are present in most tissues and cell lines studied, across most developmental stages and upon various treatment conditions. We developed a special algorithm to computationally process protein localization data generated by the modENCODE project and show that Drosophila genome has about 5700 sites that demonstrate all the features shared by the interbands cytologically mapped to date.
    PLoS ONE 07/2014; 9(7):e101631. DOI:10.1371/journal.pone.0101631 · 3.23 Impact Factor
  • Source
    • "Their model captures a substantial part of chromatin's structural complexity, thus leading to a much better prediction of nucleosome occupancy than the methods based only on periodic curved DNA motifs (Miele, et al., 2008). Illuminated by Miele's work (Miele, et al., 2008), in this paper, the DNA local structural properties were considered to define PseKNC. Generally speaking, the spatial arrangements of two neighboring base pairs are characterized by six parameters (Dickerson, 1989), of which three are local translational parameters and other three the local angular parameters, as summarized in "
    [Show abstract] [Hide abstract]
    ABSTRACT: Nucleosome positioning participates in many cellular activities and plays significant roles in regulating cellular processes. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying nucleosome positioning. Although some computational methods were proposed, most of them were species specific and neglected the intrinsic local structural properties that might play important roles in determining the nucleosome positioning on a DNA sequence. Here a predictor called " INUC-PSEKNC " was developed for predicting nucleosome positioning in Homo sapiens, Caenorhabditis elegans, and Drosophila melanogaster genomes, respectively. In the new predictor, the samples of DNA sequences were formulated by a novel feature-vector called "pseudo k-tuple nucleotide composition", into which six DNA local structural properties were incorporated. It was observed by the rigorous cross-validation tests on the three stringent benchmark datasets that the overall success rates achieved by INUC-PSEKNC in predicting the nucleosome positioning of the aforementioned three genomes were 86.27%, 86.90% and 79.97%, respectively. Meanwhile, the results obtained by INUC-PSEKNC on various benchmark datasets used by the previous investigators for different genomes also indicated that the current predictor remarkably outperformed its counterparts. A user-friendly web-server, INUC-PSEKNC is freely accessible at, (H.L.);, (W.C.); (KCC).
    Bioinformatics 02/2014; 30(11). DOI:10.1093/bioinformatics/btu083 · 4.98 Impact Factor
  • Source
    • "Furthermore , promoter regions are constrained by factors other than transcription factor binding alone. For example, it was shown that promoter regions possess special structural properties (melting temperature, curvature, bendability , and stability) that are conserved in evolution (Kanhere and Bansal, 2005) and that have been shown to influence gene expression, for example, via determining nucleosome occupancies of promoter regions (Miele et al., 2008). Thus, promoter regions are, in general, more conserved (Fig. 2), likely explaining also the only small difference in SNP density within and outside motifs. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Identifying regulatory elements and revealing their role in gene expression regulation remains a central goal of plant genome research. We exploited the detailed genomic sequencing information of a large number of Arabidopsis thaliana accessions to characterize known and to identify novel cis-regulatory elements in gene promoter regions of Arabidopsis by relying on conservation as the hallmark signal of functional relevance. Based on the genomic layout and the obtained density profiles of single nucleotide polymorphisms (SNPs) in sequence regions upstream of transcription start sites, the average length of promoter regions in Arabidopsis could be established at 500 bp. Genes associated with high degree of variability of their respective upstream regions are preferentially involved in environmental response and signaling processes, while low levels of promoter SNP-density are common among housekeeping genes. Known cis-elements were found to exhibit a decreased SNP-density than sequence regions not associated with known motifs. For 15 known cis-element motifs, strong positional preferences relative to the transcription start site were detected based on their promoter SNP density profiles. Five novel candidate cis-element motifs were identified as consensus motifs of 17 sequence hexamers exhibiting increased sequence conservation combined with evidence of positional preferences, annotation information, and functional relevance for inducing correlated gene expression. Our study demonstrates that the currently available resolution of SNP data offers novel ways for the identification of functional genomic elements and the characterization of gene promoter sequences.
    Plant physiology 11/2013; 164(1). DOI:10.1104/pp.113.229716 · 6.84 Impact Factor
Show more