The role of DNA shape in protein-DNA recognition

Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 Saint Nicholas Avenue, New York, New York 10032, USA.
Nature (Impact Factor: 41.46). 10/2009; 461(7268):1248-53. DOI: 10.1038/nature08473
Source: PubMed


The recognition of specific DNA sequences by proteins is thought to depend on two types of mechanism: one that involves the formation of hydrogen bonds with specific bases, primarily in the major groove, and one involving sequence-dependent deformations of the DNA helix. By comprehensively analysing the three-dimensional structures of protein-DNA complexes, here we show that the binding of arginine residues to narrow minor grooves is a widely used mode for protein-DNA recognition. This readout mechanism exploits the phenomenon that narrow minor grooves strongly enhance the negative electrostatic potential of the DNA. The nucleosome core particle offers a prominent example of this effect. Minor-groove narrowing is often associated with the presence of A-tracts, AT-rich sequences that exclude the flexible TpA step. These findings indicate that the ability to detect local variations in DNA shape and electrostatic potential is a general mechanism that enables proteins to use information in the minor groove, which otherwise offers few opportunities for the formation of base-specific hydrogen bonds, to achieve DNA-binding specificity.

Download full-text


Available from: Remo Rohs
  • Source
    • "These previous applications of the CSO model relied on expectation values of the conformational parameters that are generated by considering DNA as either a homogenous ideally straight fragment or a periodic curved fragment. Importantly, based on the idea that the physicochemical properties of DNA play an important role in protein-DNA interaction333435 , and this approach recently shed light on the interplay between DNA flexibility and protein binding [7, 36]. In this study, we evaluate the role of static bending in computationally determined DNA cyclization rates by applying the CSO model with several sets of expectation values of the conformational parameters. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The intrinsic bendability of DNA plays an important role with relevance for myriad of essential cellular mechanisms. The flexibility of a DNA fragment can be experimentally and computationally examined by its propensity for cyclization, quantified by the Jacobson-Stockmayer J factor. In this study, we use a well-established coarse-grained three-dimensional model of DNA and seven distinct sets of experimentally and computationally derived conformational parameters of the double helix to evaluate the role of structural parameters in calculating DNA cyclization. We calculate the cyclization rates of 86 DNA sequences with previously measured J factors and lengths between 57 and 325 bp as well as of 20,000 randomly generated DNA sequences with lengths between 350 and 4000 bp. Our comparison with experimental data is complemented with analysis of simulated data. Our data demonstrate that all sets of parameters yield very similar results for longer DNA fragments, regardless of the nucleotide sequence, which are in agreement with experimental measurements. However, for DNA fragments shorter than 100 bp, all sets of parameters performed poorly yielding results with several orders of magnitude difference from the experimental measurements. Our data show that DNA cyclization rates calculated using conformational parameters based on nucleosome packaging data are most similar to the experimental measurements. Overall, our study provides a comprehensive large-scale assessment of the role of structural parameters in calculating DNA cyclization rates.
    Full-text · Article · Feb 2016 · BMC Bioinformatics
  • Source
    • "Arg154, Arg157, Asn160, Arg167, Ser183, and Ser186 were some of the major DNA-binding residues of Sox2, and Gln44, Thr45, Arg49, Ser56, Arg95, Arg105, and Gln146 were some of the major Oct4 residues that participated in DNA-binding interactions (S3B Fig).In addition, the residues Arg20, Arg225, Lys223, and Arg228 were involved in hydrogen bonding with bases DG11, DG10, DT9 and DT9, respectively (at the 3 base pairs separated binding sites) and may play an important role in heterodimerization of the Oct4/Sox2 3bp complex (S4 Fig).Hydrogen-bond interactions were studied to explore DNA-protein interactions, as summarized in S2 Table. AT-rich sequences are generally more flexible[45]and, hence, bending occurred at the AT-rich site for both complexes. Experimental evidence indicated that the 3 base pairs separated complex Oct1/Sox2/FGF4 exhibited bending in DC3.DG47 and DT6. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The octamer-binding transcription factor 4 (Oct4) and sex-determining region Y (SRY)-box 2 (Sox2) proteins induce various transcriptional regulators to maintain cellular pluripotency. Most Oct4/Sox2 complexes have either 0 base pairs (Oct4/Sox20bp) or 3 base pairs (Oct4/Sox23bp) separation between their DNA-binding sites. Results from previous biochemical studies have shown that the complexes separated by 0 base pairs are associated with a higher pluripotency rate than those separated by 3 base pairs. Here, we performed molecular dynamics (MD) simulations and calculations to determine the binding free energy and per-residue free energy for the Oct4/Sox20bp and Oct4/Sox23bp complexes to identify structural differences that contribute to differences in induction rate. Our MD simulation results showed substantial differences in Oct4/Sox2 domain movements, as well as secondary-structure changes in the Oct4 linker region, suggesting a potential reason underlying the distinct efficiencies of these complexes during reprogramming. Moreover, we identified key residues and hydrogen bonds that potentially facilitate protein-protein and protein-DNA interactions, in agreement with previous experimental findings. Consequently, our results confess that differential spacing of the Oct4/Sox2 DNA binding sites can determine the magnitude of transcription of the targeted genes during reprogramming.
    Full-text · Article · Jan 2016 · PLoS ONE
  • Source
    • "Remarkably, the CXC domain uses a single arginine to directly read out dinucleotide sequences from the minor groove of DNA, distinct from other DNA-binding domains that commonly recognize DNA sequence from the major groove with large secondary structure elements (Freemont et al. 1991). Arginine has been documented to interact with the minor groove but, in most cases, indirectly reads out DNA sequences by binding narrow minor grooves adopted by AT-rich sequences (Rohs et al. 2009). A single small CXC domain offers only limited binding specificity and affinity. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The male-specific lethal dosage compensation complex (MSL-DCC) selectively assembles on the X chromosome in Drosophila males and activates gene transcription by twofold through histone acetylation. An MSL recognition element (MRE) sequence motif nucleates the initial MSL association, but how it is recognized remains unknown. Here, we identified the CXC domain of MSL2 specifically recognizing the MRE motif and determined its crystal structure bound to specific and nonspecific DNAs. The CXC domain primarily contacts one strand of DNA duplex and employs a single arginine to directly read out dinucleotide sequences from the minor groove. The arginine is flexible when bound to nonspecific sequences. The core region of the MRE motif harbors two binding sites on opposite strands that can cooperatively recruit a CXC dimer. Specific DNA-binding mutants of MSL2 are impaired in MRE binding and X chromosome localization in vivo. Our results reveal multiple dynamic DNA-binding modes of the CXC domain that target the MSL-DCC to X chromosomes.
    Full-text · Article · Dec 2014 · Genes & development
Show more