Variation in Homeodomain DNA Binding Revealed by High-Resolution Analysis of Sequence Preferences

Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA.
Cell (Impact Factor: 32.24). 07/2008; 133(7):1266-76. DOI: 10.1016/j.cell.2008.05.024
Source: PubMed


Most homeodomains are unique within a genome, yet many are highly conserved across vast evolutionary distances, implying strong selection on their precise DNA-binding specificities. We determined the binding preferences of the majority (168) of mouse homeodomains to all possible 8-base sequences, revealing rich and complex patterns of sequence specificity and showing that there are at least 65 distinct homeodomain DNA-binding activities. We developed a computational system that successfully predicts binding sites for homeodomain proteins as distant from mouse as Drosophila and C. elegans, and we infer full 8-mer binding profiles for the majority of known animal homeodomains. Our results provide an unprecedented level of resolution in the analysis of this simple domain structure and suggest that variation in sequence recognition may be a factor in its functional diversity and evolutionary success.

Download full-text


Available from: Olga Borisovna Botvinnik,
    • "Even though NKX2-5Y191C cannot bind DNA, the most overrepresented motifs were 5′AAGTGT3′ (NKE), 5′GATAA3′ (GATA), and 5′TGCCAA3′ (NF1-like), exactly as for NKX2-5 WT fusion (Supplementary file 1). NKX2-5Y191C peaks also showed overrepresentation of the motif 5′TAATC3′, which is similar to the binding sites of many non-NK-2 class HD proteins, including HOX proteins, as well as that of NK-2 proteins that lack HD tyrosine 54, such as NKX1-2 (Berger et al., 2008). NK-2 class proteins which do carry Y54 in their HDs, including NKX2-5, also bind to this HOX-like site, albeit with a 10-fold-reduced affinity compared to that of the NKE (Chen and Schwartz, 1995). "
    [Show abstract] [Hide abstract]
    ABSTRACT: To model cardiac gene regulatory networks in health and disease we used DamID to establish robust target gene sets for the cardiac homeodomain factor NKX2-5 and two congenital heart disease-associated mutants carrying a crippled homeodomain, which normally functions as DNA- and protein-binding interface. Despite compromised direct DNA-binding, NKX2-5 mutants retained partial functionality and bound hundreds of targets, including NKX2-5 wild type targets and unique sets of 'off-targets'. NKX2-5∆HD, which lacks the entire homeodomain, could still dimerise with wild type NKX2-5 and its cofactors, including newly-discovered cofactors of the ETS family, through the conserved tyrosine-rich domain (YRD). NKX2-5∆HD off-targets showed overrepresentation of many binding motifs, including ETS motifs, the majority co-occupied by ETS proteins as determined by DamID. Off-targets of an NKX2-5 YRD mutant were not enriched in ETS targets. Our study reveals off-target binding and transcriptional activity for NKX2-5 mutations driven in part by cofactor interactions, suggesting a novel type of gain-of-function in congenital heart disease.
    eLife Sciences 07/2015; 4. DOI:10.7554/eLife.06942 · 9.32 Impact Factor
  • Source
    • "These differentially expressed genes represent both direct and indirect targets of Rx3. The presence of RAX-binding motifs as defined by a position weight matrix in the de-regulated genes was shown to further support the contention that some of the genetic network is directly regulated by Rx3 [8, 56]. The core of the RAX binding motif is a short sequence “TAATTA”, thus it is relatively common to find this sequence in the zebrafish genome (about 1 in every 4 kb). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background The genetic cascades underpinning vertebrate early eye morphogenesis are poorly understood. One gene family essential for eye morphogenesis encodes the retinal homeobox (Rx) transcription factors. Mutations in the human retinal homeobox gene (RAX) can lead to gross morphological phenotypes ranging from microphthalmia to anophthalmia. Zebrafish rx3 null mutants produce a similar striking eyeless phenotype with an associated expanded forebrain. Thus, we used zebrafish rx3-/- mutants as a model to uncover an Rx3-regulated gene network during early eye morphogenesis. Results Rx3-regulated genes were identified using whole transcriptomic sequencing (RNA-seq) of rx3-/- mutants and morphologically wild-type siblings during optic vesicle morphogenesis. A gene co-expression network was then constructed for the Rx3-regulated genes, identifying gene cross-talk during early eye development. Genes highly connected in the network are hub genes, which tend to exhibit higher expression changes between rx3-/- mutants and normal phenotype siblings. Hub genes down-regulated in rx3-/- mutants encompass homeodomain transcription factors and mediators of retinoid-signaling, both associated with eye development and known human eye disorders. In contrast, genes up-regulated in rx3-/- mutants are centered on Wnt signaling pathways, associated with brain development and disorders. The temporal expression pattern of Rx3-regulated genes was further profiled during early development from maternal stage until visual function is fully mature. Rx3-regulated genes exhibited synchronized expression patterns, and a transition of gene expression during the early segmentation stage when Rx3 was highly expressed. Furthermore, most of these deregulated genes are enriched with multiple RAX-binding motif sequences on the gene promoter. Conclusions Here, we assembled a comprehensive model of Rx3-regulated genes during early eye morphogenesis. Rx3 promotes optic vesicle morphogenesis and represses brain development through a highly correlated and modulated network, exhibiting repression of genes mediating Wnt signaling and concomitant enhanced expression of homeodomain transcription factors and retinoid-signaling genes. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-825) contains supplementary material, which is available to authorized users.
    BMC Genomics 09/2014; 15(1):825. DOI:10.1186/1471-2164-15-825 · 3.99 Impact Factor
  • Source
    • "The inference scheme described here relies on the high degree of conservation among DBDs. Indeed, our analyses confirm the ''deep homology'' that has been described for metazoan developmental processes and the TFs that regulate them (e.g., homeodomains) (Berger et al., 2008; Carroll, 2008; Noyes et al., 2008) and furthermore indicate that deep homology is a property of the sequence preferences of many TFs in all eukaryotic kingdoms. Our initial analyses (data not shown) suggest that many motifs likely date to the base of metazoans, land plants, angiosperms (flowering plants), or euteleostomi (bony vertebrates ), consistent with well-established TF expansions in these lineages (de Mendoza et al., 2013; Weirauch and Hughes, 2011). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ∼1% of eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ∼34% of the ∼170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in chromatin immunoprecipitation sequencing (ChIP-seq) peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif "library" can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.
    Cell 09/2014; 158(6):1431-43. DOI:10.1016/j.cell.2014.08.009 · 32.24 Impact Factor
Show more