Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

Neural Cell-Fate Determinants Section, NINDS, NIH, Bethesda, Maryland 20892, USA.
Developmental Dynamics (Impact Factor: 2.38). 01/2012; 241(1):169-89. DOI: 10.1002/dvdy.22728
Source: PubMed


Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs.
We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization.
cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process.

21 Reads
  • [Show abstract] [Hide abstract]
    ABSTRACT: In the developing CNS, unique functional identities among neurons and glia are, in part, established as a result of successive transitions in gene expression programs within neural precursor cells. One of the temporal-identity windows within Drosophila CNS neural precursor cells or neuroblasts (NBs) is marked by the expression of a zinc-finger transcription factor (TF) gene, castor (cas). Our analysis of cis-regulatory DNA within a cas loss-of-function rescue fragment has identified seven enhancers that independently activate reporter transgene expression in specific sub-patterns of the wild-type embryonic cas gene expression domain. Most of these enhancers also regulate different aspects of cas expression within the larval and adult CNS. Phylogenetic footprinting reveals that each enhancer is made up of clusters of highly conserved DNA sequence blocks that are flanked by less-conserved inter-cluster spacer sequences. Comparative analysis of the conserved DNA also reveals that cas enhancers share different combinations of sequence elements and many of these shared elements contain core DNA-binding recognition motifs for characterized temporal-identity TFs. Intra-species alignments show that two of the sub-pattern enhancers originated from an inverted duplication and that this repeat is unique to the cas locus in all sequenced Drosophila species. Finally we show that three of the enhancers differentially require cas function for their wild-type regulatory behavior. Cas limits the expression of one enhancer while two others require cas function for full expression. These studies represent a starting point for the further analysis of cas gene expression and the TFs that regulate it.
    Gene Expression Patterns 06/2012; 12(7-8):261-72. DOI:10.1016/j.gep.2012.05.004 · 1.38 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The field of regulatory genomics today is characterized by the generation of high-throughput data sets that capture genome-wide transcription factor (TF) binding, histone modifications, or DNAseI hypersensitive regions across many cell types and conditions. In this context, a critical question is how to make optimal use of these publicly available datasets when studying transcriptional regulation. Here, we address this question in Drosophila melanogaster for which a large number of high-throughput regulatory datasets are available. We developed i-cisTarget (where the 'i' stands for integrative), for the first time enabling the discovery of different types of enriched 'regulatory features' in a set of co-regulated sequences in one analysis, being either TF motifs or 'in vivo' chromatin features, or combinations thereof. We have validated our approach on 15 co-expressed gene sets, 21 ChIP data sets, 628 curated gene sets and multiple individual case studies, and show that meaningful regulatory features can be confidently discovered; that bona fide enhancers can be identified, both by in vivo events and by TF motifs; and that combinations of in vivo events and TF motifs further increase the performance of enhancer prediction.
    Nucleic Acids Research 06/2012; 40(15):e114. DOI:10.1093/nar/gks543 · 9.11 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Coding region alterations of ZIC2 are the second most common type of mutation in holoprosencephaly (HPE). Here we use several complementary bioinformatic approaches to identify ultraconserved cis-regulatory sequences potentially driving the expression of human ZIC2. We demonstrate that an 804 bp element in the 3' untranslated region (3'UTR) is highly conserved across the evolutionary history of vertebrates from fish to humans. Furthermore, we show that while genetic variation of this element is unexpectedly common among holoprosencephaly subjects (6/528 or >1%), it is not present in control individuals. Two of six proband-unique variants are de novo, supporting their pathogenic involvement in HPE outcomes. These findings support a general recommendation that the identification and analysis of key ultraconserved elements should be incorporated into the genetic risk assessment of holoprosencephaly cases.
    PLoS ONE 07/2012; 7(7):e39026. DOI:10.1371/journal.pone.0039026 · 3.23 Impact Factor
Show more

Full-text (3 Sources)

21 Reads
Available from
Jun 2, 2014