Evidence for stabilizing selection in a eukaryotic enhancer element.
ABSTRACT Eukaryotic gene expression is mediated by compact cis-regulatory modules, or enhancers, which are bound by specific sets of transcription factors. The combinatorial interaction of these bound transcription factors determines time- and tissue-specific gene activation or repression. The even-skipped stripe 2 element controls the expression of the second transverse stripe of even-skipped messenger RNA in Drosophila melanogaster embryos, and is one of the best characterized eukaryotic enhancers. Although even-skipped stripe 2 expression is strongly conserved in Drosophila, the stripe 2 element itself has undergone considerable evolutionary change in its binding-site sequences and the spacing between them. We have investigated this apparent contradiction, and here we show that two chimaeric enhancers, constructed by swapping the 5' and 3' halves of the native stripe 2 elements of two species, no longer drive expression of a reporter gene in the wildtype pattern. Sequence differences between species have functional consequences, therefore, but they are masked by other co-evolved differences. On the basis of these results, we present a model for the evolution of eukaryotic regulatory sequences.
SourceAvailable from: citeseerx.ist.psu.edu
[Show abstract] [Hide abstract]
ABSTRACT: The basic body plan and major physiological axes have been highly conserved during mammalian evolution, yet only a small fraction of the human genome sequence appears to be subject to evolutionary constraint. To quantify cis- versus trans-acting contributions to mammalian regulatory evolution, we performed genomic DNase I footprinting of the mouse genome across 25 cell and tissue types, collectively defining ∼8.6 million transcription factor (TF) occupancy sites at nucleotide resolution. Here we show that mouse TF footprints conjointly encode a regulatory lexicon that is ∼95% similar with that derived from human TF footprints. However, only ∼20% of mouse TF footprints have human orthologues. Despite substantial turnover of the cis-regulatory landscape, nearly half of all pairwise regulatory interactions connecting mouse TF genes have been maintained in orthologous human cell types through evolutionary innovation of TF recognition sequences. Furthermore, the higher-level organization of mouse TF-to-TF connections into cellular network architectures is nearly identical with human. Our results indicate that evolutionary selection on mammalian gene regulation is targeted chiefly at the level of trans-regulatory circuitry, enabling and potentiating cis-regulatory plasticity.Nature 11/2014; 515(7527):365-70. DOI:10.1038/nature13972 · 42.35 Impact Factor
[Show abstract] [Hide abstract]
ABSTRACT: Gene expression is regulated through the activity of transcription factors (TFs) and chromatin-modifying proteins acting on specific DNA sequences, referred to as cis-regulatory elements. These include promoters, located at the transcription initiation sites of genes, and a variety of distal cis-regulatory modules (CRMs), the most common of which are transcriptional enhancers. Because regulated gene expression is fundamental to cell differentiation and acquisition of new cell fates, identifying, characterizing, and understanding the mechanisms of action of CRMs is critical for understanding development. CRM discovery has historically been challenging, as CRMs can be located far from the genes they regulate, have few readily identifiable sequence characteristics, and for many years were not amenable to high-throughput discovery methods. However, the recent availability of complete genome sequences and the development of next-generation sequencing methods have led to an explosion of both computational and empirical methods for CRM discovery in model and nonmodel organisms alike. Experimentally, CRMs can be identified through chromatin immunoprecipitation directed against TFs or histone post-translational modifications, identification of nucleosome-depleted 'open' chromatin regions, or sequencing-based high-throughput functional screening. Computational methods include comparative genomics, clustering of known or predicted TF-binding sites, and supervised machine-learning approaches trained on known CRMs. All of these methods have proven effective for CRM discovery, but each has its own considerations and limitations, and each is subject to a greater or lesser number of false-positive identifications. Experimental confirmation of predictions is essential, although shortcomings in current methods suggest that additional means of validation need to be developed. WIREs Dev Biol 2015, 4:59-84. doi: 10.1002/wdev.168 For further resources related to this article, please visit the WIREs website. The authors have declared no conflicts of interest for this article. © 2014 Wiley Periodicals, Inc.12/2014; 4(2). DOI:10.1002/wdev.168