Massively parallel functional dissection of mammalian enhancers

Department of Genome Sciences, University of Washington, Seattle, Washington, USA.
Nature Biotechnology (Impact Factor: 41.51). 02/2012; 30(3):265-70. DOI: 10.1038/nbt.2136
Source: PubMed


The functional consequences of genetic variation in mammalian regulatory elements are poorly understood. We report the in vivo dissection of three mammalian enhancers at single-nucleotide resolution through a massively parallel reporter assay. For each enhancer, we synthesized a library of >100,000 mutant haplotypes with 2-3% divergence from the wild-type sequence. Each haplotype was linked to a unique sequence tag embedded within a transcriptional cassette. We introduced each enhancer library into mouse liver and measured the relative activities of individual haplotypes en masse by sequencing the transcribed tags. Linear regression analysis yielded highly reproducible estimates of the effect of every possible single-nucleotide change on enhancer activity. The functional consequence of most mutations was modest, with ∼22% affecting activity by >1.2-fold and ∼3% by >2-fold. Several, but not all, positions with higher effects showed evidence for purifying selection, or co-localized with known liver-associated transcription factor binding sites, demonstrating the value of empirical high-resolution functional analysis.

Download full-text


Available from: Mee Kim,
  • Source
    • "We used a high-throughput multiplexed reporter assay (Kwasnieski et al. 2012; Melnikov et al. 2012; Patwardhan et al. 2012; Sharon et al. 2012) to characterize the regulatory activity of 2100 randomly chosen sequences annotated as Enhancer, Weak Enhancer, or Repressed. Specifically, we tested sequences with the following annotations in the K562 cell line: 600 Enhancer regions, 600 Weak Enhancer regions, and 300 Repressed regions. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The histone modification state of genomic regions is hypothesized to reflect the regulatory activity of the underlying genomic DNA. Based on this hypothesis, the ENCODE Project Consortium measured the status of multiple histone modifications across the genome in several cell types and used these data to segment the genome into regions with different predicted regulatory activities. We measured the cis-regulatory activity of more than 2000 of these predictions in the K562 leukemia cell line. We tested genomic segments predicted to be Enhancers, Weak Enhancers, or Repressed elements in K562 cells, along with other sequences predicted to be Enhancers specific to the HI human embryonic stem cell line (H1-hESC). Both Enhancer and Weak Enhancer sequences in K562 cells were more active than negative controls, although surprisingly, Weak Enhancer segmentations drove expression higher than did Enhancer segmentations. Lower levels of the covalent histone modifications H3K36me3 and H3K27ac, thought to mark active enhancers and transcribed gene bodies, associate with higher expression and partly explain the higher activity of Weak Enhancers over Enhancer predictions. While DNase I hypersensitivity (HS) is a good predictor of active sequences in our assay, transcription factor (TF) binding models need to be included in order to accurately identify highly expressed sequences. Overall, our results show that a significant fraction (similar to 26%) of the ENCODE enhancer predictions have regulatory activity, suggesting that histone modification states can reflect the cis-regulatory activity of sequences in the genome, but that specific sequence preferences, such as TF-binding sites, are the causal determinants of cis-regulatory activity.
    Genome Research 07/2014; 24(10). DOI:10.1101/gr.173518.114 · 14.63 Impact Factor
  • Source
    • "PAR-CLIP (Hafner et al., 2010), and HITS-CLIP (Darnell, 2010), are labor intensive and require knowledge of specific RBPs. Recently, high-throughput reporter assays have been developed to determine the functionality of regulatory elements in yeast promoters (Sharon et al., 2012) and human enhancers (Kheradpour et al., 2013; Melnikov et al., 2012; Patwardhan et al., 2012). These studies allowed the experimental dissection of transcriptional regulatory roles for thousands of sequences in parallel. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Posttranscriptional regulatory programs governing diverse aspects of RNA biology remain largely uncharacterized. Understanding the functional roles of RNA cis-regulatory elements is essential for decoding complex programs that underlie the dynamic regulation of transcript stability, splicing, localization, and translation. Here, we describe a combined experimental/computational technology to reveal a catalog of functional regulatory elements embedded in 3' UTRs of human transcripts. We used a bidirectional reporter system coupled with flow cytometry and high-throughput sequencing to measure the effect of short, noncoding, vertebrate-conserved RNA sequences on transcript stability and translation. Information-theoretic motif analysis of the resulting sequence-to-gene-expression mapping revealed linear and structural RNA cis-regulatory elements that positively and negatively modulate the posttranscriptional fates of human transcripts. This combined experimental/computational strategy can be used to systematically characterize the vast landscape of posttranscriptional regulatory elements controlling physiological and pathological cellular state transitions.
    Cell Reports 03/2014; 7(1). DOI:10.1016/j.celrep.2014.03.001 · 8.36 Impact Factor
  • Source
    • "This lack of knowledge strongly restrains the practical applications of ab initio design. Innovative experimental methodologies based on high-throughput technologies are scaling the characterization process up to tens of thousands of designed sequence variants, providing larger datasets to better understand sequence/activity relationships (Dvir et al., 2013; Kinney et al., 2010; Patwardhan et al., 2009, 2012; Sharon et al., 2012; Smith et al., 2013). However dramatic, this increase in throughput remains limited in comparison to the sheer immensity of the sequence space. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Current advances in DNA synthesis, cloning and sequencing technologies afford high throughput implementation of artificial sequences into living cells. However, flexible computational tools for multi-objective sequence design are lacking, limiting the potential of these technologies. We developed DNA-Tailor (D-Tailor), a fully extendable software framework, for property-based design of synthetic DNA sequences. D-Tailor permits the seamless integration of multiple sequence analysis tools into a generic Monte-Carlo simulation that evolves sequences toward any combination of rationally defined properties. As proof of principle, we show that D-Tailor is capable of designing sequence libraries comprising all possible combinations among three different sequence properties influencing translation efficiency in E. coli. The capacity to design artificial sequences that systematically sample any given parameter space should support the implementation of more rigorous experimental designs. Source code is available for download at CONTACT: or SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online (D-Tailor Tutorial).
    Bioinformatics 01/2014; 30(8). DOI:10.1093/bioinformatics/btt742 · 4.98 Impact Factor
Show more