Xiao-Yong Li

Lawrence Berkeley National Laboratory, Berkeley, CA, USA

Are you Xiao-Yong Li?

Claim your profile

Publications (13)104.78 Total impact

  • Article: Gene expression in early Drosophila embryos is highly conserved despite extensive divergence of transcription factor binding
    [show abstract] [hide abstract]
    ABSTRACT: To better characterize how variation in regulatory sequences drives divergence in gene expression, we undertook a systematic study of transcription factor binding and gene expression in the blastoderm embryos of four species that sample much of the diversity in the 60 million-year old genus Drosophila: D. melanogaster, D. yakuba, D. pseudoobscura and D. virilis. We compared gene expression, as measured by mRNA-seq to the genome-wide binding of four transcription factors involved in early development, as measured by ChIP-seq (Bicoid, Giant, Hunchback and Kr\"uppel). Surprisingly, we found that mRNA levels are much better conserved than individual binding events. We looked at binding characteristics that may explain such evolutionary disparity. As expected, we found that binding divergence increases with phylogenetic distance. Interestingly, binding events in non-coding regions that were bound strongly by single factors, or bound by multiple factors, were more likely to be conserved. As this class of sites are most likely to be involved in gene regulation, the divergence of other bound regions may simply reflect their lack of function. We used a model of quantitative trait evolution to compare the changes of gene expression with nearby regulatory TF binding. We found that changes in gene expression were poorly explained by changes in associated TF binding. These results suggest that some of the differences in sequence and binding have limited effect on gene expression or act in a compensatory manner to maintain the overall expression levels of regulated genes.
    03/2013;
  • Article: Genome-wide in vivo cross-linking of sequence-specific transcription factors.
    Xiao-Yong Li, Mark D Biggin
    [show abstract] [hide abstract]
    ABSTRACT: Immunoprecipitation of cross-linked chromatin in combination with microarrays (ChIP-chip) or ultra high-throughput sequencing (ChIP-seq) is widely used to map genome-wide in vivo transcription factor binding. Both methods employ initial steps of in vivo cross-linking, chromatin isolation, DNA fragmentation, and immunoprecipitation. For ChIP-chip, the immunoprecipitated DNA samples are then amplified, labeled, and hybridized to DNA microarrays. For ChIP-seq, the immunoprecipitated DNA is prepared for a sequencing library, and then the library DNA fragments are sequenced using ultra high-throughput sequencing platform. The protocols described here have been developed for ChIP-chip and ChIP-seq analysis of sequence-specific transcription factor binding in Drosophila embryos. A series of controls establish that these protocols have high sensitivity and reproducibility and provide a quantitative measure of relative transcription factor occupancy. The quantitative nature of the assay is important because regulatory transcription factors bind to highly overlapping sets of thousands of genomic regions and the unique regulatory specificity of each factor is determined by relative moderate differences in occupancy between factors at commonly bound regions.
    Methods in molecular biology (Clifton, N.J.) 01/2012; 809:3-26.
  • Source
    Article: Zelda binding in the early Drosophila melanogaster embryo marks regions subsequently activated at the maternal-to-zygotic transition.
    [show abstract] [hide abstract]
    ABSTRACT: The earliest stages of development in most metazoans are driven by maternally deposited proteins and mRNAs, with widespread transcriptional activation of the zygotic genome occurring hours after fertilization, at a period known as the maternal-to-zygotic transition (MZT). In Drosophila, the MZT is preceded by the transcription of a small number of genes that initiate sex determination, patterning, and other early developmental processes; and the zinc-finger protein Zelda (ZLD) plays a key role in their transcriptional activation. To better understand the mechanisms of ZLD activation and the range of its targets, we used chromatin immunoprecipitation coupled with high-throughput sequencing (ChIP-Seq) to map regions bound by ZLD before (mitotic cycle 8), during (mitotic cycle 13), and after (late mitotic cycle 14) the MZT. Although only a handful of genes are transcribed prior to mitotic cycle 10, we identified thousands of regions bound by ZLD in cycle 8 embryos, most of which remain bound through mitotic cycle 14. As expected, early ZLD-bound regions include the promoters and enhancers of genes transcribed at this early stage. However, we also observed ZLD bound at cycle 8 to the promoters of roughly a thousand genes whose first transcription does not occur until the MZT and to virtually all of the thousands of known and presumed enhancers bound at cycle 14 by transcription factors that regulate patterned gene activation during the MZT. The association between early ZLD binding and MZT activity is so strong that ZLD binding alone can be used to identify active promoters and regulatory sequences with high specificity and selectivity. This strong early association of ZLD with regions not active until the MZT suggests that ZLD is not only required for the earliest wave of transcription but also plays a major role in activating the genome at the MZT.
    PLoS Genetics 10/2011; 7(10):e1002266. · 8.69 Impact Factor
  • Source
    Article: Dynamic reprogramming of chromatin accessibility during Drosophila embryo development.
    [show abstract] [hide abstract]
    ABSTRACT: The development of complex organisms is believed to involve progressive restrictions in cellular fate. Understanding the scope and features of chromatin dynamics during embryogenesis, and identifying regulatory elements important for directing developmental processes remain key goals of developmental biology. We used in vivo DNaseI sensitivity to map the locations of regulatory elements, and explore the changing chromatin landscape during the first 11 hours of Drosophila embryonic development. We identified thousands of conserved, developmentally dynamic, distal DNaseI hypersensitive sites associated with spatial and temporal expression patterning of linked genes and with large regions of chromatin plasticity. We observed a nearly uniform balance between developmentally up- and down-regulated DNaseI hypersensitive sites. Analysis of promoter chromatin architecture revealed a novel role for classical core promoter sequence elements in directing temporally regulated chromatin remodeling. Another unexpected feature of the chromatin landscape was the presence of localized accessibility over many protein-coding regions, subsets of which were developmentally regulated or associated with the transcription of genes with prominent maternal RNA contributions in the blastoderm. Our results provide a global view of the rich and dynamic chromatin landscape of early animal development, as well as novel insights into the organization of developmentally regulated chromatin features.
    Genome biology 05/2011; 12(5):R43. · 6.63 Impact Factor
  • Source
    Article: The role of chromatin accessibility in directing the widespread, overlapping patterns of Drosophila transcription factor binding.
    [show abstract] [hide abstract]
    ABSTRACT: In Drosophila embryos, many biochemically and functionally unrelated transcription factors bind quantitatively to highly overlapping sets of genomic regions, with much of the lowest levels of binding being incidental, non-functional interactions on DNA. The primary biochemical mechanisms that drive these genome-wide occupancy patterns have yet to be established. Here we use data resulting from the DNaseI digestion of isolated embryo nuclei to provide a biophysical measure of the degree to which proteins can access different regions of the genome. We show that the in vivo binding patterns of 21 developmental regulators are quantitatively correlated with DNA accessibility in chromatin. Furthermore, we find that levels of factor occupancy in vivo correlate much more with the degree of chromatin accessibility than with occupancy predicted from in vitro affinity measurements using purified protein and naked DNA. Within accessible regions, however, the intrinsic affinity of the factor for DNA does play a role in determining net occupancy, with even weak affinity recognition sites contributing. Finally, we show that programmed changes in chromatin accessibility between different developmental stages correlate with quantitative alterations in factor binding. Based on these and other results, we propose a general mechanism to explain the widespread, overlapping DNA binding by animal transcription factors. In this view, transcription factors are expressed at sufficiently high concentrations in cells such that they can occupy their recognition sequences in highly accessible chromatin without the aid of physical cooperative interactions with other proteins, leading to highly overlapping, graded binding of unrelated factors.
    Genome biology 04/2011; 12(4):R34. · 6.63 Impact Factor
  • Source
    Article: Quantitative models of the mechanisms that control genome-wide patterns of transcription factor binding during early Drosophila development.
    [show abstract] [hide abstract]
    ABSTRACT: Transcription factors that drive complex patterns of gene expression during animal development bind to thousands of genomic regions, with quantitative differences in binding across bound regions mediating their activity. While we now have tools to characterize the DNA affinities of these proteins and to precisely measure their genome-wide distribution in vivo, our understanding of the forces that determine where, when, and to what extent they bind remains primitive. Here we use a thermodynamic model of transcription factor binding to evaluate the contribution of different biophysical forces to the binding of five regulators of early embryonic anterior-posterior patterning in Drosophila melanogaster. Predictions based on DNA sequence and in vitro protein-DNA affinities alone achieve a correlation of ∼0.4 with experimental measurements of in vivo binding. Incorporating cooperativity and competition among the five factors, and accounting for spatial patterning by modeling binding in every nucleus independently, had little effect on prediction accuracy. A major source of error was the prediction of binding events that do not occur in vivo, which we hypothesized reflected reduced accessibility of chromatin. To test this, we incorporated experimental measurements of genome-wide DNA accessibility into our model, effectively restricting predicted binding to regions of open chromatin. This dramatically improved our predictions to a correlation of 0.6-0.9 for various factors across known target genes. Finally, we used our model to quantify the roles of DNA sequence, accessibility, and binding competition and cooperativity. Our results show that, in regions of open chromatin, binding can be predicted almost exclusively by the sequence specificity of individual factors, with a minimal role for protein interactions. We suggest that a combination of experimentally determined chromatin accessibility data and simple computational models of transcription factor binding may be used to predict the binding landscape of any animal transcription factor with significant precision.
    PLoS Genetics 01/2011; 7(2):e1001290. · 8.69 Impact Factor
  • Source
    Article: Binding site turnover produces pervasive quantitative changes in transcription factor binding between closely related Drosophila species.
    [show abstract] [hide abstract]
    ABSTRACT: Changes in gene expression play an important role in evolution, yet the molecular mechanisms underlying regulatory evolution are poorly understood. Here we compare genome-wide binding of the six transcription factors that initiate segmentation along the anterior-posterior axis in embryos of two closely related species: Drosophila melanogaster and Drosophila yakuba. Where we observe binding by a factor in one species, we almost always observe binding by that factor to the orthologous sequence in the other species. Levels of binding, however, vary considerably. The magnitude and direction of the interspecies differences in binding levels of all six factors are strongly correlated, suggesting a role for chromatin or other factor-independent forces in mediating the divergence of transcription factor binding. Nonetheless, factor-specific quantitative variation in binding is common, and we show that it is driven to a large extent by the gain and loss of cognate recognition sequences for the given factor. We find only a weak correlation between binding variation and regulatory function. These data provide the first genome-wide picture of how modest levels of sequence divergence between highly morphologically similar species affect a system of coordinately acting transcription factors during animal development, and highlight the dominant role of quantitative variation in transcription factor binding over short evolutionary distances.
    PLoS Biology 01/2010; 8(3):e1000343. · 11.45 Impact Factor
  • Source
    Article: Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions.
    [show abstract] [hide abstract]
    ABSTRACT: We previously established that six sequence-specific transcription factors that initiate anterior/posterior patterning in Drosophila bind to overlapping sets of thousands of genomic regions in blastoderm embryos. While regions bound at high levels include known and probable functional targets, more poorly bound regions are preferentially associated with housekeeping genes and/or genes not transcribed in the blastoderm, and are frequently found in protein coding sequences or in less conserved non-coding DNA, suggesting that many are likely non-functional. Here we show that an additional 15 transcription factors that regulate other aspects of embryo patterning show a similar quantitative continuum of function and binding to thousands of genomic regions in vivo. Collectively, the 21 regulators show a surprisingly high overlap in the regions they bind given that they belong to 11 DNA binding domain families, specify distinct developmental fates, and can act via different cis-regulatory modules. We demonstrate, however, that quantitative differences in relative levels of binding to shared targets correlate with the known biological and transcriptional regulatory specificities of these factors. It is likely that the overlap in binding of biochemically and functionally unrelated transcription factors arises from the high concentrations of these proteins in nuclei, which, coupled with their broad DNA binding specificities, directs them to regions of open chromatin. We suggest that most animal transcription factors will be found to show a similar broad overlapping pattern of binding in vivo, with specificity achieved by modulating the amount, rather than the identity, of bound factor.
    Genome biology 08/2009; 10(7):R80. · 6.63 Impact Factor
  • Article: Association of cohesin and Nipped-B with transcriptionally active regions of the Drosophila melanogaster genome.
    [show abstract] [hide abstract]
    ABSTRACT: The cohesin complex is a chromosomal component required for sister chromatid cohesion that is conserved from yeast to man. The similarly conserved Nipped-B protein is needed for cohesin to bind to chromosomes. In higher organisms, Nipped-B and cohesin regulate gene expression and development by unknown mechanisms. Using chromatin immunoprecipitation, we find that Nipped-B and cohesin bind to the same sites throughout the entire non-repetitive Drosophila genome. They preferentially bind transcribed regions and overlap with RNA polymerase II. This contrasts sharply with yeast, where cohesin binds almost exclusively between genes. Differences in cohesin and Nipped-B binding between Drosophila cell lines often correlate with differences in gene expression. For example, cohesin and Nipped-B bind the Abd-B homeobox gene in cells in which it is transcribed, but not in cells in which it is silenced. They bind to the Abd-B transcription unit and downstream regulatory region and thus could regulate both transcriptional elongation and activation. We posit that transcription facilitates cohesin binding, perhaps by unfolding chromatin, and that Nipped-B then regulates gene expression by controlling cohesin dynamics. These mechanisms are likely involved in the etiology of Cornelia de Lange syndrome, in which mutation of one copy of the NIPBL gene encoding the human Nipped-B ortholog causes diverse structural and mental birth defects.
    Chromosoma 03/2008; 117(1):89-102. · 3.85 Impact Factor
  • Source
    Article: Transcription factors bind thousands of active and inactive regions in the Drosophila blastoderm.
    [show abstract] [hide abstract]
    ABSTRACT: Identifying the genomic regions bound by sequence-specific regulatory factors is central both to deciphering the complex DNA cis-regulatory code that controls transcription in metazoans and to determining the range of genes that shape animal morphogenesis. We used whole-genome tiling arrays to map sequences bound in Drosophila melanogaster embryos by the six maternal and gap transcription factors that initiate anterior-posterior patterning. We find that these sequence-specific DNA binding proteins bind with quantitatively different specificities to highly overlapping sets of several thousand genomic regions in blastoderm embryos. Specific high- and moderate-affinity in vitro recognition sequences for each factor are enriched in bound regions. This enrichment, however, is not sufficient to explain the pattern of binding in vivo and varies in a context-dependent manner, demonstrating that higher-order rules must govern targeting of transcription factors. The more highly bound regions include all of the over 40 well-characterized enhancers known to respond to these factors as well as several hundred putative new cis-regulatory modules clustered near developmental regulators and other genes with patterned expression at this stage of embryogenesis. The new targets include most of the microRNAs (miRNAs) transcribed in the blastoderm, as well as all major zygotically transcribed dorsal-ventral patterning genes, whose expression we show to be quantitatively modulated by anterior-posterior factors. In addition to these highly bound regions, there are several thousand regions that are reproducibly bound at lower levels. However, these poorly bound regions are, collectively, far more distant from genes transcribed in the blastoderm than highly bound regions; are preferentially found in protein-coding sequences; and are less conserved than highly bound regions. Together these observations suggest that many of these poorly bound regions are not involved in early-embryonic transcriptional regulation, and a significant proportion may be nonfunctional. Surprisingly, for five of the six factors, their recognition sites are not unambiguously more constrained evolutionarily than the immediate flanking DNA, even in more highly bound and presumably functional regions, indicating that comparative DNA sequence analysis is limited in its ability to identify functional transcription factor targets.
    PLoS Biology 03/2008; 6(2):e27. · 11.45 Impact Factor
  • Source
    Article: Large-scale turnover of functional transcription factor binding sites in Drosophila.
    [show abstract] [hide abstract]
    ABSTRACT: The gain and loss of functional transcription factor binding sites has been proposed as a major source of evolutionary change in cis-regulatory DNA and gene expression. We have developed an evolutionary model to study binding-site turnover that uses multiple sequence alignments to assess the evolutionary constraint on individual binding sites, and to map gain and loss events along a phylogenetic tree. We apply this model to study the evolutionary dynamics of binding sites of the Drosophila melanogaster transcription factor Zeste, using genome-wide in vivo (ChIP-chip) binding data to identify functional Zeste binding sites, and the genome sequences of D. melanogaster, D. simulans, D. erecta, and D. yakuba to study their evolution. We estimate that more than 5% of functional Zeste binding sites in D. melanogaster were gained along the D. melanogaster lineage or lost along one of the other lineages. We find that Zeste-bound regions have a reduced rate of binding-site loss and an increased rate of binding-site gain relative to flanking sequences. Finally, we show that binding-site gains and losses are asymmetrically distributed with respect to D. melanogaster, consistent with lineage-specific acquisition and loss of Zeste-responsive regulatory elements.
    PLoS Computational Biology 11/2006; 2(10):e130. · 5.22 Impact Factor
  • Source
    Article: Genome-wide analysis of Polycomb targets in Drosophila melanogaster.
    [show abstract] [hide abstract]
    ABSTRACT: Polycomb group (PcG) complexes are multiprotein assemblages that bind to chromatin and establish chromatin states leading to epigenetic silencing. PcG proteins regulate homeotic genes in flies and vertebrates, but little is known about other PcG targets and the role of the PcG in development, differentiation and disease. Here, we determined the distribution of the PcG proteins PC, E(Z) and PSC and of trimethylation of histone H3 Lys27 (me3K27) in the D. melanogaster genome. At more than 200 PcG target genes, binding sites for the three PcG proteins colocalize to presumptive Polycomb response elements (PREs). In contrast, H3 me3K27 forms broad domains including the entire transcription unit and regulatory regions. PcG targets are highly enriched in genes encoding transcription factors, but they also include genes coding for receptors, signaling proteins, morphogens and regulators representing all major developmental pathways.
    Nature Genetics 07/2006; 38(6):700-5. · 35.53 Impact Factor
  • Article: Correction: Transcription Factors Bind Thousands of Active and Inactive Regions in the Drosophila Blastoderm