Cell type-specific binding patterns reveal that TCF7L2 can be tethered to the genome by association with GATA3

Department of Biochemistry and Molecular Biology, Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA 90089, USA. .
Genome biology (Impact Factor: 10.81). 09/2012; 13(9):R52. DOI: 10.1186/gb-2012-13-9-r52
Source: PubMed


The TCF7L2 transcription factor is linked to a variety of human diseases, including type 2 diabetes and cancer. One mechanism by which TCF7L2 could influence expression of genes involved in diverse diseases is by binding to distinct regulatory regions in different tissues. To test this hypothesis, we performed ChIP-seq for TCF7L2 in six human cell lines.

We identified 116,000 non-redundant TCF7L2 binding sites, with only 1,864 sites common to the six cell lines. Using ChIP-seq, we showed that many genomic regions that are marked by both H3K4me1 and H3K27Ac are also bound by TCF7L2, suggesting that TCF7L2 plays a critical role in enhancer activity. Bioinformatic analysis of the cell type-specific TCF7L2 binding sites revealed enrichment for multiple transcription factors, including HNF4alpha and FOXA2 motifs in HepG2 cells and the GATA3 motif in MCF7 cells. ChIP-seq analysis revealed that TCF7L2 co-localizes with HNF4alpha and FOXA2 in HepG2 cells and with GATA3 in MCF7 cells. Interestingly, in MCF7 cells the TCF7L2 motif is enriched in most TCF7L2 sites but is not enriched in the sites bound by both GATA3 and TCF7L2. This analysis suggested that GATA3 might tether TCF7L2 to the genome at these sites. To test this hypothesis, we depleted GATA3 in MCF7 cells and showed that TCF7L2 binding was lost at a subset of sites. RNA-seq analysis suggested that TCF7L2 represses transcription when tethered to the genome via GATA3.

Our studies demonstrate a novel relationship between GATA3 and TCF7L2, and reveal important insights into TCF7L2-mediated gene regulation.

Download full-text


Available from: Peggy J Farnham, Dec 18, 2013
  • Source
    • "For breast cell lines, a total of 42 high-throughput sequencing data, including 10 RNA sequencing (RNA-seq) data from two data sets[20,21], 18 ChIP sequencing (ChIP-seq) data, 4 corresponding input DNA (control) from five data sets, and 10 bisulfite sequencing (BS-seq) data from two data sets19202122, were collected in this study (Additional file 1: Table S1). Among ten RNA-seq data, eight are from breast cancer cell lines and two are from normal breast cell lines. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: The human APOBEC protein family plays critical but distinct roles in host defense. Recent studies revealed that APOBECs mediate C-to-T mutagenesis in multiple cancers, including breast cancer. It is still unclear whether APOBEC gene family shows functional diversification involved in cancer mutagenesis. Results: We performed an integrated analysis to characterize the functional diversification of APOBEC gene family associated with breast cancer mutagenesis relative to estrogen receptor (ER) status. Among the APOBEC family, we found that both APOBEC3B and APOBEC3C mRNA levels were significantly higher in estrogen receptor negative (ER-) subtype compared with estrogen receptor positive (ER+) subtype (P < 2.2 × 10(-16) and P < 3.1 × 10(-5), respectively). Epigenomic data further reflected the distinct chromatin states of APOBEC3B and APOBEC3C relative to ER status. Notably, we observed the significantly positive correlation between the APOBEC3B-mediated mutagenesis and APOBEC3B expression levels in ER+ cancers but not in ER- cancers. In contrast, we discovered the negative correlation of APOBEC3C mRNA levels with base-substitution mutations in ER- tumors. Meanwhile, we observed that breast cancers in carriers of germline deletion of APOBEC3B gene harbor similar mutation patterns, but higher mutation rates in the TCW motif (W corresponds to A or T) than cancers in non-carriers, indicating additional factors may also induce carcinogenic mutagenesis. Conclusions: These results suggest that functional potential of APOBEC3B and APOBEC3C involved in cancer mutagenesis is associated with ER status.
    Full-text · Article · Dec 2015 · Human genomics
  • Source
    • "(A) The locations of all SNPs that are correlated (r 2 $ 0.1) with the published risk SNP (rs13387042) are shown, with the two SNPs that are strongly correlated (r 2 $ 0.8) in red. SNPs are aligned with CTCF and RAD21 binding sites, active (H3K27ac, H3K4me1, and H3K4me3) and repressive (H3K27me3) histone modification marks (in black) generated in the breast cancer cell line MCF7 by the ENCODE Project and by Frietze et al. (2012), and ESR1 and FOXA1 binding peaks generated in MCF7 (blue), T-47D (green), and ZR75-1 cells (red) by Hurtado et al. (2011). All three breast cancer cell lines are homozygous for the G-allele of rs4442975. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide association studies have identified over 70 common variants that are associated with breast cancer risk. Most of these variants map to non-protein-coding regions and several map to gene deserts, regions of several hundred kb lacking protein-coding genes. We hypothesised that gene deserts harbour long range regulatory elements that can physically interact with target genes to influence their expression. To test this, we developed Capture Hi-C (CHi-C) which, by incorporating a sequence capture step into a Hi-C protocol, allows high resolution analysis of targeted regions of the genome. We used CHi-C to investigate long range interactions at three breast cancer gene deserts mapping to 2q35, 8q24.21 and 9q31.2. We identified interaction peaks between putative regulatory elements ('bait fragments') within the captured regions and 'targets' that included both protein coding genes and long non-coding (lnc)RNAs, over distances of 6.6kb to 2.6Mb. Target protein-coding genes were IGFBP5, KLF4, NSMCE2 and MYC and target lncRNAs included DIRC3, PVT1 and CCDC26. For one gene desert we were able to define two SNPs (rs12613955 and rs4442975) that were highly correlated with the published risk variant and that mapped within the bait end of an interaction peak. In vivo ChIP-qPCR data show that one of these, rs4442975, affects the binding of FOXA1 and implicate this SNP as a putative functional variant.
    Full-text · Article · Aug 2014 · Genome Research
  • Source
    • "ChIP-seq of H3K27me3 and H3K9me2 in AKT1-tranfected MCF710A cells is obtained from our previous study [21]. The ChIP-seq of TCF7L2 in MCF7 and PANC1 cells were obtained from our previous study [32]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Many computational programs have been developed to identify enriched regions for a single biological ChIP-seq sample. Given that many biological questions are often asked to compare the difference between two different conditions, it is important to develop new programs that address the comparison of two biological ChIP-seq samples. Despite several programs designed to address this question, these programs suffer from some drawbacks, such as inability to distinguish whether the identified differential enriched regions are indeed significantly enriched, lack of distinguishing binding patterns, and neglect of the normalization between samples. In this study, we developed a novel quantitative method for comparing two biological ChIP-seq samples, called QChIPat. Our method employs a new global normalization method: nonparametric empirical Bayes (NEB) correction normalization, utilizes pre-defined enriched regions identified from single-sample peak calling programs, uses statistical methods to define differential enriched regions, then defines binding (histone modification) pattern information for those differential enriched regions. Our program was tested on a benchmark data: histone modifications data used by ChIPDiffs. It was then applied on two study cases: one to identify differential histone modification sites for ChIP-seq of H3K27me3 and H3K9me2 data in AKT1-transfected MCF10A cells; the other to identify differential binding sites for ChIP-seq of TCF7L2 data in MCF7 and PANC1 cells. Several advantages of our program include: 1) it considers a control (or input) experiment; 2) it incorporates a novel global normalization strategy: nonparametric empirical Bayes correction normalization; 3) it provides the binding pattern information among different enriched regions. QChIPat is implemented in R, Perl and C++, and has been tested under Linux. The R package is available at
    Full-text · Article · Dec 2013 · BMC Genomics
Show more