[show abstract][hide abstract] ABSTRACT: Alu repeats, which account for ~10% of the human genome, were originally considered to be junk DNA. Recent studies, however, suggest that they may contain transcription factor binding sites and hence possibly play a role in regulating gene expression.
Here, we show that binding sites for a highly conserved member of the nuclear receptor superfamily of ligand-dependent transcription factors, hepatocyte nuclear factor 4alpha (HNF4α, NR2A1), are highly prevalent in Alu repeats. We employ high throughput protein binding microarrays (PBMs) to show that HNF4α binds > 66 unique sequences in Alu repeats that are present in ~1.2 million locations in the human genome. We use chromatin immunoprecipitation (ChIP) to demonstrate that HNF4α binds Alu elements in the promoters of target genes (ABCC3, APOA4, APOM, ATPIF1, CANX, FEMT1A, GSTM4, IL32, IP6K2, PRLR, PRODH2, SOCS2, TTR) and luciferase assays to show that at least some of those Alu elements can modulate HNF4α-mediated transactivation in vivo (APOM, PRODH2, TTR, APOA4). HNF4α-Alu elements are enriched in promoters of genes involved in RNA processing and a sizeable fraction are in regions of accessible chromatin. Comparative genomics analysis suggests that there may have been a gain in HNF4α binding sites in Alu elements during evolution and that non Alu repeats, such as Tiggers, also contain HNF4α sites.
Our findings suggest that HNF4α, in addition to regulating gene expression via high affinity binding sites, may also modulate transcription via low affinity sites in Alu repeats.
[show abstract][hide abstract] ABSTRACT: Hepatocyte nuclear factor 4 alpha (HNF4alpha), a member of the nuclear receptor superfamily, is essential for liver function and is linked to several diseases including diabetes, hemophilia, atherosclerosis, and hepatitis. Although many DNA response elements and target genes have been identified for HNF4alpha, the complete repertoire of binding sites and target genes in the human genome is unknown. Here, we adapt protein binding microarrays (PBMs) to examine the DNA-binding characteristics of two HNF4alpha species (rat and human) and isoforms (HNF4alpha2 and HNF4alpha8) in a high-throughput fashion. We identified approximately 1400 new binding sequences and used this dataset to successfully train a Support Vector Machine (SVM) model that predicts an additional approximately 10,000 unique HNF4alpha-binding sequences; we also identify new rules for HNF4alpha DNA binding. We performed expression profiling of an HNF4alpha RNA interference knockdown in HepG2 cells and compared the results to a search of the promoters of all human genes with the PBM and SVM models, as well as published genome-wide location analysis. Using this integrated approach, we identified approximately 240 new direct HNF4alpha human target genes, including new functional categories of genes not typically associated with HNF4alpha, such as cell cycle, immune function, apoptosis, stress response, and other cancer-related genes. CONCLUSION: We report the first use of PBMs with a full-length liver-enriched transcription factor and greatly expand the repertoire of HNF4alpha-binding sequences and target genes, thereby identifying new functions for HNF4alpha. We also establish a web-based tool, HNF4 Motif Finder, that can be used to identify potential HNF4alpha-binding sites in any sequence.
[show abstract][hide abstract] ABSTRACT: The core promoter of eukaryotic genes is the minimal DNA region that recruits the basal transcription machinery to direct efficient and accurate transcription initiation. The fraction of human and yeast genes that contain specific core promoter elements such as the TATA box and the initiator (INR) remains unclear and core promoter motifs specific for TATA-less genes remain to be identified. Here, we present genome-scale computational analyses indicating that approximately 76% of human core promoters lack TATA-like elements, have a high GC content, and are enriched in Sp1-binding sites. We further identify two motifs - M3 (SCGGAAGY) and M22 (TGCGCANK) - that occur preferentially in human TATA-less core promoters. About 24% of human genes have a TATA-like element and their promoters are generally AT-rich; however, only approximately 10% of these TATA-containing promoters have the canonical TATA box (TATAWAWR). In contrast, approximately 46% of human core promoters contain the consensus INR (YYANWYY) and approximately 30% are INR-containing TATA-less genes. Significantly, approximately 46% of human promoters lack both TATA-like and consensus INR elements. Surprisingly, mammalian-type INR sequences are present - and tend to cluster - in the transcription start site (TSS) region of approximately 40% of yeast core promoters and the frequency of specific core promoter types appears to be conserved in yeast and human genomes. Gene Ontology analyses reveal that TATA-less genes in humans, as in yeast, are frequently involved in basic "housekeeping" processes, while TATA-containing genes are more often highly regulated, such as by biotic or stress stimuli. These results reveal unexpected similarities in the occurrence of specific core promoter types and in their associated biological processes in yeast and humans and point to novel vertebrate-specific DNA motifs that might play a selective role in TATA-independent transcription.
[show abstract][hide abstract] ABSTRACT: Epithelial formation is a central facet of organogenesis that relies on intercellular junction assembly to create functionally distinct apical and basal cell surfaces. How this process is regulated during embryonic development remains obscure. Previous studies using conditional knockout mice have shown that loss of hepatocyte nuclear factor 4alpha (HNF4alpha) blocks the epithelial transformation of the fetal liver, suggesting that HNF4alpha is a central regulator of epithelial morphogenesis. Although HNF4alpha-null hepatocytes do not express E-cadherin (also called CDH1), we show here that E-cadherin is dispensable for liver development, implying that HNF4alpha regulates additional aspects of epithelial formation. Microarray and molecular analyses reveal that HNF4alpha regulates the developmental expression of a myriad of proteins required for cell junction assembly and adhesion. Our findings define a fundamental mechanism through which generation of tissue epithelia during development is coordinated with the onset of organ function.
Proceedings of the National Academy of Sciences 06/2006; 103(22):8419-24. · 9.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: Hepatocyte nuclear factor 4 alpha (HNF4alpha) is a transcription factor that has been shown to be required for hepatocyte differentiation and development of the liver. It has also been implicated in regulating expression of genes that act in the epithelium of the lower gastrointestinal tract. This implied that HNF4alpha might be required for development of the gut.
Mouse embryos were generated in which Hnf4a was ablated in the epithelial cells of the fetal colon by using Cre-loxP technology. Embryos were examined by using a combination of histology, immunohistochemistry, DNA microarray, reverse-transcription polymerase chain reaction, electrophoretic mobility shift assays, and chromatin immunoprecipitation analyses to define the consequences of loss of HNF4alpha on colon development.
Embryos were recovered at E18.5 that lacked HNF4alpha in their colons. Although early stages of colonic development occurred, HNF4alpha-null colons failed to form normal crypts. In addition, goblet-cell maturation was perturbed and expression of an array of genes that encode proteins with diverse roles in colon function was disrupted. Several genes whose expression in the colon was dependent on HNF4alpha contained HNF4alpha-binding sites within putative transcriptional regulatory regions and a subset of these sites were occupied by HNF4alpha in vivo.
HNF4alpha is a transcription factor that is essential for development of the mammalian colon, regulates goblet-cell maturation, and is required for expression of genes that control normal colon function and epithelial cell differentiation.
[show abstract][hide abstract] ABSTRACT: Even though every cell in an organism contains the same genetic material, each cell does not express the same cohort of genes. Therefore, one of the major problems facing genomic research today is to determine not only which genes are differentially expressed and under what conditions, but also how the expression of those genes is regulated. The first step in determining differential gene expression is the binding of sequence-specific DNA binding proteins (i.e. transcription factors) to regulatory regions of the genes (i.e. promoters and enhancers). An important aspect to understanding how a given transcription factor functions is to know the entire gamut of binding sites and subsequently potential target genes that the factor may bind/regulate. In this study, we have developed a computer algorithm to scan genomic databases for transcription factor binding sites, based on a novel Markov chain optimization method, and used it to scan the human genome for sites that bind to hepatocyte nuclear factor 4 alpha (HNF4alpha). A list of 71 known HNF4alpha binding sites from the literature were used to train our Markov chain model. By looking at the window of 600 nucleotides around the transcription start site of each confirmed gene on the human genome, we identified 849 sites with varying binding potential and experimentally tested 109 of those sites for binding to HNF4alpha. Our results show that the program was very successful in identifying 77 new HNF4alpha binding sites with varying binding affinities (i.e. a 71% success rate). Therefore, this computational method for searching genomic databases for potential transcription factor binding sites is a powerful tool for investigating mechanisms of differential gene regulation.