Binding sites for metabolic disease related
transcription factors inferred at base pair
resolution by chromatin immunoprecipitation
and genomic microarrays
Alvaro Rada-Iglesias1, Ola Wallerman1, Christoph Koch3, Adam Ameur2, Stefan Enroth2,
Gayle Clelland3, Kenneth Wester1, Sarah Wilcox3, Oliver M. Dovey3, Peter D. Ellis3,
Vicki L. Wraight3, Keith James3, Rob Andrews3, Cordelia Langford3, Pawandeep Dhami3,
Nigel Carter3, David Vetrie3, Fredrik Ponte ´n1, Jan Komorowski2, Ian Dunham3
and Claes Wadelius1,*
1Department of Genetics and Pathology, Rudbeck Laboratory and2Linnaeus Centre for Bioinformatics, Uppsala
University, SE-75185 Uppsala, Sweden and3Wellcome Trust Sanger Institute, Cambridge, UK
Received July 4, 2005; Revised August 19, 2005; Accepted September 28, 2005
We present a detailed in vivo characterization of hepatocyte transcriptional regulation in HepG2 cells, using
chromatin immunoprecipitation and detection on PCR fragment-based genomic tiling path arrays covering
the encyclopedia of DNA element (ENCODE) regions. Our data suggest that HNF-4a and HNF-3b, which
were commonly bound to distal regulatory elements, may cooperate in the regulation of a large fraction of
the liver transcriptome and that both HNF-4a and USF1 may promote H3 acetylation to many of their targets.
Importantly, bioinformatic analysis of the sequences bound by each transcription factor (TF) shows an over-
representation of motifs highly similar to the in vitro established consensus sequences. On the basis of these
data, we have inferred tentative binding sites at base pair resolution. Some of these sites have been pre-
viously found by in vitro analysis and some were verified in vitro in this study. Our data suggests that a simi-
lar approach could be used for the in vivo characterization of all predicted/uncharacterized TF and that the
analysis could be scaled to the whole genome.
Transcriptional control is achieved through a complex inter-
play between cis-acting regulatory DNA elements (promoters,
enhancers, locus control regions, etc.) and trans-acting
proteins. The assembly of these proteins to their binding
sites displays both synergy and cooperativity, which increases
the specificity and flexibility of the process (1). However, at
a genomic level, we have very poor knowledge of how
transcription regulation is achieved, and most of it comes
from the analysis of a limited number of regulatory elements.
Recent experiments have used chromatin immunoprecipita-
tion and detection on genomic microarrays (ChIP-chip) to
investigate transcriptional control processes at a large scale.
This has revealed that transcription factors (TFs) bind more
targets than previously suspected. In some cases, promoter
or CpG island arrays have been used, resulting in the identifi-
cation of numerous genes that could be under the control of
particular TFs (2–4). However, this approach cannot detect
cis-regulatory elements, which are sometimes located several
kilobase from transcription start sites (TSS) (5). Other
studies have employed high-resolution tiling path arrays,
mainly focusing on chromosomes 21 and 22 (6,7). In these
cases, single or not clearly related TFs have been analysed,
which do not allow the investigation of transcriptional networks.
This study has two main objectives. First, to increase the
knowledge of the complex network of TFs and histone modi-
fications that act on a gene, by deciphering the interactions and
# The Author 2005. Published by Oxford University Press. All rights reserved.
For Permissions, please email: firstname.lastname@example.org
*To whom correspondence should be addressed. Tel: þ46 184714076; Fax: þ46 184714808; Email: email@example.com
Human Molecular Genetics, 2005, Vol. 14, No. 22
Advance Access published on October 13, 2005
connections between a set of TFs and histone H3 acetylation.
Secondly, to identify the actual base pairs that these TFs
are interacting with to exert their regulatory effects. Our aim
is to achieve these goals in vivo at a genomic scale, using
the human hepatocytes as a model and by studying disease-
Hepatocyte differentiation and metabolism are controlled by
ubiquitous and liver-specific TFs. HNF-4a belongs to the
nuclear receptor family and is considered to be the major regu-
lator of the hepatocyte phenotype (8). Furthermore, HNF-4a
has been associated with both an autosomal dominant form of
diabetes, MODY1 (9), and the common form of type 2 diabetes
hierarchy, e.g. through activation of both HNF-4a and HNF-1a
(11). In the adult liver, HNF-3b regulates lipid metabolism and
ketogenesis during fasting and diabetes, and its subcellular
localization is regulated by insulin (12). The ubiquitous TF
USF1 was recently implicated in familial combined hyperlipi-
daemia, which is characterized by elevated levels of either total
serum cholesterol or triglycerides or both (13), making the
identification of novel USF1 targets a critical issue. TFs need
other proteins known as coactivators (14) to activate transcrip-
tion, many of which possess histone acetyl transferase (HAT)
activity. One of their best characterized targets, histone H3,
has been found to be acetylated in lysines 9 and 14 near tran-
scription starting sites in active genes (15,16).
The encyclopedia of DNA elements (ENCODE) project was
initiated to evaluate strategies for identifying all functional
elements in the human genome (17). To this end, a PCR-based
tiling path array covering 1% of the genome was constructed.
Using HepG2 cells, we present an exhaustive characterization
of hepatocyte transcriptional modules. The in vivo binding
sites of HNF-4a, HNF-3b and USF1 and the distribution of
acetylated H3 in lysines 9 and 14 (AcH3) were interrogated
by ChIP-chip. Some of our major observations were the corre-
lation of HNF-3b and HNF-4a binding sites, indicating
cooperativity between these proteins, and the identification
of several potential enhancers located far from annotated
genes. Although scarce, we found that most of USF1 bindings
occurred at proximal promoters, which were usually acetyl-
ated on H3. H3 acetylation was generally found near 50end
of genes, in agreement with recent observations. Most impor-
tantly, analysis of the sequences from sites bound by HNF-4a,
HNF-3b and USF1 showed that we were able to reliably
identify consensus motifs similar to those previously reported.
We also inferred tentative individual binding sites at a base
pair resolution, and some of these sites have been experimen-
tally verified by us and others. This opens the possibility for
similar analyses of all other human TFs. The results have
important implications for the strategies to construct a tiling
path array over the entire human genome.
Assessing quality of antibodies, ChIP protocol and tissue
distribution of proteins
The antibodies against HNF-4a, HNF-3b, USF1 and acetyl-
ated H3 showed high specificity by western-blot analysis
using HepG2 nuclear extracts (Supplementary Material,
Fig. S1). The USF1 antibody recognizes a C-terminal
epitope, which occurs in two isoforms (31 and 43 kDa) (18).
We detected a band of 70–80 kDa that could correspond to
heterodimers of the two isoforms (18). A second antibody
against the N-terminal part of USF1 detected only the larger
isoform, which is mainly nuclear when compared with the
shorter isoform, which is more abundant in the cytoplasm.
HepG2 cells were treated with sodium butyrate, a histone dea-
cetylase inhibitor that increases levels of histone acetylation.
After 2 h of treatment, a clear increase in the band detected
by the anti-acetylated H3 antibody was observed. Further-
more, all antibodies provided consistent results by immunohis-
tochemistry in our tissue microarrays (TMAs) experiments,
where the expression pattern of the proteins was analysed in
48 healthy tissues, 20 tumours and 50 cell lines. As expected,
AcH3 showed a strong nuclear signal in all cells. The larger
USF1 isoform gave a medium-to-strong nuclear staining in
most tissues, whereas the antibody detecting both the large
and small isoforms gave a signal of varying intensity from
both the cytoplasm and nucleus in most tissues. The HNF pro-
teins displayed a more restricted expression pattern and were
mostly detected in abdominal organs. HNF-3b had a nuclear
location with mainly median intensity. HNF-4a showed
nuclear and some cytoplasmic staining with the strongest
signals in liver, pancreas, kidney, stomach and intestine,
which could be of importance for future studies of type 2 dia-
betes (Supplementary Material, Fig. S2 and Table S1).
Several well-characterized enhancers known to bind HNF-
4a and/or HNF-3b in HepG2 cells were selected (19–21),
and ChIP DNAs were evaluated by PCR (Supplementary
Material, Fig. S3a). Clear enrichments were observed in a
number of genes for both proteins compared with a negative
control, whereas the HNF-1a promoter was only bound by
HNF-4a. The AcH3 antibody has been extensively employed
for ChIP, and we verified the enrichment in the promoter of
HNF-1a. Finally, the USF1 antibody was not pre-evaluated
by ChIP, in order to represent a scenario where no previous
knowledge is available for a certain TF. This approach was
chosen to investigate the potential for using ChIP-chip for
identifying in vivo targets of uncharacterized/predicted TFs,
and if possible, to establish consensus binding sequences on
the basis of in vivo experiments.
Genome-wide localization results
We conducted three independent biological replicates for each
of the proteins studied by ChIP-chip. Furthermore, three ChIPs
without antibody were analysed, as negative controls. No
amplification of the ChIP or input DNA was performed
before labelling, to avoid possible bias introduced by this pro-
cedure. The reproducibility of our biological replicates was
verified by principal component analysis (PCA) (Supplemen-
tary Material, Fig. S4).
The sonicated enriched DNA can often hybridize to neigh-
bouring spots in a tiling path array, resulting in positive signals
from consecutive spots (Fig. 1). For TFs, the spot with the
highest ChIP DNA/input ratio in such blocks was defined as
a unique enriched spot (UESs). For AcH3, all enriched spots
were generally considered (Materials and Methods).
3436 Human Molecular Genetics, 2005, Vol. 14, No. 22
Genomic regions bound by AcH3 was the most common
finding in our experiments (513 enriched spots), followed by
binding of HNF-4a (194 UESs) and HNF-3b (154 UESs). Sig-
nificantly lower numbers were identified for USF1 (31 UESs).
When overlaps between the different sets were calculated, a
clear co-occurrence in binding between HNF-3b and HNF-
4a was observed (40% for HNF-3b, 31% for HNF-4a;
P-value: 6.716E 2 80). Furthermore, most of USF1 sites were
heavily acetylated (58%; P-value: 3.523E 2 20) (Table 1). All
UESs, including those for AcH3, were mapped to the closest
known gene and can be visualized with the UCSC genome
browser using Supplementary Material, File 1.
In order to verify the robustness of our genome-wide findings,
we selected 8–14 identified targets from each set of ChIP-chip
experiments and analysed new independent ChIPs by PCR.
We confirmed 11 of 12 UESs for HNF4a, 12 of 14 for
HNF3b, eight of eight for USF1 and eight of eight for
AcH3 (Supplementary Material, Fig. S3b).
One of the regions included in the ENCODE arrays contains
the apolipoprotein C3/A4/A1 cluster, which has been exten-
sively studied in liver and HepG2 cells. Several promoters
and enhancers in this region have been characterized in
HepG2 cells, mainly through in vitro approaches (22).
We were able to identify all previously known regulatory
elements, with the exception of the APOC3 enhancer located
0.8 kb upstream of this gene, because this element was
covered by a low quality spot. Importantly, we identified
new potential regulatory elements, which usually correspond
to regions showing some level of evolutionary conservation.
Especially interesting was the identification of new binding
sites for HNFs at the proximal promoter of APOA5 and in
the intergenic region between APOA5 and APOA4. The
latter element could be equally important in APOA4 and
Figure 1. Identification of UESs. The combination of sonicated enriched DNA for a particular binding site combined with hybridization onto tiling path arrays
results in groups of neighbouring spots displaying high log2(ChIP DNA/input) ratios. One particular example from HNF-4a experiments is presented.
Table 1. Overall number of identified binding sites and overlaps between the
different data sets
Total number of enriched spots and UES (in bold) is presented. The
different overlaps were calculated using UES for the three TFs and
total enriched spots for AcH3, because longer DNA sequences (more
than one spot) can be bound by this modified histone. The significance
of the different overlaps is explained by the different P-values presented
as subscripts within parentheses.
Human Molecular Genetics, 2005, Vol. 14, No. 223437
APOA5 transcriptional control. Intriguingly, the 30end of the
APOC3 gene was found to be occupied by both HNF-4a and
HNF-3b (Fig. 2).
Odom et al. (4) identified gene targets of HNF-1a, HNF-4a
and HNF-6 in human liver and pancreas, using ChIP-chip with
an array comprising 13 000 promoters. They discovered that
HNF-4a regulated more genes than previously expected, by
binding ?12% of the promoters. In agreement with this obser-
vation, we have identified a large number of HNF-4a binding
sites. However, most of our bindings occur far from any TSS,
and only ?3% of the Refseq promoters are being directly
bound. Of the promoters included in both arrays, we detected
8/11 genes assigned a P-value of , 0.05 in the Odom study.
Some of the variation can be probably explained by differ-
ences in cells studied, laboratory protocols or statistical
The modification of histone H3 through acetylation occurs
near a gene’s TSS and is positively associated with gene
activity in model organisms (15). Bernstein et al. (16)
confirmed these findings in humans, using the same antibody
and cells as in our study, and high-resolution tiling path oligo-
nucleotide arrays of chromosomes 21 and 22. Three of the
ENCODE regions are located on these chromosomes, and
there is a convincing concordance in the results, with 87%
of the acetylated sites passing Bernstein’s highest cut-off
located within 5 kb of an entry in our data set. Only five of
the acetylated sites in our study are .5 kb from any entry in
the Bernstein study. However, Bernstein et al. found three
times as many as acetylated spots using the lower cut-off,
which could be due to their higher resolution or a higher
rate of false positives (Fig. 3).
Identification of consensus sequence and tentative
Our effort to identify motifs representing consensus binding
sequences for a certain TF started by defining a strict set of
bona fide enriched spots. We hypothesized that each TF
Figure 2. Apolipoprotein A1/C3/A4 cluster regulatory elements as quality controls. Different well-characterized cis-regulatory elements in the ApoA1/C3/A4
cluster were collected from the literature, and their positions and some of the TFs binding to them are shown (upper panel: coloured circles). The UES identified
in our ChIP-chip experiments are presented using the USCS browser (lower panel). UESs for each of our four proteins are displayed as black squares, relative to
the genomic position, and the rest of significantly enriched spots are displayed in grey. Those elements reproduced in our analysis are surrounded by circles,
following the same colour code as in the upper panel. The apolipoprotein genes are also represented (in blue), as well as the level of conservation of the
whole genomic region displayed (bottom). USF1 bindings in APOA5 (49) and APOC3 (50) promoters were not detected in our analysis; however, our low-
glucose conditions in the first case, and mutually exclusive binding with HNF-4a in the second case, might explain these discrepancies.
3438Human Molecular Genetics, 2005, Vol. 14, No. 22
binding could result in enrichment of several neighbouring
spots, as shown in Figure 1. The definition of UES was
found to be a critical issue to achieve high quality consensus
sequences. Using a motif-finding program (BioProspector)
(23) on the sequences of our UES, we reproducibly found a
consensus highly similar to the previously established consen-
sus binding sequences for USF1, HNF-4a and HNF-3b
(Fig. 4A). The robustness of our consensus sequences is
shown by the high similarity between the best motifs obtained
in 10 independent BioProspector runs (Fig. 4B). We calculated
the probability of obtaining the consensus sequence by chance
and found that this was highly unlikely (Fig. 4C). Finally,
when similar analysis was performed in randomly selected
spot sequences, a motif contained in Alu repeats was consist-
ently obtained. Thus, our data indicate that in vivo ChIP-chip
experiments are able to detect consensus sequences similar to
those found in the TRANSFAC database (24). To eliminate
false positives, we required that the same base pairs should
be used to create the consensus in at least five of the 10 iter-
ations using BioProspector. Using these criteria, we inferred
the following number of tentative binding sites (TBSs) for
the TFs: 159 for HNF-4a, 132 for HNF-3b and 36 for
USF1. Our large number of in vivo generated TBS should
be compared to the TRANSFAC motifs, on the basis of 32,
24 and 81 in vitro observations separately. The genome-
wide mapped locations of the TBS at base pair resolution
are presented in Supplementary Material, Table S2.
We were able to confirm several previously established
HNF-4a binding sites at base pair resolution, most of them
located in the apolipoprotein C3/A4/A1 cluster. We identified
TBS in the APOC3 promoter, the APOC3/APOA4 intergenic
enhancer and the F10 promoter, which were exactly as defined
in the previous studies. Furthermore, we detected one out of
two binding sites in the APOA4 promoter, but did not find
the reported HNF-3b binding site in the APOA1 promoter.
This promoter has been suggested to contain two HNF-4a
binding sites, which are relatively different from the estab-
lished consensus and they were not identified in our study.
Instead, we found a TBS in the first intron, close to the TSS
(Supplementary Material, Table S3).
Furthermore, electrophoresis mobility shift assay (EMSA)
experiments were performed using oligonucleotides designed
from TBSs identified by BioProspector. For each of the three
proteins investigated, we could confirm that they bound these
Figure 3. AcH3 regions that we identified in chromosomes 21 and 22 are very similar to those previously reported. Part of the ENCODE region ENm005 is
presented, displaying from top to bottom: the results from our ChIP-chip experiments for all the four proteins investigated, with black and grey corresponding
to UES and the rest of significantly enriched spots. The same region was recently analysed by Bernstein et al. and their analysis of AcH3 is presented with the
low- and high-stringency cut-offs used (P-value 20 ¼ P , 1022and P-value 40 ¼ P , 1024). The known-genes contained in this region, as well as the level of
evolutionary conservation, are displayed.
Human Molecular Genetics, 2005, Vol. 14, No. 223439
Figure 4. In vivo ChIP-chip experiments can find the consensus binding sequence of a particular TF. (A) Predicted motifs compared with TRANSFAC weight
matrices. Logos were created from TBSs for HNF-4a (159 sites), HNF-3b (132 sites) and USF-1 (36 sites), found by BioProspector in at least five out of 10 iter-
one TBS in 31 of 31 USF1 targets. Those numbers were 124/154 for HNF-3b and 152/194 for HNF-4a. (B) Our predicted binding motifs are highly stable and
reproducible. Consensus logos for the 10 separate runs of BioProspector on the UES for each TF. Each consensus logo is built from all predicted binding sites
motifsfrom Figure 3A.Each result from BioProspector is associatedwith a motifscorethat is calculated withrespect to a null distributionof the score. The redline
shows the null distribution, calculated from 100 Monte Carlo simulations by setting the flag ‘ 2 r’ to 100, and the blue crosses show scores for our consensus
matrices in each of the 10 iterations. (D) EMSA experiments were performed with oligonucleotides containing TBSs identified by BioProspector in our ChIP-
chip data sets. From left to right, the results for labelled HNF-4a consensus (sequence from spot stSG634982), labelled USF1 (sequence from spot
together with a positive reaction (pos) containing nuclear extracts. The rest of the reactions contained nuclear extracts, together with 100? excess self-competitor
(self.comp) (unlabelled consensus oligonucleotide), 100? excess of unrelated competitor (unr.comp) (unlabelled Sp1 consensus), 1 mg antibody against protein of
interest or 1 mg of unrelated antibody (Sp1, Ap2a). Specific shifted bands are indicated with filled arrows and supershifted bands with open arrows.
3440 Human Molecular Genetics, 2005, Vol. 14, No. 22
sequences. This indicates that our identified consensus
sequences are highly accurate (Fig. 4D). In conclusion, our
data suggest that a similar approach can be used for the in vivo
characterization of the perhaps 2000 TFs with unknown
DNA binding sequences, providing that specific antibodies
are available. The results for USF1 illustrate this clearly,
Recent reports indicate that long-range enhancers and prox-
imal promoters are in close proximity in the cellular context,
owing to formation of chromatin loops (25,26). When our
HNF-4a binding sequences in proximal promoters, defined
as within 5 kb from TSS, were analysed with BioProspector,
the HNF-4a consensus sequence was not found among the
top motifs (Fig. 5A). There could be several explanations
for this observation. First, there might be a fraction of HNF-
4a binding sites that are different from the established consen-
sus, even though it is difficult to explain why they should be
more frequent in promoters. Secondly, HNF-4a might act as
a coactivator, interacting with another TF(s) but not with the
DNA. Thirdly, it might be the case that some of the HNF-
4a interactions with proximal promoters are indirect through
formation of enhancer/promoter loops with HNF-4a binding,
occurring mainly in distal regulatory elements (Fig. 5B).
The motif presented in Figure 5A could be the combined
result of these three alternatives, but we favour the last
hypothesis, because we identified many binding sites far
from proximal promoters. Our data indicate that some of the
promoters positive for HNF-4a in the study by Odom et al.
might be enriched on the basis of indirect interactions, and
this might explain why they found HNF-4a consensus
sequences in only 9% of such promoters. It is likely that posi-
tive signals are generated in every ChIP-chip study as a con-
sequence of indirect interaction, but we believe that by
identifying TBS, it is possible to identify and distinguish
some of them.
HNF-3b and HNF-4a are major regulators
of hepatocyte phenotype
Out of the spots enriched for HNF-4a and HNF-3b, a fraction
was enriched for both proteins. The shared spots had higher
enrichments (log2-ratios) than those that bound only one of
the two factors (2.37 versus 1.97 for HNF-3b and 2.28 versus
1.83 for HNF-4a), indicating cooperativity in their bindings.
This is also suggested by the observed proximity between
TBSs for the two proteins. We determined the distances
between the TBS for one protein and the closest TBS for the
other, and observed a clear over-representation of HNF-4a
Figure 4. Continued.
Human Molecular Genetics, 2005, Vol. 14, No. 22 3441
within the typical size of an enhancer, and a trend towards colo-
calization within 100 bp can be observed (Fig. 6A).
HNF-4a and HNF-3b showed similar genomic distribution
of their binding sites, with many of them occurring at long dis-
tances from TSS, both upstream and downstream (Fig. 6B
and C), as could be expected for enhancer elements. Among
the genes closest to the binding sites for these hepatocyte
nuclear factors, there were representatives of various gene
ontology (GO) (27) biological function categories, but inter-
estingly genes involved in lipid metabolism and transport
were significantly over-represented when compared with a
random distribution (Supplementary Material, Fig. S5).
USF1 binds to proximal promoters and is associated
with acetylated H3: a model for investigating
The total number of enriched spots found in the USF1 exper-
iments was considerably lower than that for the other proteins
investigated. However, when an enrichment was detected, it
was at similar levels as those observed for HNFs, indicating
Figure 5. A fraction of HNF-4a binding sites located in proximal promoters can be indirect as the result of protein–protein interactions. (A) Logos were created
using either HNF-4a UES located within 5 kb from TSS or the rest of HNF-4a UESs that are not in proximity of TSS. (B) From top to bottom, four different
scenarios are presented: (1 and 2) represent the case where HNF-4a binding sites are located in proximal promoters (defined as within 5 kb from TSS), with some
of them being similar to the established consensus sequence (1) and others being rather different from such consensus (2). (3) represents HNF-4a acting as a
coactivator, interacting with another TF which in fact binds directly to the promoter DNA. (4) shows a scenario implying a looping model, where the detection of
HNF-4a enrichment in proximal promoters could be the result of HNF-4a binding to distal regulatory elements and interacting with coactivators and/or other
TFs bound to the promoter regions. We have shown that among promoters bound by HNF-4a, there was an over-representation of a motif not corresponding to
HNF-4a consensus (Fig. 5A). When that sequence was analysed by Match (http:/ /www.gene-regulation.com/cgi-bin/pub/programs/match/bin/match.cgi), several
TFs could potentially bind to it, some of them previously reported to be able to interact with HNF-4a (SREBP-1, SREBP-2, SMAD3, SMAD4 and Sp1) (36–39).
HNF-4a is represented in yellow, other potential TFs in orange and different coactivators in green.
3442Human Molecular Genetics, 2005, Vol. 14, No. 22
that our antibody against USF1 worked efficiently in ChIP
(Supplementary Material, Fig. S3b and c). This suggests that
USF1 has a restricted number of targets in HepG2 cells,
more similar to proteins like HNF1 and HNF6 (4). The
genomic distribution of USF1 targets was different from
HNF-4a and HNF-3b, because most of USF1 UESs were
located in proximal promoters (Fig. 6B and C). It has been
recently suggested that USF1 exerts its trans-activation
effects through recruitment of coactivators that possess HAT
activity (28). In agreement with this, the overlap between
USF1 binding and AcH3 (58%; P-value: 3.523E 220), as
well as the mean level of acetylation of USF1 targets
(log2-ratio ¼ 1.56), was higher than the overall level of
AcH3 (normalized log2-ratio ¼ 0).
As previously stated, we started our ChIP-chip experiments
for USF1 without including the previously known positive
controls. In spite of this, the newly identified USF1 UESs
were verified to be the bona fide targets of this protein, by
various methods. First, some of the new targets were con-
firmed as true positives by PCR analysis of ChIP DNA (Sup-
plementary Material, Fig. S3b). Secondly, the same targets
were analysed by PCR analysis of ChIP DNA obtained
using a second antibody against USF1, which confirmed all
tested enriched spots (Supplementary Material, Fig. S3c).
Thirdly, the sequence analysis of USF1 UES resulted in the
identification of an over-represented motif highly similar to
(Fig. 5A). This suggests that the same strategy could be
used in the case of completely uncharacterized TFs. Immuno-
histochemistry can be used on TMA to determine the cell type/
tissue expressing the TF, and western blot can characterize the
specificity of the antibody. ChIP-chip can be then performed
to identify in vivo targets of the protein, as well as provide a
predicted consensus binding sequence on the basis of in vivo
experiments and TBSs. Analysis of GO of genes near
binding sites may give indications of which biological pro-
cesses the TF is regulating.
Histone H3 acetylation is a histone modification that
frequently occurs near TSS and is associated
with TFs binding sites
Our genome-wide identification of regions acetylated in
histone 3 confirms that this is a common modification.
In general, the genomic location of AcH3 displayed a clear
preference for regions near TSS (Fig. 6B and C). Regions
immediately downstream of TSS displayed the highest levels
of AcH3 (Supplementary Material, Fig. S6). Our results are
in agreement with recent reports both in humans and other
eukaryotic organisms (15,16) but extend the knowledge of
this distribution to previously uncharacterized regions.
Most of the USF1 targets were acetylated in histone 3 (58%;
P-value: 3.523E 220). This was also true for a relatively high
number of HNF-4a (31%; P-value: 4.353E 244) and to a
lesser degree HNF-3b UES (14%; P-value: 7.551E 210).
Interestingly, when the AcH3 log2-ratios for HNF targets
were investigated, the highest level was found for unique
HNF-4a UES (1.18), followed by shared HNF-4a/HNF-3b
UES (0.93), whereas acetylation levels for unique HNF-3b
UES (0.50) was almost half when compared with the other
Figure 6. Characteristics of the identified protein binding sites. (A) Distances
between HNF-4a TBS and nearest HNF-3b TBS (Supplementary Material,
Table S2) were calculated, and the cases where such distance was ,5 kb
are presented in windows of 100 bp. (B) Distribution of UES compared
with the UCSC ‘known genes’. Each spot’s midpoint was used to calculate
the distance to the closest gene (50or 30) within the boundaries given; spots
.220 kb from any TSS and .þ5 kb from any 30end were grouped as inter-
genic. For AcH3, UESs were considered for calculating distances to known
genes. (C) Distribution of the ‘intergenic’ UES from (B) around entries in
the GenScan data set. Although not as obvious as for the curated data set,
an over-representation of acetylation in proximity of TSS and within the pre-
dicted genes can be seen.
Human Molecular Genetics, 2005, Vol. 14, No. 223443
two groups. These observations suggest that coactivator
recruitment (HAT) is a trans-activating mechanism employed
by HNF-4a, but not by HNF-3b.
After the completion of the human genome sequence, attention
has been directed towards determining the function of non-
coding sequences. High levels of evolutionary conservation
have been observed in numerous non-coding elements, and
it has been proposed that a major function of them may be
to regulate gene activity (29). To determine which TFs that
bind to each conserved and possible non-conserved element
in every tissue is a long-term goal in biology. The ultimate
map would contain binding sites at base pair resolution, but
currently we are far from this and mostly have to resort to
comparison to in vitro characterized consensus sequences.
Therefore, in vivo identification of TF targets and refinement
of consensus sequences are important aims in the post-
sequencing era. The major finding of our work is the develop-
ment of an in vivo strategy to define consensus motifs for
USF1, HNF-4a and HNF-3b. These motifs are highly
similar to the previously in vitro established consensus
binding sequences, which confirm that most of our UESs are
bona fide binding sites for the investigated proteins. In
addition, the finding of over-represented motifs opens the
possibility to infer the exact base pair where in vivo DNA–
protein interactions occur in a particular cis-regulatory
element. We present the putative location, at base pair resol-
ution, of suggested DNA–protein interactions for each ident-
ified TBS. Some of these predicted binding sites have been
experimentally confirmed by us and others. Further, analysis
combined with more powerful statistical and bioinformatic
approaches will improve these predictions. These novel
binding sites and nearby SNPs may be further investigated,
e.g. by genetic analysis in diseases affecting glucose, lipid
or cholesterol metabolism.
Our data has important implications for strategies to con-
struct high-resolution arrays covering the whole genome.
The resolution in a ChIP-chip experiment is determined by
the size of the sonicated enriched DNA hybridized to the
array and by the size of the array elements (Fig. 1). The
DNA in our experiments was sonicated to a range between
500 and 2000 bp, and the average size of the spots in this
array is 1100 bp. We have shown that when applying opti-
mized protocols and strict statistical evaluation, we can ident-
ify consensus sequences and TBSs at base pair resolution. This
means that a tiling path array at 1000 bp resolution may be
enough to map many of the binding sites for sequence-specific
TFs. Such an array may contain around 2 000 000 elements
for the whole human genome. For these purposes, the high-
resolution arrays with 51 874 388 probes covering the
genome at 46 bp resolution (30) and 74 180 611 probe pairs
covering 30% of the genome at 5 bp resolution (31) may not
be necessary. Future experiments will determine which
arrays give the best optimization between cost and resolution.
The second major goal of our study is to understand how a
tissue-specific transcriptional program is constructed, by
analysing the interconnections between different TFs and
epigenetic modifications. Tissue-specific gene expression is
under the control of several ubiquitous and tissue-specific
TFs, and liver is the best-characterized mammalian tissue in
this aspect. HNF-4a and HNF-3b play major roles in liver
function and differentiation (8,12), and their collaborative
relationship in the transcriptional control of several liver
genes is well established (32). In concordance with these
observations, we found that HNF-4a and HNF-3b are
common binders across the genome, with a high similarity
in their binding patterns and many of their targets located
far from known genes. Proximity between their TBSs further
suggests regulatory cooperation. Despite these similarities,
individual HNF-4a UESs displayed higher AcH3 levels than
those occupied by HNF-3b only, indicating that HNF-4a
acts more often through recruitment of HATs. These results
together with previous knowledge about the two proteins
suggest a sequential and cooperative model of transcriptional
control. HNF-3b is already detectable in the early gastrula,
playing a major role in visceral endoderm differentiation.
HNF-3 proteins occupy the albumin enhancer early in
development, even before the gene is activated (33). They
are capable of binding directly to chromatin, thereby initiating
chromatin opening events and are therefore considered
‘pioneer’ factors (34). Applied to liver transcriptional
control, we could hypothesize that HNF-3b, which is
expressed at a very early stage, creates chromatin marks at
cis-regulatory elements, making them accessible to other
TFs, sequentially expressed during development. Among
these TFs, HNF-4a has been shown to establish a myriad of
protein–protein interactions including those with other
important TFs (35–39), several coactivators (40,41), some
with HAT activity and members of the RNA PolII machinery
(42). All these interactions identify HNF-4a as a good
example of a protein present in enhanceosomes (Supplemen-
tary Material, Fig. S7).
In our genome-wide interrogation for USF1 targets, we
observed a clear preference of USF1 binding to proximal pro-
moters, and furthermore, a high correlation with AcH3. This is
in full agreement with a recent study, where West et al. (28)
reported that USF proteins interact with histone modifying
enzymes (Set7/9, PCAF, p300/CBP) and promotes H3K4
methylation and AcH3. We found that USF1 binding was
much less common than bindings of HNFs, even though the
enrichment levels in its targets were comparable to those of
HNFs. This might be explained by the fact that our HepG2
cells were cultivated in a constant relatively low glucose
(2 g/l) medium. Several reports suggest that USF1 and USF2
are involved in promoting the liver glucose response, in
which a change from low- to high-glucose conditions activates
certain genes (43). In this respect, further experiments should
elucidate how different stimuli, resembling the metabolic
stress encountered in certain diseases, e.g. familial combined
hyperlipidaemia and type 2 diabetes with the accompanying
metabolic syndrome, modulate the hepatocyte transcriptome.
In conclusion, we have investigated some aspects of how
liver-specific gene expression is achieved and highlighted
some of the mechanisms implicated. Furthermore, our in
vivo genome-wide identification and inference of TF binding
sites should serve as an example of how ChIP-chip technology
can be used in the identification of targets and consensus
3444Human Molecular Genetics, 2005, Vol. 14, No. 22
binding sequences for uncharacterized TF. This could dramati-
cally increase our understanding of the complex transcrip-
tional networks acting on a cell, where the role of most key
players still remains elusive.
MATERIALS AND METHODS
Cell culture and nuclear extracts preparation
HepG2 cells were grown in RPMI-1640 medium (Sigma-
Aldrich), supplemented with 10% FBS (Gibco, Invitrogen),
1% PEST (Gibco, Invitrogen) and 1% glutamine (Gibco,
Invitrogen), at 378C with 5% CO2. For nuclear extract
preparation, cells were treated with cell lysis buffer for
10 min on ice. Nuclei were resuspended in 1? RIPA buffer
for 10 min on ice. Samples were centrifuged, and supernatants
were kept at 2708C until used.
Western blot and antibodies
HepG2 nuclear extracts were separated on NuPAGE 4–12%
Bis–Tris gel (Invitrogen) and transferred to polyvinylidene
fluoride membrane (Amersham Biosciences), which was
developed using ECL Advance Western Blotting Detection
sciences). Antibodies against USF1 (C-20 and H-86) and
HNF-3b were purchased from Santa Cruz Biotechnology;
antibody against HNF-4a was purchased from Active Motif
and antibody against AcH3 was purchased from Upstate Bio-
HepG2 cells were grown as described earlier. Around 108sub-
confluent cells were used per ChIP experiment. Cells were
crosslinked with 0.37% formaldehyde for 10 min and resus-
pended in cell lysis buffer for 10 min on ice. Nuclei were
resuspended in 1? RIPA buffer and kept on ice for another
10 min. Chromatin was sonicated to a size of 0.5–2 kb and
pre-cleared by incubating with protein G-agarose (Roche)
for at least 1 h at 48C with slow rotation. At this step, a frac-
tion of the pre-cleared chromatin was kept as input DNA and
the rest was incubated with 10 mg antibody at 48C overnight,
and 100 ml of protein G-agarose were used for each ChIP reac-
tion. Protein G-agarose was washed four times with 1? RIPA
buffer, once with ChIP washing buffer and once with 1? TE
buffer. DNA–protein complexes were eluted, treated with
RNaseA (Amersham Biosciences) and incubated at 658C for
6 h in order to reverse crosslinks. Proteins were degraded by
Proteinase K (Amersham Biosciences), and DNA was
extracted by phenol/chloroform/isoamyl alcohol extraction,
purified and resuspended in water.
Microarray construction: primer design, PCR reactions
Each array element was generated by PCR using specific
primer pairs tiling through all ENCODE regions. Primers
were selected, so that the resulting amplicons were 1–1.5 kb
long, minimally overlapping and were allowed to contain
repetitive elements. Gaps in the tiling array were filled with
relaxed parameters chiefly allowing 180 bp to 1.5 kb and
30–70% GC. All the forward primers are normalized to the
same length to format for Illumina synthesis. For each array
element, one PCR reaction was performed. As a template,
we used mainly genomic DNA (Roche/Sigma). A minor set
of amplicons was amplified with BAC, PAC or fosmid
DNA. Typically, the failure rate was ,20%. Failed PCR reac-
tions were repeated. Spotting buffer was added to the PCR
products at a final concentration of 250 mM sodium phosphate,
pH 8.5, 0.00025% sarkosyl, followed by spin filtration using
96-well filtration plates (Millipore). These array elements
were printed without any further purification onto activated
amine-binding slides (Codelink, Amersham), using a Bio-
Robotics TAS arrayer with a 48-pin tool. Most array elements
are printed once onto each slide (about 19 000 spots/slide),
only X-chromosomal regions (ENm006 and ENr324) were
printed in duplicate. The final array presents a 75% coverage
of the ENCODE regions.
DNA labelling and microarray hybridization
The DNA obtained from a single ChIP reaction was labelled
with Cy5, and a fraction of the total input was labelled with
Cy3 (1/5 of total input DNA for HNF-4a, HNF-3b, USF1
and no-antibody samples and 1/3 of total input for acetylated
H3). For labelling reactions, the Bioprime Labelling system
(Invitrogen) was used. Labelled DNA was purified using
Amersham G50 columns. ChIP/Cy5 and total input/Cy3
DNAs were combined and ethanol precipitated together with
human Cot-1 DNA, and the resulting pellet was resuspended
in hybridization buffer. The arrays were pre-hybridized with
human Cot-1 and salmon-sperm DNA, followed by addition
of the hybridization solution containing the labelled DNAs.
The arrays were then washed, dried and scanned in a
GenePix 4000 B scanner (Axon instruments, Molecular
Microarray data analysis, motifs discovery
and GO categories
The computational data analysis was divided into three major
parts. The first performed in the LCB-Data WareHouse (http://
www.lcb.uu.se/lcbdw.php), where data was pre-processed
through various spot-filters and normalizations. Visualization
through PCA was used to assert the data quality of each repli-
cate before and after pre-processing. In addition, a log-odds
(B-score) for differential enrichment with respect to the nega-
tive control was calculated using an empirical Bayes method
(44). Each spot then becomes associated with four B-scores,
which represent the probability of it being enriched by
USF1, HNF-3b, HNF-4a and/or AcH3, respectively. Empiri-
cally, spots were considered as enriched when B-score is
.0 and log2-ratio .1.25.
To determine the overlaps in binding between the different
TFs and AcH3, UESs were used for TFs and total enriched
spots for AcH3. In all cases, only bindings occurring in the
same spot were taken into consideration. The P-values for
the overlaps were calculated under the hypothesis that the
size of the overlaps is hypergeometrically distributed, which
Human Molecular Genetics, 2005, Vol. 14, No. 22 3445
would be the case if selection of UES was done at random.
The background used for each comparison is taken to be the
common spots of the two proteins tested before the selection
of UES was made.
Motif discovery was done in several steps. The first, written
in R (45), consisted of detecting enriched spots by both the
log2-ratio and the B-score. Then, a set of UESs among all
enriched spots was created by filtering out adjacent spots
with lower log2-ratios. For the AcH3 data set, all enriched
spots were generally counted, because longer DNA sequence
can be bound by this modified histone, and UESs were only
considered for calculating distances to the closest genes. To
identify the binding sites, the corresponding DNA sequences
were analysed using BioProspector. As BioProspector is
non-deterministic, we repeated the analysis and kept all
binding sites occurring in each top scoring motif to generate
a set of candidates. From these, a set of TBS was obtained
by selecting those present in at least five out of 10 runs. The
motif logos were created using the WebLogo (46) service.
Distances between binding sites were obtained by mapping
TBS on the assembled genome.
Finally, each spot on the array was mapped to its closest
gene and the corresponding GO (27) ids, allowing a signifi-
cance score (P-value) to be calculated for each GO-term
under the null hypothesis that GO-terms in the UES-sets are
distributed as on the whole array.
A new set of ChIPs was performed with (ChIP DNA) and
without antibody (no-Ab DNA) to serve as templates for
PCR verification of newly identified binding sites. PCRs
were performed using the same volume from ChIP DNA,
no-Ab DNA and a dilution of the input (generally 1/30).
Different numbers of cycles (25–35) were used for the differ-
ent primers tested in order to determine the conditions where
linear amplification occurs. Enrichment was scored visually by
comparing the PCR amplification from ChIP DNA to no-Ab
and input DNAs.
Electrophoretic mobility shift assay
HepG2 nuclear extracts were prepared as previously described
(47). Nuclear extracts were mixed with binding buffer,
tions, a certain excess of unlabelled probe was added to the
mixture. For supershift assays, 1 mg of antibody was used.
32P-labelled probe. For competition reac-
TMA design and immunohistochemistry
The TMAs were designed as described previously (48). A
spectrum of 48 normal tissues, 20 different cancers and 50
cell-lines were sampled. Immunohistochemistry was done
according to the instructions from the manufacturer of the
EnVision kitw(DAKO Cytomation, Glostrup, Denmark)
using an automated immunostaining instrument, Autostainer
Supplementary Material is available at HMG Online.
We thank Ulf Landegren for critical reading of the manuscript.
This work was supported by the Swedish Research Council,
the Wellcome Trust, the US National Human Genome
Research Institute (grant no. 5 U01 HG003168), the Markus
Borgstro ¨m foundation and Knut and Alice Wallenberg
Conflict of Interest statement: The authors declare that they
have no competing financial interests.
1. Carey, M. (1998) The enhanceosome and transcriptional synergy. Cell,
2. Weinmann, A.S., Yan, P.S., Oberley, M.J., Huang, T.H. and Farnham, P.J.
(2002) Isolating human transcription factor targets by coupling chromatin
immunoprecipitation and CpG island microarray analysis. Genes Dev., 16,
3. Li, Z., Van Calcar, S., Qu, C., Cavenee, W.K., Zhang, M.Q. and Ren, B.
(2003) A global transcriptional regulatory role for c-Myc in Burkitt’s
lymphoma cells. Proc. Natl Acad. Sci. USA, 100, 8164–8169.
4. Odom, D.T., Zizlsperger, N., Gordon, D.B., Bell, G.W., Rinaldi, N.J.,
Murray, H.L., Volkert, T.L., Schreiber, J., Rolfe, P.A., Gifford, D.K. et al.
(2004) Control of pancreas and liver gene expression by HNF
transcription factors. Science, 303, 1378–1381.
5. Kleinjan, D.A. and van Heyningen, V. (2005) Long-range control of gene
expression: emerging mechanisms and disruption in disease. Am. J. Hum.
Genet., 76, 8–32.
6. Martone, R., Euskirchen, G., Bertone, P., Hartman, S., Royce, T.E.,
Luscombe, N.M., Rinn, J.L., Nelson, F.K., Miller, P., Gerstein, M. et al.
(2003) Distribution of NF-kappaB-binding sites across human
chromosome 22. Proc. Natl Acad. Sci. USA, 100, 12247–12252.
7. Cawley, S., Bekiranov, S., Ng, H.H., Kapranov, P., Sekinger, E.A.,
Kampa, D., Piccolboni, A., Sementchenko, V., Cheng, J., Williams, A.J.
et al. (2004) Unbiased mapping of transcription factor binding sites along
human chromosomes 21 and 22 points to widespread regulation of
noncoding RNAs. Cell, 116, 499–509.
8. Parviz, F., Matullo, C., Garrison, W.D., Savatski, L., Adamson, J.W.,
Ning, G., Kaestner, K.H., Rossi, J.M., Zaret, K.S. and Duncan, S.A.
(2003) Hepatocyte nuclear factor-4 alpha controls the development of a
hepatic epithelium and liver morphogenesis. Nat. Genet., 34, 292–296.
9. Yamagata, K., Furuta, H., Oda, N., Kaisaki, P.J., Menzel, S., Cox, N.J.,
Fajans, S.S., Signorini, S., Stoffel, M. and Bell, G.I. (1996) Mutations in
the hepatocyte nuclear factor-4 alpha gene in maturity-onset diabetes of
the young (MODY1). Nature, 384, 458–460.
10. Silander, K., Mohlke, K.L., Scott, L.J., Peck, E.C., Hollstein, P.,
Skol, A.D., Jackson, A.U., Deloukas, P., Hunt, S., Stavrides, G. et al.
(2004) Genetic variation near the hepatocyte nuclear factor-4 alpha gene
predicts susceptibility to type 2 diabetes. Diabetes, 53, 1141–1149.
11. Duncan, S.A., Navas, M.A., Dufort, D., Rossant, J. and Stoffel, M. (1998)
Regulation of a transcription factor network required for differentiation
and metabolism. Science, 281, 692–695.
12. Wolfrum, C., Asilmaz, E., Luca, E., Friedman, J.M. and Stoffel, M. (2004)
Foxa2 regulates lipid metabolism and ketogenesis in the liver during
fasting and in diabetes. Nature, 432, 1027–1032.
13. Pajukanta, P., Lilja, H.E., Sinsheimer, J.S., Cantor, R.M., Lusis, A.J.,
Gentile, M., Duan, X.J., Soro-Paavonen, A., Naukkarinen, J., Saarela, J.
et al. (2004) Familial combined hyperlipidemia is associated with
upstream transcription factor 1 (USF1). Nat. Genet., 36, 371–376.
14. Spiegelman, B.M. and Heinrich, R. (2004) Biological control through
regulated transcriptional coactivators. Cell, 119, 157–167.
3446Human Molecular Genetics, 2005, Vol. 14, No. 22
15. Roh, T.Y., Ngau, W.C., Cui, K., Landsman, D. and Zhao, K. (2004) Download full-text
High-resolution genome-wide mapping of histone modifications. Nat.
Biotechnol., 22, 1013–1016.
16. Bernstein, B.E., Kamal, M., Lindblad-Toh, K., Bekiranov, S.,
Bailey, D.K., Huebert, D.J., McMahon, S., Karlsson, E.K., Kulbokas, E.J.,
III, Gingeras, T.R. et al. (2005) Genomic maps and comparative analysis
of histone modifications in human and mouse. Cell, 120, 169–181.
17. ENCODE Project Consortium. (2004) The ENCODE (Encyclopedia of
DNA Elements) Project. Science, 306, 636–640.
18. Saito, T., Oishi, T., Yanai, K., Shimamoto, Y. and Fukamizu, A. (2003)
Cloning and characterization of a novel splicing isoform of USF1.
Int. J. Mol. Med., 12, 161–167.
19. Rouet, P., Raguenez, G., Tronche, F., Mfou’ou, V. and Salier, J.P. (1995)
Hierarchy and positive/negative interplays of the hepatocyte nuclear
factors HNF-1, -3 and -4 in the liver-specific enhancer for the human
alpha-1-microglobulin/bikunin precursor. Nucleic Acids Res., 23,
20. Ceelie, H., Spaargaren-Van Riel, C.C., De Jong, M., Bertina, R.M. and
Vos, H.L. (2003) Functional characterization of transcription factor
binding sites for HNF1-alpha, HNF3-beta (FOXA2), HNF4-alpha, Sp1
and Sp3 in the human prothrombin gene enhancer. J. Thromb. Haemost.,
21. Cooper, A.D., Chen, J., Botelho-Yetkinler, M.J., Cao, Y., Taniguchi, T.
and Levy-Wilson, B. (1997) Characterization of hepatic-specific
regulatory elements in the promoter region of the human cholesterol 7
alpha-hydroxylase gene. J. Biol. Chem., 272, 3444–3452.
22. Zannis, V.I., Kan, H.Y., Kritis, A., Zanni, E.E. and Kardassis, D. (2001)
Transcriptional regulatory mechanisms of the human apolipoprotein genes
in vitro and in vivo. Curr. Opin. Lipidol., 12, 181–207.
23. Liu, X., Brutlag, D.L. and Liu, J.S. (2001) BioProspector: discovering
conserved DNA motifs in upstream regulatory regions of co-expressed
genes. Pac. Symp. Biocomput., 127–138.
24. Wingender, E. (1988) Compilation of transcription regulating proteins.
Nucleic Acids Res., 16, 1879–1902.
25. Horike, S., Cai, S., Miyano, M., Cheng, J.F. and Kohwi-Shigematsu, T.
(2005) Loss of silent-chromatin looping and impaired imprinting of DLX5
in Rett syndrome. Nat. Genet., 37, 31–40.
26. Murrell, A., Heeson, S. and Reik, W. (2004) Interaction between
differentially methylated regions partitions the imprinted genes Igf2 and
H19 into parent-specific chromatin loops. Nat. Genet., 36, 889–893.
27. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H.,
Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T. et al.
(2000) Gene ontology: tool for the unification of biology. The Gene
Ontology Consortium. Nat. Genet., 25, 25–29.
28. West, A.G., Huang, S., Gaszner, M., Litt, M.D. and Felsenfeld, G. (2004)
Recruitment of histone modifications by USF proteins at a vertebrate
barrier element. Mol. Cell, 16, 453–463.
29. Xie, X., Lu, J., Kulbokas, E.J., Golub, T.R., Mootha, V.,
Lindblad-Toh, K., Lander, E.S. and Kellis, M. (2005) Systematic
discovery of regulatory motifs in human promoters and 30UTRs by
comparison of several mammals. Nature, 434, 338–345.
30. Bertone, P., Stolc, V., Royce, T.E., Rozowsky, J.S., Urban, A.E., Zhu, X.,
Rinn, J.L., Tongprasit, W., Samanta, M., Weissman, S. et al. (2004)
Global identification of human transcribed sequences with genome tiling
arrays. Science, 306, 2242–2246.
31. Cheng, J., Kapranov, P., Drenkow, J., Dike, S., Brubaker, S., Patel, S.,
Long, J., Stern, D., Tammana, H., Helt, G. et al. (2005) Transcriptional
maps of 10 human chromosomes at 5-nucleotide resolution. Science, 308,
32. Harnish, D.C., Malik, S., Kilbourne, E., Costa, R. and Karathanasis, S.K.
(1996) Control of apolipoprotein AI gene expression through synergistic
interactions between hepatocyte nuclear factors 3 and 4. J. Biol. Chem.,
33. Bossard, P. and Zaret, K.S. (1998) GATA transcription factors as
potentiators of gut endoderm differentiation. Development, 125,
34. Cirillo, L.A., Lin, F.R., Cuesta, I., Friedman, D., Jarnik, M. and
Zaret, K.S. (2002) Opening of compacted chromatin by early
developmental transcription factors HNF3 (FoxA) and GATA-4. Mol.
Cell, 9, 279–289.
35. Eeckhoute, J., Formstecher, P. and Laine, B. (2004) Hepatocyte nuclear
factor-4 alpha enhances the hepatocyte nuclear factor 1alpha-mediated
activation of transcription. Nucleic Acids Res., 32, 2586–2593.
36. Yamamoto, T., Shimano, H., Nakagawa, Y., Ide, T., Yahagi, N.,
Matsuzaka, T., Nakakuki, M., Takahashi, A., Suzuki, H., Sone, H. et al.
(2004) SREBP-1 interacts with hepatocyte nuclear factor-4 alpha and
interferes with PGC-1 recruitment to suppress hepatic gluconeogenic
genes. J. Biol. Chem., 279, 12027–12035.
37. Misawa, K., Horiba, T., Arimura, N., Hirano, Y., Inoue, J., Emoto, N.,
Shimano, H., Shimizu, M. and Sato, R. (2003) Sterol regulatory element-
binding protein-2 interacts with hepatocyte nuclear factor-4 to enhance
sterol isomerase gene expression in hepatocytes. J. Biol. Chem., 278,
38. Kardassis, D., Pardali, K. and Zannis, V.I. (2000) SMAD proteins
transactivate the human ApoCIII promoter by interacting physically and
functionally with hepatocyte nuclear factor 4. J. Biol. Chem., 275,
39. Kardassis, D., Falvey, E., Tsantili, P., Hadzopoulou-Cladaras, M. and
Zannis, V. (2002) Direct physical interactions between HNF-4 and Sp1
mediate synergistic transactivation of the apolipoprotein CIII promoter.
Biochemistry, 41, 1217–1228.
40. Yoshida, E., Aratani, S., Itou, H., Miyagishi, M., Takiguchi, M.,
Osumu, T., Murakami, K. and Fukamizu, A. (1997) Functional
association between CBP and HNF4 in trans-activation. Biochem.
Biophys. Res. Commun., 241, 664–669.
41. Wang, J.C., Stafford, J.M. and Granner, D.K. (1998) SRC-1 and GRIP1
coactivate transcription with hepatocyte nuclear factor 4. J. Biol. Chem.,
42. Malik, S. and Karathanasis, S.K. (1996) TFIIB-directed transcriptional
activation by the orphan nuclear receptor hepatocyte nuclear factor 4.
Mol. Cell. Biol., 16, 1824–1831.
43. Casado, M., Vallet, V.S., Kahn, A. and Vaulont, S. (1999) Essential role
in vivo of upstream stimulatory factors for a normal dietary response of
the fatty acid synthase gene in the liver. J. Biol. Chem., 274, 2009–2013.
44. Smyth, G.K. (2004) Linear models and empirical Bayes methods for
assessing differential expression in microarray experiments. Stat. Appl.
Genet. Mol. Biol., 3. Article 3.
45. R Development Core Team (2004) A Language and Environment for
Statistical Computing. R Foundation for Statistical Computing, Vienna,
Austria. ISBN 3-900051-00-3. http:/ /www.R-project.org.
46. Crooks, G.E., Hon, G., Chandonia, J.M. and Brenner, S.E. (2004)
WebLogo: a sequence logo generator. Genome Res., 14, 1188–1190.
47. Andrews, N.C. and Faller, D.V. (1991) A rapid micropreparation
technique for extraction of DNA-binding proteins from limiting numbers
of mammalian cells. Nucleic Acids Res., 19, 2499.
48. Uhlen, M. and Ponten, F. (2005) Antibody-based proteomics for human
tissue profiling. Mol. Cell. Proteom., 4, 384–393.
49. Nowak, M., Helleboid-Chapman, A., Jakel, H., Martin, G.,
Duran-Sandoval, D., Staels, B., Rubin, E.M., Pennacchio, L.A.,
Taskinen, M.R., Fruchart-Najib, J. et al. (2005) Insulin-mediated
down-regulation of apolipoprotein A5 gene expression through the
phosphatidylinositol 3-kinase pathway: role of upstream stimulatory
factor. Mol. Cell. Biol., 25, 1537–1548.
50. Pastier, D., Lacorte, J.M., Chambaz, J., Cardot, P. and Ribeiro, A. (2002)
Two initiator-like elements are required for the combined activation of the
human apolipoprotein C-III promoter by upstream stimulatory factor and
hepatic nuclear factor-4. J. Biol. Chem., 277, 15199–15206.
Human Molecular Genetics, 2005, Vol. 14, No. 223447