JOURNAL OF VIROLOGY, Sept. 1992,p.5631-5634
Copyright © 1992, American Society for Microbiology
Interactions of HTF4 with E-Box Motifs in the Long Terminal
Repeat of Human Immunodeficiency Virus Type 1
YI ZHANG, KENNETH DOYLE, AND MINOU BINA*
Department of Chemistry, Purdue University, West Lafayette, Indiana 47907-1393
Received 24 March 1992/Accepted 28 May 1992
We have identified three consensus E-box motifs in the long terminal repeat of human immunodeficiency
virus type 1. One ofthese E boxes interactsselectivelywith representative members ofthe class A group ofbasic
helix-loop-helix proteins, including HTF4, E47, and their heterodimers. Ouranalysesimplicate the helix-loop-
helix proteins in regulation of human immunodeficiency virus type 1 gene expression.
The binding sites for a number of regulatory proteins share
a common core motif (CANNTG) known as an E box
(reviewed in reference
26). For example,
(CACGTG) defines the core binding motif for heterodimers
of max and c-myc implicated in controlling cell proliferation
(6), and selected E-box motifs in the promoters and enhanc-
ers of cellular genes have been shown to interact with basic
helix-loop-helix (bHLH) proteins which function in activa-
tion of gene expression (reviewed in references 9 and 26).
There are three classes of bHLH proteins: A, B, and C
(30). The class A proteins appear to be ubiquitous (9, 10, 17,
19, 26, 30, 41). They include several transcription factors
expressed from distinct but evolutionarily related genes (41).
The human E2A gene encodes several related proteins
(E47/HE47, E12, and ITF1), produced from differentially
spliced mRNAs (17, 22, 29); a second human gene encodes
proteins related to ITF2 (10, 17); a third gene encodes
proteins related to HTF4 (19, 39, 41).
The class B proteins are cell type and lineage specific (see,
for example, references 5, 9, 18, and 30). The members of
this class interact relatively weakly with DNA, but their
affinity for their cognate sites is dramatically increased when
they form heterodimers with members of class A (see, for
example, references 5 and 30). Gene activation mediated by
bHLH proteins appears to play key roles in differentiation of
lymphocytes, muscle cells, pancreatic 0i cells, and many
other cell types which acquire their functional capacities
through oligomeric complexes of bHLH proteins (reviewed
in reference 26). In addition, the biological activity ofbHLH
proteins is highly regulated by Id proteins (4, 24, 36, 37). By
forming heterodimers with bHLH proteins, Idproteins abol-
ish their capacity to interact with DNA and thus negatively
regulate their transcriptional activities (4, 36, 37).
To investigate whether bHLH proteins could contribute to
the regulation of human immunodeficiency virus type 1
(HIV-1) gene expression, we have applied a pattern recog-
nition program (1) and identified three E-box motifs in the
long terminal repeat (LTR) of an HIV-1 strain in the data
base. These motifs are schematically shown as boxes I, II,
and III in Fig. 1. Also shown in Fig. 1 are a predicted
negative regulatory element (NRE) in the LTR,thebinding
sites for two of the known activators of HIV-1 gene expres-
sion (Spl and NF-KB), and the TATA element which medi-
ates the LTR promoter function (reviewed in reference 15).
One of the E-box motifs in the LTR (EboxI) is located
an E box
between the TATA element and the transcription initiation
site (Fig. 1). This box appears to be conserved among the
various HIV strains reported in data bases. Its sequence
(CAGCTG) is identical to that of the ,uE2 core E box in the
enhancer of the immunoglobulin heavy-chain gene (26) and
to that in the AP4 site implicated in controlling the expres-
sion of the simian virus 40 late genes (20, 39). E box II is
within the NRE (Fig. 1). Its sequence is strain dependent but
conforms to a consensus (CACRTG, where R is a purine).
One element (CACGTG) within the consensus is identical to
the E-box recognition sites of max and c-myc heterodimers;
the other (CACATG) is identical to the core KE3 and ,uE3
elements in the enhancers of the immunoglobulin genes
(reviewed in reference 26). E box III is upstream of the
NRE, and comparative analysis indicates that its sequence
(CAGTTG) is conserved among the various known strains of
In our initial analysis, we investigated the interactions ofE
box I with HTF4 (Fig. 2), which is a member of the class A
group of bHLH proteins (19, 40, 41). Box I contains a ,uE2
motif found also in the simian virus 40 AP4 site which
interacts specifically with HTF4 (40). This protein includes
the DNA binding domain and a major portion of a longer
protein (HTF4a) whose sequence has been deduced from
cloned cDNA (41). A transcription factor whose predicted
sequence is nearly identical to that of HTF4a has been
recently described and named HEB (19).
For DNA binding studies, HTF4 fused to glutathione
S-transferase was expressed in Escherichia coli, isolated by
affinity chromatography, and cleaved with thrombin (16, 35)
in order to obtain a relatively pure HTF4. Figure 1 shows the
map locations of three DNA fragments prepared for band
shift analysis from recombinant constructs. Fragment Fl
includes the upstream and the NRE E boxes. F2 contains the
,uE2 core sequence, the TATA element, and the bindingsites
for Spl and NF-KB. F3 spans the ,uE2 core and the TATA
element (Fig. 1). In addition, since HTF4 interacts strongly
with oligomers of the ,uE2 core motif in the simian virus 40
AP4 site (39, 40), we constructed and cloned multimers of
the AP4 element and multimers of an unrelated element
(AP3) for competition experiments.
Incubation of radiolabelled F3 with purified HTF4 pro-
duced a prominent complex in band shift assays (Fig. 2A).
The formation of this complex can be inhibited with unla-
belled multimer of AP4 but not with unlabelled multimer of
AP3 (Fig. 2A), suggesting that HTF4 interacts specifically
with the ,uE2 core motif in the LTR. HTF4 does not show a
high affinity for E box II or E box III,since arelatively high
Vol. 66, No. 9
FIG. 1. (A) Locations of selected bindingsites forregulatory proteinsin the HIV-1 LTR. Shown are the NRE(13, 27, 32)andbinding sites
for Spl (21) and NF-KB (23, 31). The three E-box motifs described in the text are positioned at nucleotides -303 to -298 (E box III), -166
to -161 (E box II), and -21 to -16 (E box I). (B) DNAfragmentsused in band shift andfootprinting analyses,obtained as restriction
fragments (Fl and F2) or from cloned synthetic oligonucleotides (F3) by standard techniques (33, 34).
protein concentration is required to produce a shifted band
with radiolabelled Fl which contains both motifs (data not
shown). Thus, the binding assays indicate that in the LTR,
the ,uE2 core represents the preferred HTF4 binding site.
This analysis was further extended by two other binding
studies. In the first, we added unlabelled F2, which contains
the ,uE2 core, and Fl, which contains the NRE and upstream
E boxes, as competitor DNAs to reaction mixtures contain-
ing HTF4 and radiolabelled F2 as the probe (Fig. 2B). The
results show that the NRE and upstream E boxes do not
compete effectively with the probe for binding HTF4. In the
second analysis, we used 1,10-phenanthroline copper ion as
a footprinting reagent (25). HTF4 produced a prominent
footprint on radiolabelled F2 (Fig. 2C) in a region (nucleo-
tides -26 to -13) which includes the ,uE2 core motif (Fig.
2C) and extends to include a perfect palindrome (AAG
CAGCTGCTT) surrounding the E box. This finding lends
support to the suggestion that the flanking sequence of the E
box contributes to the specificity of protein binding (19).
Since the ,uE2 core motif in the LTR is the only E box that
interacts detectably with HTF4, we examined the interac-
tions of this motif with several other bHLH proteins of class
A (Fig. 3). The proteins were synthesized in rabbit reticulo-
cyte extracts, since this system provides a relatively efficient
way for producing heterodimers of bHLH proteins for
binding assays (29, 30). The selected class A proteins were
translated from in vitro-transcribed RNA encoding HTF4,
ITF2, E12, HE47, ITF1, and ITF1S (see Fig. 3 for the
constructs used for transcription).
In band shift assays, the in vitro-translated HTF4 pro-
duces a shifted band with radiolabelled F3 (Fig. 3, lane 2), as
observed for the purified protein (Fig. 2). The probe also
formed a detectable complex with HE47 and ITFlS (Fig. 3,
lanes 3 and 7) but not with E12, ITF2, and ITF1 (Fig. 3).
Cotranslation and DNA binding experiments further re-
vealed interactions between the probe and HTF4-HE47 and
Multiple E-box motifs appear to act cooperatively or
synergistically in the activation of cellular genes (26). There-
fore, similar mechanisms might also contribute to the regu-
lation ofHIV-1 gene expression by the three E-box motifs in
the viral LTR (Fig. 1). The results described above have
revealed interactions between E box I and at least two
members of bHLH proteins of class A: HTF4, HE47, and
their heterodimers. E box II in the NRE has previously been
shown to bind USF (14, 27). This box may also interact with
TFE3 and TFEB (2, 3, 7, 14) and with heterodimers of c-myc
and max. E box III conforms to the consensus binding site
for the product of the cellular proto-oncogene c-myb (11).
The expression of c-myb is induced in mitogen-stimulated
peripheral blood lymphocytes and is constitutive in several
CD4+ T-cell and myeloid cell lines, all of which represent
potential targets for HIV-1 infection (11).
A systematic linker-scanning mutational analysis indicates
that sequences between the TATA element and transcription
start site (-21 to -4) are required for optimum LTR-
mediated gene expression in unstimulated, stimulated, and
particularly tat-expressing Jurkat cells (38). Interestingly, in
the mutated sequences, E box I is replaced by an E box
(CATATG) that does not appear to constitute a high-affinity
binding site for HTF4. Potential significance ofE box I in the
wild-type sequence can also be inferred from the results of
methylation protection analysis showing that in HIV-1-
infected H9 cells, a G residue in E box I is protected against
chemical modification (12). Generally, G residues in E-box
motifs represent critical contact sites for specific interactions
of DNA with bHLH proteins, including E47, E12 (30), and
HTF4 (data not shown).
Since the results described above indicate that the class A
group of proteins might contribute to the activation of HIV-1
gene expression, it would be of interest to identify their
potential lymphoid-specific class B partners and to deter-
mine the physiological conditions under which their biolog-
ical activities are regulated (4, 36, 37). The protein Tal-1
appears to be a likely candidate for a lymphoid-specific class
B protein. Heterodimers of Tal-1 with E47 and E12 have
FIG. 2. HTF4 binding specificity on the HIV-1 LTR. (A) HTF4
was expressed in E. coli, purified, and subsequently incubated with
end-labelled F3 (0.5 ng) and 0.5 ,ug of poly(dI-dC) in binding buffer
[HEPES; pH 7.8], 50 mM KCl, 1 mM dithiothreitol, 1 mM EDTA,
8% glycerol) for 30 min at room temperature. The products were
fractionated on a 5% native polyacrylamide gel containing 2.5%
glycerol in TGE (12.5 mM Tris [pH 8.3], 95 mM glycine, 0.5 mM
EDTA). Lane 1 shows the electrophoretic mobility of the probe in
the absence of any protein. The binding reactions contained no
competitor (lane 2), 25 and 80 ng of unlabelled AP4 multimer (lanes
3 and 4), and equivalent amounts of unlabelled AP3 multimer (lanes
3' and 4'). (B) Band shift assays were performed by using HTF4 and
0.5 ng of F2 probe as described above. Lane 1 shows the mobility of
the probe in the absence of added protein. The reactions contained
no competitor (lane 2), 25 and 75 ng of unlabelled F2 (lanes 3 and 4),
and equivalent amounts of unlabelled Fl (lanes 3' and 4'). (C) For
footprinting analysis, HTF4 was incubated with singly-end-labelled
F2 probe as described above; the free and protein-bound DNAs
were separated by gel electrophoresis. Chemical footprinting was
performed within the gel slices (25) containing free and HTF4-bound
DNA identified by autoradiography. The cleavage products were
isolated and analyzed on a 10% denaturing polyacrylamide gel. Lane
G represents a Maxam-Gilbert (28) G reaction with the probe; lanes
F and B represent, respectively, the cleavage products of 1,10-
phenanthroline copper ion reactions with free and HTF4-bound
DNA. The sequence of the protected area is indicated.
been shown to interact specifically with an E-box motif in
vitro (18); in patients with T-cell acute lymphoblastic leuke-
mia, abnormalities in the tal-1 gene result in accumulation of
immature lymphoblasts in bone marrow and peripheral
Id protein represses the activity of several bHLH pro-
teins. DNA binding assays have shown that Id inhibits
complex formation between E47 and muscle-specific factors
(4, 36, 37) and also abolishes the DNA binding activity of
HTF4 and its complexes with myogenic factors (data not
shown). It has been postulated that Id functions as a general
inhibitor of cell differentiation (4, 24). Recent studies indi-
cate that during myeloid differentiation, there is a correlation
between a decrease in Id mRNA and a concomitant appear-
ance of E-box binding activities in nuclear extracts (24).
These findings considered in the context of the results
described above indicate that the potential effects of bHLH
proteins on the activation of HIV-1 gene expression may be
detectable only in appropriately differentiated cells and may
be of significance during development.
FIG. 3. Interaction of E box I with bHLH proteins of class A.
The proteins indicated above the lanes were translated separately or
cotranslated in vitro in rabbit reticulocyte lysates (Promega). For
band shift analysis, a sample (5 ,ul) of the translation reactions was
incubated with radiolabelled F3 probe (0.2 ng), and the products
were analyzed by electrophoresis as described in the legend to Fig.
2. Lane F shows the mobility of the probe in the absence of added
protein. NS represents a nonspecific band derived from a compo-
nent in reticulocyte lysates. For in vitro protein synthesis, E12 RNA
was transcribed from an E12R cDNA containing plasmid obtained
from D. Baltimore (22, 30); ITF1, ITFlS (a shorter version of ITF1),
and ITF2 RNAs were transcribed from plasmids T7PE2-5, T7PE2-
5S, and T7,BE2-2, respectively, obtained from T. Kadesch (17).
HE47 and HTF4 RNAs were transcribed from plasmids which
contained the proteins' cDNAs cloned into the pBSATG vector
described in reference 30. HE47 represents the carboxy terminus
(141 residues) of a HeLa protein (40) related to E47, an immuno-
globulin enhancer-binding protein (29).
We thank D. Baltimore and T. Kadesch for providing plasmids
used in the in vitro transcription-translation experiments.
This research was supported by grants awarded by NIH.
1. Ambrose, C., and M. Bina. 1990. Strategy for statistical-map-
ping of potential regulatory regions in the human genome. J.
Mol. Biol. 216:485-490.
2. Beckman, H., and T. Kadesch. 1991. The leucine zipper ofTFE3
dictates helix-loop-helix dimerization specificity. Genes Dev.
3. Beckmann, H., L.-K. Su, and T. Kadesch. 1990. TFE3: a
helix-loop-helix protein that activates transcription through the
immunoglobulin enhancer ,uE3 motif. Genes Dev. 4:167-179.
4. Benezra, R., R. L. Davis, D. Lockshon, D. L. Turner, and H.
Weintraub. 1990. The protein Id: a negative regulator of helix-
loop-helix DNA binding proteins. Cell 61:49-59.
5. Blackwell, T. K., and H. Weintraub. 1990. Differences and
similarities in DNA-binding preferences of MyoD and E2A
protein complexes revealed by binding site selection. Science
6. Blackwood, E. M., and R. N. Eisenman. 1991. Max: a helix-loop-
helix zipper protein that forms a sequence-specific DNA-bind-
ing complex with myc. Science 251:1211-1217.
7. Carr, C. S., and P. A. Sharp. 1990. A helix-loop-helix protein
related to the immunoglobulin E box-binding proteins. Mol.
Cell. Biol. 10:4384-4388.
8. Chen, Q., J.-T. Cheng, L.-H. Tsai, N. Schneider, G. Buchanan,
A. Carroll, W. Crist, B. Ozanne, M. J. Siciliano, and R. Baer.
The tal gene undergoes chromosome translocation in T cell
VOL. 66, 1992
leukemia and potentially encodes a helix-loop-helix protein.
EMBO J. 9:415-424.
9. Cline, T. W. 1989. The affairs of daughterless and the promis-
cuity of developmental regulators. Cell 59:231-234.
10. Corneliussen, B., A. Thornell, B. Hallberg, and T. Grundstrom.
1991. Helix-loop-helix transcriptional activators bind to a se-
quence in glucocorticoid response elements of retrovirus en-
hancers. J. Virol. 65:6084-6093.
11. Dasgupta, P., P. Saikumar, C. D. Reddy, and E. P.Reddy. 1990.
Myb protein binds to human immunodeficiency virus 1 long
terminal repeat (LTR) sequences and transactivates LTR-medi-
ated transcription. Proc. Natl. Acad. Sci. USA 87:8090-8094.
12. Demarchi, F., P. D'Agaro, A. Falaschi, and M. Giacca. 1992.
Probing protein-DNA interactions at the long terminal repeat of
human immunodeficiency virus type 1 by in vivo footprinting. J.
13. Garcia, J. A., F. K. Wu, R. Mitsuyasu, and R. B. Gaynor. 1987.
Interactions of cellular proteins involved in the transcriptional
regulation of the human immunodeficiency virus. EMBO J.
14. Giacca, M., M. I. Gutierrez, S. Menzo, F. D. Di Fagagna, and A.
Falaschi. 1992. A human binding site for transcription factor
USF/MLTF mimics the negative regulatory element of human
immunodeficiency virus type 1. Virology 186:133-147.
15. Greene, W. C. 1990. Regulation of HIV-1 gene expression.
Annu. Rev. Immunol. 8:453475.
16. Guan, K.-L., and J. E. Dixon. 1991. Eukaryotic proteins ex-
pressed in Escherichia coli: an improved thrombin cleavage and
purification procedure of fusion proteins with glutathione-S-
transferase. Anal. Biochem. 192:262-267.
17. Henthorn, P., M. Kiledjian, and T. Kadesch. 1990. Two distinct
transcription factors that bind the immunoglobulin enhancer
p.E5/KE2 motif. Science 247:467-470.
18. Hsu, H.-L., J.-T. Cheng, Q. Chen, and R. Baer. 1991. Enhancer-
binding activity of the tal-I oncoprotein in association with the
E47/E12 helix-loop-helix proteins. Mol. Cell. Biol. 11:3037-
19. Hu, J.-S., E. N. Olson, and R. E. Kingston. 1992. HEB, a
helix-loop-helix protein related to E2A and ITF2 that can
modulate the DNA-binding ability of myogenic regulatory fac-
tors. Mol. Cell. Biol. 12:1031-1042.
20. Hu, Y.-F., B. Luscher, N. Mermod, and R. Tjian. 1990. Tran-
scription factor AP-4 contains multiple dimerization domains
that regulate dimer specificity. Genes Dev. 5:1741-1752.
21. Jones, K. A., J. T. Kadonaga, P. A. Luciw, and R. Tjian. 1986.
Activation of the AIDS retrovirus promoter by the cellular
transcription factor, Spl. Science 232:755-759.
22. Kamps, M. P., C. Murre, X.-H. Sun, and D. Baltimore. 1990. A
new homeobox gene contributes the DNA binding domain of the
t(1;19) translocation protein in pre-B ALL. Cell 60:547-555.
23. Kawakami, K., C. Scheidereit, and R. G. Roeder. 1988. Identi-
fication and purification of a human immunoglobulin-enhancer-
binding protein (NF-KB) that activates transcription from a
human immunodeficiency virus type I promoter in vitro. Proc.
Natl. Acad. Sci. USA 85:4700-4704.
24. Kreider, B. L., R. Benezra, G. Rovera, and T. Kadesch. 1992.
Inhibition of myeloid differentiation by the helix-loop-helix
protein Id. Science 255:1700-1702.
25. Kuwabara, M. D., and D. S. Sigman. 1987. Footprinting DNA-
protein complexes in situ following gel retardation assays using
1,10-phenanthroline-copper ion: Escherichia coli RNA poly-
merase-lac promoter complexes. Biochemistry 26:7234-7238.
26. Libermann, T. A., and D. Baltimore. 1991. Transcriptional
regulation of immunoglobulin gene expression. Mol. Aspects
Cell Regul. 6:399-421.
27. Lu, Y., N.Touzjian, M. Stenzel, T. Dorfman, J. G. Sodroski, and
W. A. Haseltine. 1990. Identification of cis-acting repressive
sequences within the negative regulatory element of human
immunodeficiency virus type 1. J. Virol. 64:5226-5229.
28. Maxam, A. M., and W. Gilbert. 1977. A new method for
sequencing DNA. Proc. Natl. Acad. Sci. USA 74:560-564.
29. Murre, C., P. S. McCaw, and D. Baltimore. 1989. A new DNA
binding and dimerization motif in immunoglobulin enhancer
binding, daughterless, MyoD, and myc proteins. Cell 56:777-
30. Murre, C., P. S. McCaw, H. Vaessin, M. Caudy, L. Y. Jan, Y. N.
Jan, C. V. Cabrera, J. N. Buskin, S. D. Hauschka, A. B. Lassar,
H. Weintraub, and D. Baltimore. 1989. Interactions between
heterologous helix-loop-helix proteins generate complexes that
bind specifically to a common DNA sequence. Cell 58:537-544.
31. Nabel, G., and D. Baltimore. 1987. An inducible transcription
factor activates expression of human immunodeficiency virus in
T cells. Nature (London) 326:711-713.
32. Rosen, C. A., J. G. Sodroski, and W. A. Haseltine. 1985. The
location of cis-acting regulatory sequences in the human T cell
lymphotropic virus type III (HTLVIII/LAV) long terminal re-
peat. Cell 41:813-823.
33. Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecular
cloning: a laboratory manual, 2nd ed. Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, N.Y.
34. Singh, H., J. H. LeBowitz, A. S. Baldwin, Jr., and P. A. Sharp.
1988. Molecular cloning of an enhancer binding protein: isola-
tion by screening of an expression library with a recognition site
DNA. Cell 52:415-423.
35. Smith, D. B., and K. S. Johnson. 1988. Single step purification of
polypeptides expressed in Escherichia coli as fusions with
glutathione-S-transferase. Gene 67:3140.
36. Sun, X.-H., N. G. Copeland, N. A. Jenkins, and D. Baltimore.
1991. Id proteins Idl and Id2 selectively inhibit DNA binding by
one class of helix-loop-helix proteins. Mol. Cell. Biol. 11:5603-
37. Wilson, R. B., M. Kiledjian, C.-P. Sen, R. Benezra, P. Zwollo,
S. M. Dymecki, S. V. Desiderio, and T. Kadesch. 1991. Repres-
sion of immunoglobulin enhancers by the helix-loop-helix pro-
tein Id: implications for B-lymphoid-cell development. Mol.
Cell. Biol. 11:6185-6191.
38. Zeichner, S. L., J. Y. H. Kim, and J. C. Aiwine. 1991. Linker-
scanning mutational analysis of the transcriptional activity of
the human immunodeficiency virus type 1 long terminal repeat.
J. Virol. 65:2436-2444.
39. Zhang, Y., J. Babin, A. L. Feldhaus, H. Singh, P. A. Sharp, and
M. Bina. 1991. HTF4: a new human helix-loop-helix protein.
Nucleic Acids Res. 19:4555.
40. Zhang, Y., and M. Bina. 1991. Sequence of a HeLa cDNA
provides the DNA binding domain and carboxy terminus of
HE47: a human helix-loop-helix protein related to the enhancer
binding factor E47. DNA Sequence 2:197-202.
41. Zhang, Y., and M. Bina. 1992. The nucleotide sequence of the
human transcription factor HTF4a cDNA. DNA Sequence