Content uploaded by Dimitris Tzamarias
Author content
All content in this area was uploaded by Dimitris Tzamarias on Sep 01, 2016
Content may be subject to copyright.
Proc.
Natl.
Acad.
Sci.
USA
Vol.
90,
pp.
4513-4517,
May
1993
Biochemistry
Adaptability
at
the
protein-DNA
interface
is
an
important
aspect
of
sequence
recognition
by
bZIP
proteins
(DNA-binding
protein/yeast
GCN4/transcription
factor/gene
regulation/leucine
zipper)
JOON
KIM*,
DIMITRIS
TZAMARIAS*,
THOMAS
ELLENBERGERt,
STEPHEN
C.
HARRISONtt,
AND
KEVIN
STRUHL*§
*Department
of
Biological
Chemistry
and
Molecular
Pharmacology,
Harvard
Medical
School,
Boston,
MA
02115;
tDepartment
of
Biochemistry
and
Molecular
Biology
and
tHoward
Hughes
Medical
Institute,
Harvard
University,
Cambridge,
MA
02138
Communicated
by
Bert
Vogelstein,
February
19,
1993
ABSTRACT
The
related
AP-1
and
ATF/CREB
families
of
transcriptional
regulatory
proteins
bind
as
dimers
to
overlap-
ping
or
adjacent
DNA
half-sites
by
using
a
bZIP
structural
motif.
Using
genetic
selections,
we
isolate
derivatives
of
yeast
GCN4
that
affect
DNA-binding
specificity
at
particular
posi-
tions
of
the
AP-1
target
sequence.
In
general,
altered
DNA-
binding
specificity
results
from
the
substitution
of
larger
hy-
drophobic
amino
acids
for
GCN4
residues
that
contact
base
pairs.
However,
in
several
cases,
DNA
binding
by
the
mutant
proteins
cannot
be
simply
explained
in
terms
of
the
GCN4-
AP-1
structure;
movement
of
the
protein
and/or
DNA
struc-
tural
changes
are
required
to
accommodate
the
amino
acid
substitutions.
The
quintet
of
GCN4
residues
that
make
base-
pair
contacts
do
not
entirely
determine
DNA-binding
specificity
because
these
residues
are
highly
conserved
in
the
bZIP
family,
yet
many
of
the
bZIP
proteins
bind
to
distinct
DNA
sites.
The
a-helical
fork
between
the
GCN4
DNA-binding
and
dimeriza-
tion
surfaces
is
important
for
half-site
spacing
preferences,
because
mutations
in
the
fork
alter
the
relative
affinity
for
AP-1
and
ATF/CREB
sites.
The
basic
region
in
the
protein-DNA
complex
is
a
long
isolated
a-helix,
with
no
constraints
from
other
parts
of
a
folded
domain.
From
all
of
these
consider-
ations,
we
suggest
that
small
shifts
in
position
and
orientation
or
local
deformations
in
the
a-helical
backbone
distinguish
one
bZIP
complex
from
another.
The
DNA-binding
domains
of
most
eukaryotic
transcrip-
tional
regulatory
proteins
can
be
classified
into
a
relatively
small
number
of
distinct
structural
classes.
The
bZIP
motif
(50-60
amino
acid
residues)
consists
of
two
distinct
seg-
ments,
the
leucine
zipper
and
the
basic
region
(1).
The
C-terminal
30
residues
form
a
two-stranded
parallel
coiled
coil
(the
leucine
zipper),
which
mediates
dimerization
(2).
This
leucine
zipper
symmetrically
positions
a
divergent
pair
of
basic-region
a-helices,
which
pass
through
the
major
groove
of
each
DNA
half-site
(3-6).
Upon
specific
DNA-
complex
formation,
the
bZIP
segment
undergoes
a
folding
transition.
The
previously
unfolded
basic
region
becomes
a-helical
(7-9),
and
a
quintet
of
conserved
basic-region
residues
are
positioned
to
make
contacts
with
the
DNA
(6).
Yeast
GCN4
belongs
to
the
AP-1
family
of
transcription
factors
that
includes
the
Jun
and
Fos
oncoproteins.
The
optimal
AP-1-GCN4
recognition
sequence,
ATGA(C/
G)TCAT,
consists
of
overlapping
half-sites,
which
are
non-
equivalent
because
of
the
asymmetry
imposed
by
the
central
C-G
base
pair
(defined
as
position
0)
(6,
10-12).
GCN4
also
binds
with
only
slightly
reduced
affinity
to
the
ATF/CREB
sequence
(ATGACGTCAT),
in
which
the
half-sites
abut
rather
than
overlap
(13).
In
contrast,
the
structurally
and
immunologically
related
ATF/CREB
transcription
factors
bind
much
more
efficiently
to
ATF/CREB
sites
than
to
AP-1
sites
(14).
We
have
therefore
proposed
that
AP-1
and
ATF/
CREB
proteins
make
similar
DNA
sequence-specific
con-
tacts
but
differ
in
their
half-site
spacing
requirements
(13).
We
previously
isolated
a
specificity
mutant
of
GCN4
by
a
genetic
selection
for
derivatives
activating
transcription
from
promoters
containing
mutant
binding
sites
(15).
The
mutant
protein
contains
a
tryptophan
in
place
of
the
invariant
basic-
region
asparagine
(Asn-235),
and
it
affects
specificity
at
the
±4
position
of
the
AP-1
site.
Here,
we
address
the
basis
of
DNA-binding
specificity
at
the
more
critical
positions
of
the
AP-1
site
by
isolating
additional
GCN4
specificity
mutants.
Furthermore,
we
address
the
specificity
of
half-site
spacing
by
generating
GCN4
derivatives
with
altered
preferences
for
the
AP-1
and
ATF/CREB
sites.
The
resulting
changes
in
DNA-binding
specificity
are
interpreted
in
terms
of
the
x-ray
structure
of
the
GCN4-DNA
complex
(6),
and
the
implica-
tions
of
these
results
for
the
DNA-binding
specificities
of
other
bZIP
proteins
are
discussed.
MATERIALS
AND
METHODS
The
methods
for
degenerate
oligonucleotide
mutagenesis,
genetic
selections
for
GCN4
specificity
mutants,
phenotypic
analysis,
and
DNA-binding
specificity
determinations
have
been
described
(15).
RESULTS
Isolation
of
GCN4
Mutants
That
Functionally
Interact
with
Altered
DNA
Sites.
GCN4
proteins
that
activate transcription
from
altered
AP-1
target
sites
were
isolated
by
the
genetic
selection
described
(15).
A
library
of
104
GCN4
derivatives
averaging
2-bp
substitutions
in
the
basic
region
(residues
236-246)
was
introduced
into
yeast
strains
containing
sym-
metrically
mutated
AP-1
sites
(AGGACTC_CT;
ATIAC-
TAAT)
upstream
of
the
his3
TATA
element
and
structural
gene
(Fig.
1).
Yeast
transformants
with
increased
levels
of
his3
expression
were
selected
by
their
growth
in
the
presence
of
aminotriazole.
GCN4
plasmids
were
recovered
from
these
strains,
sequenced,
and
analyzed
for
their
ability
to
stimulate
transcription
from
a
variety
of
mutant
AP-1
sites
(Fig.
2).
Upon
selection
for
proteins
that
could
activate
transcrip-
tion
from
ACGACTCGT,
we
isolated
a
derivative
in
which
Ala-238
is
changed
to
tyrosine.
This
Tyr-238
protein
also
stimulates
transcription
from
a
his3
promoter
containing
A-iGACTCCT,
but
it
is
unable
to
function
on
AAGAC-
TCIT.
Although
the
Tyr-238
protein
only
weakly
activates
transcription
from
the
mutant
target
sites,
it
is
clearly
more
effective
than
wild-type
GCN4.
The
Tyr-238
protein
retains
the
ability
to
activate
transcription
effi'ciently
from
the
opti-
§To
whom
reprint
requests
should
be
addressed.
4513
The
publication
costs
of
this
article
were
defrayed
in
part
by
page
charge
payment.
This
article
must
therefore
be
hereby
marked
"advertisement"
in
accordance
with
18
U.S.C.
§1734
solely
to
indicate
this
fact.
Proc.
Natl.
Acad.
Sci.
USA
90
(1993)
225
23S
238
239
242
246 247
\
/
SDPAALKRARiTEMR
RaRAIRKLQRMKQ
leucinezipper
I
~~
~~~~~I
BJanm
AIwNI
Psd
Xhd
ATGACTCAT
GCN4
TR
-M
ggaattcc
l
l
-447
-103
-85
-55
-35
TACTGAGTA
optimal
AP-1
site
-4
-3
-2
-1
0
+1
+2
+3
+4
ACGACT
CGT
AAGACT
CIT
AT
I
ACT
AAT
AT
GI
CACAT
symmetrically
mutated
sites
FIG.
1.
Isolation
of
GCN4
specificity
mutants.
(Upper)
Amino
acid
sequence
of
the
GCN4
basic
region
adjacent
to
the
leucine
zipper
is
shown
with
residues
identified
by
specificity
mutants
underlined.
The
library
of
mutant
GCN4
proteins
subjected
to
genetic
selection
was
generated
by
replacing
the
region
between
the
AlwNI
and
Pst
I
sites
with
a
degenerate
oligonucleotide
containing
an
average
of
2-bp
substitutions.
(Lower)
Target
promoters
used
for
genetic
selection
are
derived
from
a
molecule
containing
the
optimal
GCN4
binding
site
upstream
of
the
his3
TR
element
and
structural
gene.
The
central
C-G
base
pair
of
the
optimal
binding
site
is
defined
as
position
0,
base
pairs
to
the
right
are
defined
as
+1
to
+4,
andbase
pairs
to
the
left
are
defined
as
-1
to
-4
(12).
Symmetrically
mutated
derivatives
of
the
GCN4
binding
site
that
respond
to
the
various
specificity
mutants
are
shown
below
with
nonoptimal
bases
under-
lined.
mal
AP-1
site.
Thus,
the
Tyr-238
substitution
in
the
GCN4
basic
region
broadens
the
specificity
at
the
±3
position.
Two
other
derivatives
were
selected
for
their
ability
to
activate
transcription
strongly
from
a
promoter
containing
AT-IACTAAT.
Both
of
these
derivatives
stimulate
transcrip-
tion
from
the
optimal
AP-1
site,
but
they
are
inactive
on
sites
with
other
symmetric
substitutions
at
the
±2
position.
Both
of
these
±2
specificity
mutants
contain
two
amino
acid
substitutions.
In
one
case,
Ala-239
and
Ser-242
are
changed
to
Val-239
and
Leu-242.
Examination
of
the
individual
sub-
stitutions
indicates
that
the
Val-239
protein
functions
at
AT-IACTAAT,
although
less
efficiently
than
the
double-
mutant
protein,
whereas
the
Leu-242
protein
does
not.
In
the
other
case,
Ser-242
and
Lys-246
are
changed
to
Cys-242
and
Gln-246;
both
of
these
substitutions
contribute
to
transcrip-
tional
activity
on
ATIACTAAT.
DNA-Binding
Specificities
of
the
Mutant
GCN4
Proteins.
The
various
proteins
were
synthesized
in
vitro
and
incubated
with
13
target
DNA
sequences
representing
all
possible
symmetric
mutations
at
positions
±1,
±2,
±3,
and
±4
(Figs.
3
and
4;
Table
1).
In
general,
the
mutant
proteins
have
gained
the
ability
to
bind
specific
sites
to
which
wild-type
GCN4
does
not
bind.
Otherwise,
the
mutant
proteins
retain
normal
GCN4
sequence
recognition
properties
including
binding
the
optimal
site
with
near
wild-type
affinity.
In
all
cases,
the
DNA-binding
properties
of
the
mutant
proteins
are
in
excel-
lent
agreement
with
their
transcriptional
activation
proper-
ties
in
vivo.
The
Tyr-238
protein
behaves
similarly
to
GCN4
except
that
it
binds
detectably
(although
weakly)
to
AGGACTC_CT
and
A_CGACTCGT,
confirming
that
it
only
affects
sequence
recognition
at
the
±3
position.
The
Cys-242/Gln-246
double-
mutant
protein
also
behaves
similarly
to
GCN4,
except
that
it
binds
to
AT.IACTAAT
with
an
affinity
comparable
to
that
of
the
optimal
site.
Both
amino
acid
substitutions
are
impor-
tant
for
high-affinity
binding
to
ATTACTAAT,
because
the
Cys-242
and
Gln-246
single-mutant
proteins
bind
this
site
ATGICACAT
ATGACTCAT
AThACT&AT
ATGACTCAT
ATIACTAAT
i
z
(.9
N
N
CY
-j
N
N
a;
01
Cy
Q N
w
m
N N
N
Ny Ny
Ny
v
CY
>
>
_j
N
10mM
AT
I
O5mM
AT
I
OmM
AT
't
In
z
tJN
ATGACTCAT
AfGACTCCT
ACGACTCÆT
AA.GACTCIT
4mM
AT
FIG.
2.
Phenotypic
analysis.
Strains
containing
the
indicated
GCN4
and
his3
promoter
derivatives
were
plated
on
medium
con-
taining
various
concentrations
of
aminotriazole
(AT).
From
top
to
bottom
are
shown
analyses
of
specificity
at
positions
±1,
±2,
and
±3.
very
weakly.
Thus,
the
dual
DNA-binding
specificity
of
the
Cys-242/Gln-246
protein
is
restricted
to
the
+2
position.
In
accord
with
their
genetic
properties,
the
Val-239
and
Val-239/Leu-242
proteins
have
a
slightly
reduced
affinity
for
the
optimal
site,
and
they
efficiently
bind
AT-IACTAAT;
the
double-mutant
protein
is
more
active.
Although
selected
by
a
change
at
±2,
the
Val-239
and
the
double-mutant
protein
also
bind
to
ATGICACAT
as
well
as
they
do
to
that
of
the
GCN4
W235
V239,L242
C242,Q246
Y238
+
4
A
(:
G
A
C:
a
A
£
£
A
GT
A
CQ
G
T
GCN4
Y238
V239,L242
W235
C242,Q246
T
A
£
aST
A
G
T
A
C
ST
A
C
gT
A
-C
GCN4
V239,L242
C242,Q246
Y238
W235
+
2,
HI
n
G
T
A
X£G
T
A
£G
T
A
£G
T
A
CG
T
A
C
GCN4
V239
V239,L242
L242
W235
Y238
C242,Q246
+
1
,
,
,
-
A
C
T
A
CG
T
A
C
£
T
A
C
a
T
A
C
,
I
A
T
TA
-C
a
T
FIG.
3.
DNA-binding
specificities
of
GCN4
and
the
mutant
proteins.
Protein-DNA
complexes
formed
by
incubating
equivalent
amounts
of
in
vitro
synthesized
35S-labeled
proteins
(determined
by
SDS/PAGE)
with
the
indicated
target
sequences
(mutated
residues
are
underlined).
From
top
to
bottom
are
indicated
DNA-binding
specificities
at
positions
±4,
±3,
±2,
and
±1.
-J
'6r
Om
03
z
m
m
u
N
N
LI)
>
>
4514
Biochemistry:
Kim
et
al.
Proc.
Natl.
Acad.
Sci.
USA
90
(1993)
4515
GCN4
C242
Q246
C242,
Q246
-
G
T
C
AG
T
£
AG
T
A
G
T
C
A
GCN4
V239,L242
L242
V239
+
2
-
G
I
A
QiG
IA
QG
I
A
QG
I
A
X.
FIG.
4.
DNA-binding
specificity
at
the
+2
position
of
single-
and
double-mutant
proteins.
Protein-DNA
complexes
formed
by
incu-
bating
equivalent
amounts
of
in
vitro
synthesized
35S-labeled
pro-
teins
(determined
by
SDS/PAGE)
with
the
target
sequences
con-
taining
mutated
residues
(underlined)
at
position
±2.
optimal
site.
Consistent
with
this
observation,
these
proteins
activate
transcription
from
a
promoter
containing
ATGIC&-
CAT
upstream
of
the
his3
TATA
element.
Thus,
substitution
of
Ala-239
with
valine
affects
DNA-binding
specificity
at
both
the
+
1
and
±2
positions.
We
also carried
out
detailed
DNA-binding
specificity
ex-
periments
on
the
Trp-235
protein
that
had
previously
been
shown
to
affect
recognition
at
the
±4
position
(15).
The
Trp-235
protein
binds
extremely
strongly
to
AAGACTCTT
but
not
to
AGGACTCCT,
AAGACTCIT,
or
to
any
se-
quence
variants
at
the
±
1
or
+2
positions.
Indeed,
the
affinity
for
AAGACTCTT
is
higher
than
for
the
optimal
site,
indi-
cating
that
the
Trp-235
substitution
alters
sequence
recogni-
tion
at
both
positions
±3
and
+4,
with
the
more
pronounced
effect
being
at
+3.
Mutations
of
GCN4
That
Affect
Half-Site
Spacing
Specific-
ity.
We
previously
suggested
that
AP-1
and
ATF/CREB
proteins
make
similar
DNA
contacts
but
differ
in
half-site
spacing
preferences,
and
we
predicted
that
the
connection
between
the
leucine
zipper
and
basic
region
(residues
244-
250)
determines
the
flexibility
and
specificity
of
half-site
spacing
(13).
In
this
regard,
there
is
a
consistent
difference
at
the
position
corresponding
to
GCN4
residue
247:
ATF/
CREB
proteins
have
a
positively
charged
residue
(nearly
always
lysine),
whereas
AP-1
proteins
do
not
(GCN4
con-
tains
a
leucine)
(16).
We
therefore
analyzed
Lys-247
and
Arg-247
derivatives
of
GCN4
for
their
relative
binding
to
AP-1
and
ATF/CREB
sites
(Fig.
5).
Unlike
GCN4,
which
prefers
the
AP-1
site
over
the
ATF/CREB
site
by
a
factor
of
5
(13),
the
Lys-247
and
Table
1.
Binding
of
GCN4
mutant
proteins
to
various
target
sequences
GCN4
derivative
Wild
Trp- Tyr-
Val-
Val-239/
Cys-242/
Target
site
type
235
238
239
Leu-242
Gln-246
ATGACTCAT
+++
++
+++
++ ++
+++
CTGACTCA_I
+
+
+
++
- -
+
_GTGACTCAC
+++
+
+++
+
+
++
ITGACTCAA
++
++
+
-
-
+
AAGACTCIT
-
+++
-
-
-
-
A_GACTC_jT
-
-
+
-
-
-
A-iGACTCCT
-
-
+
-
-
-
ATAACTIAT
-
-
-
-
-
-
ATCACTGAT
-
-
-
-
-
-
ATIACTAAT
-
-
-
+
++
+++
ATGCC_ICAT
-
-
-
-
-
-
ATG-jCCCAT
-
-
-
-
-
-
ATGICACAT
-
-
-
++
+
-
Relative
DNA-binding
abilities
(based
on
data
in
Figs.
3
and
4
and
in
additional
experiments)
are
indicated
as
follows:
+ +
+,
wild-type
affinity;
+
+,
somewhat
weaker
than
wild-type
affinity;
+,
weak
binding;
-,
not
detectable.
When
tested,
transcriptional
activation
of
the
GCN4
derivatives
on
the
indicated
target
sites
in
vivo
was
in
excellent
accord
with
the
DNA-binding
properties
in
vitro.
Iys247
GCN4
arg247
Iys247
tyr249
vaI2
50
,
U.
U.
4.
I.
U.
a
.'
0.
0
O01O
01-0
Xc
.
o
X
Fo
FIG.
5.
GCN4
derivatives
that
alter
half-site
spacing
specificity.
Protein-DNA
complexes
formed
by
incubating
equivalent
amounts
of
in
vitro
synthesized
35S-labeled
proteins
(determined
by
SDS/
PAGE)
with
the
individual
or
a
mixture
of
DNA
fragments
containing
the
ATF/CREB
or
AP-1
target
sequences.
Differences
in
electro-
phoretic
mobility
between
complexes
with
the
ATF/CREB
and
AP-1
binding
sites
are
due
to
the
length
differences
of
the
two
DNA
fragments.
Arg-247
proteins
bind
the
AP-1
and
ATF/CREB
sites
with
comparable
affinities.
In
comparison
to
GCN4,
these
pro-
teins
bind
with
reduced
affinity
to
the
AP-1
site
but
with
wild-type
affinity
to
the
ATF/CREB
site.
A
GCN4
derivative
in
which
residues
247,
249,
and
250
are
replaced
by
the
corresponding
residues
in
CREB
appears
to
show
a
further
reduction
in
AP-1
binding
activity
such
that
the
ATF/CREB
site
is
preferred
by
a
factor
of
2.
Thus,
these
substitutions
alter
half-site
spacing
specificity,
but
they
do
not
fully
convert
GCN4
into
a
protein
with
typical
ATF/CREB
DNA-
binding
properties.
Nevertheless,
the
results
indicate
that
the
region
between
the
leucine
zipper
and
DNA-binding
surface
is
critical
for
half-site
spacing
specificity,
with
position
247
playing
an
important
but
not
fully
determinative
role.
Modeling
of
the
Mutant
Protein-DNA
Complexes.
The
crystal
structure
of
the
GCN4
bZIP-AP-1
DNA
complex
(6)
demonstrates
that
Asn-235,
Ala-238,
Ala-239,
Ser-242,
and
Arg-243
are
in
contact
with
the
central
7
bp
of
the
binding
site.
In
addition,
numerous
basic
residues
anchor
GCN4
to
its
binding
site
by
hydrogen
bonds
and
electrostatic
interactions
with
the
phosphodiester
backbone.
This
structure
provides
a
framework
for
interpreting
the
functional
consequences
of
amino
acid
substitutions
in
the
GCN4
mutant
proteins.
Be-
cause
these
proteins
function
well
on
the
optimal
AP-1
site
present
in
the
crystallized
protein-DNA
complex,
we
have
tried
to
configure
the
mutant
side
chains
in
orientations
that
minimally
disrupted
the
wild-type
structure.
As
described
below,
some
of
the
substituted
residues
cannot
be
accom-
modated
by
the
wild-type
orientation
of
the
GCN4
basic
region
on
DNA.
Tyr-238.
In
the
GCN4
complex,
the
thymine
methyl
group
at
±3
interacts
with
the
methyl
group
of
Ala-238.
The
Tyr-238
substitution
creates
a
steric
clash
with
the
phosphates
of
bases
4
and
5.
This
clash
can
be
relieved
by
a
local
adjustment
of
the
DNA
backbone
conformation-for
example,
as
ob-
served
in
complexes
of
the
bacteriophage
434
repressor
with
different
operators
(17).
The
tyrosine
hydroxyl
group
might
then
donate
a
hydrogen
bond
to
the
phosphate
of
the
+5
pyrimidine
residue.
However,
the
tyrosine
ring
would
still
crowd
the
DNA
at
position
±4,
requiring
further
adjustment.
A
distributed
set
of
small,
local
structural
changes,
relative
to
wild-type,
may
contribute
to
broadened
specificity
at
posi-
tion
±3.
Trp-235.
Asn-235
interacts
directly
with
both
strands
of
the
optimal
target
site
through
hydrogen
bonds
with
the
±3
thymine
and
the
±2
cytosine,
and
it
may
also
communicate
with
position
±4
through
a
hydrogen
bond
to
an
intervening
water
molecule.
If
Trp-235
is
oriented
with
the
long
axis
of
its
indole
ring
pointing
away
from
the
DNA,
it
can
be
accom-
modated
in
the
wild-type
structure
without
interfering
with
DNA
contacts
made
by
other
residues.
The
crystal
structure
is
consistent
with
the
possibility
that
in
a
complex
of
the
Trp-235
protein
with
AAGACTCTT,
the
tryptophanyl
side
Biochemistry:
Kim
et
al.
Proc.
Natl.
Acad.
Sci.
USA
90
(1993)
chains
might
stack
against
the
±3
thymine
methyl
groups
in
the
mutant
site.
The
Trp-235
residue
is
relatively
close
to
±4,
but
the
basis
of
the
observed
specificity
change
at
this
position
is
unclear.
The
Trp-235
substitution
eliminates
two
hydrogen
bonds,
including
the
only
direct
contact
to
±2.
It
is
therefore
surprising
that
the
Trp-235
protein
binds
the
opti-
mal
GCN4
site
with only
slightly
diminished
affinity
and
that
it
shows
the
wild-type
preferences
at
±2.
Cys-242/Gln-246.
In
GCN4,
Lys-246
is
not
in
close
contact
with
DNA,
but
Ser-242
directly
interacts
with
the
±3
thymine
methyl.
Thus,
the
Gln-246
substitution
must
affect
DNA-
binding
specificity
indirectly,
probably
by
altering
the
posi-
tion
of
the
basic
region
to
accommodate
new
bases
at
position
±2.
It
is
unclear
why
Cys-242
is
more
effective
than
Ser-242
in
allowing
dual
specificity
at
position
±2.
Val-239
and
Val-239/Leu-242.
Ala-239
contacts
the
±
1
thymine
methyl
group
in
the
wild-type
GCN4
complex.
Substitution
of
the
larger
Val-239
residue
would
not
affect
the
thymine
contact,
but
it
would
crowd
the
Arg-243
side
chain
that
contacts
the
central
base
pair.
Arg-243
is
invariant
in
the
set
of
known
bZIP
proteins,
and
its
contact
to
the
central
guanine
is
energetically
significant
because
GCN4
will
bind
to
an
ATGAC
half-site
but
not
to
ATGAG
(13).
Crowding
of
Arg-243
by
Val-239
requires
some
conformational
adjust-
ment,
which
might
account
for
the
reduction
in
affinity
of
the
Val-239
and
Val-239/Leu-242
proteins
for
an
optimal
AP-1
site.
Although
Val-239
is
located
near
the
±1
base
pair,
it
is
unknown
how
it
tolerates
the
T*A
but
not
the
G-C
or
the
C-G
substitution.
Lys-247
and
Arg-247.
Residue
247
of
each
monomer
lies
within
the
"fork"
region
where
the
basic
regions
diverge
from
the
leucine
zipper.
If
we
assume
that
the
protein
contacts
AP-1
and
ATF/CREB
half-sites
in
a
similar
manner,
the
fork
will
be
more
widely
spread
in
ATF/CREB
com-
plexes.
Because
the
Lys-247
and
Arg-247
proteins
lose
af-
finity
for
AP-1
sites
but
not
for
ATF/CREB
sites
with
normal
affinity,
we
suggest
that
the
Lys-247
and
Arg-247
substitu-
tions
interfere
with
the
configuration
of
the
fork
necessary
for
AP-1
site
binding.
It
is
unlikely,
however,
that
such
interfer-
ence
reflects
electrostatic
repulsion
between
Lys-247/Lys-
247
or
Arg-247/Arg-247
pairs,
because
the
corresponding
Leu-247
residues
in
GCN4
are
not
in
close
proximity.
DISCUSSION
Functional
Analyses
of
the
GCN4-AP-1
Complex.
The
ge-
netic
selection
of
GCN4
derivatives
that
function
on
mutant
DNA
sequences
provides
a
method
for
identifying
amino
acid
residues
that
contribute
to
DNA-binding
specificity.
The
mutant
proteins
described
here
generally
retain
activity
on
the
optimal
AP-1
sequence
while
gaining
the
ability
to
bind
specific
mutant
target
sites.
Because
these
GCN4
derivatives
were
isolated
from
complex
libraries
of
mutant
proteins
rather
than
by
directed
mutagenesis,
it
is
likely
that
the
residues
identified
here
are
important
determinants
of
the
strict
DNA
sequence
specificity
of
GCN4.
Indeed,
of
the
five
residues
that
contact
the
central
7
bp
(6),
four
(Asn-235,
Ala-238,
Ala-239,
and
Ser-242)
were
identified
by
the
GCN4
specificity
mutants.
If
amino
acid
substitutions
cause
only
local
structural
changes,
then
amino
acids
and
nucleotides
identified
by
the
specificity
mutants
might
be
predicted
to
interact
in
the
wild-type
complex.
Several
of
the
specific
contacts
inferred
in
this
way
(Asn-235
and
+3,
Ala-238
and
±3,
and
Ala-239
and
±1)
are
indeed
observed
in
the
crystal
structure.
How-
ever,
GCN4
mutants
affecting
specificity
at
±2
contain
substitutions
at
residues
239,
242,
and
246
that
are
not
in
contact
with
base-pair
2
in
the
wild-type
complex.
These
observations
imply
that
complexes
of
some
of
the
variant
proteins
differ
from
the
wild-type
structure
in
more
ways
than
just
local
perturbations
in
the
vicinity
of
the
altered
residues.
Relationship
Between
the
GCN4
Specificity
Mutants
and
Other
bZIP
Proteins.
Although
the
specificity
mutants
were
sought
primarily
to
understand
the
basis
of
GCN4
DNA-
binding
specificity,
some
of
them
are
relevant
to
other
bZIP
proteins.
First,
C/EBP
and
several
other
bZIP
proteins
contain
a
valine
at
the
position
corresponding
to
Ala-239,
where
a
valine
substitution
in
GCN4
affects
specificity
at
+1
and
±2,
and
Schizosaccharomyces
pombe
PAP1
and
Sac-
charomyces
cerevisiae
YAP1
contain
a
glutamine
at
this
position.
Thus,
position
239
is
likely
to
play
a
role
in
the
distinct
DNA-binding
specificities
of
GCN4,
C/EBP,
and
YAP1
(3,
10,
18).
Second,
two
of
the
GCN4
specificity
mutants
bind
with
high
affinity
to
ATTACTAAT.
Several
AP-1
and
ATF/CREB
proteins
also
recognize
this
sequence,
and
T(G/T)AC
has
been
proposed
as
the
half-site
consensus.
It
is
possible
that
the
mutant
and
natural
bZIP
proteins
recognize
ATTACTAAT
in
the
same
way.
Third,
position
247
plays
an
important
role
in
half-site
spacing
and
is
likely
to
account
for
some
of
the
differences
between
AP-1
and
ATF/CREB
factors;
it
may
also
be
important
for
determining
half-site
relationships
in
other
bZIP
proteins.
Adaptability
at
the
Protein-DNA
Interface
Is
a
Critical
Determinant
for
DNA-Binding
by
bZIP
Proteins.
The
basic
region
of
GCN4
and
of
other
bZIP
proteins
forms
an
extended
a-helix
when
it
binds
to
DNA,
and
no
other
tertiary
inter-
actions
within
the
protein
stabilize
its
conformation.
By
contrast,
most
of
the
other
well-studied
prokaryotic
and
eukaryotic
DNA-binding
domains
contain
compact,
globular
modules.
Constraints
within
their
folded
structures
restrict
adaptability
in
the
DNA
recognition
surface.
Flexibility
is
instead
built
into
elements
such
as
the
arm
of
A
repressor
and
the
linker
segment
of
GAL4,
which
fold
when
the
protein
binds
DNA
or
the
joints
in
zinc-finger
proteins,
which
allow
successive
fingers
to
wrap
around
DNA
in
the
major
groove.
Moreover,
the
globular
modules
are
generally
tightly
an-
chored
to
the
DNA
backbone
through
peptide-NH
groups
or
small
polar
residues.
As
a
result,
among
proteins
with
a
common
structural
motif,
there
is
a
strong
relationship
be-
tween
the
amino
acid
residues
on
the
recognition
surface
and
DNA-binding
specificity.
Proteins
containing
similar
amino
acid residues
on
the
recognition
surface
generally
have
similar
DNA-binding
specificities,
whereas
proteins
with
distinct
specificities
differ
at
these
crucial
amino
acid
posi-
tions.
Thus,
substitutions
of
amino
acid
residues
that
nor-
mally
contact
base
pairs
usually
cause
large
decreases
in
affinity,
because
an
altered
protein
cannot
adapt
to
an
unaltered
site,
and
efficient
binding
of
a
mutant
protein
to
an
altered
site
can
often
be
explained
by
new
interactions
between
the
substituted
amino
acids
and
base
pairs.
Our
results
suggest
that
adaptability
in
the
local
confor-
mation
and/or
positioning
of
the
basic
region
is
an
important
aspect
of
sequence
recognition
by
bZIP
proteins.
For
many
of
the
GCN4
specificity
mutants,
the
substituted
residues
cannot
be
accommodated
by
the
structure
of
the
complex.
The
Val-239
substitution,
which
affects
specificity
at
±1
and
±2,
requires
some
adjustment
in
the
protein
in
order
to
relieve
steric
clash
with
the
invariant
Arg-243
residue.
The
Tyr-238
substitution,
which
broadens
specificity
at
+3,
re-
quires
movement
of
the
DNA
backbone
away
from
the
protein.
Other
substitutions
of
larger,
hydrophobic
residues
are
permitted
at
positions
238
and
239
(19),
and
these
pre-
sumably
cause
some
perturbation
of
the
protein-DNA
inter-
face.
The
Trp-235
substitution
eliminates
the
only
contact
to
±2,
yet
it
retains
normal
DNA-binding
specificity
at
this
position.
Given
the
central
role
of
Asn-235
in
the
wild-type
complex
(hydrogen
bonds
to
±2
and
±3
and
a
possible
H20-mediated
hydrogen
bond
to
±4),
it
is
striking
that
some
substitutions
have
relatively
modest
effects
on
DNA-binding
4516
Biochemistry:
Kim
et
al.
Proc.
Natl.
Acad.
Sci.
USA
90
(1993)
4517
affinity
(15,
19).
Finally,
two
GCN4
specificity
mutants
alter
specificity
at
±2
even
though
the
original
(and
possibly
the
substituted)
amino
acids
do
not
contact
±2.
These
observa-
tions
are
not
simply
artifacts
of
the
mutant
proteins
because,
as
discussed
above,
most
of
them
have
counterparts
in
other
bZIP
proteins.
Adjustments
in
the
a-helical
geometry
of
the
GCN4
fork
segment
and
basic
region
are
also
likely
to
accommodate
the
different
half-site
spacings
of
the
AP-1
and
ATF/CREB
sites.
Residue
247,
which
does
not
contact
DNA
but
lies
at
the
fork
between
the
leucine
zipper
and
basic
region,
is
important
for
half-site
spacing.
Assuming
that
AP-1
and
ATF/CREB
half-
sites
are
contacted
by
the
same
GCN4
residues,
then
the
protein
must
be
sufficiently
flexible
to
allow
a
rotation
of
360
and
translation
of
~3.3
A
between
half-sites,
while
maintain-
ing
these
protein-DNA
contacts.
This
amount
of
flexibility
is
unprecedented
in
other
DNA-binding
proteins,
presumably
because
tertiary
folding
constraints
limit
movement
within
other
DNA-binding
domains.
Comparative
analysis
of
bZIP
protein
sequences
and
their
DNA-binding
specificities
provides
an
independent
argument
for
conformational
variations
in
basic
regions.
The
five
GCN4
residues
that
make
base-pair
contacts
(6)
are
very
highly
conserved
in
bZIP
proteins;
Asn-235
and
Arg-243
are
invariant,
whereas
Ala-238/Ala-239
and
Ser/Cys-242
are
present
in
>80%
of
bZIP
domains
(19).
Nevertheless,
bZIP
proteins
can
differ
considerably
in
their
DNA-binding
spec-
ificities.
This
situation
is
in
marked
contrast
to
that
observed
in
helix-turn-helix
proteins
in
which
amino
acid
similarity
at
the
recognition
surface
is
strongly
correlated
with
DNA-
binding
specificity.
Nonconserved
residues
in
the
basic
re-
gion
must
play
a
crucial
role
in
the
different
DNA-binding
specificities
of
bZIP
proteins,
either
by
direct
base-pair
interactions
or
by
indirect
effects
on
the
conserved
quintet.
Both
mechanisms
require
conformational
variation
in
the
DNA
recognition
surface
from
that
of
the
GCN4-AP-1
complex.
These
differences
may
result
from
variations
in
the
a-helical
geometry
and/or
overall
orientation
in
the
major
groove
of
the
basic
region.
The
basic
regions
of
bZIP
domains
become
ordered
only
upon
association
with
target
DNA
(7-9)
and
are
not
con-
strained
by
tertiary
interactions
within
the
protein.
The
absence
of
a
rigid,
globular
structure
makes
it
plausible
that
basic
regions
of
bZIP
proteins
adopt
different
conformations
along
DNA.
Variable
conformations
of
a
given
basic
region
are
likely
to
allow
the
dual
specificities
of
C/EBP
and
of
the
GCN4
derivatives
described
here.
In
the
case
of
GCN4,
the
basic
region
is
held
in
the
major
groove
primarily
by
long
arginine
and
lysine
side
chains,
and
there
is
likely
some
flexibility
in
the
way
it
is
anchored.
As
these
basic
residues
are
highly
conserved,
such
flexibility
is
likely
to
be
a
general
feature
of
bZIP
domains.
A
precise
description
of
individual
protein-DNA
complexes
will
require
the
high-resolution
structures.
However,
the
combined
evidence
from
the
GCN4-AP-1
complex
structure,
the
sequence
comparison
of
bZIP
proteins,
and
the
structural
and
functional
interpreta-
tion
of
our
GCN4
specificity
mutants
provides
a
strong
case
that
adaptation
at
the
protein-DNA
interface
is
an
important
aspect
of
DNA-binding
specificity
in
bZIP
proteins.
This
work
was
supported
by
postdoctoral
fellowships
from
the
Damon
Runyon-Walter
Winchell
Foundation
(J.K.),
Human
Fron-
tiers
of
Sciences
(D.T.),
and
the
National
Institutes
of
Health
and
the
Lucille
P.
Markey
Charitable
Trust
(T.E.)
and
by
research
grants
from
the
National
Institutes
of
Health
(GM30186
and
GM46555
to
K.S.).
1.
Landschulz,
W.
H.,
Johnson,
P.
F.
&
McKnight,
S.
L.
(1989)
Science
243,
1681-1688.
2.
O'Shea,
E.
K.,
Klemm,
J.
D.,
Kim,
P.
S.
&
Alber,
T.
(1991)
Science
254,
539-544.
3.
Agre,
P.,
Johnson,
P.
F.
&
McKnight,
S.
L.
(1989)
Science
246,
922-926.
4.
Talanian,
R.
V.,
McKnight,
C.
J.
&
Kim,
P.
S.
(1990)
Science
249,
769-771.
5.
Pu,
W.
T.
&
Struhl,
K.
(1991)
Proc.
Natl.
Acad.
Sci.
USA
88,
6901-6905.
6.
Ellenberger,
T.
E.,
Brandl,
C.
J.,
Struhl,
K.
&
Harrison,
S.
C.
(1992)
Cell
71,
1223-1237.
7.
O'Neil,
K.
T.,
Hoess,
R.
H.
&
DeGrado,
W.
F.
(1990)
Science
249,
774-778.
8.
Weiss,
M.
A.,
Ellenberger,
T.,
Wobbe,
C.
R.,
Lee,
J.
P.,
Harrison,
S.
C.
&
Struhl,
K.
(1990)
Nature
(London)
347,
575-578.
9.
Patel,
L.,
Abate,
C.
&
Curran,
T.
(1990)
Nature
(London)
347,
572-575.
10.
Hill,
D.
E.,
Hope,
I.
A.,
Macke,
J.
P.
&
Struhl,
K.
(1986)
Science
234,
451-457.
11.
Hope,
I.
A.
&
Struhl,
K.
(1987)
EMBO
J.
6,
2781-2784.
12.
Oliphant,
A.
R.,
Brandl,
C.
J.
&
Struhl,
K.
(1989)
Mol.
Cell.
Biol.
9,
2944-2949.
13.
Sellers,
J.
W.,
Vincent,
A.
C.
&
Struhl,
K.
(1990)
Mol.
Cell.
Biol.
10,
5077-5086.
14.
Hai,
T.,
Liu,
F.,
Allegretto,
E.
A.,
Karin,
M.
&
Green,
M.
R.
(1988)
Genes
Dev.
2,
1216-1226.
15.
Tzamarias,
D.,
Pu,
W.
T.
&
Struhl,
K.
(1992)
Proc.
Natl.
Acad.
Sci.
USA
89,
2007-2011.
16.
Vincent,
A.
C.
&
Struhl,
K.
(1992)
Mol.
Cell.
Biol.
12,
5394-
5405.
17.
Harrison,
S.
C.
&
Aggarwal,
A.
K.
(1990)
Ann.
Rev.
Biochem.
59,
933-969.
18.
Moye-Rowley,
W.
S.,
Harshman,
K.
D.
&
Parker,
C.
S.
(1989)
Genes
Dev.
3,
283-292.
19.
Pu,
W.
T.
&
Struhl,
K.
(1991)
Mol.
Cell.
Biol.
11,
4918-4926.
Biochemistry:
Kim
et
al.