Structure of the DNA deaminase domain of the HIV-1 restriction factor APOBEC3G.
ABSTRACT The human APOBEC3G (apolipoprotein B messenger-RNA-editing enzyme, catalytic polypeptide-like 3G) protein is a single-strand DNA deaminase that inhibits the replication of human immunodeficiency virus-1 (HIV-1), other retroviruses and retrotransposons. APOBEC3G anti-viral activity is circumvented by most retroelements, such as through degradation by HIV-1 Vif. APOBEC3G is a member of a family of polynucleotide cytosine deaminases, several of which also target distinct physiological substrates. For instance, APOBEC1 edits APOB mRNA and AID deaminates antibody gene DNA. Although structures of other family members exist, none of these proteins has elicited polynucleotide cytosine deaminase or anti-viral activity. Here we report a solution structure of the human APOBEC3G catalytic domain. Five alpha-helices, including two that form the zinc-coordinating active site, are arranged over a hydrophobic platform consisting of five beta-strands. NMR DNA titration experiments, computational modelling, phylogenetic conservation and Escherichia coli-based activity assays combine to suggest a DNA-binding model in which a brim of positively charged residues positions the target cytosine for catalysis. The structure of the APOBEC3G catalytic domain will help us to understand functions of other family members and interactions that occur with pathogenic proteins such as HIV-1 Vif.
- SourceAvailable from: Kiran S Gajula[Show abstract] [Hide abstract]
ABSTRACT: Antibody maturation is a critical immune process governed by the enzyme activation-induced deam-inase (AID), a member of the AID/APOBEC DNA deaminase family. AID/APOBEC deaminases prefer-entially target cytosine within distinct preferred se-quence motifs in DNA, with specificity largely con-ferred by a small 9–11 residue protein loop that dif-fers among family members. Here, we aimed to deter-mine the key functional characteristics of this protein loop in AID and to thereby inform our understanding of the mode of DNA engagement. To this end, we developed a methodology (Sat-Sel-Seq) that couples saturation mutagenesis at each position across the targeting loop, with iterative functional selection and next-generation sequencing. This high-throughput mutational analysis revealed dominant characteris-tics for residues within the loop and additionally yielded enzymatic variants that enhance deaminase activity. To rationalize these functional requirements, we performed molecular dynamics simulations that suggest that AID and its hyperactive variants can en-gage DNA in multiple specific modes. These find-ings align with AID's competing requirements for specificity and flexibility to efficiently drive antibody maturation. Beyond insights into the AID-DNA inter-face, our Sat-Sel-Seq approach also serves to further expand the repertoire of techniques for deep posi-tional scanning and may find general utility for high-throughput analysis of protein function.Nucleic Acids Research 07/2014; · 8.81 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Recently, a number of large-scale cancer genome sequencing projects have generated a large volume of somatic mutations; however, identifying the functional consequences and roles of somatic mutations in tumorigenesis remains a major challenge. Researchers have identified that protein pocket regions play critical roles in the interaction of proteins with small molecules, enzymes, and nucleic acid. As such, investigating the features of somatic mutations in protein pocket regions provides a promising approach to identifying new genotype-phenotype relationships in cancer.Genome Medicine 01/2014; 6(10):81. · 4.94 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: The Vif protein of HIV-1 allows virus replication by degrading several members of the host-encoded APOBEC3 family of DNA cytosine deaminases. Polymorphisms in both host APOBEC3 genes and the viral vif gene have the potential to impact the extent of virus replication among individuals. The most genetically diverse of the seven human APOBEC3 genes is APOBEC3H with seven known haplotypes. Overexpression studies have shown that a subset of these variants express stable and active proteins, whereas the others encode proteins with a short half-life and little, if any, antiviral activity. We demonstrate that these stable/unstable phenotypes are an intrinsic property of endogenous APOBEC3H proteins in primary CD4+ T lymphocytes and confer differential resistance to HIV-1 infection in a manner that depends on natural variation in the Vif protein of the infecting virus. HIV-1 with a Vif protein hypo-functional for APOBEC3H degradation, yet fully able to counteract APOBEC3D, APOBEC3F, and APOBEC3G, was susceptible to restriction and hypermutation in stable APOBEC3H expressing lymphocytes, but not in unstable APOBEC3H expressing lymphocytes. In contrast, HIV-1 with hyper-functional Vif counteracted stable APOBEC3H proteins as well as all other endogenous APOBEC3s and replicated to high levels. We also found that APOBEC3H protein levels are induced over 10-fold by infection. Finally, we found that the global distribution of stable/unstable APOBEC3H haplotypes correlates with the distribution a critical hyper/hypo-functional Vif amino acid residue. These data combine to strongly suggest that stable APOBEC3H haplotypes present as in vivo barriers to HIV-1 replication, that Vif is capable of adapting to these restrictive pressures, and that an evolutionary equilibrium has yet to be reached.PLoS Genetics 11/2014; 10(11):e1004761. · 8.17 Impact Factor
Structure of the DNA deaminase domain of the HIV-1
restriction factor APOBEC3G
Kuan-Ming Chen1,2*, Elena Harjes1,2*, Phillip J. Gross1,2,3*, Amr Fahmy4, Yongjian Lu1,2, Keisuke Shindo1,2,3,
Reuben S. Harris1,2,3& Hiroshi Matsuo1,2
The human APOBEC3G (apolipoprotein B messenger-RNA-
editing enzyme, catalytic polypeptide-like 3G) protein is a single-
strand DNA deaminase that inhibits the replication of human
immunodeficiency virus-1 (HIV-1), other retroviruses and
retrotransposons1–6. APOBEC3G anti-viral activity is circum-
vented by most retroelements, such as through degradation by
HIV-1 Vif7. APOBEC3G is a member of a family of polynucleotide
cytosine deaminases, several of which also target distinct physio-
logicalsubstrates. Forinstance,APOBEC1editsAPOB mRNAand
AID deaminates antibody gene DNA8–10. Although structures of
other family members exist, none of these proteins has elicited
polynucleotide cytosine deaminase or anti-viral activity11–16.
Here we report a solution structure of the human APOBEC3G
catalytic domain. Five a-helices, including two that form the
zinc-coordinating active site, are arranged over a hydrophobic
platform consisting of five b-strands. NMR DNA titration experi-
ments, computational modelling, phylogenetic conservation and
Escherichia coli-based activity assays combine to suggest a DNA-
binding model in which a brim of positively charged residues
positions the target cytosine for catalysis. The structure of the
APOBEC3G catalytic domain will help us tounderstand functions
of other family members and interactions that occur with patho-
genic proteins such as HIV-1 Vif.
Full-length human APOBEC3G (also known as A3G) is prone to
aggregation and precipitation, especially at high concentrations1,17.
Residues 198–384 are sufficient for DNA deamination but are
similarly insoluble18. To circumvent this problem, we tested 31
individual lysine substitution derivatives of A3G(198–384) for
activity and solubility. Activity was measured using an E. coli-based
rifampicin-resistance (Rifr) mutation assay, which provides a
sensitive genetic readout of DNA cytosine deamination activity
(for example, see refs 12 and 18). As observed previously for alanine
substitutions at these positions18, many of the lysine substitution
mutants retained activity (Supplementary Fig. 1). Several variants,
including L234K and F310K, had improved solubility. L234K and
F310K were combined to yield a protein that was 2.4-fold more
active and 4-fold more soluble (Fig. 1a and Supplementary Fig. 1,
data not shown). Three additional non-detrimental substitutions18,
the possibility of intermolecular disulphide bond formation and to
maximize long-term stability (Supplementary Fig. 2). The resulting
variant was dubbed A3G-2K3A, and it was 2.7-fold more active
and 4-fold more soluble than the parental protein (Fig. 1a, b).
Importantly, the DNA cytosine deamination activity of A3G-2K3A
was fully dependent on the catalytic glutamic acid E259 (refs 17, 19
and 20, Fig. 1a).
Gel filtration assays were used previously to show that A3G(198–
384) is monomeric18. To bolster this finding and to assess the integ-
rity of A3G-2K3A, the parental protein and the five-substitution
derivative were compared using circular dichroism spectroscopy.
The circular dichroism spectra of A3G(198–384) and A3G-2K3A
virtually superimposed, indicating that the five-substitution deriv-
sedimentation velocity analytical ultracentrifugation profiles were
nearly identical over a range of concentrations, providing strong
evidence that a monomer–dimer or higher order equilibrium is not
*These authors contributed equally to this work.
1Department of Biochemistry, Molecular Biology and Biophysics,2Institute for Molecular Virology and3Arnold and Mabel Beckman Center for Genome Engineering, University of
Minnesota, Minneapolis, Minnesota 55455, USA.4Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 240 Longwood Avenue, Boston,
Massachusetts 02115, USA.
VEC CTD 2K
3A 2K3A 2K3A
190 200 210 220 230 240 250
Sedimentation coeffcient [s*]
distribution function [g(s*)]
Rifr mutation frequency (×10–7)
Mean residue ellipticity (×10–3)
0.15 mg ml–1
0.40 mg ml–1
0.80 mg ml–1
1.20 mg ml–1
Figure 1 | Functional and biophysical properties of A3G-2K3A. a, Capacity
vector control (VEC) to trigger Rifrmutations in E. coli. Each 3 represents
the mutation frequency of an independent culture, and the median values
are indicated. b, Solubility of GST, GST–A3G(198–384) (CTD) and
GST–A3G-2K3A, as monitored by SDS–PAGE and coomassie blue staining
(top panels) or immunoblotting (anti-GST middle panel and anti-A3G
bottom panel). c, Circular dichroism spectra of A3G(198–384) (CTD), 2K
and 2K3A derivatives. d, Sedimentation velocity analytical
ultracentrifugation profiles for A3G-2K3A. The sedimentation coefficient
distribution function g(s*) is shown for various concentrations of A3G-
2K3A. The single peak of the g(s*) distribution indicates that A3G-2K3A is
homogenous and monomeric.
to calculate an A3G-2K3A molecular weight of 22.3kDa (which is
within error of the theoretical 22.6kDa).
A3G-2K3A was used for NMR spectroscopy experiments (see
Methods). A total of 2,008 distance constraints were obtained and
used to calculate a solution structure (Fig. 2 and Supplementary
Tables 1 and 2). The superimposition of the ten lowest-energy struc-
comprised of five b-strands and five a-helices, arranged from amino
the platform of b-strands. The catalytic site is further supported by
the a4- and a5-helices, which make extensive stabilizing hydro-
phobic contacts with the b-strand platform (Fig. 2d). The secondary
structural elements are connected by loops of varying lengths, with
the b3-to-a2 loop being remarkably well-defined (blue in Fig. 2b).
This loop consists of S284, W285, S286 and P287, residues that are
conserved among DNA deaminases and that are probably important
for the integrity of the active site (see Supplementary Fig. 3 and
The A3G catalytic domain shares some features with previously
known structures. First, the a-b-a Zn21-binding motif, a1-b3-a2
in A3G-2K3A, is the clearest structural feature of this deaminase
superfamily11,13–16,21(Fig. 3, top). Second, a subset of the superfamily
members, including human A3G, Staphylococcus aureus transfer
RNA adenosine-editing protein TadA21and human APOBEC2 (ref.
b-strand of the zinc-coordinating motif and the two subsequent
b-strands arranged in parallel (Fig. 3, bottom). As hypothesized
previously, this organization is probably a key determinant of
substrate specificity, enabling a loop and additional structural ele-
ments to be accommodated between the latter two b-strands8,15,18,22.
In contrast, cytidine deaminases of E. coli, Bacillus subtilis,
Saccharomyces cerevisiae and humans have an anti-parallel b4-b5
organization separated by a small loop11,13,16,23(Fig. 3, bottom right).
Finally, closer family members, such as APOBEC2 (ref. 15), have a
common overall fold and similar secondary structures (Fig. 3 and
Supplementary Fig. 4). Several previous reports have discussed and
modelled this likelihood8,15,16,18,22,24.
However, A3G-2K3A differs significantly from all previously
reported structures. For instance, the closest family member for
which we have structural information15, APOBEC2, shares only
31% identity overall (Supplementary Fig. 3). As inferred pre-
viously8,15,16,18,22,24, most of these residues are located within the pro-
tein core (35 out of the 86 total core residues), consistent with the
likelihood that these amino acids are critical for forming the overall
scaffold (Supplementary Fig. 4). In contrast, much less identity
occurs among solvent-accessible residues (11 out of 68 total solvent-
accessible residues), which mediate substrate recognition, catalysis
and interactions with other macromolecules (Supplementary Fig.4).
years have passed since these two proteins were encoded by a single
gene (before vertebrate radiation)8. Thus, as described below, the
A3G-2K3A structure will help us to understand why A3G and other
family members (but apparently not APOBEC2, refs 2, 8, 12, 15 and
25) are endowed with DNA cytosine deaminase and retrovirus
In addition to surface residue differences, A3G-2K3A has several
remarkable structural features. First, A3G-2K3A (or a derivative in
which L234 is restored) has a unique b2 strand, which is interrupted
with a bulge of six residues (Fig. 2c and Supplementary Figs 4 and 5;
see Supplementary Discussion). In contrast, APOBEC2 has a con-
tinuous 11-residue b2 strand, which mediates dimerization through
by the b2-bulge-b29 suggest that different contacts will connect N-
and C-terminal domains of A3G. Alternatively, the b2-bulge-b29
may mediate interactions with RNA and/or other proteins (of
for DNA deamination activity18(Supplementary Figs 1 and2). How-
ever, additional data will be needed to fully discount the possibility
that the b2 bulge has a different conformation in the context of the
full-length protein. Second, A3G-2K3A begins with b1, whereas
APOBEC2 has a small a-helix preceding its first b-strand15. Amino
acid alignments suggest that residues 198–202 of A3G may form an
analogous a-helix (ExPASy proteomix tools, http://ca.expasy.org/),
Figure 2 | NMR structure of A3G-2K3A. a, Superimposition of ten NMR
structures showing a-helices in red, b-sheets in yellow and Zn21in purple.
b, c, Ribbon diagrams of the NMR structure shown in a from the same
(b) and 180u (c) angles, respectively. The b3-to-a2 and b4-to-a3 loops are
coloured blue in b, and the b2-bulge-b29 is coloured orange in
c. d, Hydrophobic contacts between a4 and the b-strands and loops of the
Amino acid side chain atoms are coloured yellow (sulphur), red (oxygen),
blue (nitrogen) and white (carbon). Zn21-binding side chains are coloured
Figure 3 | The relationship of the catalytic domain of A3G to selected
family members. a–d, Human A3G (2jyw), S. aureus TadA (2b3j), human
APOBEC2 (2nyt) and E. coli cytidine deaminase (CDA, 1ctu) Zn21-binding
motifs (top row) and b-strand organization (bottom row). The amino acid
side chains of the catalytic glutamic acid (E) as well as the Zn21-binding
histidine (H) and cysteines (C) are indicated.
but this prediction awaits experimental confirmation. Finally, there
(Supplementary Figs 3 and 4). For instance, the zinc-coordinating
helix in APOBEC2, and the conserved S-W285-S motif in A3G and
other DNA deaminases is an S-S-S motif in all known APOBEC2
proteins. Given the prominence of W285 within the A3G catalytic
site (discussed further below), it is likely that the S-S-S motif of
APOBEC2 contributes to this protein’s substrate specificity.
A fundamental question is how A3G and related family members
recognize single-strand DNA (ssDNA). Like many other nucleic-
acid-interacting proteins, we imagined that A3G-2K3A would have
a prominent positively charged surface that would define the DNA-
interacting region. However, the electrostatic potential of the active-
charged residues arranged on an apparent brim surrounding the
concave active-site region (Fig. 4a). To test directly whether any of
these residues interacted with DNA, NMR chemical shift perturba-
tion experiments were conducted with15N-labelled A3G-2K3A and
varying concentrations of a 21-base ssDNA oligonucleotide, which
contained an APOBEC3G 59-CC deamination hotspot (the under-
lined C is heavily preferred as a deamination substrate). As expected,
significant chemical shift perturbations occurred predominantly on
the active-site face of A3G-2K3A (Fig. 4b and Supplementary Fig. 6).
Notable perturbations were detected for conserved arginines R215
and R313 and for the catalytic glutamic acid E259. Residues adjacent
to R313 (within the b4-to-a3 loop) and E259 also showed strong
chemical shift perturbations. The two other brim-domain arginines,
R213 and R320, could not be detected with this technique.
The NMR titration data were used to build a model for ssDNA
binding (Fig. 4b, c and Supplementary Fig. 6, see Methods). First we
selectedan A3Ghotspotcontaining the trinucleotide 59-C1-C2-T3-39
to model the DNA interaction. This short sequence was selected
because ssDNA interactions were detected predominantly around
the active site and this sequence spans that region. Second, the target
cytosine (C2) was positioned under H257, analogous to how it ori-
ents in cytidine deaminase crystal structures26,27. Finally, we used all
residues that showed significant chemical shift perturbations to cal-
culate the lowest energy structure of an A3G-2K3A–trinucleotide 59-
C1-C2-T3-39 complex (Fig. 4c).
One notable feature of the DNA-binding model is that the target
bone (that is, without flipping, it cannot access the catalytic glutam-
that contributes significantly to the overall predicted trinucleotide-
to explain the observed specificity of A3G for 59-CC dinucleotides,
mutation bias (for example, see ref. 3). We hypothesize that DNA
deaminases with different dinucleotide preferences such as AID (59-
RC) or APOBEC3F (59-TC) will make similarly robust contacts with
39 nucleotide T3would contact both R215 and R213, and that the C2
phosphate would interact with R320.
To test this brim-domain model for DNA binding, we first asked
whether conserved residues would be required for activity. The
model predicted that R215 and R313 would promote DNA binding,
W285 would help to form the hydrophobic active site, and E259, as
shown previously, would mediate catalysis. As expected, all of these
2). Second, because R213 and R320 were predicted to interact with
the phosphate backbone of ssDNA, we hypothesized that they would
be influential but non-essential for activity. Accordingly, a non-
invasive substitution at these positions might be tolerated, but a
negatively charged substitution might render the protein inactive
by repelling the phosphate backbone. Indeed, R213A and R320A
derivatives still retained 20% of wild-type activity, whereas R213E
and R320E derivatives were nearly dead (Fig. 4d and Supplementary
Fig.2).Thus,theA3G-2K3A solutionstructure, NMRDNAtitration
cytosine deaminase activity data combined to support the brim-
domain model for ssDNA binding.
The catalytic domain of the HIV-1 restriction factor A3G repre-
sents the first high-resolution ssDNA deaminase structure. This
structure will facilitate studies on related proteins such as the
mRNA editor APOBEC1, the antibody gene deaminase AID and
other family members that elicit retroelement restriction activity.
Although our data strongly support a novel model for DNA inter-
contacts and to explain major substrate differences such as the spe-
such as APOBEC1). As a practical consideration, we anticipate that
similar mutagenesis strategies may be used to improve the solubility
P < 0.0075
P < 2 × 10–5
Rifr mutation frequency (×10–7)
Figure 4 | A3G catalytic domain DNA interaction model. a, Surface
representation of A3G-2K3A, highlighting positions of positive (blue),
negative (red) or neutral (white) charge. Arginines that brim the concave
active site are labelled. The hypothesized position and polarity of ssDNA is
indicated (green dashed line). b, NMR ssDNA-titration data summary (for
details, see Supplementary Fig. 6). Residues with chemical shift
perturbations more than 1s.d. above average are coloured green (E259 is
perturbed but hidden by H257). H257, C288 and C291 are shaded purple.
c, Model depicting the interaction between A3G-2K3A and ssDNA (59-C1-
C2-T3-39). H257 (purple) is shown partially stacked with the ring of the
flipped-out target cytosine (C2). W285 (grey) helps to form a hydrophobic
catalytic cavity. Arginines surrounding the positively charged brim of the
active site are indicated (see text for discussion). Single-stranded DNA is
coloured white (carbon), blue (nitrogen), red (oxygen) and yellow
(phosphate). d, DNA deaminase activity of A3G-2K3A derivatives. Each 3
represents the mutation frequency of an independent culture, and key
median values are indicated (others were at background levels). The y axis
splits to accommodate the high activity of A3G-2K3A, and therefore one
CTD data point (52.7) is not shown. The significance of the A versus E
substitution at R213 or R320 is indicated (Student’s t-test).
used to build accurate models of the N-terminal, Vif-interacting
domain of A3G and therefore also models of the full-length protein.
The structure presented here therefore provides a crucial step
towards a molecular definition of the A3G–Vif interaction, which
will benefit the development of AIDS therapeutics that function by
modulating this battle between host and pathogen.
All expression constructs were based on pGEX6P2-A3G(198–384) (ref. 18),
modified by site-directed mutagenesis and confirmed by DNA sequencing. E.
coli Rifrmutation assays were used to report the intrinsic activity of A3G(198–
384) derivatives12,18. Glutathione S-transferase (GST)-based constructs were
expressed in E. coli strain BL21 DE3 RIL (Stratagene). Unlabelled proteins were
produced by expression for 17h at 17uC in Luria broth containing 1mM IPTG
ion for 17h at 17uC in M9 supplemented with15NH4Cl,13C-labelled D-glucose
and2H water as described28. Proteins were purified by sonicating cell pellets in
lysis buffer (100mM NaCl, 50mM Na2HPO4/NaH2PO4(pH7.0), protease
inhibitor (Roche)), separating the soluble (supernatant) and insoluble (pellet)
fractions by centrifugation (12,110g, 20min, 4uC), binding to glutathione
sepharose (GE Healthcare), washing with lysis buffer and eluting with
PreScission protease (GE Healthcare) in 1mM dithiothreitol (DTT) and
50mM Na2HPO4/NaH2PO4 (pH7.4), and, finally, concentrating with
Centricon filters (Millipore). Solubility was monitored by SDS–PAGE, coomas-
sie blue staining and/or immunoblotting (anti-GST (GE Healthcare) or anti-
A3G20). Circular dichroism spectroscopy, velocity sedimentation and NMR
experiments were conducted as described previously (for example, refs 29 and
30) and details can be found online in Supplementary Information and in the
Methods. The ssDNA-binding model was calculated by searching all possible
residues that showed significant chemical shift perturbations for the lowest-
energy complex. The model was only constrained by positioning the target
cytosine within the conserved active site, which was estimated from substrate-
containing crystal structures26,27.
Full Methods and any associated references are available in the online version of
the paper at www.nature.com/nature.
Received 11 September 2007; accepted 21 December 2007.
Published online 20 February 2008.
acts processively 3’ R 5’ on single-stranded DNA. Nat. Struct. Mol. Biol. 13,
Esnault, C. et al. APOBEC3G cytidine deaminase inhibits retrotransposition of
endogenous retroviruses. Nature 433, 430–433 (2005).
Harris, R. S. et al. DNA deamination mediates innate immunity to retroviral
infection. Cell 113, 803–809 (2003).
Mangeat, B. et al. Broad antiretroviral defence by human APOBEC3G through
lethal editing of nascent reverse transcripts. Nature 424, 99–103 (2003).
Sheehy, A. M., Gaddis, N. C., Choi, J. D. & Malim, M. H. Isolation of a human gene
thatinhibits HIV-1 infectionandis suppressed bytheviral Vifprotein. Nature 418,
Zhang, H. et al. The cytidine deaminase CEM15 induces hypermutation in newly
synthesized HIV-1 DNA. Nature 424, 94–98 (2003).
Yu, X. et al. Induction of APOBEC3G ubiquitination and degradation by an HIV-1
Vif–Cul5–SCF complex. Science 302, 1056–1060 (2003).
Conticello, S.G.,Langlois,M.A.,Yang,Z.&Neuberger,M.S.DNA deaminationin
immunity: AID in the context of its APOBEC relatives. Adv. Immunol. 94, 37–73
Di Noia, J. M. & Neuberger, M. S. Molecular mechanisms of antibody somatic
hypermutation. Annu. Rev. Biochem. 76, 1–22 (2007).
in mammals: new members of the APOBEC family seeking roles in the family
business. Trends Genet. 19, 207–216 (2003).
11. Betts, L., Xiang, S., Short, S. A., Wolfenden, R. & Carter, C. W. Jr. Cytidine
deaminase. The 2.3 A˚crystal structure of an enzyme: transition-state analog
complex. J. Mol. Biol. 235, 635–656 (1994).
12. Harris, R. S., Petersen-Mahrt, S. K. & Neuberger, M. S. RNA editing enzyme
APOBEC1 and some of its homologs can act as DNA mutators. Mol. Cell 10,
13. Johansson, E., Mejlhede, N., Neuhard, J. & Larsen, S. Crystal structure of the
tetrameric cytidine deaminase from Bacillus subtilis at 2.0 A˚resolution.
Biochemistry 41, 2563–2570 (2002).
14. Ko,T.P.etal.Crystalstructureofyeastcytosine deaminase. Insightsinto enzyme
mechanism and evolution. J. Biol. Chem. 278, 19111–19117 (2003).
15. Prochnow, C., Bransteitter, R., Klein, M. G., Goodman, M. F. & Chen, X. S. The
APOBEC-2 crystal structure and functional implications for the deaminase AID.
Nature 445, 447–451 (2007).
16. Xie,K. etal. The structure ofayeastRNA-editingdeaminase provides insight into
the fold and function of activation-induced deaminase and APOBEC-1. Proc. Natl
Acad. Sci. USA 101, 8114–8119 (2004).
17. Iwatani, Y., Takeuchi, H., Strebel, K. & Levin, J. G. Biochemical activities of highly
purified, catalytically active human APOBEC3G: correlation with antiviral effect.
J. Virol. 80, 5992–6002 (2006).
18. Chen, K. M. et al. Extensive mutagenesis experiments corroborate a structural
model for the DNA deaminase domain ofAPOBEC3G. FEBS Lett. 581, 4761–4766
19. Navarro, F. et al. Complementary function of the two catalytic domains of
APOBEC3G. Virology 333, 374–386 (2005).
20. Newman, E. N. et al. Antiviral function of APOBEC3G can be dissociated from
cytidine deaminase activity. Curr. Biol. 15, 166–170 (2005).
aureus tRNA adenosine deaminase TadA in complex with RNA. Nat. Struct. Mol.
Biol. 13, 153–159 (2006).
22. Huthoff, H. & Malim, M. H. Cytidine deamination and resistance to retroviral
infection: towards a structural understanding of the APOBEC proteins. Virology
334, 147–153 (2005).
bound to a potent inhibitor. J. Med. Chem. 48, 658–660 (2005).
24. Zhang, K. L. et al. Model structure of human APOBEC3G. PLoS ONE 2, e378
25. Mariani, R. et al. Species-specific exclusion of APOBEC3G from HIV-1 virions by
Vif. Cell 114, 21–31 (2003).
26. Teh, A. H. et al. The 1.48 A˚resolution crystal structure of the homotetrameric
cytidine deaminase from mouse. Biochemistry 45, 7825–7833 (2006).
27. Xiang, S., Short, S. A., Wolfenden, R. & Carter, C. W. Jr. The structure of the
cytidine deaminase-product complex provides evidence for efficient proton
transfer and ground-state destabilization. Biochemistry 36, 4768–4774 (1997).
domain of the human protein DEK. Protein Sci. 13, 2252–2259 (2004).
29. Matsuo, H. et al. Structure of translation factor eIF4E bound to m7GDP and
interaction with 4E-binding protein. Nat. Struct. Biol. 4, 717–724 (1997).
30. Kim, S., Cullis, D. N., Feig, L. A. & Baleja, J. D. Solution structure of the Reps1 EH
domain and characterization of its binding to NPF target sequences. Biochemistry
40, 6776–6785 (2001).
Supplementary Information is linked to the online version of the paper at
Acknowledgements We thank R. LaRue, N. Martemyanova, M. Stenglein and
S. Wagner for assistance, laboratory members for discussions, V. Pathak for
the manuscript. Key instrumentation was provided by the University of Minnesota
NMR Facility (NSF) and Supercomputing Institute, the University of Wisconsin
NMRfam (NIH) and the University of Connecticut Analytical Ultracentrifugation
Facility. This work was supported by grants from the National Institutes of Health
and Medical Genomics (H.M. and R.S.H.)), the University of Minnesota (H.M. and
R.S.H.) and the Searle Scholarship Program (R.S.H.).
Author Contributions H.M. and R.S.H. conceived the experimental designs, wrote
the manuscript and assisted with experimentation. K.C., E.H., P.G., Y.L., A.F. and
K.S. primarily contributed to protein purification, NMR data analyses, activity
assays, site-directed mutagenesis, A3G-2K3A–DNA complex modelling and
immunoblotting/purification optimization experiments, respectively. All authors
contributed to data analyses, figure constructions and manuscript revisions.
information is available at www.nature.com/reprints. Correspondence and
requests for materials should be addressed to H.M. (firstname.lastname@example.org) or
DNA constructs. The GST expression vectors, pGEX6P1-A3G(1–384) (full
described previously18. All amino acid substitution derivatives were constructed
using the QuikChange protocol (Stratagene) and verified by DNA sequencing.
Rifrmutation experiments. The E. coli-basedRifrmutationassay has been used
Each construct was expressed constitutively in E. coli strain BW310 (uracil-
excision-defective), and at least eight independent cultures were used to deter-
mine the median Rifrmutation frequency. Representative raw frequencies and
median frequencies were shown (Figs 1a and 4d) or, to facilitate comparisons,
the median values were normalized and data from a minimum of two indepen-
dent experiments were averaged (Supplementary Figs 1 and 2).
Protein expression and purification. See Methods Summary. Image J software
was used to quantify protein levels (http://rsb.info.nih.gov/ij/).
diluted to 6mM in solution containing 50mM Na2HPO4/NaH2PO4(pH7.4)
and 50mM ZnCl2. Circular dichroism spectra were collected on a Jasco 710
dichrograph using 10-mm-thick quartz cells at 10uC. Data were acquired
between 190nm and 250nm at 50nmmin21with a bandwidth of 1nm.
Sedimentation velocity analytical ultracentrifugation experiments. A3G-
2K3A was diluted to 0.15, 0.4, 0.8 or 1.2mgml21in a buffer containing
50mM Na2HPO4/NaH2PO4 (pH7.4), 0.005% Tween 20, 5mM DTT and
50mM ZnCl2. Samples were then sedimented using a four-hole rotor at 20uC
were kept on ice during protein preparation and dilution. Synthetic boundary
cells were loaded with 430ml of buffer and 420ml of the appropriate sample
solution. The cells were placed in the rotor and accelerated to 42,000g while
monitoring the transfer of the excess buffer in each cell. Subsequently, the rotor
The rotor was then equilibrated under vacuum at 20uC and, after a period of
,1h at 20uC, the rotor was accelerated to 220,000g. Interference scans were
acquiredat 1min intervalsfor6h. Thedatafor eachloading concentrationwere
analysed using the program DcDt1 (version 2.0.7)31,32. The normalized sedi-
mentation coefficient distribution function, g(s*), plots of all four concentra-
tions of A3G-2K3A are shown in Fig. 1d. The complete data set for A3G-2K3A
was analysed with Sedphat v4.4b using the model of a hybrid local continuous
distribution and global discrete species33. The fit yielded a value of 22.15kDa
(with 95% confidence limits of 21.85 and 22.45) for the molecular weight, and
a corrected sedimentation coefficient, S20,wof 2.42S, with an r.m.s.d. error of
0.0034mgml21. A similar analysis using Sedanal v4.37 gave a value of 22.3kDa
(with 95% confidence limits of 21.9 and 22.8) for the molecular weight, and a
corrected sedimentation coefficient, S20,wof 2.39S, with an r.m.s.d. error of
NMR spectroscopy and structure determination. Five amino acid substitu-
tions, L234K, C243A, F310K, C321A and C356A, were required to increase the
solubility and stability of A3G(198–384) for NMR experiments. The backbone
1H,13C and15N resonances of the uniformly13C- and15N-labelled and 90%
perdeuterated protein were assigned using triple resonance HNCA35–37,
HNCO37, HNCACB38, HNCOCACB39, HNCOCA35and HNCACO40–42experi-
ments. The side-chain assignments were completed using three-dimensional
(3D) CCONH43, HCCH-TOCSY44and15N-,13C-edited nuclear O¨verhauser
enhancement spectroscopy (NOESY)-HSQC45with 80ms mixing time. NOE-
derived distance restraints were obtained from15N- or13C-edited NOESY-
HSQC and two-dimensional (2D) NOESY spectra. These spectra were acquired
with the 200ms (for15N-edited NOESY of perdeuterated protein), 150ms (for
15N-edited NOESY of non-deuterated protein) or 100ms (for
NOESY and 2D NOESY) mixing time. To collect NOEs between amide proton
aromatic proton, these protons were selectively protonated in an otherwise fully
NMR spectra were processed with NMRPipe46and analysed with CARA47
(http://www.nmr.ch) Torsion angle restraints (200) were taken from TALOS
prediction48. Hydrogen bonds (142) were set for residues consistent with the
chemical shift deviations and NOE-pattern-defined secondary structure. NOE
distance restraints (1,008) were picked manually from NOESY data, and 1,004
additional NOEs were assigned using Atnos49/Candid50structure-dependent
cycles. The final calculation used 242 intra-residue, 604 sequential, 506
medium-range and 656 long-range NOEs. One hundred structures were calcu-
lated with CNS51torsion angle molecular dynamics. Ten of the calculated
structures were chosen based on energy and Ramachandran plots of dihedral
angles for the Fig. 2 ensemble. NMR calculation statistics are summarized in
Supplementary Tables 1 and 2.
how it is coordinated in existing deaminase superfamily member crystal struc-
that were used to link Zn21to His257, Cys288 and Cys291 in the A3G-2K3A
A3G-2K3A at protein:DNA molar ratios of 1:0, 1:1, 1:2, 1:4, 1:8 and 1:16 (see
Supplementary Discussion for specific sequences). A heteronuclear single
quantum coherence (HSQC) spectrum was recorded at each molar ratio, which
enabled specific amino acid chemical shift perturbations to be detected. The
chemical shift perturbation was calculated using following equation,
ical shifts and15N chemical shifts, respectively.
Single-strand-DNA-binding model. The target cytosine was positioned in the
catalytic active site (under His257 and adjacent to Glu259) on the basis of
existing crystal structures of active cytidine deaminases26,27. This positioning
fixed the target cytosine base and enabled calculations of all possible rotamer
configurations of the 59-C (C1), the 39-T (T3) and all 25 of the amino acid side
chains that showed significant NMR chemical shift perturbations (R215, T218,
Y219, C221, H228, L242, A246, E254, R256, A258, E259, V265, C281, N302,
both the DNAand the affectedA3G-2K3A residues.Althoughthese calculations
resulted in a very large number of possible configurations, the number was
efficiently reduced using the dead-end elimination method, which eliminates
configurations of side-chains or nucleotides that are unlikely to be part of the
global minimum structure52,53. The reduced number of configurations was
enumerated systematically to arrive at a minimum energy model.
31. Philo, J. S. Improved methods for fitting sedimentation coefficient distributions
derived by time-derivative techniques. Anal. Biochem. 354, 238–246 (2006).
32. Philo, J. S. A method for directly fitting the time derivative of sedimentation
velocity data and an alternative algorithm for calculating sedimentation
coefficient distribution functions. Anal. Biochem. 279, 151–163 (2000).
33. Schuck, P. On the analysis of protein self-association by sedimentation velocity
analytical ultracentrifugation. Anal. Biochem. 320, 104–124 (2003).
34. Stafford, W. F. & Sherwood, P. J. Analysis of heterologous interacting systems by
sedimentation velocity: curve fitting algorithms for estimation of sedimentation
coefficients, equilibrium and kinetic constants. Biophys. Chem. 108, 231–243
35. Matsuo, H., Kupce, E., Li, H. & Wagner, G. Increased sensitivity in HNCA and
HN(CO)CA experiments by selective C beta decoupling. J. Magn. Reson. B. 113,
36. Ikura, M., Kay, L. E. & Bax, A. A novel approach for sequential assignment of1H,
13C, and15N spectra of proteins: heteronuclear triple-resonance three-
dimensional NMR spectroscopy. Application to calmodulin. Biochemistry 29,
spectroscopy of isotopically enriched proteins. J. Magn. Reson. 89, 496–514
38. Wittekind, M. & Mueller, J. HNCACB, a high-sensitivity 3D NMR experiment to
resonances in proteins. J. Magn. Reson. B. 101, 201–205 (1993).
triple resonance NMR experiments for the backbone assignment of15N,13C,2H
labeled proteins with high sensitivity. J. Am. Chem. Soc. 116, 11655–11666 (1994).
40. Matsuo, H., Kupce, E., Li, H. & Wagner, G. Use of selective C alpha pulses for
improvement of HN(CA)CO-D and HN(COCA)NH-D experiments. J. Magn.
Reson. B. 111, 194–198 (1996).
41. Matsuo, H., Li, H. & Wagner, G. A sensitive HN(CA)CO experiment for
deuterated proteins. J. Magn. Reson. B. 110, 112–115 (1996).
resonance pulse scheme to correlate intraresidue 1HN, 15N, and13C9 chemical
shifts in15N-13C-labeled proteins. J. Magn. Reson. 97, 213–217 (1992).
43. Grzesiek, S., Anglister, J. & Bax, A. Correlation of backbone amide and aliphatic
side-chain resonances in13C/15N-enriched proteins by isotropic mixing of13C
magnetization. J. Magn. Reson. B. 101, 114–119 (1993).
44. Clore, G. M., Bax, A., Driscoll, P. C., Wingfield, P. T. & Gronenborn, A. M.
Assignment of the side-chain1H and13C resonances of interleukin-1 beta using
double- and triple-resonance heteronuclear three-dimensional NMR
spectroscopy. Biochemistry 29, 8172–8184 (1990).
assignments of the N-terminal SH3 domain of drk in folded and unfolded states
using enhanced-sensitivity pulsed field gradient NMR techniques. J. Biomol. NMR
4, 845–858 (1994).
46. Delaglio, F. et al. NMRPipe: a multidimensional spectral processing system based
on UNIX pipes. J. Biomol. NMR 6, 277–293 (1995).
47. Keller, R. Optimizing the process of nuclear magnetic resonance spectrum
analysis and computer aided resonance assignment. PhD thesis, Swiss Fed. Inst.
Tech. Zurich (2004).
48. Cornilescu, G., Delaglio, F. & Bax, A. Protein backbone angle restraints from
searching a database for chemical shift and sequence homology. J. Biomol. NMR
13, 289–302 (1999).
49. Herrmann, T., Guntert, P. & Wuthrich, K. Protein NMR structure determination
ATNOS. J. Biomol. NMR 24, 171–189 (2002).
50. Herrmann, T., Guntert, P. & Wuthrich, K. Protein NMR structure determination
with automated NOE assignment using the new software CANDID and the
torsion angle dynamics algorithm DYANA. J. Mol. Biol. 319, 209–227 (2002).
51. Brunger, A. T. et al. Crystallography & NMR system: A new software suite for
macromolecular structure determination. Acta Crystallogr. D Biol. Crystallogr. 54,
52. Desmet, J., De Maeyer, M., Hazas, B. & Lasters, I. The dead-end elimination
theorem and its use in protein side-chain positioning. Nature 356, 539–542
53. Goldstein, R. F. Efficient rotamer elimination applied to protein side-chains and
related spin glasses. Biophys. J. 66, 1335–1340 (1994).