© 2009 Nature America, Inc. All rights reserved.
nature structural & molecular biology advance online publication
Structural insight into
Francesco V Rao1–3, Jamie R Rich3, Bojana Rakić3, Sai Buddai4,
Marc F Schwartz4, Karl Johnson4, Caryn Bowe4,
Warren W Wakarchuk5, Shawn DeFrees4, Stephen G Withers1,3
& Natalie C J Strynadka1,2
Mammalian cell surfaces are modified by complex arrays of
glycoproteins, glycolipids and polysaccharides, many of which
terminate in sialic acid and have central roles in essential processes
including cell recognition, adhesion and immunogenicity.
Sialylation of glycoconjugates is performed by a set of sequence-
related enzymes known as sialyltransferases (STs). Here we
present the crystal structure of a mammalian ST, porcine
ST3Gal-I, providing a structural basis for understanding the
mechanism and specificity of these enzymes and for the
design of selective inhibitors.
Negatively charged sialic acids (including the common variant
N-acetylneuraminic acid (Neu5Ac)), feature prominently at the non-
reducing termini of mammalian cell-surface glycans. STs catalyze the
transfer of the sialic acid moiety from a cytidine-5′-monophospho-
N-acyl-neuraminic acid donor (CMP-Neu5Ac) to the various accep-
tor glycoconjugates terminating in either galactose (Gal), N-acetyl-
galactosamine (GalNAc) or another sialic acid. STs are classified on
the basis of the position of attachment of the donor sialic acid to the
acceptor, being either α2,3 (ST3), α2,6 (ST6) or α2,8 (ST8), and on
their detailed acceptor specificity (for example, ST3Gal-I, ST3Gal-II
and so on)1,2. The human genome contains at least 20 genes encoding
putative STs, all belonging to the mammalian family of glycosyltrans-
ferases (GTs) defined by the CAZY classification as GT29 (ref. 3);
nonmammalian STs cluster in CAZY families GT38, 42, 52 and 80.
The differential expression and biological importance of GT29 STs
during mammalian development4 has been illustrated by knockout
studies in mice5,6. For example, inactivation of an α2,6-ST (ST6Gal-I)
gene led to a severe decrease in immune function related to defective
B lymphocyte activation (redundant acceptor specificity among the vari-
ous STs hinders complete elimination of sialylation of a given acceptor,
and thus a decrease rather then a complete loss of function is typically
observed)5. In separate studies, inactivation of α2,3 sialyltransferase-I
(ST3Gal-I) led to a deficiency of mature cytotoxic T lymphocytes6,
whereas inactivation of either of the α2,8-poly-sialyltransferases
ST8Sia-II or ST8Sia-IV led to defects in the formation of neurologi-
cal synapses7. Additionally, increased ST expression, evident in some
cancers, results in an aberrant, tumor-associated glycosylation profile8.
For instance, ST3Gal-I is upregulated in primary breast carcinomas,
where MUC1 mucin glycoproteins bear sialylated core 1 oligosac-
charides, in contrast to the asialo core 2 structures that are evident in
healthy tissues8. Structures for bacterial ST representatives are available,
but their sequences are too disparate (typically 10–15% identity) to
provide useful models for their mammalian GT29 counterparts.
Mammalian STs share a predicted N-terminal membrane-anchoring
region and four conserved sequence motifs denoted as L (large),
S (small), VS (very small)9–12 and motif 3 (ref. 13). Although these
‘sialyl motifs’ have been instrumental in the identification and clon-
ing of new STs over the last three decades, their structural context has
remained unknown, with the only insights being those obtained by
kinetic analysis of mutants9,11. In this limited context, one of the most
characterized mammalian STs to date, and the focus of this study, is
ST3Gal-I, which transfers donor sialic acids to O3 of the galactosyl
residue of either Galβ1,3GalNAc-Ser/Thr O-glycan or ganglio-series
glycolipid (Galβ1-3GalNAcβ1-4Galβ1-4Glc-ceramide) acceptors.
A major hurdle for GT29 characterization has been obtaining suit-
able levels of stable and soluble protein. We have overcome this with a
porcine variant of ST3Gal-I (pST3Gal-I, which has 85% identity with
human ST3Gal-I). The sequence span of the construct, overexpression
in reductase-deficient (Origami2) Escherichia coli and the choice of
fusion partner all seem to be crucial for soluble protein production
(Supplementary Methods), as well as the microbatch method for
crystallization. Furthermore, steady-state kinetic analyses of our pST-
3Gal-I samples yielded apparent KM values on the micromolar scale
(Supp. Table 1), in general agreement with previous values published
for GT29 enzymes obtained via eukaryotic expression systems and
different animal species13–16.
The lack of substantial sequence homology between ST3Gal-I
and other GTs of known structure required that the ST3Gal-I
crystallographic data be phased using a selenomethionine deriva-
tive. Cocrystallization with a modified disaccharide acceptor
(Galβ1,3GalNAcα−PhNO2) and soaking with the product CMP
yielded complexes at 1.25-Å and 1.55-Å resolution, respectively
(Supplementary Table 2). The structure of pST3Gal-I consists of a mixed
αβ fold, composed of 7 twisted β-strands (topology 7612345) flanked
by 12 α-helices (Fig. 1). Electron density for the segment (residues
305–316) that presumably spans across the β-core is disordered in
the native and ligand-bound structures. Examples of such a missing
‘lid’ have been noted in other GT structures; they typically become
structured when bound to the donor substrate17–19. The structure
of pST3Gal-I reveals three disulfide bonds: Cys62-Cys67 (C1) and
Cys65-Cys142 (C2) are located in a loop region near the N terminus
(Fig. 1), and the third, Cys145-Cys284 (C3), connects β-sheet 1 with
α-helix 11 (Fig. 1). Mammalian STs are normally N-glycosylated
1Department of Biochemistry and Molecular Biology, 2Centre for Blood Research and 3Department of Chemistry, University of British Columbia, Vancouver, Canada.
4Neose Technologies, Inc., Horsham, Pennsylvania, USA. 5Institute for Biological Sciences, National Research Council, Ottawa, Ontario, Canada. Correspondence
should be addressed to N.C.J.S. (firstname.lastname@example.org).
Received 20 April; accepted 1 September; published online 11 October 2009; doi:10.1038/nsmb.1685
© 2009 Nature America, Inc. All rights reserved.
advance online publication nature structural & molecular biology
in vivo, with pST3Gal-I predicted to be
modified at four asparagine residues. Indeed,
three of these sites map to the protein sur-
face (Fig. 1a); the remaining site is on the
N-terminal stem. Our structural and kinetic
characterization of bacterially expressed
pST3Gal-I clearly shows that glycosylation
is not essential for folding or activity in vitro and is therefore likely
to have alternative roles in stabilization and trafficking in vivo13.
GTs that use a nucleotide-activated donor sugar have thus far been
found to adopt either a so-called GT-A or GT-B topology, consisting
of a single or double Rossmann-like fold, respectively20. pST3Gal-I
has a single Rossmann domain; however, a search with DALI21 identi-
fied only weak homology with a single GT-A member, the bacterial
ST CstII (Z-score of 9.4, with an r.m.s. deviation of 3.6 Å for 275 Cα
atoms) (Supp. Fig. 1), which itself demonstrates sufficient differences
in topology from other GT-A proteins to warrant distinction as a
separate subclass (which we define here as GT-A (variant1); Supp.
Fig. 1)18. Despite having less than 10% sequence identity with CstII
and several additional insertions, pST3Gal-I shares a similar β-sheet
core (Supp. Fig. 1). Beyond this, however, there is little similarity, with
most helices occupying completely different positions (Supp. Fig. 1),
and explaining why previous bacterial ST structures could not be used
to understand molecular features of the mammalian counterparts. For
these reasons we define the mammalian GT29 STs as a second distinct
subclass, which we term GT-A (variant 2).
Our structure of a ternary complex of pST3Gal-I involving the
acceptor disaccharide Galβ1,3GalNAcα−PhNO2 and product CMP
reveals the catalytic site (Fig. 1 and Supp. Fig. 2). The nucleotide
resides in a cleft adjacent to the Rossman fold–containing β-core
(Figs. 1 and 2), with the cytidine moiety oriented by multiple hydro-
gen bonds to the protein backbone (Fig. 2). The ribose ring of CMP
adopts a C2′ exo conformation, with the phosphate stabilized by
interactions with His302, Asn150 and Asn173, the last two being
well conserved amongst all eukaryotic α2,3 sialyltransferases (ST3)
and the α2,6 sialyltransferase, ST6Gal-I (Supp. Fig. 3). His302 is also
conserved in all ST3 members, whereas in ST6Gal-I the equivalent
interaction with CMP is likely to be mediated by a tyrosine residue,
similarly to what is seen in the bacterial CstII–CMP3F-Neu5Ac
complex (Supp. Figs. 2 and 3).
The active sites of other GT-A members that use a UDP-
activated donor sugar often bind a divalent cation (Mn2+ or Mg2+)
that coordinates the diphosphate moiety of the UDP through
a ‘DxD’ motif. However, our data show that pST3Gal-I, as with the
bacterial STs, lacks this metal-binding motif, and its activity shows
no dependence on metal ions. These observations are consistent
with the CMP-activated donor substrate containing only a single
phosphate group and a negatively charged sialic carboxylate, which
an adjacent DxD motif could repel electrostatically.
The structure of pST3Gal-I answers many questions about the
four conserved mammalian sialyl motifs. The largest, sialyl motif
L, encompasses four out of the seven core β-strands of pST3Gal-I
(Fig. 1b) and forms part of the donor-binding site, as supported
by previous mutagenesis studies11. Sialyl motif S comprises helix
α11 and strand β6, the latter occupying a position parallel to β1
from motif L (Fig. 1b and Supp. Fig. 3). α11 is positioned in line
with the bound electronegative donor sugar, potentially contribut-
ing to binding through the positive end of its helix dipole (Fig. 1).
Supporting this is the observation that disulfide C3, which fixes the
end of α11 relative to β1 from sialyl motif L (Fig. 1), is conserved in
all eukaryotic STs. The recently identified sialyl motif 3 is also located
near the phosphate-binding site, just before the lid domain (Fig. 1b).
A histidine-to-alanine mutation in this region (equivalent to His302
in pST3Gal-I) has been shown to abolish human ST3Gal-I activity,
a result that is now explained by our structure, which shows that
His302 forms a direct interaction with the donor phosphate13. The
last sialyl motif, VS, comprises part of helix α12 and the loop region
in which the proposed catalytic base resides (see below; Fig. 1).
The mammalian STs are expected to act via the direct displace-
ment mechanism used by other inverting GTs16. Reaction proceeds
through an oxocarbenium ion–like transition state that is analogous
to that of the inverting glycosidases, with a general base assisting in
the deprotonation of the acceptor hydroxyl20 (Supp. Fig. 2e). Our
structure of pST3Gal-I in complex with Galβ1,3GalNAcα−PhNO2
provides support for this mechanism and identifies His319 of the
sialyl motif VS as the catalytic base (2.8 Å between the C3 hydroxyl of
the galactose moiety and the imidazole nitrogen (Fig. 2a)). This his-
tidine is conserved in all GT29 members, and an earlier study showed
that its mutation in the human polysialyltransferases ST8SiaII and
ST8SiaIV abrogates catalytic activity22. Notably, despite the observed
major structural differences and the complete absence of the eukaryo-
tic sialyl motif VS, a spatially (but not sequentially) equivalent his-
tidine (His188) is used as the catalytic base in the bacterial ST CstII
(Supp. Fig. 2). Superimposition of the donor analog CMP3F-Neu5Ac
from the CstII–CMP3F-Neu5Ac cocrystal structure onto the pST-
3Gal-I–CMP–disaccharide cocomplex places the acceptor nucleophile
(Gal-O3) within 2.8 Å of the anomeric reaction center (Fig. 2b). The
structure of pST3Gal-I in complex with CMP, and by extension the
modeled CMP3F-Neu5Ac, reveals that the leaving group phosphate
is in a near axial position, suitable for departure upon attack by the
nucleophilic hydroxyl. An axially oriented leaving group is a generally
Sialyl motif VS
Sialyl motif S
Sialyl motif L
Sialyl motif 3
Figure 1 The GT29 fold. Cartoon of pST3Gal-I
in complex with product CMP and a
disaccharide sugar acceptor (yellow stick
model). A model of the missing lid (residues
305–316) is shown as dashed line (magenta)
based on an equivalent loop in CstII. The
catalytic base (His319) is highlighted in
cyan. (a) The GT29 catalytic domain is linked
to the transmembrane helix by a protease-
sensitive stem region. Residues predicted to be
glycosylated are shown as black sticks. (b) The
four conserved sialyl motifs of GT29 STs.
© 2009 Nature America, Inc. All rights reserved. Download full-text
nature structural & molecular biology advance online publication
observed feature for both GTs and glycosidases at each step in catalysis.
The conservation of histidine as the catalytic base in the mammalian
STs rather than the more typical aspartate or glutamate of other GT
families may reflect the need to accommodate the anionic sialic acid
moiety of the donor substrate.
Our structures provide insights into the underlying molecular basis
of substrate specificity. Comparison of the apo form of pST3Gal-I and
the two complexes reveals few differences in conformation, suggesting
that the acceptor site is largely preformed, consistent with the random
order mechanism determined for this enzyme16. In our ternary com-
plex, the acceptor sugar Galβ1,3GalNAcα−PhNO2 binds in such a way
that the Gal (the key determinant of specificity) lies in a shallow pocket
on the enzyme surface with the remainder projecting toward solvent
(Fig. 1 and Supp. Fig. 4). The acceptor is oriented for catalysis by
several hydrogen bonds, including the aforementioned interaction
between Gal OH-3 and the catalytic His319 (Fig. 2a). Gal OH-4
interacts with the Tyr269 hydroxyl, whereas Gal OH-6 interacts with
Gln108 and Tyr233 and, via a water molecule, with the side chains
of Glu196, Lys213 and Asp216 (Fig. 2a). The GalNAc moiety engages
only in water-mediated hydrogen bonds, with the exception of the
axial hydroxyl at C-4, which interacts directly with the hydroxyl group
of Tyr269 (Fig. 2a).
The observed interactions between Tyr269 and the hydroxyl
groups at C-4 of both the Gal and GalNAc units suggest that
this phenolic residue is probably a key determinant of acceptor
specificity, consistent with its presence in both ST3Gal-I and
ST3Gal-II, enzymes that preferentially use Galβ1,3GalNAc. Other
ST3Gal enzymes (ST3GalIII–ST3GalVI) (Supp. Fig. 3) seem not to
have an equivalent tyrosine, consistent with these enzymes using
different acceptor disaccharides and sugar linkages—for exam-
ple, ST3Gal-III transfers to Galβ1,4GlcNAc. The significance
of the dual hydrogen bonding role for Tyr269 is reinforced by
the strongly reduced rate of sialylation of either galactose alone
or Galβ1,3GlcNAc by ST3Gal-I16. The specificity for the galac-
tose moiety in ST3Gal-I species probably further derives from an
additional hydrogen bond to O6 of the sugar, both directly from
Tyr233 and via a water molecule to Gln108, Lys213 and Asp216, the
last two being conserved in all ST3Gal species (Fig. 2a and Supp.
Fig. 3). Finally, even though murine and porcine ST3Gal-I prefer
Galβ1,3-GalNAc over related acceptors lacking the acetamide15,16,
no direct hydrogen bonding interactions between the protein and
this acetamide are observed (Fig. 2), suggesting that interactions
of the disordered lid that are not modeled in our structures
may have a role.
Specificity for the donor substrate probably resides in a number of
interactions that hold the donor sugar in position for transfer. On the
basis of our pST3Gal-I–CMP3F-Neu5Ac model, the C1 carboxylate
of the sialic acid interacts with residues at the end of helix α11 and is
stabilized electrostatically by its N-terminal helix dipole, whereas the
O7 of Neu5Ac hydrogen-bonds with the side chain hydroxyl of Tyr194
(Fig. 2b). Additional interactions that confer specificity for Neu5Ac
may also be contributed by the flexible lid domain. The final possible
specificity element is that of the sugar acceptor-bearing protein or
lipid moiety itself. Notably, pST3Gal-I has an extended polar cleft
of suitable dimensions to bind a peptide or lipid that extends from
the terminal sugar of our disaccharide acceptor toward the solvent
(Supp. Fig. 4).
The data presented here provide the first detailed structural and
mechanistic insights into a mammalian ST, indicating a new GT
topology and active site architecture. The structure provides a foun-
dation for understanding specificity and for the design of inhibitors
of these and other vertebrate STs. Such inhibitors will be useful in
probing the crucial roles of individual STs in cellular processes or
in modulating the levels of their product sialyl glycoconjugates in
diseases such as cancer.
Accession codes. Protein Data Bank: Coordinates and structure
factors for the native, disaccharide and CMP-disaccharide com-
plexes were deposited with accession numbers 2WML, 2WNB and
Note: Supplementary information is available on the Nature Structural & Molecular
Published online at http://www.nature.com/nsmb/.
Reprints and permissions information is available online at http://npg.nature.com/
1. Harduin-Lepers, A. et al. Biochimie 83, 727–737 (2001).
2. Tsuji, S., Datta, A.K. & Paulson, J.C. Glycobiology 6, 647 (1996).
3. Cantarel, B.L. et al. Nucleic Acids Res. 37, D233–D238 (2009).
4. Paulson, J.C. & Colley, K.J. J. Biol. Chem. 264, 17615–17618 (1989).
5. Hennet, T., Chui, D., Paulson, J.C. & Marth, J.D. Proc. Natl. Acad. Sci. USA 95,
6. Martin, L.T., Marth, J.D., Varki, A. & Varki, N.M. J. Biol. Chem. 277, 32930–32938
7. Galuska, S.P. et al. J. Biol. Chem. 281, 31605–31615 (2006).
8. Burchell, J.M., Mungul, A. & Taylor-Papadimitriou, J. J. Mammary Gland Biol.
Neoplasia 6, 355–364 (2001).
9. Datta, A.K., Sinha, A. & Paulson, J.C. J. Biol. Chem. 273, 9608–9614 (1998).
10. Datta, A.K. & Paulson, J.C. J. Biol. Chem. 270, 1497–1500 (1995).
11. Datta, A.K., Chammas, R. & Paulson, J.C. J. Biol. Chem. 276, 15200–15207
12. Geremia, R.A., Harduin-Lepers, A. & Delannoy, P. Glycobiology 7, 161 (1997).
13. Jeanneau, C. et al. J. Biol. Chem. 279, 13461–13468 (2004).
14. Vallejo-Ruiz, V. et al. Biochim. Biophys. Acta 1549, 161–173 (2001).
15. Kono, M. et al. Glycobiology 7, 469–479 (1997).
16. Rearick, J.I., Sadler, J.E., Paulson, J.C. & Hill, R.L. J. Biol. Chem. 254, 4444–4451
17. Unligil, U.M. et al. EMBO J. 19, 5269–5280 (2000).
18. Chiu, C.P. et al. Nat. Struct. Mol. Biol. 11, 163–170 (2004).
19. Charnock, S.J. & Davies, G.J. Biochemistry 38, 6380–6385 (1999).
20. Lairson, L.L., Henrissat, B., Davies, G.J. & Withers, S.G. Annu. Rev. Biochem. 77,
21. Holm, L., Ouzounis, C., Sander, C., Tuparev, G. & Vriend, G. Protein Sci. 1,
22. Kitazume-Kawaguchi, S., Kabata, S. & Arita, M. J. Biol. Chem. 276, 15696–15703
Figure 2 The GT29 active site. (a) Active site of pST3Gal-I, with CMP
occupying the donor site and Galβ1,3GalNAcα-PhNO2 disaccharide
defining the acceptor site (yellow carbon atoms). Amino acids of
interest are shown as sticks, with the catalytic base represented in
cyan. Black dotted lines indicate potential hydrogen bonds. For Gal
β1,3GalNAcα-PhNO2, the unbiased 1.25-Å |Fo| − |Fc|, φcalc electron
density map is shown, contoured at 2σ. For CMP, the unbiased 1.55-Å
|Fo| − |Fc|, φcalc electron density map is shown, contoured at 2σ.
(b) Observed disaccharide acceptor binding in pST3Gal-I (yellow) with
a model of CMP3F-NeuAc (green) based on the CstII–CMP3F-NeuAc
complex (PDB 1RO7)18.