ArticlePDF Available

Hyaluronan‐protein interactions: Lilliput revisited

Wiley
Proteoglycan Research
Authors:

Abstract and Figures

Hyaluronan (HA) is a huge linear polysaccharide composed entirely of a simple repeating disaccharide that has been preserved unchanged since the evolution of vertebrates ~520 million years ago. It is present in all mammalian tissues, being synthesised within the cell membrane and extruded into the extracellular space where it dictates tissue elasticity, hydration and permeability. HA also directs cell behaviour via engagement with cell surface receptors. These properties allow it to mediate diverse functions in a wide range of physiological and pathological processes, including development, mammalian reproduction and inflammation. There is growing evidence that the way in which the HA biopolymer is differentially organised, through its interaction with a repertoire of HA‐binding proteins (HABPs), is key to the diversity of its biological functions. This review summarises current knowledge of HA‐protein interactions and how their diversity may lead to the formation of HA/protein complexes with distinct molecular architectures that in turn underpin different physical properties and receptor‐mediated effects. Potentially controversial areas such the pro‐inflammatory effects of low molecular weight HA fragments and how short HA‐binding peptides modulate HA function are considered from a molecular perspective.
Domain organisation of human HA‐binding proteins. The 14 members of the Link module superfamily are shown, along with the non‐Link module‐containing proteins (in right hand box) for which there is sufficient molecular evidence that they are functional HA‐binding proteins. All proteins are human apart from SpnHL, which is from Streptococcus pneumoniae. Amino acid sequences for the mature proteins were obtained from Uniprot or Protein NCBI databases and analysed using the Sequence SMART programme within the SMART Protein Domain database (http://smart.embl-heidelberg.de/)²⁵; proteins are shown to scale relative to the scale bar at the bottom of the figure showing the numbers of amino acid residues in 100 s. Domains are as defined in SMART (CCP, complement control protein module; CUB, complement C1r/C1s, Uegf, Bmp1 module; CLECT, C‐type lectin module; EGF, epidermal growth factor module; EGF Ca, EGF with calcium ion binding site; EGF‐Lam, laminin‐type EGF module; EGF‐like, unclassified subfamily of EGF module; FAS1, Fasciclin‐1 family domain; IG, immunoglobulin module; IGc2, C‐2 type IG module; IGv, V‐type IG module; Link, Link module; VIT, Vault protein inter‐alpha‐trypsin domain; VWA, von Willebrand Factor A domain; blue boxes are transmembrane regions, green boxes are coiled coil sequences, grey lines are regions of unspecified sequence and pink boxes regions of low complexity). GAG‐attachment regions (GAG) are denoted by orange boxes. For versican both V0 (containing both α‐ and β‐GAG attachment regions) and V3 isoforms (without either α‐GAG or β‐GAG) are shown with G1 and G3 domain regions, common to the lectican proteoglycans, indicated on the former; the G2 region of aggrecan is also highlighted. Brevican is shown in both its extracellular matrix and cell surface attached forms, where the latter is alternatively spliced and contains a glycosylphosphatidylinositol (GPI)‐linkage. For CD44 the standard form is shown (CD44s) along with CD44 (V1‐10) where all 10 variant exons are spliced in; the GAG attachment region within the V3 exon is shown. LYVE‐1 is shown as a disulphide (–ss–) linked dimer. The position of the proteolytic processing site in stabilin‐2 that leads to the generation of HARE (HA Receptor for Endocytosis) is indicated with an arrow. HC1 is shown as a representative heavy chain, where HC2, HC3 and HC5 also become covalently attached to HA. These HCs are likely to have an identical domain structure to HC1 (see Figure 4B) for which a crystal structure has been determined.²⁶ For RHAMM and SpnHL the regions that have been reported as mediating their interactions with HA are denoted by cyan bars and labelled ‘HABD’ (HA‐binding domain); see Figure 5 for further details on RHAMM HABD. All of the sequence analyses underpinning in this figure was carried out by the author (A.J. Day, unpublished data).
… 
Structures and models of Types‐A, ‐B and ‐C HA‐binding domains of the Link module superfamily. (A) The solution structure of the Link module from human TSG‐6 (Link_TSG6), a Type‐A HA‐binding domain (HABD), showing secondary structural elements (including 2 α‐helices and 2 triple‐stranded β‐sheets (composed of β‐strands 1, 2 and 6, and 3, 4 and 5)) and the positions of N‐ and C‐termini; pdb: 1O7B.³⁷ (B) The Type‐B HABD of human CD44 determined by NMR spectroscopy (pdb: 1POZ³⁸) is aligned relative to Link_TSG6 in (A) to show the similarity of the Link module regions (colour coded the same); additional β‐strands, shown in pink (0, 7, 8 and 9), form an extended β‐sheet with β‐strands 1, 2 and 6. (C) A model (in space‐filling representation) of Link_TSG6 (blue) with a bound HA octasaccharide (HA8; carbons in green); here the octasaccharide was modelled onto the Link module structure in its HA‐bound conformation pdb: 1O7C³⁷) based on experimental restraints (see¹⁹). (D) The crystal structure of the HABD of mouse CD44 in complex with HA8 (carbons in orange), where the protein is in its ‘high affinity’ conformation (pdb: 2JCR³⁹). The Link module is shown in blue (oriented the same as Link_TSG6 in (C)) with the non‐link module regions shown in grey. (C, D) are adapted from the original figures made by A.J. Day and colleagues for use in.¹⁹ (E) A homology model of human HAPLN1 (a Type‐C HABD) with an HA12 (blue sticks) docked into the HA‐binding groove that spans the surface of the 2 Link modules (Link1 and Link2).³⁵ The amino acids predicted to form the inter‐module interface between Link1 and Link2 are shown in magenta (space‐filling) and the residues on the external faces of the Link modules predicted to mediate protein‐protein (e.g. HAPLN‐lectican) interactions are shown in cyan; the positions of the cyan and magenta amino acids within the Link module sequences are shown in Figure 3. (F) A model (space filling) of how the Type‐C HABDs of aggrecan (red) and HAPLN1 (yellow) could associate to generate a helical HA (blue) structure.³⁵ (E, F) are adapted from the original figures made by A.J. Day and colleagues for use in.³⁵
… 
Multiple sequence alignment of Link modules from the human Link module superfamily. The alignment of the Link module sequences from the 14 members of the human Link module superfamily was generated as described in Blundell et al., 2003 and 2005.35,37 Where proteins have more than one Link module these are denoted by a hyphen and the number of the Link module from the N‐terminal end of the protein (e.g. HAPLN1‐1 and HAPLN1‐2); for the Link modules that are not thought to be involved in HA binding their names are depicted in grey. Secondary structure elements for Link_TSG6, matching those in Figure 2A, are shown above the alignment along with the disulphide bonds (green lines) connecting the four consensus cysteines (C1, C2, C3 and C4 in black boxes); all Cys in the alignment are coloured green. The predicted disulphide bond between two additional non‐consensus cysteines within SUSD5 is also shown. In the sequences for TSG‐6, CD44 and LYVE‐1 amino acids for which there is experimental evidence to demonstrate their role in HA binding (see text) are coloured red. In the other sequences, chemically equivalent amino acids are coloured in orange, indicating residues that could potentially be involved in the interaction with HA (e.g. based on analysis of homology models³⁵). From this analysis the Link modules from stabilin‐1 and SUSD5 are predicted to not participate in HA binding. Residues in TSG‐6 (His4, Tyr12, His45 and Asp89 in Link_TSG6) that mediate the pH‐dependent interaction with HA are shown in the dark grey boxes. Amino acids predicted to form inter‐modular interactions between Link modules within the Type‐C HABDs (i.e. of HAPLNs and lecticans) are coloured magenta (within magenta boxes). Residues within these proteins that are predicted to mediate inter‐protein interactions (e.g. between an HAPLN and lectican), are coloured blue, within blue boxes.
… 
Formation of HC•HA complexes and their roles in HA crosslinking. (A) Schematic showing TSG‐6‐mediated transfer of heavy chains (HCs) from the inter‐α‐inhibitor (IαI) family of proteoglycans onto HA (diamonds: GlcA; squares: GlcNAc) to form covalent HC•HA complexes. IαI contains 2 HCs (HC1 and HC2) that are linked via ester bonds (red circles) to the chondroitin sulphate (CS) chain of the bikunin core protein; PαI has only one HC (HC3). The identity of the proteoglycan containing HC5 is not yet known (?), however, based on available data it seems likely to be attached to bikunin•CS (see text for details). While HC1, HC2, HC3 and HC5 can all become covalently attached to GlcNAc residues within HA (via ester bonds) it is unknown if they could all be associated with the same HA chain where available data indicates they are sparsely arranged along the HA polymer (i.e. the depiction here is purely for illustrative purposes). Details on the molecular basis of the HC transfer reaction have been reviewed elsewhere.³² (B) Domain organisation of HC1 as determined from the crystal structure for this protein with amino acid residue numbering from P19827.²⁶ The N‐ (N; blue) and C‐terminal (C; yellow) regions together form the HC‐Hybrid2 domain and the von Willebrand Factor A (vWFA) domain is flanked by ‘H’ sequences (dark green) that form the HC‐Hybrid1 domain; the structures of these regions (from pdb: 6FPY) are shown above and below the schematic. The sequences at the extreme N‐ and C‐terminal ends (depicted in grey) are disordered in the structure. This domain organisation for HC1 is likely conserved for the other HCs (HCs 2‐6). (C) The 3D structure of HC1 showing the HC‐Hybrid1, HC‐Hybrid2 and vWFA domains (pdb: 6FPY²⁶). The vWFA domain contains a Mg²⁺ ion (pink) (or Mn²⁺ at the same position) that is required for mediating HC1‐HC1 interactions based on the analysis of a D298A mutant (position shown in (B)), for which a crystal structure is also available (pdb: 6FPZ). Residues 631‐638 (which form a loop within the HC‐Hybrid2 domain) and residues 653‐672, including the C‐terminal (COOH) aspartic acid that becomes ester bonded to HA, are disordered and shown as dotted red lines; the disordered region at the N‐terminal end (residues 35‐44) is not included in the figure. (D) Schematic showing how HC1 and HC2 attached to HA (via ester bonds; red circles) can participate in protein‐protein interactions that can crosslink multiple HA chains. Metal ion‐dependent HC1‐HC1 interactions have been demonstrated to occur,²⁶ albeit with very weak affinity, whereas the affinities of the predicted interactions between HC1 and HC2 have not yet been published. HC1 and HC2 are thought to interact with PTX3 (see text), an octameric protein, where this leads to HA crosslinking,175,176 although the molecular details have not yet been described.
… 
This content is subject to copyright. Terms and conditions apply.
Received: 20 July 2024
|
Accepted: 14 October 2024
DOI: 10.1002/pgr2.70007
REVIEW ARTICLE
Hyaluronanprotein interactions: Lilliput revisited
Anthony J. Day
1,2
1
Wellcome Centre for CellMatrix Research,
Faculty of Biology Medicine & Health,
University of Manchester, Manchester
Academic Health Science Centre,
Manchester, UK
2
Lydia Becker Institute of Immunology and
Inflammation, Manchester Academic Health
Science Centre, Manchester, UK
Correspondence
Anthony J. Day, Wellcome Centre for
CellMatrix Research, Faculty of Biology,
Medicine & Health, University of Manchester,
Manchester Academic Health Science Centre,
Manchester, UK.
Email: anthony.day@manchester.ac.uk
Funding information
Versus Arthritis, Grant/Award Number:
22277
Abstract
Hyaluronan (HA) is a huge linear polysaccharide composed entirely of a simple
repeating disaccharide that has been preserved unchanged since the evolution of
vertebrates ~520 million years ago. It is present in all mammalian tissues, being
synthesised within the cell membrane and extruded into the extracellular space
where it dictates tissue elasticity, hydration and permeability. HA also directs cell
behaviour via engagement with cell surface receptors. These properties allow it to
mediate diverse functions in a wide range of physiological and pathological pro-
cesses, including development, mammalian reproduction and inflammation. There is
growing evidence that the way in which the HA biopolymer is differentially orga-
nised, through its interaction with a repertoire of HAbinding proteins (HABPs), is
key to the diversity of its biological functions. This review summarises current
knowledge of HAprotein interactions and how their diversity may lead to the for-
mation of HA/protein complexes with distinct molecular architectures that in turn
underpin different physical properties and receptormediated effects. Potentially
controversial areas such the proinflammatory effects of low molecular weight HA
fragments and how short HAbinding peptides modulate HA function are considered
from a molecular perspective.
KEYWORDS
HA, HABPs, hyaladherins, hyaluronan, hyaluronanbinding proteins, hyaluronanprotein
interactions
Proteoglycan Research. 2024;2:e70007. wileyonlinelibrary.com/journal/pgr2
|
1of33
https://doi.org/10.1002/pgr2.70007
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium,
provided the original work is properly cited.
© 2024 The Authors. Proteoglycan Research published by Wiley Periodicals LLC.
Abbreviations and Acronyms: ADAMTS, A disintegrin and metalloproteinase with thrombospondin motifs; BRAL, brainspecific link protein; C0S, Nonsulfated CS disaccharide; C4S,
Chondroitin4sulfate; CBD, Carbohydrate binding domain; CEMIP, Cell migrationinducing and hyaluronanbinding protein; CLECT, Ctype lectin; CS, Chondroitin sulphate; CSPG, Chondroitin
sulphate proteoglycan; CUB, Complement C1r/C1s, Uegf, Bmp1; DS, Dermatan sulphate; ECM, Extracellular matrix; FRET, Förster Resonance Energy Transfer; GAG, Glycosaminoglycan; GlcA,
Glucuronic acid; GlcNAc, Nacetyl glucosamine; gpi, Glycosylphosphatidylinositol; GST, Glutathione Stransferase; HA, Hyaluronan; HABD_CD44, HABD of CD44; HABD, Hyaluronan binding
domain; HABP, Hyaluronanbinding protein; HABP1, Hyaluronic Acid Binding Protein1; HAPLN, HA and Proteoglycan LiNk Protein family; HARE, HA Receptor for Endocytosis; HAS,
Hyaluronan synthase; HC, Heavy chain; HCHA, Hyaluronanheavy chain complex; HMW, High molecular weight; HS, Heparan sulphate; Hyal1, Hyaluronidase1; IG, Immunoglobulin; ITC,
Isothermal titration calorimetry; ITIH, Interαtrypsin inhibitor heavy chain; IαI, Interαinhibitor; LECs, Lymph vessel endothelial cells; Link_TSG6, Link module from human TSG6; LMW, Low
molecular weight; LYVE1, Lymphatic vessel endothelial hyaluronan receptor 1; MIDAS, Metal iondependent adhesion site; MMP, Matrix metalloproteinase; MyD88, Myeloid differentiation
primary response 88; NMR, Nuclear Magnetic Resonance; pdb, Protein database; poly(I:C), Polyinosinic acid:polycytidylic acid; PTX3, Pentraxin3; PαI, Preαinhibitor; RHAMM, Receptor for
hyaluronanmediated motility; rhTSG6, Recombinant human TSG6; SAXS, Smallangle Xray scattering; SpnHL, Hyaluronate lyase from Streptococcus pneumoniae; SUSD5, Sushi domain
containing protein 5; TGFβ, Transforming growth factorβ; TK, Tissue kallikrein; TLR, TollLike receptor; TMEM2, Transmembrane protein 2; TSG6, Tumour necrosis factorstimulated gene6;
VG1, G1 domain from human versican; VIT, Vault protein interalphatrypsin domain; vWFA, von Willebrand Factor A.
INTRODUCTION
The title of my 2002 review article Hyaluronanbinding proteins:
tying up the giant
1
alluded to Gulliver's Travels,
2
first published in
1726, where the giant Gulliver was overcome and restrained by the
many small people of Lilliput. This imagestill feels appropriate as a
metaphor for how the giant polysaccharide hyaluronan (HA) can be
tied up by HAbinding proteins (HABPs), sometimes referred to as
hyaladherins. Much has been learned about HAprotein interactions
over the 25 years since I wrote my first review on the topic,
3
how-
ever, some fundamental questions remain on how exactly HABPs
regulate HA biology. The aim here is to capture the current state of
play around this topic and highlight areas where there is controversy
or the need for further research.
HA is a large (MDa) linear glycosaminoglycan (GAG) composed
entirely of repeating disaccharides of glucuronic acid (GlcA) and N
acetyl glucosamine (GlcNAc) connected by alternate β13 and β14
linkages with a contour length of up to ~10 µm.
4
HA is subject to no
biosynthetic modifications, having been preserved unchanged as a
chemically simple polysaccharide,
5
since its appearance in verte-
brates, likely evolving from chondroitin sulphate (CS), over 400
million years ago.
6
However, HA may be older than this perhaps
being present in amphioxus (see
6
), where chordates likely split from
this cephalochordate around 520 million years ago.
7
Despite its simplicity HA has diverse functional activities. It is
made by essentially all cell types in humans, being a key component
of the extracellular matrix, with crucial roles in reproduction, devel-
opment, tissue homoeostasis, wound healing, and the regulation of
stromal and immune cell migration.
810
In fact, HA is involved in
almost every aspect of mammalian biology, and in most contexts is
synthesised in the cell membrane and extruded into the extracellular
compartment by hyaluronan synthase (HAS) enzymes; a process that
is now well characterised at the molecular level.
11
Surprisingly, HA
can also be found inside cells, however, its origin and intracellular
functions are currently less well understood.
12,13
In solution phase (e.g. in healthy synovial fluid, and the pleural and
peritoneal cavities), HA is present as a free sugar without proteins
bound, where each molecule occupies a very large volume, due to ionic
repulsion between the carboxylic acid moieties of the GlcA residues
and HA's locally extended conformation (i.e. forming a stiffened ran-
dom coil).
5,14
In this context, data strongly indicate that, under physi-
ological conditions, there are no intermolecular interactions between
HA chains.
15
Its solution properties can be explained from the rapid
dynamic switching (on the picosecond timescale) between a large
number of equienergetic conformations (i.e. with a restricted range of
psi and phi angles, stabilised by transient intramolecular hydrogen
bonds formed across the glycosidic linkages) and molecular entangle-
ment when HA is present above the critical overlap concentra-
tion
5,14,16
; moreover, HA is highly hydrated in aqueous solution, as
expected for a hydrophilic polysaccharide, with ~fifteen water mole-
cules dynamically associated with each disaccharide.
17
HA, in most other contexts, is associated with HABPs, where the
composition and organisation of a particular HA/protein complex,
likely underpin its physical and functional properties.
1,5
In other words,
the diverse functional roles of HA arise from the diversity in the
structure and function of HAbinding proteins and the differential
ways in which they interact with HA and each other. This provides a
plausible explanation for how such a chemically simple polysaccharide
can mediate a broad spectrum of biological activities. Importantly, HA
represents a highly versatile, and unique, scaffold on which to build
HAprotein complexes.
4,18
This is due to its inherent flexibility (i.e.
dynamically sampling a wide range of low energy conformations in
solution phase; see above), where different HABPs might capture,
propagate and stabilise different HA conformers.
5,1820
Furthermore,
because HA has no biosynthetic modifications it is the perfect tem-
plate on which to generate periodic arrays.
5,18
In this regard, the large
size of HA in tissues, ranging from 10
5
10
7
Da,
21
e.g. where an HA
chain of 2MDa is composed of ~5000 disaccharides, allows it to form
enormous complexes with proteins, such that hundreds (and in some
cases >1000) of HABPs can associate with a single HA molecule.
4,18
Depending on the type of HABP(s) that are associated with HA, very
different molecular architectures can be formed, either built on indi-
vidual HA polysaccharides or via crosslinking multiple (sometimes vast
numbers) of HA chains together.
5,18,22
The location of HA/protein
complexes (e.g. on the cell surface or in the extracellular matrix) will
also have an impact on the molecular architectures formed.
4
Interestingly, in synovial fluid it is HA's interaction with phos-
pholipids, rather than proteins, that appears responsible for the ul-
tralow friction and boundary lubrication of articular cartilage.
23,24
However, nonprotein ligands for HA will not be discussed fur-
ther here.
THE LINK MODULE SUPERFAMILY AND ITS
INTERACTION WITH HA
The majority of HABPs contain one or more Link modules (Figure 1),
a structural domain of approximately 90 amino acid residues that is
often, but not always, associated with HAbinding activity. The initial
determination of the Link module fold revealed a strong structural
similarity to the Ctype lectin (CLECT) domain, and hence the sug-
gestion that they have a common evolutionary origin.
27
The CLECT
domain is found in invertebrate proteins, whereas the Link module
has largely been identified in vertebrates (see below). On this basis it
has been suggested
28
that the Link module evolved from the CLECT
domain before the divergence of the cartilaginous and bony fish/land
vertebrates approximately 400 million years ago (see
29
for the cur-
rent understanding of the evolution of vertebrates). The CLECT
domain contains a Ca
2+
binding loop usually required for ligand
binding,
30,31
which is not present in the Link module, where the latter
binds HA in a metal ionindependent manner.
27
There are fourteen Link modulecontaining proteins in humans
(see Figure 1), including CD44 (the major cellsurface receptor for
HA), four lecticans (aggrecan, brevican, neurocan and versican), four
link proteins(HAPLN (HA and Proteoglycan LiNk protein family)1,
2, 3 and 4), LYVE1, Stabilin1 and 2, Sushi domaincontaining
2of33
|
DAY
FIGURE 1 Domain organisation of human HAbinding proteins. The 14 members of the Link module superfamily are shown, along with the nonLink modulecontaining proteins (in right hand box)
for which there is sufficient molecular evidence that they are functional HAbinding proteins. All proteins are human apart from SpnHL, which is from Streptococcus pneumoniae. Amino acid sequences
for the mature proteins were obtained from Uniprot or Protein NCBI databases and analysed using the Sequence SMART programme within the SMART Protein Domain database (http://smart.embl-
heidelberg.de/)
25
; proteins are shown to scale relative to the scale bar at the bottom of the figure showing the numbers of amino acid residues in 100s. Domains are as defined in SMART (CCP,
complement control protein module; CUB, complement C1r/C1s, Uegf, Bmp1 module; CLECT, Ctype lectin module; EGF, epidermal growth factor module; EGF Ca, EGF with calcium ion binding site;
EGFLam, laminintype EGF module; EGFlike, unclassified subfamily of EGF module; FAS1, Fasciclin1 family domain; IG, immunoglobulin module; IGc2, C2 type IG module; IGv, Vtype IG module;
Link, Link module; VIT, Vault protein interalphatrypsin domain; VWA, von Willebrand Factor A domain; blue boxes are transmembrane regions, green boxes are coiled coil sequences, grey lines are
regions of unspecified sequence and pink boxes regions of low complexity). GAGattachment regions (GAG) are denoted by orange boxes. For versican both V0 (containing both αand βGAG
attachment regions) and V3 isoforms (without either αGAG or βGAG) are shown with G1 and G3 domain regions, common to the lectican proteoglycans, indicated on the former; the G2 region of
aggrecan is also highlighted. Brevican is shown in both its extracellular matrix and cell surface attached forms, where the latter is alternatively spliced and contains a glycosylphosphatidylinositol (GPI)
linkage. For CD44 the standard form is shown (CD44s) along with CD44 (V110) where all 10 variant exons are spliced in; the GAG attachment region within theV3 exon is shown. LYVE1 is shown as a
disulphide (ss) linked dimer. The position of the proteolytic processing site in stabilin2 that leads to the generation of HARE (HA Receptor for Endocytosis) is indicated with an arrow. HC1 is shown as
a representative heavy chain, where HC2, HC3 and HC5 also become covalently attached to HA. These HCs are likely to have an identical domain structure toHC1(seeFigure4B) for which a crystal
structure has been determined.
26
For RHAMM and SpnHL the regions that have been reported as mediating their interactions with HA are denoted by cyan bars and labelled HABD(HAbinding
domain); see Figure 5for further details on RHAMM HABD. All of the sequence analyses underpinning in this figure was carried out by the author (A.J. Day, unpublished data).
PROTEOGLYCAN RESEARCH
|
3of33
protein 5 (SUSD5), also known as KIAA0527, and TSG6, the protein
product of Tumour Necrosis Factorstimulated gene6.
22
These
proteins are highly conserved in other mammals, and many have also
been identified (e.g. from sequence data) in amphibians, birds, fish
(including coelacanths, eels, lampreys, and sharks) and reptiles
(including crocodilians and turtles); i.e. based on information ex-
tracted from the SMART Protein Domain Database.
25
There are also
some potential Link modulecontaining proteins in ascidia (sea
squirts) and lingulids (marine molluscs), which might indicate that this
domain has an origin older than previously thought. Additionally,
there is a TSG6related sequence in Ixodes ricinus, the common tick,
which is 98% identical to the Jamaican fruit bat, presumably resulting
from misidentification from a contaminating blood meal. Moreover,
proteins with apparent Link module sequences are present in a few
bacteria (e.g. bdellovibrio, flavobacteriaceae and parcubacteria),
where it seems possible that these derive from vertebrate hosts via
horizontal gene transfer.
As described previously,
1,22
the Link module superfamily can be
divided into at least three subgroups based on the size of their HA
binding domains (HABD).
TypeAHAbinding proteins: TSG6 and stabilin2
TSG6 contains a single independently folded Link module that is
sufficient to mediate the interaction with HA
32
and has been desig-
nated a TypeA HABD.
1,22
Stabilin1, stabilin2 and SUSD5 all also
contain individual Link modules (Figure 1), but of these only stabilin2
is known to have HAbinding activity
33,34
; based on sequence com-
parisons and molecular modelling, it seems very unlikely that the Link
modules from stabilin1 and SUSD5 will be able to bind HA.
35
A
190kDa form of stabilin2, generated by proteolytic processing, is
termed HARE (HA Receptor for Endocytosis), where both HARE and
stabilin2 have important roles as endocytic scavenger receptors in a
wide range of tissues,
33
involved for example in the systemic clear-
ance of HA when expressed on liver sinusoidal endothelial cells.
36
The tertiary structure, HA binding site, and dynamics of
the TSG6 link module
The 3D fold for the Link module was first elucidated from human
TSG6, a protein that protects tissues during inflammation.
32
An
initial structure determined by nuclear magnetic resonance (NMR)
spectroscopy
27
was superseded by a higher quality solution structure
based on a larger number of experimental restraints.
37
As shown in
Figure 2A, the Link module is composed of two αhelices and two
triplestranded, antiparallel βsheets, with four cysteine residues
forming two intramolecular disulphide bonds (Cys23Cys92 and
Cys47Cys68 as numbered in the isolated Link module). These con-
sensus cysteines (linked in a pattern of C1C4 and C2C3) are fully
conserved within the Link module superfamily,
35
with the exception
of SUSD5 that lacks the C2C3 disulphide bridge, with two additional
cysteines that are predicted to form a disulphide bond
28
; see Link
module alignment in Figure 3.
The NMR structure for the Link module of TSG6 (often referred
to as Link_TSG6) was determined in both the absence and presence
of an HA octasaccharide.
37
The determination of the HAbound
conformation for the Link module, in combination with data from site
directed mutagenesis, and NMR spectroscopy and isothermal titra-
tion calorimetry with different lengths of HA, allowed the generation
of an experimentallyvalidated model for the Link_TSG6/HA com-
plex.
19,35,40,41
This large body of work identified the amino acids in
TSG6 involved in mediating its interaction with HA, revealing that
these residues are present in a shallow groove on the Link module
surface. In the latest (refined) model for the Link_TSG6/HA com-
plex,
19
three basic residues (Lys11, Lys63 and Arg81) can make salt
bridges with the carboxylate groups of three glucuronic acid moieties,
where these ionic interactions are likely to be transient. In addition,
His45, Tyr59 and Tyr78 form ring stacking (CHpi) interactions with
three sugar rings, with Phe70 perhaps also making an occasional
stacking interaction, and the hydroxyls of Tyr12 and Ty78 making
hydrogen bond contacts with the HA; methyl groups from two
GlcNAc residues are accommodated within hydrophobic pockets
located at the bottom of the binding groove. Thus a complex inter-
action network is formed, where it has been determined that ionic
interactions only contribute ~25% of the free energy of binding.
42
An
HA octasaccharide (HA
8
AN
with a nonreducing terminal GlcA (A) and
reducing terminal GlcNAc (N)) was the minimum length of oligo-
saccharide found to make the full complement of these interactions,
where seven of the eight sugar rings make contact with the protein.
19
There is a large contact area involved, with the octasaccharide
wrapping round two faces of the Link module (see Figure 2C). This
requires two of the glycosidic bonds in the bound HA
8
to be highly
kinked, although still with energetically favoured conformations.
Interestingly, the TSG6 Link module undergoes a small but sig-
nificant conformational perturbation on HA binding.
37
In the absence
of HA, the Link module is in a closedstate such that the HAbinding
groove is not visible, whereas in its presence the groove becomes
open, exposing the residues that mediate binding. This is largely due
to a change in the conformation of the β4β5 loop (connecting the β4
and β5 strands; see Figure 2A), which has a high degree of mobility in
the free protein, with motion on the nanosecond to picosecond
timescale.
37,43
This is consistent with the crystal structure for
Link_TSG6, which has 5 molecules in the asymmetric unit, with little or
no electron density visible for the β4β5 loop and the presence of
multiple conformations of the Cys47Cys68 disulphide bond.
43
On HA
binding there is a change in the geometry of the Cys47Cys68 dis-
ulphide and the dynamic behaviour of the Link module is greatly
dampened, such that the conformation of the β4β5 loop becomes
stabilised in its open position. In the free protein, while the binding
groove is usually closed, it is thought the β4β5 loop can undergo
slower timescale motions, thereby sampling a variety of conformations
that may include the open conformation, i.e. facilitating the capture of
HA within the binding groove. In the context of the Link_TSG6/HA
complex there is still motion within the Link module that might allow
4of33
|
DAY
(A) (B)
(C) (D)
(E) (F)
FIGURE 2 Structures and models of TypesA, BandCHAbinding domains of the Link module superfamily. (A) The solution structure of the
Link module from human TSG6 (Link_TSG6), a TypeAHAbinding domain (HABD), showing secondary structural elements (including 2 αhelices
and 2 triplestranded βsheets (composed of βstrands 1, 2 and 6, and 3, 4 and 5)) and the positions of Nand Ctermini; pdb: 1O7B.
37
(B) The
TypeB HABD of human CD44 determined by NMR spectroscopy (pdb: 1POZ
38
) is aligned relative to Link_TSG6 in (A) to show the similarity of the
Link module regions (colour coded the same); additional βstrands, shown in pink (0, 7, 8 and 9), form an extended βsheet with βstrands 1, 2 and 6.
(C) A model (in spacefilling representation) of Link_TSG6 (blue) with a bound HA octasaccharide (HA
8
; carbons in green); here the octasaccharide
was modelled onto the Link module structure in its HAbound conformation pdb: 1O7C
37
) based on experimental restraints (see
19
). (D) The crystal
structure of the HABD of mouse CD44 in complex with HA
8
(carbons in orange), where the protein is in its high affinityconformation (pdb: 2JCR
39
).
The Link module is shown in blue (oriented the same as Link_TSG6 in (C)) with the nonlink module regions shown in grey. (C, D) are adapted from
the original figures made by A.J. Day and colleagues for use in.
19
(E) A homology model of human HAPLN1 (a TypeC HABD) with an HA
12
(blue
sticks) docked into the HAbinding groove that spans the surface of the 2 Link modules (Link1 and Link2).
35
The amino acids predicted to form the
intermodule interface between Link1 and Link2 are shown in magenta (spacefilling) and the residues on the external faces of the Link modules
predicted to mediate proteinprotein (e.g. HAPLNlectican) interactions are shown in cyan; the positions of the cyan and magenta amino acids within
the Link module sequences are shown in Figure 3. (F) A model (space filling) of how the TypeC HABDs of aggrecan (red) and HAPLN1 (yellow) could
associate to generate a helical HA (blue) structure.
35
(E, F) are adapted from the original figures made by A.J. Day and colleagues for use in.
35
PROTEOGLYCAN RESEARCH
|
5of33
FIGURE 3 Multiple sequence alignment of Link modules from the human Link module superfamily. The alignment of the Link module sequences from the 14 members of the human Link module
superfamily was generated as described in Blundell et al., 2003 and 2005.
35,37
Where proteins have more than one Link module these are denoted by a hyphen and the number of the Link module
from the Nterminal end of the protein (e.g. HAPLN11 and HAPLN12); for the Link modules that are not thought to be involved in HA binding their names are depicted in grey. Secondary structure
elements for Link_TSG6, matching those in Figure 2A, are shown above the alignment along with the disulphide bonds (green lines) connecting the four consensus cysteines (C1, C2, C3 and C4 in
black boxes); all Cys in the alignment are coloured green. The predicted disulphide bond between two additional nonconsensus cysteines within SUSD5 is also shown. In the sequences for TSG6,
CD44 and LYVE1 amino acids for which there is experimental evidence to demonstrate their role in HA binding (see text) are coloured red. In the other sequences, chemically equivalent amino acids
are coloured in orange, indicating residues that could potentially be involved in the interaction with HA (e.g. based on analysis of homology models
35
). From this analysis the Link modules from
stabilin1 and SUSD5 are predicted to not participate in HA binding. Residues in TSG6 (His4, Tyr12, His45 and Asp89 in Link_TSG6) that mediate the pHdependent interaction with HA are shown
in the dark grey boxes. Amino acids predicted to form intermodular interactions between Link modules within the TypeC HABDs (i.e. of HAPLNs and lecticans) are coloured magenta (within
magenta boxes). Residues within these proteins that are predicted to mediate interprotein interactions (e.g. between an HAPLN and lectican), are coloured blue, within blue boxes.
6of33
|
DAY
some conformational flexibility in the bound HA, thereby decreasing
the entropic penalty of capturing a singleHA conformation.
43
The phdependency of the TSG6HA interaction
The interaction of the TSG6 Link module with HA is highly
pH dependent, with maximum binding at pH 6.0 and a dramatic loss of
binding both above and below this optimum.
41,44,45
For example, there
is a 100fold increase in the binding affinity as the pH is lowered from
pH 7.4 to pH 6.0, whereas aggrecan's binding to HA decreases 75fold
over the same range.
45
The loss in Link_TSG6's HAbinding affinity
below pH 6.0 is due to the change in protonation state of His45.
41
In
its uncharged state (at pH 6.0) His45 plays a role in stabilising the β4
β5 loop that forms one side of the HA binding groove
37,43
and also
makes a direct ring stacking interaction with HA.
19
On the other hand,
the decrease in HA binding above pH 6.0 is caused by the loss of
protonation of a histidine (His4) that is not within the HAbinding
site.
41
When protonated (e.g. at pH 6.0), His4 can make a salt bridge to
Asp89, which is buried in the hydrophobic core of the protein, where
Asp89 is simultaneously hydrogen bonded to Tyr12, a residue that has
an important role in HA binding.
41
Thus, the resulting molecular net-
work transmits the change in protonation state of His4 to the HA
binding site, where increasing pH is thought to lead to a destabilisation
of the β1α1 loop, on which Lys11 and Tyr12 are located, and that
forms the other side of the binding groove. The reduction of HA
binding above pH 6.0 is also a property of the fulllength protein,
41
and
it has been suggested that TSG6 may be regulated by pH gradients,
e.g. in tissues such as cartilage.
32,43,45
His4, His45 and Asp89 are
present in the TSG6 sequence from a wide range of species including
other mammals, chicken, fish and Xenopus, suggesting that the
pH regulation of TSG6's HAbinding activity has been a conserved
feature of this protein throughout its evolution.
41
Moreover, given that
Link_TSG6 is being developed as a biological drug in its own right, e.g.
for dry eye disease
46
and osteoarthritis,
47
the pHdependency of its
HA binding (and other activities
44,48
) could contribute to the mecha-
nisms underlying its therapeutic activities.
The GAGbinding specificity of the TSG6 link module
Link_TSG6 interacts with other GAGs in addition to HA; i.e. CS,
dermatan sulphate (DS) and heparan sulphate (HS).
4850
While hep-
arin/HS binds at a surface that is on the opposite face from the HA
interaction site,
48
chondroitin4sulphate (C4S) can be accommo-
dated within the HAbinding groove, based on an NMR structure
determined for Link_TSG6 in its C4Sbound conformation.
51
This is
consistent with earlier experiments showing that C4S can compete
for HA binding
45,49
; heparin can also inhibit the interaction with HA
in competition experiments,
48
but this is via an allosteric mecha-
nism.
43
These data indicate that the composition of the matrix within
particular microenvironments could differentially regulate TSG6's
HAbinding function.
48
TSG6mediated crosslinking of HA
Surfacesensitive biophysical experiments with polymeric HA end
grafted on to a solid support to form a film, showed that TSG6 can
crosslink HA chains.
52
TSG6 caused a dramatic condensation
(collapsing) and rigidification of the HA network at physiologically
relevant concentrations of the protein; the HA film thickness was
reduced, up to fourfold, on TSG6 binding. The interaction of HA with
the fulllength protein is cooperative in nature (unlike its binding to
Link_TSG6 which is noncooperative) and induces the formation of
TSG6 dimers and higherorder oligomers (likely tetramers) that can
directly connect HA chains together. The oligomerisation (that only
occurs in the presence of HA) may be mediated by TSG6's CUB
module. The dramatic effect of TSG6 on HA structure is thought to
be due to a combination of its crosslinking activity along with a
conformational perturbation of HA (bending) caused by binding to
two faces of the Link module (see Figure 2C). Consistent with this
Link_TSG6, which does not oligomerise in the presence of HA, and
binds with lower affinity compared to the fulllength protein, has a
smaller effect on HA film thickness. Furthermore, a lower occupancy
of TSG6 is required for condensation of HA. For example, a twofold
reduction in thickness occurs when a TSG6 molecule is bound every
88 saccharides on the HA chain, whereas binding of Link_TSG6 every
12 saccharides is necessary to achieve the same decrease
52
; in the
latter, the HA will be close to saturation, given that HA
20
is the
minimum length of HA oligosaccharide that can accommodate two
Link_TSG6 molecules.
41
This crosslinking activity of TSG6 could be biologically relevant,
serving to collapse/organise HA and thereby modulate its mechanical
properties.
32
However, it should be noted that interαinhibitor (IαI)
inhibits the interaction of TSG6 with HA,
53
such that the TSG
6mediated crosslinking of HA may be restricted to tissue locations
where IαI, a plasma proteoglycan that leaks into tissues e.g. during
inflammation (see
54
), is at a low concentration or absent. One context
where TSG6mediated crosslinking could occur is within the HArich
cumulus matrix that forms around the oocyte before ovulation
(see
52
), where TSG6 expression is rapidly upregulated following the
gonadotropin surge
55,56
; i.e. before IαI ingresses into the ovarian
follicle. Here, TSG6 could become concentrated in a dense pericel-
lular HA network around the cumulus cells, providing a reservoir of
the protein ready to interact with IαI, which switches TSG6 from an
HAbinding protein into an enzyme that mediates that covalent
transfer of Heavy Chains (HCs) from the IαI onto HA and thereby
drives cumulus matrix expansion.
53,57
This catalytic function of
TSG6 leads to a different mode of HA crosslinking, which is dis-
cussed in detail below (see Heavy chains of the interαinhibitor
family).
Stabilin2/HAREHA interactions
Stabilin2/HARE, which has a single Link module towards the
Cterminus of the protein (Figure 1), plays an important role in the
PROTEOGLYCAN RESEARCH
|
7of33
endocytic clearance of HA and CS (and other GAGs) from blood and
lymph fluid.
33,58
Recombinant human HARE (190hHARE) was ex-
pressed on FlpIn HEK293 cells and shown to mediate the uptake of
125
IHA (average MW of ~133kDa), with Scatchard analysis indi-
cating a K
D
of ~10 nM for the HAHARE interaction
59
; expression of
a truncation mutant of 190hHARE with the Link module deleted had
greatly decreased uptake of HA,
60
providing evidence that it is this
domain of HARE that binds HA. Microtitre plate assays with the
purified ectodomains from stabilin2 or 190hHARE provided similar
affinities, with estimated K
D
values of ~10 nM and ~20 nM, respec-
tively
61
; ligand blotting experiments for the interaction of the
190hHARE ectodomain with
125
Ilabelled oligosaccharides, indicated
that a HA 20mer bound with ~35fold higher (apparent) affinity than
a decasaccharide.
62
Interestingly, HARE has been found to recognise and internalise
HA ranging in size from an octasaccharide to ~10MDa (see
62
).
However, only HA in the range of 40400kDa is able to induce NF
κBmediated gene expression (via ERK1/2 activation
60
) in a HARE
dependent manner.
62
Thus, a relatively narrow size range of HA can
activate the HARE signalling system (with ~140kDa HA being the
optimal size
62
), suggesting that while HARE is involved in the con-
tinuous clearance of HA from the circulation (e.g. via liver sinusoids) it
is also able to monitor and react to the extent/nature of HA turnover
throughout the body.
62
Mechanisms explaining how HARE re-
cognises some HA sizes but not others have been proposed.
63
CS is able to compete for
125
IHA binding to 190hHARE, where
C6S and CSE (4and 6sulpahted CS) were better competitors than
C4S
59
suggesting that, like Link_TSG6, CS may potentially be
accommodated within HARE's HAbinding groove. The Link module
of stabilin2/HARE is 43% identical to Link_TSG6,
35
and interestingly,
contains all of the amino acids responsible for the pH dependency in
TSG6, including His2322 (equivalent to His45 in Link_TSG6). As
such, this might explain how stabilin2/HARE is able to release HA
when it is internalised to endocytic compartments with low pH.
33,64
Type B HAbinding proteins: CD44 and LYVE1
The 3D structure of CD44's HAbinding domain
CD44 is a ubiquitous cell surface molecule and the best characterised
receptor for HA. The structure of the HAbinding domain from
human CD44 (HABD_CD44) has been determined by both NMR
spectroscopy and Xray crystallography, revealing it is composed of a
single Link module with flanking Nand Cterminal sequences that
are required for folding and functional activity.
38
As such, CD44
contains an extendedLink module forming a domain of about 150
amino acids,
20
which has been designated as a TypeB HABD.
1,3
As
can be seen from Figure 2B, the Link module of CD44 is very similar
in overall structure to that of TSG6 apart from having 4 additional β
strands (β0, β7, β8 and β9) that form a sevenstranded βsheet with
three of the βstrands from the Link module. In addition to the dis-
ulphide bridges formed by the consensus cysteines in the Link
module, Cys28 and Cys129 (numbered as in the preprotein) are
disulphide bonded, linking together the Nand Cterminal extensions.
The HABD from mouse CD44 has also been analysed by crys-
tallography with structures determined of the unliganded apoprotein
and the protein in complex with HA
8
39
; the HA binds in a shallow
groove in a similar location on the Link module to the HAbinding site
in Link_TSG6.
19,39
From this structural analysis, the HAbinding sur-
face is entirely located within the Link module of the HABD_CD44,
with thirteen amino acids of mouse CD44 (Arg45, Tyr46, Cys81,
Arg82, Tyr83, Ile92, Asn98, Ile100, Cys101, Ala102, Ala103, His105
and Tyr109; see alignment in Figure 3) making prominent contact
with five rings of the HA. Unexpectedly, four basic amino acids within
the Cterminal extension (Arg150, Arg145, Lys158 and Arg162), that
had been predicted play a role in binding,
65
do not appear to be
involved in the interaction, i.e. based on extensive analysis of site
directed mutants.
39
However, Arg45 (equivalent to Arg41 in human
CD44) was found to play a critical role in the interaction with HA,
with its guanidino group forming a watermediated hydrogen bond
with the carbonyl oxygen of a GlcNAc sugar; the β1α1 loop carrying
this residue can exist in two different conformations within the
CD44/HA
8
complex (one in which the arginine makes contactwith
HA and other where it doesn't) that are thought to represent high
and lowaffinity binding states, respectively. It had long been pre-
sumed that this arginine would make a salt bridge to the HA (e.g.
see
65,66
), but in fact, the binding of CD44 to HA does not involve any
ionic interactions,
39
and consistent with this, is not inhibited by salt.
67
Moreover, there are no ring stacking interactions formed between
HA and the CD44_HABD, with a network of hydrogen bonds and van
der Waals forces mediating the binding instead
39
; the methyl group
from a single GlcNAc residue sits within a hydrophobic pocket
(formed by the side chains of Tyr83 and Ile92 and the disulphide
between Cys81Cys101) at a similar position to one of the pockets in
Link_TSG6. While only five sugar rings of HA make prominent con-
tact with the protein,
39
HA
6
AN
oligosaccharides bind more weakly
than HA
8
AN
,
20
consistent with the conclusion that oligomers shorter
than eight cannot make the full complement of interactions within
the HA binding groove.
38
Extensive reordering of the β0 and β8 strands, with the β9 strand
becoming disordered, has been reported to occur on HA binding,
based on NMR experiments on CD44 in the absence and presence of
HA
6
,
68
i.e. using an essentially identical human HABD_CD44 con-
struct to that investigated in our studies.
38,39
However, this major
structural perturbation is incompatible with the more subtle ligand
induced conformational changes in the β1α1 loop observed in the
crystal structure for mouse HABD_CD44.
39
Moreover, our NMR
experiments on the human HABD_CD44
38
indicated that while there
are some alterations in the hydrogen bonding network on HA binding
these serve to stabilise the association between the β3 and β4
strands and destabilise the interaction of the β2 and β6 strands, with
no evidence for a significant change in the topology of the secondary
structure (e.g. in the Cterminal region). The suggestion by Shimada
and colleagues that in the presence of HA
6
the human HABD_CD44
is mostly present in solution in a partially disordered form (>90% of
8of33
|
DAY
molecules),
69
is seemingly inconsistent with the similarities in the
Hbond network of the liganded and unliganded protein we
observed, as determined from slow exchanging amides.
38
Therefore,
it is difficult to understand the origin of the structural differences
between our findings and those of the Shimada lab.
An alternative position for the HAbinding site on human CD44
was suggested
38
to reconcile the widespread distribution of amino
acids implicated from the original sitedirected mutagenesis studies
(e.g. Lys38, Arg154 and Arg162
65,66
); i.e. with HA being accommo-
dated in two mutually exclusive orientations on the surface of the
HABD.
38,70
A study using molecular dynamic simulations has pro-
vided support for this possibility, where this alternative parallel
mode (and a third dramatically different uprightorientation), is
proposed to represent metastable configurations adopted during the
early stages of binding.
71
However, this seems unlikely, given that the
amino acids identified as being important in mediating these alter-
native modes (i.e. Arg154 (parallel) and Arg162 (upright)
71
) have
been shown in the context of mouse CD44 (Arg159 and Arg167,
respectively) not to contribute to HA binding.
39
Comparison of the HAbinding sites in CD44 and
TSG6
From the above, it is apparent that CD44 mediates its binding to HA in
a very different way compared to TSG6(see
19
). While the locations of
the binding grooves are similar for HABD_CD44 and Link_TSG6
(Figure 2C, D) most of the residues that are involved in the interaction
with HA are found in different sequence positions within these Link
modules (Figure 3). The considerable differences between the
HABD_CD44HA and Link_TSG6HA interaction networks may ex-
plain why the latter has higher affinity for HA.
19,39
Recent work has
investigated whether the different modes of HA binding for CD44 and
TSG6 can be exploited, i.e. by making chemically modified HA oli-
gosaccharides that differentially target one protein over the other.
20
It
was found that addition of certain acidic chemical groups to the
reducing termini of HA could enhance binding to Link_TSG6 (but not
CD44) via the formation of a salt bridge with Arg81 in the former.
While the interaction networks mediating binding are completely
different in CD44 and TSG6, the conformations of the bound HA
molecules are reasonably similar
19,67
; in both cases, the HA wraps
around 2 faces of the Link module (Figure 2C, D), albeit with a less
pronounced kink on binding to CD44 compared to Link_TSG6.
Interestingly, the conformation of HA captured by CD44
39
is similar
to one of the main conformers of HA in solution (see
72
; Charles D.
Blundell, personal communication). This is consistent with the general
finding that the bioactive conformations of ligands are often very
similar to their most common conformers in solution, where this
concept is being used to facilitate drug discovery.
7375
Importantly,
the HA polysaccharide is highly conformationally dynamic,
14,16,76
such that, while different HAbinding proteins may be able to capture
different HA conformations,
5
these will likely be conformations that
are frequently visited in solution (see
20
).
LYVE1's HAbinding domain
LYVE1 is an HA receptor found on lymph vessel endothelial cells and
on some macrophage subsets.
67,77
It plays an important role, through
its HAbinding activity, in the regulation of leucocyte trafficking, for
example acting as a gatekeeper for immune cell entry into the lym-
phatic system,
78
and in macrophage function, e.g. with regard to the
turnover of collagen.
79
In both of these contexts, LYVE1 recognises
HA present in the glycocalyx, e.g. surrounding dendritic and vascular
smooth muscle cells, respectively.
78,79
LYVE1 on lymphatic en-
dothelium is also thought to take up HA (of intermediate and large
size) from tissues and act as a receptor for low molecular weight
(LMW) HAinduced lymphangiogenesis.
8082
LYVE1, like CD44, is an integral membrane protein having a
TypeB HABD that is comprised of a single Link module along with
flanking Nand Cterminal sequences.
67
This extended Link module
has three disulphide bonds in equivalent sequence positions to those
in CD44 and has been modelled based on the CD44 structure.
Truncation mutagenesis of the LYVE1 extracellular domain indicates
that the HABD of LYVE1 may be somewhat smaller than that of
CD44. Sitedirected mutagenesis identified seven residues that
mediate the interaction of LYVE1 with HA (i.e. Arg37, Tyr87, Ile97,
Arg99, Asn103, Lys105 and Lys108; numbered as in the preprotein),
where all of these are within the Link module (see Figure 3), with the
exception of Arg37, which is part of the Nterminal extension. Unlike
CD44, the interaction of LYVE1 with HA is highly saltdependent,
indicating that the interaction is mediated predominately by ionic
bonds. While an HA octasaccharide is the minimum length that can
compete for high molecular weight HA binding (suggesting this is the
size of sugar accommodated in the binding site), longer oligo-
saccharides are more effective competitors.
67
This is consistent with
LYVE1 existing as a disulphidelinked homodimer on the cell sur-
face,
83
where the interaction with HA only occurs if the poly-
saccharide is appropriately organised
84
; see further discussion below
(in section Regulation of HAreceptor interactions).
Crystal structures of the HABDs from human and mouse LYVE1,
and their HAbound complexes, have been determined and should
soon be published (David G. Jackson, personal communication). Gi-
ven that no other highresolution structures are available for Link
modulecontaining HABDs (apart from CD44 and TSG6; as
described above), this new structural information will make a highly
important contribution to the field.
Regulation of HAreceptor interactions
The association of HA with cell surface receptors, such as CD44 and
LYVE1, involves multiple (tens of) weak interactions.
8486
For ex-
ample, this is important in the adhesion of leucocytes to endothelial
surfaces, which are coated with an HArich glycocalyx,
87
under-
pinning numerous cell migration events during tissue homoeostasis
and inflammation.
77,88
CD44HA interactions can mediate the rolling
of leucocytes in situations where the shear forces are low (e.g. in liver
PROTEOGLYCAN RESEARCH
|
9of33
sinusoids
88,89
); CD44 can also interact with Eselectin, and Pselectin
glycoligand1, to mediate slow and fastrolling, respectively, inde-
pendent of HA (see
88
). Interestingly, a chimeric CD44, where its Link
module was replaced by the Link module of TSG6, which binds HA
more tightly (see
19
), no longer supported cell rolling on HA where,
instead, cells became firmly adherent
90
; i.e. illustrating the likely
requirement of multivalent interactions of low affinity for cells to roll
on HA underflow (see
88
).
The CD44 gene can be alternatively spliced to generate tran-
scripts with up to 10 variant exons incorporated.
91,92
This leads to
the formation of over twenty isoforms of CD44.
88,93
and when all of
these variant exons are spliced in, results in a mature protein of 721
residues (see Figure 1). Some isoforms (e.g. v310 and v3,810),
despite having the HAbinding domain, do not interact with HA; this
is likely related to the extent/nature of the Nlink glycosylation
present in the CD44 and/or the presence of CS/HS modification on
the alternatively spliced v3 exon (see
94,95
). The major isoform found
on leucocytes, the socalled standardor hematopoieticform
(CD44s), does not include any of these additional sequences (being
only 341 residues in length) and will be the main focus here; this is
composed of an extracellular domain of 248 amino acids, a
21residue transmembrane domain and 72residue intracytoplasmic
domain (Figure 1).
Circulating immune cells, although expressing the CD44s iso-
form, do not constitutively bind to HA,
95,96
but can be activated to do
so in response to inflammatory signals,
97,98
where this leads to de
novo CD44 synthesis with an altered glycosylation pattern
99
; removal
of sialic acid moieties from cellsurface CD44,
100,101
or indeed
mutation of particular Nlinked attachment sites,
102
can also lead to
activation of HA binding. It has been suggested, based on the 3D
structure of the HABD of CD44, that a sialic acidcontaining glycan
linked to Asn25 in the human protein could obstruct the HA binding
site (i.e. via steric inhibition), whereas a glycan attached to Asn120
(on the opposite face) could inhibit receptor clustering.
38
Consistent
with this, antibodies that crosslink CD44 molecules (e.g. IRAWB14)
can also trigger binding on nonconstitutively active cell back-
grounds,
96,103
whereas some noncrosslinking antibodies can inhibit
constitutive binding,
104
indicating that receptor clustering plays an
important role in the regulation of CD44's interaction with HA.
38,105
The structural studies described above (see section: The 3D struc-
ture of CD44's HAbinding domain") revealed that CD44 undergoes a
conformational change that likely switches the receptor from a low to
a highaffinity HAbinding state.
39
While this change in conformation
can be triggered by the interaction with HA,
38,39
it seems likely that it
will also be influenced by receptor clustering, where particular gly-
coforms are more readily activated.
Another way in which CD44 on induciblecell backgrounds can
become activated is via crosslinking of HA chains.
52,53,106,107
Pre-
incubation of HA with fulllength recombinant human TSG6 (rhTSG
6), and to a lesser extent Link_TSG6, was found to enhance or induce
the binding of HA, respectively, to cell backgrounds on which CD44
was constitutively or nonconstitutively active
106
; it was concluded
that HA was being crosslinked by Link_TSG6/rhTSG6, thereby
promoting CD44 clustering. While Link_TSG6 was somewhat less
potent than the fulllength protein, it was clearly able to form a stable
complex with HA that could enhance/induce binding to CD44
+
cells.
Based on biophysical experiments (described in section: TSG
6mediated crosslinking of HA"), which compared the effects of
Link_TSG6 and rhTSG6 in surfaceanchored HA films, Link_TSG6
might not be expected to directly crosslink HA chains, i.e. given, its
1:1 and noncooperative mode of binding.
52
However, the conditions
under which Link_TSG6 was preincubated with HA, i.e. 50 mM Na
HEPES at pH 6.0 in Lesley et al. (2004),
106
could explain this since in
very low salt (5 mM Na
+
ions, pH 7.4) Link_TSG6 has been seen to
cause total collapse of an HA film consistent with the occurrence of
crosslinking.
52
The mechanism underlying this Link_TSG6mediated
collapse of the HA network, which can lead to phase separation,
4
has
not been determined.
The TSG6mediated enhancement of the HA interaction with
CD44expressing cells occurred under shear forces equivalent to
those observed in postcapillary venules,
106
indicating that if HA/
TSG6 complexes were formed on the vascular endothelium they
could perhaps support rolling in situations where HA alone cannot.
However, this mechanism is probably unlikely given that the serum
proteoglycan IαI can inhibit (and partially reverse) TSG6's cross-
linking activity and also suppress its enhancement of HA binding to
CD44
+
cells.
53
That said, as described below (in The molecular basis
of HCHAmediated crosslinking), the interaction of IαI with TSG6
leads to the formation of a different kind of crosslinked HA (referred
to as HCHA or SHAPHA) that also has been found to be pro
adhesive for CD44 expressing leucocytes in some,
107
but not other,
53
circumstances.
Initial studies revealed that while LYVE1 overexpressed in cer-
tain cell backgrounds (e.g. in a fibroblast cell line) could bind HMW
HA, the native LYVE1 protein on lymph vessel endothelial cells
(LECs) had little or no binding activity in vitro.
82,108
However, LYVE1
does bind to HA displayed on the surface of dendritic cells,
78
mac-
rophages
84
and Group A streptococci
109
that use their interactions to
gain entry to the lymphatics. The explanation for this discrepancy is
that both LYVE1 and HA are required to be correctly organised for a
stableinteraction to occur.
83,84
In this regard, the HA needs to be
multimerised (e.g. crosslinked) where freeHA is not recognised by
LYVE1
84
; this cellsurface receptor has to be present as a disulphide
linked dimer,
83
or be induced to selfassociate (i.e. clustered), e.g. by
divalent antiLYVE1 monoclonal antibodies.
84
Importantly for the
HALYVE1 interaction to occur there needs to be threshold density
of receptor molecules on the cell surface,
84
something that has been
described previously for CD44
86,102
; in the case of LYVE1, binding
can also occur when it is organised into discrete microclusters.
84
HA
complexed with rhTSG6 was able to bind to LYVE1 on primary LECs
(unlike freeHA), as was biotinylated HA artificially crosslinked with
streptavidin, whereas the enhancing effect of Link_TSG6 was 10fold
weaker
84
; moreover, the HA/rhTSG6 complexes induced clustering
of LYVE1 on the surface of the LECs.
LYVE1 has been found to form disulphidelinked dimers via a
cysteine residue (Cys201) within its membraneproximal domain (see
10 of 33
|
DAY
Figure 1), where these homodimers are the predominant configura-
tion of the receptor on lymph vessel endothelial cells analysed
in vitro.
83
Biophysical analysis of the recombinant dimer demon-
strated that it binds to HA with ~fifteenfold higher affinity than
monomeric LYVE1 and with a much slower off rate (~seventyfold).
However, disruption of dimer formation (by mutagenesis of Cys201)
abolished HA binding in primary LECs, where binding could not be
recovered by increasing the level of surface expression or by acti-
vatingantibodies. This indicates that there is a requirement for
LYVE1 to be present as a covalent dimer for it to be functionally
active and that the divalent antiLYVE1 antibodies enhance binding
through clustering of the LYVE1 homodimers.
83,109
The disulphide
bond linking the monomers was found to be highly sensitive to
reduction, thus providing a potential mechanism for regulation of
LYVE1's HAbinding activity via extracellular reducing agents (see
83
).
CD44 can also be redoxregulated, but in this case via reduction of
the Cys77Cys97 disulphide bond (equivalent to Cys81Cys101 in
the mouse protein), which when intact stabilises the HAbinding
groove.
110
However, CD44 can also exist as a covalent dimer, which
has increased avidity for HA, via the formation of two interchain
disulphide bonds involving cysteine residues, one within the trans-
membrane region (Cys286) and one within the intracytoplasmic
domain (C295).
111
These cysteine residues can also become palmi-
toylated, which prevents dimer formation, e.g. in lipid rafts, mod-
ulating HAbinding and also the association of CD44 with the cyto-
skeleton (see
112,113
).
A lowresolution solution structure of the disulphidelinked
LYVE1 extracellular region, determined by SmallAngle Xray Scat-
tering (SAXS), has informed the generation of a model for the
homodimer.
83
Here the LYVE1 is organised in an antiparallel open
scissorsarrangement, where the two HADBs are separated from one
another and would be expected to lie close to the cell surface; this is
consistent with the finding that HMW HA, and long oligosaccharides,
were much better competitive inhibitors than short oligomers (e.g.
22and 8mers, respectively).
67
An important consequence of this
particular dimeric organisation is that LYVE1 would not be able to
engage both of its HABDs with the same HA strand, without a tor-
tuous, and energetically unfavourable, conformational rearrangement
of the polysaccharide,
83
which likely explains its preferred binding to
HA structures where multiple chains are brought together (e.g.
through association with crosslinking proteins). Another potential
consequence is that depending on the exact organisation of HA
chains (e.g. in crosslinked HA/protein complexes) the LYVE1 dimer
may be able adopt a range of conformations from fully open to fully
closed; this would affect the relative positioning of the in-
tracytoplasmic domains and could therefore mediate differential
signalling, i.e. depending on the structure of the bound HA.
The positioning of LYVE1 in microclusters in LECs, which are
located within 0.11.5 µmsized actin corrals(but not tethered
directly to actin), may also be an important determinant in the reg-
ulation of HA binding, by limiting LYVE1's lateral diffusion such that
it associates preferentially with supramolecular HA configurations
114
;
disruption of the cortical actin network increases the mobility of
LYVE1 on the cell surface and increases its HAbinding activity.
Somewhat similarly, CD44, when associated with bound HA, has
been likened to a picket fence, restricting the movement and func-
tional activity of other molecules (e.g. phagocytic receptors) on the
cell surface
115
; in this context, CD44 is attached to the actin cyto-
skeleton through the binding of its intracytoplasmic region to ezrin.
Remodelling of the actin/CD44/HA 'fence' allows clustering of Fcγ
receptors and the initiation of phagocytosis.
Superselectivity of HAreceptor interactions
As discussed above it has long been recognised that HA binding to
receptors is highly sensitive to their density on the cell surface, where
an interaction is only seen if the number of receptor molecules is
above a threshold level.
84,102
For noncooperative, weak and multi-
valent interactions (such as those with CD44 and LYVE1) this can be
explained by a concept taken from softmatterphysics, that has
been termed superselectivity.
86,116
For example, it has been found
that a twofold change in the number of cell surface CD44 molecules
leads to a sixteenfold increase in binding
86
; this is because CD44 has
asuperselective quality(denoted α) of 4, where the change in
density (D) to power α(D
α
) provides the predicted increase in binding
avidity. Based on available flow cytometry data,
84
LYVE1 has an α
value ranging from 3 to 5 depending on cell type (Ralf P. Richter,
personal communication), indicating that its interaction with HA may
even more susceptible to changes in receptor density under certain
circumstances.
The concept of superselectivity explains how the size of HA, and
crosslinking of HA chains, affect receptor binding and why LMW HA
preparations are such effective competitors of the interaction
between receptors and high molecular weight (HMW) HA.
86,117,118
Short HA chains, including oligosaccharides that only engage with a
single receptor molecule (e.g. HA
8
), bind transiently to the cell surface
because individual HA/receptor interactions are weak, and the short
chains cannot engage with a sufficiently large number of receptors to
stabilise cell surface engagement (see
85
). Long HA chains, on the
other hand, present numerous receptor binding sites and thus can
associate stably to the cell surface through multivalent binding. The
transition between these two regimes is relatively sharp, where the
sensitivity to changes in the number of interactions between an HA
chain and a particular receptor depends on the value of the super-
selective quality. Thus, an increase in the size of HA (which will lead
to larger number of receptor binding sites), or indeed crosslinking
(that effectively increases the HA's apparent MW), shifts towards a
higher number of interactions and higher avidity.
118
Competition of
HMW HA by LMW HA species has a dramatic effect on the avidity
through reducing the total number of interactions that can be made
along the HA polymer chain. It should be noted that my (rather
simplistic) description above is firmly underpinned theoretically, and
well supported by the available experimental data, e.g. on the inter-
action of CD44 with HA. This includes the effect of glycosylation on
CD44HA binding,
102
such that a modest change in K
D
of the
PROTEOGLYCAN RESEARCH
|
11 of 33
interaction for a particular glycoform, can switch HA binding on or off
(see
118
). Furthermore, the superselectivity of HAreceptor interac-
tions has profound implications for our understanding of the different
biological activities of HA molecules of different sizes; see further
discussion below (in section: Differential effects of high and low
molecular weight HA).
TypeCHAbinding proteins: the HAPLN and lectican
families
TypeC HABDs are composed of two contiguous Link modules
1,3
and
are present in the four HAPLNs (link proteins) and four lectican pro-
teins (Figure 1). The HAPLNs, which share 4552% overall sequence
identity,
119
have the same modular organisation as the G1 domains of
the lecticans, comprising an Nterminal immunoglobulin (IG) module
followed by the Link modules. The Link module was first identified in
cartilage link protein,
120
which despite its name is widely expressed
(see The Human Protein Atlas
121
), and is often now referred to as
HAPLN1. As well as binding to HA, HAPLN1 interacts with the G1
domains of the lecticans (e.g. aggrecan and versican
122
), which are CS
proteoglycans (CSPGs), stabilising their interaction with HA; for ex-
ample, aggrecan and HAPLN1 form huge multimolecular complexes
that have a critical role in the organisation and function of carti-
lage
4,123,124
and perineuronal nets
4,125
; see further discussion below.
Modelling of the HABD from HAPLN1
While a structure has not been determined for HAPLN1 (or any of
the other TypeC HABDs), as shown in Figure 2E, a model for the
tandem pair of Link modules for the human protein has been gen-
erated.
35
The HAbinding sites of the individual Link modules
(modelled on the basis of Link_TSG6 in its open, HAbound, con-
formation
37
) were aligned to generate a contiguous HAbinding
groove extending over the surface of the two Link modules
35
; this is
likely to be a more accurate representation of the TypeC HABD than
an earlier model,
126
which was based on a previous unliganded Link
module structure.
27
Hydrophobic residues at the Link module inter-
face (magenta on Figure 2E and Figure 3) were identified that are
highly conserved within the HAPLN and lectican families,
35
indicating
that this organisation is likely to be representative of TypeC HABDs
in general. The model of HAPLN1 indicates that nine sugar residues
of HA are the maximum number that could make contact with the
protein,
35
which is consistent with competition studies on the
recombinant human HAPLN1 protein.
122
The modelling
35
allowed
the prediction of the amino acids that are likely to be involved in
binding to HA, however, this has not been tested experimentally. This
includes arginine/lysine amino acids at equivalent positions to basic
residues in Link_TSG6 (Figure 3), where these are likely to make ionic
interactions with HA, given that HAPLN1HA binding is saltstrength
dependent.
127
The model of HAPLN1 also identified residues that
may mediate intermolecular interactions between the HAPLN
proteins and lecticans (cyan on Figure 2E and Figure 3), e.g. in the
context of ternary complexes with HA, where again these are highly
conserved across the two families.
35
The position of the cyan
coloured residues in the model for HAPLN1 (Figure 2E) indicates that
lectican molecules could potentially bind on either side of the TypeC
HABD, allowing a repeated array to be formed (Figure 2F); see fur-
ther discussion in Macromolecular lectican/HA complexesbelow.
Modelling of the V3 isoform of human versican
Versican, which is very widely expressed (The Human Protein Atlas),
is both the largest lectican and the largest member of the Link module
superfamily, with its V0 isoform (see Figure 1) being composed of
3376 residues in the mature human protein. Alternative splicing of
exons VII and VIII within the versican gene, which encode the αGAG
and βGAG CScontaining regions, respectively, generates four main
isoforms (reviewed in (
128
). The V0 isoform has both regions, V1 has
only the βGAG, V2 has only the αGAG and V3 has neither GAG
attachment domains; there is also a V4 isoform, highly related to V3,
but containing a portion of exon VIII and with attachment of CS
chains.
129
Recombinant expression of the G1 domain from human
versican (VG1),
122,130
a region common to all of the variants, has
provided useful insights into the HAbinding properties of versi-
can,
122
and has also been utilised as a probe for HA detection.
130
A model of the (human) V3 isoform with HA
10
bound to the two
Link modules that constitute its TypeC HABD, has recently been
published
128
; this was informed, in part, by SAXS of VG1 in complex
with an HA decasaccharide. As is the case for HAPLN1 (described
above), up to nine sugar rings can be accommodated within Versican's
HAbinding site, which extends over the two Link modules.
128
Versi-
can, like HAPLN1,
127
binds to HA in a saltdependent manner (S.J.
Foulcer and A.J. Day, unpublished) indicating that ionic interactions
play a major role in HA binding. In contrast, aggrecan's interaction with
HA has been reported to be saltstrength independent,
131
suggesting
that aggrecan utilises a mode of binding that is more similar to CD44.
In the model of the V3 isoform,
128
the IG module of versican is
not positioned close to the bound HA, and is not required for binding,
i.e. based on findings from domain deletion studies
126
; this is
apparently different to aggrecan, which does require the IG module
for its HAbinding activity. Moreover, while aggrecan can interact
with HAPLN1 in the absence of HA, versican cannot (see
122
). No
equivalent modelling for the entire aggrecan G1 domain has been
carried out, so it is not feasible, at present, to rationalise these dif-
ferences at a molecular level. However, it is possible that rather than
the IG module in aggrecan playing a direct role in HA binding, it is
necessary to stabilise the structure of the Link modules.
122
Insights into aggrecan, brevican and neurocan
The expression of neurocan is mostly restricted to the brain and
central nervous system (The Human Protein Atlas). Studies with
12 of 33
|
DAY