Overview of the Matrisome – an Inventory of
Extracellular Matrix Constituents and Functions
Richard O. Hynes* and Alexandra Naba
Howard Hughes Medical Institute
Koch Institute for Integrative Cancer Research
Massachusetts Institute of Technology
Cambridge, MA 02139, USA
Running title: The Matrisome – an overview of ECM Constituents
Completion of genome sequences for many organisms allows a reasonably complete definition
of the complement of extracellular matrix (ECM) proteins. In mammals this “core matrisome”
comprises ~300 proteins. In addition there are large numbers of ECM-modifying enzymes,
ECM-binding growth factors and other ECM-associated proteins. These different categories of
ECM and ECM-associated proteins cooperate to assemble and remodel extracellular matrices
and bind to cells through ECM receptors. Together with receptors for ECM-bound growth
factors, they provide multiple inputs into cells to control survival, proliferation, differentiation,
shape, polarity and motility of cells. The evolution of ECM proteins was key in the transition to
multicellularity, the arrangement of cells into tissue layers and the elaboration of novel structures
during vertebrate evolution. This key role of ECM is reflected in the diversity of ECM proteins
and the modular domain structures of ECM proteins both allow their multiple interactions and,
during evolution, development of novel protein architectures by exon shuffling.
The term extracellular matrix (ECM) means somewhat different things to different people (Hay,
1981, 1991; Mecham, 2011). Light and electron microscopy show that extracellular matrices are
widespread in metazoa, underlying and surrounding many cells, and comprising distinct
morphological arrangements. The initial biochemical studies on extracellular matrix concentrated
on large, structural extracellular matrices such as cartilage and bone. In the 1980s, the
availability of model systems such as the EHS sarcoma opened the way to biochemical analyses
of basement membranes and led to the discovery of the different group of ECM proteins that
make up basement membranes. Biochemistry of native ECM was, and still is, impeded by the
fact that the ECM is, by its very nature, insoluble and is frequently crosslinked. Furthermore,
ECM proteins tend to be large and early work was frequently on proteolytic fragments. The
application of molecular biology to studies of ECM proteins and their genes uncovered many
previously unknown ECM molecules and defined their structures. The protein chemistry and
molecular biology revealed that ECM proteins are typically made up of repeated domains, often
encoded in the genome as separate exonic units. The completion of the sequences of many
genomes now allows description of the entire list of proteins and, potentially, the definition of
the complete repertoire of ECM proteins, based on homologies with known ECM proteins.
Comparative analyses of the genomes of different organisms allow deductions about the
evolution of this repertoire, which we term the matrisome. Newer methods such as mass
spectrometry are also beginning to allow more detailed biochemical characterization of
extracellular matrices. In this article we will give an overview of the mammalian matrisome and
briefly discuss certain aspects of the evolution of the matrisome and of the ECM.
Definition of the Matrisome
In analyzing the structure and functions of extracellular matrices, one would like to have a
complete “parts list” – a list of all the proteins in any given matrix and a larger list of all the
proteins that can contribute to matrices in different situations (the “matrisome”). As mentioned,
the biochemistry of ECM is challenging because of the insolubility of most ECMs. However,
the availability of complete genome sequences coupled with our accumulated knowledge about
ECM proteins does now make it possible to come up with a reasonably complete list of ECM
proteins. ECM proteins typically contain repeats of a characteristic set of domains (LamG,
TSPN, FN3, VWA, Ig, EGF, collagen pro domains, etc.; see figures and Table 1). Many of these
domains are not unique to ECM proteins but their arrangements are highly characteristic. That
is, the architecture of ECM proteins is diagnostic – they are built from assemblies of many
ancient, and a few more recent, protein domains, each of which is typically encoded by one or a
few exons in the genome. ECM proteins represent one of the earliest recognized and most
elaborate examples of exon (domain) shuffling during evolution (Engel, 1996; Patthy, 1999;
Hohenester and Engel, 2002; Whittaker et al., 2006; Adams and Engel, 2010). This
characteristic of ECM proteins allows bioinformatic sweeps of the proteome encoded by any
given genome, using a list of 50 or so domains to identify a list of candidate ECM proteins.
Negative sweeps of that list using domains from other protein families (e.g, tyrosine kinases,
which share FN3 and Ig domains with ECM proteins) and screens for transmembrane domains
allow refinement of the list. A very few known ECM proteins do not have readily recognizable
domains (e.g., elastin, dermatopontin and some dentin matrix proteins) although, increasingly,
even those are now being incorporated into protein analysis sites such as SMART and InterPro,
allowing their routine capture in the sweeps. Using such methods plus manual annotation, we
have been able to define a robust list of the proteins defining the mammalian matrisome by
analysis of the human and mouse genomes (Naba et al., 2011). We call this list of “core” ECM
proteins the core matrisome. It comprises 1-1.5% of the mammalian proteome (without
considering the contribution of alternatively spliced isoforms (prevalent in transcripts of
matrisome genes). This list comprises almost 300 proteins, including 42 collagen subunits, three
dozen or so proteoglycans and around 200 glycoproteins.
This core matrisome list does not include mucins, secreted C-type lectins, galectins, semaphorins
and plexins and certain other groups of proteins that plausibly do associate with the ECM but are
not commonly viewed as ECM proteins; lists of these “ECM-affiliated” proteins are given in
Naba et al., (2011). The core matrisome list also does not include ECM-modifying enzymes,
such as proteases, or enzymes involved in crosslinking, or growth factors and cytokines,
although these are well known to bind to ECMs (see below).
Two useful databases provide information on the expression and distribution of various ECM
proteins (http://www.matrixome.com/bm/Home/home/home.asp, The Matrixome Project,
maintained by Kiyotoshi Sekiguchi and http://www.proteinatlas.org/; Human Protein Atlas;
Ponten et al., 2008; Uhlen et al., 2010). A third database (MatrixDB, http://matrixdb.ibcp.fr/,
Chautard et al., 2009, 2010) collates information about interactions among ECM proteins.
Collagens are found in all metazoa and provide structural strength to all forms of extracellular
matrices, including the strong fibers of tendons, the organic matrices of bones and cartilages, the
laminar sheets of basement membranes, the viscous matrix of the vitreous humor and the
interstitial ECMs of the dermis and of capsules around organs. Collagens are typified by the
presence of repeats of the triplet Gly-X-Y, where X is frequently proline and Y is frequently 4-
hydroxyproline. This repeating structure forms stable, rod-like, trimeric, coiled coils, which can
be of varying lengths. A primordial collagen exon encoded 6 of these triplets (18 amino acids)
encoded in 54 base pairs and, during evolution, this original motif has been duplicated, modified
and incorporated into many genes (Figure 1A). Collagen subunits assemble as homotrimers or
as restricted sets of heterotrimers and, in general, collagen subunits are very restricted in the
partnerships they can form, although occasional promiscuity has been noted (for more detail see
Ricard-Blum, 2011; Yurchenco, 2011).
Some of these genes are viewed as collagens, sensu stricto, whereas others that contain only
short collagen segments are often referred to as “collagen-like” or “collagen-related.” The
distinction is to some extent arbitrary since many proteins viewed as “true” collagens also
contain significant portions made up of other domains. The original type I collagen of bones and
tendons consists almost entirely of a long (~1000 amino acids) and rigid uninterrupted collagen
triple helix (plus terminal non-collagenous pro-domains that are removed during biosynthetic
processing of the protein; Figure 1A). The rod-like trimers assemble into higher-order oligomers
and fibrils and become crosslinked by various enzymatic and non-enzymatic reactions conferring
considerable structural strength. Several other collagens with similar fibrillar structure are found
in various tissues. Many other collagen types have interruptions in the Gly-X-Y repeating
structure, introducing flexibility into the molecules. All collagen genes also encode additional
non-collagenous domains, some of which are the characteristic collagen N- and C- pro-domains,
whereas others are domains shared with other ECM proteins and retained in the mature proteins
(Figure 1B,C). These additional protein domains confer specific binding affinities, allowing
collagen molecules to interact with each other and with other proteins to assemble the various
structures. The diversity of collagen structures, genes and assemblies is discussed by Ricard-
Blum (2011) and the assembly of type IV collagen into the laminar structure of basement
membranes is reviewed by Yurchenco (2011). Other reviews of the collagen family cover
additional aspects (Eyre and Wu, 2005; Robins, 2007; Gordon and Hahn, 2009).
Among the collagen-like or collagen-related proteins (see table in Ricard-Blum, 2011), a few are
membrane proteins; others, such as complement component C1q and related proteins are
secreted but their main functions do not involve ECM and they are not considered as part of the
ECM or matrisome; yet others, such as the collagen-like domain of acetylcholinesterase, serve to
anchor other proteins into the ECM and some, such as EMIDs, are true ECM proteins. It is
worth keeping in mind the possibility that the presence of collagen-like domains could act to
bind some of these non-ECM proteins to the ECM, at least part of the time; in that sense they are
Proteoglycans are interspersed among the collagen fibrils in different ECMs. Rather than
providing structural strength, they confer additional properties. Proteoglycans are glycoproteins
with attached glycosaminoglycans (GAGs; repeating polymers of disaccharides with carboxyl
and sulfate groups appended). The addition of GAGs confers on proteoglycans a high negative
charge, leading them to be extended in conformation and able to sequester both water and
divalent cations such as calcium. These properties confer space-filling and lubrication functions.
GAGs, especially heparan sulfates, also bind many secreted and growth factors into the ECM
(see Sarrazin et al., 2011 for more details).
There are around three dozen extracellular matrix proteoglycans encoded in mammalian
genomes; they fall into several families (Table 1; see also Iozzo and Murdoch, 1996). The two
largest are those based on LRR repeats (Merline et al., 2009; Schaefer and Schaefer, 2010) and
those containing LINK and C-type lectin domains (hyalectans). Many of the LRR proteoglycans
bind to various collagens and to growth factors and the hyalectan family members bind to
various ECM glycoproteins such as tenascins and, through the LINK domain, to hyaluronic acid.
These binding functions contribute to regulation of protein complexes in the ECM.
In addition, there are around a dozen proteoglycans that do not fall into these two families (e.g.,
lubricin/PRG4, endocan/ESM1, serglycin, and three testicans related to SPARC/osteonectin, see
Table 1). Perhaps the most significant of all is perlecan (HSPG2), a multi-domain protein that is
a core proteoglycan of all basement membranes (see Table 1 and below). There are also many
examples of proteins falling into other categories (e.g., some collagens, agrin, betaglycan, CD44,
other glycoproteins) that are sometimes or always modified by attachment of GAGs, which could
lead one to consider them also as proteoglycans. The boundary between proteoglycans and
glycoproteins is thus somewhat a matter of definition. The consensus view is to consider as
proteoglycans those that have a significant fraction of their total mass made up by GAGs.
There are also two small families of integral membrane proteoglycans; glypicans (Filmus et al.,
2008) and syndecans (Couchman, 2010; Xian et al., 2010) both of which bear heparan sulfate
side chains as does CD44 and there are a few additional transmembrane chondroitin sulfate
proteoglycans. Further details of structure and functions of various heparan sulfate
proteoglycans are discussed by Bishop et al., (2007) and Sarrazin et al. (2011).
In addition to the collagens and proteoglycans that provide strength and space-filling functions
(among others), there are around 200 complex glycoproteins in the mammalian matrisome (see
Table 2 and Naba et al., 2011). These confer myriad functions including interactions allowing
ECM assembly, domains and motifs promoting cell adhesion and also signaling into cells and
other domains that bind growth factors. The bound growth factors can serve as reservoirs that
can be released (e.g., by proteolysis) or can be presented as solid-phase ligands by the ECM
proteins (Hynes, 2009).
The best studied ECM glycoproteins are the laminins (11 genes; 5α, 3β, 3γ) and fibronectins (1
gene encoding multiple splice isoforms). These are reviewed in detail by Aumailley et al.,
(2005) and Yurchenco (2011) and by Schwarzbauer and DeSimone (2011), respectively. Also
well studied are the thrombospondins and tenascins, reviewed by Bentley and Adams (2010) and
Adams and Lawler (2011) and by Chiquet-Ehrismann and Turner (2011), respectively. The
structures of these glycoproteins are well known and exemplify the typical multiple repeating
domain structure and extended multimeric forms of ECM proteins (Figure 2). The same is true
for fibulins (de Vega et al., 2009) and nidogens (Ho et al., 2008; Yurchenco, 2011) and many
others. Two subgroups of ECM glycoproteins have been studied particularly in the context of
the nervous system (netrins, slits, reelin, agrin, SCO-spondin – see article by Barros et al., 2011
and Figure 3) and the hemostatic system (von Willebrand factor, vitronectin and fibrinogen – a
facultative ECM protein; Bergmeier et al., 2008). These two biological systems also involve
roles for more widely distributed ECM proteins such as thrombospondins, fibronectins, laminins,
collagens, proteoglycans, etc. Similarly, the matrices of other tissues typically contain both
ubiquitous and tissue-restricted ECM glycoproteins. Another group of ECM glycoproteins that
has been studied in the context of disease and the regulation of TGFβ functions includes the
fibrillins and LTBPs (Ramirez and Dietz, 2009; Ramirez and Rifkin, 2009; see article by Munger
and Sheppard, 2011).
However, as can be seen in Table 2, there are multiple other ECM glycoproteins about which
much less (in some case, almost nothing) is known. These include some enormous glycoproteins
with impressive arrays of domains, such as SCO-spondin (59 domains of 7 types) and
hemicentin-1, also known as fibulin-6 (61 domains of 6 types) and many that are affected in
disease (Aszódi et al., 2006; Nelson and Bissell, 2006; Bateman et al., 2009). It will be of
considerable interest to learn the distributions and functions of this diverse set of ECM
glycoproteins and we can expect that the approaches that have been effective for the better-
studied proteins will provide many insights into the roles of those less well known and novel.
ECM-Bound Growth and Secreted Factors
As mentioned above and elsewhere (Hynes, 2009; Ramirez and Rifkin, 2009; Rozario and
DeSimone, 2010), many growth factors bind to ECM proteins and must be considered also as
constituents of extracellular matrices. One popular idea is that growth and other secreted factors
bind to GAGs, especially heparan sulfates. While this is undoubtedly true, there are clear
examples of growth factors binding to specific domains of ECM proteins. Fibronectin binds
specifically to a variety of growth factors (VEGF, HGF, PDGF, etc.; Rahman et al., 2005;
Wijelath et al., 2006; Lin et al., 2010) and the VWC/chordin and follistatin domains found in
many ECM proteins (see Figures 1-3) are known to bind BMPs (Wang et al., 2008; Banyai et al.,
2010). TGFβ binds specifically to TB domains in LTBPs, which bind in turn to fibrillins and to
fibronectin-rich matrices (Ramirez and Rifkin, 2009; Munger and Sheppard, 2011). These
ECM-TGFβ interactions have significant consequences for genetic diseases; mutations in
fibrillins affect the regulation of TGFβ function in Marfan’s syndrome and in other diseases
(Ramirez and Dietz, 2009).
It seems virtually certain that the known examples of growth factor binding to ECM, including
directly to ECM proteins, presage many more such cases and this aspect of ECM function is in
great need of further investigation. The ECM can act as a reservoir or sink of such factors and
there are many examples of this for chemokines and for many of the most important
developmental signals (e.g., VEGFs, Wnts, Hhs, BMPs and FGFs). Such factors form gradients
that control pattern formation during developmental processes and it is clear that some of those
gradients are markedly affected by ECM binding (Yan and Lin, 2009). Indeed, it seems probable
that many more gradients incorporate ECM binding as part of their regulation. Investigation of
this concept will be greatly aided by our current fairly complete inventory of ECM proteins and
their constituent domains.
Modifiers of ECM Structure and Function
Another aspect of ECM function is that ECM proteins and the fibrils into which they assemble
are often significantly modified subsequently. Collagens have long been known to become
crosslinked by disulfide bonding, transglutaminase crosslinking and through the action of lysyl
oxidases and hydroxylases (Eyre and Wu, 2005; Robins, 2007; Ricard-Blum, 2011). Laminins
and other basement membrane proteins also become crosslinked by disulfide bonding (see
Yurchenco, 2011 for further details) and the same is true of fibronectin, which also undergoes
further processing to a state characterized by insolubility in deoxycholate (DOC; Choi and
Hynes, 1979; Schwarzbauer and DeSimone, 2011). The exact basis for this insolubility is not
known but fibronectin and other ECM proteins are also substrates for transglutaminase 2, which
undoubtedly contributes to the insolubility of ECM (Lorand and Graham, 2003; Iisma et al.,
Proteolytic enzymes also modify the ECM – indeed procollagen propeptidases are necessary to
process collagens so that they can polymerize. Collagens and other ECM proteins are also
substrates for matrix metalloproteases (MMPs; Page-McCaw et al., 2007; Cawston and Young,
2010), ADAMs (Murphy, 2008) and ADAMTS proteases (Porter et al., 2005; Apte, 2009) and
many other proteolytic enzymes (elastases, cathepsins, various serine esterase proteases, etc.) can
also act on many ECM proteins (see article by Lu et al., 2011). These various proteolytic
processes play roles in ECM turnover and are thought to release ECM-bound growth factors and
also to expose cryptic activities in the ECM (Mott and Werb, 2004; Ricard-Blum, 2011),
including the release of antiangiogenic inhibitors (Nyberg et al., 2005; Bix and Iozzo, 2005;
Hynes, 2007). Similarly, enzymes that degrade GAGs, such as heparanases and sulfatases, can
also alter the properties of ECM proteoglycans (see articles by Sarrazin et al., 2011; Lu et al.,
2011). The remodeling of ECM by these various processes has major effects on development
and pathology (Daley et al., 2008; Kessenbrock et al., 2010; Lu et al., 2011). Lists of these ECM
modifying enzymes can be found in the reviews cited and in Naba et al. (2011).
Cellular Receptors for Extracellular Matrix
In order for the ECM to affect cellular functions, it is obvious that there must be receptors for
ECM proteins. The major receptors are the integrin family, comprising 24 αβ heterodimers
(Figure 4). These have been extensively reviewed elsewhere and specific aspects are covered in
other articles in this volume (Campbell and Humphries, 2011; Geiger and Yamada, 2011;
Huttenlocher and Horwitz, 2011; Schwartz, 2010; Watt and Fujimura, 2011; Wickstrom et al.,
2011). Another receptor for ECM proteins is dystroglycan, which binds to laminin, agrin and
perlecan in basement membranes as well as to the transmembrane neurexins (Barresi and
Campbell, 2006). Each of these dystroglycan ligands contains LamG domains, which bind to
dystroglycan in a glycosylation-dependent manner (see Figure 3), probably by binding
carbohydrate side chains on dystroglycan. Mutations in dystroglycan or its associated proteins in
the membrane or the cytoskeleton (or in laminin) can all produce various forms of muscular
dystrophy, because of the loss of the transmembrane connection to the basement membrane
surrounding the muscle cells. Other cellular receptors for ECM include GPVI on platelets and
the DDR (discoidin domain) tyrosine kinase receptors, all of which are receptors for collagens
(Leitinger and Hohenester, 2007), the GPIb/V/IX complex which forms a receptor for von
Willebrand factor on platelets (Bergmeier et al., 2008) and CD44 which binds to hyaluronan and
is expressed on many cells. As noted in Figure 3, Slits bind to Robo receptors of the Ig
superfamily and netrins bind to Unc5-related tyrosine-kinase receptors or to DCC, an Ig
superfamily receptor, while agrin binds to the MuSK tyrosine kinase receptor. Thus, although
integrins comprise the dominant class of ECM receptors and are present on most cells, numerous
other receptors for ECM proteins are expressed on specific cell types.
In addition to binding extracellular ligands, these ECM receptors provide transmembrane links to
the cytoskeleton and to signal transduction pathways. The cytoplasmic domains of ECM
receptors assemble large and dynamic complexes of proteins, which regulate cytoskeletal
assembly and activate many signaling cascades within cells (Geiger and Yamada, 2011). In the
case of integrins, these submembranous complexes also regulate the extracellular affinity of the
receptors (so-called “inside-out” signaling) and the same may be true of other classes of ECM
receptors. It has become clear that the signaling functions of ECM adhesion receptors are at
least as complicated as those of canonical growth factor receptors and that engagement of ECM
receptors provides signals regulating cellular survival, proliferation and differentiation as well as
adhesive and physical connections involved in cell shape, organization, polarity and motility.
Evolution of the Matrisome and the Extracellular Matrix
The ~300 proteins that make up the core matrisome in mammals are a mixture of very ancient
proteins and some much newer ones (Figure 5). Comparative analyses of the genomes of
different taxa have revealed that some ECM proteins are shared by almost all metazoa, even
simple organisms such as sponges, coelenterates and cnidaria (Huxley-Jones et al., 2007; Ozbek
et al., 2010). Most notable are the proteins that make up the core of basement membranes – type
IV collagens (2 subunits), laminin (4 genes, 2α, 1β and 1γ), nidogen and perlecan (1 gene each)
– see Yurchenco, 2011). We call this set of genes the basement membrane toolkit and it is found
in all protostome and deuterostome genomes and must therefore have been present in the
common ancestor of all bilateria (Hynes and Zhao, 2000; Whittaker et al., 2006). Many, but not
all, of these genes are found also in more primitive metazoan organisms such as cnidaria and
sponges (Chapman et al., 2010; Putnam et al., 2007; Srivasatava et al., 2010). It is plausible to
argue that the evolution of multilayered organisms with their different cell layers separated by
basement membranes was dependent on this basement membrane toolkit that has been
maintained ever since. Fibrillar collagens are also found in early metazoa, including Hydra and
sponges. Interestingly, another collagen, the paralog of collagens XV and XVIII is also ancient,
being found in both protostomes and deuterostomes, although the key functions of this class of
collagens are not fully understood. Most other collagens are later evolutionary developments,
for example the cuticular collagens of C.elegans (Hutter et al., 2000) and the complex collagens
with VWA and FN3 domains (see Figure 1C and Ricard-Blum, 2011) found in vertebrates. Also
found in all bilateria are the neuronal guidance ECM proteins, netrins, slits and agrin (Figure 3).
One characteristic feature of the evolution of ECM proteins, as for other genes, is an increase in
numbers of homologous genes as one ascends the tree of life (Figure 5). Thus, mammals have
six type IV collagen genes (see Ricard-Blum, 2011), 2 nidogen genes and 11 laminin genes (see
Yurchenco, 2011) that have arisen by gene duplications and subsequent divergence without
altering the basic structures of the proteins. This diversification accompanies the diversification
of basement membranes in vertebrates. Similar evolution by duplication and diversification
from a primordial gene shared by all bilateria is seen in the case of thrombospondins (see Adams
and Lawler, 2011), although in this case the diversification has involved more extensive
evolution of the domain architecture than is the case for the basement membrane toolkit. This
suggests that thrombospondins have evolved to fulfill a more diverse set of functions, whereas
basement membranes have retained many of their basic structure-function requirements during
the more than half a billion years of their evolution.
Other ECM proteins, in contrast, are more recent developments. Two clear examples are
tenascins and fibronectins (Tucker and Chiquet-Ehrismann, 2009; Chiquet-Ehrismann and
Tucker, 2011). Both are restricted to chordates, as are many of the more complex collagen
genes. A tenascin gene is found in all the chordate genomes that have been sequenced and
vertebrates have expanded the tenascin family. Tenascins represent a novel architectural
assembly of preexisting domains (EGF and FN3, see Figure 2). In contrast, fibronectin contains
domains that do not appear until quite late in evolution; while FN3 domains are ancient, being
found in cell surface receptors in all metazoa, FN1 and FN2 domains are restricted to chordates.
The earliest fibronectin-like gene so far reported (although lacking the precise, characteristic
domain organization of vertebrate fibronectin) appears in urochordates (ascidians, sea squirts)
while vertebrates all have the canonical structure found in mammals (Hynes, 1990;
Schwarzbauer and DeSimone, 2011; see Figure 2); once assembled, this gene appears to have
been strongly selected (it is essential for life) and has remained unchanged. Reelin, a protein that
controls aspects of brain development in mammals also appears to be a deuterostome-specific
gene (Whittaker et al., 2006), using one old domain (EGF) and two new ones (Reeler, BNR).
Analyses of proteoglycans reveal a similar story. While perlecan is ancient (as are the
transmembrane proteoglycans, syndecan and glypican), proteoglycans containing the LINK
domain are confined to deuterostomes, indeed largely to vertebrates (there are two genes
containing that domain in sea urchins, Whittaker et al., 2006).
In general, it seems clear that the fraction of the proteome that is ECM proteins has expanded
disproportionately during the evolution of the deuterostome lineage, both by duplication and
divergence of existing genes and by the appearance of novel gene architectures and even some
new domains. It is interesting to speculate on the reasons for this. One obvious explanation is
the development of cartilage, bones and teeth in vertebrates and that undoubtedly accounts for
some of the elaboration of novel collagens, proteoglycans and ECM glycoproteins. However,
proteins such as tenascins, fibronectin and reelin (as well as other neural ECM proteins) have no
obvious strong connections to the development of structural ECMs and it is tempting to
hypothesize that their emergence was more closely tied to the emergence of novel structures such
as the neural crest, endothelial-lined vasculature and more complex nervous systems. Consonant
with this model of key roles for ECM proteins in evolution, the matrisome is one of the most
plastic and rapidly evolving compartments of the proteome.
We now have a reasonably complete inventory of ECM proteins and their associated modifiers.
Some ECM proteins have been well studied and we have a good picture of their basic functions –
other ECM proteins are virtually unstudied. Even in the case of the well-studied proteins, many
of the constituent domains, all of which are well conserved and must, therefore, have important
functions, still lack assigned functions. Presumably many of them, like those that we do
understand, serve to bind other proteins in ways that contribute to ECM assembly, binding and
presentation of growth factors and interactions with cells to influence their behavior. There is
now a pressing need to describe the changes in ECM composition in development and pathology,
to understand better the interactions of individual domains and to probe the cooperation of these
multi-protein assemblies in modulating the functions of cells and tissues. The techniques for
such analyses (biophysical, imaging, etc.) continue to advance and there is every prospect that
studies of ECM structure and function will yield important insights into the vital roles played by
this vital component of metazoan organization, and genetic analyses and studies of human
disease are revealing the biological relevance of individual ECM proteins and of specific
We would like to thank Charlie Whittaker and Sebastian Hoersch for their assistance and
collaboration in the bioinformatic mining of genomes during our development of the ECM
inventory discussed here. The work in our laboratory was supported by the National Cancer
Institute and the Howard Hughes Medical Institute.
Adams?? J,?? Engel?? J.?? 2007.?? Bioinformatic?? Analysis?? of?? Adhesion?? Proteins.?? In?? Adhesion?? Protein??
Protocol?? (ed.?? AS?? Coutts),?? Vol.?? 370?? of?? Methods?? in?? Molecular?? Biology,?? pp.?? 147-‐172,?? Humana??
Press,?? New?? York.?? ?? ??
Adams?? JC?? and?? Lawler?? J.?? 2011.?? The?? Thrombospondins.?? In?? Extracellular?? Matrix?? Biology?? (eds.??
RO?? Hynes?? and?? KM?? Yamada).?? Cold?? Spring?? Harbor?? Perspectives?? in?? Biology.?? (Ms?? in?? press)??
Apte?? SS.?? 2009.?? A?? disintegrin-‐like?? and?? metalloprotease?? (reprolysin-‐type)?? with??
thrombospondin?? type?? 1?? motif?? (ADAMTS)?? superfamily:?? functions?? and?? mechanisms.?? J.??
Biol.?? Chem?? 284:?? 31493-‐31497.??
Aszódi?? A,?? Legate?? KR,?? Nakchbandi?? I,?? and?? Fässler?? R.?? 2006.?? What?? Mouse?? Mutants?? Teach?? Us??
About?? Extracellular?? Matrix?? Function.?? Annu.?? Rev.?? Cell?? Dev.?? Biol.?? 22:?? 591-‐621.??
Aumailley?? M,?? Bruckner-‐Tuderman?? L,?? Carter?? WG,?? Deutzmann?? R,?? Edgar?? D,?? Ekblom?? P,?? Engel?? J,??
Engvall?? E,?? Hohenester?? E,?? Jones?? JCR,?? et?? al.?? 2005.?? A?? simplified?? laminin?? nomenclature.??
Matrix?? Biol?? 24:?? 326-‐332.??
Bányai?? L,?? Sonderegger?? P,?? and?? Patthy?? L.?? 2010.?? Agrin?? Binds?? BMP2,?? BMP4?? and?? TGFβ1?? ed.?? B.??
Kobe.?? PLoS?? ONE?? 5:?? e10758.??
Barresi?? R?? and?? Campbell?? K.?? 2006.?? Dystroglycan:?? from?? biosynthesis?? to?? pathogenesis?? of?? human??
disease.?? Journal?? of?? Cell?? Science?? 119:?? 199-‐207.??
Barros?? CS,?? Franco?? SJ,?? and?? Muller?? U.?? 2011.?? Extracellular?? Matrix:?? Functions?? in?? the?? Nervous??
System.?? In?? Extracellular?? Matrix?? Biology?? (eds.?? RO?? Hynes?? and?? KM?? Yamada).?? Cold?? Spring??
Harbor?? Perspectives?? in?? Biology?? 3:?? a005108.?? doi/10.1101/cshperspect.a005108??
Bateman?? JF,?? Boot-‐Handford?? RP,?? and?? Lamandé?? SR.?? 2009.?? Genetic?? diseases?? of?? connective??
tissues:?? cellular?? and?? extracellular?? effects?? of?? ECM?? mutations.?? Nat?? Rev?? Genet?? 10:?? 173-‐183.??
Bentley?? AA?? and?? Adams?? JC.?? 2010.?? The?? evolution?? of?? thrombospondins?? and?? their?? ligand-‐
binding?? activities.?? Mol.?? Biol.?? Evol?? 27:?? 2187-‐2197.??
Bergmeier?? W,?? Chauhan?? AK,?? and?? Wagner?? DD.?? 2008.?? Glycoprotein?? Ibalpha?? and?? von??
Willebrand?? factor?? in?? primary?? platelet?? adhesion?? and?? thrombus?? formation:?? lessons?? from??
mutant?? mice.?? Thromb.?? Haemost?? 99:?? 264-‐270.??
Bishop?? JR,?? Schuksz?? M,?? and?? Esko?? JD.?? 2007.?? Heparan?? sulphate?? proteoglycans?? fine-‐tune??
mammalian?? physiology.?? Nature?? 446:?? 1030-‐1037.??
Bix?? G?? and?? Iozzo?? RV.?? 2005.?? Matrix?? revolutions:?? “tails”?? of?? basement-‐membrane?? components??
with?? angiostatic?? functions.?? Trends?? Cell?? Biol?? 15:?? 52-‐60.??
Campbell?? ID?? and?? Humphries?? MJ.?? 2011.?? Integrin?? Structure,?? Activation,?? and?? Interactions.?? In??
Extracellular?? Matrix?? Biology?? (eds.?? RO?? Hynes?? and?? KM?? Yamada).?? Cold?? Spring?? Harbor??
Perspectives?? in?? Biology?? 3:?? a004994.?? doi/10.1101/cshperspect.a004994.??
Cawston?? TE?? and?? Young?? DA.?? 2010.?? Proteinases?? involved?? in?? matrix?? turnover?? during?? cartilage??
and?? bone?? breakdown.?? Cell?? Tissue?? Res?? 339:?? 221-‐235.??
Chapman?? JA,?? Kirkness?? EF,?? Simakov?? O,?? Hampson?? SE,?? Mitros?? T,?? Weinmaier?? T,?? Rattei?? T,??
Balasubramanian?? PG,?? Borman?? J,?? Busam?? D,?? et?? al.?? 2010.?? The?? dynamic?? genome?? of?? Hydra.??
Nature?? 464:?? 592-‐596.??
Chautard?? E,?? Ballut?? L,?? Thierry-‐Mieg?? N,?? and?? Ricard-‐Blum?? S.?? 2009.?? MatrixDB,?? a?? database??
focused?? on?? extracellular?? protein-‐protein?? and?? protein-‐carbohydrate?? interactions.??
Bioinformatics?? 25:?? 690-‐691.??
Chautard?? E,?? Fatoux-‐Ardore?? M,?? Ballut?? L,?? Thierry-‐Mieg?? N,?? and?? Ricard-‐Blum?? S.?? 2010.?? MatrixDB,??
the?? extracellular?? matrix?? interaction?? database.?? Nucleic?? Acids?? Research?? 39:?? D235-‐D240.??
Chen?? C-‐C?? and?? Lau?? LF.?? 2009.?? Functions?? and?? mechanisms?? of?? action?? of?? CCN?? matricellular??
proteins.?? Int.?? J.?? Biochem.?? Cell?? Biol?? 41:?? 771-‐783.??
Chiquet-‐Ehrismann?? R?? and?? Tucker?? RP.?? 2011.?? Tenascins?? and?? the?? Importance?? of?? Adhesion??
Modulation.?? In?? Extracellular?? Matrix?? Biology?? (eds.?? RO?? Hynes?? and?? KM?? Yamada).?? Cold??
Spring?? Harbor?? Perspectives?? in?? Biology?? 3:?? a004960.?? doi/10.1101/cshperspect.a004960.??
Choi?? MG,?? and?? Hynes?? RO.?? 1979.?? Biosynthesis?? and?? processing?? of?? fibronectin?? in?? NIL.8?? hamster??
cells.?? J.?? Biol.?? Chem?? 254:?? 12050-‐12055.??
Couchman?? JR.?? 2010.?? Transmembrane?? Signaling?? Proteoglycans.?? Annu.?? Rev.?? Cell?? Dev.?? Biol.?? 26:??