-
me
CahilPh
fer
Edited by R. Sanderson
lycans is a much more complexe a comprehensive and
simplifiedtealogthon
the simple concept thatof finite units, like piecesent a minimum
level of
globular-like domain of perlecan, which has recentlybeen
crystallized [3]. Below, we will critically assessthe field of
proteoglycans which now encompass forty
MATBIO-1140; No. of pages: 45; 4C: 2, 4, 8, 12, 15, 24
Reviewproteoglycan geneavailable. In contraaminoglycans (GAGical
structure of thclassifying proteogtask [2]. We proposnomenclature
of proincluding: Cellulargene/protein homoprotein modules wiWhereas
the first twthe past for various
is ofmore recent deveintrinsic signature fo
0022-2836/ 2015 Publis(http://creativecommons.o
Please cite this articleof proteoglycans, Matrixoglycans based
on three criteriand subcellular location, overally, and the
presence of specificin their respective protein cores.attributes
have been utilized inomenclatures, the third attribute
three distinct genes and a much higher number ofproteoglycans
due to alternative splicing, therebyproviding a very rich and
biologically-active group ofmolecules. As hyaluronan and the
enzymes involvedin the synthesis and degradation of various GAGs
arenot covered in this review, readers are referred tost to the
classification of glycos-s), primarily based on the chem-eir
repeating disaccharide units,publication of a comprehensive
classification ofproteoglycan gene families [1]. For the most
part,these classes have been widely accepted. However,a broad and
current taxonomy of the various
families and their products is not
organization and a module can be thought of as afunctional
domain that affects cellmatrix dynamics.Another key feature is that
each module/functionalunit can be stable and can fold on its own,
withoutbeing part of the large precursor protein. Thus, amodule is
a self-contained component. An example ofthis is the LG3 domain of
endorepellin, the C-terminalIntroduction
It has been nearly 20 years since the original
modular design is based onprotein cores are made upof Lego. The
units represAbstract
We provide a comprehensive classification of the proteoglycan
gene families and respective protein cores.This updated
nomenclature is based on three criteria: Cellular and subcellular
location, overall gene/proteinhomology, and the utilization of
specific protein modules within their respective protein cores.
These threesignatures were utilized to design four major classes of
proteoglycans with distinct forms and functions:the intracellular,
cell-surface, pericellular and extracellular proteoglycans. The
proposed nomenclatureencompasses forty-three distinct
proteoglycan-encoding genes and many alternatively-spliced
variants. Thebiological functions of these four proteoglycan
families are critically assessed in development, cancer
andangiogenesis, and in various acquired and genetic diseases where
their expression is aberrant.
2015 Published by Elsevier B.V. This is an open access article
under the CC BY-NC-ND
license(http://creativecommons.org/licenses/by-nc-nd/4.0/).Proteoglycan
forA comprehensivof proteoglycans
Renato V. Iozzo1 and Liliana Schaefer 2
1 - Department of Pathology, Anatomy and Cell Biology and
theSidney Kimmel Medical College at Thomas Jefferson University, P2
- Pharmazentrum Frankfurt/ZAFES, Institut fr AllgemeineFrankfurt am
Main, Frankfurt am Main, Germany
Correspondence to Renato V. Iozzo and Liliana
[email protected]://dx.doi.org/10.1016/j.matbio.2015.02.003lopment
and represents a sort ofr various protein cores. Indeed,
hed by Elsevier B.V. This is an open
accesrg/licenses/by-nc-nd/4.0/).
as: Iozzo Renato V., Schaefer Liliana, ProBiol (2015),
http://dx.doi.org/10.1016/j.matband function:nomenclature
ncer Cell Biology and Signaling Program, Kimmel Cancer
Center,adelphia, PA 19107, USAarmakologie und Toxikologie, Klinikum
der Goethe-Universitt
: [email protected];recent reviews covering these
closely-related sub-jects [418].
s article under the CC BY-NC-ND licenseMatrix Biol (2015) xx,
xxxxxx
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
2General features
Four major proteoglycan classes encompassnearly all the known
proteoglycans of the mamma-lian genome (Fig. 1). Observing the
types of
Fig. 1. A comprehensive classification of proteoglycans.
Thlocation, homology at the protein and genomic levels and the pby
members of a given class. The key for the various modules
istructure and function, please consult the text.
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbProteoglycan
nomenclatureproteoglycans based on cellular and
subcellularlocalization, we can see that there is only
oneintracellular proteoglycan, serglycin. This uniqueproteoglycan
forms a class on its own as it is the onlyproteoglycan that carries
heparin side chains.
e four families are based on their cellular and
subcellularresence of unique protein modules which are often
shareds provided in the bottom panel. For additional details
about
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
associated with the cell surface or the pericellular
matrix. The HSPGs are intimately associated withthe
plasmamembranes of cells, either directly via anintercalated
protein core or via a glycosyl-pho-sphatidyl-inositol (GPI) anchor,
and function asmajor biological modifiers of growth factors suchas
FGF, VEGF and PDGF among others. Similarfunctions are also
performed by the HSPGs locatedin the basement membrane zone, in
addition to theirability to interact with each other and with
keyconstituents of the basement membrane, includingvarious
laminins, collagen type IV, and nidogen.Presentation of growth
factors to their cognatereceptors in a biologically-favorable form
is a majorfunction of cell surface and pericellular HSPGs.Another
key role is participating in the generationand long range
maintenance of gradients formorphogens during embryogenesis and
regenera-tive processes.As we move away from the cells in a
centrifugal
manner, chondroitin- and dermatan sulfate-contain-ing
proteoglycans (CSPGs and DSPGs, respective-ly) predominate. These
proteoglycans function asstructural constituents of complex
matrices such ascartilage, brain, intervertebral discs, tendons
andcorneas. Thus, among other functions, they provideviscoelastic
properties, retain water and keeposmotic pressure, dictate proper
collagen organiza-tion and are the main molecules responsible
forcorneal transparency. The extracellular matrix alsocontains the
largest class of proteoglycans, theso-called small leucine-rich
proteoglycans (SLRPs)which are themost abundant products in terms
of genenumber. These SLRPs can function both as
structuralconstituent and as signaling molecules, especiallywhen
tissues are remodeled during cancer, diabetes,inflammation and
atherosclerosis. SLRPs interact withseveral receptor tyrosine
kinases (RTKs) and Toll-likereceptors, thereby regulating
fundamental processesincluding migration, proliferation, innate
immunity,apoptosis, autophagy and angiogenesis. Below wewill
discuss the rationale for grouping certain proteo-glycans in the
same class and their overall biologicalfunction.
Intracellular proteoglycans
It is quite amazing that since the original cloning ofserglycin,
the first proteoglycan-encoding gene to besequenced, no other true
intracellular proteoglycanSerglycin is packaged in the granules of
mast cellsand serves as biological glue for most of
theintracellular proteases stored within the granules[19]. Another
general observation is that heparansulfate proteoglycans (HSPGs)
are prevalently
Proteoglycan nomenclaturehas been discovered. Serglycin occupies
a class ofits own insofar as it is the only proteoglycan that
iscovalently substituted with heparin due to its
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbconsecutive (and quite unique)
Ser-Gly repeats,essentially a silk-like sequence. Serglycin has
beenutilized primarily by mast cells for the properassembly and
packaging of the numerous proteasesthat are released upon
inflammation [19]. Thedefects in the formation of mast cell
granulesobserved in Srgn/ mice are remarkably similar tothose
observed in mast cells derived from micelacking
N-deacetylase/N-sulfotransferase 2, a keyenzyme involved in the
sulfation of heparin [19].Thus, serglycin promotes granular storage
viaelectrostatic interaction between its highly-anionicheparin
chains and basic residues within the variousproteases of the
secretory granules. It is becomingevident, however, that all
inflammatory cells expressserglycin and store it within
intracytoplasmic gran-ules where, in addition to proteases,
serglycin bindsand modulates the bioactivity of several
inflamma-tory mediators, chemokines, cytokines and growthfactors
[20].More recently, serglycin has been found in a wide
variety of non-immune cells such as endothelialcells,
chondrocytes and smooth muscle cells [21].Cell-surface serglycin
promotes adhesion ofmyelomacells to collagen I and affects the
expression of MMPs[22]. These findings have been corroborated by in
vivostudies where serglycin knockdown attenuates themultiple
myeloma growth in immunocompromisedmice [23]. It has been proposed
that some of theseeffects are mediated by a specific interaction
betweenserglycin and cell-surface CD44 [23], a knownreceptor for
hyaluronan [24,25]. It has been recentlyshown that serglycin is a
key component of the cellinflammatory response in activated primary
humanendothelial cells as both LPS and IL-1 increase itssynthesis
and secretion [26]. Notably, serglycin canbe substituted with
chondroitin sulfate (CS), and inseveral circulating cells serglycin
contains lowersulfated CS-4 chains [21]. In contrast,
severalhematopoietic cells (mucosal mast cells, macro-phages etc.)
express serglycin with highly sulfatedCS-E. Although the
significance of this phenomenonis not fully appreciated, it is
likely that these isoformsof serglycin might have different
functions in acell-context specific manner. Serglycin is a marker
ofimmature myeloid cells and interacts with manybioactive
components including histamine, TNF-and proteases [27]. In general,
serglycin expressioncorrelates with a more aggressive malignant
pheno-type and it has been recently proposed that serglycinprotects
breast cancer cells from complement attack,thereby supporting
cancer cell survival and progres-sion [28].
Cell surface proteoglycans
3In this class, there are thirteen genes, sevenencoding
transmembrane proteoglycans and sixencoding GPI-anchored
proteoglycans. With the
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
glyndechexception of two gene products, NG2 and phospha-can, all
contain heparan sulfate side chains.
Syndecans
Fig. 2. Schematic representation of the cell surface proteois
outside of the plasma membrane) proteoglycans (four syGPI-anchored
proteoglycans, glypicans 16. The type of GAGkey for the various
modules is provided in the bottom panel.
4The eponym syndecan was coined by the lateMerton Bernfield [29]
to define a class of transmem-brane proteoglycans that would
connect (from theGreek syndein, bind together) the surface of the
cellsto the underlying extracellular matrix. The syndecanfamily now
comprises four distinct genes encodingsingle-pass transmembrane
protein cores whichinclude an ectodomain, a transmembrane regionand
an intracellular domain [4,30] (Fig. 2). Theectodomains exhibit the
lowest amount of aminoacid sequence conservation, no more than
1020%,in contrast to the transmembrane and cytoplasmicdomains which
are 6070% identical. A recent studyhas shown that the ectodomain of
syndecans isnatively disordered and this characteristic
allowssyndecans to interact with a variety of proteins andligands,
thereby providing enrichment in their biolog-ical function [31].
The ectodomain contains the GAGattachment sites, which are often
covalently-linked toHS and sometimes to CS, making syndecans
hybridproteoglycans. Several cell types shed syndecan intothe
pericellular environment through the action ofMMPs. For example, it
has recently been shown thatshed syndecan-2 retards angiogenesis by
inhibitingendothelial cell migration [32], a key step in
neovas-cularization [33]. The transmembrane domain con-tains a
dimerization motif (GxxxG) that mediates bothhomo-dimerization and
hetero-dimerization [30]. The
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbintracellular domain is composed of
two regions ofconserved amino acid sequence (C1 and C2),separated
by a central variable sequence of aminoacids that is distinct for
each family member (V) [34].
cans, which comprise transmembrane type I (the N-terminuscans,
CSPG4/NG2, betaglycan and phosphacan) and sixain and themajor
protease sensitive sites are indicated. The
Proteoglycan nomenclatureNotably, the C-terminus of all the four
syndecansharbors a unique signature (EFYA) that bindsPDZ-containing
proteins. Generally, PDZ-containingproteins contribute to a proper
anchor of transmem-brane proteins to the cytoskeleton, thereby
holdingtogether large signaling complexes.Syndecans are involved in
a wide variety of
biological functions, too vast to be reviewed here,but reviewed
recently [5,30,34]. Briefly, syndecansbind numerous growth factors,
especially throughtheir HS chains, and dictate morphogen
gradientsduring development. In concert with other cell-surface
HSPGs, syndecans can act as endocytosisreceptors and are also
involved in the uptake ofexosomes [35]. Syndecans play key roles
asco-receptors for many RTKs and can also functionas receptors for
atherogenic lipoproteins [36].Indeed, there is strong genetic
evidence thatsyndecan-1 is the main HSPG mediating clearanceof
triglyceride-rich lipoproteins derived from eitherthe liver or from
intestinal absorption [37].Many, if not all the syndecans, can also
act as
soluble HSPGs via partial proteolysis of theirjuxtamembrane
region releasing their whole ectodo-mains. This shedding is
considered a powerfulpost-translational modification that can
regulate theamount of HSPG linked to the cell surface andthat
present in the pericellular microenvironment[30]. Several
inflammatory cytokines can induce
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
syndecan shedding by triggering outside-in signalingand by
activating several metalloproteinases. In thecase of hepatocytes,
shedding of syndecan-1 occursvia PKC-dependent activation of
ADAM17, andthis impairs VLDL catabolism and promotes
hypertri-glyceridemia [38]. Importantly, soluble syndecan-1promotes
the growth of myeloma tumors in vivo [39],and this process, i.e.
the shedding of syndecan-1, isenhanced by heparanase [40], thereby
offering anovel mechanism for promoting cancer growth andmetastasis
[41,42]. Notably, chemotherapy stimulatessyndecan-1 shedding, a
potential drawback of thetreatment that could potentially favor
tumor progres-sion [43]. The biological interplay between
hepara-nase-evoked shedding of syndecan-1 and myelomacells leads to
enhanced angiogenesis [44], furthersupporting cancer growth. As
mentioned above,however, shed syndecan-2 inhibits angiogenesis viaa
paracrine interaction with the protein tyrosinephosphatase receptor
CD148, which in turn deacti-vates 1-containing integrins [32],
presumably 11and 21, two main angiogenesis receptors. Incontrast,
the ortholog syndecan-2 is required forangiogenic sprouting during
zebrafish development[45].An emerging new role for syndecan-1 is
linked to
its ability to reach the nuclei in a variety of cells.
Initialobservations showed that myeloma and mesotheli-oma cells
contain syndecan-1 in their nuclei [46,47]and this nuclear
translocation is also regulated byheparanase [46], indicating that
there must be acellular receptor for shed syndecan-1 that
couldmediate its nuclear targeting and transport. Insupport of
these studies are previous observationsthat exogenous HS can
translocate to the nuclei andmodulate the activity of DNA
Topoisomerase I [48]and histone acetyl transferase (HAT) [49].
N-terminalacetylation of histones by HAT is linked to
transcrip-tional activation, and this process is finely tuned by
itscounteracting enzyme, histone deacetylase
(HDAC).Heparanase-evoked loss of nuclear syndecan-1causes an
increase in HAT enzymatic activity andenhances transcription of
pro-tumorigenic genes [50].Syndecan-1 that is shed from myeloma
tumor cells isuptaken by bone marrow stromal cells and
istransported to the nuclei by amechanism that requiresits HS
chains, as this process is inhibited by heparinand chlorate [51].
Once nuclear, soluble syndecan-1binds to HAT p300 and inhibits its
activity, therebyproviding a new mechanism for tumorhost
cellinteraction and cross-talk [52].
CSPG4/NG2
The melanoma-associated chondroitin sulfate pro-teoglycan (MCSP)
was discovered over 30 years
Proteoglycan nomenclatureago as a transmembrane proteoglycan and
a highlyimmunogenic tumor antigen of melanoma tumor cells.This
proteoglycan has been subsequently detected in
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbvarious species, with many names
designating thesame gene product. The rat ortholog of MCSP iscalled
nerve/glial antigen 2 (NG2) [53], while the termCSPG4 designates
the human gene. We will useCSPG4/NG2 terminology with the idea that
some ofthe functional properties have not been fully describedin
the human and rat species [54]. CSPG4/NG2 is asingle-pass, type I
transmembrane proteoglycancarrying one chondroitin sulfate chain,
and harboringa large ectodomain composed of three subdomains(Fig.
2). The N-terminal domain (D1 subdomain)contains two laminin-like
globular (LG) repeats. It islikely that the LG domains as in other
proteoglycans(i.e. perlecan and agrin, see below) mediate
ligandbinding, cellmatrix and cellcell interactions, as wellas
interaction with integrins and receptor tyrosinekinase (RTK). The
central subdomain D2 contains 15tandem repeats of a new module
called CSPG [54].The CSPG repeat is a cadherin-like and
tumor-relevant module which is predicted to be involved
incellmatrix interaction, further modulated by the CSchain
covalently attached to this module. Indeed,CSPG modules bind to
collagens V and VI, FGF andPDGF. The juxtamembrane subdomain D3
contains acarbohydrate modification able to bind integrins
andgalectin, as well as numerous protease cleavagesites.
Accordingly, the intact ectodomain and frag-ments thereof can be
detected in sera from normaland melanoma-carrying patients [54].
The transmem-brane domain of CSPG4/NG2 is quite interestinginsofar
as it has a unique Cys residue, generally notfound in transmembrane
regions. The intracellulardomain harbors a proximal region with
numerous Thrphospho-acceptor sites for PKC and ERK1/2, and adistal
region encompassing a PDZ-binding modulesimilar to the syndecan
family. The latter can bind tothe PDZ domain of several scaffold
proteins involvedin intracellular signaling, including syntenin,
MUPP1and GRIP1.Functionally, CSPG4/NG2 proteoglycan promotes
tumor vascularization [55] and because of itspredominant
perivascular localization, CSPG4/NG2may modulate the availability
of FGF at the cellsurface as well as the bioactivity and
signaltransduction of FGF receptors [56]. This CSPGbinds to
collagen VI in the tumor microenvironmentand promotes cell survival
and adhesion via thePI3K pathway [57]. Indeed, targeting CSPG4/NG2
intwo animal models of highly-malignant brain tumorsreduces tumor
growth and angiogenesis [58].Moreover, a combinatorial treatment
using activatednatural killer cells and a monoclonal antibody
towardCSPG4/NG2 is capable of eradicating glioblastomaxenografts
more efficiently than single therapies[59].It has recently been
discovered that NG2 controls
5the directional migration of oligodendrocyte precur-sor cells
by constitutively stimulating RhoA GTPases[60]. Based on NG2
ability to regulate adhesion,
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
RhoA GTPase and growth factor activities, it is likelythat this
transmembrane proteoglycan might play akey role in regulating cell
polarity in response toextracellular cues [61].Perdido/Kon-tiki,
the Drosophila ortholog of mam-
malian CSPG4, genetically interacts with integrinsduring
Drosophila embryogenesis, and its loss isembryonic lethal [62].
RNAi-mediated suppressionof Perdido/Kon-tiki in the muscles, just
before adultmyogenesis starts, induces misorientation anddetachment
of Drosophila adult abdominal muscle,generating a phenotype similar
to the embryoniclethal ones [63]. Thus, it is possible that, based
on itshigh conservation through species, mammalianCSPG4 could also
play a role in myogenesis andfunction as well.A recent study has
added another function to
CSPG4 by involving this cell surface proteoglycan inthe
pathogenesis of severe pseudomembranouscolitis. CSPG4 acts as a
receptor for the Clostridiumdifficile toxin B, one of the key
toxins secreted by thisgram-positive and spore-forming anaerobic
bacillus[64]. The interaction occurs between the N-terminusof CSPG4
and the C-terminus of toxin B. Thisdiscovery, if confirmed in
future studies, opens newtherapeutic targets for the treatment of
this severeand often lethal form of enterocolitis.
Betaglycan/TGF type III receptor
In 1991, two back-to-back papers reported on theisolation and
cloning of a membrane-anchoredproteoglycan with high affinity for
TGF, and thusnamedbetaglycan [65,66]. Betaglycan, also knownasTGF
type III receptor (TGFB3), is a single-passtransmembrane
proteoglycan that belongs to theTGF superfamily of co-receptors
(Fig. 2). Theextracellular domain contains several potential
GAGattachment sites and protease-sensitive sequencesnear the plasma
membrane. The short intracellulardomain is highly enriched in
Ser/Thr (N40%) andsome of these residues are candidate sites
forPKC-mediated phosphorylation [65]. Betaglycanamino acid sequence
is highly similar to that ofendoglin, a close member of the same
superfamily.The membrane-proximal ectodomain of betaglycan
contains a unique module called zona pellucida(ZP)-C [67]. The
ZP module is a structural elementtypically found in the ectodomain
of eukaryoticproteins composed of a Cys-rich bipartite
structurejoined by a linker. Generally, proteins harboring
ZPmodules tend to polymerize and assemble into longfibrils of
specialized extracellular matrices [67]. In thecase of betaglycan
and endoglin these ZP modulesare not utilized for polymerization,
rather they functionas membrane co-receptors for the TGF
superfamily
6members [68]. The intracellular domain contains aPDZ-binding
element similar to that observed in thesyndecan family of
proteoglycans (Fig. 1).
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbBetaglycan is a
ubiquitously-expressed cell sur-face proteoglycan that acts as a
co-receptor formembers of the TGF superfamily of Cys knotgrowth
factors which also include activins, inhibins,GDFs and BMPs
[69,70]. For example, betaglycanenhances the binding of all the TGF
isoforms to thesignaling TGF complex [71] and is needed forTGF2
high-affinity interaction with the receptorcomplex. Betaglycan also
blocks the aggressive-ness of ovarian granulosa cell tumors by
suppress-ing NF-B-evoked MMP2 expression [72].Betaglycan, together
with other TGF-bindingSLRPs, i.e. decorin and biglycan (see below),
canbe cleaved by granzyme B, thereby releasing anactive form of TGF
[73]. Ectodomain shedding ofbetaglycan is indeed necessary for
betaglycan-mediated suppression of TGF signaling and breastcancer
migration and invasion [74]. The ability ofbetaglycan to affect
epithelial mesenchymal trans-formation [70], together with genetic
evidence ofembryonic lethality in Tgfbr3/ mice, suggests
thatbetaglycan may play a unique and non-redundantfunction during
development.Another important feature of betaglycan is its
ability to modulate the subcellular topology of thesignaling
receptor complex via its PDZ-bindingdomain, which interacts with
PDZ-containing pro-teins such as -arrestin [75]. This interaction,
aswell as that between betaglycan intracellular do-main and GIPC,
would stabilize betaglycan at thecell surface and potentiate its
bioactivity. Finally,betaglycan is involved in regulating many
functionsincluding reproduction and fetal growth [75], and isa
putative tumor suppressor in many forms ofcancer [76]. Several
additional betaglycan-evokedactivities have been recently reviewed
elsewhere[75].
Phosphacan/receptor-type protein tyrosinephosphatase
Phosphacan, originally isolated from rat brain, is aCSPG that
interacts with neurons and neuralcell-adhesion molecules (N-CAM)
and correspondsto the soluble ectodomain of a Receptor-type
proteintyrosine phosphatase (RPTP) [77]. The phos-phacan gene
(PTPRZ1) encodes a single-pass typeI membrane protein with a
relatively large ectodo-main harboring an N-terminal module
homologous tothe alpha-carbonic anhydrase (Fig. 2). Distal to
this,there is a fibronectin type III domain. The ectodo-main
contains six Ser-Gly repeats, at least four ofwhich are flanked by
acidic residues suggestingpotential glycanation sites.
Sporadically, phospha-can can also be substituted with keratan
sulfatechains. Notably, alternative splice variants encod-
Proteoglycan nomenclatureing different protein isoforms have
been describedbut their full-length nature has not yet
beenestablished.
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
morphogen gradients including Wnt, BMP and HhFunctionally, the
ectodomain of phosphacan medi-ates cellcell adhesion by hemophilic
binding. Inaddition, phosphacan's ability to bind N-CAM andtenascin
in a calcium-dependent manner suggeststhat RPTPs may also modulate
cellular interactionsvia heterophilic mechanisms [77]. Indeed,
phospha-can blocks the growth-promoting ability of N-CAM,axonin-1
TAG-1 and tenascin, and is crucial in theorientedmovement of
post-mitotic cells during corticaldevelopment of the brain [78].
Moreover, phosphacanbinds contactin, another member of the Ig
superfamilylike N-CAM, and the extracellular portion of
thevoltage-gated sodium channel [79]. The latter inter-action
appears to be mediated by the carbonicanhydrase-like module of
phosphacan's ectodomain.It has been proposed that phosphacan, as an
integralextracellular matrix constituent of the neural stem
cellcompartment, would contribute to the privilegedmicroenvironment
that supports self-renewal andmaintenance of the neural stem cell
niche [80].
Glypicans/GPI-anchored proteoglycans
Glypicans (GPC) are HSPGs that are bound tothe plasma membrane
via a C-terminal lipid moietyknown as GPI, for
glycosylphosphatidylinositol,linkage or anchor (Fig. 2). There are
six independentgenes in the mammalian genome which can besubdivided
into two broad classes: GPC1/2/3/6 andGPC3/5 with orthologs present
across Metazoanincluding Dally and Dlp in Drosophila
melanogaster[81]. Although most of the protein core is unique
tothis family, there is a stretch of amino acid in theectodomain of
the protein core with similarity to theCys-rich domain of Frizzled
proteins. There are twounique features in the structural
organization of allglypicans, with potentially important
functionalimplications.First and in contrast to syndecans, the
attachment
of the GAG chains mostly HS chains is locatednear the
juxtamembrane region. This allows thethree linear HS chains to span
a great deal of plasmamembrane surface, thereby presenting
variousmorphogens and growth factors in an active config-uration to
their cognate receptors. Indeed, glypicansbind to and modulate the
activity of Hedgehog (Hh),Wnt, and FGFs [8284]. More recently, it
has beenshown that glypican-3 binds to Frizzled therebyacting
directly in the modulation of canonical Wntsignaling [85].Second,
glypicans are dually processed via partial
proteases and lipases. In the former case, theectodomain of
glypicans is processed via endopro-teolytic cleavage by a
furin-like convertase. Thisprocessing generates two subunits that
are thenbound via disulfide bonds, in a way similar to the Met
Proteoglycan nomenclaturereceptor. In the latter case, the
entire glypicanproteoglycan is released from the plasmamembranevia
an extracellular lipase (Notum) that cleaves the
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbgradients [84].Notably, the
anchorless GPC-1, devoid of the
GPI anchor, is a stable -helical protein that restshigh
concentrations of urea and guanidine HCL[86]. Unfolding data are
consistent with a two-s-tate model, suggesting that GPC-1 protein
core isa densely-packed globular protein. In agreementwith these
data, the crystal structure of theDrosophila glypican Dally-like
protein has re-vealed an extended -helical fold [87]. Thecrystal
structure of human GPC-1 is very similarto Drosophila Dally-like,
and consists of a stable-helical domain with 14 conserved Cys
residues,followed by a GAG attachment site that isexclusively
substituted with HS chains [88]. Ofinterest, removal of the
-helical domain leads tosubstitution with CS chains instead of HS
chains,indicating that there is a message embedded inthe -helical
domain that drives a different post-translational modification
[88].Functionally, glypicans have been involved in the
control of tumor growth and angiogenesis. Forexample, glypican-3
has been implicated in cancerand growth control. Human mutations of
GPC3cause the rare X-linked SympsonGolabiBehmel(SGB) syndrome,
characterized by both pre- andpost-natal overgrowth, abnormal
craniofacial fea-tures, cardiovascular anomalies, renal dysplasia
andurinary tract malformations [84]. Originally, it washypothesized
that GPC3 was an inhibitor of IGF-II,given the prominent function
of IGF-II in develop-mental growth. However, it was later found
that thelevels of IGF-II do not change in Gpc3/ mice nordoes GPC3
interacts with IGF-II. It appears thatGPC3 is an inhibitor of the
Hh signaling, insofar asthe Hh-dependent signaling activity is
elevated inGpc3/ mice. Moreover, purified glypican-3 bindswith high
affinity to Indian and Sonic Hh as well as itcompetes with Patched
for Hh binding [83,89]. Arecent study has shown that processing by
con-vertases is required for GPC3-evoked suppressionof Hh
signaling, and this process is dependent on theHS chains and their
degree of sulfation [90]. Thus,the glypican family is not only
complex in nature, butis also the control of various modifying
enzymes(proteases and lipases) that modulate its
biologicalactivity. We are positive than many surprises willhappen
in the future regarding unsuspected biolog-ical functions of
various glypicans.
Pericellular and basement membranezone proteoglycansGPI anchor.
Drosophila studies have shown that theNotum-mediated release of
glypican can regulate
7This group of four proteoglycans is closelyassociated with the
surface of many cell types
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
8anchored via integrins or other receptors, but theycan also be
a part of most basement membranes.Pericellular proteoglycans are
mostly HSPGs andinclude perlecan and agrin, which share
homologyespecially at their C-termini, and collagens XVIII andXV,
which share homology at their N- and C-terminalnoncollagenous
domains (Fig. 1).
Perlecan
Perlecan is a modular HSPG encoded by a largegene [91,92] with a
complex promoter [9395]. The~500-kDa protein core is composed of 5
domainswith homology to SEA, N-CAM, IgG, LDL receptorand laminin
[96,97] (Fig. 3). The terminal LG3domain has been crystallized and
reveals a jellyroll
Fig. 3. Schematic representation of the pericellular proteogland
XV. The collagenous (COL) and non-collagenous (NC) domof the lower
schematics. For brevity only the structure of collprovided in the
bottom panel.
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbProteoglycan nomenclaturefold
characteristic of other LG modules [3]. Perlecanis expressed by
both vascular and avascular tissues[97101], and is ubiquitously
located at the apicalcell surface [102,103] and basement
membranes[98,104106]. Perlecan regulates various
biologicalprocesses primarily because of its widespreaddistribution
[101,105] and its ability to interact withvarious ligands and RTKs
[107], and more recentlythe potential utilization of perlecan
splice variants inmast cells [108]. Perlecan is an early
responsivegene and is induced by TGF [109] and repressedby
interferon [95]. The heparan sulfate chainsof perlecan and the
protein core can be cleaved byheparanase and various proteases
[110112],respectively, releasing various pro-angiogenicfactors
[113].
ycans, which comprise perlecan agrin, and collagens XVIIIains of
collagen XVIII are numbered on the top and bottomagen XVIII is
shown. The key for the various modules is
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
Perlecan is involved in modulating cell adhesion[114,115], lipid
metabolism [116], thrombosis and celldeath [117,118], biomechanics
of blood vessels andcartilage [119121], skin and endochondral
boneformation [122,123], and osteophyte formation [124].Perlecan
binds and modulates the activity of severalgrowth factors and
morphogens [106,125129] andits expression is often deregulated in
several types ofcancer [130134]. In Drosophila, perlecan, known
asTrol (for terribly reduced optical lobe) regulates Fgfand Hh
signaling to activate neural stem signaling[135,136]. In addition,
Trol is essential for thearchitecture and maintenance of the lymph
glandand for the proliferation of blood progenitor cells [137].Loss
of Trol is associated with premature differentia-tion of hemocytes
and this phenotype can be rescuedby ectopic expression of Hh [137].
In mice, Hspg2controls neurogenesis in the developing
telencepha-lon [138]. Moreover, perlecan can act as a
lipoproteinreceptor and mediate its endocytosis and
catabolism[116]. Specifically, domain II of perlecan has beenshown
to bind low density lipoproteins and thisinteraction is mediated by
the O-linked oligosaccha-rides [139], suggesting an important role
for perlecanin atherogenesis and lipid retention.Perlecan is a
complex regulator of vascular
biology and tumor angiogenesis [33,140,141] byperforming a dual
function: via the N-terminal HSchains, perlecan is pro-angiogenic
[96] by bindingand presenting VEGFA and various FGFs to
theircognate receptors [33,141152]. Moreover, hepar-anase-mediated
cleavage of basement membraneperlecan releases FGF10 and enhances
salivarygland branching morphogenesis [153]. Indeed,ablating Hspg2
or preventing Hspg2 expression inearly embryogenesis causes severe
cardiovasculardefects [154157]. The critical role for the
N-terminalHS chains of perlecan has been elegantly demon-strated by
the generation of mice harboring agenomic deletion of exon 3,
designated Hspg23/3
mice, which encodes the SGDs responsible for thecovalent
attachment of HS chains [158]. Thesemutant mice have impaired
angiogenesis, delayedhealing after experimental wounding and
suppres-sion of tumor growth [159]. When challenged withflow
cessation of the carotid artery, the Hspg23/3
mice show an enhanced intimal hyperplasia andsmooth muscle cell
proliferation [160,161]. Moreover,during mouse hind-limb ischemia,
the HS chainsof perlecan are key regulators of the
angiogenicresponse [162]. Collectively, these studies reaffirm
therole of HS perlecan in modulating pro-angiogenicfactors such as
FGF2, VEGFA and PDGF.More recently other functions of perlecan
have
been discovered. Using a lethality-rescued Hspg2/
where perlecan was reintroduced into the cartilage, it
Proteoglycan nomenclaturewas found that perlecan deficiency
leads to signif-icant depression of endothelial nitric oxide
synthase[163]. This leads to endothelial cell dysfunction, as
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbshown by attenuated endothelial
relaxation, likely asa consequence of endothelial nitric oxide
synthaseexpression. This is another example of how asecreted HSPG
affects the biology of vascularendothelial cells likely through a
receptor-mediatedsignaling pathway. Another recently unveiled
func-tion of perlecan is its ability to bind the clusteringmolecule
gliomedin [164]. In this case, perlecanbinds dystroglycan at nodes
of Ranvier which arerequired for fast conduction and accumulation
of Na+
channels. Perlecan seems to enhance clustering ofnodes of
Ranvier components via a specific interac-tion with gliomedin.
Thus, perlecan may havespecific roles in the biology and
pathophysiology ofperipheral nodes [164].In contrast to the
pro-angiogenic N-terminal
domain I, the C-terminal processed form of perlecandomain V,
named endorepellin [165], has a nearlyopposite function: it
inhibits endothelial cell migra-tion, capillary morphogenesis, and
in vivo angiogen-esis [166169]. A global proteomic analysis ofhuman
serum has identified endorepellin as amajor circulating protein
[170]. Moreover, endore-pellin has been detected in extracts of
fetal cartilage,exclusively in the hypertrophic zone, and it
wasspeculated that processing of perlecan protein corein the growth
plate could play a role in inhibitingblood vessel invasion or
formation in cartilage [171].Elevated endorepellin/LG3 peptides
were found inthe plasma proteome of patients with
refractorycytopenia with multilineage dysplasia [172], and inthe
urine of end-stage renal failure patients [173].These LG3 fragments
had N-terminal residues(i.e., cleaved by BMP-1) identical to those
reportedby us [174]. Similar LG3 fragments are elevated inthe urine
of patients with chronic allograft nephrop-athy [175,176], in the
amniotic fluid of pregnantwomen [177] with a marked increase in
women withpremature rupture of fetal membranes [178,179] andthose
carrying trisomy 21 fetuses [180]. Recently,LG3 peptides have been
proposed to represent apotential marker of physical activity [181].
Endorepel-lin fragments have also been detected in the urine
ofchildren with sleep apnea [182], in the mediacondit ioned by
apoptotic endothelial cells[118,183,184], and in the secretome of
pancreaticand colon carcinoma cells [174,185188]. Endore-pellin can
be pro-angiogenic in brain infarcts due tothe lack of
anti-angiogenic 21 integrin and thepresence of the pro-angiogenic
51 integrin receptorfor endorepellin in brain microvascular
endothelialcells [189]. In this context, LG3 can be released
byoxygen-glucose deprivation and can be neuroprotec-tive [190,191].
Finally, circulating LG3 levels arereduced in patients with breast
cancer, suggestingthat reduced LG3 titersmight be a useful
biomarker for
9cancer progression and invasion [192].Mast cells produce
shorter forms of perlecan
including functional endorepellin, suggesting a
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
potential role of endorepellin in inflammation andtissue repair
[193]. Moreover, MMP-7 processing ofperlecan in the prostate cancer
stroma acts as amolecular switch to favor cancer invasion
[112].Thus, processed forms of perlecan protein coreharboring
domains III and IV can function asprotumorigenic
factors.Endorepellin binds to the 21 integrin receptor
[140,166,194], and tumor xenografts generated in21/ mice are
insensitive to systemic delivery ofendorepellin [168]. Endorepellin
triggers the activa-tion of the tyrosine phosphatase SHP-1 which,
inturn, dephosphorylates and inactivates variousRTKs including
VEGFR2 [195]. Soluble endorepellinalters the proteomic profile of
human endothelialcells [196], and exerts a dual receptor antagonism
byconcurrently targeting VEGFR2 and the 21integrin [197]. Notably,
the proximal LG1/2 domainsbind the Ig35 domain of VEGFR2 while the
terminalLG3 domain, release by BMP-1/Tolloid-like
metallo-proteinases [174], binds the 21 integrin [198]. Thisdual
signaling causes: (a) Disassembly of actinfilaments and focal
adhesions, via the 21 integrin,leading to suppression of
endothelial cell migration[198,199], and (b) Activation of SHP-1
dephosphor-ylates Tyr1175, a key residue in the cytoplasmic tail
ofVEGFR2, and consequent transcriptional inhibitionof VEGFA
[200].More recently, we have discovered that endor-
epellin induces autophagy in endothelial cells viaVEGFR2
signaling [201], similar to decorin (seebelow). This novel function
could contribute to theangiostatic properties of this interesting
fragment ofperlecan protein core.
Agrin
The second pericellular/basement membraneHSPG is agrin. A
C-terminal portion of agrin lackingHS chains was first isolated
from the Torpedoelectric organ as an agent responsible for
acetyl-choline receptor (AChR) clustering, thereby theeponym agrin,
from the Greek ageirein, meaningto assemble [202]. The majority of
the research onagrin in mammalians has focused on
agrin'scontribution to the control of the postsynapticapparatus in
the neuromuscular junction. However,after many years of research,
it was serendipitouslydiscovered that agrinwas indeedanHSPG
interactingwith N-CAM in the avian brain [203].
Subsequently,orthologs of agrin have been cloned from
multiplespecies and are all highly homologous.Agrin has a
multimodular structural organization
that is homologous to that of perlecan with potentialgeneration
of several splice isoforms. The N-terminalregion can be spliced to
generate either a Type II
10transmembrane form (TM) of agrin, highly expressedin nervous
tissue, or an isoform associated with mostbasement membranes that
contains the N-terminal-
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbagrin (NtA) domain (Fig. 3). In the
central nervoussystem, TM agrin is highly expressed by axons
anddendrites; thus, neurite-associated TM agrin couldpotentially
function as receptor or co-receptor forneurite function. The NtA
domain has high affinity forthe laminin 1 chain's coiled-coil
domain, therebyfunctioning as a link between the cell surface and
thebasement membrane. Following the N-terminal do-main is a stretch
of nine follistatin-like (FS) repeats,also known as Kazal-type
protein inhibitor domains[204]. The last two repeats are separated
by aninsertion of two laminin EGF-like (LE) domains.Notably,
overexpression of TM agrin in non-neuronalcells induces
filipodia-like processes similar to thoseinduced in CNS neurites,
and this bioactivity waslocalized to FS repeat seven [205]. Thus FS
modulescan modulate an important biological activity ofneurons by
affecting the reorganization of the actincytoskeleton during active
neurite growth.Following the FS repeats, there are two Ser/Thr
(S/T)-rich domains which can be alternatively spliced(especially
the second ST module) to generate anX+/ form [204]. The two S/T
modules are separatedby a SEA module, similar to that of perlecan
(seeabove), known to be involved in regulating O-glyco-sylation of
mucins and glycoproteins. The N-terminaland central regions of
agrin protein core contain theattachment sites for HS chains, and
rotary shadowingelectron microscopy has revealed three
attachmentsites for HS chains [206]. However, agrin can be ahybrid
HS/CSPG with two clusters of Ser-Gly se-quences, one primarily
carrying HS chains locatedbetween FS repeats 7 and 8, and one
carrying mostlyCS chains, located in the first S/T module [207].An
agrin fragment harboring all protein modulesdescribed so far
inhibits neuronal outgrowth indepen-dently of HS or CS [208]. The
HS chains of agrin,however, bind FGF2, thrombospondin,
-amyloidpeptide, N-CAM, and the protein tyrosine phospha-tase
[209].The C-terminus of agrin is structurally organized
as perlecan domain V/endorepellin, with three LGdomains
separated by EGF-like modules (Fig. 3).The only difference is the
position of the EGFrepeats vis--vis the LG domains. The LG
domainsof agrin bind -dystroglycan in skeletal muscle
andlow-density lipoprotein-like receptor 4 (LRP4) [210].The latter
interaction activates the RTK MuSK whichinitiates a signaling
cascade that leads to theformation of pre- and post-synaptic
specializations.The terminal LG3 domain of agrin can be
alterna-tively spliced with inserts of 8,11 and 19 residuesand
their bioactivity is influenced by Ca2+ binding[211]. Moreover, the
overall function of agrin isregulated by site-specific processing
via MMPs[212]. Agrin is a good example, together with
Proteoglycan nomenclatureperlecan, of the evolved mechanisms in
molecularrecognition and function achieved through utilizationof
common protein folds, such as LG modules [211].
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
of the VEGF signaling cascade and, concurrently, to aThus, both
agrin and perlecan bind, via their LG-richC-termini, multiple cell
surface receptors includingRTKs, and can potently modulate
cardiovascular andmusculoskeletal systems. Importantly, conjugation
ofLG modules of agrin and perlecan to polymerizinglaminin-2 evokes
clustering of acetylcholine receptors[213]. These data provide
strong support for acooperative function of basement membrane
HSPGsin AChR assembly and function.Of interest, recessive missense
mutations in the
AGRN genes cause congenital myasthenic syn-dromes characterized
by defective neuromusculartransmission [214]. More recently, AGRN
recessivemissense mutations have been identified as causa-tive
factor for a congenital myasthenic syndromewith distal muscle
weakness and atrophy, resem-bling distal myopathy [215]. Given the
large numberand heterogeneous groups of neuromuscular disor-ders it
is likely that in the future new syndromes willbe identified that
are linked to genetic abnormalitiesof the AGRN gene.
Collagens XVIII and XV
Collagens XVIII and XV, two members of themultiplexin gene
family [216220], harbor structur-al features of collagens and
proteoglycans, beingsubstituted with HS and CS, respectively [221].
Likeagrin, collagen XVIII was serendipitously discoveredto be an
HSPG when monoclonal antibodies wereused against an unidentified
avian HSPG [222].Subsequent cloning and sequencing of the
cDNAshowed that this avian HSPG protein core showshigh homology to
the mammalian collagen XVIII.Collagen XVIII is a homotrimer
comprised of threeidentical 1 chains and consists of ten
interruptedcollagenous domains, flanked by eleven noncolla-genous
domains at their respective N- and C-termini.Collagen XVIII also
harbors three Ser-Gly consen-sus binding sites for the attachment
of HS chains[223] (Fig. 3). The human COL18A1 gene cangenerate
three protein variants derived from alter-native promoter usage and
splicing events [221].Specifically, COL18A1 can produce a short
variant,a middle variant containing a TSP-1 module, and along
variant containing an additional Frizzled repeat.The latter is
missing in collagen XV. Both collagensXVIII and XV contain a
C-terminal noncollagenousdomain harboring the antiangiogenic
endostatinand endostatin-like modules. Specifically, the NC1domain
consists of an N-terminal trimerizationregion, a central hinge
region sensitive to proteolyticactivity and the C-terminal
endostatin domain(Fig. 3). Endostatin interacts with numerous
recep-tors including integrins 51, v3 and v5[224,225] and VEGFR2
[226]. Interestingly, endo-
Proteoglycan nomenclaturestatin, in analogy to endorepellin, is
capable ofinducing autophagy in endothelial cells by modulat-ing
Beclin 1 and -catenin levels [227]. These
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbstimulation of the synthesis of
thrombospondin [228],a powerful angiostatic protein [229,230].Both
collagens XVIII and XV are ubiquitously
expressed in all vascular and epithelial basementmembranes of
human and mouse tissues, with anoverall topography reminiscent of
that of perlecanand agrin. Notably, Col18a1/ mice show
multipleocular abnormalities, especially affecting the
anteriorportion of the eyes [231,232]. In humans, mutationsin the
COL18A1 gene cause Knobloch syndrome, arare autosomal recessive
disease characterized byhigh myopia, vitreoretinal degeneration and
retinaldetachment [233,234].Col18a1/ mice show enhanced
neovasculari-
zation and vascular permeability during atheroscle-rotic disease
progression [235], and loss of this genein both mice and humans
leads to hypertriglyc-eridemia [236]. Moreover, Col18a1/ mice
displayenhanced angiogenesis during wound healing[237]. In contrast
to Col18a1/, Col15a1/ shownormal vascular formation but primarily
developa skeletal myopathy [238]. However, microscopicchanges in
the small arterioles with collapsedcapillaries and endothelial cell
degeneration inheart and skeletal muscles are also noted
[238].Collectively, these findings implicate collagen XVIIIas a
negative regulator of angiogenesis and as ananti-atherosclerotic
factor. Collagen XV may func-tion as a key structural constituent
required for thestabilization of skeletal muscle cells and
microves-sels [238], and recently both collagens XV and XVIIIhave
been involved in mediating the influx ofleukocytes in renal
ischemia/reperfusion [239]. Ofinterest, mice lacking the long form
of collagen XVIII(i.e. the N-terminal frizzled-like sequence)
butproducing the short form, exhibit a decreasednumber of
pre-adipocytes, hepatic steatosis andelevated VLDL and triglyceride
levels [240]. Thuscollagen XVIII is directly implicated in the
generationof adipose tissue and in hyperlipidemia associatedwith
visceral obesity and fatty liver.
Extracellular proteoglycans
This is the largest class encompassing 25 distinctgenes. Four
genes encode the hyalectans, keystructural components of cartilage,
blood vesselsand central nervous systems. They all bind
hyalur-findings suggest that C-terminal anti-angiogenicfragments of
pericellular HSPGs may evoke endo-thelial cell autophagy which
could contribute to theirangiostatic properties.The signaling
network evoked by soluble endostatin
leads to a downregulation of several key components
11onan and form supramolecular complexes of highviscosity. The
second class encompasses 18SLRPs, which have a multitude of
functions and
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
12often signal through various receptors as manymembers are now
found in the circulation and invarious body fluids. The third
class, SPOCK family,encompasses 3 testicans which are
calcium-bindingHSPGs.
Hyaluronan- and lectin-binding proteoglycans(hyalectans)
Hyalectans comprise a distinct family of proteo-glycans with
structural similarities at both thegenomic and protein levels. This
family containsfour distinct genes, namely aggrecan,
versican,neurocan, and brevican (Figs. 1 and 4). A sharedfeature of
these proteoglycans is their tridomainstructure: an N-terminal
domain that binds hyalur-
Fig. 4. Schematic representation of the hyaluronan- and
laggrecan, versican, neurocan and brevican. The full-length verGAG
(V2) or both GAG and GAG (V3) are shown. A newGPI-anchored form of
brevican is also not shown in the graphicshared by the other
hyalectans. These modules are composesequence with four
disulfide-bonded Cys residues. The key fo
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbProteoglycan nomenclatureonan, a
central domain harboring the GAG sidechains, and a C-terminal
region that binds lectins [2].Based on this dual activity at the N-
and C-termini,the term hyalectans, an acronym for hyaluronan-
andlectin-binding proteoglycans, has been proposed [1].Alternate
exon usage and variability in the degree ofglycanation and
glycosylation provide diverse func-tional attributes for these
proteoglycans which oftenact as molecular bridges between cell
surfaces andextracellular matrices.
Aggrecan
Aggrecan, as its eponym indicates, has thepropensity to
aggregate into large supramolecularcomplexes N 200 MDa together
with hyaluronan
ectin-binding proteoglycans (hyalectans), which comprisesican
(V0) and the three splice variants lacking GAG (V1),variant, V4,
containing a portion of GAG is not shown. A. The dotted circles
specify the globular domains (G1G3)of ~100 amino acids and have a
characteristic consensusr the various modules is provided in top
right panel.
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
and link protein, and is the principal load-bearingproteoglycan
of cartilage [241]. These large aggre-gates generate a
densely-packed, hydrated gelenmeshed in a network of reinforcing
collagen fibrilsand other proteoglycans and glycoproteins [242].The
N-terminal domain contains four link protein-likemodules or
proteoglycan tandem repeats in additionto the Ig-like repeat (Fig.
4). The entire link module is~100 amino acids in length and has a
characteristicconsensus sequence with four disulfide-bonded
Cysresidues. These modules form two globular domainsknown as G1 and
G2 [243]. The G1 domain isrelated to link protein and to the other
G1 domains ofthe hyalectans, both in terms of structural domainsand
subdomains [243]. The G1/hyaluronan/linkprotein ternary complex is
very stable therebyimmobilizing the aggrecan into enormous
com-plexes that maintain a stable network and providemechanical
properties to cartilage. An interglobularregion, between G1 and G2,
has a rod-like structureand harbors several protease-sensitive
sites in-volved in the partial degradation of aggrecan inarthritis
and other inflammatory diseases.Following the G2 domain is a
relatively small region
containing numerous KS chains. This domain is notwell conserved
and its size significantly varies amongspecies. Next, is the
largest domain of aggrecanwhich contains the GAG-binding region.
This proteindomain is encoded by a single, very large (~4 kb)exon
with ~120 Ser-Gly dipeptide repeats, which cangenerate N100
covalently-linked CS chains. Theconcentration of negatively-charged
forces withinaggrecan accounts for its ability to hold large
amountof water, not only in cartilage, but also in
theintervertebral disc and brain. Moreover, electrostaticrepulsion
forces generated by the numerous negati-vely-charged CS and KS
chains of aggrecan providethe equilibrium compressive modulus (a
measure ofstiffness) of cartilage. In humans, variable number
oftandem repeats can generate different alleles in thegeneral
population, ranging between 13 and 33repeats, causing a great
variability in the aggrecandegree of glycanation and negative
charge (due tosulfation) within cartilage.The G3 module of aggrecan
contains 2 EGF-like
repeats, a C-type lectin domain and a complementregulatory
protein (CRP) domain. Notably, the EGFrepeats can be alternatively
spliced in part becausein rodents exon 13 is a pseudoexon.
Moreover, inrodent brain, the most common aggrecan specieslacks
both EGF repeats [244]. As in the case of otherhyalectans, the
C-type lectin domain of aggrecanbinds simple sugars, such as fucose
and galactose,in a Ca2+-dependent manner. Thus, aggrecan G3may
serve as a binding domain for the galactosepresent on collagen type
II or other extracellular
Proteoglycan nomenclaturematrix or cell surface constituents.
Moreover, the G3domain of aggrecan interacts with tenascins,
fibulinsand sulfated glycolipids [245]. Thus, aggrecan could
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbbridge and interconnect various
constituents of thecell surface and extracellular matrix via its
C-terminalG3 domain, thereby providing a mechanosensitivefeedback
to the chondrocytes. Indeed, epiphysealchondrocytes grown on
hydrogel substrata canmaintain their phenotype for up to six months
withproper secretion of cartilage-specific constituents,such as
aggrecan, and collagens type II and IX, butwithout expressing
collagen type I [246].The essential role of aggrecan in cartilage
is
underscored by several genetic defects including twoautosomal
recessive chondrodystrophies, nanomeliain chickens and cartilage
matrix deficiency (cmd) inmice [247]. In nanomelia, the defect
leads to theformation of a C-terminal truncated aggrecan, while
incmdmice there is an even larger C-terminal truncation.In both
mutant animals, there is little or no aggrecan incartilage leading
to shortened long bones and lethality,most likely due to
respiratory failure arising fromtracheal collapse [247]. Aggrecan
is also involved inthe morphogenesis of limb synovial joints and
articularcartilage [248], and fragments of aggrecan
representbiomarkers for osteoarthritis [249].Aggrecan is also
expressed in the brain, and unlike
other hyalectans, is expressed primarily in theperineuronal nets
[79]. A relatively small number ofcortical neurons express
aggrecan, especially thecortical interneurons [244]. One of the
hypothesizedfunctions of brain aggrecan is its potential regulation
ofneural maturation, in addition to its physical ability toadduct
cations and regulate osmotic imbalances.Thus, aggrecan could affect
high-rate synaptic trans-mission, mechanical stabilization of
synaptic contactsand neuroprotection by counteracting oxidative
stressvia scavenging redox-active cations [244].
Versican
Versican, an eponym that signifies its highlyversatile function
[250], is the largest member ofthe hyalectan family when expressed
as a wholemolecule, designated V0 (Fig. 4). Versican is
themammalian counterpart of the so-called PG-M, alarge chondroitin
sulfate proteoglycan expressedduring chondrogenesis in chick limb
buds [251,252].The VCAN gene, originally called CSPG2 [253255],
encompasses 15 exons encoding a full-length(V0 variant) protein
core of ~400 kDa, with 3396amino acid residues. The overall
structural organi-zation of versican is similar to that of
aggrecan, witha few exceptions. At the N-terminus there is only
oneglobular domain instead of two. Specifically, theN-terminal
domain of versican contains one IgG foldfollowed by two consecutive
link protein modulessimilar to G1, which are involved in mediating
thebinding of proteins to hyaluronan. Recombinant
13versican and a truncated form of versican containingthe
N-terminal domain bind to hyaluronan with highaffinity, KD ~ 4 nM,
in the same range as the other
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
major aggregating CSPG, aggrecan [256]. Thecentral domain of
versican comprises two relativelylarge subdomains, designated GAG
(encoded byexon 7) and GAG (encoded by exon 8), which canbe
alternatively spliced to generate the three mainvariants V1, V2 and
V3 [255], with significant CSpolymorphism in the different versican
isoforms.These large regions lack Cys residues and contain~30
potential consensus sequences for GAGattachment as well as several
binding sites for N-and O-linked oligosaccharides. There is also
vari-ability in tissue expression of the isoforms, with V0and V1
representing the most ubiquitous isoforms,expressed in the
developing heart and limbs,vascular smooth muscle cells and several
non-neuronal tissues, whereas the V2 isoform is mainlypresent in
the brain [79]. Expression of the V3isoform in arterial smooth
muscle cells regulatesmultiple signaling pathways, including TGF,
EGFand NF-B pathways, thereby creating a microenvi-ronment
resistant to monocyte adhesion [257].Recently, a new splice variant
of Versican, V4, hasbeen identified in human breast cancer,
whichcontains up to five CS chains [258]. This isoformcomprises
only the first 1194 bp of exon 8 (encodingthe GAG) sandwiched
between exon 6 and 9, andis highly expressed in breast cancer in
contrast tonormal breast tissue where it is undetectable
[258].Notably, the avian versican ortholog harbors anadditional
exon, known as PLUS, in the N-terminalregion that is
developmentally regulated [259]. Thisexon can be alternatively
spliced giving rise to twoadditional isoforms. Although no similar
region ispresent in the mammalian genome, sequencehomology suggests
that the PLUS domain of avianversican may correspond to the KS
attachmentregion in aggrecan.The C-terminal domain of versican is
also very
similar to that of aggrecan and other hyalectans in thatit
harbors similar structural motifs, including twoEGF-like repeats, a
C-type lectin domain, and acomplement regulatory protein-like
module (Fig. 4).These motifs are generally found in the selectin
familyof glycoproteins, which include several adhesionreceptors
regulating leukocyte homing and extrava-sation during inflammation.
Given the fact that thevarious C-type lectin modules may have
differentsaccharide-binding specificity, the presence of
thesedomains at the C-terminal ends of hyalectans couldprovide
specialized and refined functions for theseCSPGs. Moreover, these
findings suggest thatversican may form a molecular link between
lectin-containing glycoproteins at the cell surface
andextracellular hyaluronan. Because hyaluronan isbound to the cell
surface via its CD44 receptor[241,260], versican may also stabilize
a large supra-
14molecular complex at the plasmamembrane zone [2].The
functional roles of versican are multiple and
complex. Versican is involved in the regulation of cell
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbadhesion, migration and
inflammation [260262].During an inflammatory response, leukocytes
need toemigrate from the inner bloodvessels into
thedamagedsurrounding tissues. During this process,
leukocytesencounter a provisional matrix highly enriched
inversican, which in turn is capable of interacting withmany
receptors on the surface of immune cellsincluding CD44, P-selectin
glycoprotein-1, and Toll-likereceptors [261]. Another important
role of versicanderives from the multiple processing of its protein
core.Versican is degraded and partially processed byseveral MMPs,
plasmin and members of the ADAMTSfamily [263,264]. Versican is also
involved in thebiologyof leiomyosarcomas insofar as its levels are
markedlyincreased vis--vis benign leiomyomas, and suppres-sion of
versican expression attenuates malignantgrowth and tumor
progression [265].Two autosomal dominant eye disorders, Wagner
syndrome and erosive vitreo-retinopathy, which bothshow
optically empty vitreous cavities, are causedby mutations in the
VCAN gene [266]. Interestingly,the mutant alleles contain mutations
around thesplice sites flanking exon 8, which encodes theGAG
domain, likely producing exon skipping. Theultimate consequence of
exon skipping is that mosttissues, and especially the eye, would
have a lack ofthe GAG domain with much fewer CS chains, andthus a
less charged environment.
Neurocan and brevican
The third member of the hyalectans is neurocan, adevelopmentally
regulated CSPG originally clonedfrom rat brain, and thus its eponym
to signifyneuronal origin [267]. Rotary shadowing
electronmicroscopy of neurocan has revealed two globulardomains
interconnected by a 6090 nm rod [268],similar to the predicted
organization of other hya-lectans derived from biochemical and
genomicanalyses. As other hyalectans, neurocan has anN-terminal
domain with structural homology to thetypical arrangements found in
link protein, harboringa G1 domain and an Ig repeat (Fig. 4).
Functionally,recombinant N-terminal module of neurocan inter-acts
with hyaluronan in solution, and isolatedcomplexes comprise gel
permeation assays, andhyaluronan and globular profiles [268].
Therefore, itis highly likely that all the N-terminal domains of
thehyalectans bind and interact with hyaluronan andlink protein in
vivo, forming gigantic supramolecularaggregates. The next
interglobular region of neuro-can, with little homology to other
proteins, contains~seven potential CS binding sites. The
C-terminalmodule of neurocan shares significant homology tothe G3
domain of aggrecan and versican, with ~60%identity between the rat
neurocan and human
Proteoglycan nomenclatureversican/aggrecan. By analogy to the
other hyalec-tan members, this domain could bind several
brainglycoproteins including Ng-CAM, N-CAM, and
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
Proteoglycan nomenclaturetenascin. Neurocan is known to inhibit
neuriteoutgrowth in vitro and, in keeping with this function,the
expression of neurocan is increased at the site ofmechanical and
ischemic injury in the adult centralnervous system [78,269].
Neurocan has beenimplicated in path finding during
development.However, Ncan/ mice develop normally with onlymild
deficiency in long-term potentiation, suggestingthat neurocan might
only have a redundant roleduring development.Brevican is one of the
most important hyalectans
of the central nervous system. It takes its eponymfrom the Latin
word brevis (for short) as it harbors atypical hyalectan
configuration with N- and C-termi-nal homologous domains, but with
the shorterGAG-binding domain (Fig. 4) [270,271]. Brevicanwas
simultaneously discovered by three laboratoriessearching for
hyaluronan-binding proteoglycans inthe brain [271,272] and for
synapse associatedproteins [273]. The eponym BEHAB, which
issometimes used for brevican as they are the samegene products,
refers to brain-enriched hyaluronanbinding protein [272]. Although
sequence homologywith the other hyalectan members is quite
uniform(~60% overall), the GAG-binding domain is poorly
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbFig. 5. Phylogenetic tree of
thesmall leucine-rich proteoglycans(SLRPs) and crystal structure
ofporcine decorin and biglycan dec-orin. (A) Dendogram of the
fivehuman SLRP classes, numberedand color-coded. Protein se-quences
were first aligned withCLUSTALW before an unrooteddendogram was
generated by aneighbor joining method using Gen-omeNet. (B) Cartoon
ribbon dia-gram of the crystal structure ofmonomeric bovine decorin
ren-dered with Pymol v1.7 (PDB acces-
15conserved and contains a high content of acidicamino acid
residues (mainly glutamic acid). Thisstructural feature, shared
with the link protein-likemodule of versican, could mediate binding
tocationic proteins and minerals. In analogy toneurocan, brevican
can exist as either a full-lengthCSPG or as a partially cleaved
product without theGAG-binding module and the N-terminal
domain.Similar to neurocan, brevican exists in vivo either asa
full-length proteoglycan or as a proteolytically-processed form
lacking the GAG-binding region andthe N-terminal domain. The
C-terminal G3-likedomain is structurally organized like the
otherhyalectans, although it harbors only one EGF-likerepeat
instead of two as in all the other members(Fig. 4).In addition to
secreted full-length brevican, an
isoform of brevican encoded by a shorter 3.3 kbmRNA and highly
expressed during post-nataldevelopment, is linked to the plasma
membranevia a GPI anchor [273]. Notably, the GPI-anchoredbrevican
lacks EGF, C-type lectin and CRP modulesbut contains a stretch of
hydrophobic amino acidsresembling the GPI-anchor. Brevican is
located atthe outer surface of neurons and is enriched at
sion number 1XKU). Vertical arrowsindicate -strands, while
coiled rib-bons indicate -helices. The leuci-ne- r ich repeats
(LRRs) arenumbered above the diagram. Thesequence (SYIRIADTNIT)
involvedin binding to collagen type I[306,307] is highlighted in
yellow.The terminal LRR Cys cappingmotif, known as the ear repeat,
isalso indicated [299].
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
[291,292] and in terms of modulating the bioactivity of
various signaling pathways when in soluble form[293295].
Moreover, several SLRPs bind TGF andbone morphogenetic protein
(BMP), and severalmembers of this family inhibit cell growth
[296,297].The crystal structure of bovine decorin [298] shows
a solenoid fold structure typical of LRRs (Fig. 5B).Each LRR
unit is composed of ~24 amino acids,characterized by a conserved
pattern of hydrophobicperisynaptic sites. Brevican interacts with
tenas-cin-R and fibulin-2 via its G3-like domain
[274].Functionally, brevican has been implicated in glioma
tumorigenesis, nervous tissue injury and repair, and
inAlzheimer's disease [274]. However, many morestudies need to be
performed before a clear pictureof brevican's biology can be
clearly drawn.
Small leucine-rich proteoglycans/SLRPs
General considerations
This is the largest family of proteoglycans encom-passing 18
distinct gene products and numeroussplice variants and processed
forms. The eponymSLRP, for small leucine-rich proteoglycans [1],
isnow a widely-used abbreviation. SLRPs designate aclass of
proteoglycans characterized by a relativelysmall protein core (as
compared to the largeraggregating proteoglycans) of 3642 kDa
andencompassing a central region constituted byleucine-rich repeats
(LRRs) (Fig. 5) [275]. TheSLRPs are ubiquitously expressed in most
extracel-lular matrices and are highly expressed duringdevelopment
in the thin membranes enveloping allthe major organs such as
meninges, pericardium,pleura, periosteum, perichondrium, perimesium
andendomesium [276278] This strategic topologysuggests that SLRPs
would be directly involved inregulating organ size and shape during
embryonicdevelopment and homeostasis [279,280].The 18 SLRP members
are grouped into five
classes: Classes IIII are canonical genes, whereasClasses IV and
V are non-canonical (Fig. 1). Althougheight non-canonical members
do not carry glycos-aminoglycan side chains, they have been
includedbecause they share close structural homology andseveral
functional properties with the full-time proteo-glycans. This
classification is based on severalconsiderations, including
evolutionary conservation,homology at both the protein and genomic
level, andchromosomal organization (Fig. 5A) [281]. It isimportant
to note that SLRPs share many biologicalfunctions in terms of
binding to various collagens[282286], RTKs [287290], innate immune
receptors
16residues, with short parallel -sheet on the concaveface
interwoven with loops containing short -strands,310 helices and
polyproline II helices on the convex
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matb(outer) side of the protein core
(Fig. 5B). The LRRsform a curved, solenoid structure where
protein/protein interactions occur primarily via the side chainsof
variable residues protruding from the short parallel-strands that
form the inner (concave) face of thesolenoid. The LRRs are flanked
at the N- andC-termini by disulfide-bonded caps which define
thevarious classes [277]. At the N-terminus, there arefour Cys
residues with a variable number of interven-ing aminoacids,whereas
theC-terminal cappingmotifencompasses two LRRs and includes the
so-calledear repeat (Fig. 5B). This Cys-capping motif, desig-nated
LRRCE, is present in the canonical SLRPs(Classes IIII) but absent
in the other two non-canon-ical classes [299]. Likely, both capping
motifs at eitherend of SLRPsClass IIII would function to stabilize
theLRR central domain as in the case of other LRRprotein and
receptors.Another characteristic feature of Class IIII SLRPs
is the presence of a long penultimate LRR (LRR XI indecorin),
that has been called the ear repeat [300].Typically, the ear
repeats contain 30 or more aminoacid residues including an atypical
sequence harbor-ing a Cys located at about 10 residues after
theasparagine residue in the consensus LRR [300].Genetic mutations
in the decorin gene leading to aterminal truncation of the decorin
protein core, lackingthe ear repeat, cause congenital stromal
cornealdystrophy [301]. This syndrome has been faithfullyreproduced
in mice where this truncated decorin wasspecifically expressed into
the cornea [302,303].Although bovine decorin has been crystallized
as an
anti-parallel dimer [298] and reported to be a dimer insolution
[304], there is strong evidence that decorinacts as a monomer in
solution [293], especially wheninteracting with the small binding
site on the EGFRectodomain in vivo where a dimer could not fit
thecavity [305]. Also supportive of a concave facebindingis the
identification of the sequence (SYIRIADTNIT) inLRRVII (highlighted
in yellow in Fig. 5B) of the decorinprotein core that is directly
involved in binding tocollagen type I [306,307]. A recent study
utilizingmutant forms of mouse decorin, where
engineeredglycosylated sites in the concave face
preventdimerization, has shown that the monomeric mutantsare as
stable as the wild-type in solution [308]. Theconcave facemutants
fail to bind collagen, regardlessof the dimerization state, thus
providing robustbiological evidence for a concave
face-mediatedbinding (i.e., monomeric decorin) to collagen [308].A
hallmark shared by nearly all SLRPs, and bymost
LRR-containing proteins, is their propensity to interactwith
other proteins and to regulate collagen fibrillo-genesis
[282,283,309,310]. For example, severalSLRPs interact with fibrils
of collagen types I, II, III,V, VI and XI. Indeed, the eponym
decorin derives
Proteoglycan nomenclaturefrom its ability to decorate fibrillar
(banded) collagen ina periodic fashion, that is, decorin protein
corenon-covalently binds, about every 67 nm, to an
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
intraperiod site on the surface of collagen fibrils, everyD
period [311,312]. In highly purified 1(I) procollagenmolecules,
decorin protein core binds close to anintermolecular cross-linking
site near the C terminus[313]. SLRP coating of various types of
collagenserves a dual function: it regulates the lateralassociation
of collagen molecules into proper fibrils,and protects collagen
fibrils from proteolysis bysterically limiting the access of
collagenases to theircleavage sites. It is important that, during
evolution,these dual functional properties of SLRPs are
sharedbyboth their sulfatedGAGsandprotein cores.Notably,few SLRP
members contain stretches of amino acidsthat can be sulfated, such
as the poly-Tyr sulfate infibromodulin or the poly-Asp region in
asporin. Often,the GAGs are located in the N-terminus, in a
locationthat is similar to that of these poly-sulfated amino
acidstretches, and can be directly involved in collageninteraction
[314,315]. An additional degree of com-plexity is provided by the
heterogeneous structure ofthe GAG chains. For instance, Class I
SLRPs containCS or DS chains, with the exception of asporin,
ECM2,and ECMX. In contrast, Class II members
containpoly-lactosamine or KS chains in their LRRs andsulfated Tyr
residues at their N-termini. Class IIImembers contain CS/DS
(epiphycan), KS (osteogly-cin), or no GAG (opticin). Finally, the
non-canonicalClass IV and V members lack GAG chains with
theexception of chondroadherin, which is substituted withKS.The
biological functions of SLRPs are very vast and
there are over 3000 published papers on decorinalone, the
archetypal and most studied SLRP. Thus,we refer the readers to
recent comprehensive andspecial ized reviews on SLRPs
[275,281283,294,307,316325]. Moreover, it has been pro-posed that
SLRPs can be transcriptionally co-regu-lated through utilization of
HOX-Runxmodules in theirpromoters and genomic regions, including
proximalexons and intergenic regions [326]. Below, is a
briefoverview of each family with emphasis on recentdiscoveries of
their multiple functional roles inphysiological and perturbed
states.
Class I SLRP
Decorin, also known as PG40 and DSPG1, wasoriginally cloned from
a fibroblast cDNA library [327],and subsequently named decorin
because of itsability to decorate collagen fibrils [328].
Specifically,decorin protein core is a Zn2+ metalloprotein[329,330]
that is biologically active in solution as amonomer [293]. As
mentioned above, decorinprotein core binds non-covalently to an
intraperiodsite on the surface of collagen fibrils about every67
nm, at the D period [312]. Using purified collagen
Proteoglycan nomenclatureand procollagen molecules, that can be
visualized bytheir C-terminal globular regions, it has been
shownthat decorin protein core binds near the C terminus of
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbcollagen1(I), near an
intermolecular cross-linking site[313]. Not only the protein core
but also the N-terminalGAG chain of decorin plays a role in
collagenfibrillogenesis and structure [285,314,315,331334].The
strategic location of the GAG binding domain inthe N-terminus of
decorin allows a higher degree ofmobility for the DS chain, which
presumably couldalign orthogonally or parallel to the axis of the
collagenfibrils. This dual function of decorin could help
inmaintaining corneal transparency and biomechanicalproperties of
various connective tissues [282,284,335].The decorin gene exhibits
a complex genomic
organization and transcriptional control [276,336338] and its
transcription can be induced by quies-cence and suppressed by TNF
[339,340]. It wasknown for many years that the small DSPG of
tendon,mostly decorin, is capable of inhibiting lateral growthof
collagen fibrils [309]. Thus, when the decorin-nullmice were
generated, the first targeted deletion of aproteoglycan-encoding
gene, the abnormal collagenstructure in the dermis and the skin
fragility phenotype[310] provided the first genetic evidence for
aregulatory role for the prototype member of SLRPgene family in
collagen fibrillogenesis. The phenotypeof the decorin deficient
mice includes abnormalcollagen fibril morphology in the skin and
tail tendon,presumably by being less stable during developmentdue
to abnormal cross-linking or enhanced suscepti-bility to
collagenase. The prevalent phenotype of thedecorin-null mice is
skin fragility caused by a thinningof the dermiswith concurrent
reduced tensile strength,a biomechanical impairment directly linked
to theabnormal collagen network. Overall, the Dcn/ miceresemble the
cutaneous defects observed in theEhlersDanlos syndrome,
characterized by skinhyperextensibility and tissue fragility [341],
in a wayopposite to fibrosis [342]. Due to its mild phenotype,the
Dcn/ mice have been utilized by a large numberof investigators
using many experimental challengesand have provided strong genetic
evidence fordecorin roles in Lyme disease [343,344], lungmechanics
and asthma [345,346], diabetic nephrop-athy and tubulointerstitial
fibrosis [347350], myocar-dial infarction [351], corneal
transparency and tendonbiomechanical properties [352356], dentin
mineral-ization and periodontal homeostasis [357359], he-patic
fibrosis and hepatocellular carcinoma [318,360362], collagen
fibrillogenesis [314,363,364], fetalmembrane biology [365367],
wound healing andangiogenesis [368373], innate immunity and
inflam-mation [291,374,375], adhesion and migration [376],and
mesenchymal stem cell biology [377]. Decorinplays an important role
during zebrafish developmentinsofar as zDcn knockdown causes a
severe pheno-type characterized by abnormal convergent exten-sion,
craniofacial abnormalities, and cyclopia [278].
17As these genetic defects are reminiscent of severalzebrafish
mutants affecting the non-canonical Wntsignaling pathway, it is
possible that decorin might
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
also play a role in this pathway in mammalians.Indeed, a recent
study has shown that decorin isdirectly involved in modulating the
signaling pathwayofWnt3a shaping niches supportive of
hematopoiesis[378].Mutations in the decorin gene have been linked
to
congenital stromal corneal dystrophy (CSCD) syn-drome [301,379]
where a truncated form of decorinlacking the ear repeat, the
C-terminal 33 aminoacids, acts in dominant negative fashion. A
cornealknock in transgenic mouse lacking the C-terminal 33amino
acid residues (952delTDcn) faithfully recapit-ulates the human
phenotype of corneal opacities[302]. Mechanistically, the
C-terminal truncated formof decorin is retained in the cytoplasm of
keratino-cytes, triggering ER stress and an unfolded
proteinresponse [380]. These data provide a cell-based,rather than
ECM-based, interpretation of the CSCDphenotype whereby a truncated
SLRP protein core,by inducing ER stress, causes an abnormal
pro-cessing and secretion of decorin and other SLRPs,eventually
generating an abnormal matrix assemblyand corneal opacities.Decorin
was the first proteoglycan to be directly
involved in the control of cell growth. Two seminalpapers
identified decorin as a growth suppressor, viaa mechanism involving
decorin's binding to andinhibiting TGF in Chinese hamster ovary
cells[381,382]. Concurrently, decorin was identified as
aproteoglycan highly expressed in the tumor stromaof colon
carcinomas [383], primarily via hypomethy-lation of its promoter
regions [384]. It was soonrecognized, however, that the growth of
mostmalignant cells does not depend on the availabilityof TGF.
Thus, there had to be other signalingreceptors for the growth
suppressive function ofdecorin. The existence of such receptor(s)
wassupported by an emerging body of literature describ-ing that
ectopic expression of decorin or its proteincore suppress the
malignant phenotype in a varietyof histogenetic malignant
backgrounds [385,386].Utilizing A431 cells, a squamous carcinoma
cell linewhich overexpress EGFR, it was discovered thatexogenous
decorin proteoglycan or protein coretransiently activated the EGFR
to induce growthinhibition via expression of the
cyclin-dependentkinase inhibitor p21WAF1 [287,387,388].
Indeed,decorin binds to a narrow region of the EGFR,partially
overlapping with but distinct from theEGF-binding epitope [305].
Mechanistically, decorintransiently activates the EGFR and elevates
cytosolicCa2+ in A431 cells [389], but it causes a
sustaineddown-regulation of this RTK, thereby providing aplausible
mechanism for controlling tumor growthin vivo in various forms of
cancer [390392].Specifically, soluble decorin evokes protracted
inter-
18nalization and degradation of the EGFR via caveolarendocytosis
[393]. An anti-oncogenic role for decorinhas been also demonstrated
in its ability to inhibit
Please cite this article as: Iozzo Renato V., Schaefer Liliana,
Proof proteoglycans, Matrix Biol (2015),
http://dx.doi.org/10.1016/j.matbanother member of the ErbB family,
namely theErbB2/Neu, in this case by inhibiting heterodimeriza-tion
of ErbB4 with ErbB2, thereby leading to growthsuppression and
cytodifferentiation of mammarycarcinoma cells [394]. It was
subsequently foundthat decorin binds specifically and with higher
affinity(KD ~ 2 nM) to hepatocyte growth factor receptorknown as
Met [288] and causes proteasomal degra-dation of Myc and -catenin,
two critical downstreameffectors ofMet [395]. An important
downstreameffectof the decorin/Met interaction is induction of
twoanti-angiogenic proteins, Thrombospondin 1 andTIMP3, with
concurrent inhibition of two powerfulpro-angiogenic factors,
HIF-1andVEGFA [371,372].Moreover, decorin binds and suppresses both
theIGF-IR [289,396,397] and VEGFR2 [371,398].Loss of decorin in the
tumor stroma correlates with
poor survival of patients with invasive breastcarcinomas
[275,399,400] and in mice with sponta-neous breast cancer [401].
Moreover, decorin ismarkedly reduced in the stroma of many solid
tumors[402404], as well as low- and high-grade bladdercarcinomas,
but is highly expressed in the normalbladder stroma [397]. Decorin
levels are alsodecreased in multiple myeloma [405,406], soft
tissuesarcomas [407], prostatic [408], urothelial [409411]and
hepatic [362,412] carcinomas, together with acomplete loss of
decorin expression by severaltumor cells [413,414]. Additional
proof for an onco-static role of decorin as a soluble tumor
repressorstems from genetic models wherein ablation ofdecorin under
conditions of a high-fat, western-typediet, is linked to the
spontaneous appearance ofintestinal tumors [415,416]. Moreover,
compoundDcn/;Tp53/ mice die of aggressive T-cell lym-phomas much
sooner than mice lacking only thetumor suppressor Tp53 [417].
Notably, systemicdelivery of decorin, either as a soluble factor or
viaadenoviral gene delivery, significantly retards tumor-igenic and
angiogenic growth in a wide variety ofmalignant solid tumors
[413,418424]. Collectively,these findings provide strong support to
the conceptthat decorin could act as a guardian from the matrixin
analogy to p53, a guardian of the genome [414].Thus, decorin could
become a potent therapeuticfactor, either alone or in combination
with traditionalchemotherapy, in preventing tumor progression
andmetastasis [297].Recently, it was discovered that soluble
decorin
evokes excessive autophagy in endothelial cells,independently of
nutrient deprivation, through partialagonistic activity on VEGFR2
[425]. This signalingcascade emanating from the decorin/VEGFR2
inter-action leads to two effects. First, it activates AMPKand
Vps34, which in turn stimulate the synthesis ofPeg3 [426], a
recently-identified master regulator of
Proteoglycan nomenclatureautophagy [422]. Peg3 recruits LC3 and
Beclin 1,which evoke autophagy, and concurrently
inducestranscription of both genes, while inhibiting VEGFA
teoglycan form and function: A comprehensive
nomenclatureio.2015.02.003
-
production [425]. These multiple biological roles ofdecorin
would converge on oncostasis by suppress-ing RTK signaling in the
growing cancer cells andinhibiting the supply of oxygen and
nutrient viahindering angiogenesis and inducing a protracted,and in
this case deleterious, stromal cell autophagy[427]. In view of the
fact that decorin has been foundin the circulation in nanomolar
amounts [428430],at concentrations similar to those used in
theexperimental studies mentioned above, and asplasma decorin is
significantly increased in cancerpatients [291], it is plausible
that this endogenoustumor repressor might have a physiological role
invivo.Biglycan, decorin's closest proteoglycan, was orig-
inally isolated from bovine bone and then, following itscloning
and sequencing, was found to contain twoSer-Gly attachment sites in
theN-terminal region, thusits eponym meaning two GAG chains [431].
Both thehuman and mouse genes have an overall similarexonic
arrangement [432,433]. It is highly homolo-gous to decorin, with
N65% overall homology. Similarto decorin, biglycan binds TGF [434]
and modulatesits bioactivity [435]. Ablation of the biglycan
gene,Bgn/0 (this genetic symbol designates the presenceof Bgn gene
on the X chromosome), which harbors agene with a ubiquitous tissue
distribution and apronounced expression in bone [433,436], reveals
akey function for this SLRP in regulating postnatalskeletal growth
[437]. In general, the long bones inBgn/0 mice grow slower than
wild-type littermatesand eventually are shorter and exhibit reduced
bonemass. The latter is secondary to the marked decline innumber of
osteoblasts with concurrent progressivedepletion of the bone marrow
stromal cells [437].These mutant mice also display delayed
osteogene-sis after marrow ablation [438], broader metadentin,and
altered dentin mineralization, causing significantenamel structural
defects. Thus, biglycan-deficientmice could be a promising animal
model to studyskeletal diseases and osteoporosis [439].
AlthoughDcn/ mice also show abnormalities in bonecollagen fibril
size and organization, they show neitherovert bone mass defects nor
abnormal osteoblastgrowth as in the case of biglycan deficiency.
Thesefindings underline non-overlapping functions that haveevolved
for these two homologous Class I SLRPs.Biglycan modulates
BMP-4-induced osteoblast
differentiation [440], and it also binds Chordin andBMP-4
inXenopus embryos, thereby blocking BMP-4activity [441]. Moreover,
biglycan affects the Wntsignaling pathway [442], in analogy to
decorin (seeabove). However, a recent study has shown thatbiglycan
acts as a pro-angiogenic stimulus in contrastto dec