-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
1 of 25
B I O C H E M I S T R Y
Short linear motif candidates in the cell entry system used by
SARS-CoV-2 and their potential therapeutic implicationsBálint
Mészáros1*, Hugo Sámano-Sánchez1, Jesús Alvarado-Valverde1,2,
Jelena Čalyševa1,2, Elizabeth Martínez-Pérez1,3, Renato Alves1,
Denis C. Shields4, Manjeet Kumar1*, Friedrich Rippmann5, Lucía B.
Chemes6*, Toby J. Gibson1*
The first reported receptor for SARS-CoV-2 on host cells was the
angiotensin-converting enzyme 2 (ACE2). Howev-er, the viral spike
protein also has an RGD motif, suggesting that cell surface
integrins may be co-receptors. We examined the sequences of ACE2
and integrins with the Eukaryotic Linear Motif (ELM) resource and
identified candidate short linear motifs (SLiMs) in their short,
unstructured, cytosolic tails with potential roles in endocyto-sis,
membrane dynamics, autophagy, cytoskeleton, and cell signaling.
These SLiM candidates are highly con-served in vertebrates and may
interact with the 2 subunit of the endocytosis-associated AP2
adaptor complex, as well as with various protein domains (namely,
I-BAR, LC3, PDZ, PTB, and SH2) found in human signaling and
regulatory proteins. Several motifs overlap in the tail sequences,
suggesting that they may act as molecular switches, such as in
response to tyrosine phosphorylation status. Candidate
LC3-interacting region (LIR) motifs are present in the tails of
integrin 3 and ACE2, suggesting that these proteins could directly
recruit autophagy components. Our findings identify several
molecular links and testable hypotheses that could uncover
mecha-nisms of SARS-CoV-2 attachment, entry, and replication
against which it may be possible to develop host-directed therapies
that dampen viral infection and disease progression. Several of
these SLiMs have now been validated to mediate the predicted
peptide interactions.
INTRODUCTIONThe coronavirus disease 19 (COVID-19) pandemic is
caused by se-vere acute respiratory syndrome coronavirus 2
(SARS-CoV-2), an enveloped, single-stranded RNA virus. It had
infected more than 68 million people and caused over 1.5 million
deaths globally by mid-December 2020. SARS-CoV-2 belongs to the
Coronaviridae family, whose members are common human pathogens
responsible for the common cold, as well as for some emerging
severe respirato-ry diseases. Among them are the SARS-CoV and the
Middle East respiratory syndrome coronavirus (MERS-CoV), the former
of which caused over 8000 cases in 2003 with a fatality rate of
~10% and the latter caused about 2500 infections in 2012 with a
fatality rate of 37% (1). Another coronavirus, infectious
bronchitis virus (IBV), infects birds and has been used as a model
in coronavirus research (2). SARS-CoV-2, like SARS-CoV (3), uses
the angiotensin- converting enzyme 2 (ACE2) as a receptor (4–6) to
attach to host cells. ACE2 is a single-pass type I membrane protein
with a short cytosolic C-terminal region for which the
functionality, however, is mostly unknown.
Earlier results show that the SARS-CoV-2 receptor-binding
do-main (RBD) of the spike protein interacts with ACE2 for
cellular
entry. In 2004, ACE2 was shown to be highly expressed in lungs
by anti-ACE2 antibody staining (7). However, several 2020 papers
us-ing both antibodies and single-cell mRNA sequencing now find
that there is very little ACE2 gene expression in normal lungs
(8–11). This suggests that the ACE2 receptor is insufficient to
establish se-vere lung disease and that SARS-CoV-2 can bind other
cell surface receptors on human lung cells. One group of candidate
co-receptors are the integrins that bind a large variety of ligands
harboring an RGD (Arg-Gly-Asp) sequence motif, as recent analysis
of the RBD identified a possibly functional RGD motif (12).
Integrins are major cell attachment receptors, which are known
to be targeted by a range of viruses—including HIV, herpes simplex
virus-2, Epstein-Barr virus (EBV), and the foot and mouth disease
virus (FMDV)—for cell entry and activation of linked intracellular
pathways (13–15). Integrins are special types of receptors, as they
propagate signals in both directions; extracellular ligands can
induce cytoplasmic pathway activation, but intracellular
interactions with the cytosolic tails can influence the structure
of the ectodomains and hence ligand-binding affinity. The
complexity of integrin signaling stems from the dimeric structure
of integrins, as they are composed of two subunits, and . For the
RGD-binding integrins, the ligand-binding surface lies at the
interface of the two integrin subunits, with both subunits making
contacts with the ligand. These RGD motifs are recognized by at
least 8 of the 24 human integrins, and the flanking residues next
to the core RGD motif are known to play a decisive role in
selectivity (16). Several viral proteins contain RGD (or RGD-like)
short linear motifs (SLiMs) for integrin modulation; in addi-tion,
not only some viruses can use integrins on the host cell surface
but also HIV/SIV (simian immunodeficiency virus) can incorpo-rate
integrins into their own membranes for mediating interactions with
the host (17). Therefore, integrins can potentially be targeted
1Structural and Computational Biology Unit, European Molecular
Biology Labora-tory, Heidelberg 69117, Germany. 2Collaboration for
joint PhD degree between EMBL and Heidelberg University, Faculty of
Biosciences. 3Laboratorio de bio-informática estructural, Fundación
Instituto Leloir, C1405BWE Buenos Aires, Argentina. 4School of
Medicine, University College Dublin, Dublin 4, Ireland.
5Computational Chemistry & Biology, Merck KGaA, Frankfurter
Str. 250, 64293 Darmstadt, Germany. 6Instituto de Investigaciones
Biotecnológicas “Dr. Rodolfo A. Ugalde”, IIB-UNSAM, IIBIO-CONICET,
Universidad Nacional de San Martín, CP1650 San Martín, Buenos
Aires, Argentina.*Corresponding author. Email:
[email protected] (B.M.); [email protected] (M.K.);
[email protected] (L.B.C.); [email protected] (T.J.G.)
Copyright © 2021 The Authors, some rights reserved; exclusive
licensee American Association for the Advancement of Science. No
claim to original U.S. Government Works
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
2 of 25
at both the extracellular and the intracellular side to combat
patho-genic hijacking.
Viruses, as obligate intracellular entities, need to interfere
with major cellular processes like vesicular trafficking, cell
cycle, cellular transport, protein degradation, or signal
transduction to satisfy their replication, enzymatic, metabolic,
and transport needs (18). To achieve this, a large number of host
processes are hijacked using SLiMs often located in intrinsically
disordered regions to establish protein-protein interactions with
host proteins or undergo post-translational modifications (PTMs)
such as tyrosine phosphoryl-ation. For example, cellular signaling
relies heavily on the use of SLiMs (19, 20). The low affinity
and cooperativity of SLiM-based molecular processes allow
reversible and transient interactions that can work as switches
between distinct functional states and are reg-ulated in both time
and space (21, 22). Conditional switching of SLiMs, for
example, through phosphorylation, can induce the ex-change of
binding partners for a protein, thus mediating molecular
decision-making in response to signals reporting on the cell state
(20). The Eukaryotic Linear Motif (ELM) resource
(http://elm.eu.org/) is a dedicated database and exploratory server
for over 280 manually curated SLiM classes with experimental
evidence, each of them defined by a POSIX regular expression
(23).
As explained above, a major strategy of viruses is to abuse the
host system by using mimics of eukaryotic SLiMs to compete with
extracellular or intracellular binding partners or to sequester
host proteins (18). This dependence of viruses and many other
patho-gens on SLiM-mediated functions suggests that there is an
opportu-nity to drug the cell systems where these interactions are
being hijacked (24). For example, tyrosine kinase inhibitors, often
used in anticancer therapy, have shown promising coronavirus
replication inhibition in infectious cell culture systems
(2, 25–27). In the re-mainder of the introduction, we will
describe some of the major pathways hijacked by viruses to
accomplish cell attachment, entry, and replication, which are
suggested by our results to be relevant to SARS-CoV-2
infection.
Receptor-mediated endocytosis (RME) is a cellular import process
triggered by cell surface receptor proteins, including any car-goes
attached to them, in which a large vesicular structure is
assem-bled entirely through cooperative low-affinity interactions
of SLiMs and phospholipid head groups with their globular protein
domain partners. The vesicles are strong and stable, yet flexible
and dynam-ically assembled and disassembled. The external
triggering of sur-face receptors (many of which have the YxxPhi or
NPxY tyrosine sorting motifs) is transmitted across the plasma
membrane, inducing local enzymatic modification of lipid head
groups from phosphatidylinositol- 4-phosphate (PI4P) to
phosphatidylinositol 4,5-bisphosphate [PI(4, 5)P2] by the PIPK1
kinase. The local enrichment of PI(4,5)P2 enables binding of
domains such as ENTH in epsins that can begin to curve the membrane
and assemble clathrin cages using their clathrin box motif and also
attract additional adapter proteins via yet more SLiMs. In turn,
additional sets of SLiM-bearing proteins stimulate the actin
filament formation and attachment, necessary to fold and pull the
invagination into the cytosol. Later, dynamin binds directly to
PI(4,5)P2 on the membrane to complete the scission process. Once in
the cytosol, the clathrin-coated vesicles are soon dismantled and
the contents are included into the early endosomes. [For recent
reviews of the process, see (28–30).] Many viruses enter the cell
via endocytosis, using many different cell surface receptors (31).
Viruses such as HIV and hepatitis C virus depend on the recognition
of more than
one receptor for entry, but in many cases, the stoichiometry of
re-ceptor engagement is unknown. Coronaviruses can enter cells
through different routes that include RME and cell-cell fusion
(32). In the case of SARS-CoV, the main entry route is endocytic
and depends on endosome acidification (33, 34). However,
protease-mediated activation of the spike protein relieves the pH
dependence of viral entry, indicating that acidification is not a
requirement per se, but acts by inducing the endosomal cleavage of
the spike protein re-quired for viral fusion (35, 36). The
spike protein is cleaved either by the transmembrane protease
serine 2 (TMPRSS2) at the cell sur-face or by cathepsin L within
endosomes (37). The same entry route and proteases are used by
SARS-CoV-2, and the use of endocytosis inhibitors indicates that
the main entry route also seems to be endo-cytic (4, 38).
Autophagy is an evolutionarily conserved process in eukaryotes
with multiple cellular roles that include the regulation of
cellular homeostasis through the catabolism of cell components,
immune development, and the host cell response to infection through
patho-gen phagocytosis (39). Viruses have evolved mechanisms to
block the host cell antiviral response and can further hijack
autophagy components to promote their survival and replication.
This can be done through viral mimicry of host proteins
coordinating autopha-gy or through the direct inhibition of the
host autophagy machinery (40). Coronaviruses exploit the autophagy
machinery through dif-ferent mechanisms (41, 42). For example,
MERS-CoV targets the BECN1 autophagy regulator for degradation,
blocking the fusion of autophagosomes and lysosomes and protecting
the virus from degradation (43). Coronaviruses repurpose cellular
membranes to create double-membrane vesicles (DMVs) onto which the
replication- transcription complex (RTC) is assembled, a process
that involves recruitment of multiple autophagy components
(41, 44, 45). DMVs in SARS-CoV-2 confine viral
double-stranded RNA (dsRNA) con-cealing the viral genome from the
innate immune system (46). Betacoronavirus mouse hepatitis virus
(MHV) RTCs assemble by recruiting LC3-I, a nonlipidated form of the
autophagy-associated protein LC3 (microtubule- associated protein
1A/1B–light chain 3) (41, 47), and SARS-CoV RTCs also
colocalize with LC3 (44). Proximity- based mass spectrometry on the
MHV replication complex further revealed that the RTC environment
repurposes components from the host autophagy, vesicular
trafficking, and translation machiner-ies (45).
In the present work, we identify a set of conserved SLiM
candi-dates in the ACE2 and integrin proteins, which are likely to
act in the cell entry system of SARS-CoV-2 and provide molecular
links to understand how the virus recognizes target membranes,
enters into cells, and repurposes intracellular membrane components
to drive its replication. These molecular links might provide
previously un-identified clues toward drugging SARS-CoV-2
infections. We first focus on the extracellular SLiMs, before
moving across the mem-brane to examine the cytosolic potential of
the receptor tails. In a coincidently published paper, experimental
testing of several motifs in the receptor tails is presented
(48).
RESULTSExtracellular receptor interplay and viral hijacking in
the ACE2/integrin systemThe identified RGD motif in the spike
protein marks integrins as can-didates for acting as co-receptors
for SARS-CoV-2 entry. However,
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://elm.eu.org/http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
3 of 25
similarly to most SLiMs, the integrin-binding RGD motif has a
low sequence information content, and the chance of random
occur-rence in protein sequences is relatively high. Therefore, the
mere presence of an RGD motif in a sequence is not a strong
indication of actual integrin binding. However, there are several
features that make the spike-integrin interaction via the RGD motif
plausible, including sequence- and structure-level information,
gene expres-sion profiles, the presence of accessory motifs, and
protein-protein interactions. In the next sections, we review how
this information gives credibility to the functional nature of the
spike protein RGD as an integrin-binding motif and, more generally,
to the existence of integrin hijacking by SARS-CoV-2.
The evolution of integrin-binding motif candidates within RBDs
in the spike protein highlights that while the RGD motif is not
conserved, the integrin-binding capacity might have evolved
conver-gently in several betacoronaviruses. Owing to the high rate
of re-combination in coronaviruses (49), it is challenging to build
proper phylogenies to trace their evolution. However, simply
aligning homologs of the RBD from the Betacoronavirus genus
(Fig. 1A) already shows that the RGD motif candidate is
located in a locally less conserved region, hinting at the rapid
evolvability of the site. The closest known homolog of SARS-CoV-2
is the RatG13 bat corona-virus containing TGD instead of RGD, which
is incompatible with integrin binding. However, while the RGD motif
itself is not con-served, several other members of the
Betacoronavirus genus harbor other possible integrin-binding
motifs. SARS-CoV and several of its close homologs, such as
BM48-31/BGR/2008, contain KGD at this
site. KGD can bind integrin as part of disintegrin binding, such
as in the snake venom barbourin (50), but because disintegrins
lacking KGD also bind integrin (51), and there is no evidence of
KGD bind-ing independent of disintegrins, we think that SARS-CoV
KGD is less likely to be an active integrin ligand.
Considering more distant homologs of SARS-CoV-2, it becomes
evident that the presence of an RGD/KGD site is not a universal
feature of betacoronaviruses. The RBD of a moderately related
Rousettus bat coronavirus does not contain any of the three
residues of the RGD (Fig. 1B). However, other even more
distant coronavirus sequences show a different potential integrin
targeting motif at the same site. OC43 is a betacoronavirus that is
one of the pathogens causing the common cold. Several OC43 RBD
sequences show an NGR motif in nearly the same position as the
SARS-CoV-2 RGD. NGR is an integrin interaction motif that becomes
active upon the nonenzymatic natural deamidation of the asparagine
residue pre-ceding a glycine to isoaspartic acid, forming an
l-isoDGR site, which can recognize several v integrins, as well as
integrin 51 (52). The parallel evolutionary emergence of potential
integrin- binding motifs at this location indicates that, despite
the lack of conservation at the site, the SARS-CoV-2 RGD motif
might be functional.
Normally, the functional importance of a protein region
cor-relates with its conservation. Checking for sequence variances
in the SARS-CoV-2 spike protein RGD motif across isolates showed
that all 8841 (when checked on 9 June 2020) high-quality full spike
pro-tein sequences in GISAID (Global Initiative on Sharing
Avian
Fig. 1. The RGD motif of the SARS-CoV-2 spike protein. (A)
Multiple sequence alignment of a part of the SARS-CoV-2 spike RBD
region using homologous sequences from betacoronaviruses of various
evolutionary distances and showing the location of potential
integrin-binding motifs in black. Virus names together with the
host or-ganisms, UniProt accessions (*or GenBank accession in the
case of RatG13), and sequence region numberings are shown on the
left side of the alignment. The location of the region shown in the
alignment is indicated in a representative diagram of the spike
protein, together with the location of the RGD motif and the region
responsible for ACE2 binding. (B) Neighbor-joining tree of the
multiple sequence alignment, with this particular set of sequences
containing the potential high affinity, low affinity, and reverse
integrin-binding motifs (RGD, KGD, and NGR) shown in red, orange,
and green boxes, respectively. Only the sequence regions shown in
(A) were used in the calculation of the tree. (C) Structure of the
SARS-CoV-2 RBD as seen in the ACE2-bound form (PDB:6m17). The RGD
motif is shown in red sticks. Regions in direct contact with ACE2
are shown in blue. Residues with missing atomic coordinates
(indicating flexibility) in the unbound trimeric spike protein
structures (PDB:6vsb, 6vxx, and 6vyb) are shown in transparency.
Alignment and tree were prepared in Jalview (226) with Clustal
colors. Structure was visualized using UCSF Chimera (228).
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
4 of 25
Influenza Database) (53, 54) contain the RGD region
together with the two flanking residues. While normally a fully
conserved site would indicate functional importance, the full spike
protein se-quence shows very little variation among isolates, with
some standard conservation scores (55) giving a value of 1
uniformly across the whole spike protein sequence.
The structural features of the SARS-CoV-2 spike protein RGD
motif are compatible with integrin binding. At the time of
reporting the RGD motif, no SARS-CoV-2 spike protein structures
were available, so the authors used structural homology modeling to
de-termine that the RGD motif is surface accessible (12). Since
then, several RBD structures have been determined, in both unbound
(5, 56) and ACE2 complexed forms using electron microscopy
(57) and X-ray diffraction (58), allowing for the direct structural
assess-ment of the possibility of binding to integrins. In the
sequence, the RGD motif and the ACE2 binding site do not overlap
(see the sche-matic in Fig. 1A); however, in the RBD
structural fold, the RGD motif is largely surrounded by residues
binding to ACE2 (Fig. 1C). This indicates that ACE2 binding
obscures the RGD motif and the two interactions would be mutually
exclusive on a single copy of the RBD. However, in the uncomplexed
structures, the residues that surround the RGD site are flexible,
whereas the RGD motif is sur-face accessible and is in the
appropriate -turn conformation for binding integrins. Thus, without
ACE2, the interaction with integ-rins is not sterically
blocked.
The spike protein is heavily glycosylated in its functional
form. A comprehensive glycosylation analysis of the spike protein
showed that the ACE2 binding site can be partially shielded by
structurally nearby glycans located at Asn165, Asn234, and Asn343.
However, the spike protein RBD has two alternative conformations,
and this shield-ing by glycans only happens in the “down”
conformation. Similarly, the glycans do not shield the RGD motif in
the binding- competent “up” conformation (5, 59), and
therefore, the RGD is accessible for interaction.
Given that the spike protein exists as a trimer on the virion
surface, different copies of the RBD can, in theory, interact with
ACE2 and integrins at the same time. Under the right structural
settings, even two copies of the RBD in the same spike protein
trimer can bind to ACE2 and integrins. The feasibility of such an
interaction depends on the spatial orientation of the integrin:ACE2
complex, which has been shown to form naturally (60). Although we
know that the interaction is between ACE2 and the subunit of the
integrin dimer, there is no solved structure of the ACE2- integrin
complex. However, further structural consideration may indicate
whether the spike-ACE2 and the spike-integrin interac-tion can
coexist within the same spike protein trimer (fig. S1). The
ectodomains of both ACE2 and integrins in the open conforma-tion
are roughly the same length measured from the membrane, being about
100 Å, depending on the conformation of the integrin dimer [based
on available structures; PDB:6m17 (57) and PDB:6avr (46)]. This
means that the RGD-binding site of integrins and the RBD-binding
regions of ACE2 are relatively close in space. In ad-dition, in the
ACE2 binding- competent up conformation of the RBDs, the distance
between pairs of RBDs is about 66 Å [based on the structure
PDB:6x2b reported in (61)]. Thus, the simultaneous binding of an
integrin dimer and an ACE2 dimer to the same spike protein trimer
would orient ACE2 and the integrin to have the correct distance and
orientation for the integrin subunit to bind ACE2.
The sequence and structure context of the RGD motif can
indi-cate possible target integrins. RGD motifs are recognized by
several integrins, and specificity is determined mostly by the
flanking residues of the core motif. As evidenced by crystallized
integrin dimer-ligand complexes, the residue preceding RGD is in
contact with the subunit, whereas the residue after the core motif
interacts with the subunit. The immediate context of the SARS-CoV-2
RGD motif is 402-IRGDE-406 (Fig. 1A), which can give an
indica-tion about possible integrin targets. IRGD can be found in
several native integrin-binding partners, including FREM1 (62),
MFAP4 (63), and IGFBP1/2 (64, 65). These extracellular matrix
proteins target integrins with v, 5, and 8 subunits. RGDE is
present in the native human integrin ligands TGFBI, osteolectin,
collagen -1(VI) chain, PSBG-9, and polydom, and in vitro and
in vivo bind-ing studies of the specificity profiles of these
proteins (66–71) high-lighted a post-RGD Glu to be efficient in
binding to 1, 2, and 3 integrin subunits. Correlating these
preferences with possible - and -integrin subunit pairings points
to the most likely candidate target integrins for SARS-CoV-2 being
v1, v3, 51, and 81. However, in vivo and in vitro
integrin-binding studies have indicated that various v and 51
integrins share a large overlap in binding specificity for ligands,
and therefore, any of these integrins might play a role in
SARS-CoV-2 cell attachment and infection.
Most RGD-binding integrin dimers recognize the partner RGD motif
in a long loop conformation that fits into the deep binding pocket
of the receptor (fig. S2A), including the integrin candidates
identified by the RGD-flanking residues. However, available
struc-tures highlight that v6 integrins have a different structural
preference in their ligands. In this binding mode, the ligand is
only in contact with the integrin subunit via the Arg residue of
the RGD motif. Therefore, the subunit plays little role in specific
ligand recognition. In contrast, the region following the RGD motif
adopts an helix and binds to the -integrin subunit (fig. S2B). In
most known cases, this interaction is stabilized by two small
hydrophobic residues fitting into two hydrophobic pockets on the
surface of integrin 6, establishing contacts with the three
specificity-determining loops (72), conforming to a pattern of
xRGDφxxφ, where φ indicates a hydrophobic residue and x indicates
any residue. This binding mode is known to be used by the growth
factors transforming growth factor–1 (TGF-1) and TGF-3 (72), and it
is also mimicked by the cell attachment loop of the FMDV for cell
entry (73). In its unbound state, the RGD motif of the SARS-CoV-2
spike protein RBD resides in a loop, followed by a helical
structure containing two small hydrophobic residues, rem-iniscent
of bound structures of v6 ligands (fig. S2C). While the RBD is
stabilized via three disulfide bridges, the RGD motif– containing
region is on the far side of the domain. In addition, this
region—together with the ACE2 binding site—has the highest aver-age
B-factor of the whole spike protein trimer (fig. S2D), hinting at a
possible structural rearrangement to accommodate the binding.
A major difference between TGF-–type ligand and the RBD
se-quence is that RBD contains an extra residue between the RGD and
the two hydrophobics, conforming to a pattern of RGDxφxxφ in-stead.
On the basis of current knowledge, it is unclear how this would
influence integrin binding; however, there are known v6 ligands
that also deviate from the TGF- subtype. Fibrillin-1 con-tains an
integrin-binding region with the sequence RGDNGD-TACSN, and it is a
known ligand for integrins 51, v3, and v6 (74). The deviation from
the canonical TGF-–type motif is possi-bly a compromise between
the—hitherto undescribed—specificity
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
5 of 25
determinants of the three integrins, resulting in binding to
several receptors with reduced affinity.
Motif-domain interactions are typically under heavy spatio-
temporal regulation. Hence, the SARS-CoV-2 RBD-integrin binding can
only occur if the possible target integrins are expressed on the
infected host cells. Integrins 51 (75) and v3 (76–78), at least,
have been observed in lung epithelial cells—the primary cells of
in-fection in the lung—and are implicated in the emergence and
pro-gression of various diseases, including emphysema, non–small
cell lung cancer, and mechanical injury of the lungs (79).
SARS-CoV-2 infection has been observed to cause damage in various
other tis-sues as well, including the heart, blood vessels, liver,
and kidney (80). v integrins are near ubiquitous in major human
tissues (81) and have been observed in all organs with observed
damage from SARS-CoV-2 infections.
There are several other factors that point to an interplay
between ACE2 and various integrins under normal cellular
conditions. It has been shown that in heart tissues, ACE2 is able
to bind the 1 and 5 subunits of integrins in an RGD-independent
manner, enhanc-ing cell adhesion and regulating integrin signaling
via the focal ad-hesion kinase (FAK) (60). It is unclear whether
ACE2 interacts with integrins from the same cell, suppressing
integrins by locking them in an inactive conformation, or adherent
cells, acting as a direct in-hibitor of integrins. However, the
functional link indicates that in-tegrins and ACE2 are expressed on
the surface of the same cells in certain tissues, further
corroborated by large-scale expression data (81). Furthermore, the
RGD independence of the interaction means that while ACE2 and
integrins are in complex, the RGD-binding site of the integrin is
unoccupied, leaving it available for a potential interaction with a
spike protein trimer.
Apart from the known interplay between ACE2 and integrins, there
are additional features that indicate an even tighter cross-talk
between the two receptors. RGD-mediated interaction to integrins is
metal-mediated (via divalent cations like Mg2+ or Mn2+), and all
integrins have a so-called “metal ion–dependent adhesion site”
(MIDAS) motif (DxSxS) (82). The integrin MIDAS structural motif is
located near the ligand-binding site on the subunit and is
essen-tial for binding, as side chains belonging to the motif and
an acidic residue from the ligand coordinate the metal ion together
(83). ACE2 also has a similar DxSxS motif (see Table 1) that
might facil-itate interactions with ligands that are recognized by
integrins, cre-ating an overlap between the ligand-binding profiles
and regulation of the two receptors. In the known structures where
spike protein is bound to ACE2, the RGD motif is not in contact
with the ACE2 MIDAS (57). However, the MIDAS motif is highly
conserved across species (see Fig. 2) and surface exposed. The
conserved ACE2 MIDAS motif partially overlaps with a semiconserved
NxT glycosyla-tion motif, and the attached carbohydrate is present
in solved ACE2 structures (57). This glycosylation does not
directly affect the MIDAS’s acidic residue, which might play the
main role in ligand binding. Consequently, the ACE2 MIDAS may still
be involved in mediating an interaction with an RGD-like motif,
potentially serv-ing as a parallel mechanism for binding the spike
protein.
Extracellular proteases are native modulators of cell surface
re-ceptors, and the SARS-CoV-2 spike protein uses these proteases
to enhance infection. ACE2 and several integrin subunits require
pro-teolytic cleavage for biological activity. Integrin subunits 3,
5, 6, and v are cleaved by furin or furin-like proprotein
convertases (PCs) during maturation (84, 85). Nearly all PCs
contain an RGD
motif, and while its role in integrin binding is not clear, the
motif has been shown to be required for proper functioning for
several PCs (86–88). The SARS-CoV-2 spike protein contains a
furin-like cleav-age site that is absent from closely related spike
proteins, immediately following the RBD (89). This cleavage is
essential for infection of human lung cells (90) and results in
increased virulence. A structural effect of the cleavage might be
to allow greater movement of the RBD, potentially aiding in
exploring a larger space around the RBD-binding region of ACE2. The
cleavage by furin has also been shown to create a new SLiM in the
spike protein, conforming to the C-end rule ([RK]xx[R]$ CendR motif
where $ indicates the C-terminus of the protein,
ELM:LIG_NRP_CendR_1; see Table 1) and mediating attach-ment to
host cell surface via neuropilin-1 and neuropilin-2 (NRP1 and NRP2)
(91). Similarly to ACE2, NRP1 physically interacts with integrin 1
and regulates integrin signaling (text S1 and fig. S8, A and B)
(92, 93). The binding of NRP1 to peptide C termini may be
associated with cooperative heparin binding (94); the SARS-CoV-2
S1/S2 cleavage site contains a heparin-binding motif (RRxR) that
may partly explain the higher binding affinity of the SARS-CoV-2
spike protein for heparin, compared with SARS-CoV and MERS (95),
and the inhibition of SARS-CoV-2 infection by heparin (96).
ACE2 is cleaved by several proteases, including TMPRSS2 (97).
ACE2 binds to TMPRSS2, forming a receptor-protease complex (98).
TMPRSS2 is also known to cleave the spike protein of both SARS-CoV
and MERS-CoV (99), augmenting their entry into the host cell (97).
Furthermore, similar results have been found for SARS-CoV-2, where
TMPRSS2 was found to be fundamental for cell entry (4). This
dependence is most probably twofold: On one hand, TMPRSS2 is needed
for ACE2 activation; on the other hand, SARS-CoV-2 spike protein
also contains a TMPRSS2 cleavage site (100).
SLiM candidates in the ACE2 receptor intrinsically disordered
tailRecent structural analysis provided experimental evidence that
the ACE2 tail is intrinsically disordered across the region
following the transmembrane helix (residues 769 to 805) (57), as is
also predicted from sequence analysis. The ACE2 sequence (UniProt:
ACE2_ HUMAN) was entered in the ELM server (23) and returned
several relevant candidate SLiMs in the short cytosolic C-terminal
tail. Be-cause SLiMs are so short, it is difficult to obtain
reliable results in sequence searches. Contextual information,
including cell com-partment localization and functional relevance,
is important in deciding whether a motif candidate is worth testing
experimen-tally (101). Furthermore, in intrinsically unstructured
protein se-quences, amino acid conservation is usually indicative
of functional interactions. Therefore, an alignment was prepared of
vertebrate ACE2 proteins. The deepest diverged organism with a
sequenced ACE2 gene is the hagfish, a jawless fish included in the
subphylum Vertebrata, although it lacks vertebrae (102). All of the
detected motif matches in human ACE2 [shown in Table 1
together with potential binding partner domains defined using Pfam
(103) and InterPro (104)] were conserved in mammals, most were
con-served with birds and mammals and some were conserved with
extant reptiles (Fig. 3). These groups diverged from one
another >300 million years ago (105). However, whereas the NPY
motif, for example, is absent in reptiles, it is present in bony
fish ACE2 se-quences and also in the hagfish, indicating that NPY
has been lost in the reptile lineage. The hagfish sequence shares
all of the candidate
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
6 of 25
motifs present in the human ACE2 tail, although it is >500
million years since their lineages diverged (102). In addition to
the strong evolutionary conservation of these candidate motifs,
their functional contexts are also biologically coherent, involving
signaling by tyrosine kinases, endocytosis, autophagy, and actin
filament in-duction (Table 1). In the following subsections,
we briefly summa-rize each of the conserved motifs and their
possible role in the viral entry mechanism.
The ACE2 tail contains a candidate YxxPhi endocytic sorting
signal. The YxxPhi motif binds the 2 subunit (UniProt: AP2M1_HUMAN)
of the endocytosis AP2 adaptors by -augmentation (106). It is found
in numerous cell surface receptors that have in-trinsically
disordered C-terminal tails (107). A small selection is listed in
the database entry ELM:TRG_ENDOCYTIC_2, and while the motif has not
been validated in ACE2, it is highly conserved (Fig. 3). When
the Tyr is phosphorylated, this motif becomes an SH2-binding site,
while in the apo form, it binds the 2 adapter. Therefore, this
motif can operate as a molecular switch. The residue following the
Tyr makes a -strand interaction and therefore can-not be a proline
(PDB:1bxx). The phi position requires a bulky hy-drophobic residue.
The motif pattern can be represented by the regular expression
Y[^P].[LMVIF], and this motif is conserved in ACE2 of all mammals
except monotremes. Thus, the mammalian ACE2, which internalizes the
coronavirus, has a SLiM candidate for internalization appropriately
located within its cytosolic tail. The
ACE2 tail sequence was found to bind with moderate affinity to
AP2 2 subunit (48) well within the 30 to 100 M range of
biologi-cally relevant affinities.
The region encompassing the YxxPhi motif overlaps with a
can-didate SRC homology 2 (SH2) domain–binding motif (Fig. 3)
that is created upon phosphorylation of Tyr781. SH2-binding motifs
are characterized by an invariant phosphotyrosine (pY) that is
created following tyrosine kinase activation and allows binding to
more than 100 types of SH2 domains present in human proteins (108).
The pY residue is accompanied by additional binding determinants
that fre-quently involve hydrophobic residues at the
pY + 3 position, but can also involve other combinations,
such as Asn at pY + 2 in Grb2-specific SH2 motifs or
hydrophobic residues at pY + 4 in STAP-1 SH2 motifs
(112; 110). Most SH2 motifs are also characterized by the exclusion
of residues at certain positions following the pY, and in general,
SH2-binding motifs show a high degree of cross-specificity (112)
(109), limiting the power of bioinformatics predictions.
Cell culture infection assays with different coronaviruses,
in-cluding SARS-CoV, have shown susceptibility to tyrosine kinase
inhibitors, indicating the involvement of host tyrosine
phosphoryl-ation (25; 26; 27; 2). The sequence found in ACE2
(781-YASID-785) matched the regular expression
(Y)[DESTNA][^GWFY][VPAI][DENQSTAGYFP] defined in the ELM database
for the SH2 do-main present in NCK1/2 proteins, which belong to the
class IA SH2 domains (110). No other SH2 entry catalogued in ELM
matched the
Table 1. Known and predicted SLiMs in SARS-CoV-2 host-entry
interactions. Previously identified motifs are marked with (✓).
Regular expressions follow POSIX definitions (23). The symbols ‘x’
and ‘.’ mark any residues in the definition of main residues and
regular expressions.
RegionProtein
(UniProt accession)
Motif ELM class* Main residuesRegular
expression Start End Sequence† Binding
domain‡Interaction
partner§Interaction
type
Extracellular
SARS-CoV-2 spike
protein (P0DTC2)
RGD LIG_RGD RGD RGD 403 405 RGD PF00362 and PF01839
RGD-binding integrins, most probably 51
and v3
Host:virus
Multibasic cleavage sites (✓)
– RRxR – 682 687 RRAR|SV PF00082 or IPR001254
Furin-like PCs/TMPRSS2 Host:virus– KxxKR – 811 817 KPSKR|SF
CendR (✓) LIG_NRP_CendR_1 RxxR [RK].{0,2}[R]$ 682 685 RRAR
PF00754 Neuropilin-1 Host:virus
Integrin v (similar for
other chains)
(P06756)
Multibasic cleavage sites (✓)
– xKR – 888 892 TKR|DL PF00082 Furin-like PCs Host
Integrin 3 (similar for
other chains)
(P05106)
MIDAS║ (✓) – DxSxS D.[TS].S 145 149 DLSYS –The acidic part
of RGD-like ligands
Host
Furin (P09958) RGD LIG_RGD RGD RGD 498 500 RGD
PF00362 and PF01839
Possibly RGD-binding
integrin dimersHost
MIDAS║ – DxSxS D.[TS].S 543 547 DISNS -
Unknown partner with
acidic residue via metal ion coordination
Host
Multibasic cleavage site (✓)
– R – 697 716 RTEVEKAIRM SRSRINDAFR IPR001254 TMPRSS2 Host
continued on the next page
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
7 of 25
RegionProtein
(UniProt accession)
Motif ELM class* Main residuesRegular
expression Start End Sequence† Binding
domain‡ Interaction partner§ Interaction
type
Intracellular
ACE2 (Q9BYF1)
I-BAR binding
LIG_IBAR_NPY_1 NPY NPY 779 781 NPY IPR027681
I-BAR domain–containing proteins like IRSp53 or IRTKS
Host
Endocytic sorting signal
TRG_ENDOCYTIC_2 YPx Y[^P].[LMVIF] 781 784 YASI PF00928
Adapter protein complex 2 subunit
SH2 binding – YxxD
((Y)[DE][^KRHG][DESTAPILVMFYW]
[^KR])|((Y)[NQSTAILVMFY]
[^KRHG][ILV][^KR])
781 785 YASID PF00017 SH2 domain of SFKs
LIR autophagy
LIG_LIR_Gen_1 ExxYxxx
[EDST].{0,2}[WFY][^RKP][^PG]
[ILMV].{0,4}[LIVFM]778 786 ENPYASIDI PF02991
Related proteins LC3, Atg8, GABARAP. There may be some variation
in LIR motif specificity
apoPTB LIG_PTB_Apo_2 Nxx[FY](.[^P].NP.[FY])|(.
[ILVMFY].N..[FY].) 789 796 GENNPGFQ PF08416PTB-containing
protein with a preference for
NxxF core motifs
PBM LIG_PDZ_Class_1 TxF$ [ST].[ACVILF]$ 800 805 DVQTSF
PF00595
PDZ-containing proteins with TxF$ preferences such as NHERF3
and
SHANK1
Integrin 3 (P05106)
apoPTB (✓) LIG_PTB_Apo_2 Nxx[FY](.[^P].NP.[FY])|(.
[ILVMFY].N..[FY].)
767 774 TANNPLYK PF00373PF00630
Talins (high affinity)Dok1 (low affinity)
Filamin-A (binding to both apoPTB motifs
simultaneously)
Host779 786 TFTNITYR PF00373PF00630
KindlinFilamin-A (binding to both apoPTB motifs
simultaneously)
PTB (✓) LIG_PTB_Phospho_1 Nxx(Y)(.[^P].NP.(Y))|(.
[ILVMFY].N..(Y))
767 773 TANNPLYPF08416PF00640PF02174
Talins (low affinity)Dok1 (high affinity)
Shc (binding to both PTB motifs simultaneously)
779 785 TFTNITY PF00640Shc (binding to both
apoPTB motifs simultaneously)
LIR autophagy
LIG_LIR_Gen_1 ExxYxxx
[EDST].{0,2}[WFY][^RKP][^PG]
[ILMV].{0,4}[LIVFM]777 783 TSTFTNI PF02991 Atg8 protein family
Host
Integrin 1 (P05556)
ApoPTB (✓)
LIG_PTB_Apo_2 Nxx[FY]
(.[^P].NP.[FY])| (.[ILVMFY].N..[FY].)
777 784 TGENPIYKPF00373,PF10480PF00630
Talins (high affinity)Dok1 (low affinity)
ICAP-1Filamin-A (binding to both apoPTB motifs
simultaneously)
Host789 796 TVVNPKYEPF00373PF00630
KindlinFilamin-A (binding to both apoPTB motifs
simultaneously)
PTB (✓) LIG_PTB_Phospho_1 Nxx(Y)(.[^P].NP.(Y))|(.
[ILVMFY].N..(Y))
777 783 TGENPIYPF10480PF00640PF02174
Talins (low affinity)Dok1 (high affinity)
ICAP-1Shc (binding to both PTB motifs simultaneously)
789 795 TVVNPKY PF00640 Shc (binding to both PTB motifs
simultaneously)
*Motif identifier as in the ELM resource. †“|” denotes cleavage
points for protease-recognition motifs. ‡Defined through use of
Pfam (103) or InterPro (104), where applicable. §PC, proprotein
convertases. ║Not a SLiM but a structural motif.
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
8 of 25
tail. Proteins known to contain this motif are listed in entry
ELM:LIG_ SH2_NCK1_1. We have since learned that an ACE2
phosphorylated Tyr781 (pTyr781) tail peptide does not bind to NCK1
(48). Upon re-examination of the SPOT arrays in (111, 112), we
noted that the strong preference at pY + 3 is for Val and
Pro. While Ile is tolerated at pY + 3 in the context
of the high-affinity EPEC Tir (enteropathogenic Escherichia coli
translocated intimin receptor) sequence (111), it is not tolerated
in the context of random peptide pools (112). This would indicate
that NCK can only tolerate a weak Ile residue at pY + 3
when a strong residue such as Glu and Asp is found at
pY + 1, such as Asp in EPEC Tir. The presence of the weak
aliphatic residue Ala at pY + 1 in ACE2 would
explain the lack of binding for the ACE2 tail motif. This evidence
indicates that the ELM pattern needs correcting to allow only one
weak amino acid at either of pY + 1 or
pY + 3 in the regular expression.
Other class 1A SH2 domains with a strong preference for Ile at
the +3 position in SPOT array include the SH2 domains of the SRC
family kinases (SFKs). A regular expression for SRC family SH2
do-mains allowing for weak/strong residues +1 and +3 positions
and compatible with the SPOT arrays could be ((Y)[DE][^KRHG]
[DESTAPILVMFYW][^KR])|((Y)[NQSTAILVMFY][^KRHG][ILV][^KR])
(Table 1). This pattern matches the ACE2 tail. The ACE2 YASID
sequence has a weak Ala at pY + 1, neutral Ser at
pY + 2, and strong Ile and Asp at pY + 3/+4,
making this a plausible motif for binding SFKs. Because all human
cells have at least one SFK, and they are involved in regulating
endocytosis and actin fila-ment formation (113–115), their SH2
domains are plausible candi-dates for binding the ACE2 tail. For
example, Abl kinases have
specialized cytoskeletal remodeling capacity mediated through
their actin binding and actin bundling domains (113), while SRC
enhances receptor endocytosis and focal adhesion (FA) remodeling
through the phosphorylation of Eps8 and dynamin2 (115). We also
turned to the ModPepInt server that uses unsupervised learning
tech-niques to train SH2- binding motif prediction. ModPepInt has
models for 51 SH2 domains (116). A run of the ACE2 tail sequence
returned best matches with several nonreceptor tyrosine kinases,
most harboring class IA SH2 domains that largely overlap with
expectations from the SPOT arrays (the kinases Abl1/2, BLK, FGR,
FRK, HCK, LCK, SRC, FYN, and TEC) plus other predicted binders,
such as the kinase FES and the adaptor proteins GRB10 and GRB14
(table S1). Kliche et al. then tested the revised SH2 motif
assignment to the SFKs, measuring a low micromolar affinity for the
Fyn SH2 domain with the tyrosine- phosphorylated ACE2 peptide
(48).
The residues present at pY + 1, pY + 2, and
pY + 4 should rule out that the ACE2 YASID motif can be a
strong Grb2, CRK, and STAP-1 SH2 domain binder, and binding to SH2
domains in the transcription factors signal transducer and
activator of transcrip-tion 1 (STAT1), STAT3, and STAT5 is also
unlikely due to the lack of adequate specificity determinants.
However, other SH2 domains, particularly ones with low observed
specificity (e.g., PTPN11_N, PLCgamma1_C, and SH2D1A), could be
recruited by ACE2 when there is coexpression in the same cell type.
Experimental validation will be required to test these
hypotheses.
Tyr781 in ACE2 also overlaps with a candidate phosphorylation-
independent NPY IBAR-binding motif (ELM:LIG_IBAR_NPY_1). This motif
was initially described in the bacterial secreted protein
Fig. 2. Alignment of ACE2 illustrating conservation of the MIDAS
motif. Multiple sequence alignment of a part of the ACE2
extracellular domain using 25 homologous sequences from different
vertebrate lineages (mammals, birds, reptiles, and fish) and
showing the conservation of the Dx[ST]xS motif as well as an NxT
glycosylation site (main residues displayed above). A red box marks
the conservation range of the MIDAS motif in all sequences but the
hagfish. Organism names, UniProt IDs (UniParc for hagfish), and
sequence numberings are listed on the left side of the alignment.
The location of the region shown in the alignment is indicated in a
representative diagram of the ACE2 protein. Figure was prepared
with Jalview using Clustal colors. TM, transmembrane; C-ter,
C-terminal.
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
9 of 25
Tir from pathogenic strains of Escherichia coli, such as
enterohaem-orrhagic E. coli (EHEC). The NPY tripeptide recognizes
and binds with a 60 M affinity to inverse Bin-Amphiphysin-Rvs
(I-BAR) do-mains in adaptor proteins like insulin receptor
substrate protein of 53 kDa (IRSp53) and its homolog insulin
receptor tyrosine kinase substrate (IRTKS) (117, 118). I-BAR
domains bind to the plasma membrane to favor weak membrane
protrusions, and the preference of I-BAR domains for negative
membrane curvatures enables a pos-itive feedback loop that can
result in the formation of lamellipodia, filopodia, and other types
of membrane protrusions (119–121). IRSp53 and IRTKS are modular
proteins that contain SH3 domains that, in turn, recognize PxxP
SLiMs in actin filament regulators like Mena, Eps8, and mDia1
(122), resulting in the formation of mem-brane protrusions through
actin filament formation (117, 119–121). Moreover, IRSp53 has
an additional Cdc42-binding motif that can result in a direct
neural Wiskott-Aldrich syndrome protein activa-tion (122). During
EHEC infection, the bacteria use the NPY motif in the transmembrane
protein Tir to recruit IRSp53 (117). IRSp53 acts as a scaffold to
localize the injected bacterial protein EspFU to the bacterial
attachment site, cytosolic side, through the binding of a PxxP
motif in EspFU to the IRSp53 SH3 domain. Through the use of the
same helical SLiM present in NCK (ELM:LIG_GBD_CHELIX_1), EspFU acts
as a potent Wiskott-Aldrich syndrome protein activator, inducing
the actin polymerization that contrib-utes to the pedestal
formation characteristic of EHEC infections (123, 124). The
NPY SLiM, although not yet experimentally validat-ed in any human
protein, is potentially functional in proteins like SHANK2 or the
microtubule-binding CLIP-associating protein 1 (CLASP1), based on
protein conservation and functional associa-tion (118). The
putative NPY motif in ACE2 is conserved in all an-
alyzed mammalian and bird homologs (Fig. 3), suggesting a
direct interaction with host I-BAR–containing proteins such as
IRSp53 or IRTKS, which are expressed in lung tissues (81).
The I-BAR domain–binding motif in the cytosolic region of ACE2
could be relevant for SARS-CoV-2 infection in the following
scenario. During viral cell entry, the NPY motif could recruit
I-BAR–containing proteins such as IRSp53 or IRTKS, resulting in
membrane protrusion formation that could be exploited for viral
entry or in cell to cell transmission. It is known that the hijack
of the filopodia formation network is beneficial for the entry and
spread-ing of many enveloped viruses (125), but whether this
process is active during coronavirus infection is still unclear. A
second route might cooperate with the NPY motif in the recruitment
of actin cy-toskeleton components. A direct interaction between the
SARS-CoV spike protein cytosolic side C-terminal domain and the
ezrin FERM (4.1 protein, ezrin, radixin, moesin) domain can occur
during the opening of the viral fusion pore and has been proposed
to restrain viral infection (126). Ezrin is a protein involved in
cell morphology and apical membrane remodeling that acts as a
membrane-cytoskeleton linker. Ezrin recruits F-actin through its
C-terminal domain and can also bind to IRSp53 located at negatively
curved membranes (127, 128), suggesting that while the NPY
motif acts at earlier stag-es of viral attachment, the spike
protein–Ezrin interaction might work during or after viral fusion,
to promote the recruitment of actin-regulatory components to viral
fusion sites.
Apart from the endocytic sorting signal, the SH2 binding, and
the IBAR-binding motif, Tyr781 is also part of an LC3-interacting
region (LIR) autophagy motif candidate (Fig. 3). Autophagy,
the recycling of cellular material, is vital for cellular
homeostasis. Many pathogens must control the autophagy response to
establish productive
Fig. 3. Alignment of ACE2 illustrating conserved motifs in the
cytosolic C-terminal tail following the transmembrane helix.
Multiple sequence alignment of ACE2 transmembrane and C-terminal
regions using 25 homologous sequences from different vertebrate
lineages (mammals, birds, reptiles, and fish) and showing their
motif conservation. The names (bold) and key residues of the motifs
are displayed above the alignment (ɸ stands for a bulky hydrophobic
residue), including a conserved tyro-sine (bold) and excluded
positions (red and crossed). Red boxes mark the conservation range
of the PDZ-binding motif (PBM) (all sequences) and NPY motif (in
mammals, birds, and some fish). Organism names, UniProt IDs
(UniParc for hagfish), and sequence numberings are listed on the
left side of the alignment. The location of the region shown in the
alignment is indicated in a representative diagram of the ACE2
protein. Figure was prepared with Jalview using Clustal colors.
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
10 of 25
infection (39). It has been shown that coronaviruses, including
those that infect humans, subvert autophagy components to promote
viral replication at DMVs associated to the RTC
(43, 47, 129, 130). The LIR motif is required for
the interaction of a target protein with autophagy-related protein
Atg8 in yeast, or its homologs LC3 and GABARAP in human, to
facilitate autophagy of the target via the autophagosome (131). The
LIR motif has been catalogued in the ELM resource entry
ELM:LIG_LIR_Gen_1, and ELM detected a candidate motif in the human
ACE2 cytosolic tail sequence (Fig. 3). After the LIR motif was
annotated in ELM, a more recently solved LC3-LIR structure
(PDB:5cx3) showed that the interacting peptide is longer, with one
or two additional hydrophobic interactions (132). LIR enters a
hydrophobic groove bordered by positively charged residues. A core
[WFY]xx[ILMV] enters the deepest part of the groove. On either side
of the core, the interacting residues can be flexibly spaced. The
core must be preceded by a negatively charged residue (which might
be enabled by phosphorylation). Furthermore, the motif core is
followed by a flexibly spaced hy-drophobic residue. There is often
a negatively charged residue preceding this hydrophobic position:
It can make favorable inter-actions with counter charges but is not
an absolute requirement, so is not included in the revised motif
pattern. On the basis of the structure (PDB:5cx3) and some SPOT
arrays (132–134), the updated regular expression
[EDST].{0,2}[WFY][^RKP][^PG][ILMV].{0,4}[LIVFM] matches the motif
instances annotated in ELM. This revised motif is conserved in the
mammalian ACE2 cytosolic tail as well as hagfish and ghost shark,
but not in birds, reptiles, or bony fish. The ACE2 LIR motif
candidate can potentially enable the incoming coronavirus to
attract autophagy elements such as LC3 to the structures where the
virus replicates and assembles. In line with this, a nonlipidated
form of the LC3 protein has been shown to be associated with the
RTCs of MHV and SARS-CoV (41, 44, 47). This brings up the
interesting possibility that ACE2 remains associated with the
membranous structures where SARS-CoV-2 replicates at later
infection stages, assisting in the repurposing of autophagy
components required for viral replication. Techni-cal issues
hampered the comprehensive testing of phosphorylated ACE2 peptide
sequences containing the LIR candidate, but the un-phosphorylated
peptide did not show meaningful binding (48). However,
phosphorylation of Ser783 seems to induce a weak bind-ing with
MAP1LC3A and GABARAPL2 domains, albeit with affin-ities not
reaching physiological relevance (48). So far, the evidence is not
enough to support LIR functionality but perhaps multi-
phosphorylation and/or a longer tail sequence could deliver a
stronger interaction.
The ACE2 tail region C-terminal to the overlapping motifs
cen-tered around Tyr781 contains two additional motif candidates.
The first such candidate is an apoPTB domain-binding motif. Certain
members of the large PTB domain family were initially discovered to
bind to phosphorylated NPxY motifs, hence the designation
“phospho-tyrosine binding domain” (135). The NPxY motifs in
cytosolic tails of receptors, including integrins, are regarded as
en-docytosis sorting signals (107). It was later discovered that
PTB do-mains in the endocytic internalization adapter protein Dab1
could also bind nonphosphorylated Nxx[FY] motifs (apoPTB motif) and
that this might be the case for the majority of PTBs (136).
Repre-sentative receptors with apoPTB motifs are in the database
entry ELM:LIG_PTB_Apo_2. The core Nxx[FY] motif is conserved in all
the vertebrate ACE2s (Fig. 3). For the Dab1 endocytic adapter
class
of apoPTB motifs, there is a hydrophobic requirement two
residues before the Asn. In ACE2 of fishes such as the hagfish and
coelacanth (Latimeria chalumnae), the residue is hydrophobic
(Fig. 3), suggest-ing that this motif is present. However, in
most other species in-cluding human, Glu predominates at this
position: Therefore, if this notably conserved Nxx[FY] is an apoPTB
motif, it should then bind a PTB protein other than the Dab1 class.
The apoPTB motif binds as a short -strand (-augmentation) followed
by a -turn. Proline is rejected at the first position of the motif,
which is a strand-forming residue, and therefore, the minimal
regular expression for this motif is [^P].N..[FY]. As with the
phosphorylated versions, the apo-motifs are tightly connected to
endocytosis (136). The conservation of this motif in the homologous
position of the cytoplasmic chain of the partially collinear
collectrin protein (UniProt: CLTRN_HUMAN; fig. S3) indicates that
this motif instance has an even earlier evolu-tionary origin than
the origin of ACE2 itself, hinting at a key role in
internalization. As expected, because the specificity is not yet
de-fined, Dab1 and four other tested PTB domains did not bind to
the ACE2 tail region (48). A poorly soluble sorting nexin 17
(SNX17) FERM domain was found to bind with ≈100 M affinity,
providing an ambiguous result.
The very C-terminal region of ACE2 contains a TxF$ PDZ-binding
motif (PBM) candidate. Among other motif-binding modules, PDZ
domains come in great abundance in human and other multicellular
animals (137). PDZ domains take part in a variety of biological
pro-cesses including cellular signaling and activity at the
neuronal synapse (138). These domains bind by -strand augmentation
to SLiMs that are called PBMs, most commonly known to be found in
the C terminus of fully or partially disordered proteins. These
in-teractions are widely studied and their link to various diseases
and infections has been previously established (139). A PBM
candidate is also found in the very C terminus of the cytosolic
tail of all verte-brate ACE2 proteins (Fig. 3). Motifs
following a pattern [ST].[ACVILF]$ are a common PBM variant,
described in the ELM resource entry ELM: LIG_PDZ_Class_1. There are
multiple func-tional examples of this motif. However, in the ACE2
protein, the matching sequence has not been characterized. Because
the tail of ACE2 is facing the cytosol, it is available to interact
with PDZ do-mains with the appropriate specificity (138).
Two PDZs in two different adapter proteins—Na(+)/H(+) ex-change
regulatory cofactor NHERF3 and SH3 and multiple ankyrin repeat
domains protein 1 SHANK1—have been previously identi-fied to be
able to bind TxF$ sequences (140), which makes both of them
candidates for an interaction with the ACE2 C terminus. NHERF3 is
colocalized with ACE2 in intestinal tissue, and its PDZ
domains were previously validated to interact with PBMs in
trans-membrane proteins on the cytosolic side of the membrane
(141), so it is possible that they come in proximity with the ACE2
tail containing the TxF$ motif and possibly bind it as a part of
ion exchange regu-lation of small-molecule transport activities.
NHERF3 is known for its involvement in sodium ion–dependent
transporter activity (142), and ACE2 was also shown to interact
with a sodium-dependent trans-porter (57), which could be one of
the leads toward unraveling the possible interaction between NHERF3
and ACE2. Kliche et al. (48) confirmed ACE2 tail binding with
good affinity for both NHERF3 and SHANK1. They also measured low
micromolar affinity for the PDZ domain of SNX27, which is involved
in retrograde trans-port from the endosome to the plasma membrane.
Although plausible, whether or not NHERF3 and SNX27 are PDZ
domain–containing
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
11 of 25
proteins interacting with ACE2 is an open question that will
require follow-up experiments in the cell.
Tyr781 in the ACE2 tail creates a potential multiway molecular
switch regulated via phosphorylationThe tyrosine at residue
781 in ACE2 is a part of the motif patterns for four of the
motifs listed above (Fig. 3 and Table 1) but must be
phosphorylated to act as an SH2-binding motif. We searched the
ACE2-related literature for reports of phosphorylation but were
unable to find any with strong site identification. Examination of
the human ACE2 entry in the database PhosphoSitePlus (143)
re-vealed that high-throughput (HTP) phosphoproteomic studies, but
no low-throughput (LTP) studies, identify pTyr781. Thirteen HTP
measurements identified phosphorylation at Tyr781, and this
resi-due is the only ACE2 phosphosite that is reproducible across
multi-ple HTP datasets (Fig. 4). For example, pTyr781 was one
of 318 unique phosphopeptides belonging to 215 proteins analyzed
from an erlotinib-treated breast cancer cell line model (144).
Therefore, this site fulfills the phosphorylation requirement to be
an SH2- binding motif.
As outlined above, four candidate sequence motifs overlap in the
region surrounding Tyr781: the YxxPhi endocytic sorting signal
(ELM:TRG_ENDOCYTIC_2), an SH2 motif that mediates binding to SFKs,
an NPY I-BAR–binding motif (ELM:LIG_IBAR_NPY_1), and the LIR
autophagy motif (ELM:LIG_LIR_Gen_1). While the YxxPhi, NPY, and LIR
motifs require an unphosphorylated state of Tyr781, the SH2 motif
requires Tyr781 phosphorylation, creating the opportunity for a
multiway phospho-switch acting in this region of ACE2 that directs
different steps of the SARS-CoV-2 infection cy-cle. In support of
this proposal, Kliche et al. (48) confirmed that the
ACE2-YxxPhi interaction is negatively regulated by phosphoryl-ation
and that binding to the FYN SH2 domain requires Tyr781 phos-
phorylation. The relative affinities of the ACE2 tail binders,
which is still to be fully established, will dictate the
competition between the interactions and the functional output.
Current results indicate that the phosphorylated ACE2 tail can
reach low micromolar affin-ity for SFKs and that the
unphosphorylated state can bind to the AP2 2 subunit with moderate
affinity, while physiologically relevant interactions with
autophagy components and I-BAR do-mains are still to be
demonstrated. The state of this switch could be controlled by
protein localization and by tyrosine kinase activity involving
SRC/Abl and other tyrosine kinases, which are known to have
increased abundance during endosomal processes (115) and viral
infection (18) including in coronaviruses (2, 25–27). Similar
switches have been described before, as with the cytotoxic T
lymphocyte–associated protein 4 (CTLA-4) receptors, where SRC
tyrosine kinases dictate the binding preferences of overlapping
YxxPhi and SH2-binding motifs. In the unphosphorylated state,
en-docytosis is favored, whereas T cell activation brings about Tyr
phosphorylation, shutting down endosomal recycling and initiating
signaling through the recruitment of SH2 domain–containing
pro-teins (106, 145–148). The CagA effector from Helicobacter
pylori provides an example of a multiway molecular phospho-switch,
where the choice for senescence versus cell proliferation is
dictated by the SH2 domain–containing protein that forms a complex
with phosphorylated CagA (24). Additional regulation can create a
tem-poral gradient of the phospho-signal: CagA leads to remodeling
of the actin cytoskeleton through its sequential phosphorylation by
tyro-sine kinases. Initial phosphorylation by SRC creates a
negative feed-back loop that terminates SRC signaling through
activation of the SRC inhibitor Csk in the early stages of
infection, while phosphorylation by Abl kinases leads to concerted
changes in the phosphorylation of actin-regulatory proteins that
drive actin-cytoskeletal rearrange-ments at later time points of
infection (149).
Fig. 4. The summary for the ACE2 C-terminal tail provided by
PhosphoSitePlus. No low-throughput (LTP) studies have been recorded
in the database for ACE2. Thirteen high-throughput (HTP) studies
have identified phosphorylation on Tyr781. Phosphosites reported in
the extracellular part of ACE2 have only been reported once each
and therefore are likely to be misidentified peptides.
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
12 of 25
A similar temporal regulation could be at play in SARS-CoV-2
endocytosis. This might be enacted by a Tyr781 phospho-switch. The
early attachment phase could be characterized by unphospho-rylated
Tyr781 that allows the YxxPhi and NPY motifs to be active. During
this phase, the YxxPhi motif could initiate RME by binding the AP2
complex 2 subunit, recruiting clathrin and other endocytic
components to the viral attachment sites. In addition, some viruses
can “surf” along filopodia by myosin-mediated actin cytoskeleton
movements that transport the viral particles to the entry sites at
the cell body, ultimately increasing their entry rate (125). The
forma-tion of these membrane protrusions could be promoted by the
I-BAR–binding NPY motif. The relative affinity and availability of
binders might dictate the sequential or concerted use of the YxxPhi
and NPY motifs during the initial stage. Following the initial
steps of membrane attachment and clathrin coat formation, actin
po-lymerization is required to internalize the endocytic vesicles.
This second step could be brought about by SFK-mediated Tyr781
phos-phorylation that leads to disengagement of the AP2 2 subunit
and I-BAR–containing proteins and to activation of actin-regulatory
proteins through SFK recruitment. SRC and Abl, two of the SFKs
predicted to bind the SH2 motif, are known to promote RME and actin
cytoskeletal rearrangements (113, 115).
An alternative scenario that is not mutually exclusive with
tem-poral regulation might be enabled by the multimeric nature of
the spike protein and by attachment of several viral particles to a
mem-brane domain, leading to adjacent ACE2 tails on the
intracellular side that expose both phosphorylated and
unphosphorylated motifs, allowing these three signaling steps to
take place simultaneously. The separation between the RBD-binding
sites in the ACE2 dimer is 68 Å calculated from PDB:6m17 (57), in
close agreement with the distance between RBDs in the up
conformation (~66 Å) measured from PDB: 6x2b (61) (fig. S1). While
the outward orientation of the RBD- binding sites in ACE2 might
preclude stable contacts between two RBDs and an ACE2 dimer, the
spatial proximity implies that both ACE2 subunits are likely
activated by the dynamic interaction of a spike protein trimer with
an ACE2 dimer. The presence of several parallel routes for the
recruitment of cytoskeleton components in-volving the NPY and SH2
motifs could provide the robustness needed to ensure the actin
reorganization required for the uptake of virus-containing vesicles
into the cytosol. Following endocytosis and fusion, viral
components are released into the cell and viral rep-lication takes
place. SFKs have been shown to be inactive at the en-dosomal
compartments, which would lead to dephosphorylation of Tyr781
following endocytosis (115). During this phase, the last com-ponent
of the switch could come into play, when the ACE2 protein that
remains bound to spike protein–coated membranes could pro-mote the
hijack of autophagy components necessary to assemble the viral
replication factories. However, the functionality of the LIR motif
has not yet been established and might require other PTMs of the
ACE2 tail, as suggested by Kliche et al. (48).
Known and candidate motifs in the -integrin tailsIntegrin tails
are short cytosolic C-terminal intrinsically disor-dered regions,
similar to the analyzed region of ACE2. The three most probable
integrin subunit candidates at play in SARS-CoV-2 viral entry are
3, 6, and 1. The C-terminal tails of all three sub-units share a
high degree of sequence similarity (with 3 and 6 being almost
identical) and, similarly to ACE2, contain several known and
candidate SLiMs (Table 1 and
Fig. 5, A and B) that
propagate signals in the cytoplasm and regulate integrin
activity not only through intracellular pathways but also changing
the structural state of the ectodomains determining ligand-binding
capacity (150). In addition, all three integrin tails are very
highly conserved (figs. S4 to S6), hinting at their high functional
importance.
Integrin tails contain a highly charged patch in their membrane-
proximal region (Fig. 5A). This region is indispensable for
the inter-action between integrins and tyrosine kinases, including
the SRC kinase Fyn (151) and FAK, most probably via the direct
interaction with paxillin (152). Through these interactions,
integrins regulate cytoskeletal remodeling (153) and the promotion
of cell survival (154), as well as regulation of FA assembly and
cell protrusion for-mation (155). In turn, FAK regulates integrin
recycling and endoso-mal trafficking (156, 157).
Now, there is no consensus sequence motif describing these
in-teractions, although a definition of HDR[KR]E has been proposed
(158), matching integrins 1, 3, 5, and 6. This motif is under heavy
regulation by several mechanisms. First, the interaction with
tyrosine kinases seems to involve additional residues N-terminal of
the charged motif core—most notably, the conserved lysine
preced-ing the hydrophobic patch (159)—that are only accessible in
the active state of the integrin dimer, as these regions are buried
in the membrane otherwise (160). Second, the D residue of the motif
forms a salt bridge with the cytosolic tail of the subunit of the
integrin in the inactive conformation of the receptor. Thus, this
motif region is dependent on integrin activation regulated by
ligand binding and intracellular interactions mediated by the
downstream NPxY motifs.
The tails of integrins 1, 3, and 6 contain two regions that
match the apoPTB motif (Table 1 and Fig. 5A) as either
NPxY (with two matches in integrin 1 and 1-1 matches in integrins 3
and 6) or φxNxxY (with 1-1 matches in integrins 3 and 6).
Furthermore, these regions are known to have Tyr phosphorylation,
matching the phosphorylated motif definition as well
(ELM:LIG_PTB_Phospho_1). These regions are known to be able to form
-turns and are recog-nition sites for PTB domains. In addition,
NPxY motifs are the ma-jor sorting signals mediating interactions
with FERM domains for regulating endosomal trafficking (161). In
-integrin tails, these motifs recruit adaptor proteins and
clathrin, serving as sorting sig-nals (162), and the NPxY motifs in
the 1 tail have a direct connec-tion to viral entry for reovirus
(163).
The NPxY motif switches mediate several interactions. The
membrane-proximal NPxY motif binds talin-1, serving as a
connec-tion between the plasma membrane and the major cytoskeletal
struc-tures (164). Considering the expression profiles of talins,
the most likely interaction partner of lung-expressed integrins is
talin-1. Talin-1 contains a FERM domain, similarly to Ezrin, which
establishes a direct interaction with the SARS-CoV spike protein
upon viral fusion (126). However, the interaction between the RBD
and integrins offers the virus an earlier point of interference
with the cytoskeletal system, being able to modulate it
cooperatively with the ACE2 actin-regulatory elements (NPY and SH2
motifs) before and during cellular entry. The talin/integrin
interaction, however, presents a feedback loop: The binding of
talin on the cytoplasmic side induces a structural rear-rangement
on the ectodomains of integrins, enabling a higher affin-ity
interaction with RGD motif–containing ligands (165).
The membrane-proximal NPxY motif is also a binding site for
docking protein 1 (DOK1), a negative regulator of integrin
activation. DOK1 is in direct competition with talin for binding
integrins (165). The competition is fundamentally influenced by
phosphorylation
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
13 of 25
on Tyr783 (for integrin 1; fig. S7A), Tyr773 (for integrin 3;
Fig. 5B), and Tyr762 (for integrin 6; fig. S7B) of the NPxY
motif. The un-phosphorylated motif has a higher affinity toward
talin, whereas phosphorylation prefers DOK1 (166); thus, the
tyrosine acts as a phospho-switch regulating integrin
activation.
The membrane-proximal NPxY motif also presents a binding site
for a largely phosphorylation-independent interaction with the
in-tegrin cytoplasmic domain–associated protein-1 (ICAP-1). ICAP-1
is a fundamental regulator of the assembly of FAs and ICAP-1
knockdown reduced FA assembly (167), possibly working in
con-junction with the membrane-proximal charged region. ICAP-1
seems to be specific for 1, and hence, the therapeutic
consider-ations for targeting this pathway require the verification
of the type of integrins expressed on AT2 cells (and other related
cell types).
The membrane distal NPxY motif is a binding site for the FERM
domain of kindlin (168). This interaction requires the integrin
tail to be nonphosphorylated, and phosphorylation on Tyr795 (for
inte-grin 1) or Tyr785 (for integrin 3) can switch off the
interaction with kindlin-2 (169) (no corresponding Tyr
phosphorylation has been identified in 6 tails as of yet). Kindlin
binding (together with talin binding) is a crucial step in integrin
activation and hence reg-ulates the availability of integrins for
extracellular ligands (170) and was also suggested to play a role
in TGF-1 signaling (171).
The two NPxY(-like) motifs in the integrin tails not only
con-stitute two separate phospho-switches (Fig. 5, fig. S7,
and Table 1) but also act in synergy to give rise to more
complex regulation. Filamin
and the PTB domain region of Shc1 each bind to both NPxY motifs
(172, 173). Shc is an adaptor protein playing a key role in
mitogen- activated protein kinase (MAPK) and Ras signaling
pathways, and its interaction with integrin 3 requires both
phosphorylations on Tyr773 and Tyr785 (172, 174). In contrast,
binding of the immuno-globulin domain of filamin-A requires both
tyrosines to be in a non-phosphorylated state. The filamin-A
interaction can be considered as a main shutdown switch in integrin
signaling, as this interaction induces the closed conformation of
the integrin ectodomains, de-creasing the chance of ligand binding
(173). In addition, binding partners using both NPxY motifs may
also serve as stronger modu-lators of endosomal trafficking,
switching on enhanced signals.
Integrins are known to be connected to autophagy regulation, and
therefore, motif identification and analysis might help suggest
possible underlying molecular mechanisms. The connection between
autophagy and cell adhesion has already been described, showing
that both reduced FAK signaling (175) and detachment from the
extracellular matrix via integrins (176) enhance autophagy. Atg-
deficient cells have enhanced migration properties, and at the
mo-lecular level, there seems to be a direct connection between Atg
proteins and integrins as well: autophagy stimulation increases the
colocalization of 1 integrin–containing vesicles with LC3-stained
autophagic vacuoles, whereas autophagy inhibition decreases the
degradation of internalized 1 integrins (177). In Drosophila cells,
it has been shown that the Wiskott-Aldrich syndrome protein and
SCAR homolog (WASH) plays a connecting role between integrin
Fig. 5. Alignment of human integrins illustrating conserved
motifs in the cytosolic C-terminal tail. (A) Multiple sequence
alignment of human integrin C-terminal regions, not including the
two most divergent tails (4 and 8). The alignment shows motif
conservation of the NPxY and LIR motifs (key residues displayed
above). Red boxes mark the conservation range of the PTB motif in
all sequences and the location of the LIR motif in integrin 3.
Protein names, UniProt IDs, and sequence numberings are listed on
the left side of the alignment. (B) Summary of the PTMs on the
C-terminal tail of integrin 3. Details of the experimental evidence
for the PTB tyrosine phos-phorylations are highlighted: pTyr773
(pY773) and pTyr785 (pY785). Graph was obtained from
PhosphoSitePlus.
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
14 of 25
recycling and the efficiency of phagocytic and autophagic
clearance (178). However, molecular details about how this
connection is brought about are unclear.
Sequence analysis of integrin 3 and 6 tails shows a potential
Atg-targeting LIR motif (Fig. 5A), similarly to the ACE2 tail.
How-ever, neither -integrin tails conform to the regular expression
in-troduced in earlier sections, as the hydrophobic residue
following the core motif is a tyrosine (Tyr785 for 3 and Tyr774 for
6). Thus, to capture this instance as well, the regular expression
needs to be modified to
[EDST].{0,2}[WFY][^RKP][^PG][ILMV].{0,4}[LIVFMY]. LTP
phosphorylation assays have determined that both Tyr773 and Tyr785
for 3 are phosphorylated in live cells (Fig. 5B). However,
such assays have also determined additional phosphorylation sites
in the 3 tail, Thr777, Ser778, Thr779, and Thr784. These
phosphoryl-ations are not connected to the NPxY motif switches in
any known way but could serve as charge-based switches for the LIR
motif. The peptide binding assays presented in the accompanying
paper by Kliche et al. (48) show that phosphorylations
introduced in the N-terminal tandem sites yielded low micromolar
binding affinities. In addition, phosphorylation of Tyr785 further
increases affinity, showing that the loss of the favorable
interaction mediated by the C-terminal hydrophobic residue can be
well compensated for by electrostatic interactions. While the
current motif definition does not exactly fit the 1 tail, there are
also LTP phosphorylation assay data (179) for the existence of
these phosphorylations in the corre-sponding residues, hinting at
the possibility of the presence of a slightly modified motif. For
3, as well as for 1 and 6 tails, the phosphorylation provides the
negative charge required upstream of the FxxIxY LIR motif
hydrophobic core. Phosphopeptides span-ning the candidate region
should also reveal whether the LIR motif- like region is a
functional Atg-binding site in integrin 1. Such experiments can
also shed light on the existence of a rheostat-like behavior of
multi-phosphorylation, already demonstrated to a cer-tain extent
for the 3 LIR. The motif found in integrin 3 is also present in
integrin 2, and the motif candidate identified in integrin 1 is
also present in integrin 6.
Potential synergy between the ACE2 and integrin intracellular
motifsBringing together the candidate SLiMs identified in the
integrin and ACE2 tails potentially strengthens the functional
links between them and provides an emergent picture of SLiM-driven
cooperative switches driving viral attachment, entry, and
replication (Fig. 6). Following attachment of the spike
protein to the receptors, the two NPxY motifs in the integrin
subunit could act cooperatively with the apoPTB and YxxPhi motifs
in ACE2 as sorting signals that me-diate the internalization of
viral particles into endosomes. The pres-ence of several endocytic
motifs in close proximity would strengthen the interaction with the
endocytosis apparatus, creating a high- avidity environment for
recruitment of RME components (107). During this time, the
phosphorylated integrin NPxY motifs would also reinforce viral
attachment through inside-out signaling, stabi-lizing the integrin
ectodomain in the open, high ligand affinity conformation. As
discussed previously, RME also involves the recruit-ment of adaptor
molecules that activate rearrangements of the actin cytoskeleton
required for the internalization of the endocytic vesi-cle. At this
stage, the NPY and SH2 motifs in ACE2 would recruit several
molecules that mediate actin polymerization signaling, prominently
I-BAR–containing proteins IRSp53 and IRTKS as well
as actin cytoskeleton regulators activated by SFKs. While most
of this actin signaling would serve to allow viral entry,
additional actin recruitment processes could occur following viral
fusion, such as that initiated by the interaction between the spike
protein and Ezrin. Last, at later stages of infection, both
integrins and ACE2 might remain attached to virus-associated DMVs
and other replication- competent membranes where the RTC assembles.
At this stage, ACE2 and integrins might cooperatively mediate the
recruitment of autophagy components such as LC3 through the LIR
motifs located in the cytosolic tails of both molecules.
SLiMs and their potential therapeutic implicationsThe analysis
of candidate SLiMs in ACE2 and integrins suggests that SARS-CoV-2
hijacks both receptors, co-opting their SLiMs to drive viral
attachment, entry, and replication. This creates an op-portunity
for drugging these interactions, or the processes they con-trol,
through host-directed therapies (HDTs) to prevent viral entry. On
the basis of the identified candidate interactions, we collected a
list of potentially useful drugs (Table 2) together with
ChEMBL ac-cessions (180); several are already registered for
clinical trials (181).
The RGD sequence is used by a large number of viruses for cell
attachment, via integrins (13). RGD mimics have been developed as
inhibitors of integrin–extracellular matrix protein interaction for
a variety of diseases. A cyclic RGD peptide [c-RGDf(NMe)V,
cilen-gitide] has been developed clinically for glioblastoma
treatment and other cancers. It proved safe but did not enhance the
survival bene-fit (182). SARS-CoV-2 has a unique RGD sequence in
the vicinity of the ACE2 binding region of its spike protein. It
has been proposed that integrins may have a potential role for
infectivity (12). If so, RGD mimetics might be able to block the
RGD-binding site(s) on target cells and block the attachment of the
virus. Another applica-tion that has been suggested is bacterial
sepsis (sepsis is also a dreaded complication in COVID-19
patients), and experimental evidence in animals is available (183).
Cilengitide is relatively specific for integrin v3 but also active
on v5, v1, v6, v8, IIb3, 41, and 51 (in decreasing order of
activity). The antibody abituzumab (DI-17E6) is a pan-v antibody,
meaning it is also active against other v integrins and,
consequently, may be better suited for block-ing virus entry. It
has been clinically tested in several cancer indica-tions
(184, 185).
As discussed above, tyrosine kinase–mediated phosphorylation
plays an important role in virus entry and maturation, and several
tyrosine kinase inhibitors have entered the clinic and some show
effects on viral infection in cell culture. For example,
saracatinib, an SRC and Abl inhibitor that has completed several
clinical trials, mainly targeting cancers, inhibited replication of
different corona-viruses including MERS-CoV, SARS-CoV, and
HCoV-229E in cell culture infection experiments (27). After
internalization and endo-somal trafficking, imatinib, an Abl
inhibitor, prevented fusion of SARS-CoV and MERS-CoV virions at the
endosomal membrane in infected cell culture experiments (25). Using
the avian model virus IBV, imatinib and two other Abl inhibitors
(GNF2 and GNF5) pre-vented the fusion of the spike protein to the
membrane of the target cell as well as cell-cell fusion and
syncytia formation (2). More re-cently, tyrphostin A9, a
platelet-derived growth factor receptor (PDGFR) tyrosine kinase
inhibitor, came out from an HTP screen-ing using cytopathic effect
as readout and also showed in vitro in-hibitory capacity to
transmissible gastroenteritis virus (TGEV), an alphacoronavirus
that infects pigs (26). The authors also showed
on June 6, 2021http://stke.sciencem
ag.org/D
ownloaded from
http://stke.sciencemag.org/
-
Mészáros et al., Sci. Signal. 14, eabd0334 (2021) 12 January
2021
S C I E N C E S I G N A L I N G | R E S E A R C H R E S O U R C
E
15 of 25
that tyrphostin A9 has a broad antiviral spectrum, being active
against three other tested coronaviruses: MHV in murine L929 cells,
porcine epidemic diarrhea virus in primate Vero cells, and feline
infectious peritonitis virus in feline CCL-94 cells. The mode of
ac-tion was found to be through p38 MAPK, at the post-adsorption
stage. As FAK has been implicated in viral entry for other viruses
including influenza A (186), experimental drugs targeting FAK,
in-cluding some in clinical trials (187), can be considered for
studying potential spike protein–induced integrin signaling. Now,
39 tyro-sine kinase inhibitors are approved by the U.S. Food and
Drug Ad-ministration (FDA): 11 target nonreceptor protein–tyrosine
kinases and 28 inhibit receptor protein–tyrosine kinases (188).
Consequent-ly, tyrosine kinase inhibitors may be good candidates to
test for their effect on SARS-CoV-2. For example, an inhibitor of
the Abl and PDGFR kinases, flumatinib mesylate, showed 42%
reduction of
SARS-CoV-2 infection of Vero E6 cells at 2.5 M (189). As part of
the United King-dom’s ACCORD (Accelerating COVID-19 Research &
Development) program, a clin-ical trial is underway to evaluate
bem-centinib, a specific inhibitor of the receptor tyrosine kinase
AXL in COVID-19 (190). AXL acts as a pleiotropic inhibitor of
innate immunity (191) and is also a re-ceptor for Ebola virus
(192).
A number of protease inhibitors are now discussed for SARS-CoV-2
treat-ment. Serine protease inhibitor camo-stat mesylate is active
against TMPRSS2 and blocks cell entry (4). Nafamostat
mesylate—originally developed as a tryptase inhibitor (193)—also
has been shown to inhibit TMPRSS2. Nafamo-stat mesylate is an
approved anticoag-ulant in Japan, with clinical testing for
COVID-19 infections now being con-ducted. The spike protein of
SARS-CoV-2 contains a furin cleavage sequence (PRRARS|V).
Consequently, furin con-vertase inhibitors are considered as
an-tiviral agents (194). A prime example of such inhibitors is
decanoyl-RVKR-CMK, which has been shown to inhibit cleav-age of the
SARS-CoV-2 spike protein at the S1/S2 site by furin (90). A large
drug screen identified four drugs that targeted host cysteine
proteases in SARS-CoV- 2–infected human cells including VBY-825
(cathepsin B/L), ZLVG CHN2, ONO 5334 (cathepsin K), and MDL-28170
(cathepsin B and calpain I/II), with the latter two inhibiting
SARS-CoV-2 rep-lication in human induced pluripotent stem cell
(iPSC) pneumocytes (189).
Many viruses enter the cell via endo-cytosis, and a number of
candidate SLiMs relevant for SARS-CoV-2 infection are related to
endocytosis (see above). Chlor-promazine, an antipsychotic
dopamine
D2 antagonist developed in the 1950s, is a potent endocytosis
inhibitor (which likely explains its reputation as a “dirty drug”
and some of its marked side effects, which can include low white
blood cell levels). Like other tricyclic antipsychotics, the drug
specifically inhibits the dynamin motor protein that is required to
close off the endocytic vesicle at the plasma membrane (195)