-
REVIEW ARTICLEpublished: 20 June 2013
doi: 10.3389/fneur.2013.00076
Trinucleotide repeats: a structural perspective
Bruno Almeida, Sara Fernandes†, Isabel A. Abreu† and Sandra
Macedo-Ribeiro*
Instituto de Biologia Molecular e Celular, Universidade do
Porto, Porto, Portugal
Edited by:Thomas M. Durcan, McGill University,Canada
Reviewed by:Denis Soulet, Laval University, CanadaThomas M.
Durcan, McGill University,Canada
*Correspondence:Sandra Macedo-Ribeiro, Instituto deBiologia
Molecular e Celular,Universidade do Porto, Rua do CampoAlegre 823,
4150-180 Porto, Portugale-mail: [email protected]†Present
address:Sara Fernandes, Shannon ABC,Limerick Institute of
Technology,Limerick, Ireland;Isabel A. Abreu, GplantS, Instituto
deTecnologia Química e Biológica,Oeiras, Portugal.
Trinucleotide repeat (TNR) expansions are present in a wide
range of genes involved in sev-eral neurological disorders, being
directly involved in the molecular mechanisms
underlyingpathogenesis through modulation of gene expression and/or
the function of the RNA orprotein it encodes. Structural and
functional information on the role of TNR sequences inRNA and
protein is crucial to understand the effect of TNR expansions in
neurodegenera-tion.Therefore, this review intends to provide to the
reader a structural and functional viewofTNR and encoded
homopeptide expansions, with a particular emphasis on polyQ
expan-sions and its role at inducing the self-assembly, aggregation
and functional alterations ofthe carrier protein, which culminates
in neuronal toxicity and cell death. Detail will be givento the
Machado-Joseph Disease-causative and polyQ-containing protein,
ataxin-3, provid-ing clues for the impact of polyQ expansion and
its flanking regions in the modulation ofataxin-3 molecular
interactions, function, and aggregation.
Keywords: amino acid-repeats, microsatellites, protein
complexes, protein aggregation, amyloid, protein structure
TRINUCLEOTIDE REPEATS AND HUMAN DISEASETrinucleotide repeat
(TNR) expansions and their association withneurological disorders
have been known for the past 20 years(La Spada et al., 1991).
Expansion of CAG, GCG, CTG, CGG,and GAA repeats located in coding
or non-coding sequences ofdifferent genes (summarized in Table 1;
Figures 1 and 2) are asso-ciated with a diverse range of human
monogenic diseases suchas Spinobulbar Muscular Atrophy (SBMA,
a.k.a. Kennedy dis-ease), Huntington Disease (HD), Spinocerebellar
Ataxias (SCAs),Oculopharyngeal Muscular Dystrophy (OPMD), Myotonic
Type 1(DM1), Fragile X-Associated Tremor Ataxia Syndrome
(FXTAS),and Friedreich Ataxia (FRDA) (for a review see Orr and
Zoghbi,2007), with longer repeats being correlated with earlier age
at onsetand increased disease severity. These TNR are highly
unstableand the repeat tract length can change between affected
indi-viduals within the same family and can be different in
differenttissues (La Spada, 1997; Brouwer et al., 2009). More
interestingly,in the brain of patients affected by CAG expansions,
differencesin repeat instability have been found between specific
cell types(Pearson et al., 2005; Gonitel et al., 2008; Lopez Castel
et al.,2010). GCG repeats are usually shorter and reveal a higher
sta-bility in different tissues and across generations than CAG
repeats.The dynamic nature of these DNA repeat expansions is a
con-sequence of their capability to form different secondary
struc-tures, which interfere with the cellular mechanisms of
replication,repair, recombination and transcription (for a recent
review seeLopez Castel et al., 2010). The molecular mechanisms
underly-ing pathogenesis in those disorders, either associated with
mentalretardation, neuronal, or muscular degeneration, might
resultfrom alterations in the levels of gene expression and/or the
func-tion of the RNA or protein it encodes, mechanisms that
likely
act in concert to influence the pattern of selective cell
toxic-ity. Some of those toxicity mechanisms will be briefly
discussedbelow.
TRINUCLEOTIDE REPEATS AND RNA STRUCTUREThe formation of hairpin
structures within the TNR RNA is relatedto the gain in RNA toxic
function, the major pathogenic mecha-nism associated with CUG and
CGG repeat expansions in non-coding regions of DM1 and FXTAS
transcripts, which was alsoshown to contribute to pathogenesis in
CAG repeat disorders suchas HD and Machado-Joseph disease (MJD,
a.k.a. SCA3) (reviewedin Krzyzosiak et al., 2012). These duplex
structures, whose sta-bility is positively correlated with the
repeat size (Napierala andKrzyzosiak, 1997), sequester dsRNA
binding proteins involved inmRNA splicing such as CUG-binding
protein (CUGBP) and mus-cleblind protein 1 (MBNL1) (Miller et al.,
2000), inducing aber-rant splicing in affected cells, compromising
multiple intracellularpathways, affecting cell-quality control
regulation, and ultimatelyresulting in cell dysfunction (Li and
Bonini, 2010). Structural stud-ies on model trinucleotide CUG, CAG,
and CGG repeats formingdouble-stranded chains revealed the features
induced by peri-odic U-U, A-A, and G-G mismatches, and provided
hints into thestructural details of pathogenic RNAs that are
recognized by RNA-binding proteins (Mooers et al., 2005; Kiliszek
et al., 2010, 2011;Kumar et al., 2011; Parkesh et al., 2011). MBNL1
is composedof four zinc-containing RNA-binding domains arranged in
twotandem segments, with the C-terminal zinc-finger pair
displayinga GC-sequence recognition motif (Teplova and Patel, 2008)
andinteracting with the stem region of expanded CUG RNAs (Yuanet
al., 2007). Electron microscopy analysis of MBNL1:CUG136
complexes showed that the pathogenic dsRNA forms a scaffold
www.frontiersin.org June 2013 | Volume 4 | Article 76 | 1
http://www.frontiersin.org/Neurologyhttp://www.frontiersin.org/Neurology/editorialboardhttp://www.frontiersin.org/Neurology/editorialboardhttp://www.frontiersin.org/Neurology/editorialboardhttp://www.frontiersin.org/Neurology/abouthttp://www.frontiersin.org/Neurodegeneration/10.3389/fneur.2013.00076/abstracthttp://www.frontiersin.org/Community/WhosWhoActivity.aspx?sname=BrunoAlmeida&UID=83513http://www.frontiersin.org/Community/WhosWhoActivity.aspx?sname=SaraFernandes&UID=88667http://www.frontiersin.org/Community/WhosWhoActivity.aspx?sname=IsabelAbreu&UID=83538http://www.frontiersin.org/Community/WhosWhoActivity.aspx?sname=SandraMacedo-Ribeiro&UID=83310mailto:[email protected]://www.frontiersin.orghttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
Tab
le1
|Hu
man
dis
ease
sas
soci
ated
wit
hn
ucl
eoti
de
rep
eat
exp
ansi
on
s(a
dap
ted
fro
mM
essa
edan
dR
ou
leau
,200
9;Lo
pez
Cas
tele
tal
.,20
10;M
ato
set
al.,
2011
).
Dis
ease
nam
eR
epea
t
typ
e
Rep
eat
loca
tio
n
Gen
eP
rote
in(U
niP
rot
iden
tifi
er,n
um
ber
of
resi
du
es)
Bio
log
ical
pro
cess
a
No
rmal
rep
eat
len
gth
Dis
ease
rep
eat
len
gth
Pro
tein
stru
ctu
red
eter
min
ed?
Spi
nala
ndbu
lbar
mus
cula
rat
roph
y
(SB
MA
)
CA
GPr
otei
nco
ding
regi
on(p
olyQ
)
AR
And
roge
nre
cept
or
(P10
275,
919
resi
dues
)
Tran
scrip
tion,
tran
scrip
tion
regu
latio
n
9–36
38–6
2R
esid
ues
20–3
0an
d67
1–91
9(P
DB
code
1xow
)
Hun
tingt
on’s
dise
ase
(HD
)
CA
GPr
otei
nco
ding
regi
on(p
olyQ
)
HTT
Hun
tingt
in(P
4285
8,31
42
resi
dues
)
Apo
ptos
is6–
3436
–121
Res
idue
s5–
18(3
lrh),
Res
idue
s1–
17
(2ld
0,2l
d2),
Res
idue
s1–
64(3
io4,
3io6
,3io
r,3i
ot,3
iou,
3iov
,3io
w)
Den
tato
rubr
al-
palli
douy
sian
atro
phy
(DR
PLA
)
CA
GPr
otei
nco
ding
regi
on(p
olyQ
)
ATN
1at
roph
in1
(P54
259,
1190
resi
dues
)
Tran
scrip
tion,
tran
scrip
tion
regu
latio
n
7–34
49–8
8N
ost
ruct
ural
info
rmat
ion
Spi
noce
rebe
llar
atax
ia1
(SC
A1)
CA
GPr
otei
nco
ding
regi
on(p
olyQ
)
ATX
N1
atax
in1
(P54
253,
815
resi
dues
)
Tran
scrip
tion,
tran
scrip
tion
regu
latio
n
6–39
40–8
2R
esid
ues
563–
693
(1oa
8)
Spi
noce
rebe
llar
atax
ia2
(SC
A2)
CA
GPr
otei
nco
ding
regi
on(p
olyQ
)
ATX
N2
atax
in2
(Q99
700,
1313
resi
dues
)
No
asso
ciat
edG
O
keyw
ords
for
biol
ogic
al
proc
ess
15–2
432
–200
Res
idue
s91
2–92
8(3
ktr)
Spi
noce
rebe
llar
atax
ia3
(SC
A3)
CA
GPr
otei
nco
ding
regi
on(p
olyQ
)
ATX
N3/
MJD
atax
in3
(P54
252,
364
resi
dues
)
Tran
scrip
tion,
tran
scrip
tion
regu
latio
n,U
blco
njug
atio
n
path
way
10–5
155
–87
Res
idue
s1–
182
(1yz
b),R
esid
ues
222–
263
(2kl
z)
Spi
noce
rebe
llar
atax
ia6
(SC
A6)
CA
GPr
otei
nco
ding
regi
on(p
olyQ
)
CA
CN
A1
AC
AC
NA
1 A,P
/Q-t
ype
α1A
calc
ium
chan
nels
ubun
it
(O00
555,
2505
resi
dues
)
Cal
cium
tran
spor
t,io
n
tran
spor
t,tr
ansp
ort
4–20
20–2
9R
esid
ues
1955
–197
5(3
bxk)
Spi
noce
rebe
llar
atax
ia7
(SC
A7)
CA
GPr
otei
nco
ding
regi
on(p
olyQ
)
ATX
N7
atax
in7
(O15
265,
892
resi
dues
)
Tran
scrip
tion,
tran
scrip
tion
regu
latio
n
4–35
37–3
06R
esid
ues
330–
401
(2kk
r)
Spi
noce
rebe
llar
atax
ia17
(SC
A17
)
CA
GPr
otei
nco
ding
regi
on(p
olyQ
)
ATX
N17
TATA
box
bind
ing
prot
ein
(TB
P)(
P20
226,
339
resi
dues
)
Tran
scrip
tion,
tran
scrip
tion
regu
latio
n,H
ost-
viru
s
inte
ract
ion
25–4
247
–63
Res
idue
s15
9–33
7(1
cdw
,1c9
b,1j
fi,
1nvp
,1tg
h)
Mul
tiple
skel
etal
dysp
lasi
as(C
OM
P)
GA
CPr
otei
nco
ding
regi
on
(pol
yasp
arta
te)
CO
MP
cart
ilage
olig
omer
icm
atrix
prot
ein
(a.k
.a
Thro
mbo
spon
din-
5)
(P49
747,
757
resi
dues
)
Apo
ptos
is,c
ella
dhes
ion
54,
6,7
Res
idue
s22
5–75
7(3
fby)
.
Synp
olyd
acty
ly
(HO
XD
13)
GC
GPr
otei
nco
ding
regi
on(p
olyA
)
HO
XD
13ho
meo
box
D13
(P35
453,
343
resi
dues
)
Tran
scrip
tion,
tran
scrip
tion
regu
latio
n
1522
–29
No
stru
ctur
alin
form
atio
n
(Con
tinue
d)
Frontiers in Neurology | Neurodegeneration June 2013 | Volume 4
| Article 76 | 2
http://www.frontiersin.org/Neurodegenerationhttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
Tab
le1
|Co
nti
nu
ed
Dis
ease
nam
eR
epea
t
typ
e
Rep
eat
loca
tio
n
Gen
eP
rote
in(U
niP
rot
iden
tifi
er,n
um
ber
of
resi
du
es)
Bio
log
ical
pro
cess
a
No
rmal
rep
eat
len
gth
Dis
ease
rep
eat
len
gth
Pro
tein
stru
ctu
red
eter
min
ed?
Ocu
loph
aryn
geal
Mus
cula
rD
ystr
ophy
(OP
MD
)
GC
GPr
otei
nco
ding
regi
on(p
olyA
)
PAB
PN
1Po
lyad
enyl
ate-
bind
ing
prot
ein
2(Q
86U
42,3
06
resi
dues
)
mR
NA
proc
essi
ng10
12–1
7R
esid
ues
167–
254
(3b4
d,3b
4m,
3ucg
)
Cle
idoc
rani
al
dysp
lasi
a(C
BFA
1)
GC
GPr
otei
nco
ding
regi
on(p
olyA
)
RU
NX
2R
unt-
rela
ted
tran
scrip
tion
fact
or2
(Q13
950,
521
resi
dues
)
Tran
scrip
tion;
tran
scrip
tion
regu
latio
n
1727
No
stru
ctur
alin
form
atio
n
Hol
opro
senc
epha
ly
(ZIC
2)
GC
GPr
otei
nco
ding
regi
on(p
olyA
)
ZIC
2Zi
nc-fi
nger
prot
ein
ZIC
2
(O95
409,
532
resi
dues
)
Diff
eren
tiatio
n,
neur
ogen
esis
,
tran
scrip
tion,
tran
scrip
tion
regu
latio
n
1525
No
stru
ctur
alin
form
atio
n
Han
d-Fo
ot-G
enita
l
Synd
rom
e/H
OX
A13
)
GC
GPr
otei
nco
ding
regi
on(p
olyA
)
HO
XA
13ho
meo
box
A13
(P31
271,
388
resi
dues
)
Tran
scrip
tion,
tran
scrip
tion
regu
latio
n
1824
–26
No
stru
ctur
alin
form
atio
n
Ble
phar
ophi
mos
is/
ptos
is/e
pica
nthu
s
inve
rsus
synd
rom
e
type
II(F
OX
L2)
GC
GPr
otei
nco
ding
regi
on(p
olyA
)
FOX
L2Fo
rkhe
adbo
xlik
e2
(P58
012,
376
resi
dues
)
Diff
eren
tiatio
n,
tran
scrip
tion,
tran
scrip
tion
regu
latio
n
1422
–24
Res
idue
s32
2–32
8(2
l7z)
Infa
ntile
spas
m
synd
rom
e(A
RX
)
GC
GPr
otei
nco
ding
regi
on(p
olyA
)
AR
XA
rista
less
-rel
ated
hom
eobo
x(Q
96Q
S3,
562
resi
dues
)
Diff
eren
tiatio
n,
neur
ogen
esis
,
tran
scrip
tion,
tran
scrip
tion
regu
latio
n
10–1
617
–23
No
stru
ctur
alin
form
atio
n
Myo
toni
cdy
stro
phy
type
1(D
M1)
CTG
3′U
TRD
MP
KM
yoto
nic
dyst
roph
y
prot
ein
kina
se(D
MP
K)
(Q09
013,
639
resi
dues
)
No
asso
ciat
edG
O
keyw
ords
for
biol
ogic
al
proc
ess
5–37
90–6
500
Res
idue
s11
–420
(2vd
5),R
esid
ues
460–
537
(1w
t6)
Frie
drei
chat
axia
(FR
DA
)
GA
AIn
tron
FXN
Frat
axin
(Q16
595,
210
resi
dues
)
Hem
ebi
osyn
thes
is,I
on
tran
spor
t,Ir
onst
orag
e,
Iron
,tra
nspo
rt
6–32
>20
0R
esid
ues
88–2
10(1
ekg)
,Res
idue
s
91–2
10(1
ly7)
,Res
idue
s82
–210
(3s4
m,3
s5d,
3s5e
,3s5
f,3t
3j,3
t3k,
3t3l
,3t3
t,3t
3x)
Spi
noce
rebe
llar
atax
ia8
(SC
A8)
CTG
3′U
TRAT
XN
8A
taxi
n-8
(a.k
.apr
otei
n1C
2;
(Pre
sent
inS
CA
8-sp
ecifi
c
1C2-
posi
tive
intr
anuc
lear
incl
usio
ns)(
Q15
6A1,
80
resi
dues
)
Cel
ldea
th2–
130
>11
0N
ostr
uctu
rali
nfor
mat
ion
(Con
tinue
d)
www.frontiersin.org June 2013 | Volume 4 | Article 76 | 3
http://www.frontiersin.orghttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
Tab
le1
|Co
nti
nu
ed
Dis
ease
nam
eR
epea
t
typ
e
Rep
eat
loca
tio
n
Gen
eP
rote
in(U
niP
rot
iden
tifi
er,n
um
ber
of
resi
du
es)
Bio
log
ical
pro
cess
a
No
rmal
rep
eat
len
gth
Dis
ease
rep
eat
len
gth
Pro
tein
stru
ctu
red
eter
min
ed?
Spi
noce
rebe
llar
atax
ia12
(SC
A12
)
CA
G5′
UTR
PP
P2R
2BS
erin
e/th
reon
ine-
prot
ein
phos
phat
ase
2A55
kDa
regu
lato
rysu
buni
tB
β
isof
orm
(Q00
005,
443
resi
dues
)
Apo
ptos
is7–
4555
–78
No
stru
ctur
alin
form
atio
n
Hun
tingt
on
dise
ase-
like
2(H
DL2
)
CA
GA
ltern
ativ
e
splic
eis
ofor
m
2–
poly
A-
expa
nsio
n
JPH
3Ju
ncto
phili
n3
(Q8W
XH
2,
748
resi
dues
)
No
asso
ciat
edG
O
keyw
ords
for
biol
ogic
al
proc
ess
6–27
51–5
7N
ost
ruct
ural
info
rmat
ion
FRA
XA
:fra
gile
X
synd
rom
e
CG
G5′
UTR
FMR
1Fr
agile
Xm
enta
l
reta
rdat
ion
1pr
otei
n
(Q06
787,
632
resi
dues
).
Tran
spor
t;m
RN
Atr
ansp
ort
6–52
230–
2000
Res
idue
s1–
134
(2bk
d),R
esid
ues
216–
280
(2fm
r),R
esid
ues
216–
425
(2qn
d),R
esid
ues
527–
541
(2la
5)
FXTA
S:f
ragi
leX
trem
or/a
taxi
a
synd
rom
e
CG
G5′
UTR
FMR
1Fr
agile
Xm
enta
l
reta
rdat
ion
1pr
otei
n
(Q06
787,
632
resi
dues
).
Tran
spor
t;m
RN
Atr
ansp
ort
6–52
59–2
30R
esid
ues
1–13
4(2
bkd)
,Res
idue
s
216–
280
(2fm
r),R
esid
ues
216–
425
(2qn
d),R
esid
ues
527–
541
(2la
5)
FRA
XE
:fra
gile
X
synd
rom
e
CG
G5′
UTR
FMR
2Fr
agile
Xm
enta
l
reta
rdat
ion
2pr
otei
n
(P51
816,
1311
resi
dues
)
mR
NA
proc
essi
ng,m
RN
A
splic
ing
4–39
200–
900
No
stru
ctur
alin
form
atio
n
UTR
,unt
rans
late
dre
gion
.aB
iolo
gica
lFun
ctio
nba
sed
onG
ene
Ont
olog
yas
anno
tate
din
Uni
Prot
.
Frontiers in Neurology | Neurodegeneration June 2013 | Volume 4
| Article 76 | 4
http://www.frontiersin.org/Neurodegenerationhttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
FIGURE 1 | Structural variability of proteins encoded
byTNR-containinggenes. Illustrative domain graphics of the
multi-domain structure of proteinsassociated with polyQ-expansion
diseases. All proteins shown arereferenced by their name as
annotated in UniProt. The protein domains forwhich information is
annotated in the Pfam database are shown as coloredboxes with Pfam
family accession code referenced above the domain box.Complete
names of domains can be assessed by searching the specific
Pfam accession code at http://pfam.sanger.ac.uk/. Numbers below
thedomain schemes represent amino acid residue numbers.
Regionscontaining the amino acid repeats and with a prediction for
formation ofcoiled-coils (as annotated in UniProt) are shown as
well as regions withknown 3D structure (boxed in red, with PDB
accession codes shown).Notice the predominant location of the
repeat regions within the N-terminalregions of the proteins.
with tandem spaced MBNL1 binding sites were MBNL1 oligomerswith
a ring-like structure can assemble, possibly leading to the
for-mation of the ribonuclear foci identified in cell models of
theseTNR diseases (Yuan et al., 2007; de Mezer et al., 2011). The
struc-ture and stability of the TNR hairpin structures formed
depends onthe presence of interruptions as well as on the nature of
the flank-ing regions. This might be related with the ability of
individual
repeats to participate in the RNA toxicity mechanisms
(Krzyzosiaket al., 2012).
In FRDA and FXTAS, pathogenesis results predominantlyfrom
decreased expression of the associated genes (FXN andFMR1/FMR2)
caused by the expansion of GAA and CGG repeats,respectively, which
results in loss of function of key proteinsinvolved in iron-sulfur
cluster biogenesis and mRNA translation
www.frontiersin.org June 2013 | Volume 4 | Article 76 | 5
http://pfam.sanger.ac.uk/http://www.frontiersin.orghttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
FIGURE 2 | Structural variability of proteins encoded
byTNR-containing genes. Illustrative domain graphics of the
multi-domainstructure of proteins associated with polyD- and
polyA-expansion diseases.All proteins shown are referenced by their
name as annotated in UniProt.The protein domains for which
information is annotated in the Pfamdatabase are shown as colored
boxes with Pfam family accession codereferenced above the domain
box. Complete names of domains can beassessed by searching the
specific Pfam accession code athttp://pfam.sanger.ac.uk/. Numbers
below the domain schemes representamino acid residue numbers.
Regions containing the amino acid repeatsand with a prediction for
formation of coiled-coils (as annotated in UniProt)are shown as
well as regions with known 3D structure (boxed in red, withPDB
accession codes shown). Notice the predominant location of
therepeat regions within the N-terminal regions of the
proteins.
at synapses. Nevertheless, in FXTAS RNA toxicity is also
proposedto play a role in pathogenesis (Li and Bonini, 2010). The
recentlydiscovered mechanisms of pathogenesis in spinocerebellar
ataxiatype 8 (SCA8) uncovered the extreme complexity of TNR
disor-ders. In fact, SCA8 is caused by expansion of CTG/CAG
repeatsin the affected gene, which are transcribed bi-directionally
leading
to the generation of expanded CUG and CAG-containing
tran-scripts further translated into homopolymeric proteins, so
thatpathogenesis can be mediated by both RNA and protein
toxicity(Merienne and Trottier, 2009). Curiously, recent data have
high-lighted the possibility of non-ATG translation across
expandedTNR in all possible reading frames, which might further
con-tribute to the generation of novel toxic proteins and RNAs
addingto the multi-parametric character of the pathogenic
mechanismsassociated with TNR diseases (Li and Bonini, 2010;
Pearson, 2011;Sicot et al., 2011).
TRINUCLEOTIDE REPEATS WITHIN PROTEIN CODING REGIONSOver 20 years
ago, the finding that the expansion of CAG repeatswithin the coding
sequence of the androgen receptor gene was thegenetic basis of SBMA
(La Spada et al., 1991) represented a hall-mark in the discovery of
these novel dynamic mutations and theirassociation with human
disease. Some years later, the identifica-tion of intracellular
inclusions containing the expanded proteins(Paulson et al., 1997)
provided a clue to pathogenesis, directingresearch in the field
into an extensive search for the mechanismsof polyQ-induced protein
aggregation. The moderate expansionof GCG and CAG repeats, which
are translated into polyA andpolyQ tracts in the affected proteins
(Figures 1 and 2), results inprotein misfolding and aggregation, in
accordance with a general,although not always unique, toxic gain of
function mechanismof pathogenesis (Williams and Paulson, 2008). The
appearanceof insoluble cytoplasmic or nuclear inclusions enriched
in theexpanded polyA- or polyQ-containing protein constitutes a
char-acteristic fingerprint of these diseases (Messaed and Rouleau,
2009;Orr, 2012a), regardless of their controversial role in
pathogenesis.While the proteins containing polyA repeats are
predominantlytranscription factors with a role in development (see
Table 1and Amiel et al., 2004; Messaed and Rouleau, 2009), most
ofthe proteins linked to polyQ-expansion diseases are involved
inDNA-dependent regulation of transcription or neurogenesis
andoften contain multiple intermolecular partners (Butland et
al.,2007). Despite the overall lack of sequence or structural
homol-ogy, both polyQ- and polyA-repeat expansions are associated
withformation of ß-rich amyloid-like protein inclusions, and with
thewider group of protein misfolding disorders. These inclusions
areenriched in ubiquitin, proteasome subunits, and chaperones,
andoften recruit macromolecules that are part of the
macromolecularinteraction networks associated with the proteins’
native functions(Williams and Paulson, 2008). As an example, the
poly(A)-bindingprotein PABNP1 forms insoluble inclusions upon
alanine expan-sion, co-aggregating together with poly(A)-mRNA,
proteasomesubunits, ubiquitin, heat-shock proteins, and SKIP, a
transcrip-tion factor associated with muscle-specific gene
expression (Brais,2003; Tavanez et al., 2009; Winter et al.,
2013).
The simplistic view of the predominant role of the inclusionsin
polyQ-induced pathogenesis was later challenged by the failureof
this mechanism to explain the cell-specific vulnerability
char-acteristic for each disease and by the identification of
numerousexamples of neuronal toxicity in the absence of visible
intracellu-lar inclusions (Arrasate et al., 2004). Indeed, the
inclusions wereshown to be fibrillar and display amyloid-like
properties bothin vivo and in vitro (Huang et al., 1998; Bevivino
and Loll, 2001;
Frontiers in Neurology | Neurodegeneration June 2013 | Volume 4
| Article 76 | 6
http://pfam.sanger.ac.uk/http://www.frontiersin.org/Neurodegenerationhttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
Sathasivam et al., 2010) and, in a mechanistic parallel with
thepathogenic mechanisms proposed for “classical” amyloids,
manystudies suggested that the insoluble inclusions played a
protec-tive role, sequestering toxic, and misfolded protein
conformers(Arrasate et al., 2004; Rub et al., 2006; Miller et al.,
2010). Indeed,soluble intermediates in the aggregation pathway such
as mis-folded β-sheet rich polyQ protein monomers and oligomers
havelatter been identified and proposed to represent the major
toxicspecies (Kayed et al., 2003; Gales et al., 2005; Nagai et al.,
2007;Miller et al., 2011). Also, in OPMD, the primary toxic species
areproposed to be the soluble variants of the expanded
polyA-repeatprotein PABPN1 (Messaed et al., 2007). It is currently
accepted thatin polyQ disorders the expanded region plays a role in
inducingthe self-assembly of the carrier protein, which engages in
patho-genic interactions and leads to the formation of toxic
monomersor oligomers (Takahashi et al., 2008; Weiss et al., 2008)
latterconverted to insoluble intracellular amyloid-like oligomers
whereboth expanded and “normal” protein are sequestered along
withother macromolecular partners (reviewed in Williams and
Paul-son, 2008; Matos et al., 2011; Costa and Paulson, 2012). As
morebiochemical data is gathered, more is understood about the role
ofamino acid expansions in modulating the interaction with
macro-molecular partners. As an example, expansion of the polyA
tract inPABPN1 results in increased association with Hsp70
chaperonesand type I arginine methyl transferases (Tavanez et al.,
2009). Thisindicates that the distinct neuropathological features
arising fromthis amino acid-repeat expansion might at least
partially resultfrom alterations on the native biological functions
and macro-molecular interactions of the carrier protein, which
might vary indifferent intracellular environments.
Recent data have shown that expansion of polyA repeats
isfrequently associated with loss of normal function altering a
mul-titude of cellular pathways with consequences in cell
functionality(Amiel et al., 2004; Messaed and Rouleau, 2009),
although proteinaggregation might also play a dominant role in some
of the polyA-associated disorders (Messaed and Rouleau, 2009;
Winter et al.,2013). Studies with polyQ proteins have shown that
pathogene-sis might result from a subtle imbalance in the
association of themutant protein with multiple cellular partners
and that toxicityand neuronal death could result from a combination
of proteinself-assembly and functional alterations (Friedman et
al., 2007; Liet al., 2007b; Lim et al., 2008; Kratter and
Finkbeiner, 2010; Orr,2012b; Pastore and Temussi, 2012). In fact,
neuronal death as aresult of polyQ-expansion seems to resemble that
of linker cell inC. elegans (Pilar and Landmesser, 1976; Chu-Wang
and Oppen-heim, 1978; Blum et al., 2012, 2013) which involves the
polyQprotein pqn-4, pointing for a common mechanism for linker
celldeath, and neuronal death in polyQ diseases (Blum et al.,
2013).
Polyglutamine diseases constitute a representative and
largelystudied group of neurodegenerative disorders where
considerableamounts of data have been collected on the role of
expandedpolyQ for disease pathogenesis. However, given the proposed
func-tion of polyQ regions in mediating protein–protein
interactions,which might be modulated by polyQ-expansion (Schaefer
et al.,2012), the information on the role of these regions for
native pro-tein function, structure, and dynamics is still limited.
Structuraland functional information on the role of these repeat
sequences
in protein function is crucial to better understand how
expan-sion affects selected neuronal subpopulations. Below, we
brieflydiscuss the current knowledge on the function and structure
ofpolyQ repeats and their role on macromolecular interactions,
andfinally focus on the known structural and functional
informationon ataxin-3, the protein whose mutation causes MJD.
FUNCTION OF PolyQ ON PROTEIN–PROTEIN INTERACTIONSAND
EVOLUTIONUntil recently, the function of many amino
acid-repeat-containingproteins and the role of homopeptide regions
were somewhatobscure. However, several global analysis studies on
single aminoacid-repeat-containing proteins shed light onto their
function andonto the biological significance of the repeated
region, in particu-lar of polyQ, the most prevalent amino acid
repetition in humans(Alba and Guigo, 2004). It is now accepted that
TNR, particu-larly those located within protein-coding regions, are
consideredimportant mutators providing the genetic variability
required fordriving evolution (King, 1994; Kashi et al., 1997;
Kashi and King,2006; Nithianantharajah and Hannan, 2007). In fact,
simple orlow-complexity amino acid-repeats are rare within
prokaryoticbut extremely abundant within eukaryotic proteins,
particularlyover-represented in Plasmodium (49–90% of the total
proteome),D. discoideum (52%), D. melanogaster (20%), C. elegans
(9%),and H. sapiens (14%) (Haerty and Golding, 2010). Among
allhomopolymeric repeats, the most common on eukaryotic pro-teins
are glutamine, asparagine, alanine, and glutamate repeats(Faux et
al., 2005). This seems to indicate that there has been astrong
negative selection against the appearance of hydrophobicamino
acid-repeats with high tendency to aggregate, such as
poly-isoleucine, polyleucine, polyphenylalanine, and polyvaline
(Omaet al., 2005, 2007).
The homopeptide regions seem to be particularly relevant
forbrain development and function, since these repeated regions
canbe found in various neurodevelopmental genes (Nithiananthara-jah
and Hannan, 2007). Indeed, the sexual behavior of prairievoles
(Hammock andYoung, 2005), as well as human pair-bonding(Walum et
al., 2008), seems to be dependent on the repeat lengthin the
vasopressin 1A receptor gene. A wide study of the distribu-tion and
function of homopeptide-containing proteins could alsodemonstrate a
clear trend in humans, D. melanogater, and C. ele-gans, with the
majority of homopeptide-containing proteins per-forming roles in
transcription/translation and signaling processesand to a less
extend in transport and adhesion processes (Fauxet al., 2005). A
similar profile was also found in a comparativeanalysis of proteins
with amino acid-repeats in human and rodents(Alba and Guigo, 2004)
and also on a comparative genomic studyin domestic dogs, which
unveiled an association between mor-phological variations and the
length of the repeated region in thetranscription factor-encoded
genes ALX4 and RUNX2 (Fondonand Garner,2004). Analysis of the human
genome also revealed theexistence of 64 CAG repeat-containing genes
involved in biologicalprocesses such as regulation of
transcription, binding of transcrip-tional co-activators and
transcription factors, and in neurogenesisin general (Butland et
al., 2007). Additionally, a detailed analy-sis of the human polyQ
database (http://pxgrid.med.monash.edu.au/polyq/) (Robertson et
al., 2011) also indicated that the
www.frontiersin.org June 2013 | Volume 4 | Article 76 | 7
http://pxgrid.med.monash.edu.au/polyq/http://pxgrid.med.monash.edu.au/polyq/http://www.frontiersin.orghttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
majority of polyQ-containing proteins display domains involvedin
development (Homeobox domain-containing proteins, Fibrob-last
growth factor receptor), chromatin remodeling (Bromod-omain and
PHD-containing proteins), and signal transduction(PDZ
domain-containing proteins), all biological processes thatare
highly dependent on protein–protein interactions and associ-ated
with the formation of multicomponent protein complexes. Asfor
humans, analysis of bovine polyQ proteins revealed an enrich-ment
for large multi-domain transcriptional regulators (Whanet al.,
2010).
It is currently accepted that the majority of
repeat-containingproteins perform roles in processes that require
the assembly oflarge multiprotein or protein/nucleic acid complexes
(Faux et al.,2005; Hancock and Simon, 2005; Whan et al., 2010).
Supportingthis notion is the fact that homopolymeric amino
acid-repeats areconsidered to be unstructured (Gojobori and Ueda,
2011) andthat intrinsically unstructured regions are suggested to
consti-tute macromolecular docking sites, which become structured
onlywhen bound to cognate ligand partners (Huntley and
Golding,2002; Simon and Hancock, 2009). In fact, “hub proteins”
con-tain significantly longer and more frequent repeats or
disorderedregions, which facilitate binding to multiple partners
(Dosztanyiet al., 2006). Recently, Fiumara et al. (2010) found an
overrep-resentation of coiled-coils domains in polyQ-containing
proteinsand in their interaction partners, which are able to form
α-helicalsupersecondary structures, often inducing protein
oligomeriza-tion (Parry et al., 2008). Thus, polyQ tracts due to
their intrinsicstructural flexibility, which is largely influenced
by the flankingresidues (see PolyQ:A Simple Sequence Repeat with a
PolymorphicStructure below), may act as stabilizers of intra- and
intermole-cular protein interactions, possibly by extending a
neighboringcoiled-coil region to promote its interaction with a
coiled-coilregion in an interacting protein partner (Schaefer et
al., 2012).A detailed analysis revealed heptad repeats typical of
coiled-coilsin regions flanking or overlapping polyQ stretches,
whose disrup-tion is sufficient to impair CHIP-huntingtin
interaction, indicatingthat coiled-coils are crucial for
polyQ-mediated protein contacts.Importantly, coiled-coils also seem
to be important for the regula-tion of aggregation and insolubility
of polyQ-containing proteins(see below and Fiumara et al., 2010) as
recently proposed byPetrakis et al. (2012), which discovered a
recurrent presence ofcoiled-coil domains in ataxin-1 misfolding
enhancers, while suchdomains were not present in suppressors.
Based on the several observations on the function of
polyQ-containing proteins it is suggested that a general function
of polyQ,as for the majority of repeat sequences, is to aid in the
assem-bly of macromolecular complexes, either through tethered
distantdomains or through interactions with the polyQ itself
(Gerberet al., 1994; Korschen et al., 1999; Faux et al., 2005). By
affectingprotein interactions, and being present in particular
functionalclasses such as transcription factors, polyQ is
considered central tothe evolution of this type of proteins and
consequently crucial tothe evolution of cellular signaling pathways
(Hancock and Simon,2005).
A structural analysis of polyQ repeats and its flanking
domainsas well as its role in protein aggregation will be discussed
in greaterdetail in the next sections.
STRUCTURAL STUDIES ON PolyQ REPEATSSince the discovery that
polyQ repeats are associated with humanneurodegenerative diseases
that a huge effort has been made todetermine the structure of polyQ
and to understand how expan-sion of the repeat affects the
structure of the carrier protein and/orthe normal interaction with
molecular partners. The first evidencefrom the aggregation-prone
character of polyQ-rich proteins camefrom studies with
glutamine-rich cereal storage proteins and syn-thetic glutamine
polypeptides (Beckwith et al., 1965; Krull et al.,1965). After the
discovery that a number of neurological disor-ders were triggered
by expansion of a polyQ tract in different andunrelated proteins
(La Spada et al., 1994), and before intracellularinclusions
enriched in the polyQ-expanded protein were identi-fied as a major
fingerprint in these diseases (Davies et al., 1997;Paulson et al.,
1997), Perutz (1994) anticipated that the expandedpolyQ tract could
mediate protein–protein interactions causingprotein aggregation in
neurons and recruiting other polyQ-richproteins such as
transcription factors leading to cellular dysfunc-tion. Below, the
structural features and self-assembly properties ofpolyQ sequences
are briefly discussed (for a detailed review on thebiophysical and
structural features of polyQ, see Wetzel, 2012).
PolyQ: A SIMPLE SEQUENCE REPEAT WITH A POLYMORPHICSTRUCTUREIn
order to elucidate the structure of the glutamine repeat andto
uncover the structural changes induced by polyQ expansion,several
strategies have been put forward including (a) the struc-tural
analysis of polyQ-containing peptides of different lengths,(b) the
characterization of proteins of well-known structure afterinsertion
of an exogenous polyQ repeat, and structural determina-tion of (c)
polyQ-antibody complexes, or (d) natural polyQ-richproteins.
Using synthetic peptides containing 15 glutamine repeats,Perutz
and coworkers proposed that polyQ stretches could self-associate
forming hydrogen bonds between their side-chain amidegroups and the
main chain of a neighboring β-strand, to formcross-β structures
(polar zippers) (Perutz, 1994). This study wasfollowed by many
reports where synthetic polyQ peptides wereused as models of the
biophysical properties of polyQ-rich pro-teins, which established
that polyQ-containing peptides have atendency toward self-assembly
into amyloid-like structures (Chenet al., 2002a). Moreover, the
results obtained in vitro reflected dis-ease features observed in
vivo such as the correlation betweenlarger polyQ size, increased
protein aggregation, and earlier diseaseonset (Chen et al., 2002b;
Kar et al., 2011). Circular dichroism stud-ies of polyQ peptides in
solution have shown that their monomericforms lack regular
secondary structure (Altschuler et al., 1997;Klein et al., 2007)
and additional biophysical experiments pro-posed that these
peptides can adopt collapsed (Crick et al., 2006;Dougan et al.,
2009; Peters-Libeu et al., 2012) or extended (Singhand Lapidus,
2008) coils in solution whose compactness wasstrongly correlated
with the polyQ size (Walters and Murphy,2009). The determination of
the structure of monomeric polyQpeptides with atomic detail is
however still lacking as a result oftheir intrinsic conformational
flexibility and tendency to aggregateinto heterogeneously sized
β-rich oligomers. From the combina-tion of experimental and
theoretical methods a picture for polyQ
Frontiers in Neurology | Neurodegeneration June 2013 | Volume 4
| Article 76 | 8
http://www.frontiersin.org/Neurodegenerationhttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
structure and aggregation is emerging, where the monomericpolyQ
adopt an ensemble of conformations lacking regular sec-ondary
structures that assemble into β-structures in a polyQ-length
dependent fashion (Vitalis et al., 2009; Walters and Murphy,2009,
2011; Williamson et al., 2010; Kar et al., 2011). Divergentresults
proposing the existence of predominantly extended or col-lapsed
conformations or the minimum size for polyQ aggregationare likely
due to the differences in the introduction of variableflanking
residues (Kar et al., 2011). They might result from theinsertion of
different polyQ tract interrupting residues (Waltersand Murphy,
2011), or be a consequence of the protocols used forthe preparation
and disaggregation of the peptides used for thebiophysical studies
(Jayaraman et al., 2011). Most results obtainedwith these peptides
do not generally take into account the pos-sible effects of the
protein context on the structural propertiesof the polyQ stretches,
a particularly relevant feature consideringthat the role of
non-polyQ domains in protein aggregation hasbeen reported for
ataxin-1 (de Chiara et al., 2005), ataxin-3 (Galeset al., 2005),
and huntingtin (Tam et al., 2009; Thakur et al., 2009;Liebman and
Meredith, 2010).
In a pioneer work, Stott et al. (1995) inserted a G-Q10-G
peptide into the inhibitory loop of chymotrypsin inhibitor2 (CI2),
a soluble small protein from barley seeds, showingthat this
CI2-polyQ chimera has an increased tendency for self-assembly. Even
though a CI2 variant with four glutamines crys-tallized, the
structure of the CI2-Q4 dimer showed that thepolyQ region was
disordered and that oligomerization was medi-ated by domain
swapping (Figure 3A) and not by direct polyQassociation (Chen et
al., 1999). A structure resembling the pro-posed polar zipper was
later observed between two asparaginesin the hinge loop of the
major domain swapped dimer ofbovine pancreatic ribonuclease A (Liu
et al., 2001) (Figure 3B).Insertion of a 10 glutamine repeat within
this hinge loop ofribonuclease A, resulted in domain swapping,
oligomerization,and amyloid-like fiber formation, but strikingly
the enzymewithin the fibers was catalytically active, retaining its
nativefold (Sambashivan et al., 2005). However, although the
struc-ture of the domain swapped dimer was solved by X-ray
crys-tallography, the repeat region was not visible in the
electrondensity maps.
FIGURE 3 | Structure of proteins/protein domains containing
polyQregions. (A) Cartoon representation of the domain swapped
dimer ofchymotrypsin inhibitor 2 with a 4 glutamine insertion
[(Chen et al., 1999); PDBaccession code 1cq4], dotted lines
represent the polyQ linker not visible inthe X-ray crystal
structure. (B) Cartoon representation of domain swappedmajor dimers
of ribonuclease A. Inset shows a short segment resembling thepolar
zipper formed by asparagine residues in the linker region [(Liu et
al.,2001); PDB accession code 1f0v]. (C) Surface representation Fv
fragment of amonoclonal antibody in complex with a polyQ peptide
shown as sticks [(Li
et al., 2007a), PDB accession code 2otu]. (D) Cartoon
representation of theglutamine-rich domain from HDAC4 showing
details of the polar interactions(dotted lines) at the oligomer
interfaces involving glutamine residues [(Guoet al., 2007), PDB
accession code 2o94]. (E) Cartoon representation of thecrystal
structures of huntingtin exon-1 fragments observed in different
crystalforms, highlighting the different orientations of the
C-terminal polyQ residuesshown as sticks. The 17 glutamine stretch
adopts variable conformations inthe structures: α helix, random
coil, and extended loop. [(Kim et al., 2009),PDB accession codes
3io4, 3iow, 3iov, 3iou, 3iot, 3ior, 3io6].
www.frontiersin.org June 2013 | Volume 4 | Article 76 | 9
http://www.frontiersin.orghttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
A first overview of a short polyQ stretch at atomic resolu-tion
resulted from the structure of a polyQ10 peptide (GQ10G)(Figure 3C)
bound to MW1, an antibody against polyQ. Thisstructure reveals that
polyQ adopts an extended, coil-like struc-ture in which contacts
are made between side chains and/or mainchain atoms of all 10
glutamines and the antibody-combining site(Li et al., 2007a). The
peculiar structural features of these repeat-containing regions
were also revealed by the crystallographicstructure of a
glutamine-rich domain of human histone deacety-lase (HDAC4), that
folds into a tetramer-forming straight α-helix(Figure 3D). The
protein interfaces consist of multiple hydropho-bic patches
separated by polar interaction networks, in whichclusters of
glutamines engage in extensive intra- and interheli-cal
interactions (Guo et al., 2007). Further details on the structureof
polyQ were unveiled by the high-resolution crystal structuresof
huntingtin (HD) exon 1, containing 17 glutamines (Htt17Q)(Kim et
al., 2009). Htt17Q in fusion with maltose-binding pro-tein (MBP)
folds into an amino-terminal α-helix followed by apolyQ17 region
that adopts multiple conformations in the differ-ent crystal forms,
including α-helix, random coil, and extendedloop, and a polyproline
helix formed by the polyP11 and mixedP/Q regions (Figure 3E). The
authors suggested that the shallowequilibrium between α-helical,
random coil, and extended confor-mations can be subtly altered by
the size of polyQ sequence, theneighboring protein context, protein
interactions, or by changesin cellular environment, and that this
polymorphic behavior isa common characteristic of many
amyloidogenic proteins (Kimet al., 2009).
SELF-ASSEMBLY AND AGGREGATION OF PolyQ REPEATSThe first
approaches to characterize polyQ-induced protein aggre-gation and
pathogenesis in the context of a full-length proteinincluded the
insertion of the polyQ peptides into well-known non-pathogenic
protein carriers such as hypoxanthinephosphoribosyltransferase
(HPRT), which resulted in a neurological phenotypemimicking that
observed in mice expressing the mutant HD trun-cated protein
(Ordway et al., 1997). In vitro studies aiming at
bettercharacterizing the structure and function of polyQ repeats in
thecontext of full-length soluble proteins, included the insertion
ofectopic polyQ stretches into well-characterized and soluble
pro-teins such as CI2 (Stott et al., 1995; Chen et al., 1999),
myoglobin(Mb) (Tanaka et al., 2001; Tobelmann and Murphy, 2011),
glu-tathione S transferase (GST) (Masino et al., 2002; Bulone et
al.,2006) and the B domain from Staphylococcus aureus Protein
A(SpA) (Saunders et al., 2011). Fusion of the polyQ sequenceswith
stable and soluble proteins moderates the intrinsic polyQpeptide
aggregation propensity, but induces the self-assembly ofcarrier
proteins into fibrillar amyloid-like structures, a
nucleation-dependent process whose kinetics is directly
proportional to thesize of the inserted polyQ repeat. Likewise,
polyQ peptides are ableto seed the aggregation of intracellular
soluble polyQ-containingproteins when added to cell cultures,
conferring a heritable pheno-type of self-sustaining seeding,
resembling a prion-like mechanism(Ren et al., 2009), reviewed in
Cushman et al. (2010).
The impact of the polyQ tract and its expansion on the
per-turbation of the structure of flanking sequences and domains
is
critically dependent on the location of the amino
acid-repeats,revealing impressive location-dependent changes in
structural sta-bility, and fibril morphology of the host proteins
(Robertsonet al., 2008; Saunders et al., 2011; Tobelmann and
Murphy, 2011).Curiously, the studies with these model proteins
showed that sta-bility and structure of the carrier protein
remained unalteredby polyQ expansion when the repeat was inserted
at the N- orC-terminus of the structured domain (Robertson et al.,
2008),mimicking the location of polyQ tracts in most
disease-relatedproteins (Figure 1).
The role of the flanking regions in modulating protein
fibrilformation in polyQ disease proteins is well supported by
experi-mental data (de Chiara et al., 2005; Gales et al., 2005;
Bhattacharyyaet al., 2006; Saunders and Bottomley,2009; Tam et al.,
2009; Thakuret al., 2009; Liebman and Meredith, 2010), in agreement
with theknowledge that different polyQ-containing proteins have a
diversethreshold for aggregation. For example, addition of a
polyprolineextension after the polyQ repeat slows down aggregation
(Bhat-tacharyya et al., 2006), while protein domains outside the
polyQtract [e.g., Josephin domain (JD) of ataxin-3 and AHX domainof
ataxin-1] have been shown to contribute to protein aggre-gation
(Masino et al., 2004; de Chiara et al., 2005; Gales et al.,2005;
Ellisdon et al., 2006, 2007). The multitude of data on
thepolyQ-induced aggregation of disease and
non-disease-proteinshighlights the complex interplay between the
polyQ region andthe adjacent protein domains. In light of the
polymorphic natureof the polyQ and the modulation of its structural
features bythe protein context, two general mechanisms have been
proposedfor polyQ-mediated toxicity (Kim et al., 2009): (a) the
expandedpolyQ stretch adopts a novel conformation that mediates
toxicityor is the precursor to toxic species; (b) intra- or
intermolecularprotein interactions mediated by expanded polyQ in
the randomcoil conformation are sufficient to result in
pathological effects. Inboth cases the affinity of the interactions
involving the expandedpolyQ region could be higher with selected
target proteins, lead-ing to a preference of the disease proteins
for some of the proteinpartners, a fact that is in agreement with
the hypothesis raisedby Zuchner and Brundin (2008), which postulate
that resistanceto NMDA receptor-mediated excitotoxicity occurring
in somemouse models for HD is a consequence of a differential
bind-ing of partner proteins, in a polyQ tract size dependent
manner, tothe proline-rich domain of huntingtin. In this context,
differencesin molecular interactions occurring in a cell- and
tissue-specificmanner would result in different toxicities
according to particularcellular environments.
Given the above mentioned studies, it is nowadays clear that
thepolyQ region influences aggregation of proteins, but this
process ishighly dependent on the surrounding protein context.
Therefore,even though the structural information on peptides and
proteinswith polyQ expansions is a useful guideline for the
investiga-tion of the pathogenic effects of polyQ expansion, each
of theproteins involved in polyQ diseases shows distinctive
characteris-tics, cellular roles, and structural properties causing
difficulties inthe formulation of structural hypothesis that could
explain howdifferent monomeric conformations of polyQ leads to
variousaggregated species and how they contribute to
neurotoxicity.
Frontiers in Neurology | Neurodegeneration June 2013 | Volume 4
| Article 76 | 10
http://www.frontiersin.org/Neurodegenerationhttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
PolyQ REPEATS IN ATAXIN-3 FUNCTION AND DYSFUNCTIONMachado-Joseph
disease is an inherited neurodegenerative disor-der of adult onset
originally described in people of PortugueseAzorean descent but
later shown to be the most common auto-somal dominant
spinocerebellar ataxia worldwide. Clinically, it ischaracterized by
ataxia, ophthalmoplegia, and pyramidal signs,associated in variable
degree with dystonia, spasticity, periph-eral neuropathy, and
amyotrophy (Coutinho and Andrade, 1978).Pathologically, the
disorder is associated with degeneration ofthe deep nuclei of the
cerebellum, pontine nuclei, subthalamicnuclei, substantia nigra,
and spinocerebellar nuclei (Coutinhoet al., 1982; Rosenberg, 1992;
Margolis and Ross, 2001). It is causedby an expansion of a
repetitive CAG tract within the ATXN3 gene(Kawaguchi et al., 1994).
While in the healthy population the num-ber of CAG repeats ranges
between 10 and 51, in MJD patients thelength of ataxin-3 polyQ
tract exceeds 55 consecutive residues.Ataxin-3 is a modular
protein, located both in the nucleus and thecytoplasm (Perez et
al., 1999; Antony et al., 2009; Macedo-Ribeiroet al., 2009),
encompassing an N-terminal globular JD, with struc-tural similarity
to cysteine proteases (Scheel et al., 2003; Albrechtet al., 2004),
followed by an extended tail composed of two ubiq-uitin interaction
motifs (UIMs), the expandable polyQ tract, anda C-terminal region
(Matos et al., 2011). The C-terminal region ofataxin-3 may contain
a third UIM, depending on the splice vari-ant (Goto et al., 1997),
with the 3UIM isoform of ataxin-3 beingpredominantly found in the
brain (Harris et al., 2010). Currently,the physiological function
of ataxin-3, as well as the molecularmechanism by which expanded
polyQ sequences causes selectiveneurodegeneration remain mostly
unknown. However, since itis ubiquitously expressed and cell death
is region specific, neu-rodegeneration is currently viewed as
depending on sequence andstructural features outside the ataxin-3
polyQ tract [reviewed inMatos et al. (2011) and references
therein].
ATAXIN-3 BIOLOGICAL ROLESATXN3 orthologs have been identified in
eukaryotic organismsincluding protozoans, plants, fungi, and
animals (Albrecht et al.,2004; Costa et al., 2004; Rodrigues et
al., 2007). Several functionshave been ascertained to ataxin-3
based on studies with orthologs.Specifically, a role in cell
structure and/or motility was proposedfor mouse ataxin-3 as it is
highly abundant in all types of muscleand in ciliated epithelial
cells (Costa et al., 2004). In fact, ataxin-3is able to interact
with tubulin through its JD domain (Figure 4),with nM affinity
(Mazzucchelli et al., 2009), which supports arole in cell
structure. Interestingly, data on ataxin-3 C. elegansortholog not
only reinforces a function in structure/motility andsignal
transduction (Rodrigues et al., 2007), but also indicate afunction
in development as absence of ATXN3 strongly modifiesexpression of
several development-related genes. ATXN3 knock-out animals showed
no obvious deleterious phenotype, probablydue to a putative
redundant function between ataxin-3 and otherJD-encoding proteins,
such as ataxin-3-like protein, Josephin 1 andJosephin 2, all
containing a typical cysteine protease catalytic triad.However the
studies with ATXN3 knock-out animals revealed anoverall increase in
the levels of ubiquitinated proteins (Schmittet al., 2007) and
signs of altered expression of core sets of genesassociated with
the ubiquitin-proteasome and signal transduction
pathways (Rodrigues et al., 2007), pointing to a dual function
ofataxin-3 in the ubiquitin-proteasome system and
transcriptionalregulation (Matos et al., 2011; Orr, 2012a).
Ataxin-3 function as transcriptional regulatorThe putative role
of ataxin-3 in transcriptional regulation isproposed to entail the
modulation of histone acetylation anddeacetylation at selected
promoters. Ataxin-3 interacts with themajor histone
acetyltransferases cAMP-response-element bindingprotein
(CREB)-binding protein (CBP), p300, and p300/CREB-binding
protein-associated factor (KAT2B/PCAF, Figures 4 and5), and is
proposed to inhibit transcription in specific promot-ers (e.g.,
MMP-2 promoter) either by blocking access to histoneacetylation
sites or through recruitment of histone deacetylase 3(HDAC3) and
nuclear receptor co-repressor (NCOR1; Figures 4and 5) (Li et al.,
2002; Evert et al., 2006). Although, the interac-tion sites have
not been mapped in detail for all these
proteins,co-immunoprecipitation experiments showed that
KAT2B/PCAF,p300, and CBP bind exclusively to the polyQ-containing
C-terminal region of ataxin-3 (Figure 4), apparently in a
polyQ-sizedependent manner (Li et al., 2002). Experimental evidence
alsoindicates that ataxin-3 forms part of a CREB-containing
complex,although no direct interaction has been observed between
the twoproteins (Li et al., 2002). In contrast, the N-terminal
region ofataxin-3 directly binds histones H3 and H4 (Table 2;
Figure 4)(Li et al., 2002). Of note, p300 and CBP, as well as
NCOR1,also encompass amino acid repetitions in its sequence.
Interest-ingly, in huntingtin and in ataxin-1, polyQ interferes
with CBP-activated gene transcription via interaction of their
glutamine-rich domains (Shimohata et al., 2000; Nucifora et al.,
2001) andmutant huntingtin targets specific components of the core
tran-scriptional machinery, in a glutamine-tract length-sensitive
man-ner (Zhai et al., 2005), pinpointing once again the role of
theamino acid-repeat region in the establishment of
protein–proteininteractions.
Ataxin-3 molecular function: ubiquitin hydrolaseA role for
ataxin-3 in ubiquitin-dependent pathways was pro-posed by
bioinformatic analysis (Scheel et al., 2003; Albrecht et al.,2004),
and its ability to bind and cleave poly-ubiquitin chainsand
polyubiquitinated proteins was later demonstrated experi-mentally
(Burnett et al., 2003; Chai et al., 2004). Importantly,inhibition
of ataxin-3 catalytic activity results in the increaseof
polyubiquitinated proteins, resembling the effects of protea-some
inhibition (Berke et al., 2005), indicating that ataxin-3
isinvolved with proteins targeted for proteasomal degradation.
Thefunction of ataxin-3 in the ubiquitin-proteasome system was
fur-ther supported by the identification of its association with
theubiquitin-like domain of the human homologs of the yeast
DNArepair protein Rad23, HHR23A, and HHR23B (Wang et al.,
2000;Doss-Pepe et al., 2003; Nicastro et al., 2005, 2009), with
valosin-containing protein (VCP)/p97 (Hirabayashi et al., 2001;
Doss-Pepeet al., 2003; Boeddrich et al., 2006; Zhong and Pittman,
2006), andwith the ubiquitin ligase E4B (Matsumoto et al., 2004)
(Figures 4and 5). Strikingly, the weak direct association between
ataxin-3and E4B is strongly reinforced by the addition of VCP/p97,
indicat-ing that these proteins form part of a higher order
macromolecular
www.frontiersin.org June 2013 | Volume 4 | Article 76 | 11
http://www.frontiersin.orghttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
FIGURE 4 | Overview of ataxin-3 structural information.
Schematicillustration of ataxin-3 (isoform 2; a.k.a. 3UIM isoform)
domain structurehighlighting the regions involved in
protein–protein interactions. The solutionstructures of the
Josephin domain (PDB accession code 1yzb) and UIMs1-2(PDB accession
code 2klz) are shown colored from N-(blue) to C- terminus(red).
JD-, UIM-, NLS-, and polyQ-mediated interactions are represented
byblue, red, green, and purple arrows, respectively; blue arrows
indicate thelocation of post-translational modification sites,
resulting from the interactionand phosphorylation by CK2 and GSK3.
Representative multi-subunitcomplexes where ataxin-3 participates
are boxed (Li et al., 2002; Matsumoto
et al., 2004; Scaglione et al., 2011; Durcan et al., 2012). One
of the mainquestions in the quest for ataxin-3 interacting proteins
is whetherpolyQ-expansion of the disease-protein modulates the
binding affinities.Current data indicates that polyQ-expansion
increments the ataxin-3 affinityfor CHIP (Scaglione et al., 2011),
VCP/p97 (Matsumoto et al., 2004; Boeddrichet al., 2006; Zhong and
Pittman, 2006), and the transcription regulators p300,CBP, and PCAF
(Li et al., 2002) (interactions represented by broken
lines).Strikingly, all these interactions are mediated by ataxin-3
flexible tail, whichincludes the polyQ tract. Moreover the
transcriptional regulators p300, CBP,and NCOR all contain amino
acid repeats.
complex to regulate the degradation of misfolded ER
proteins(Matsumoto et al., 2004; Zhong and Pittman, 2006) (Figure
5).
Biochemical studies showed that ataxin-3 displays a
strongpreference for chains containing four or more ubiquitins
(Chaiet al., 2004) and that full-length ataxin-3 and its JD both
displayproteolytic activity toward either linear substrates
containing asingle ubiquitin molecule (Burnett et al., 2003; Chow
et al., 2004b;Weeks et al., 2011) or K48/K63-linked poly-ubiquitin
chains (Win-born et al., 2008; Todi et al., 2009), displaying also
the capacity tobind the ubiquitin-like protein NEED8 in a
substrate-like fashion(Ferro et al., 2007). Moreover, ataxin-3-like
protein, Josephin 1 andJosephin 2, also display ubiquitin protease
activity (Tzvetkov andBreuer, 2007; Weeks et al., 2011), although
the relative activities arehighly variable in spite of their high
sequence similarity. Charac-terization of ataxin-3 ubiquitin
hydrolase activity has also revealedthat the full-length protein
preferentially cleaves Lys-63-linkedand mixed-linkage chains with
more than four ubiquitins (Bur-nett et al., 2003; Winborn et al.,
2008). This specificity is dictated
by the UIMs, as the isolated JD shows a preference toward
thedisassembly of Lys-48-linked chains (Nicastro et al., 2009,
2010).Altogether, this indicates that ataxin-3 ubiquitin hydrolase
activ-ity is likely to be associated with delivery of target
substrates tothe proteasome rather than with their rescue from
degradation,as it happens with most of the other deubiquitinases
(Ventii andWilkinson, 2008; Matos et al., 2011; Scaglione et al.,
2011). Inter-estingly, ubiquitin hydrolase activity of ataxin-3 is
not affectedby polyQ expansion and both normal and expanded
ataxin-3 areable to increase the cellular levels of a short-lived
GFP normallydegraded by the ubiquitin-proteasome pathway (Burnett
et al.,2003).
The 3D structures for JD alone or in the presence of ubiquitin
aswell as that of the tandem UIM1-UIM2 have already been
deter-mined (Mao et al., 2005; Nicastro et al., 2005, 2009; Song et
al.,2010), giving a structural perspective on the ubiquitin
hydrolasefunction of ataxin-3. The JD contains two ubiquitin
binding sites,both of hydrophobic nature, with site 1 being
negatively charged to
Frontiers in Neurology | Neurodegeneration June 2013 | Volume 4
| Article 76 | 12
http://www.frontiersin.org/Neurodegenerationhttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
Tab
le2
|Hu
man
atax
in-3
asso
ciat
edp
rote
ins.
Ata
xin
-3in
tera
ctin
gp
rote
in
(Un
iPro
tac
cess
ion
cod
e)
Pro
tein
nam
eD
irec
tin
tera
ctio
n?
Inte
ract
ion
do
mai
ns
Ref
eren
ce
Ata
xin
-3Pa
rtn
erp
rote
in
CE
LL-Q
UA
LITY
CO
NT
RO
L(P
RO
TE
INH
OM
EO
STA
SIS
)
HH
R23
A/B
(P54
725/
P54
727)
UV
exci
sion
repa
irpr
otei
n
RA
D23
hom
olog
A/B
Yes,
kD(J
D:U
bl)=
12µ
MJD
Ubi
quiti
n-lik
e(U
bl)
N-t
erm
inal
dom
ain
Wan
get
al.(
2000
),D
oss-
Pepe
etal
.
(200
3),N
icas
tro
etal
.(20
05,2
009)
Poly
-ubi
quiti
n(P
0CG
48/P
0CG
47)
Poly
ubiq
uitin
-
C/P
olyu
biqu
itin-
B
Yes,
kD(a
txn3
:K48
-
tetr
aUb)=
0.2
µM
,kD
(atx
n3:U
b)=
50µ
M
UIM
s,JD
K48
-and
K63
-link
edU
b
(≥4
Ub)
,K48
-link
eddi
Ub
Bur
nett
etal
.(20
03),
Dos
s-Pe
pe
etal
.(20
03),
Cha
iet
al.(
2004
),
Nic
astr
oet
al.(
2009
,201
0)
Ubi
quili
n-1
(Q9U
MX
0)Pr
otei
nlin
king
IAP
with
cyto
skel
eton
1
n.d.
n.d.
n.d.
Hei
ret
al.(
2006
)
NE
DD
8(Q
1584
3)U
biqu
itin-
like
prot
ein
Ned
d8
Yes
JDN
ED
D8
Ferr
oet
al.(
2007
)
Park
in(O
6026
0)E
3ub
iqui
tin-p
rote
inlig
ase
park
in
Yes
JD,U
IMs
IBR
dom
ain,
Ubi
quiti
n-lik
e
(Ubl
)dom
ain
Dur
can
etal
.(20
11,2
012)
Ubc
7(P
6225
3)U
biqu
itin-
conj
ugat
ing
enzy
me
E2
G1
Yes
(tra
nsie
ntin
tera
ctio
n
dete
cted
usin
g
cros
s-lin
king
reag
ents
)
n.d.
n.d.
Dur
can
etal
.(20
12)
p45
(P62
195)
26S
prot
easo
me
regu
lato
rysu
buni
t8
Yes
N-t
erm
inal
atxn
3re
gion
(res
idue
s1–
133)
n.d.
Wan
get
al.(
2007
)
20S
Prot
easo
me
(P25
786,
P25
787,
P25
788,
P25
789,
P28
066,
P60
900,
O14
818,
P20
618,
P49
721,
P49
720,
P28
070,
P28
074,
P28
072,
Q99
436)
Prot
easo
me
subu
nits
α
type
s1-
7an
dβ
type
s1-
7
n.d.
N-t
erm
inal
atxn
3re
gion
(res
idue
s1–
150)
n.d.
Dos
s-Pe
peet
al.(
2003
)
CH
IP(Q
9UN
E7)
E3
ubiq
uitin
-pro
tein
ligas
e
CH
IP
Yes,
kD
(atx
n3:C
HIP
)=2.
2µ
M,k
D
(atx
n3:U
b-C
HIP
)=0.
1µ
M
Atx
n3C
-ter
min
us
(res
idue
s13
3–35
7)
CH
IPN
-ter
min
usJa
naet
al.(
2005
),S
cagl
ione
etal
.
(201
1)
VCP
/p97
(P55
072)
Tran
sitio
nale
ndop
lasm
ic
retic
ulum
ATPa
se
Yes
Res
idue
s27
7–28
1
(incl
udes
argi
nine
/lysi
ne-r
ich
NLS
)
Ndo
mai
n,re
sidu
es1-
199
Hira
baya
shie
tal
.(20
01),
Dos
s-Pe
pe
etal
.(20
03),
Mat
sum
oto
etal
.
(200
4,?)
Boe
ddric
het
al.(
2006
),an
d
Zhon
gan
dP
ittm
an(2
006)
E4B
(O95
155)
Ubi
quiti
nco
njug
atio
n
fact
orE
4B
Yes
(with
79Q
-ata
xin-
3)n.
d.n.
d.M
atsu
mot
oet
al.(
2004
)
(Con
tinue
d)
www.frontiersin.org June 2013 | Volume 4 | Article 76 | 13
http://www.frontiersin.orghttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
Tab
le2
|Co
nti
nu
ed
Ata
xin
-3in
tera
ctin
gp
rote
in
(Un
iPro
tac
cess
ion
cod
e)
Pro
tein
nam
eD
irec
tin
tera
ctio
n?
Inte
ract
ion
do
mai
ns
Ref
eren
ce
Ata
xin
-3Pa
rtn
erp
rote
in
OTU
B2
(Q96
DC
9)U
biqu
itin
thio
este
rase
OTU
B2
n.d.
n.d.
n.d.
Sow
aet
al.(
2009
)
US
P13
(Q92
995)
Ubi
quiti
nca
rbox
yl-t
erm
inal
hydr
olas
e13
n.d.
n.d.
n.d.
Sow
aet
al.(
2009
)
KC
TD10
(Q9H
3F6)
BTB
/PO
Z
dom
ain-
cont
aini
ngad
apte
r
for
CU
L3-m
edia
ted
Rho
A
degr
adat
ion
prot
ein
3
n.d.
n.d.
n.d.
Sow
aet
al.(
2009
)
Tubu
lindi
mer
(Q71
U36
/P68
363)
Tubu
linα-1
A,T
ubul
inβ-2
BYe
s,kD
(atx
n3:tu
bulin
)=50
–70
nM
JDn.
d.M
azzu
cche
lliet
al.(
2009
)
Dyn
ein
(Q9Y
6G9)
Cyt
opla
smic
dyne
in1
light
inte
rmed
iate
chai
n1
n.d.
n.d
n.d.
Bur
nett
and
Pitt
man
(200
5)
HD
AC
6(Q
9UB
N7)
His
tone
deac
etyl
ase
6n.
d.n.
d.n.
d.B
urne
ttan
dP
ittm
an(2
005)
TR
AN
SC
RIP
TIO
NA
LR
EG
ULA
TIO
N
p300
(Q09
472)
His
tone
acet
yltr
ansf
eras
e
p300
Yes
Poly
Q-c
onta
inin
gC
term
inus
ofat
xn3
(res
idue
s28
8–35
4)
n.d.
Liet
al.(
2002
)
CB
P(Q
9279
3)cA
MP-
resp
onse
-ele
men
t
bind
ing
prot
ein
(CR
EB
)-bin
ding
prot
ein
Yes
Poly
Q-c
onta
inin
gC
term
inus
ofat
xn3
(res
idue
s28
8–35
4)
n.d.
Liet
al.(
2002
)
PC
AF
(Q92
831)
p300
/CR
EB
-bin
ding
prot
ein-
asso
ciat
edfa
ctor
:
hist
one
acet
yltr
ansf
eras
e
KAT
2B
Yes
Poly
Q-c
onta
inin
gC
term
inus
ofat
xn3
(res
idue
s28
8–35
4)
n.d.
Liet
al.(
2002
)
His
tone
H3/
H4
(P68
431/
P62
805)
His
tone
Yes
JD+
UIM
1an
d2
(res
idue
s1–
288)
n.d.
Liet
al.(
2002
)
HD
AC
3(O
1537
9)hi
ston
ede
acet
ylas
e3
Yes
n.d.
n.d.
Eve
rtet
al.(
2006
)
NC
OR
1(O
7537
6)N
ucle
arre
cept
or
core
pres
sor
1
n.d.
n.d.
n.d.
Eve
rtet
al.(
2006
)
MA
ML3
(Q96
JK9)
Mas
term
ind-
like
prot
ein
3n.
d.n.
d.n.
d.R
avas
iet
al.(
2010
)
EW
SR
1(Q
0184
4)R
NA
-bin
ding
prot
ein
EW
Sn.
d.n.
d.Vi
naya
gam
etal
.(20
11)
(Con
tinue
d)
Frontiers in Neurology | Neurodegeneration June 2013 | Volume 4
| Article 76 | 14
http://www.frontiersin.org/Neurodegenerationhttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
Tab
le2
|Co
nti
nu
ed
Ata
xin
-3in
tera
ctin
gp
rote
in
(Un
iPro
tac
cess
ion
cod
e)
Pro
tein
nam
eD
irec
tin
tera
ctio
n?
Inte
ract
ion
do
mai
ns
Ref
eren
ce
Ata
xin
-3Pa
rtn
erp
rote
in
SIG
NA
LT
RA
NS
DU
CT
ION
CK
2(P
1978
4)C
asei
nki
nase
IIsu
buni
tα
Yes
n.d.
n.d.
Tao
etal
.(20
08),
Mue
ller
etal
.
(200
9)
GS
K3B
(P49
841)
Gly
coge
nsy
ntha
se
kina
se-3
β
Yes
n.d
n.d
Feie
tal
.(20
07),
Vina
yaga
met
al.
(201
1)
DN
M2
(P50
570)
Dyn
amin
-2n.
d.n.
d.n.
d.Vi
naya
gam
etal
.(20
11)
CD
KN
1A(P
3893
6)C
yclin
-dep
ende
ntki
nase
inhi
bito
r1
n.d.
n.d.
n.d.
Vina
yaga
met
al.(
2011
)
AN
XA
7(P
2007
3)A
nnex
inA
7n.
d.n.
d.n.
d.Vi
naya
gam
etal
.(20
11)
RP
S6A
K1
(Q15
418)
Rib
osom
alpr
otei
nS
6
kina
seα-1
n.d.
n.d.
n.d.
Vina
yaga
met
al.(
2011
)
TK1
(P04
183)
Thym
idin
eki
nase
,cyt
osol
icn.
d.n.
d.n.
d.Vi
naya
gam
etal
.(20
11)
MK
NK
1(Q
9BU
B5)
MA
Pki
nase
-inte
ract
ing
serin
e/th
reon
ine-
prot
ein
kina
se1
n.d.
n.d.
n.d.
Vina
yaga
met
al.(
2011
)
ATA
XIO
ME
TEX
11(Q
8IY
F3)
Test
is-e
xpre
ssed
sequ
ence
11pr
otei
n
n.d.
n.d.
n.d.
Lim
etal
.(20
06)
C16
orf7
0(Q
9BS
U1)
UP
F018
3pr
otei
nC
16or
f70
n.d.
n.d.
n.d.
Lim
etal
.(20
06)
AR
HG
AP
19(Q
14C
B8)
Rho
GTP
ase-
activ
atin
g
prot
ein
19
n.d.
n.d.
n.d.
Lim
etal
.(20
06)
PIC
K1
(Q9N
RD
5)P
RK
CA
-bin
ding
prot
ein
n.d.
n.d.
n.d.
Lim
etal
.(20
06)
Box
essh
aded
ingr
ayre
pres
ent
asso
ciat
ions
iden
tified
inhi
gh-t
hrou
ghpu
tin
tera
ctom
esc
reen
ings
.
Atx
n3,
atax
in-3
;IB
R,
InB
etw
een
Rin
gfin
gers
;JD
,Jo
seph
indo
mai
n;n.
d.,
not
dete
rmin
ed;
NLS
,nu
clea
rlo
caliz
atio
nse
quen
ce;
Ub,
ubiq
uitin
;U
BA
,ub
iqui
tinas
soci
ated
dom
ain;
Ubl
,ub
iqui
tin-li
kedo
mai
n;U
IM,
ubiq
uitin
-inte
ract
ing
mot
ifs.
www.frontiersin.org June 2013 | Volume 4 | Article 76 | 15
http://www.frontiersin.orghttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
FIGURE 5 | Overview of ataxin-3 protein interaction network.
Data onthe ataxin-3 interactors was obtained by analysis of
Interactome3D (Moscaet al., 2012), MINT (Ceol et al., 2010), and
Dr. PIAS (Sugaya and Furuya,2011) protein interaction databases,
and completed with data compiledfrom current literature on ataxin-3
protein associations obtained with adiverse set of experimental
approaches (see complete information on
Table 2). Red arrows indicate interactions for which structural
data has beenobtained, while orange arrows indicate that
biophysical data on interactionaffinity in vitro is known (Table
2). Broken arrows represent interactions thatresult from
high-throughput interactome analysis that still require
detailedbiochemical and functional analysis. Proteins are grouped
according to theirbiological role.
facilitate docking of the positively charged ubiquitin
C-terminusclose to the catalytic site. Binding of ubiquitin to site
1 is of crucialimportance for both JD and full-length ataxin-3
activity as ubiqui-tin hydrolase (Nicastro et al., 2010). Site 2
confers ubiquitin-chainlinkage preference to ataxin-3 and it
overlaps with the surface forinteraction of the ubiquitin-like
domain in HHR23B (Nicastroet al., 2005, 2010). Solution structure
for the two UIMs (UIM1and UIM2), which are separated by a short 2
amino acid spacer,revealed that they fold into two α-helices
separated by a flexiblelinker (Song et al., 2010). Upon ubiquitin
binding, this structureadopts a typical helix-loop-helix folding
pattern, where hydropho-bic interactions dominate the complex
formation (Song et al.,2010). When in tandem, UIM1 and UIM2 show
higher bindingaffinity for mono- or poly-ubiquitin than individual
UIMs (Songet al., 2010), suggesting a cooperative binding mechanism
(Songet al., 2010). The effect of the presence of UIM3 in ataxin-3
bindingaffinity for ubiquitin has not been shown, but its role in
ubiqui-tin chain binding and recognition is unlikely to be of
relevance toataxin-3 activity, since no differences in proteolytic
activity wereidentified when the 2UIM and 3UIM isoforms were
compared. Inthe model proposed for ataxin-3 ubiquitin chain
proteolysis, theUIMs (UIM1-UIM2) select and recruit poly-ubiquitin
substrates,presenting them to the catalytic JD for cleavage (Mao et
al., 2005).
Even though ataxin-3 functions as ubiquitin hydrolase, its
pro-teolytic activity is rather low, indicating that either
ataxin-3/JD
requires additional factors (post-translational
modifications,cofactors, intracellular interactions) to exhibit
significant prote-olytic activity or the substrates used in vitro
so far are not optimal.Interestingly, only three amino acid
mutations are sufficient tosignificantly increase the proteolytic
activity of ataxin-3, to avalue close to that of ataxin-3-like
protein (Weeks et al., 2011).Under physiological conditions, one
candidate for an activatingsignal is mono-ubiquitination at K117,
which has been shownto increase the enzyme’s rate of cleavage of
Lys-63 linked sub-strates (Todi et al., 2009). However, the
molecular mechanism bywhich ubiquitination increases enzyme
activity is not still clear,nor is it known whether other cellular
signals (e.g., phospho-rylation by CK2 or GSK3b; Fei et al., 2007;
Tao et al., 2008)may also modulate the activity of ataxin-3.
Interestingly the JD-containing protein, Josephin 1 was also
demonstrated to cleaveubiquitin chains only after it is
mono-ubiquitinated (Seki et al.,2013). The regulation of ataxin-3
activity through ubiquitinationmight depend on the interaction of
ataxin-3 with several E3 ubiq-uitin ligases (Durcan and Fon, 2013),
such as the C-terminus of70 kDa heat-shock protein
(Hsp70)-interacting protein (CHIP),parkin, and E4B (Figure 5),
since all were shown to promoteataxin-3 ubiquitination and regulate
its degradation by the pro-teasome (Matsumoto et al., 2004; Jana et
al., 2005; Miller et al.,2005). Association of ataxin-3 with CHIP
is a multistep processregulated by mono-ubiquitination of the
N-terminal region of
Frontiers in Neurology | Neurodegeneration June 2013 | Volume 4
| Article 76 | 16
http://www.frontiersin.org/Neurodegenerationhttp://www.frontiersin.org/Neurodegeneration/archive
-
Almeida et al. Structure and function of trinucleotide
repeats
CHIP by the E2-conjugating enzyme Ube2w, and occurs throughthe
region encompassing polyQ and UIM1 and 2 (Jana et al.,2005) (Figure
4). As observed for other interactions involvingthe C-terminal
region of ataxin-3, the ataxin-3-CHIP complex isaffected by polyQ
expansion and the polyQ-expanded protein dis-plays a sixfold
increase in binding affinity (Scaglione et al., 2011).The presence
of ataxin-3 in multicomponent E3-ligase complexesis also supported
by the identification of a direct interactionwith parkin, an
association that stabilizes the interaction betweenparkin and the
E2-conjugating enzyme Ubc7 (Durcan et al., 2011).In contrast with
what is observed in the ataxin-3:CHIP com-plex, ataxin-3
association with parkin remains unaltered by polyQexpansion (Durcan
et al., 2012) (Figure 4). However, we still donot understand the
mechanisms that regulate shuttling of ataxin-3 between these
functional complexes or how its distribution ismodulated by polyQ
expansion. Further biochemical studies arerequired to establish the
correlation between these macromolec-ular interactions and their
relevance for ataxin-3 aggregation andneurodegeneration in MJD
patients
ATAXIN-3 AGGREGATION: A MULTISTEP PATHWAY MODULATED BYTHE
PROTEIN CONTEXTA characteristic hallmark of MJD and other
polyQ-expansion dis-eases is the appearance of intracellular
inclusions enriched inthe disease protein and containing components
from the cell-quality control machinery (e.g., ubiquitin,
proteasome subunits,and chaperones), indicating that these diseases
form part of thelarger family of protein misfolding disorders
(Williams and Paul-son, 2008). Early in vitro studies showed that
expansion of thepolyQ tract within the pathological range induced
formationof insoluble β-rich fibrils with the capacity to bind
amyloid-specific dyes (Bevivino and Loll, 2001). Later it was
demonstratedthat non-pathological ataxin-3 could also form
insoluble fibrillaraggregates upon destabilization of its structure
by temperature,pressure or denaturing agents (Marchal et