Top Banner
Combining prediction, computation and experiment for the characterization of protein disorder Clay Bracken 1 , Lilia M Iakoucheva 2 , Pedro R Romero 3 and A Keith Dunker 4 Several computational and experimental methods exist for identifying disordered residues within proteins. Computational algorithms can now identify these disordered sequences and predict their occurrence within genomes with relatively high accuracy. Recent advances in NMR and mass spectroscopy permit faster and more detailed studies of disordered states at atomic resolutions. Combining prediction, computation and experimentation is proposed to accelerate and enhance the characterization of intrinsically disordered protein. Addresses 1 Department of Biochemistry, Weill Medical College of Cornell University, 1300 York Avenue, New York, NY 10021, USA e-mail: [email protected] 2 Laboratory of Statistical Genetics, The Rockefeller University, 1230 York Avenue, New York, NY 10021, USA 3 Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, and Indiana University School of Informatics, Indianapolis, IN 46202, USA 4 Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine and Molecular Kinetics Inc, Indiana University Emerging Technology Center, Indianapolis, IN 46202, USA e-mail: [email protected] Current Opinion in Structural Biology 2004, 14:570–576 This review comes from a themed issue on Biophysical methods Edited by Arthur G Palmer III and Randy J Read Available online 15th September 2004 0959-440X/$ – see front matter # 2004 Elsevier Ltd. All rights reserved. DOI 10.1016/j.sbi.2004.08.003 Abbreviations HXMS hydrogen/deuterium exchange MS MoRE molecular recognition element MS mass spectrometry NOE nuclear Overhauser effect NOESY nuclear Overhauser effect spectroscopy PDB Protein Data Bank RDC residual dipolar coupling Introduction Historically, the unfolded state of proteins has been studied by examining protein denaturation [1]. Yet, many unfolded proteins perform biological functions [2]. Successful disorder predictors support the hypothesis that the amino acid sequence encodes disorder [3]. The incidence of unfolded proteins and protein segments encompasses around 25% of all proteins, as demonstrated by predictions on major protein sequence databases [4]. This observation has contributed to a reassessment of the assumption that tertiary structure is necessary for function [5]. Since these seminal studies, several disorder predictors have been developed [6–11]. In addition, the experiments that have been conducted in the 2002 Critical Assessment of Structure Prediction (CASP5) support the value of disorder prediction [12]. The increased activity in this area is likely to produce increasingly accurate predictors. Numerous experimental methods have been developed or adapted to study disordered proteins and regions [5,13 , 14–17]. A growing body of experimental data now indicates that disorder does not generally comprise entirely random conformations, but is biased toward a particular type of secondary structure or clusterings of hydrophobic residues [18,19,20 ,21,22 ,23–26]. Given the frequency [4] and functional importance [2,27,28] of intrinsic disorder, the accurate identification of these regions is of significant biological interest. This review surveys experimental methods that provide atomic-level details about the dis- ordered states, computational approaches that explore disorder at the same atomic level and bioinformatics predictors of disorder. Since the experimental studies on proteins use markedly different approaches for ordered and disordered proteins, it is proposed that disorder and dis- order-related predictions provide an important method of guiding protein experimental studies. Experimental approaches and applications NMR chemical shifts Sequence-specific chemical shift assignments enable most NMR analyses. The deviations from random-coil refer- ence values indicate secondary structure and disorder within proteins. Pulse sequences have been specifically tailored for assigning backbone and sidechain resonances for disordered proteins [16,29]. Recently, pulse sequences have been devised for resolving sidechain chemical shifts in the unfolded state, to allow measurement of pK a values [29]. Furthermore, TROSY (transverse relaxation- optimized spectroscopy) and CRINEPT (cross relaxation enhanced polarization transfer)-TROSY NMR pulse sequences have dramatically extended the size limit for chemical shift analysis [30], as demonstrated by the NMR assignment of the 72 kDa GroES protein in both the free Current Opinion in Structural Biology 2004, 14:570–576 www.sciencedirect.com
7

Combining prediction, computation and experiment for the characterization of protein disorder

Apr 30, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Combining prediction, computation and experiment for the characterization of protein disorder

Combining prediction, computation and experiment for thecharacterization of protein disorderClay Bracken1, Lilia M Iakoucheva2, Pedro R Romero3 andA Keith Dunker4

Several computational and experimental methods exist for

identifying disordered residues within proteins. Computational

algorithms can now identify these disordered sequences and

predict their occurrence within genomes with relatively high

accuracy. Recent advances in NMR and mass spectroscopy

permit faster and more detailed studies of disordered states at

atomic resolutions. Combining prediction, computation and

experimentation is proposed to accelerate and enhance the

characterization of intrinsically disordered protein.

Addresses1 Department of Biochemistry, Weill Medical College of Cornell

University, 1300 York Avenue, New York, NY 10021, USA

e-mail: [email protected] Laboratory of Statistical Genetics, The Rockefeller University,

1230 York Avenue, New York, NY 10021, USA3 Center for Computational Biology and Bioinformatics, Indiana

University School of Medicine, and Indiana University School of

Informatics, Indianapolis, IN 46202, USA4 Center for Computational Biology and Bioinformatics, Department of

Biochemistry and Molecular Biology, Indiana University School of

Medicine and Molecular Kinetics Inc, Indiana University Emerging

Technology Center, Indianapolis, IN 46202, USA

e-mail: [email protected]

Current Opinion in Structural Biology 2004, 14:570–576

This review comes from a themed issue on

Biophysical methods

Edited by Arthur G Palmer III and Randy J Read

Available online 15th September 2004

0959-440X/$ – see front matter

# 2004 Elsevier Ltd. All rights reserved.

DOI 10.1016/j.sbi.2004.08.003

Abbreviations

HXMS h

Current O

ydrogen/deuterium exchange MS

MoRE m

olecular recognition element

MS m

ass spectrometry

NOE n

uclear Overhauser effect

NOESY n

uclear Overhauser effect spectroscopy

PDB P

rotein Data Bank

RDC re

sidual dipolar coupling

IntroductionHistorically, the unfolded state of proteins has been

studied by examining protein denaturation [1]. Yet,

many unfolded proteins perform biological functions

[2]. Successful disorder predictors support the hypothesis

that the amino acid sequence encodes disorder [3]. The

incidence of unfolded proteins and protein segments

pinion in Structural Biology 2004, 14:570–576

encompasses around 25% of all proteins, as demonstrated

by predictions on major protein sequence databases [4].

This observation has contributed to a reassessment of the

assumption that tertiary structure is necessary for function

[5].

Since these seminal studies, several disorder predictors

have been developed [6–11]. In addition, the experiments

that have been conducted in the 2002 Critical Assessment

of Structure Prediction (CASP5) support the value of

disorder prediction [12]. The increased activity in this

area is likely to produce increasingly accurate predictors.

Numerous experimental methods have been developed or

adapted to study disordered proteins and regions [5,13�,14–17]. A growing body of experimental data now indicates

that disorder does not generally comprise entirely random

conformations, but is biased toward a particular type of

secondary structure or clusterings of hydrophobic residues

[18,19,20�,21,22�,23–26]. Given the frequency [4] and

functional importance [2,27,28] of intrinsic disorder, the

accurate identification of these regions is of significant

biological interest. This review surveys experimental

methods that provide atomic-level details about the dis-

ordered states, computational approaches that explore

disorder at the same atomic level and bioinformatics

predictors of disorder. Since the experimental studies on

proteins use markedly different approaches for ordered and

disordered proteins, it is proposed that disorder and dis-

order-related predictions provide an important method of

guiding protein experimental studies.

Experimental approaches and applicationsNMR chemical shifts

Sequence-specific chemical shift assignments enable most

NMR analyses. The deviations from random-coil refer-

ence values indicate secondary structure and disorder

within proteins. Pulse sequences have been specifically

tailored for assigning backbone and sidechain resonances

for disordered proteins [16,29]. Recently, pulse sequences

have been devised for resolving sidechain chemical shifts

in the unfolded state, to allow measurement of pKa values

[29]. Furthermore, TROSY (transverse relaxation-

optimized spectroscopy) and CRINEPT (cross relaxation

enhanced polarization transfer)-TROSY NMR pulse

sequences have dramatically extended the size limit for

chemical shift analysis [30], as demonstrated by the NMR

assignment of the 72 kDa GroES protein in both the free

www.sciencedirect.com

Page 2: Combining prediction, computation and experiment for the characterization of protein disorder

Characterizing protein disorder Bracken et al. 571

state and a 900 kDa GroES–GroEL complex [31]. These

assignments revealed an important protein–protein bind-

ing region that undergoes a disorder-to-order transition as

GroES binds to GroEL [31].

Temperature- and pressure-dependent changes in che-

mical shift are generally small and linear. The appearance

of curvature is likely to be the result of low energy

changes in conformational populations [18]. The tem-

perature-dependent changes in the 1H, 13CO and 15N

chemical shifts and resonance line-shapes can identify

conformational exchange and localize unfolded or nascent

structure [18,19,20�] (Figure 1). Corresponding changes

in the 13CO(i) to 1HN(i+4) temperature coefficients have

Figure 1

0.0

1.5

3.0

4.5

J(0)

(ns

)

(a)

0

–40

–80dδC

′/dT

(pp

B/K

)

–20

–60

0 10 20 30 40 50Residue number

–1.4

0.0

1.4

2.8

∆δC

′ (pp

m)

(b)

(c)

Current Opinion in Structural Biology

The Ala14 protein undergoes a rapid helix-coil transition and is

populated by �30% helix in solution. The structure and disorder of

Ala14 were investigated using NMR. (a) The values of J(0) were

obtained from analysis of the 15N relaxation data using spectral

density mapping. Large values of J(0) indicate slower motions and are

centered near residues 37–42, where helix initiation occurs. (b) The

NMR chemical shift deviations for the carbonyl resonances are shown.

Positive values are characteristic of helix, whereas values close to

zero suggest random coil. The blue shaded region indicates chemical

shift deviations similar to those of the random-coil model peptides.

(c) The temperature coefficients of the carbonyl resonances are shown.

The direction of change is consistent with the peptide adopting

increasing helical content with reduction in temperature. The

temperature coefficients are significantly larger than expected for

either stably folded or unfolded states (blue shaded region), indicating

that significant changes in conformational populations occur as a

function of temperature throughout the peptide. Adapted from [20�].

www.sciencedirect.com

revealed the initial hydrogen bonds that are formed in

helix initiation [20�].

Interpreting pressure-dependent chemical shift changes

requires an understanding of the random-coil behavior.

The 1H chemical shifts in all 20 amino acids were

measured in a series of GGXA tetrapeptides to charac-

terize pressure-dependent random-coil shifts [32]. The

largest changes were associated with backbone hydrogen

bonding interactions; the smallest with aliphatic side-

chains [32]. Pressure-dependent chemical shift studies

have recently been employed to examine the aggrega-

tion/fibrillization process [33,34]. The pressure-depen-

dent reversal of the fibril assembly process was followed

for a fibril-forming variant of lysozyme [33]. Examination

of pressure-dependent changes in the hamster prion

protein has indicated conformational fluctuations that

suggest a transition toward the pathological scrapie form

[34].

NMR spin relaxation

NMR spin relaxation processes depend on global and

local motions, and therefore characterize flexibility and

disorder. Subtle rate changes can highlight motional

differences due to small variations in nascent structure

population. The most commonly employed method for

examining residual structure is the analysis of 15N relaxa-

tion rates [15,33,35]. For example, upon mutation of

Trp62 to glycine, the unfolded state of lysozyme exhibits

significant changes in the 15N relaxation rates and che-

mical shifts of distant hydrophobic residues. Both of these

changes indicate the presence of long-range, non-native

interactions [21].

Notably, spin relaxation measurements have been

extended to study methyl sidechain dynamics using pulse

sequences that are designed specifically for measuring

deuterium spin relaxation rates of 13CH2D methyl groups

[22�]. Methyl relaxation rate variations revealed sidechain

mobility differences that were abolished upon pH titra-

tion, thus indicating a potential site of hydrophobic

clustering [22�,23].

Recently, a new treatment for interpreting the spectral

density function has been proposed that avoids invoking

the decoupling approximation. The method allows better

estimation of the parameter errors associated with the

order parameter when internal, te, and overall, tc, motions

no long satisfy the conditions te/tc >> 1 or te/tc << 1

[36�]. The treatment provides better motional interpreta-

tion of flexible loops and disordered sequences within

proteins.

Distributions of correlation times have been applied to

interpret relaxation data using the Cole–Cole model.

Comparison of the Cole–Cole analysis to the original

model-free formalism demonstrated that Cole–Cole pro-

Current Opinion in Structural Biology 2004, 14:570–576

Page 3: Combining prediction, computation and experiment for the characterization of protein disorder

572 Biophysical methods

vided better agreement with field-dependent 15N relaxa-

tion data collected on the unfolded subtilisin pro-peptide

[37].

Relaxation rates have been modeled by estimating the

effective correlation times in the denatured state. In

denatured apomyoglobin, correlation times were calcu-

lated based on the amino acid sequence, using a simple

equation that incorporates the hydrodynamic volume of

each residue and assumes a seven-residue persistence

length [24]. Remarkably, this simple model closely repro-

duced the 15N transverse relaxation rates. Several regions

displayed significantly slower motions than predicted,

indicating the presence of hydrophobic clustering in

the unfolded state [24].

Residual dipolar coupling

Residual dipolar coupling (RDC) interactions occur when

a pair of magnetic nuclei orient with respect to an external

magnetic field. This can be achieved by placing proteins

in anisotropic alignment media or by the addition of

suitable paramagnetic tags for inducing alignment. The

RDC strength of a spin pair depends on their separation

and their orientation with respect to the magnetic field.

Alignment fluctuations attenuate the RDC values,

whereas the absence of net alignment has the effect of

averaging RDC values to zero. The orientation depen-

dence of RDCs constitutes an important source of mo-

tional information, which has been reviewed recently

[13�].

Unfolded and native staphylococcal nuclease and eglin C

exhibit statistically significant correlations in the dipolar

couplings obtained in the unfolded and folded states.

This suggests a degree of structural congruence in the

orientation of these states [25,26]. These observations

indicate either an unfolded equilibrium state with native-

like orientations or the presence of small populations that

adopt native or native-like conformations. By contrast, a

recent study of the unfolded state of the GB1 protein

displayed vanishingly small RDC values, indicating no

correlation with the native structure [38]. A recent the-

oretical study indicates that non-vanishing RDC values

occur in the unfolded states because of local volume

exclusion effects among neighboring amino acids [39�].This suggests that structural correlations may persist as a

result of intrinsic sequence effects.

NOE distance analysis

NOESY (nuclear Overhauser effect spectroscopy) experi-

ments are the primary method for establishing atomic

distances. In unstructured proteins, the rapidly inter-

converting conformations display averaged distances

between local neighboring residues. The interpretation

of NOESY spectra in unstructured proteins is compli-

cated by significant spectral overlap. To surmount these

problems, a variety of NOESY-based methods have been

Current Opinion in Structural Biology 2004, 14:570–576

developed to improve resolution and selectivity [17].

Reductions in the complexity of NOESY spectra have

been achieved through selective labeling and deuteration

to simplify the spectra and reduce relaxation pathways

[40]. Recently, NOE methods have been applied to

study the unfolded state of the N-terminal drk SH3

domain; fully 2H,13C,15N-labeled samples with selective

incorporation of protonated methyl-Leu-Cd, Ile-Cd and

aromatic residues were used to help overcome problems

associated with spectral overlap [41]. Using these selec-

tively labeled samples, long-range native-like and non-

native hydrophobic clusters were defined.

Long-range distance analysis via paramagnetic probes

Often, long-range NOE distances are unobservable in

unfolded and partially folded states. A complementary

method for observing long-range distances follows NMR

relaxation enhancement upon incorporation of paramag-

netic spin labels. By following the relaxation changes in

the presence and absence of the spin label, distances of

up to 20 A can be measured. This approach was initially

applied to disordered states of staphylococcal nuclease

[42]. More recently, this technique has been used to

provide evidence of long-range interactions in the

unstructured states of the acetyl coenzyme A binding

protein [43] and apomyoglobin [44].

H/D exchange, mass spectrometry and limited

proteolysis

The rate of amide H/D exchange is related to the pH

value, solvent exposure and hydrogen bonding state. H/D

exchange occurs orders of magnitude faster in unfolded

proteins in comparison to folded proteins. H/D exchange,

coupled with NMR or with mass spectrometry (MS) (i.e.

HXMS; hydrogen/deuterium exchange MS), is important

for studying protein folding and dynamics [45�,46,47].

Compared to NMR, the HXMS technique provides a

dramatic reduction in sample requirements and can be

routinely applied to the analysis of significantly larger

proteins.

The combination of protease digestion with H/D ex-

change and/or MS has been utilized to determine the

exact location and dynamics of disordered regions in

proteins [48,49]. The use of different proteases in the

HXMS experiments increases both coverage for the pep-

tide mapping and spatial resolution in large proteins [50].

Recent advancements of HXMS that include direct cou-

pling of solid-phase proteolysis with liquid chromatogra-

phy (LC)-MS mass determination further improve its

resolution and sensitivity [51��].

Recently, HXMS has been used for the rapid, high-

throughput identification of unstructured protein regions

that may inhibit crystallization [52��]. Fragmentation

maps indicating the positions of both fast and slow

exchanging amides were produced for 21 Thermotoga

www.sciencedirect.com

Page 4: Combining prediction, computation and experiment for the characterization of protein disorder

Characterizing protein disorder Bracken et al. 573

maritima proteins. When tested on control proteins

with known three-dimensional structures, HXMS was

able to reliably detect even short regions of disorder.

The amount of detected disorder was inversely correlated

with crystallization success.

Computational analysis of disorderedprotein regionsThe combined use of Monte Carlo sampling and experi-

mental NMR restraints has been employed to estimate

the ensemble of structures populated by human a-lactal-

bumin in the presence of increasing concentrations of

urea [53]. Parts of the a- and b-domains of the native

protein were preserved even when the interactions defin-

ing them were substantially weakened. The relative

probabilities of the conformations, determined from

the NMR data, were used to construct a coarse-grained

free energy landscape for a-lactalbumin in the absence

of urea.

High-temperature Monte Carlo simulations have been

used to investigate disorder-to-order transitions of the

intrinsically unstructured cyclin-dependent kinase inhi-

bitor p27Kip1 [54]. The transition was found to be depen-

dent on the intermolecular binding interface, indicating

that complex formation overwhelms any local folding

preferences.

A protocol for calculating ensembles of structures repre-

senting the unfolded state has been developed. Starting

from the folded protein structure, several unfolding

trajectories are generated. Experimental NMR data are

back-calculated, using the ENSEMBLE program for

weighted populations of the structures sampled in the

unfolding trajectories [55]. Initial analysis of the N-term-

inal drk SH3 domain suggested that a limited number

of conformations can adequately represent the unfolded

state [55]. However, subsequent experimental studies

have observed non-native conformations that were not

represented, indicating the significant challenge in com-

putationally characterizing unfolded states [41].

A distinctly different computational approach for exam-

ining disorder in native proteins has been applied to the

prediction of proteolytic cleavage sites. This algorithm

analyzes the sequence, hydrophobicity and degree of

compactness, as well as the estimated change in surface

area that is exposed upon protease cleavage. Significant

agreement was observed between predicted cleavage

sites and limited proteolysis results in five model proteins

with locally unfolded regions [56].

Computational prediction of disorder fromprotein sequenceA variety of neural network predictors of disordered

regions have been developed and distributed as the

program PONDR1 (Predictors Of Natural Disordered

www.sciencedirect.com

Regions) [3,6,8]. On the basis of local amino acid com-

position, flexibility, hydropathy, coordination number

and other factors, these predictors classify each residue

within a sequence as either ordered or disordered. The

training set, built using PDB files and information from

published experiments, is labeled as ordered/disordered,

following the definition for ‘intrinsic disorder’ as used in

this work, namely, lack of tertiary structure in the native

state, either globule like (‘collapsed’ disorder) or random-

coil like (‘extended’ disorder).

The predictor developed by Uversky et al. [7] consists of a

linear discriminant that relies on the relative abundance

of hydrophobic and charged residues to classify entire

sequences (not regions) as folded or ‘natively unfolded’,

a term that denotes proteins or isolated domains that

present an extended, random-coil-like configuration in

their native state.

DISOPRED (Disorder Predictor) [10] is also based on a

neural network, but with inputs derived from sequence

profiles generated by PSI-BLAST, a commonly used tool

for local sequence alignment. The predictor’s output is

filtered by using secondary structure predictions, so that

regions confidently predicted as helix or sheet are not

predicted as disordered. DISOPRED2 [57�] uses a sup-

port vector machine and a neural network, connected in a

cascade classifier, and also takes advantage of PSI-BLAST

profiles to improve the predictor’s performance. ‘Native

disorder’, as used in the DISOPRED training sets, has

the same connotation as ‘intrinsic disorder’, although it is

limited to missing regions in PDB structures.

GlobPlot (Predictor of Intrinsic Protein Disorder,

Domain & Globularity) [58] predicts disordered and

globular regions on the basis of propensities for disorder

assigned to each amino acid, ‘disordered regions’ being

defined as non-globular domains that lack regular ‘sec-

ondary’ structure. DisEMBL, a more recent predictor

from this group [11], uses an ensemble of five neural

networks to predict any one of three disorder types,

defined as loop regions (no helix, no sheets); ‘hot’, or

highly mobile, loops; and missing coordinates in X-ray

structures.

NORSp predicts ‘regions of no regular secondary struc-

ture’ or NORS, defined as long stretches of consecutive

residues (>70) with few helix or sheet residues (<12%).

These ‘loopy’ regions [59], though compared to disor-

dered regions, are not necessarily disordered, because the

protein segments can have a fixed tertiary structure,

despite lacking regular secondary structure.

Prediction-guided experimental studiesPredictions of disorder or disorder-related properties can

help to guide experimental approaches for the study of

specific proteins. A recent example of this approach has

Current Opinion in Structural Biology 2004, 14:570–576

Page 5: Combining prediction, computation and experiment for the characterization of protein disorder

574 Biophysical methods

Figure 2

Prediction of binding elements within disordered regions. (a) The domain

organization of the measles virus nucleoprotein (MV N) consists of two

regions, NCORE (amino acids 1–400) and NTAIL (amino acids 401–525).

The approximate location of the phosphoprotein-binding site (Pb) is

shown below the NTAIL region [61]. (b) PONDRW prediction of

structural disorder within MV N, aligned with (a). Disorder prediction

values for a given residue are plotted against the residue number.

PONDRW scores above 0.5 are considered to be predictions of

disorder. The predicted MoRE is shown as a thick green segment

below the plot. Deletion of the C terminus of NTAIL, starting at the

MoRE position, prevents binding of the phosphoprotein.

been demonstrated for the measles virus nucleoprotein.

This protein was predicted to contain a folded N-terminal

region, followed by a highly disordered C terminus.

Guided by these predictions, these regions were sepa-

rately cloned and expressed. Experiments confirmed the

predicted disordered regions [60��].

Association of viral nucleoprotein with a phosphoprotein

plays an important role in regulating measles replication.

Short regions of predicted order, flanked by predictions of

disorder, may correspond to binding sites [61]. This

pattern, termed molecular recognition element (MoRE),

is present in the nucleoprotein C terminus, which med-

iates binding to the phosphoprotein, as shown in Figure 2.

Deletion of the predicted MoRE region in the nucleo-

protein precluded such binding [62]. These results sug-

gest that disorder prediction can be usefully combined

with experimentation not only to study structure but also

to identify regions of functional importance.

ConclusionsLack of fixed tertiary structure in proteins occurs not only

in denaturing buffers but also in physiological conditions.

Experimental and computational methods for studying

both the denatured forms of structured proteins, and also

Current Opinion in Structural Biology 2004, 14:570–576

intrinsically disordered proteins and their regions are

advancing rapidly. Many of the disordered protein exam-

ples discussed here are involved in signaling and regula-

tion, suggesting that identification of natively disordered

sequences will increase as more signaling and regulatory

proteins are structurally characterized. Intrinsically dis-

ordered proteins represent a distinct category of protein

conformation.

UpdateThe structure of the predicted MoRE in the NTAIL region

(Figure 2) has recently been determined in association

with its partner, protein P, by X-ray crystallography [63].

Although prediction suggested that residues 489 to 504

formed a helix, with a MoRE from 488 to 499 [60��], the

actual binding helix was observed to be from residues 486

to 504, but with the direction opposite to the orientation

predicted earlier [60��].

The RNA degradosome-organizing domain of RNAse

E was recently shown to contain four MoRE regions

that bind to four different partners, one of which is

RNA rather than protein [64]. The structure of one of

these regions when bound to its protein partner has just

been determined to be helical by X-ray crystallography

(BF Luisi, personal communication).

AcknowledgementsWe would like to thank Zoran Obradovic for his continuing collaborationon the bioinformatics investigations of intrinsically disordered proteins.Support for this work was provided by National Institutes ofHealth grant R01 LM007688-01, the Indiana Genomics Initiative(INGEN), which is funded, in part, by the Lilly Endowment Inc,and the American Heart Association.

References and recommended readingPapers of particular interest, published within the annual period ofreview, have been highlighted as:

� of special interest

�� of outstanding interest

1. Rose GD (Ed): Unfolded proteins. In Advances in ProteinChemistry, vol 62. Amsterdam: Academic Press; 2002.

2. Dunker AK, Brown CJ, Lawson JD, Iakoucheva LM, Obradovic Z:Intrinsic disorder and protein function. Biochemistry 2002,41:6573-6582.

3. Romero P, Obradovic Z, Kissinger CR, Villafranca JE, Dunker AK:Identifying disordered regions in proteins from amino acidsequences. IEEE Int Conf Neural Netw 1997, 1:90-95.

4. Romero P, Obradovic Z, Kissinger CR, Villafranca JE, Garner E,Guilliot S, Dunker AK: Thousands of proteins likely to have longdisordered regions. Pac Symp Biocomput 1998:437-448.

5. Wright PE, Dyson HJ: Intrinsically unstructured proteins:re-assessing the protein structure-function paradigm.J Mol Biol 1999, 293:321-331.

6. Romero P, Obradovic Z, Li X, Garner EC, Brown CJ, Dunker AK:Sequence complexity of disordered protein. Proteins 2001,42:38-48. [URL: http://www.PONDR.com].

7. Uversky V, Gillespie J, Fink A: Why are ‘natively unfolded’proteins unstructured under physiologic conditions? Proteins2000, 41:415-427.

www.sciencedirect.com

Page 6: Combining prediction, computation and experiment for the characterization of protein disorder

Characterizing protein disorder Bracken et al. 575

8. Vucetic S, Brown CJ, Dunker AK, Obradovic Z: Flavors of proteindisorder. Proteins 2003, 52:573-584.

9. Liu J, Rost B: NORSp: predictions of long regions withoutregular secondary structure. Nucleic Acids Res 2003;31:3833-3835. [URL: http://cubic.bioc.columbia.edu/services/NORSp].

10. Jones DT, Ward JJ: Prediction of disordered regions in proteinsfrom position specific score matrices. Proteins 2003,53 (suppl 6):573-578.

11. Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, Russell RB:Protein disorder prediction: implications for structuralproteomics. Structure (Camb) 2003. 11:1453-1459.[URL: http://dis.embl.de/].

12. Melamud E, Moult J: Evaluation of disorder predictions inCASP5. Proteins 2003, 53 (suppl 6):561-565.

13.�

Bax A: Weak alignment offers new NMR opportunities to studyprotein structure and dynamics. Protein Sci 2003, 12:1-16.

This is an excellent introductory review on dipolar coupling.

14. Palmer AG III, Kroenke CD, Loria JP: Nuclear magneticresonance methods for quantifying microsecond-to-millisecond motions in biological macromolecules.Methods Enzymol 2001, 339:204-238.

15. Bracken C: NMR spin relaxation methods for characterizationof disorder and folding in proteins. J Mol Graph Model 2001,19:3-12.

16. Dyson HJ, Wright PE: Nuclear magnetic resonance methods forelucidation of structure and dynamics in disordered states.Methods Enzymol 2001, 339:258-270.

17. Zhang O, Forman-Kay JD, Shortle D, Kay LE: Triple-resonanceNOESY-based experiments with improved spectral resolution:applications to structural characterization of unfolded,partially folded and folded proteins. J Biomol NMR 1997,9:181-200.

18. Williamson MP: Many residues in cytochrome c populatealternative states under equilibrium conditions. Proteins 2003,53:731-739.

19. Tang Y, Rigotti DJ, Fairman R, Raleigh DP: Peptide modelsprovide evidence for significant structure in the denaturedstate of a rapidly folding protein: the villin headpiecesubdomain. Biochemistry 2004, 43:3264-3272.

20.�

Cao W, Bracken C, Kallenbach NR, Lu M: Helix formation and theunfolded state of a 52-residue helical protein. Protein Sci 2004,13:177-189.

An examination of spin relaxation, scalar coupling and chemical shiftchanges to detail helix initiation. Evidence of i–i+4 hydrogen bondinginvolvement in helix initiation was obtained by examining deviations in thechemical shift temperature coefficients of C0(i) and HN(i+4) atoms.

21. Klein-Seetharaman J, Oikawa M, Grimshaw SB, Wirmer J,Duchardt E, Ueda T, Imoto T, Smith LJ, Dobson CM, Schwalbe H:Long-range interactions within a nonnative protein. Science2002, 295:1719-1722.

22.�

Choy WY, Shortle D, Kay LE: Side chain dynamics in unfoldedprotein states: an NMR based 2H spin relaxation study ofdelta131delta. J Am Chem Soc 2003, 125:1748-1758.

This is the first examination of sidechain methyl relaxation dynamicswithin an unfolded state. The authors describe pulse sequence formeasuring deuterium spin relaxation rates in 13CH2D methyl groupsspecific for unfolded proteins. A correlation between backbone andsidechain dynamics is observed. Motional differences between methylgroups suggest the presence of hydrophobic clustering.

23. Choy WY, Kay LE: Probing residual interactions in unfoldedprotein states using NMR spin relaxation techniques: anapplication to delta131delta. J Am Chem Soc 2003,125:11988-11992.

24. Schwarzinger S, Wright PE, Dyson HJ: Molecular hinges inprotein folding: the urea-denatured state of apomyoglobin.Biochemistry 2002, 41:12681-12686.

25. Shortle D, Ackerman MS: Persistence of native-like topology ina denatured protein in 8 M urea. Science 2001, 293:487-489.

www.sciencedirect.com

26. Ohnishi S, Lee AL, Edgell MH, Shortle D: Direct demonstration ofstructural similarity between native and denatured eglin C.Biochemistry 2004, 43:4064-4070.

27. Iakoucheva LM, Brown CJ, Lawson JD, Obradovic Z, Dunker AK:Intrinsic disorder in cell-signaling and cancer-associatedproteins. J Mol Biol 2002, 323:573-584.

28. Iakoucheva LM, Radivojac P, Brown CJ, O’Connor TR, Sikes JG,Obradovic Z, Dunker AK: The importance of intrinsic disorderfor protein phosphorylation. Nucleic Acids Res 2004,32:1037-1049. [URL: http://www.ist.temple.edu/DISPHOS].

29. Tollinger M, Forman-Kay JD, Kay LE: Measurement of side-chaincarboxyl pK(a) values of glutamate and aspartate residues inan unfolded protein by multinuclear NMR spectroscopy.J Am Chem Soc 2002, 124:5714-5717.

30. Riek R, Fiaux J, Bertelsen EB, Horwich AL, Wuthrich K:Solution NMR techniques for large molecular andsupramolecular structures. J Am Chem Soc 2002,124:12144-12153.

31. Fiaux J, Bertelsen EB, Horwich AL, Wuthrich K: NMR analysisof a 900K GroEL GroES complex. Nature 2002,418:207-211.

32. Arnold MR, Kremer W, Ludemann HD, Kalbitzer HR: 1H-NMRparameters of common amino acid residues measured inaqueous solutions of the linear tetrapeptides Gly-Gly-X-Ala atpressures between 0.1 and 200 MPa. Biophys Chem 2002,96:129-140.

33. Niraula TN, Konno T, Li H, Yamada H, Akasaka K, Tachibana H:Pressure-dissociable reversible assembly of intrinsicallydenatured lysozyme is a precursor for amyloid fibrils.Proc Natl Acad Sci USA 2004, 101:4089-4093.

34. Kuwata K, Kamatari YO, Akasaka K, James TL: Slowconformational dynamics in the hamster prion protein.Biochemistry 2004, 43:4439-4446.

35. Palmer AG III: NMR probes of molecular dynamics: overviewand comparison with other techniques. Annu Rev BiophysBiomol Struct 2001, 30:129-155.

36.�

Vugmeyster L, Raleigh DP, Palmer AG III, Vugmeister BE: Beyondthe decoupling approximation in the model free approach forthe interpretation of NMR relaxation of macromolecules insolution. J Am Chem Soc 2003, 125:8400-8404.

This work provides better estimation of the accuracy of the motionalparameters and the influence of correlated motions on the interpretationof NMR relaxation analysis.

37. Buevich AV, Shinde UP, Inouye M, Baum J: Backbone dynamicsof the natively unfolded pro-peptide of subtilisin byheteronuclear NMR relaxation studies. J Biomol NMR 2001,20:233-249.

38. Ding K, Louis JM, Gronenborn AM: Insights into conformationand dynamics of protein GB1 during folding and unfoldingby NMR. J Mol Biol 2004, 335:1299-1307.

39.�

Louhivuori M, Paakkonen K, Fredriksson K, Permi P, Lounila J,Annila A: On the origin of residual dipolar couplingsfrom denatured proteins. J Am Chem Soc 2003,125:15647-15650.

The authors’ calculation and simulations demonstrate that RDC values fora random-coil polymer are non-zero. Steric interactions within the aminoacid sequence introduce local orientation within the random flight poly-mer chain. The net effect is that RDC at the termini will tend toward zero,whereas internal sequences will have non-vanishing values.

40. Goto NK, Kay LE: New developments in isotope labelingstrategies for protein solution NMR spectroscopy.Curr Opin Struct Biol 2000, 10:585-592.

41. Crowhurst KA, Forman-Kay JD: Aromatic and methyl NOEshighlight hydrophobic clustering in the unfolded state of anSH3 domain. Biochemistry 2003, 42:8687-8695.

42. Gillespie JR, Shortle D: Characterization of long-rangestructure in the denatured state of staphylococcal nuclease.I. Paramagnetic relaxation enhancement by nitroxide spinlabels. J Mol Biol 1997, 268:158-169.

Current Opinion in Structural Biology 2004, 14:570–576

Page 7: Combining prediction, computation and experiment for the characterization of protein disorder

576 Biophysical methods

43. Teilum K, Kragelund BB, Poulsen FM: Transient structureformation in unfolded acyl-coenzyme A-binding proteinobserved by site-directed spin labelling. J Mol Biol 2002,324:349-357.

44. Lietzow MA, Jamin M, Jane Dyson HJ, Wright PE: Mapping long-range contacts in a highly unfolded protein. J Mol Biol 2002,322:655-662.

45.�

Ferraro DM, Robertson AD: EX1 hydrogen exchange and proteinfolding. Biochemistry 2004, 43:587-594.

This review describes the application of slow amide hydrogen exchangeto study the kinetics of protein folding and unfolding. The advantages ofEX1-type hydrogen exchange over traditional folding experiments areemphasized.

46. Yan X, Watson J, Ho PS, Deinzer ML: Mass spectrometricapproaches using electrospray ionization charge states andhydrogen-deuterium exchange for determining proteinstructures and their conformational changes.Mol Cell Proteomics 2004, 3:10-23.

47. Lanman J, Prevelige PE: High-sensitivity mass spectrometry forimaging subunit interactions: hydrogen/deuterium exchange.Curr Opin Struct Biol 2004, 14:181-188.

48. Yamamoto T, Izumi S, Gekko K: Mass spectrometry onsegment-specific hydrogen exchange of dihydrofolatereductase. J Biochem (Tokyo) 2004, 135:17-24.

49. Iakoucheva LM, Kimzey AL, Masselon CD, Bruce JE, Garner EC,Brown CJ, Dunker AK, Smith RD, Ackerman EJ: Identification ofintrinsic order and disorder in the DNA repair protein XPA.Protein Sci 2001, 10:560-571.

50. Cravello L, Lascoux D, Forest E: Use of different proteasesworking in acidic conditions to improve sequence coverageand resolution in hydrogen/deuterium exchange of largeproteins. Rapid Commun Mass Spectrom 2003, 17:2387-2393.

51.��

Hamuro Y, Coales SJ, Southern MR, Nemeth-Cawley JF,Stranz DD, Griffin PR: Rapid analysis of protein structure anddynamics by hydrogen/deuterium exchange massspectrometry. J Biomol Tech 2003, 14:171-182.

An excellent review describing recent advances in protein structure anddynamics characterization using HXMS.

52.��

Pantazatos D, Kim JS, Klock HE, Stevens RC, Wilson IA, LesleySA, Woods VL Jr: Rapid refinement of crystallographic proteinconstruct definition employing enhanced hydrogen/deuterium exchange MS. Proc Natl Acad Sci USA 2004,101:751-756.

This paper describes the application of amide hydrogen high-throughputand high-resolution deuterium exchange MS for rapid identification ofdisordered protein regions. Examples showing the improved crystalliza-tion of poorly crystallizing/diffracting proteins after the removal of suchregions are also given.

53. Vendruscolo M, Paci E, Karplus M, Dobson CM: Structures andrelative free energies of partially folded states of proteins.Proc Natl Acad Sci USA 2003, 100:14817-14821.

54. Choy WY, Forman-Kay JD: Calculation of ensembles ofstructures representing the unfolded state of an SH3 domain.J Mol Biol 2001, 308:1011-1032.

Current Opinion in Structural Biology 2004, 14:570–576

55. Verkhivker GM, Bouzida D, Gehlhaar DK, Rejto PA, Freer ST,Rose PW: Simulating disorder-order transitions inmolecular recognition of unstructured proteins: wherefolding meets binding. Proc Natl Acad Sci USA 2003,100:5148-5153.

56. Tsai CJ, Polverino de Laureto P, Fontana A, Nussinov R:Comparison of protein fragments identified by limitedproteolysis and by computational cutting of proteins.Protein Sci 2002, 11:1753-1770.

57.�

Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT:Prediction and functional analysis of native disorder inproteins from the three kingdoms of life. J Mol Biol 2004,337:635-645.

The greater occurrence of disorder in eukaryotes, compared to prokar-yotes and archaea, and the frequent use of disorder in signalingand regulation are further supported in this paper. A novel and interest-ing feature of this work is the use of gene ontology (GO) terms toestimate the over- and under-representation of predicted disorderin the molecular function, biological process and cellular componentontologies. DISOPRED2 is available at http://bioinf.cs.ucl.ac.uk/dis-opred/.

58. Linding R, Russell RB, Neduva V, Gibson TJ: GlobPlot: Exploringprotein sequences for globularity and disorder. Nucleic AcidsRes 2003, 31:3701-3708. [URL: http://globplot.embl.de/.]

59. Liu J, Tan H, Rost B: Loopy proteins appear conserved inevolution. J Mol Biol 2002, 322:53-64.

60.��

Longhi S, Receveur-Brechot V, Karlin D, Johansson K,Darbon H, Bhella D, Yeo R, Finet S, Canard B: The C-terminaldomain of the measles virus nucleoprotein is intrinsicallydisordered and folds upon binding to the C-terminalmoiety of the phosphoprotein. J Biol Chem 2003,278:18638-18648.

This paper demonstrates the potential advantage of using prediction toguide experimental analysis of proteins with both ordered and disorderedregions.

61. Garner E, Romero P, Dunker AK, Brown C, Obradovic Z:Predicting binding regions within disordered proteins.Genome Inform Ser Workshop Genome Inform 1999,10:41-50.

62. Bourhis JM, Johansson K, Receveur-Brechot V, Oldfield CJ,Dunker KA, Canard B, Longhi S: The C-terminal domain ofmeasles virus nucleoprotein belongs to the class ofintrinsically disordered proteins that fold upon binding to theirphysiological partner. Virus Res 2004, 99:157-167.

63. Kingston RL, Hamel DJ, Gay LS, Dahlquist FW, Mathews BW:Structural basis for the attachment of a paramyxoviralpolymerase to its template. Proc Natl Acad Sci USA 2004,101:8301-8306.

64. Callaghan AJ, Aurikko JP, Ilag LL, Grossman JG, Chandran V,Kuhnel K, Poljak L, Carpousis AJ, Robinson CV, Symmons MF,Luisi BF: Studies of the RNA degradosome-organizing domainof the Escherichia coli ribonuclease RNAse E. J Mol Biol 2004,340:965-969.

www.sciencedirect.com