Experimental and theoretical methods to study protein folding

EXPERIMENTAL AND THEORETICAL METHODS TO

STUDY PROTEIN FOLDING

Experiments

• Thermal denaturation

• Chemical denaturation

• Mechanical unfolding

• Kinetic experiments

• Mutational studies

Techniques

• Differential scanning calorimetry (DSC)

• Spectroscopy– Circular dichroism (CD)– Fluorescence– Nuclear magnetic resonance (NMR)– Small angle X-ray (SAXS) and small angle

neutron scattering (SANS)

• Atomic force microscopy (AFM)

Wild typeAcid-denaturated wild typeL16A mutantC-terminal peptide

Religa et al., J. Mol. Biol., 333, 977-991 (2003)

-values

1

0

Mutation affects the folded state but not the transition state

Mutation affects both the folded state and the transition state

Matouschek A, Kellis JT, Serrano L, Fersht AR. (1989). Mapping the transition state and pathway of protein folding by protein engineering. Nature 340:122

Millet et al.. Biochemistry 41, 321-325 (2002)

Structure of closed and open form of the DnaK (Hsp70) chaperone

Fluorescence studies of closing and opening of Hsp70

Mapa et al., Molecular Cell 38, 89, 2010.

Theoretical studies of protein structure and protein folding

• Need to express energy of a system as function of coordinates

• Need an algorithm to explore the conformational space

Energy expression in empirical force fields

tor

ii

ii

ii

i

nb

i ij ij

ij

ij

ijij

el

i ij ij

ji

b

iii

i

s

iii

di

E

VVV

E

r

r

r

r

E

r

qq

E

k

E

ddkE

3cos12

2cos12

cos12

2332

2

1

2

1

)3()2()1(

60120

2020

1,3-interactions Eb

only

Bonding

interactions, E s only

1,4-nonbonded interactions

Eel+E

nb

1,5-interactions

Eel+EVdW

Torsional

interactions Etor

Partition of the energy of interactions with respect to topological distance

20

2

1ddkdE d

s

d

d0 d

Es(

d)

Bond distortion energy

Typical values of d0 and kd

Bond d0 [A] kd [kcal/(mol A2)]

Csp3-Csp3 1.523 317

Csp3-Csp2 1.497 317

Csp2=Csp2 1.337 690

Csp2=O 1.208 777

Csp2-Nsp3 1.438 367

C-N (amid) 1.345 719

Comparison of the actual bond-energy curve with that of the harmonic approximation

11

6

1

2

1

2

3020

eddbes

ds

eDdE

ddddkdE Anharmonic potential

Morse potential

Potentials that take into account the asymmetry of bond-energy curve

d [A]

E [

kcal

/mol

]

Harmonic potential

Anharmonic potential

Morse potential

20

2

1 kEb

0

Eb()

k

Energy of bond-angle distortion

Typical values of 0 and k

Angle 0 [degrees] k

[kcal/(mol degree2)]

Csp3-Csp3-Csp3 109.47 0.0099

Csp3-Csp3-H 109.47 0.0079

H-Csp3-H 109.47 0.0070

Csp3-Csp2-Csp3 117.2 0.0099

Csp3-Csp2=Csp2 121.4 0.0121

Csp3-Csp2=O 122.5 0.0101

Single bond between sp3 carbons or between sp3 carbon and nitrogen

Example: C-C-C-C quadruplet

dihedral angle [deg]

Eto

r [k

cal/m

ol] 60

50

40

30

20

10

0

3cos16.1 torE

Double or partially double bonds

Example: C-C=C-C quadruplet

2cos130 torE

Single bond between electronegative atoms (oxygens, sulfurs, etc.).

Example: C-S-S-C quadruplet

cos16.02cos15.3 torE

Basic types of torsional potentials

Potentials imposed on improper torsional angles

A

B

X

X

3cos1

2cos1

3

2

V

VEtor

61260120

42rr

rEr

r

r

rrE nbnb

Nonbonded Lennard-Jones (6-12) potential

r [A]

Enb

[kc

al/m

ol]

-

r0

jiij

jiij

o

rrr

r

000

6

1

2

Sample values of i and r0i

Atom type r0

C(carbonyl) 1.85 0.12

C(sp3) 1.80 0.06

N(sp3) 1.85 0.12

O(carbonyl) 1.60 0.20

H(bonded with C) 1.00 0.02

S 2.00 0.20

1012

6exp

r

D

r

CrE

r

CrArE

hb

nb

Other nonbonded potentials

Buckingham potential

10-12 potential used in some force fields (e.g., ECEPP) for proton…proton donor pairs

Energy contribution Source of parameters

Bond and bond angle distortion

Crystal and neutronographic data, IR spectroscopy

Torsional NMR and FTIR spectroscopy

Nonbonded interactionsPolarizabilities, crystal and neutronographic data

Electrostatic energy Molecular electrostatic potentials

AllEnergy surfaces of model systems calculated with molecular quantum mechanics

Sources of parameters

Solvent in simulations

Explicit water

• TIP3P

• TIP4P

• TIP5P

• SPC

Implicit water

• Solvent accessible surface area (SASA) models

• Molecular surface area models

• Poisson-Boltzmann approach

• Generalized Born surface area (GBSA) model

• Polarizable continuum model (PCM)

O

H H0.417 e

-0.834 e

104.52o

0.9572 ÅO

H H0.520 e

0.00 e

-1.040 eM

0.15 Å

TIP3P model TIP4P model

O=3.1507 Å

O=0.1521 kcal/mol

O=3.1535 Å

O=0.1550 kcal/mol

Solvent accessible surface area (SASA) models

atoms

iisolw AF

i Free energy of solvation of

atomu i per unit area,

Ai solvent accessible surface of

atom i dostępna

Vila et al., Proteins: Structure, Function, and Genetics, 1991, 10, 199-218.

Comparison of the lowest-energy conformations of [Met5]enkefalin (H-Tyr-Gly-Gly-Phe-Met-OH) obtained with the ECEPP/3 force field in vacuo and with the SRFOPT model

vacuum SRFOPT

vacuum SRFOPT

Compariosn of the molecular sufraces of the lowest-energy conformation of [Met5]enkefaliny obtained without and with the SRFOPT model

Molecular surface are model

AFcav

Surface tension

A molecular surface area

)(

1

11332

ijGBoutinji

GBpol

GBpolcavsolw

rfqqE

EFF

ji

ijjiijijGB RR

rRRrrf

4exp)(

22

Generalized Born molecular surface (GBSA) model

All-atom representation of polypeptide chains

Coarse-grained representation of polypeptide chains

Coarse-grained force fields

Physics-based potentials (statistical-mechanical formulation)

X : primary variables present in the model Y : secondary variables not present in the model (solvent, side-chain

dihedral angles, etc.)E(X,Y) : all-atom energy function.

Y

Y

YY

YY

dVV

dVRTEV

RTUF /,exp1

ln)()( YXXX

Statistical potentials

scx

scxscx

;;

;;ln;;

ref

obs

N

NRTW

X – geometric variablesc – residue typess – sequence context

Leu-Leu pairA – radial correlation functionB – reference distribution functionC -

Searching the conformational space

Local energy minimization

Monte CarloMolecular dynamics

Low (Lowest)-energy conformations

Canonical conformational ensembles

Monte Carlo with minimization

(MCM)

Basin hopping

Smoothing energy surface

Diffusion equation method (DEM

Canonical MC

Canonical MD

Replica-exchange MC (REMC)

Replica-exchange MD (REMD)

Simulated annealing

Genetic algorithms

x

f(x)Start

Local minimum

Global minimum

Local vs. global minimization

d(1)

x(0)

x(1)x(2)

d(2)x*

x1

x2

f(x(p)+d(p))

General scheme of local minimization of multivariate functions:

1. Choose the initial approximation x(0).

2. In pth iteration, compute the search direction d(p).

3. Locate x(p+1) as a minimum on the serarch direction (minimization of a function in one variable).

4. Terminate when convergence has been achieved or maximum number of iterations exceeded.

Deformation methods

Lowest-energy structure of gramicidin S computed with the ECEPP force field (M. Dygert, N. Go, H.A. Scheraga, Macromolecules, 8, 750-761 (1975). This structure turned out to be identical with the NMR structure determined later.

The C-terminal part of HDEA protein found by global minimization of the UNRES coarse-grained effective energy function.

The N-terminal part of HDEA

Liwo et al., PNAS, 96, 5482–5485 (1999)

Comparison of the experimental strucgture of bacteriocin AS-48 from E. faecalis with the structure obtained by global minimization of the UNRES force field (Pillardy et al., Proc. Natl. Acad. Sci. USA., 98, 2329-2333 (2001))

Nature (and a canonical simulation) finds the basin with the lowest free energy, at a given temperature which might happen to but does not have to contain the conformation with the lowest potential energy.

The global-optimization methods are desinged to find structures with the lowest potential energy, thus ignoring conformational entropy. Technically this corresponds to canonical simulations at 0 K.

“Potential energy” or “free energy”?

PDB ID codeEmin (MD)

[kcal/mol]

Eglob

[kcal/mol](number of residues)

1BDD (46) -409 (-414) -597

1GAB (47) -461 (-501) -669

1LQ7 (67) -658 (-652) -937

1CLB (75) -740 (-709) -1053

1E0G (48) -405 (-380) -632

Comparison of minimum potential energies obtained in MD runs with the lowest values of the potential energy

Results of Langevin dynamics simulations are in parentheses.

Perturb Xo: X1 = Xo + X

Compute new energy (E1)

Conformation Xo, energy Eo

E1<Eo ?

Sample Y from U(0,1)

Compute W=exp[-(E1-Eo)/kT]

W>Y?

Xo=X1, Eo=E1

NO

YES

YES

NO

Basic scheme of the Metropolis (canonical) Monte Carlo algorithm

E0

E1

Accept with probability exp[-(E2-E1)/kBT]

E1

Accept unconditionally

Calculation of averages

N

iiA

NA

1

1

The index i runs through all MC steps, including those in which new conformations have not been accepted.

Conformational space representation in Monte Carlo methods

• Lattice representation; the centers of interactions are on lattice nodes.

• Continuous; the centers are located in 3D space.

Sample MC trajectory of a good folder; Model 1a

An example of lattice Monte Carlo trajectory

A pathway of thermal unforlding of protein G simulated with the CABS model and lattice Monte Carlo dynamics

Kmiecik and Koliński, Biophys. J., 94, 726-736 (2008)

Molecular dynamics

2

00

00

2

2

2

2

)(2

1)()(

,,2,1),(

,,2,1,)(1)(

)(

ttttttt

t

t

nitdt

d

x

V

dt

xdm

nitVmm

tt

dt

d

dt

d

ii

i

i

ii

ii

iii

avrr

vv

rr

vr

rrF

avr

r

The Verlet algorithm:

)()(

)()(2

1)(

)()()(2)(

)()(2)()(

)(2

1)()()(

)(2

1)()()(

4

2

2

2

2

tOte

ttttt

ttttttt

ttttttt

ttttttt

ttttttt

rrv

arrr

arrr

avrr

avrr

The Velocity Verlet algorithm

Step 1:

tttt

t

ttttttt

)(2

1)(

2

)(2

1)()( 2

avv

avrr

Step 2:

tttt

ttt

ttUm

tti

ii

)(2

1

2)(

)(1

)(

avv

ra r

The leapfrog algorithm:

tt

tttt

ttt

tt

t

2)()(

)(22

vrr

avv

All three algorithms are symplectic, i.e., the total energy oscillates about a constant value (the „shadow Hamiltonian”) which is close bur not equal to the initial energy. Many other higher-order algorithms which are more accurate in a single step (e.g., the Gear algorithm) lack this property.

Symplectic algorithms have also been designed for isokinetic (constant temperature) and isobaric (constant pressure) simulations; extended Hamiltonian is considered in these cases.

Kinetic energy

Potential energy

Total energy

Total energy

0.0 1.0 2.0 3.0 4.0 5.0

ener

gy [

kcal

/mol

]

time [ns]

Dependence of the kinetic, potential, and total energy on time for coarse-grained Ac-Ala10-NHMe (Khalili et al., J. Phys. Chem. B, 2005, 109, 13785-13797)

Berendsen’s thermostat (weak coupling with temperature bath)

n

iziyixiik

k

vvvmE

E

fkTtvv

1

222

2

1

11

f – number of degrees of freedom (3n)

– coupling parameter

t – time step

Ek – kinetic energy

randi

ii

i

ii f

dt

dx

x

V

dt

xdm

2

2

wwii rr )(6

)1,0(2

Nt

RTf irand

i

Langevin dynamics (for implicit solvent)

Stokes’ law

Wiener process

randi

i

ii f

x

E

dt

dx

Brownian dynamics

10-15

femto10-12

pico10-9

nano10-6

micro10-3

milli100

secondsbond

vibrationloop

closure

helixformation

folding of-hairpins

proteinfolding

all atom MD step

sidechainrotation

MD algorithm references:

1. Frenkel, D.; Smit, B. Understanding molecular simulations, Academic Press, 1996, Chapter 4.

2. Calvo, M. P.; Sanz-Serna, J. M. Numerical Hamiltonian Problems; Chapman & Hall: London, U. K., 1994.

3. Verlet, L. Phys. Rev. 1967, 159, 98.

4. Swope, W. C.; Andersen, H. C.; Berens, P. H.; Wilson, K. R. J. Chem. Phys. 1982, 76, 637.

5. Tuckerman, M.; Berne, B. J.; Martyna, G. J. J. Chem. Phys. 1992, 97, 1990.

6. Ciccotti, G.; Kalibaeva, G. Philos. Trans. R. Soc. London, Ser. A 2004, 362, 1583.

Regular and multiplexed replica exchange algorithm

N independent replicas are simulated independently for a reasonably long time using standard canonical MC or MD

exchange of two neighboring replicas is attempted according to following probability:

regular multiplexed

Y.Rhee V.Pande, Biophys. J. 84, 775, 2003

XYYX EEW nmnm

0forexp

0for1,|,

Experimental and theoretical methods to study protein folding

Documents

cc quadrupletsingle

bond angle distortioncrystal

energy of interactions

h bond

asymmetry of bond

actual bondenergy curve

protein foldingneed

pathway of protein