EXPERIMENTAL AND THEORETICAL METHODS TO STUDY PROTEIN FOLDING
Jan 11, 2016
EXPERIMENTAL AND THEORETICAL METHODS TO
STUDY PROTEIN FOLDING
Experiments
• Thermal denaturation
• Chemical denaturation
• Mechanical unfolding
• Kinetic experiments
• Mutational studies
Techniques
• Differential scanning calorimetry (DSC)
• Spectroscopy– Circular dichroism (CD)– Fluorescence– Nuclear magnetic resonance (NMR)– Small angle X-ray (SAXS) and small angle
neutron scattering (SANS)
• Atomic force microscopy (AFM)
Wild typeAcid-denaturated wild typeL16A mutantC-terminal peptide
Religa et al., J. Mol. Biol., 333, 977-991 (2003)
-values
1
0
Mutation affects the folded state but not the transition state
Mutation affects both the folded state and the transition state
Matouschek A, Kellis JT, Serrano L, Fersht AR. (1989). Mapping the transition state and pathway of protein folding by protein engineering. Nature 340:122
Millet et al.. Biochemistry 41, 321-325 (2002)
Structure of closed and open form of the DnaK (Hsp70) chaperone
Fluorescence studies of closing and opening of Hsp70
Mapa et al., Molecular Cell 38, 89, 2010.
Theoretical studies of protein structure and protein folding
• Need to express energy of a system as function of coordinates
• Need an algorithm to explore the conformational space
Energy expression in empirical force fields
tor
ii
ii
ii
i
nb
i ij ij
ij
ij
ijij
el
i ij ij
ji
b
iii
i
s
iii
di
E
VVV
E
r
r
r
r
E
r
E
k
E
ddkE
3cos12
2cos12
cos12
2332
2
1
2
1
)3()2()1(
60120
2020
1,3-interactions Eb
only
Bonding
interactions, E s only
1,4-nonbonded interactions
Eel+E
nb
1,5-interactions
Eel+EVdW
Torsional
interactions Etor
Partition of the energy of interactions with respect to topological distance
20
2
1ddkdE d
s
d
d0 d
Es(
d)
Bond distortion energy
Typical values of d0 and kd
Bond d0 [A] kd [kcal/(mol A2)]
Csp3-Csp3 1.523 317
Csp3-Csp2 1.497 317
Csp2=Csp2 1.337 690
Csp2=O 1.208 777
Csp2-Nsp3 1.438 367
C-N (amid) 1.345 719
Comparison of the actual bond-energy curve with that of the harmonic approximation
11
6
1
2
1
2
3020
eddbes
ds
eDdE
ddddkdE Anharmonic potential
Morse potential
Potentials that take into account the asymmetry of bond-energy curve
d [A]
E [
kcal
/mol
]
Harmonic potential
Anharmonic potential
Morse potential
20
2
1 kEb
0
Eb()
k
Energy of bond-angle distortion
Typical values of 0 and k
Angle 0 [degrees] k
[kcal/(mol degree2)]
Csp3-Csp3-Csp3 109.47 0.0099
Csp3-Csp3-H 109.47 0.0079
H-Csp3-H 109.47 0.0070
Csp3-Csp2-Csp3 117.2 0.0099
Csp3-Csp2=Csp2 121.4 0.0121
Csp3-Csp2=O 122.5 0.0101
Single bond between sp3 carbons or between sp3 carbon and nitrogen
Example: C-C-C-C quadruplet
dihedral angle [deg]
Eto
r [k
cal/m
ol] 60
50
40
30
20
10
0
3cos16.1 torE
Double or partially double bonds
Example: C-C=C-C quadruplet
2cos130 torE
Single bond between electronegative atoms (oxygens, sulfurs, etc.).
Example: C-S-S-C quadruplet
cos16.02cos15.3 torE
Basic types of torsional potentials
Potentials imposed on improper torsional angles
A
B
X
X
3cos1
2cos1
3
2
V
VEtor
61260120
42rr
rEr
r
r
rrE nbnb
Nonbonded Lennard-Jones (6-12) potential
r [A]
Enb
[kc
al/m
ol]
-
r0
jiij
jiij
o
rrr
r
000
6
1
2
Sample values of i and r0i
Atom type r0
C(carbonyl) 1.85 0.12
C(sp3) 1.80 0.06
N(sp3) 1.85 0.12
O(carbonyl) 1.60 0.20
H(bonded with C) 1.00 0.02
S 2.00 0.20
1012
6exp
r
D
r
CrE
r
CrArE
hb
nb
Other nonbonded potentials
Buckingham potential
10-12 potential used in some force fields (e.g., ECEPP) for proton…proton donor pairs
Energy contribution Source of parameters
Bond and bond angle distortion
Crystal and neutronographic data, IR spectroscopy
Torsional NMR and FTIR spectroscopy
Nonbonded interactionsPolarizabilities, crystal and neutronographic data
Electrostatic energy Molecular electrostatic potentials
AllEnergy surfaces of model systems calculated with molecular quantum mechanics
Sources of parameters
Solvent in simulations
Explicit water
• TIP3P
• TIP4P
• TIP5P
• SPC
Implicit water
• Solvent accessible surface area (SASA) models
• Molecular surface area models
• Poisson-Boltzmann approach
• Generalized Born surface area (GBSA) model
• Polarizable continuum model (PCM)
O
H H0.417 e
-0.834 e
104.52o
0.9572 ÅO
H H0.520 e
0.00 e
-1.040 eM
0.15 Å
TIP3P model TIP4P model
O=3.1507 Å
O=0.1521 kcal/mol
O=3.1535 Å
O=0.1550 kcal/mol
Solvent accessible surface area (SASA) models
atoms
iisolw AF
i Free energy of solvation of
atomu i per unit area,
Ai solvent accessible surface of
atom i dostępna
Vila et al., Proteins: Structure, Function, and Genetics, 1991, 10, 199-218.
Comparison of the lowest-energy conformations of [Met5]enkefalin (H-Tyr-Gly-Gly-Phe-Met-OH) obtained with the ECEPP/3 force field in vacuo and with the SRFOPT model
vacuum SRFOPT
vacuum SRFOPT
Compariosn of the molecular sufraces of the lowest-energy conformation of [Met5]enkefaliny obtained without and with the SRFOPT model
Molecular surface are model
AFcav
Surface tension
A molecular surface area
)(
1
11332
ijGBoutinji
GBpol
GBpolcavsolw
rfqqE
EFF
ji
ijjiijijGB RR
rRRrrf
4exp)(
22
Generalized Born molecular surface (GBSA) model
All-atom representation of polypeptide chains
Coarse-grained representation of polypeptide chains
Coarse-grained force fields
Physics-based potentials (statistical-mechanical formulation)
X : primary variables present in the model Y : secondary variables not present in the model (solvent, side-chain
dihedral angles, etc.)E(X,Y) : all-atom energy function.
Y
Y
YY
YY
dVV
dVRTEV
RTUF /,exp1
ln)()( YXXX
Statistical potentials
scx
scxscx
;;
;;ln;;
ref
obs
N
NRTW
X – geometric variablesc – residue typess – sequence context
Leu-Leu pairA – radial correlation functionB – reference distribution functionC -
Searching the conformational space
Local energy minimization
Monte CarloMolecular dynamics
Low (Lowest)-energy conformations
Canonical conformational ensembles
Monte Carlo with minimization
(MCM)
Basin hopping
Smoothing energy surface
Diffusion equation method (DEM
Canonical MC
Canonical MD
Replica-exchange MC (REMC)
Replica-exchange MD (REMD)
Simulated annealing
Genetic algorithms
x
f(x)Start
Local minimum
Global minimum
Local vs. global minimization
d(1)
x(0)
x(1)x(2)
d(2)x*
x1
x2
f(x(p)+d(p))
General scheme of local minimization of multivariate functions:
1. Choose the initial approximation x(0).
2. In pth iteration, compute the search direction d(p).
3. Locate x(p+1) as a minimum on the serarch direction (minimization of a function in one variable).
4. Terminate when convergence has been achieved or maximum number of iterations exceeded.
Deformation methods
Lowest-energy structure of gramicidin S computed with the ECEPP force field (M. Dygert, N. Go, H.A. Scheraga, Macromolecules, 8, 750-761 (1975). This structure turned out to be identical with the NMR structure determined later.
The C-terminal part of HDEA protein found by global minimization of the UNRES coarse-grained effective energy function.
The N-terminal part of HDEA
Liwo et al., PNAS, 96, 5482–5485 (1999)
Comparison of the experimental strucgture of bacteriocin AS-48 from E. faecalis with the structure obtained by global minimization of the UNRES force field (Pillardy et al., Proc. Natl. Acad. Sci. USA., 98, 2329-2333 (2001))
Nature (and a canonical simulation) finds the basin with the lowest free energy, at a given temperature which might happen to but does not have to contain the conformation with the lowest potential energy.
The global-optimization methods are desinged to find structures with the lowest potential energy, thus ignoring conformational entropy. Technically this corresponds to canonical simulations at 0 K.
“Potential energy” or “free energy”?
PDB ID codeEmin (MD)
[kcal/mol]
Eglob
[kcal/mol](number of residues)
1BDD (46) -409 (-414) -597
1GAB (47) -461 (-501) -669
1LQ7 (67) -658 (-652) -937
1CLB (75) -740 (-709) -1053
1E0G (48) -405 (-380) -632
Comparison of minimum potential energies obtained in MD runs with the lowest values of the potential energy
Results of Langevin dynamics simulations are in parentheses.
Perturb Xo: X1 = Xo + X
Compute new energy (E1)
Conformation Xo, energy Eo
E1<Eo ?
Sample Y from U(0,1)
Compute W=exp[-(E1-Eo)/kT]
W>Y?
Xo=X1, Eo=E1
NO
YES
YES
NO
Basic scheme of the Metropolis (canonical) Monte Carlo algorithm
E0
E1
Accept with probability exp[-(E2-E1)/kBT]
E1
Accept unconditionally
Calculation of averages
N
iiA
NA
1
1
The index i runs through all MC steps, including those in which new conformations have not been accepted.
Conformational space representation in Monte Carlo methods
• Lattice representation; the centers of interactions are on lattice nodes.
• Continuous; the centers are located in 3D space.
Sample MC trajectory of a good folder; Model 1a
An example of lattice Monte Carlo trajectory
A pathway of thermal unforlding of protein G simulated with the CABS model and lattice Monte Carlo dynamics
Kmiecik and Koliński, Biophys. J., 94, 726-736 (2008)
Molecular dynamics
2
00
00
2
2
2
2
)(2
1)()(
,,2,1),(
,,2,1,)(1)(
)(
ttttttt
t
t
nitdt
d
x
V
dt
xdm
nitVmm
tt
dt
d
dt
d
ii
i
i
ii
ii
iii
avrr
vv
rr
vr
rrF
avr
r
The Verlet algorithm:
)()(
)()(2
1)(
)()()(2)(
)()(2)()(
)(2
1)()()(
)(2
1)()()(
4
2
2
2
2
tOte
ttttt
ttttttt
ttttttt
ttttttt
ttttttt
rrv
arrr
arrr
avrr
avrr
The Velocity Verlet algorithm
Step 1:
tttt
t
ttttttt
)(2
1)(
2
)(2
1)()( 2
avv
avrr
Step 2:
tttt
ttt
ttUm
tti
ii
)(2
1
2)(
)(1
)(
avv
ra r
The leapfrog algorithm:
tt
tttt
ttt
tt
t
2)()(
)(22
vrr
avv
All three algorithms are symplectic, i.e., the total energy oscillates about a constant value (the „shadow Hamiltonian”) which is close bur not equal to the initial energy. Many other higher-order algorithms which are more accurate in a single step (e.g., the Gear algorithm) lack this property.
Symplectic algorithms have also been designed for isokinetic (constant temperature) and isobaric (constant pressure) simulations; extended Hamiltonian is considered in these cases.
Kinetic energy
Potential energy
Total energy
Total energy
0.0 1.0 2.0 3.0 4.0 5.0
ener
gy [
kcal
/mol
]
time [ns]
Dependence of the kinetic, potential, and total energy on time for coarse-grained Ac-Ala10-NHMe (Khalili et al., J. Phys. Chem. B, 2005, 109, 13785-13797)
Berendsen’s thermostat (weak coupling with temperature bath)
n
iziyixiik
k
vvvmE
E
fkTtvv
1
222
2
1
11
f – number of degrees of freedom (3n)
– coupling parameter
t – time step
Ek – kinetic energy
randi
ii
i
ii f
dt
dx
x
V
dt
xdm
2
2
wwii rr )(6
)1,0(2
Nt
RTf irand
i
Langevin dynamics (for implicit solvent)
Stokes’ law
Wiener process
randi
i
ii f
x
E
dt
dx
Brownian dynamics
10-15
femto10-12
pico10-9
nano10-6
micro10-3
milli100
secondsbond
vibrationloop
closure
helixformation
folding of-hairpins
proteinfolding
all atom MD step
sidechainrotation
MD algorithm references:
1. Frenkel, D.; Smit, B. Understanding molecular simulations, Academic Press, 1996, Chapter 4.
2. Calvo, M. P.; Sanz-Serna, J. M. Numerical Hamiltonian Problems; Chapman & Hall: London, U. K., 1994.
3. Verlet, L. Phys. Rev. 1967, 159, 98.
4. Swope, W. C.; Andersen, H. C.; Berens, P. H.; Wilson, K. R. J. Chem. Phys. 1982, 76, 637.
5. Tuckerman, M.; Berne, B. J.; Martyna, G. J. J. Chem. Phys. 1992, 97, 1990.
6. Ciccotti, G.; Kalibaeva, G. Philos. Trans. R. Soc. London, Ser. A 2004, 362, 1583.
Regular and multiplexed replica exchange algorithm
N independent replicas are simulated independently for a reasonably long time using standard canonical MC or MD
exchange of two neighboring replicas is attempted according to following probability:
regular multiplexed
Y.Rhee V.Pande, Biophys. J. 84, 775, 2003
XYYX EEW nmnm
0forexp
0for1,|,