INTRODUCTION TO MOLECULAR MODELING - LiSMIDoSlismidos.strony.ug.edu.pl/pliki/skrypty/Kazmierkiewicz... · · 2014-06-25INTRODUCTION TO MOLECULAR MODELING Rajmund Kaźmierkiewicz,

INTRODUCTION TO MOLECULAR MODELING

Rajmund Kaźmierkiewicz, PhD

Laboratory of Biomolecular Systems Simulations

IFB UG-MUG

Course book prepared as part of the project: „Kształcimy najlepszych kompleksowy program rozwoju doktorantów, młodych doktorów oraz akademickiej kadry dydaktycznej Uniwersytetu Gdaoskiego”

Project no: UDA-POKL.04.01.01-00-017/10-00

Intercollegiate Faculty of Biotechnology UG-MUG

Gdańsk 2011

Introduction to molecular modeling 3 by Rajmund Kaźmierkiewicz

Table of contents:

1. INTRODUCTION ....................................................................................................................................... 9

2. LINUX .....................................................................................................................................................10

3. WHAT IS MODELING? .............................................................................................................................10

4. MOLECULAR MECHANICS: LIMITATIONS ................................................................................................11

5. GRAPHICAL REPRESENTATIONS OF MOLECULAR STRUCTURE .................................................................11

6. MOLECULAR STRUCTURE DESCRIPTION ..................................................................................................12

6.1. CARTESIAN COORDINATES ......................................................................................................................... 12

6.2. INTERNAL COORDINATES .......................................................................................................................... 12

6.3. ALTERNATIVE DESCRIPTION OF THE MOLECULAR STRUCTURE ............................................................................ 14

7. ENERGY EXPRESSIONS IN MOLECULAR MECHANICS ...............................................................................15

7.1. POTENTIAL CLASSIFICATION ....................................................................................................................... 15

7.2. THE EMPIRICAL ENERGY FUNCTION (OR FORCE FIELD) ................................................................................... 16

7.3. BOND STRETCHING .................................................................................................................................. 16

7.4. ANGLE BENDING ..................................................................................................................................... 16

7.5. BOND ROTATION (TORSION) ..................................................................................................................... 16

7.6. NON-BONDED INTERACTIONS (VAN DER WAALS) .......................................................................................... 17

7.7. GAY-BERNE POTENTIAL ............................................................................................................................ 17

7.8. NON-BONDED INTERACTIONS (ELECTROSTATIC) ............................................................................................ 17

7.9. THE TOTAL POTENTIAL ENERGY .................................................................................................................. 19

8. EMPIRICAL FORCE FIELD .........................................................................................................................19

8.1. FITTING PARAMETERS .............................................................................................................................. 20

8.2. FITTING CHARGES .................................................................................................................................. 20

8.3. PROBLEMS WITH THE INFINITE RANGE OF NON-BONDED INTERACTIONS ............................................................. 21

8.4. PROBLEMS WITH HIGH VALUES OF ELECTROSTATIC POTENTIAL .......................................................................... 23

8.5. DIELECTRIC PERMITTIVITY ......................................................................................................................... 25

9. OPTIMIZATION OF A STRUCTURE ...........................................................................................................25

9.1. SUCCESSIVE COORDINATE DIRECTION METHOD ............................................................................................ 26

9.2. NEWTON’S METHOD FOR FINDING A MINIMUM ........................................................................................... 27

9.3. STEEPEST DESCENTS ................................................................................................................................ 27

9.3.1. Steepest Descent Method ................................................................................................................ 28

9.4. CONJUGATE GRADIENT METHOD ............................................................................................................... 28

9.4.1. Conjugate gradient method without explicit knowledge of the Hessian matrix ............................. 29

9.5. THE BFGS ALGORITHM FOR UNCONSTRAINED OPTIMIZATION .......................................................................... 30

9.6. TESTING MINIMA ................................................................................................................................... 30

9.7. MINIMIZATION AND MOLECULAR MECHANICS ............................................................................................. 31

10. MOLECULAR DYNAMICS SIMULATIONS ..............................................................................................31

10.1. EQUILIBRATION ...................................................................................................................................... 31

10.2. VELOCITIES IN MD .................................................................................................................................. 32

10.3. DYNAMICS: EQUATIONS OF MOTION .......................................................................................................... 32

4 Introduction to molecular modeling by Rajmund Kaźmierkiewicz

10.3.1. Numerical Solution of the Equations of Motion .......................................................................... 32

10.3.2. The symplectic integration of equations of motion ..................................................................... 32

10.3.3. The Verlet Algorithm ................................................................................................................... 33

10.3.4. Leapfrog Verlet ............................................................................................................................ 33

10.3.5. Velocity Verlet .............................................................................................................................. 34

10.3.6. Gear Predictor-Corrector Method ................................................................................................ 34

10.4. THE TIME STEP ....................................................................................................................................... 35

10.5. THE NEED FOR FASTER COMPUTERS ............................................................................................................. 35

10.6. PHASE SPACE ......................................................................................................................................... 35

10.7. CALCULATION OF AVERAGE PROPERTIES ...................................................................................................... 36

10.8. FLUCTUATIONS ....................................................................................................................................... 37

10.9. OTHER WAYS OF EXPERIMENTAL VERIFICATION OF RESULTS OF MOLECULAR MECHANICS ........................................ 38

10.10. THE PRESSURE ........................................................................................................................................ 39

10.11. THE RADIAL DISTRIBUTION FUNCTION .......................................................................................................... 39

10.12. CALCULATION OF DYNAMIC PROPERTIES FROM MOLECULAR DYNAMICS SIMULATIONS ......................................... 41

10.13. CORRELATIONS AND THE CORRELATION TIME ............................................................................................... 41

10.14. CORRELATION FUNCTIONS AND PROPERTIES ................................................................................................. 41

10.15. THE TIME CORRELATION FUNCTION ............................................................................................................ 41

10.16. VELOCITY AUTOCORRELATION FUNCTION ..................................................................................................... 42

10.17. CALCULATING THE DIFFUSION COEFFICIENT FROM THE MEAN-SQUARE DISPLACEMENT........................................... 43

10.17.1. The Mean Square Displacement .................................................................................................. 43

10.17.2. What is the mean square distance and why is it significant? ...................................................... 44

10.17.3. The Mean Squared Displacement and the Velocity Autocorrelation Function ............................ 45

10.18. MOLECULAR DYNAMICS SIMULATIONS OF LIQUID WATER ................................................................................ 46

10.18.1. Dielectric Relaxation .................................................................................................................... 47

10.18.2. Three-site models for water......................................................................................................... 47

10.18.3. Implicit Treatment of Solvation ................................................................................................... 47

10.19. CONFORMATIONAL SEARCHING, QUENCH DYNAMICS ..................................................................................... 48

10.19.1. Protocol for conformational search: quenched molecular dynamics .......................................... 48

10.20. CONSTRAINTS ......................................................................................................................................... 48

10.20.1. Restrained dynamics as a tool in NMR structure determination ................................................. 48

10.20.2. Use of constraints to increase the integration step ..................................................................... 49

10.20.3. SHAKE and minimization ............................................................................................................. 49

10.21. BOUNDARY CONDITIONS ........................................................................................................................... 50

10.22. BOUNDARY CONDITIONS, „SPECIAL” TREATMENT OF ELECTROSTATIC INTERACTIONS ............................................. 52

10.22.1. Reaction field method .................................................................................................................. 52

10.22.2. Ewald summation ........................................................................................................................ 53

10.22.3. Scaling and other related methods .............................................................................................. 54

10.23. PRACTICAL TIPS FOR SETTING UP MD ......................................................................................................... 54

10.24. MD SIMULATION PROTOCOL ..................................................................................................................... 54

10.24.1. Sample MD simulations work-flow .............................................................................................. 55

10.25. THE „HEATING” DYNAMICS STAGE, THE TEMPERATURE CONTROL ...................................................................... 55

10.25.1. Temperature ................................................................................................................................ 55

10.25.2. MD simulations with a temperature bath. .................................................................................. 55

10.25.3. Barriers, Temperature and Timescales ........................................................................................ 56

10.25.4. Berendsen thermostat ................................................................................................................. 56

10.25.5. Nosé-Hoover thermostat ............................................................................................................. 57

10.25.6. Andersén thermostat ................................................................................................................... 57


10.25.7. Trivial Temperature scaling ......................................................................................................... 57

10.25.8. Berendsen barostat ..................................................................................................................... 57

10.25.9. Andersén barostat ....................................................................................................................... 58

10.25.10. Parrinello-Rahman barostat ................................................................................................... 58

10.25.11. So which ones to use? ............................................................................................................. 58

10.26. „PRODUCTION RUN” PROTOCOL, „HEATING” DYNAMICS ................................................................................. 58

10.27. THE REPLICA-EXCHANGE ALGORITHM ......................................................................................................... 59

10.28. SIMULATED ANNEALING ........................................................................................................................... 60

10.29. LANGEVIN EQUATION OF MOTION .............................................................................................................. 60

10.30. BROWNIAN DYNAMICS (BD)..................................................................................................................... 61

11. MONTE CARLO METHOD ....................................................................................................................62

11.1. WHAT IS MONTE CARLO? ........................................................................................................................ 62

11.2. MONTE CARLO (MC) SIMULATION ............................................................................................................ 62

11.2.1. Evolution of Monte Carlo methods so far… ................................................................................. 63

11.2.2. Markov chains ............................................................................................................................. 63

11.2.3. Markov chain Monte Carlo .......................................................................................................... 64

11.3. IMPLEMENTATION OF THE METROPOLIS ALGORITHM (IT IS A KIND OF MARKOV CHAIN) ......................................... 64

11.4. IMPLEMENTATION OF THE METROPOLIS ALGORITHM ..................................................................................... 64

11.5. ADVANTAGES OF METROPOLIS MC SIMULATIONS ......................................................................................... 65

11.6. MONTE CARLO MOVES ............................................................................................................................ 65

11.7. GENETIC ALGORITHMS IN MOLECULAR MODELING ....................................................................................... 65

11.7.1. Genetic algorithms ...................................................................................................................... 66

11.7.2. Guided random search ................................................................................................................ 66

11.7.3. Evolutionary computation ........................................................................................................... 66

11.7.4. Creation of a population of chromosomes .................................................................................. 67

11.7.5. Definition of a fitness function .................................................................................................... 67

11.7.6. Genetic manipulation of the chromosomes ................................................................................ 67

11.7.7. Applications of genetic algorithms in quantitative structure-activity relationships (QSAR) and

drug design ................................................................................................................................................... 69

12. MOLECULAR DOCKING .......................................................................................................................70

12.1. TYPES OF COMPATIBILITIES ........................................................................................................................ 71

12.2. FINDING THE PLACE AND THE ORIENTATION OF THE INTERACTIONS .................................................................... 71

12.3. COMPUTATIONAL TIME ............................................................................................................................ 71

12.4. RIGIDITY VS. FLEXIBILITY ........................................................................................................................... 71

12.5. FLEXIBILITY ............................................................................................................................................ 72

12.6. BOUND AND UNBOUND DOCKING ............................................................................................................... 72

12.7. COMPONENTS OF THE PROBLEM ................................................................................................................ 72

12.8. ASPECTS OF DOCKING .............................................................................................................................. 72

12.9. DOCKING AND DE NOVO DESIGN METHODS .................................................................................................. 72

12.10. ADDITIONAL CHALLENGES IN DOCKING ....................................................................................................... 72

12.11. PROTEIN FLEXIBILITY AND ITS INFLUENCE ON LIGAND BINDING ........................................................................ 73

12.12. CLUSTERING .......................................................................................................................................... 73

12.13. SEARCH ALGORITHMS .............................................................................................................................. 74

12.14. SCORING FUNCTIONS ............................................................................................................................... 74

12.15. RIGID PROTEIN DOCKING ......................................................................................................................... 75

12.16. PARTIAL PROTEIN FLEXIBILITY .................................................................................................................... 75

12.17. FULL PROTEIN FLEXIBILITY ........................................................................................................................ 76

6 Introduction to molecular modeling by Rajmund Kaźmierkiewicz

12.18. EXAMPLES OF DOCKING PROGRAMS ........................................................................................................... 77

12.19. ACTIVATED DYNAMICS ............................................................................................................................. 77

13. HYBRID QM/MM METHOD ................................................................................................................ 78

13.1. BOUNDARY TREATMENT ........................................................................................................................... 78

13.2. IMPROVED BOND TREATMENTS .................................................................................................................. 79

13.3. OTHER APPROACHES ................................................................................................................................ 79

13.4. AVAILABLE SOFTWARE .............................................................................................................................. 80

13.5. AN EXAMPLE - A DIELS-ALDER REACTION .................................................................................................... 80

13.6. GEOMETRY OPTIMIZATION AFTER MM DYNAMICS ........................................................................................ 80

13.7. OSCILLATIONS OF ACTIVE SITES AFTER MM DYNAMICS ................................................................................. 81

13.8. COMPLEX REACTION AFTER MM DYNAMICS ............................................................................................... 81

13.9. ELECTRONIC EXCITATION IN FIXED MM MATRIX .......................................................................................... 82

14. NORMAL MODES AND PRINCIPAL COMPONENT ANALYSIS ................................................................ 82

14.1. ONE MASS AND TWO SPRINGS ................................................................................................................... 83

14.2. TWO MASSES ......................................................................................................................................... 83

14.3. N ATOMS AND POTENTIAL ENERGY FUNCTION ............................................................................................... 85

14.4. FOR THE MULTI-ATOM MOLECULE .............................................................................................................. 86

14.5. HARMONIC APPROXIMATION IN ANHARMONIC SYSTEMS AND IN REAL PROTEINS ................................................. 87

14.6. FORCE CONSTANTS .................................................................................................................................. 88

14.7. NMA USING MOLECULAR MECHANICS, REDUCING THE NUMBER OF VARIABLES. ................................................ 89

14.8. USING NORMAL MODE ANALYSIS TO MODEL PROTEIN DYNAMICS ................................................................... 91

14.9. THE EQUILIBRIUM CORRELATION BETWEEN FLUCTUATIONS .............................................................................. 91

14.10. CALCULATION OF PROTEIN B-FACTORS ........................................................................................................ 92

14.11. EXAMPLES OF APPLICATIONS USING NORMAL MODE ANALYSIS TO MODEL PROTEIN DYNAMICS ............................ 93

14.11.1. Collective dynamics of protofilaments in microtubules: .............................................................. 93

14.11.2. Applications of NMA : ribosome (Application to EM Data) ......................................................... 94

14.11.3. Applications of Normal Mode Analysis to experimental EM maps .............................................. 94

14.12. WHAT ARE THE LIMITATIONS OF NMA ...................................................................................................... 95

14.13. THE PRINCIPAL COMPONENT ANALYSIS (PCA) METHOD ................................................................................. 96

14.13.1. Collective coordinates .................................................................................................................. 96

14.13.2. Building the covariance matrix from your trajectory .................................................................. 96

14.13.3. Visualizing principal components (PC’s) ...................................................................................... 97

14.13.4. Validation of PCA ........................................................................................................................ 98

15. USES OF FREE ENERGY........................................................................................................................ 98

15.1. METHODS AND APPLICATIONS ................................................................................................................... 99

15.2. THERMODYNAMIC INTEGRATION ................................................................................................................ 99

15.3. PERTURBATION METHOD ....................................................................................................................... 100

15.4. THERMODYNAMIC INTEGRATION AND SLOW GROWTH ................................................................................. 101

15.5. THERMODYNAMIC CYCLES ...................................................................................................................... 101

15.6. APPLICATION OF FREE ENERGY SIMULATIONS, PARTITIONING THE FREE ENERGY ................................................. 102

15.7. POTENTIAL OF MEAN FORCE CALCULATIONS .............................................................................................. 103

15.7.1. Potential of Mean Force calculation .......................................................................................... 104

15.8. SIMPLE UMBRELLA SAMPLING ................................................................................................................. 104

15.9. WEIGHTED HISTOGRAM ANALYSIS METHOD (WHAM) ............................................................................... 104

15.9.1. Running a Simulation ................................................................................................................. 104

15.9.2. Reaction Coordinate .................................................................................................................. 105


15.9.3. Example: n-Butane .................................................................................................................... 105

15.10. STEERED MOLECULAR DYNAMICS ............................................................................................................ 106

15.11. “RAPID” FREE ENERGY METHODS ............................................................................................................ 106

15.11.1. Linear Interaction Energy (LIE) .................................................................................................. 107

15.11.2. Molecular Mechanics Poisson-Boltzmann Surface Area Method (MM/PBSA) .......................... 107

15.11.3. Example: MM/PBSA .................................................................................................................. 108

15.11.4. Binding free energy of protein-ligand ....................................................................................... 108

15.11.5. Binding free energy of protein-RNA .......................................................................................... 108

16. MOLECULAR DISTANCE GEOMETRY PROBLEM ................................................................................. 109

16.1. CURRENT APPROACHES .......................................................................................................................... 109

16.1.1. Embed Algorithm ....................................................................................................................... 109

16.1.2. Geometric Build-Up ................................................................................................................... 109

17. PROTEIN FOLDING ............................................................................................................................ 110

17.1. ENERGY MINIMIZATION ......................................................................................................................... 111

17.2. SOME RELATED METHODS...................................................................................................................... 111

17.3. MONTE CARLO-MINIMIZATION (MCM) ................................................................................................... 111

17.4. KNOWLEDGE-BASED ENERGIES ................................................................................................................ 112

17.5. PREDICTING PROTEIN SECONDARY STRUCTURE ........................................................................................... 113

17.6. PROTEIN THREADING ............................................................................................................................. 114

17.7. REDUCED OR SIMPLIFIED PROTEIN MODELS ............................................................................................... 114

18. MOLECULAR GRAPHICS SOFTWARE.................................................................................................. 116

19. RECOMMENDED READING ............................................................................................................... 117

Introduction to molecular modeling. 1. Introduction 9 by Rajmund Kaźmierkiewicz

1. Introduction

Given the simplicity and the interpretative power of molecular structural models, apparent

to chemists as early as 1800s, it was only natural that scientists would develop mathematical tools to

aid in understanding molecular structure and the molecular structural changes associated with

chemical reactivity.

Models currently available to scientists for understanding molecular structure are numerous.

They range from simple, plastic, physical molecular representations to sophisticated mathematical

models. Mathematical models include molecular mechanics, the semiempirical quantum methods,

the local density functional approach, and the large-scale computer intensive ab-initio structure

procedures. Each has been usefully applied to (bio)chemical problems and each has practical

limitations. The present text is primarily focused on the use of molecular mechanics models in

(bio)chemistry. Most of these models are more complex than molecular images displayed on a

computer screen but substantially less sophisticated than electronic structure approaches.

Molecular mechanics deals with a simple, empirical “ball-and-spring” model of molecular

structure. Atoms (balls) are connected by springs (bonds) that can be stretched or compressed due

to intra- or intermolecular forces. The sizes of balls and the stiffness of the springs are determined

empirically, that is, they are chosen to reproduce experimental data.

Figure 1. Molecular mechanics is simply the best available method to classically model biomolecules, the building blocks of living systems

Molecular mechanics is often mistaken with bioinformatics. Both of them deal with molecular

structure but bioinformatics focuses mainly on getting the structural information out from the

sequence and molecular mechanics focuses mainly on dealing with the structure once it is obtained.

Both of those methodologies have some common fields of interest but approach them from different

10 Introduction to molecular modeling. 2. Linux by Rajmund Kaźmierkiewicz

directions. The aim of this textbook is to show the Reader that molecular mechanics is not about

thoughtless running program windows and clicking on pretty menu icons. I hope that it will help to

understand basic concepts of molecular mechanics.

2. Linux

The Linux operating system is treated in molecular mechanics as a tool, merely it is a way of

running the computer hardware. The choice of the specific operating system depends on its usability

in that purpose. Linux has some advantages, which makes it a more convenient tool in molecular

mechanics applications than the rest of the operating systems:

It is a true multi-user, multi-tasking operating system for PC hardware

It is flexible and powerful

It is stable; no blue screen of death

It is free; it includes much free software (most of the software is open-source, free and readily

available)

It offers a lot of “flavors”, which are called distributions, to suit most user needs

Currently Linux is a very mature operating system and, according to the top 500 list from

11/2010, about 91.80% supercomputers run on Linux, to put it in perspective, MS Windows share is

about 1%. This means most of the top 500 (www.top500.org) supercomputers run on Linux or other

Unix-like operating systems. If one wishes to run calculations on supercomputers he/she probably

needs to familiarize him-/herself with the basics of the Linux operating system. Some simple tasks in

molecular mechanics do not require running jobs on the supercomputers, but it is advised to learn

Linux using “simple tasks” rather than begin with sending jobs to supercomputers risking a lot of

trouble, out of which the administrator “reprimand” will be probably the least of the problems, if

something goes wrong.

The contemporary Linux distributions are as simple to use as the Other operating systems, in

addition to usual, simple, windows-based tools Linux has an extensive set of commands. They are a

convenient supplement of graphical tools which are especially useful when they are organized in so-

called “shell scripts”. The real power of Linux reveals itself, when One needs to run calculations on

supercomputers remotely. Usually, after being sent through the remote access tools, the “computing

jobs” are submitted to the queue where they await for available computing resources. The results of

the finished jobs are stored on the personal accounts in the supercomputer center. Most of the

computer centers do not enable interactive interpretation of the results, therefore they need to be

downloaded and analyzed locally on the user personal PC.

3. What is Modeling?

The term “computer modeling” has a very broad meaning in (bio)chemistry. It is not exactly

limited to “molecular mechanics”. Usually it involves one or more of the following tasks: a)

construction of a virtual 3-D molecule (a computer model), b) computation of some properties

expected to be associated with a real molecule, of which this is a representation, c) a virtual,

microscopic experiment, d) modeling has been described sometimes also as a “computational

spectrometer”. The computer modeling approaches can be divided into two classes:

Introduction to molecular modeling. 4. Molecular mechanics: limitations 11 by Rajmund Kaźmierkiewicz

Bond-based: bonds between atoms and properties of the bonds are part of the model

methodologies are molecular mechanics, molecular dynamics, docking

Atom-based: positions of the atoms and their electronic structures are input

methodologies are quantum mechanical

Usually the molecular modeling strategy is applied to answer one (or more) specific

questions: 1) what does a molecule look like 2) what do its neighbors look like 3) what does the

neighborhood (the potential energy surface) look like 4) how do we get from one neighborhood to

another (what are the transition states) 5) how does the structure and its neighborhood change with

time 6) how do two or more molecules interact with each other ?

The application of molecular modeling techniques are described throughout the whole

textbook, but the selected (bio)chemical applications can be presented here:

Protein folding landscapes

Interactions, such as: enzyme-substrate, drug-DNA

Interpretation of X-ray diffraction patterns, NOE spectra

Site-directed mutagenesis, the easy way

Homology modeling

Solvation models

4. Molecular mechanics: limitations

Force fields are generally not reactive: bond breaking and formation is not possible in the

simulations. A common suggested solution is to replace the harmonic bond term with a dissociative

Morse potential, but this does not provide the necessary changes in atomic hybridization.

Although long-range interactions (electrostatic, van der Waals) are included in the force

fields, the former rely on the concept of partial charges associated with the nuclei. The latter have

fixed parameters throughout the simulations, so polarization effects are not included.

Transferability of the typical atom parameter sets of different force fields should always be

questioned. However, in practice the construction of different molecules relies on experimental data

and quantum-mechanical calculations.

5. Graphical representations of molecular structure

CH4O chemical formula reflecting only summary of elementary composition of the compound.

CH3OH “condensed” chemical formula, used most often in organic chemistry

Figure 2. Chemical formula depicting full topology (atom names and the organization of chemical

bonds) within the molecule

Figure 3. The projection of the 3D ball-and-stick model representation of the real molecule

H

C

H

H O H

12 Introduction to molecular modeling. 6. Molecular structure description by Rajmund Kaźmierkiewicz

6. Molecular structure description

6.1. Cartesian coordinates

Since Euclidean Space has no preferred origin or direction we need to add a coordinate

system before we can assign numerical values to points and objects in the space. Each point in the

three-dimensional coordinate system can be specified by 3 real numbers, the X, Y and Z coordinates

of that point.

Table 1. The molecular structure described by Cartesian coordinates

Atom number atom name X Y Z

1 C1 3.108 0.653 -8.526

2 C2 4.597 0.674 -8.132

3 1Hl 2.815 -0.349 -8.761

4 2H1 2.517 1.015 -7.711

5 3H1 2.956 1.278 -9.381

6 1H2 4.748 0.049 -7.277

7 2H2 5.187 0.312 -8.947

8 3H2 4.890 1.676 -7.897

They determine an absolute location of each atom in a three-dimensional coordinate system. The

molecular structure described by Cartesian coordinates is rather hard to imagine and usually requires

a computer program to draw the molecule on a computer screen. The structural information

including atom names and Cartesian coordinates can be arranged (formatted) in many possible ways.

One specific arrangement was proposed by authors of the Protein Structural Database and it is called

the PDB format.

6.2. Internal coordinates

Internal coordinates express the relation between atoms in molecules in terms of atom

connectivity, distances, angles and torsional (dihedral) angles. In contrast, Cartesian coordinates

define the molecules in terms of the atomic positions. A complete set of internal coordinates is called

a Z-matrix. Note that only 3N-6 internal coordinates are used in Z-matrix construction, where N

denotes the number of atoms: there are six zero's in the upper right corner of the matrix. The

orientation of the structure in space is not specified. The six “missing” variables correspond to the

three translations and three rotations of the whole structure (with respect to three axes) which do

not change the (internal) energy of system and can therefore be omitted. The orientation in space of

the first three atoms can be defined arbitrarily. Usually, in a Z-matrix, the first atom is the origin. The

second atom is defined by the distance to atom number 1, the third atom by a distance (to atom 1 or

atom 2) and a valence angle between atoms 3-2-1. Starting with the fourth atom the dihedral angle

(4-3-2-1) is introduced. From here every atom is described by a distance, an valence angle and a

dihedral (torsional) angle, with respect to already defined atoms.

Introduction to molecular modeling. 6. Molecular structure description 13 by Rajmund Kaźmierkiewicz

Table 2. The molecular structure described by internal coordinates

Atom name bond length valence angle torsion angle

C 0.000000 000.000000 000.000000 0 0 0

C 1.540000 000.000000 000.000000 1 0 0

H 1.089000 109.471000 000.000000 1 2 0

H 1.089000 109.471000 180.000000 2 1 3

H 1.089000 109.471000 60.000000 1 2 4

H 1.089000 109.471000 -60.000000 2 1 5

H 1.089000 109.471000 180.000000 1 2 6

H 1.089000 109.471000 60.000000 2 1 7

The internal coordinates seem to be more intuitive, usually an experienced user has no

problems imagining very simple molecules just by looking at the set of internal coordinates organized

in the form of a Z-matrix.

Internal coordinates are local, they are determined by positions of already defined atoms.

Molecular mechanics energy is expressed in terms of a combination of internal coordinates of the

system (bonds, valence angles, torsional angles) and interatomic distances (for the non-bonded

interactions). The atomic positions are expressed in terms of Cartesian coordinates.

Internal coordinates can be calculated by the computer from Cartesian coordinates

exploiting vector operations. The bond length, rij, is defined as a distance between two bonded

atoms i and j, and it is the length of the vector between atom i and j:

The valence angle also called the bond angle, , between two consecutive bonds originating on

atom j is calculated by applying the cosine rule:

The valence angle is always positive and not larger than 180°, and it is always the smaller of the two

possible angles.

Figure 4. Internal coordinates: a) bond length, b) valence angle, c) torsional angle

14 Introduction to molecular modeling. 6. Molecular structure description by Rajmund Kaźmierkiewicz

The torsional angle, is a dihedral angle, , between two planes passing through atoms i, j, k and j,

k, l, respectively. It is an angle between vectors normal (i.e., perpendicular) to these planes. The

torsional angle spans the range -180° to 180°. Its absolute value can be calculated as:

where is a unit vector pointing from atom i to j. It is defined as

and

is equal to bond length. Only the absolute value of the torsional angle can be calculated that way.

Additional checking has to be done to obtain the sign of the angle. In molecular mechanics we use

the right-hand screw rule. Some modeling systems may use other conventions.

6.3. Alternative description of the molecular structure

There are also some other “styles” of describing the 3D structures of molecules. Usually they

lead to more complicated mathematical expressions of potential energy functions.

The set of distances between all atoms is an equivalent description of the internal geometry of any

molecule.

Figure 5. An example representation of the molecular structure using distances. Only subset of all distances is shown

Resolution of the structure from mere distances may lead to two solutions, out of which one

represents a model of the real molecule and another that represents a mirror image model of the

real molecule. It is obvious for (bio)chemists that for most molecules only one solution is correct.

Although it may not be immediately apparent but One can reproduce complete molecule geometry

using just significantly detailed contact map representation of the molecular structure.

Introduction to molecular modeling. 7. Energy expressions in molecular mechanics 15 by Rajmund Kaźmierkiewicz

Figure 6. The complete contact map of ubiquitin (PDB code: 1UBQ)

Another style of description of a molecular structure is the so-called “coarse-grain model”. It is

usually a simplified computer model and does not include all atoms. The purpose of introducing such

a model is to speed up the most time consuming tasks in molecular mechanics. The simplified

representation of molecular 3D structure enables for example: computer simulation of protein

folding pathways and simulations of self-assembly of complex cell structures.

Figure 7. An example coarse grained representation of molecular structure using virtual internal variables

7. Energy expressions in molecular mechanics

7.1. Potential classification

A classical potential V can be written in the form

where

V1 is a single-particle term (external fields)

V2 is a pair potential that depends on the interatomic separation (distance, bond length)

V3 is a three body term (angular dependence, bond bending)

V4 is a four-body potential (torsional term)

16 Introduction to molecular modeling. 7. Energy expressions in molecular mechanics by Rajmund Kaźmierkiewicz

7.2. The Empirical Energy Function (or Force Field)

The fundamental interacting unit, in molecular mechanics, is the atom, not individual

electrons. Thousands of atoms can be considered in a calculation. The potential energy of the

collection of atoms can be calculated as a fairly simple function of the atomic coordinates. This

function is called the Potential Energy Function, and is derived empirically by giving good fit to

experimental spectroscopy data. The Potential Energy Function, can be broken down into a sum of

important interaction terms describing contribution of bond stretching, bond angle bending,

torsional angle rotation, non-bonded interactions (Van der Waals interactions and electrostatic

interaction) and the hydrogen bonds contribution.

7.3. Bond stretching

Bond stretching energy term:

the sum is over all covalent bonds.

kri = Hooke’s law spring constant for bond number i

r0i

= equilibrium bond length for bond number i

ri = actual current value of bond length i

In this equation bond stretching is treated as a classical harmonic oscillator term.

7.4. Angle bending

Bond angle bending term:

,

the sum is over all covalent bond angles.

ki

= spring constant for angle deformation

0i

= equilibrium bond angle for angle i

i = current value of bond angle i

7.5. Bond rotation (torsion)

Dihedral (or torsional) angle rotation term:

,

the sum is over dihedral angles

Vi = barrier height

s = 1 for staggered minima

= -1 for eclipsed minima

n = periodicity (n = 3 for ethane, n = 2 for ethene)

= current value of dihedral angle i

Introduction to molecular modeling. 7. Energy expressions in molecular mechanics 17 by Rajmund Kaźmierkiewicz

7.6. Non-bonded interactions (van der Waals)

Non-bonded interaction terms:

where the double sum extends over all possible pairs of atoms separated by more than 2 bonds. The

“combination rules” define εij=(εiεj)1/2 and σij = 1/2(σi+ σj) which are obtained from the single atom

parameters ε and σ.

is the Lennard-Jones form of the Van der Waals energy term, there also exist other

mathematical expressions for this energy contribution term. This is often the most time-consuming

term in simulating large systems.

7.7. Gay-Berne potential

A variant of the Lennard-Jones potential to describe interactions between elongated particles.

Where denotes interparticle unit vector.

This potential is used, for example, in simulations of liquid crystals.

Figure 8. Coarse grained representation of liquid crystals

7.8. Non-bonded interactions (electrostatic)

Electrostatic interaction terms:

qi = partial atomic charges. “The partial atomic charges” are one of a few ways of introducing the

quantum effects into otherwise classical functions. Since electrons are not considered explicitly in

molecular mechanics the only way of taking into account their distribution within the molecule is by

fitting effective “point charges” centered near the atom nucleus to the electrostatic potential

calculated using one of the quantum mechanical ab-initio methods. It is usually done by employing

the so-called restrained electrostatic potential (RESP) algorithm. Such procedures lead to discrete,

centered at the given point(s), charge distributions with partial (non-integer, fractional) values.

= Dielectric constant of medium. The physical origin of this constant is the value of the scalar

dielectric permittivity of the solvent, most often water.

18 Introduction to molecular modeling. 7. Energy expressions in molecular mechanics by Rajmund Kaźmierkiewicz

There are also other terms added by some simulation packages to improve correlation with the

experiment:

“Improper” torsional terms

Out-of-Plane Bending

The potential for moving an atom out of a plane is sometimes treated separately from bending

(although it also involves bending). An out-of-plane coordinate (either χ or d) is displayed below. The

potential is usually taken quadratic in this out-of-plane bend,

Figure 9. Out of plane variable definitions

Hydrogen bonding

Various expressions of stretch/bend cross terms

Cross terms are required to account for some interactions affecting others. For example, a strongly

bent water molecule tends to stretch its O–H bonds. This can be modeled by cross

terms such as

Other cross terms might include stretch-stretch, bend-bend, stretch-torsion and bend-torsion. Force

field models vary in what types of cross terms they use.

Figure 10. Schematic representation of a cholesterol molecule, and definition of the bond distances, bond angles, dihedral angles and Coulomb interactions

Introduction to molecular modeling. 8. Empirical Force Field 19 by Rajmund Kaźmierkiewicz

7.9. The total potential energy

The total potential energy of any molecule is the sum of simple terms allowing for bond stretching,

bond angle bending, bond twisting, van der Waals interactions and electrostatics.

Numerous properties of biomolecules can be simulated with such an empirical energy function.

There are several forms of mathematical expressions of the classical total potential energy of any

molecule. It seems that each author’s ambition is to modify parts of a mathematical function to give

impression of introducing something new. The most common expression is:

It is used together with the database of standard residues (fragments of more complex molecules)

and is accompanied with the set of parameters usually optimized for evaluation of properties of a

given class of chemical compounds. All three (i.e. mathematical expression, the database of standard

residues and the set of parameters) together constitute the Empirical Force Field.

8. Empirical Force Field

Popular Force Fields for Macromolecules, optimized for calculation of properties of proteins and

nucleic acids:

AMBER (Cornell et al. J. Am. Chem. Soc. 1995. 117: 5179)

CHARMM (MacKerell et al. J. Phys. Chem. B. 1998. 102: 3586.)

GROMOS (Schuler et al. J. Comput. Chem. 2001. 22: 1205.)

Parameters of the empirical force fields depend on hybridization and the immediate surroundings of

the given atom. The consequence of this dependence is the high number of “force field atom types”

associated with one atom of the particular chemical element. One may ask: How Many Parameters

are There?

AMBER has 40 atom types.

There are 13 types of carbon:

sp3 carbon

Carbonyl sp2 carbon

Aromatic sp2 carbon

sp2 carbon, double bonded

There are also the bond stretch and angle parameters for each valid combination of atom types.

There are 1 to 3 torsional parameters for many combinations of atoms and there are 30

improper torsions.

There is one set of van der Waals parameters for each atom type, which are combined for each

pairwise interaction.

Atomic charges are set for the atoms in each amino acid/nucleotide residue.

20 Introduction to molecular modeling. 8. Empirical Force Field by Rajmund Kaźmierkiewicz

8.1. Fitting Parameters

Bond stretch and bond angle parameters are fit to IR and RAMAN spectroscopic data from simple

model molecules. Dihedral term parameters are fit to energies derived from ab-initio, usually MP2/6-

31G*, quantum mechanics calculation. Van der Waals parameters are fit to optimize properties of

liquids such as densities and enthalpies of vaporization.

Table 3. The sample force constants and reference bond lengths for selected bonds

Bond r0, (Å) k (kcal mol-1

Å-2

)

Csp3-Csp3 1.523 317

Csp3-Csp2 1.497 317

Csp2 = Csp2 1.337 690

Csp2 = O 1.208 777

Csp3-Nsp3 1.438 367

C-N (amide) 1.345 719

Table 4. The sample force constants and reference angles for selected angles

Angle 0 (deg) k (kcal mol-1

deg-1

)

Csp3-Csp3-Csp3 109.47 0.0099

Csp3-Csp3-H 109.47 0.0079

H-Csp3-H 109.47 0.0070

Csp3-Csp2-Csp3 117.2 0.0099

Csp3-Csp2 = Csp2 121.4 0.0121

Csp3-Csp2 = O 122.5 0.0101

8.2. Fitting Charges

The “atomic charges” are a useful approximation. Quantum mechanics tells us that electrons

are delocalized to probable regions in space, and their charge is shared among nearby atoms. There

is no unique way to assign electrons to particular atoms. For molecular mechanics, we want to

associate charges with atoms. Charges are fit to atomic location using the RESP (Restrained

Electrostatic Potential) method. First, the model molecules are placed in a 3D grid of points, then the

electrostatic potentials are calculated at each point in the grid. Using the potentials, charges are fit to

atomic locations to provide, as closely as possible, the potential at all points of the grid.

bond lengths bond angles charges

Figure 11. Sample non-standard residue with the experimental values of bond lengths and bond angles and the fitted charge values


8.3. Problems with the infinite range of non-bonded interactions

Like van der Waals terms, electrostatic terms are typically computed for non-bonded atoms

in a so-called 1-4 relationship, i.e. if atoms are three bonds or are further apart one from each other.

They are also long range interactions and dominate the computation time.

The number of non-bonded interactions grows quadratically with molecule size. The

computation time can be reduced by “cutting off” (excluding) the interactions after a certain

distance. The van der Waals terms decrease relatively quickly (~ R−6) and can be “cut off” around 10

Å. The electrostatic terms decrease slower (~ R−1), and are much harder to be correctly treated with

cutoffs.

Figure 12. Comparison of the „typical” contributions to potential non-bonded energies of interactions (Van der Waals and the electrostatic energies)

The point-charge model has serious deficiencies: (a) electrostatic potentials are not

accurately reproduced; (b) simple models do not allow the charges to change as the molecular

geometry changes, but they should; (this problem is partially overcame by careful parameterization

of the torsional potential) (c) only pairwise interactions are considered, but an electrostatic

interaction can actually change by about 10-20% in the presence of a third body due to induction or

“polarization” effects.

Until recently, the most frequently used method to handle electrostatic and van der Waals

interactions was to ignore all interactions between atoms whose internuclear distance is longer than

a certain cutoff value. Such an approach is usually called the Cut-off Method. In practical

applications, it is convenient to establish a cutoff radius Rc and disregard the interactions between

atoms separated by more than Rc. The same cutoff radius is defined for each atom. This approach

defines a sphere around each atom where all interactions are calculated, beyond this sphere all non-

bonded interactions are ignored. This results in simpler programs and enormous savings of computer

resources, because the number of atomic pairs separated by distance r grows as r2 and becomes

quickly huge. A simple truncation of the potential creates a new problem though: whenever a

r[Å]

E(r)[kcal/mol]

Evdw

Ees(+ )

Ees(+ -)


particle pair “crosses” the cutoff distance, the energy makes a little jump. The so-called group-based

cutoffs lighten this problem a little bit because all contributions of the entire residue are included (or

omitted) together. In this case all groups should be neutral or almost so and they should be much

smaller than the cut-off radius. Despite these countermeasures a large number of “small energy

jumps” is likely to spoil energy conservation in a simulation. To avoid this problem, the potential is

often shifted in order to vanish at the cutoff radius. Physical quantities are of course affected by this

potential truncation.

A B

Figure 13. The non-bonded cutoffs. A. the interacting atoms without applying cutoff,

B. interacting atoms after applying cutoff

There are several possible choices concerning how the cutoff can be used:

Truncation: the interactions are simply set to zero for interatomic distances greater than the cutoff

distance. This method can lead to large fluctuations in the energy. This method is not often used.

The SHIFT cutoff method: this method modifies the entire potential energy surface such that at the

cutoff distance the interaction potential is zero. The drawback of this method is that equilibrium

distances are slightly decreased.

The SWITCH cutoff method: This method tapers the interaction potential over a predefined range of

distances. The potential takes its usual value up to the first cutoff and is then switched to zero

between the first and last cutoff. This model suffers from strong forces in the switching region which

can slightly perturb the equilibrium structure. The SWITCH function is not recommended when using

short cutoff regions.

An example of the correctly applied switching function.

After applying a correct switching function both energy and gradients are continuous, total energy is

conserved and the thermodynamic properties are not affected.


Figure 14. An illustration of various cutoff application methods

Cut-offs also apply to neighbor list updating. In this case only atoms within the neighbor list

need to be considered in calculations of the potential energy. Including “close” atoms avoids

recalculation of the neighbor list on each iteration. The list updating step is carried out using

displacement-based criteria for recalculation of the neighbor list.

Figure 15. The neighbor list updating. Each atom is in the center of its own interaction sphere and there is a list of atoms included within each sphere

8.4. Problems with high values of electrostatic potential

The high values of electrostatic potential around some molecules can be both an advantage

or a disadvantage. It depends what values of electrostatic potential are desired in the given

environment. Even the smallest molecules generate noticeable electrostatic fields.


Figure 16. Molecular Dipole Moments are the vector sum of the individual bond dipole moments. They depend on the magnitude and direction of the bond dipoles

Molecular Dipole Moments are the vector sum of the individual bond dipole moments. They depend

on the magnitude and direction of the bond dipoles.

The consequence of existence of naturally occurring dipoles is the characteristic behavior of

those molecules which tend to reorient spontaneously (mainly rotate) to accommodate to both the

self-generated and external electric fields. This tendency affects also fragments of molecules if they

possess measurable dipole moments.

The dipole-dipole (or multipole-multipole; multipole is a higher order spatial arrangement of

charges, it takes into account separation of more than two charged interacting sites) interaction can

also be applied in some cases to molecules which are placed (or are observed) from “large”

distances. At larger separations, details of charge distribution are less important. Please keep in

mind, that for molecules a “large distance” term could mean just a couple of nanometers.

Figure 17. An illustration of the dipole-dipole interactions: A means attraction, R means repulsion

NH3 H2O CO2 CH3Cl

OH

H

:

::O=C=O:

.. ..

C

H

ClH

H

D 1.9 D 0.0 D 1.87 D

NH

H

H

:

Introduction to molecular modeling. 9. Optimization of a structure 25 by Rajmund Kaźmierkiewicz

Unfortunately mathematical expressions describing interactions of dipoles and higher

multipoles tend to be more complicated than the simple Coulomb interaction potential.

Figure 18. Sample mathematical expressions relating to the point multipole models based on long-range behavior

Among molecules, interesting for (bio)chemists, are phospholipids and nucleic acids, particularly DNA

molecules, which display high electrostatic field values. Such molecules may display high affinity (i.e.

strong attraction forces) to other charged molecules, especially proteins.

8.5. Dielectric permittivity

Treatment of the dielectric permittivity of the environment of a molecule in molecular mechanics is

interrelated with the treatment of the “solvent model” of the medium surrounding the molecule

computer models. The satisfactory, from the physical point of view, treatment of the dielectric

permittivity is probably not possible. It is a macroscopic, classical entity, whose value comes as a

consequence of resultant cooperative interactions of many molecules, but it is applied to description

of interactions of a single molecule which represents the microscopic world. Most of the classical

force fields do not take into account polarization effects (creation of induced dipoles) of molecules

therefore the single, scalar effective value of dielectric permittivity of the solvent is used.

9. Optimization of a structure

The empirical force field can be represented by the 3N dimensional potential energy hypersurface.

The whole hypersurface is not very interesting. In molecular mechanics only a few points are

important, they are called “the stationary points”. More precisely we need information only about

points where gradient of the potential energy function is equal 0. Is there a reason why we care

about the stationary points (especially minima) on the potential energy hypersurface ?

26 Introduction to molecular modeling. 9. Optimization of a structure by Rajmund Kaźmierkiewicz

saddle point maximum

minimum

Figure 19. Schematic representation of the potential energy hypersurface

The physical meaning of special points on the potential energy hypersurface:

Reactants (substrates of the (bio)chemical reactions, starting material), products and

intermediates (regardless of their lifetime) correspond to energy minima.

Energy minima correspond also to conformations of any compound in its standard, stable state

The most stable conformation (the native conformation) of the molecule corresponds to the

global minimum on the potential energy hypersurface

Energy maxima are (bio)chemically irrelevant.

Saddle points correspond to transition states.

If qi corresponds to one of the normal coordinates of the system,

corresponds to the force constant of this normal vibration.

9.1. Successive Coordinate Direction Method

Starting from a point p we can define a line in the direction specified by a vector n parameterized by

. Along this line, any point is given by

x = p + n

Now f(x) = f(p + n) is a function of one variable which may be minimized using any one-dimensional

method. This process is called the line minimization. The result is a line minimum of f. After one

iteration of this process, the line minimum is then used as the starting point p in the next iteration

for a different choice of the direction vector n.


9.2. Newton’s Method for Finding a Minimum

Now we turn to the minimization of a function of n variables using the Newton method,

where and the partial derivatives of are accessible.

Assume that the first and second partial derivatives of exist and are

continuous in a region containing the point , and that there is a minimum at the point . The

quadratic polynomial approximation to is:

A minimum of function occurs where . The expression for can be written

as

If point is close to the point (where a minimum of f occurs), then is invertible and the

above equation can be solved for , and we have

This value of can be used as the next approximation to and is the first step in Newton's method

for finding a minimum

The Newton-Raphson method not only uses the gradient of a function, but also the second order

gradient to determine the search direction. This direction is kept for each new step until a minimum

has been found. Then a new search direction is determined and the process continues. The method

only converges for a positive second order gradient, near the minimum.

Figure 20. Successive minimizations of f(x) along coordinate directions

9.3. Steepest Descents

This minimization method can be summarized in three points:

Step downhill in a direction of local steepest gradient using trial step length

Perform a line minimization to find the optimal step length

Repeat to convergence


9.3.1. Steepest Descent Method

Here for each iteration of line minimization the direction is chosen to be the local downhill gradient -

f(p). However, though along the downhill gradient to begin with at p, the vector n becomes

perpendicular to the local gradient of f(x) where the current line minimum occurs. Consequently, the

vector n has to make a 90° turn for every iteration. This results in a zigzag path along a "long valley"

to the final minimum of f(x).

Figure 21. Successive minimizations of f(x) using the steepest descent method

Among the Steepest Descents advantages is that it can be easily implemented. It is also very robust

and reliable, it will always get to the minima. Unfortunately it is often very slow to converge.

9.4. Conjugate Gradient Method

Recall that for a scalar quadratic function f, the gradient is given by

f = Hx - b

Along any direction, the variation of this gradient is given by

(f ) = Hx

Suppose that f has been line minimized along the direction u:

uf = 0

say, at p.

Then a successive line minimization along another direction v without spoiling the previous line

minimization should satisfy

u(f ) = 0

where the variation of the gradient is induced by moving along v, hence (f ) = H v.

It follows that we must have

uHv = 0

Any two vectors u and v satisfying the above are said to be conjugate.

For a scalar quadratic function, a sequence of N line minimizations using independent conjugate

directions will lead to the exact minimum.


An effective way to find these conjugate directions is via the Fletcher-Reeves algorithm as follows:

Start with an arbitrary initial vector g0 and another vector h0 = g0. The algorithm generates two

sequences of vectors:

g0, g1, g2, …

and

h0, h1, h2, …

using following recurrence: First, calculate

gi+1 = gi - iHhi

where

Then, calculate

hi+1 = gi+1 + ihi

where

9.4.1. Conjugate gradient method without explicit knowledge of the

Hessian matrix

The above algorithm assumes the availability of the Hessian matrix H. If for some reason, e.g. due to

data storage limitation, H is not available, but the gradient of f(x) can still be evaluated, then the

following algorithm for the conjugate gradient method due to Fletcher-Reeves can be employed:

1. Start from some point pi and define gi = - f(pi).

2. Perform line minimization along hi, i.e. minimize f(pi + hi).

3. Use the resulting to assign i = and pi+1 = pi + ihi.

4. This yields gi+1 = -f(pi+1), from which we have hi+1 = gi+1 + ihi, where

as before.

Figure 22. Successive minimizations of f(x) using the conjugate gradient method


One of the advantages of Conjugate Gradients method is the rapid rate of convergence, in a

quadratic energy landscape, each iteration should converge one degree of freedom. It has also

relatively low storage requirements.

This method has also some disadvantages: it is more complex to code than the steepest descent

algorithm and there is no knowledge of the Hessian explicitly generated.

9.5. The BFGS algorithm for unconstrained optimization

In 1970, an alternative inverse Hessian update formula was suggested independently by Broyden,

Fletcher, Goldfarb and Shanno. Their formula originated a new Quasi—Newton method.

This algorithm can be summarized as follows:

1. Set k := 0, select x(0) and a real positive definite matrix B0.

2. If g(k) = 0, stop. Else dk = -Bkg(k).

3. Compute

4. Update the inverse Hessian approximation Bk+1, set

k := k + 1 and go to step 2.

The inverse Hessian approximation is updated as follows:

where

There exists the (older) Davidon-Fletcher-Powell (DFP) variant, which is mathematically equivalent to

BFGS. It is less tolerant of round-off error or inexact line minimization and it calculates A

(approximates H) rather than H itself.

The BFGS method convergence rate is similar (or better) than the Conjugate Gradient

method and extra physical information is generated from Hessian, but it is still a local minimization

method.

9.6. Testing Minima

Compute the full Hessian (the partial Hessian from an optimization is not accurate enough).

Check the number of negative eigenvalues:

0 required for a minimum.

1 (and only 1) for a transition state

For a minimum, if there are any negative eigenvalues, follow the associated eigenvector to a

lower energy structure.

For a transition state, if there are no negative eigenvalues, follow the lowest eigenvector up hill.

Introduction to molecular modeling. 10. Molecular dynamics simulations 31 by Rajmund Kaźmierkiewicz

9.7. Minimization and Molecular Mechanics

The use of a force field to define structure is often called molecular mechanics.

Use the force field that has been assigned to the atoms in the system.

Find a stable point or a minimum on the potential energy surface in order to begin dynamics.

There will be more than one minimum for a polymer, biopolymer, or a liquid.

There may be a global minimum, but this will not likely be found without a conformational

search.

Molecular dynamics provides information that is complementary to minimization.

Three typical stages: Minimization, Equilibration, Dynamics (The production run)

10. Molecular dynamics simulations

The molecular dynamics technique enables calculation of thermodynamic properties of

molecules (energy, heat capacity) and it provides dynamic information (diffusion coefficient,

dielectric functions, correlated motion). MD allows to study the dynamics of large macromolecules,

including biological systems such as proteins, nucleic acids (DNA, RNA), membranes. Dynamical

events may play a key role in controlling processes which affect functional properties of the

biomolecule. Beyond this “traditional” use, MD is nowadays also used for other purposes, such as

studies of non-equilibrium processes, and as an efficient tool for optimization of structures

overcoming local energy minima (simulated annealing).

In molecular mechanics, the set of many molecules put together is called a system. It

includes typically a macromolecule, sometimes accompanied by a small ligand, water, ions, it may

also include phospholipids or sugars.

The idea of MD is a simple one: calculate the forces acting on the atoms in a molecular

system and analyze their motion. When enough information on the motion of the individual atoms

has been gathered, it is possible to condense it all using the methods of statistical mechanics to

deduce the bulk properties of the material. These properties include the structure (e.g. crystal

structure, predicted x-ray and neutron diffraction patterns), thermodynamics (e.g. enthalpy,

temperature, pressure) and transport properties (e.g. thermal conductivity, viscosity, diffusion). In

addition molecular dynamics can be used to investigate the detailed atomistic mechanisms

underlying these properties and compare them with theory. It is a valuable bridge between

experiment and theory.

10.1. Equilibration

Equilibration is a protocol for bringing the system to equilibrium at the desired temperature

for the simulation. The protocol consists of assigning velocities and then performing molecular

dynamics until the equilibrium has been reached. Every time the state of the system changes, the

system will be “out of equilibrium” for a while, and it is certainly so at the beginning of the computer

simulation. We are referring here to thermodynamic equilibrium.

Once the system is in equilibrium the current velocities are used for production dynamics.

The production run is the phase of the simulation where properties of the system can be

determined.

32 Introduction to molecular modeling. 10. Molecular dynamics simulations by Rajmund Kaźmierkiewicz

10.2. Velocities in MD

The trajectory in a MD simulation consists of both positions and velocities. The velocities are

assigned (created) based on a coordinate file with the atoms in the optimal (minimized) positions.

The initial velocities are assigned taking them from a Maxwell distribution at a certain temperature T.

Initial randomization of velocities is usually the only place where chance enters a molecular

dynamics simulation. The subsequent time evolution is completely deterministic.

The average velocity is related to the temperature according to:

Figure 23. The Maxwell-Boltzmann velocity distribution in the kinetic theory

10.3. Dynamics: Equations of Motion

Molecular dynamics requires a technique for the solution of the equations of motion for

atomic systems. If we consider a system of atoms, with Cartesian coordinates ri then equation of

motion becomes (Newton’s equation of motion):

where mi is the mass of atom i , is an acceleration and Fi is the force on that atom.

10.3.1. Numerical Solution of the Equations of Motion

To simulate molecular motion we need a means of solving the equations of motion for a

system of many particles. Coupled linear differential equations (equations of motion) for the motion

of various masses in a force field can be solved using finite difference methods. The equations are

solved step-by-step in discrete time intervals t. Finite difference methods use calculation of the

velocity (i.e.

) to produce a new set of positions. The new positions are used to reevaluate the

velocities using the equations of motion. This procedure is repeated for each step of the simulation.

There are several different techniques for propagating the motion of the particles in a

simulation: I. Verlet algorithm (A. basic, B. leapfrog, C. velocity Verlet), II. Gear predictor-corrector.

10.3.2. The symplectic integration of equations of motion

In molecular mechanics a great number of phenomena are modeled by ordinary differential

equations (equations of motion). When solved, analytically or numerically, they describe the time

evolution of the quantities used to model the phenomena. Among these systems there are those


called conservative or Hamiltonian. We have to use numerical procedures to solve the equations.

Numerical procedures reduce the differential equations to finite difference equations through

algorithms which are now standard. Stability of these algorithms is a research area on its own. There

are two classes of integration algorithms. The first are symplectic algorithms, they are time reversible

and conserve phase space volume, both properties are highly desired. The second class is non-

symplectic, it is bad because it does not recover time reversibility property of Newton’s equations of

motion and it is unstable due to strong energy drift. The non-symplectic algorithm requires also very

small time step to „force“ stability, although nothing can guarantee the stability in long simulations.

Symplecticity is of fundamental importance and it was discussed in many papers (Mitsutake A, Sugita

Y, Okamoto Y., “Generalized-ensemble algorithms for molecular simulations of biopolymers.”,

Biopolymers. 2001;60(2):96-123.; Feig M, Brooks CL 3rd., “Recent advances in the development and

application of implicit solvent models in biomolecule simulations.”, Curr Opin Struct Biol. 2004

Apr;14(2):217-24.; Kamberaj H, Low RJ, Neal MP, “Time reversible and symplectic integrators for

molecular dynamics simulations of rigid molecules.”, J Chem Phys. 2005 Jun 8;122(22):224114.;

Okumura H, Itoh SG, Okamoto Y “Explicit symplectic integrators of molecular dynamics

algorithms for rigid-body molecules in the canonical, isobaric-isothermal, and related ensembles.“, J

Chem Phys. 2007 Feb 28;126(8):084103.; Sugita Y. “Free-energy landscapes of proteins in solution by

generalized-ensemble simulations.“, Front Biosci. 2009 Jan 1;14:1292-303.). Among those algorithms

discussed in next paragraphs the Verlet algorithms are symplectic, the Gear predictor-corrector

algorithm is not symplectic.

10.3.3. The Verlet Algorithm

The Verlet method is a direct solution of the second order differential equations. In the

Verlet method the velocities are eliminated by comparing two Taylor expansions about the position

at time t.

The Taylor series expansion about +t and -t are summed to give the expression:

r(t + t) = r(t) + t v(t) + (1/2)t2 a(t) + …

r(t - t) = r(t) - t v(t) + (1/2)t2 a(t) + …

r(t + t) = 2r(t) - r(t - t) + t2 a(t) + …

This equation is correct except for errors of the order of t4. The computed velocity (used to

estimate the kinetic energy) is subject to errors of the order of t2.

The velocity is computed by v(t) = [r(t + t) – r(t – t)]/t, on the fly, in this method.

10.3.4. Leapfrog Verlet

The Verlet algorithm may introduce numerical imprecision since numbers of the order of t2

are added to numbers of the order t0 ( 1). For this reason the leapfrog Verlet method is used

r(t + t) = r(t) + t v(t + 1/2t) v(t + 1/2t) = v(t - 1/2t) + t a(t)

The velocity equation is executed first and generates a new mid-step velocity. This velocity is then

used to calculate the new position. The velocity is calculated from

v(t) = (1/2)v(t + 1/2t) +(1/2)v(t - 1/2t)

This leapfrog method also has the advantage that temperature scaling by velocity scaling is feasible.


10.3.5. Velocity Verlet

The handling of the velocity (and therefore the calculation of the kinetic energy) is NOT

„ideal” in either of the above forms of the Verlet algorithm. The velocity Verlet algorithm stores

positions, velocities, and accelerations:

r(t + t) = r(t) + t v(t) + (1/2)t2 a(t)

v(t + t) = v(t) + (1/2)t[a(t) + a(t + t)]

The above velocity Verlet approach can be shown to be equivalent to the basic Verlet algorithm by

eliminating the velocities.

The equations are implemented in two stages. First, the new positions at time t + t are calculated.

Then the velocities at mid-step are calculated using

v(t + 1/2t) = v(t) + (1/2)t a(t)

The forces and acceleration at time t + t are calculated and then the new velocity is calculated.

v(t + t) = v(t + 1/2t) + (1/2)t a(t + t)

10.3.6. Gear Predictor-Corrector Method

The predictor

If the classical trajectory is continuous, then an estimate of the positions, velocities, accelerations

etc. may be obtained by a Taylor series expansion about time t:

rp(t + t) = r(t) + t v(t) + 1/2 (t)2a(t) + 1/6 (t)3b(t) + ...

vp(t + t) = v(t) + t a(t) + 1/2 (t)2b(t) + ...

ap(t + t) = a(t) + t b(t) + ...

bp(t + t) = b(t) + …

The p superscript refers to predicted values. The variables are :

r = position v = velocity (

)

a = acceleration (

) b = third derivative of position with respect to time

The corrector

The equations of motion are introduced by calculating the acceleration, a due to the force, F.

The force is calculated from the potential function V(rp) at the new positions, rp so that the correct

acceleration is:

ac = F/m = (-grad V(rp))/m.

The predicted positions and velocities must be corrected. The correction term is proportional to the

difference between the predicted and correct acceleration,

The corrector step is: a(t + t) = ac(t + t) – ap(t + t)

rc(t + t) = rp(t + t) + c0a(t + t)

vc(t + t) = vp(t + t) + c1a(t + t)

ac(t + t) = ap(t + t) + c2a(t + t)

bc(t + t) = bp(t + t) + c3a(t + t)


10.4. The Time Step

The choice of time step t is of critical importance to the success of the method. The time

step must be short in relation to the length of time it takes for a particle to travel its own length.

Time step should be about 10 times shorter than the period of the highest frequency vibration in the

simulation. The configuration space sampled during the simulation will be greater if the time step is

longer, so in the interest of efficiency of calculation it is desirable to make the time step as long as

possible.

Time

Figure 24. Time Scales of Protein Motions and MD

The examples of possible applications and Time Scales of Protein Motions and MD were depicted in

Brooks, Karplus, & Pettit, "Proteins", Wiley, 1988. The time scales needed for all-atom simulations of

a protein folding process are out of reach of contemporary computers. It is still difficult to simulate a

whole process of protein folding using the conventional MD method.

10.5. The need for faster computers

Compared with other applications in today's computational (bio)chemistry, MD simulations

using classical potentials are less demanding than electronic structure programs. Using a parallel

computer a single job is divided into several smaller ones and they are calculated on multi CPUs

simultaneously. Today, almost all MD programs for biomolecular simulations (like AMBER, CHARMm,

GROMOS, NAMD) can run on parallel computers.

10.6. Phase Space

Phase Space is a concept common for theory (molecular mechanics, statistical mechanics)

and experiment (thermodynamics). Computer simulations generate information at the microscopic

level (atomic and molecular positions and velocities) and statistical mechanics converts this

information into macroscopic terms (for example: pressure and internal energy). The positions and

momenta of the particles can be thought of as coordinates in multidimensional space: phase space.

For a system of N atoms this space has 6N dimensions (3N positions and 3N momenta). represents

the particular point in phase space.

10-15 10-610-910-12 10-3 100

(s)(fs) (ps) (μs)(ns) (ms)

Bond stretching

α-Helix folding

β-Hairpin folding

Protein folding


10.7. Calculation of Average Properties

We can represent the instantaneous value of some property A as Aobs. The experimental

observable macroscopic property A is given by a time average. The equations governing the time

evolution are none other than Newton’s equations of motion. In a molecular dynamics simulation the

solutions are not performed continuously but in time steps, t.

The ensemble is a central concept in statistical mechanics. Imagine that a given molecular

system is replicated many times over, so that we have an enormous number of copies, each

possessing the same physical characteristics of temperature, density, number of atoms and so on.

Since we are interested in the macroscopic properties of the system, it is not necessary for these

replicas to have exactly the same atomic positions and velocities. In other words the replicas are

allowed to differ microscopically, while retaining the same general properties. Such a collection of

replicated systems is called an ensemble.

Because of the way the ensemble is constructed, if a snapshot of all the replicas is taken at

the same instant, we will find that they differ in the instantaneous values of their bulk properties.

This phenomenon is called fluctuation. Thus the true value of any particular bulk property must be

calculated as an average over all the replicas. This is what is meant by an ensemble average, and the

instantaneous values are said to fluctuate about the mean value.

Molecular dynamics proceeds by a numerical integration of the equations of motion. Each

time step generates a new arrangement of the atoms (called a configuration) and new instantaneous

values for bulk properties such as temperature, pressure, configuration energy etc. To determine the

true or thermodynamic values of these variables requires an ensemble average. In molecular

dynamics this is achieved be performing the average over successive configurations generated by the

simulation. In doing this we are making an implicit assumption that an ensemble average (which

relates to many replicas of the system) is the same as an average over time of one replica (the

system we are simulating). This assumption is known as the Ergodic Hypothesis. Fortunately it seems

to be generally true, provided a long enough time is taken in the average. However it has not yet

been rigorously proved mathematically.

Examples of thermodynamic properties that can be calculated from computer simulations as

ensemble averages include:

Temperature;

Pressure;

Density;

Configuration energy;

Enthalpy;

Structural correlations;

Time correlations;

Elastic properties.


10.8. Fluctuations

Most of the properties that we calculate for a molecular system are averages. Well known

properties like temperature, pressure and density are calculated as ensemble averages, and in the

real world they are treated as fixed, measurable quantities, which they generally appear to be.

However all averages are obtained by summing over many numbers, and it would be very unusual

(even pointless) if all the individual numbers summed had exactly the same value. Thus in practice

we expect the average to show some dispersion - individual contributions are scattered about the

mean value. In statistical thermodynamics this dispersion about the average value is known as

fluctuation and it is both a subtle and important property of all physical systems.

When calculating an ensemble average (of say, pressure at a fixed temperature and density),

we take an instantaneous snapshot of a very large set of replicas of the system concerned and

compute the average from the sum of the individual values taken from each replica. Even though

each replica represents the same system at the same pressure, their individual, instantaneous values

differ slightly, because the molecules that bombard the vessel surfaces to create the pressure are not

in synchronization between each replica and cannot possibly give rise to precisely the same surface

forces at the same instant. Thus, with pressure, we expect some fluctuation about the mean value

and indeed, similar arguments can be made for all the bulk properties of the system.

Fluctuations are of fundamental importance in statistical mechanics because they provide

the means by which many physical properties of a molecular system can happen. For instance, the

density of a liquid at equilibrium is a fixed, uniform quantity and we feel justified in considering the

system to be isotropic - the same at all points within its bulk. Yet we know that the molecules in the

system are undergoing diffusion and can easily travel throughout the bulk of the liquid. It is difficult

to imagine how this diffusion can take place if the environment each molecule is in is completely

isotropic. If however we consider the density to be fluctuating minutely from the mean value at

different points in the bulk, we can readily see that such fluctuations would provide a means by

which the diffusion may take place. It is a surprising fact, but most of the physical properties of a bulk

system are driven by fluctuations, and indeed can be calculated directly from them. For this reason it

is possible to view fluctuations as even more fundamental than the average value.

A good example of the importance of fluctuation is provided by the Fluctuation-Dissipation

theorem, which is a theorem of great power in statistical mechanics. This theorem proposes that the

mechanism underpinning the response of a system to an external perturbation, is precisely the same

mechanism by which equilibrium fluctuations are held close to the average bulk value. Thus for

example, a molecule vibrationally excited by an infrared photon, will lose (i.e. dissipate) that energy

to the rest of the system by the same mechanism by which normal vibrational energies are

exchanged (i.e. fluctuate) between molecules at equilibrium. This insight is the basis of a theoretical

description of solution spectroscopy.

Although the fluctuations are extraordinarily small for large systems we must confront the

fact that any real simulation has a limited number of atoms and is carried out for a relatively small

number of steps compared to the systems considered in statistical mechanics. The fluctuation in the

mean-squared energy is:


This result can be calculated in terms of familiar thermodynamic quantities. For an ideal gas

there is no potential energy contribution and so the energy is <E> = 3/2NkT yielding the ideal gas

specific heat Cv = 3/2 Nk. In general the result is related to the size of system since:

. Practically, we can increase the sampled energy configurations by averaging over a longer

time:

Using fluctuation theory we have the mathematical expression for heat capacity:

10.9. Other ways of experimental verification of results of molecular

mechanics

The potential energy of the molecule calculated from a well-designed empirical force field

represents a strain in the molecule. Augmented with bond/group equivalents and statistical

mechanical corrections, it can be used to estimate the heat of the formation of a compound (which

can be directly compared with the experimental value). This quantity can be used also to compare

the relative stability of different compounds. Unfortunately, in most cases, the calculated potential

energy incorporates some arbitrary component which depends upon the types of atoms and

covalent bonds in the molecule, therefore comparison of the energies calculated for different

molecules cannot be rigorous. For this reason, potential energy will, in most cases, reliably evaluate

the difference in energy between conformers of the same molecule, but will fail if One will attempt

to calculate the change in energy after adding a new fragment into the molecule. Molecular

mechanics can also provide the interaction energy, , of two molecules A and B as:

Where , , and are potential energies of the optimized complex, the optimized molecule A,

and the optimized molecule B; respectively. Note that the type and number of atoms and covalent

bonds in the complex AB is equal to their sum in isolated molecules A and B, and the arbitrary

“energy zero” should cancel out in this case. For this reason, the difference between interaction

energies calculated for different complexes, (equal to ) is the preferred method over

direct comparison of the energies of different complexes (equal to ).

Potential energy functions can also be used to estimate contributions from intramolecular

vibrations to the so-called vibrational free energy and vibrational entropy. These quantities, and

contributions from translation and rotation of the molecule as a whole, vary with temperature and

are the main contributors to the thermodynamic functions such as enthalpy, free energy, specific

heat. One approach is to use the frequencies, , corresponding to normal modes within harmonic

approximation, that is, to calculate them from a mass scaled Hessian matrix at energy minimum. The

expressions for relating classical vibrational contributions to Helmholtz free energy, Fvib, internal

energy, Evib, heat capacity at constant volume

, and entropy Svib of the nonlinear

molecule, are derived in many standard textbooks for statistical mechanics:


where R, T, and h are the gas constant, the absolute temperature, and Planck's constant respectively;

and N denotes the number of atoms in the molecule. Frequently, these values are augmented with a

correction to account for vibrations at T = 0 K, which is of quantum origin, by adding energy value at

zero degrees of Kelvin (E0) to the free energy Fvib and the internal energy Evib :

The harmonic approximation is quite accurate for isolated molecules. For complexes of two or more

molecules or systems containing water, the harmonic approximation breaks down. In this case,

molecular dynamics or Monte Carlo approaches are more reliable for estimation of thermodynamic

functions.

10.10. The Pressure

Fluctuations in the pressure are related to the isothermal compressibility, which is very small

for a dense fluid. For this reason calculation of the isothermal compressibility by the method of

fluctuations is a challenging task. In general, pressure is difficult to calculate accurately. The

agreement between the MD and Monte-Carlo (MC) methods is poor compared to the energy and the

statistics are significantly worse than for the energy.

10.11. The radial distribution function

The radial distribution function is an example of a pair correlation function, which describes

how, on average, the atoms in a system are radially packed around each other. This proves to be a

particularly effective way of describing the average structure of disordered molecular systems such

as liquids. Also in systems like liquids, where there is continual movement of the atoms and a single

snapshot of the system shows only the instantaneous disorder, it is extremely useful to be able to

deal with the average structure.

The radial distribution function is useful in other ways. For example, it is something that can

be deduced experimentally from x-ray or neutron diffraction studies, thus providing a direct

comparison between experiment and simulation. It can also be used in conjunction with the

interatomic pair potential function to calculate the internal energy of the system, usually quite

accurately.


Figure 25. Construction of a radial distribution function

To construct a radial distribution function is simple. Choose an atom in the system and draw

around it a series of concentric spheres, set at a small fixed distance (r) apart (see figure above). At

regular intervals a snapshot of the system is taken and the number of atoms found in each shell is

counted and stored. At the end of the simulation, the average number of atoms in each shell is

calculated. This is then divided by the volume of each shell and the average density of atoms in the

system. The result is the radial distribution function. Mathematically the formula is:

g(r)=n(r)/ r2r)

In which g(r) is the radial distribution function, n(r) is the mean number of atoms in a shell of width

r at distance r, is the mean atom density. The method need not be restricted to one atom. All the

atoms in the system can be treated this way, leading to an improved determination of the radial

distribution function as an average over many atoms.

The radial distribution function is usually plotted as a function of the interatomic separation

r. A typical radial distribution function plot (below) shows a number of important features. Firstly, at

short separations (small r) the radial distribution function is zero. This indicates the effective width of

the atoms, since they cannot approach any more closely. Secondly, a number of obvious peaks

appear, which indicate that the atoms pack around each other in “shells” of neighbors. The

occurrence of peaks at long range indicates a high degree of ordering. Usually, at high temperature

the peaks are broad, indicating thermal motion, while at low temperature they are sharp. They are

particularly sharp in crystalline materials, where atoms are strongly confined in their positions. At

very long range every radial distribution function tends to a value of 1, which happens because the

radial distribution function describes the average density at this range.

Figure 26. Both, MD and MC give similar forms for radial distribution function g(r)


10.12. Calculation of Dynamic Properties from Molecular Dynamics

Simulations

In the paper by Jianshu Cao and Gregory A. Voth (J. Chem. Phys. 103(10), 8 September 1995)

a theory for time correlation functions in liquids is developed. It is based on the optimized quadratic

approximation for liquid state potential energy functions.

10.13. Correlations and the Correlation Time

Correlations between two different quantities X and Y are determined via the correlation

coefficient:

Where means covariance, and (X), (Y) are standard deviations. The value of cXY lies

between 0 and 1, with values close to 1 indicating high correlation. If the coefficient cXY is evaluated

at different times, it becomes a time correlation function cXY(t). For identical variables X=Y, cXX(t) is

called an autocorrelation function and its integral from 0 to ∞ is a correlation time.

10.14. Correlation Functions and Properties

The meaning of the coefficient in a simulation is represented by

where represents a point in phase space, that is a set of positions and momenta at a given time

step in the computer MD simulation. Time correlation functions are useful in molecular dynamics

simulations because their time integrals can be related to transport coefficients or other properties:

diffusion viscosity

dielectric constant

thermal conductivity

The Fourier transform of time correlation functions can be related to experimental spectra.

10.15. The Time Correlation Function

In MD, the system is moved in discrete time intervals following Newton's equations of

motion. At any time t we can calculate a property A(t). The time correlation function is the product of

the property at t and at a time t+.

The angle brackets represent statistical averaging. It is defined in statistical mechanics as averaging

over many similar systems (the ensemble). We can use many separate time frames of molecular

dynamics instead of many systems in the ensemble to obtain useful time decays that can be

analyzed.


10.16. Velocity Autocorrelation Function

The velocity autocorrelation function is a prime example of a time dependent correlation

function, and is important because it reveals the underlying nature of the dynamical processes

operating in a molecular system. It is constructed as follows. At a chosen origin in time (i.e. some

moment when we chose to start the calculation) we store all three components of the velocity vi,

where

vi=[vx(t0),vy(t0),vz(t0)]i

for every atom (i) in the system. We can already calculate the first contribution to the velocity

autocorrelation function, corresponding to time zero (i.e. t=0). This is simply the average of the scalar

products vi . vi for all atoms:

At the next time step in the simulation t = t0 + t; and the corresponding velocity for each atom is

Vi = [Vx(t0 + t), Vy(t0 + t), Vz(t0 + t)]i

and we can calculate the next point of the velocity autocorrelation function as

We can repeat this procedure at each subsequent time step and so obtain a sequence of points in

the velocity autocorrelation function, as follows:

or (for short)

.

Though this can be continued forever, we generally stop after a fixed value of n, and start

again to calculate another velocity autocorrelation function, beginning at a new time origin. The final

velocity autocorrelation function can then be an average of all the velocity autocorrelation function's

we have calculated in the course of our simulation. What could such a function tell us about the

molecular system?

Consider a single atom at time zero. At that instant the atom (i) will have a specific velocity vi.

If the atoms in the system did not interact with each other, the Newton's Laws of motion tell us that

the atom would retain this velocity for all time. This of course means that all our points Cv(t) would

have the same value, and if all the atoms behaved like this, the plot would be a horizontal line. It

follows that a velocity autocorrelation function plot that is almost horizontal, implies very weak

forces are acting in the system.

On the other hand, what happens to the velocity if the forces are small but not negligible?

Then we would expect both its magnitude and direction to change gradually under the influence of

these weak forces. In this case we expect the scalar product of Vi(t=t0) with Vi(t=t0+nt) to decrease

on average, as the velocity is changed. (In statistical mechanics we say that the velocity decorrelates

with time, which is the same as saying the atom 'forgets' what its initial velocity was.) In such a

system, the velocity autocorrelation function plot is a simple exponential decay, revealing the


presence of weak forces slowly destroying the velocity correlation. Such a result is typical of the

molecules in a gas.

What happens when the interatomic forces are strong? Strong forces are most evident in

high density systems, such as solids and liquids, where atoms are packed closely together. In these

circumstances the atoms tend to seek out locations where there is a near balance between repulsive

forces and attractive forces, since this is where the atoms are most energetically stable. In solids

these locations are extremely stable, and the atoms cannot escape easily from their positions. Their

motion is therefore an oscillation; the atoms vibrate backwards and forwards, reversing their velocity

at the end of each oscillation. If we now calculate the velocity autocorrelation function, we will

obtain a function that oscillates strongly from positive to negative values and back again. The

oscillations will not be of equal magnitude however, but will decay in time, because there are still

disrupting forces acting on the atoms to change their oscillatory motion. So what we see is a function

resembling a damped harmonic motion.

Liquids behave similarly to solids, but now the atoms do not have fixed regular positions. A

diffusive motion is present to destroy rapidly any oscillatory motion. The velocity autocorrelation

function therefore may perhaps show one very damped oscillation (a function with only one

minimum) before decaying to zero. In simple terms this may be considered a collision between two

atoms before they rebound from one another and diffuse away.

As well as revealing the dynamical processes in a system, the velocity autocorrelation

function has other interesting properties. Firstly, it may be Fourier transformed to project out the

underlying frequencies of the molecular processes. This is closely related to the infra-red spectrum

of the system, which is also concerned with vibration on the molecular scale. Secondly, provided the

velocity autocorrelation function decays to zero at long time, the function may be integrated

mathematically to calculate the diffusion coefficient D0, as in:

This is a special case of a more general relationship between the velocity autocorrelation

function and the mean square displacement, and are known as the Green-Kubo relations, which

relate correlation functions to so-called transport coefficients.

10.17. Calculating the diffusion coefficient from the mean-square

displacement

10.17.1. The Mean Square Displacement

Molecules in liquids and gases do not stay in the same place, but move about constantly. It is

in fact essential that they do so, otherwise they would not possess the property of fluidity. The

phenomenon is apparent if you place a drop of ink into water - after a while the color is evenly

distributed through the liquid. It is obvious that the molecules of the ink have moved through the

bulk of the water. This process is called diffusion and it happens quite naturally in fluids at

equilibrium. (The water molecules themselves are also undergoing diffusion, though this is not so

obvious.)


The motion of an individual molecule in a dense fluid does not follow a simple path. As it

travels, the molecule is jostled by collisions with other molecules which prevent it from following a

straight line. If the path is examined in close detail, it will be seen to be a good approximation to a

random walk. Mathematically, a random walk is a series of steps, one after another, where each step

is taken in a completely random direction from the one before. This kind of path was famously

analyzed by Albert Einstein in a study of Brownian motion and he showed that the mean square of

the distance travelled by a particle following a random walk is proportional to the time elapsed. This

relationship can be written as

where is the mean square distance and t is time. D and C are constants. The constant D is the

most important of these and defines the diffusion rate. It is called the diffusion coefficient.

10.17.2. What is the mean square distance and why is it significant?

Imagine a single particle undertaking a random walk. For simplicity assume this is a walk in

one dimension (along a straight line). Each consecutive step may be either forward or back, we

cannot predict which, though we can say we are equally likely to step forward as to step back. (A

drunk man comes to mind!) From a given starting position, what distance are we likely to travel after

many steps? This can be determined simply by adding together the steps, taking into account the

fact that steps backwards subtract from the total, while steps forward add to the total. Since both

forward and backward steps are equally probable, we come to the surprising conclusion that the

probable distance travelled sums up to zero!

If however, instead of adding the distance of each step we added the square of the distance,

we realize that we will always be adding positive quantities to the total. In this case the sum will be

some positive number, which grows larger with every step. This obviously gives a better idea about

the distance (squared in this case) that a particle moves. If we assume each step happens at regular

time intervals, we can easily see how the square distance grows with time, and Einstein showed that

it grows linearly with time.

In a molecular system a molecule moves in three dimensions, but the same principle applies.

Also, since we have many molecules to consider we can calculate a square displacement for all of

them. The average square distance, taken over all molecules, gives us the mean square

displacement. This is what makes the mean square displacement significant in science: through its

relation to diffusion it is a measurable quantity, one which relates directly to the underlying motion

of the molecules.

In molecular dynamics the mean square displacement is easily calculated by adding the

squares of the distance. Typical results (for a liquid) resemble the following plot.


Figure 27. The linear dependence of the mean square displacement plot is apparent. If the slope of this plot is taken, the diffusion coefficient D may be readily obtained

At very short times however, the plot is not linear. This is because the path a molecule takes

will be an approximate straight line until it collides with its neighbor. Only when it starts the collision

process will its path start to resemble a random walk. Until it makes that first collision, we may say it

moves with approximately constant velocity, which means the distance it travels is proportional to

time, and its mean square displacement is therefore proportional to the time squared. Thus at very

short time, the mean square displacement resembles a parabola. This is of course a simplification -

the collision between molecules is not like the collision between two pebbles, it is not instantaneous

in space or time, but is `spread out' a little in both. This means that the behavior of the mean square

displacement at short time is sometimes more complicated than this mean square displacement plot

shows.

10.17.3. The Mean Squared Displacement and the Velocity

Autocorrelation Function

The mean square displacement and the velocity autocorrelation function seem to be two

very different functions. The mean square displacement is (for the most part) a linear function of

time, while the velocity autocorrelation function displays a complicated dependence on time. But a

little thought will suggest that they must have something in common. Both, in an average sense,

describe the motion of a molecule with time and must therefore be related somehow. The

mathematical relationship is revealing, as the following shows.

We can describe the distance r(t) a molecule moves in time as an integral of its velocity v(t):

The square of this distance is thus

defining u'=u+s and integrating over u, results in the following form where the ensemble average has

also been taken:


In this equation <v(0) v(s)> is the velocity autocorrelation function, so the relationship between mean

square displacement and velocity autocorrelation function is now apparent. This can also be written

as

What this integral shows is that the mean square displacement is comprised of two parts. The first

term on the right includes the time t explicitly and if we assume that when t is large, the velocity

autocorrelation function decays to zero (as it usually does) then the integral here will have a fixed

value. Since the second term also integrates to a fixed value for large t, we can see that this equation

is equivalent to Einstein's, provided we assume that

and

when t is large. This is a very important result, as it shows how the diffusion coefficient can be

obtained from both the velocity autocorrelation function and the mean square displacement.

Another thing we can see is that when t is small, the time dependence of the velocity

autocorrelation function cannot be ignored (it is no longer constant). So from the above integral, it

follows that the mean square displacement must depend on the behavior of the velocity

autocorrelation function at short time. This means the short time behavior of the mean square

displacement cannot be linear. Molecular motion only becomes random after the velocity

autocorrelation function becomes zero and the molecules have “forgotten” what speed and direction

they began with at t=0.

10.18. Molecular dynamics simulations of liquid water

The water computer model is probably the most frequently used compound in molecular

mechanics. The early attempts to model water molecules originate in 1970 years of XX century:

Rahman and Stillinger - Original four site model 1971

Stillinger and Rahman - Revised model 1973

Jorgensen - TIPS3 model, three site model 1981

Berendsen - Optimization of parameters 1981

Jorgensen - Comparison of models 1983

Review of properties of selected, so-called, three-point water models: M. Pekka, L. Nilsson, J. Phys.

Chem. A 2001, 105: (9954-9960).

In Rahman and Stillinger’s model water molecules were treated as asymmetric rigid rotors. They

defined effective pair potentials to replace higher order potential terms and …used neon parameters


for oxygen (that is currently rather unusual). They define also a switching function that allows the

potential to vary smoothly to zero and assigned charges to lone pairs and hydrogens, this is the

reason why it is called the four-site model for H2O.

10.18.1. Dielectric Relaxation

Our lack of knowledge of the true dipole moment in liquid water limits our ability to predict

the static dielectric constant . The four-site model guarantees a tetrahedral hydrogen bond

arrangement, the hydrogen bonds are too short and too directional. To improve the original model

the lone pairs were shortened to make the ST2 model (d = 0.8 Å).

Figure 28. The Stillinger and Rahman (JCP 1974, 60, 1545) model of water molecule

10.18.2. Three-site models for water

There are a few three-site water models:

The original TIPS3 model has positive charges on the hydrogen atoms and a negative charge on

oxygen atom (qO = -2qH). (Jorgensen JACS 1981, 103, 335)

Berendsen parameterized a three-site water model (SPC) and got better agreement with the

experiment. (Berendsen et al. in Intermolecular Forces 1981 p.331)

Comparison of those models is made in Jorgensen et al. (JCP 1983, 79, 926)

The TIP3P model is now frequently used. The second peak of the radial O-O pair distribution

function gO-O tends to disappear for this model. It is a good overall model and it is much less

expensive than TIP4P or other four site models.

10.18.3. Implicit Treatment of Solvation

Figure 29. Mean influence of water captured by the solvation free energy

O

H

H

q = -0.23 e

q = -0.23 e

q = +0.19 e

q = +0.19 e

d


The Implicit Treatment of Solvation can be very efficient especially with the generalized Born

(GB) approach. For example the generalized Born with a smooth switching (Michael S. Lee, Freddie R.

Salsbury, and Charles L. Brooks, J. Chem. Phys. 116, 10606 (2002); Im W, Feig M, Brooks CL 3rd.,

Biophys J. 2003 Nov;85(5):2900-18.) is about 30 times faster than comparable explicit solvent

simulations (W. Im, M.S. Lee, and C.L. Brooks III, J. Comput. Chem. 24:1691-1702 (2003).). It offers a

good balance (compromise) between accuracy and efficiency.

10.19. Conformational searching, Quench Dynamics

Quench (or quenched) molecular dynamics was historically one of the first conformational

“search” methods. Currently there exist better tools like Monte-Carlo method or Replica Exchange

Molecular Dynamics.

10.19.1. Protocol for conformational search: quenched molecular

dynamics

1. Energy minimization.

2. Equilibration at high temperature for production dynamics. Run production dynamics and save

structures at periodic intervals (for example after every 1 ps).

3. Slowly cool the structures (annealing) and minimize energy of resulting conformations.

4. Save minimized structures for structural studies.

10.20. Constraints

Constraints (restrictions on the conformational “freedom” of the molecule) may be imposed

during minimization, as well as during dynamics. These constraints may be based on experimental

data such as NOEs from an NMR experiment or they may be imposed by a template such that One

forces a ligand to find the minimum closest in structure to a target molecule. Template forcing is also

important for homology modeling. Since it is not possible, at present, to fold a protein by single

energy minimization, one can approach the question of determining the fold of a protein by

comparing it with a structure that has significant amino acid sequence homology.

10.20.1. Restrained dynamics as a tool in NMR structure determination

Distance restraints force two atoms toward a given value

E = k(rij – rtarget)2

where k is the force constant and rtarget is the target distance. An “energy penalty” is paid for

deviation from the target distance. In a typical NOE experiment, usually only the upper bound

distance is known, for example r < 5Å, for that reason an experimental data can be

Figure 30. An illustration of a so-called „flat-bottomed” potential


incorporated into a simulation using a so-called „flat-bottomed” potential. A flat-bottomed restraint

function allows flexibility to accommodate typical data where the minimum distance between nuclei

is determined from van der Waal’s radii and the data impose an upper bound.

10.20.2. Use of constraints to increase the integration step

The SHAKE algorithm constrains motions so bond lengths do not exceed preset thresholds. It

uses iterative adjustments in atom positions (one-by-one). The SHAKE algorithm typically improves

(shortens) computational time by about 3 times.

Figure 31. An illustration of the effect of the SHAKE algorithm on molecular structures.

Application of the „SHAKE“ algorithm enables the increase of the integration step from t =

1fs (fs = femtosecond) to t =2 fs.

Figure 32. "Shaking" water

The special “three-point” algorithm (SETTLE) is used for restraining deformations of water models.

Instead of restraining the bond angle there is an artificial H - H bond introduced (and constrained).

10.20.3. SHAKE and minimization

Since SHAKE is an algorithm based on dynamics, the minimization algorithm is not aware of

what SHAKE is doing; for this reason, minimizations generally should be carried out without SHAKE.

One exception is short minimization whose purpose is to remove close contacts between atoms

before molecular dynamics simulations can begin. Even in this case SHAKE can be avoided by

artificial, substantial increase of bond and bond angle force constants values during the short initial

minimization.


10.21. Boundary conditions

Many current simulations are performed using periodic boundary conditions, so that surface

effects can be avoided and configurations, typically encountered at the macroscopic level of the

system, can be obtained. In this case, a particle interacts not only with all the particles in the

systems, but also with their periodic images. The boundary conditions can be divided into two

classes:

Spatial boundary conditions:

MD simulations of biomolecules can be performed in:

Vacuum: it is of little interest

Condensed phase: the system must be finite

Spherical droplet (finite boundary): boundary artifacts

Periodic box (there is no boundary at all): periodicity artifacts

Thermodynamic boundary conditions:

MD simulations can be performed at different ensembles, according to statistical mechanics they can

be divided into four groups:

Constant NVE: micro-canonical ensemble

Constant NVT: canonical ensemble

Constant μVT: grand-canonical ensemble

Constant NPT: isothermal-isobaric ensemble

Figure 33. Boundary conditions box or droplet?

We cannot simulate infinite systems, but finite systems lead to boundary effects. The

solution is to use periodic boundary conditions (PBC). How to make sure a particle does not interact

with itself ? Use the minimum image convention and cut-off interactions beyond a specified

distance. After applying periodic boundary conditions the electrostatic interactions need “special

treatment” as they are long range.

Figure 34. An illustration of a periodic boundary conditions


After applying boundary conditions the finite system is converted into an infinite system

without increasing computational cost. As a sideline result the new “features” are introduced:

unwanted surface effects are eliminated and an artificial periodicity is imposed.

Figure 35. An illustration of an artificial periodicity

Motion of atoms in a “box replicas” mirrors the motion of atoms in the central box. If an

atom leaves the central box, it's replica enters the central box from the other side this implies that

the number of atoms in a central box is conserved. It is worth noting that not only a rectangular box

can be replicated using the periodic boundary conditions.

A B C

Figure 36. A. Example: truncated octahedron; B. Peptide in aqueous solution in a periodic truncated octahedron. C. An illustration of how the cut-off value Rc can be applied to the extended system using periodic

boundary conditions

Figure 37. Nothing can stop particles from interacting with other particles from the neighboring boxes


Figure 38. An illustration of the minimum image convention

The “minimum image convention”: a particle doesn't interact with all of the other particles,

only with the nearest non-equivalent neighbors, and each atom interacts with at most one image of

every atom (each individual particle in the simulation interacts with the closest image of the

remaining particles in the system).

The periodic boundary conditions (usually) work in three dimensions. In other words, each

system “sees infinite number of its images” along X, Y and Z axes. In order to simulate surface one

needs to apply two-dimensional periodic boundary conditions.

Figure 39. Two-dimensional periodic boundary conditions

10.22. Boundary conditions, „special” treatment of electrostatic

interactions

10.22.1. Reaction field method

Interactions are restricted inside a given radius, everything else outside is taken as a

homogeneous medium with a dielectric constant s


Figure 40. An illustration of the reaction field method

System energy is now defined as

For atomistic systems the dipoles can be taken as the charge groups. This approach is

computationally quite fast and it has become a popular approach in the modeling of very large

biostructures. However, the (still unsolved) problem is what value assigns to s?

10.22.2 Ewald summation

The Ewald summation method takes into account all interactions to an atom inside the MD

“central box” as well as from the periodic images (Ann. Phys. 64 (1921) 253):

This sum does not necessarily converge. The idea is to do the summation in such an order that it

does. (And preferably as fast as possible).


10.22.3. Scaling and other related methods

Standard Ewald summation is O(N2). A full optimization gives a slight improvement, O(N3/2).

Particle-particle particle-Mesh (PPPM, P3M) The method relies on expressing the long-range

interparticle force as the sum of two components: the short-range force, which is only nonzero

within some cut-off radius, and the ”reference” force, that is long-ranged and smooth and can be

approximated on a grid. P3M scales as O(N log N). (Hockney and Eastwood, Computer simulation

using particles (McGraw-Hill, New York. 1981))

Fast multipole method (FMM) looks at different regions in space at different resolutions; the

contributions of regions far away are described as electric multipole expansions. FMM scales as

O(N). However, in practice these methods become more efficient than an optimized Ewald

routine only for particle numbers ~ 105. (J. Comput. Phys. 73 (1987) 325)

For intermediate system sizes, N = 103 – 104, the Particle Mesh Ewald method (PME), that scales

as O(N log N), is a good alternative.

The reciprocal sum in the Ewald summation is approximated using Fast Fourier Transform with

convolutions on a grid where charges are interpolated to the grid points. PME does not

interpolate but evaluates the forces by analytically differentiating the energies. (J. Chem. Phys.

98 (1993) 10089)

For systems that are not fully periodic, in addition to using FMM there is also a method called

Lekner summation (Physica A 176 (1991) 485-498).

10.23. Practical Tips for Setting up MD

1. Decide what you want to simulate (protein, DNA, sugars, water, ions, lipids)

2. Build Individual Components

add missing atoms

add hydrogens

modify ionization states

add functional groups into residues

compute missing energy parameters with quantum mechanics (QM)

3. Solvate Structure

4. Combine Molecular Components (lipid bilayer, water, ions, polymeric chains)

5. Minimize Energy / Equilibrate

10.24. MD Simulation protocol

Depends on the purpose of the simulation:

1. following a process in time implies the dynamical simulation

time is important

running many trajectories

averaging over set of trajectories

2. getting equilibrium properties of the system implies thermodynamical simulation

time is not important

running one trajectory

averaging over time ( vs. average over ensemble in 1.)


10.24.1. Sample MD simulations work-flow

Figure 41. Sample MD simulations work-flow

10.25. The „heating” dynamics stage, the temperature control

Currently, a typical molecular dynamics simulation has 102 – 108 particles. It is still very far from

Avogadro’s number, 6 x 1023. So what does it mean temperature and pressure in such a small system?

10.25.1. Temperature

The result from the statistical mechanics - the equipartition theorem of energy

states that for Nf degrees of freedom in the kinetic energy, the temperature function may be given as

10.25.2. MD simulations with a temperature bath.

Figure 42. MD simulations with a temperature bath


A temperature bath tries to keep the protein at constant temperature or kinetic energy (Ek).

E is not conserved. Now, as the system falls down into a more stable state (lower potential

energy, Ep), the temperature bath “steals” the extra Ek .

On the other hand, if the system moves into a higher (less favorable) potential energy

(exchanging Ek for Ep and thus cooling off), the temperature bath gives a heat “boost” to restore

Ek . Ep is not conserved and is time dependent.

The system samples potential energy Ep with probability proportional to .

The protein now samples a much larger set of phase space (q, p). The (q, p) set is called the

canonical ensemble.

The fluctuations in the E distribution are critical, because large E’s and Ep’s get you OVER energy

barriers.

10.25.3. Barriers, Temperature and Timescales

Figure 43.The protein energy landscape, barriers, temperature and timescales

Protein energy landscape is highly complex and rugged with numerous local minima. Increase of the

temperature enhances barrier crossing. The “enhancement factor” depends exponentially on the

barrier height. After increasing the temperature the typical time-to-live in one high-energetic

conformational state shortens significantly, and also enhances the ability to cross barriers.

10.25.4. Berendsen thermostat

The weak-coupling thermostat, commonly known as the Berendsen thermostat or

temperature control, is a common way for controlling the temperature in MD simulations. (J. Chem.

Phys. 81 (1984) 3684). The basic idea is that One adds a frictional term to the equations of motion,

which then drives the system (exponentially) towards the desired temperature. In practice, the

velocities of the particles are scaled with a factor

The Berendsen algorithm is simple to implement and it is very efficient for reaching the desired

temperature from far-from-equilibrium configurations. The downside is that it has been argued that

this method does not produce the correct statistical ensemble.


10.25.5. Nosé-Hoover thermostat

Another popular thermostat is the Nosé-Hoover thermostat, which retains a Boltzmann

equilibrium distribution. (Hoover, Phys. Rev. A 31 (1985) 1695; S. Nosé, J. Chem. Phys. 81 (1984) 511;

Mol. Phys. 52 (1984) 255.)

Again, One can add a frictional term to the equations of motion

with the dynamics of the friction coefficient by

Q = fictional “heat bath mass”. Large Q means weak coupling. Nosé suggested Q ~ gkBT, where g is

the number of degrees of freedom in the

system (~ 6N).

10.25.6. Andersén thermostat

Another straightforward way to control the temperature (with a proper statistical

distribution) is the method by Andersén (J. Chem. Phys. 72 (1980) 2384). After each time step, each

atom is assigned, with some probability, a new velocity corresponding to the desired temperature T0.

This can be thought as coupling all the atoms in the system to an external heat bath. The method

enables one to calculate thermodynamic averages. However, due to the perturbation of the atom

velocities, it is not possible to precisely study atomistic processes in detail.

10.25.7. Trivial Temperature scaling

The simplest possible way to control the temperature in the system would be simply to set

the temperature of the system at every time step exactly to T (this means to scale atom velocities

with a suitable factor). However, for small systems this may cause significant perturbations of the

atom trajectories and the overall dynamics of the system. In addition, this totally suppresses any

possible (natural) fluctuations in system temperature and does not provide the correct statistical

ensemble (NVT, NpT). Proper temperature is characterized by distributions, not averages.

10.25.8. Berendsen barostat

The system pressure is set toward a desired value by changing the dimensions of the

simulation cell size during the simulation. The scaling factor (for each dimension) is

Where P0 is the desired value of pressure, τp the coupling time constant for the pressure scaling and

β the isothermal compressibility of the system. The scaling is done for all components of the atom

positions as well as the simulation cell dimensions. Note that since β appears in the scaling factor as a

product with δt/τp, provided the time step and the coupling constant are chosen wisely (usually

τp>100 δt) , one does not need to know the exact isothermal compressibility of the system in


question. If used with the Berendsen thermostat one obtains realistic fluctuations in T and P (J.

Chem. Phys. 81 (1984) 3684).

10.25.9. Andersén barostat

The Andersén method for pressure control uses a fictional piston, with a ”mass” Q, to control

the volume of the simulation cell. (J. Chem. Phys. 72 (1980) 2384)

The kinetic and potential energy of the piston are:

and

If the atom positions r and velocities v are written in reduced units s so that r = V1/3s and v =

V1/3(ds/dt), the equations of motion are

Where f are the forces on the atoms, P(t) is the instantaneous pressure and P0 the desired pressure.

10.25.10. Parrinello-Rahman barostat

A method for pressure control, by Parrinello and Rahman was first introduced in J. Appl.

Phys. (52 (1981) 7182). This method has the particular advantage that it allows a variable simulation

cell shape. Basically, the box vectors are set to follow an equation of motion, and the equations of

motion of the particles are also changed as in the Nosé-Hoover thermostat. No instantaneous change

of atom positions takes place. It is advised that for an exhaustive derivation of the Parrinello-Rahman

barostat and the related equations of motion, look at the original paper cited above.

Nosé-Hoover thermostat and Parrinello-Rahman barostat can be used to provide most realistic

fluctuations in temperature and pressure when one is interested in the thermodynamic properties of

a system.

10.25.11. So which ones to use?

Do not use any trivial quenching scaling methods. They suppress fluctuations, and do not

provide the correct statistical ensembles. Berendsen T and p control are simple to implement and

use. In addition, they can steadily drive the system state far from equilibrium toward equilibrated

state. This is very handy at the start of the simulation, where significant fluctuations may take place.

However, if you need to produce the correct statistical ensemble you will need to use other

methods. Nose-Hoover thermostat with Rahman-Parrinello barostat (J. Appl. Phys. 52 (1981) 7182)

is possibly the best option there.

10.26. „Production run” protocol, „heating” dynamics

After an initial minimization stage, typically using the robust steepest descent (SD) and (CG)

algorithms to resolve any initial poor contacts within the system without creating large distortions in

the overall structure, the heating dynamics is carried out. It is the first stage of “production run” MD

simulation protocol, which may take place just after the initial (optimized) structure of the system

becomes available. A molecular dynamics simulation heating stage is employed to add thermal

energy to the system to reach a target temperature. A standard molecular dynamics simulation stage

is then employed to equilibrate the system at a target temperature. The purpose of the equilibration


stage is to ensure that the energy in the system is distributed appropriately among all degrees of

freedom. This allows the system to achieve thermal equilibration at the target temperature. The last

part of the standard dynamics cascade is typically the “production stage” of molecular dynamics in

an NVT or NVE statistical ensemble using a velocity or a leap-frog Verlet integration algorithm. The

results of the production stage are stored in the simulation trajectory (this is usually a text file), from

which structural and energetics properties can be calculated and subsequently analyzed.

10.27. The Replica-Exchange algorithm

Figure 44. The Replica-Exchange algorithm

Schematic representation of the exchange of structures between replicas (parallel MD

simulations, starting from the common initial structure, which are carried out at different

temperatures) of the system.

500 K

420 K

355 K

300 K

280 K

Figure 45. Schematic representation of the exchange of structures between replicas


A drawback of temperature replica-exchange (Chemical Physics Letters 1999, 314(1-2): 141-

151) simulations is that many replicas are needed for simulations in an explicit solvent. An effective

simulation of temperature replica-exchange of a system consisting of small peptides in water, Tmax –

Tmin ~ 300K, may need even 50-80 replicas.

Since the replicas are non-interacting and independent their Hamiltonians (energy functions)

need not be identical. One may use different Hamiltonians and exchange those (R. Affentranger, I.

Tavernelli, E. E. Di Iorio. A Novel Hamiltonian Replica Exchange MD Protocol to Enhance Protein

Conformational Space Sampling, J. Chem. Theory Comput. 2006, 2(2): 217-228) instead of

temperature or restrict the changes in the Hamiltonians to only a subset of the degrees of freedom

to reduce the number of replicas needed.

10.28. Simulated Annealing

The simulated annealing (S. Kirkpatrick, C. D. Gelatt, Jr. and M. P. Vecchi, Science 13

May 1983: Vol. 220 no. 4598 pp. 671-680) is a popular global optimization algorithm. It starts

near 0 K then the temperature rises rapidly (in ~ 1-5 ps) to T ~ 1000K, the system stays at that

temperature for a while (in ~50 ps) and, after that stage, it decreases T in small steps (cooling

schedule, in ~100-500 ps). It is easy to understand and implement but it might be easily trapped in

local minima.

Figure 46. Examples of simulated annealing cooling schedules

10.29. Langevin equation of motion

Langevin equation of motion

where is an acceleration multiplied by mass of the moving particle, represents a friction

force, is a conservative force (analogous to the force from Newton’s equation of motion), is a

random force. According to Langevin, at each time step all particles receive a random force and have


their velocities lowered using a constant friction. The friction force is proportional to velocity (f > 0)

and, because of the “-“ sign, it lowers the velocities what implies a decrease in kinetic energy and

temperature. The random force, on the contrary, adds kinetic energy to the particle and it causes the

increase of temperature. Balance of those two forces maintains the system temperature at the set

value. This works like a thermostat, and in fact it is called the Langevin thermostat, and it generates

the correct canonical ensemble.

According to the statistical mechanics the random force can be written as .

Everything under the square root sign depends on an environment of the molecule and is constant

for a given system (only temperature, T may fluctuate). The is a so-called stationary Gaussian

process with zero mean. It means that the random force vanishes after averaging over time. Such a

random force mimics the random collisions of the modeled molecule with the solvent molecules but

they (the solvent molecules) are not included explicitly in the simulations. This method saves a lot of

computational time and it is used sometimes instead of the standard MD with other solvent models.

10.30. Brownian Dynamics (BD)

The dynamic contributions of the solvent are incorporated into the Brownian dynamics

equations as a dissipative random force (Einstein’s derivation on 1905). Therefore, water molecules

are not treated explicitly. Since BD algorithm is derived under the conditions that solvent damping is

large and the inertial memory is lost in a very short time, longer time-steps can be used. In other

words, BD method is suitable for long time simulation.

If the Langevin equation can be expressed as

Here, and represent the position and mass of atom i, respectively. is a frictional coefficient

and is determined by the Stokes’ law, that is, in which

is a Stokes radius of

atom i and is the viscosity of water. is the systematic force on atom i. is a random force on

atom i having a zero mean and a variance ; this derives from

the effects of solvent. For the overdamped (overdamped, in this context, means that “something”

decreases gradually, the system returns - exponentially decays - to equilibrium without oscillating)

limit, we set the left side of Langevin equation to zero,

This equation can be “integrated” using the Verlet-like procedure and it leads to the Brownian

dynamics equation:

where t is a time step and a random noise vector obtained from Gaussian distribution.

62 Introduction to molecular modeling. 11. Monte Carlo Method by Rajmund Kaźmierkiewicz

11. Monte Carlo Method

The Monte Carlo simulation technique has formally existed since the early 1940s, where it had

applications in research into nuclear fusion. There are a number of isolated and undeveloped

instances on much earlier occasions. Buffon's [Georges Louis Leclerc Comte de Buffon (07.09.1707.-

16.04.1788.)] had an idea to drop a needle of length L at random on grid of parallel lines of spacing D.

For L less than or equal D one can obtain probability that the needle intersects the grid P = 2·L/·D.

After dropping the needle N times and count R intersections one obtains P = R / N, and the

approximate value of π, π = 2·L·N/R·D.

11.1. What is Monte Carlo?

Monte Carlo (MC) methods are stochastic techniques, this means they are based on the use

of random numbers and probability statistics to investigate problems. The MC methods are used in

many areas from economics to nuclear physics to regulating the flow of traffic. The way they are

applied varies from field to field, and there are a lot of variants of MC even within (bio)chemistry. MC

methods derive their collective name from the fact that Monte Carlo, the capital of Monaco, has

many casinos and casino roulette wheels are a good example of a random number generator.

Note that these methods only provide an approximation of the answer. The attempt to

minimize errors is the reason there are so many different Monte Carlo methods. Major Components

of MC methods are: the probability distribution function, random number generator, sampling rule,

scoring (function).

11.2. Monte Carlo (MC) Simulation

Instead of evaluating forces to determine incremental atomic motions, Monte Carlo

simulation simply imposes relatively large motions on the system and determines whether or not the

altered structure is energetically feasible at the temperature simulated. The system jumps abruptly

from conformation to conformation, rather than evolving smoothly through time. It can traverse

barriers without feeling them; all that matters is the relative energy of the conformations before and

after the jump. Because MC simulation samples conformation space without a true `time' variable or

a realistic dynamics trajectory, it cannot provide time-dependent quantities. However, it may be

much better than MD in estimating average thermodynamic properties for which the sampling of

many system configurations is important. One of the first applications of the MC method was

calculation of complicated integrals in 1940’s. MC can also be applied to calculate average

macroscopic properties of modeled biomolecular systems, because they are “conveniently” defined

by (rather complicated) integrals especially when an average value of a quantity of interest is defined

by the typical integral:

, where

Introduction to molecular modeling. 11. Monte Carlo Method 63 by Rajmund Kaźmierkiewicz

Figure 47. The interplay between the potential energy and population distribution in MC method

In molecular mechanics one usually looks for the lowest energy conformation. We are in a

very fortunate situation, because from statistical mechanics one can conclude that the system will

preferably populate the lowest energy states. Such a physical low favors sampling of low-energy

structures.

Consider the game of solitaire: what’s the chance of winning with a properly shuffled deck? It

is hard to compute analytically because winning or losing depends on a complex procedure of

reorganizing cards. Why not just play a few hands, and see empirically how many do in fact win? The

Monte Carlo principle states that one can approximate a probability density function (in this

example the “frequency” of winning hands) using only samples from that density (one can estimate

the chance of winning from a couple of card deals).

11.2.1. Evolution of Monte Carlo methods so far…

If we take a look at the short Monte-Carlo history we could see:

1. Uniform points and original integrand…

but this had very poor efficiency

2. Uniform points and transformed integrand…

but this only worked for certain integrands

3. Non-uniform points and scaled integrand…

but this is very cumbersome for complicated integrands…

4. Now, we try Markov chain approaches…

11.2.2. Markov chains

Properties of Markov chain are: it is a sequence of randomly-chosen states, the probability of

transitions between states is independent of history, the entire chain represents a stationary

probability distribution. There are some examples of Markov chains applicable to molecular

mechanics: computer random number generators, Brownian motion, Hidden Markov Models,

(perfectly) encrypted data.

A Markov chain is a sequence of random variables X1, X2, X3 ... with the Markov property,

namely that, given the present state, the future and past states are independent. The possible values

of Xi form a countable set S called the state space of the chain. Markov chains are often described by

a directed graph, where the edges are labeled by the probabilities of going from one state to the

other states.


11.2.3. Markov chain Monte Carlo

Assembling the entire distribution for MC is usually hard due to:

Complicated energy landscapes

High-dimensional systems

Extraordinarily difficult normalization

Solutions are:

Build up distribution from Markov chain

Choose local transition probabilities which generate distribution of interest

Each random variable is chosen based on the previous variable in the chain

“Walk” along the Markov chain until convergence reached

Result: Normalization not required, calculations are local

11.3. Implementation of the Metropolis algorithm (it is a kind of

Markov chain)

Figure 48. The Metropolis algorithm

11.4. Implementation of the Metropolis algorithm

1. Metropolis did not suggest any strategy for generation of trial moves

2. “E” does not need to be an energy value

3. There is no requirement for “T” to be constant ! (the consequence is a possibility to “drive” T into

any desired direction). A lot of Metropolis algorithm implementations use Simulated-Annealing

inspired changes of temperature values.

4. b=1/(k*T), if T->0 then exp(-b*DE)->0. (in this case exp(-b*DE)>=rand() is always (except situations

when rand() is also 0) false and trial moves are accepted only if DE<0). This leads to slow (and

random) minimization of energy.


11.5. Advantages of Metropolis MC simulations

It does not require forces

Rapidly-changing energy functions

No differentiation required

Amenable to complex move sets:

Torsions – Rotamers – Tautomers

Monte Carlo “machinery”

Boundary conditions

Finite – Periodic

Interactions

Complete – Truncated – Periodic

How do we choose “moves”?

11.6. Monte Carlo moves

Example trial moves that may be used in the Monte-Carlo procedure:

Rigid body translation

Rigid body rotation

Internal conformational changes (soft vs. stiff modes)

Changes of titration/electronic states

Some “open” questions:

How “big” a move should we take?

Move one particle or many? How “big” a move should we take?

The smaller moves the better acceptance rate and slower sampling

Bigger moves: faster sampling, poorer acceptance rate

Move one particle or many?

Possible to achieve more efficient sampling with correct multi-particle moves

One-particle moves must choose particles at random

11.7. Genetic Algorithms in Molecular Modeling

There are three main types of search methods: calculus-based techniques are local in scope

and depend upon the existence of derivatives, enumerative methods search every point related to an

objective function's domain space, one point at a time. (They are very simple to implement, but may

require significant computation and therefore suffer from a lack of efficiency), guided random search

is based on enumerative approaches. It uses supplementary information to guide the search. Two

major subclasses are simulated annealing and evolutionary computation.


11.7.1. Genetic algorithms

Table 5. The Genetic Algorithms take inspiration from Nature

GA Nature Description

Population Population The set of individuals in a given time

point

Chromosome/solution. Information

is hold in strings

Individual (information is in the

DNA, one or more chromosomes)

Individual of the population and the

information it carries

Parameter Gene Basic information unit

Value assign to the parameter Allele The content of information in the

basic unit

All parameter values Genotype Entire coded information

Fitting value Phenotype How information is expressed

Scoring function Environment Selector

11.7.2. Guided random search

Two major subclasses of Guided random search techniques are simulated annealing and

evolutionary computation: simulated annealing is based on thermodynamic considerations (with

annealing interpreted as an optimization procedure). The method probabilistically generates a

sequence of states based on a cooling schedule to converge ultimately to the global optimum

(Metropolis, N ., Rosenbluth, A.W., Rosenbluth, M .N., Teller, A .H ., and Teller, E . (1953), J. Chem.

Phys. 21, 1087-1092 .; Kirkpatrick, S ., Gelatt, C .D., and Vecchi, M .P. (1983), Science 220, 671-680.).

The main goal of evolutionary computation (de Jong, K . and Spears, W. (1993), In , Proceedings of

the Fifth International Conference on Genetic Algorithms (S . Forrest, Ed.). Morgan Kaufmann

Publishers, San Mateo, California, pp . 618–623.) is the application of the concepts of natural

selection to a population of structures in the memory of a computer (Kinnear, K .E .Jr (1994) .

Advances in Genetic Programming. The MIT Press, Cambridge, p. 518 .).

11.7.3. Evolutionary computation

Evolutionary computation can be subdivided into evolution strategies, evolutionary

programming and genetic algorithms. Evolution strategies were proposed in the early 1970s. They

use a real encoding of the problem parameters. Evolution strategies are frequently associated with

engineering optimization problems. They promote mutations rather than recombinations.

Evolutionary programming (also) does not use recombinations but allows any type of encoding. With

genetic algorithms, a population of individuals is created and the population is then evolved by

means of the principles of variation, selection, and inheritance. The genetic algorithms differ from

evolution strategies and evolutionary programming in that this approach emphasizes the use of

specific operators, in particular crossover, that mimic the form of genetic transfer in living organisms

(1). Genetic programming (2) is an extension of genetic algorithms in which members of the

population are parse trees of computer programs. Genetic programming is most easily implemented

where the computer language is tree structured and therefore LISP is often used.

1. Porto, V.W., Fogel, D.B., and Fogel, L.J. (1995) . Alternative neural network training methods. IEEE

Expert Intell. Syst. Appl. 10 (June), 16–22.


2. Koza, J.R . (1992) . Genetic Programming. On the Programming of Computers by Means of Natural

Selection . The MIT Press, Cambridge, p . 819.

Genetic algorithms are based on the Darwinian (Lamarckian principles also can be used)

principles of natural selection and evolution. They manipulate a population of potential solutions to

an optimization (or search) problem. Specifically, they operate on encoded representations of the

GA in Computer-Aided Molecular Design solutions (equivalent to the chromosomes of individuals in

nature). Each solution is associated with a fitness value which reflects how good it is compared to

other solutions in the population. The selection strategy is ultimately responsible for ensuring

survival of the best fitted individuals. Manipulation of “genetic material” is performed through

crossover and mutation operators. How do genetic algorithms work ? A genetic algorithm operates

through a simple cycle including the following stages: encoding mechanism; creation of a population

of chromosomes; definition of a fitness function; genetic manipulation of the chromosomes.

Traditionally a binary encoding is used. This is particularly suitable when the variables are

Boolean (1 or 0 encode the presence or absence of an atom in a molecule). A chromosome consists

of a string of binary digits (bits). When continuous variables are used (like physicochemical

descriptors), a common method of encoding them uses their integer representation. Each variable is

first encoded. The binary codes of all the variables are then concatenated to obtain the binary string

constituting the chromosome.

11.7.4. Creation of a population of chromosomes

The initial population of individuals is usually generated randomly. When designing a genetic

algorithm model, one has to decide what the population size must be. Increasing the population size

increases its diversity and reduces the probability of a premature convergence to a local optimum.

However, this strategy also increases the time required for the population to converge to the optimal

regions in the search space. The most effective population size is dependent on the problem being

solved, the representation used, and the choice of the operators.

11.7.5. Definition of a fitness function

The individuals of the population are exposed to an evaluation function that plays the role of

the environmental pressure in the Darwinian evolution. This function is called fitness. Based on each

individual’s fitness, a mechanism of selection determines mates for the genetic manipulation

process. Low fitness individuals are less likely to be selected than high-fitness individuals as parents

for the next generation. A fitness scaling is often used to prevent the early domination of super-

individuals in the selection process and to promote healthy competition among near equals when the

population has largely converged. Different scaling procedures can be used. Among them one can

use the linear scaling, the sigma truncation, the power law scaling, the sigmoidal scaling and the

Gaussian scaling. Many practical problems contain constraints that must be satisfied in a modeling

process. They can be handled directly by the fitness function.

11.7.6. Genetic manipulation of the chromosomes

Parent selection in a genetic algorithm is to provide more reproductive chances for the most

fit individuals. There are many ways to do this. The classical genetic algorithm uses the roulette

wheel selection scheme. The roulette wheel selection causes a problem when the presence of a few


individuals with relatively high fitness values induces the allocation of a large number of offspring to

these individuals, and can cause a premature convergence. The solution consists of using alternate

selection schemes. Among these, the most commonly employed is the tournament selection in

which an individual must win a competition with a randomly selected set of individuals. The winner

of the tournament is the individual with the highest fitness of the tournament competitors. The

winner is then incorporated in the mating pool. The fitness difference provides the selection

pressure, which drives the genetic algorithm to improve the fitness of each succeeding generation.

Crossovers and mutations are genetic operators allowing the creation of new chromosomes

during the reproduction process. Crossover occurs when two parents exchange parts of their

corresponding chromosome. The one-point crossover is the simplest form. It occurs when parts of

two parent chromosomes are swapped after a randomly selected point, creating two children.

Figure 49. The one-point crossover scheme

In a two-point crossover scheme, two crossover points are randomly chosen and segments of the

strings between them are exchanged

Figure 50. The two-point crossover scheme


Multi-point crossover is an extension of the two-point crossover. It consists of an increase in

the number of crossover points. The segmented crossover is a variant of multi-point crossover which

allows the number of crossover points to vary.

Mutations induce sporadic and random alterations of bit strings. Mutation plays a secondary

role in genetic algorithms in restoring lost genetic material. The probability of mutation can be held

constant or can vary throughout a run of the genetic algorithm.

Figure 51. The multi-point crossover scheme

The offspring created by the genetic manipulation process constitute the next population to

be evaluated. The genetic algorithm can replace the whole population or only its less fitted

members. The former strategy is termed the generational approach and the latter the steady-state

approach. The genetic algorithm cycle is repeated until a satisfactory solution to the problem is

found or some other termination criteria are met.

11.7.7. Applications of genetic algorithms in quantitative structure-

activity relationships (QSAR) and drug design

In QSAR studies of large data sets, variable selection and model building are difficult and

time-consuming tasks. The construction of QSAR models requires, in a first step, the design of

representative training and testing sets. Genetic algorithms are used as an attractive alternative for

selecting valuable test series, the selection uses constants encoding of hydrophobic, steric and

electronic effects. Genetic algorithms were also used to propose sets of aromatic substituents which

presented a high information content. In the activity space, genetic algorithms can be used for their

ability to detect outliers. In computer-aided molecular design, genetic algorithms can also be

employed for the identification of appropriate molecular structures given the desired

physicochemical property or biological activity.

Genetic algorithms are linked with a backpropagation neural network in order to produce an

intercommunicating hybrid system, working into a cooperative environment for generating an

optimal modeling solution. Venkatasubramanian (Venkatasubramanian, V., Sundaram, A ., Chan, K.,

and Caruthers, J.M. (1996). Computer-aided molecular design using neural networks and genetic

algorithms. In, Genetic Algorithms in Molecular Modeling (J. Devillers, Ed .). Academic Press, London,

pp. 271-302.) and coworkers proposed a backpropagation neural network-based approach for

solving the forward problem of property prediction of polymers based on the structural

characteristics of their molecular subunits, and a genetic algorithm-based approach for the inverse

problem of constructing a molecular structure given a set of desired macroscopic properties.

70 Introduction to molecular modeling. 12. Molecular docking by Rajmund Kaźmierkiewicz

Molecular docking is the process which allows recognition between molecules through

complementarity of molecular surface structures and energetics. It includes not only the size and

shape of molecular surfaces, but also charge-charge interaction, hydrogen bonding and van der

Waals interaction. The genetic algorithm was used for the first time (Payne, A.W.R. and Glen, R .C.

(1993), J. Mol. Graphics 11, 74-91.) in molecular docking to fit a series of N-methyl-D-aspartate

(NMDA) antagonists to a putative NMDA pharmacophore composed of the distance from the amine

nitrogen to a phosphonate sp2 oxygen and the distance from the carboxylic acid oxygen to the same

phosphonate sp2 oxygen.

The structure–generation genetic algorithm can produce (Jones, G., Willett, P., and Glen, R.C.

(1996), In, Genetic Algorithms in Molecular Modeling (J. Devillers, Ed .). Academic Press, London, pp.

211-242.) very large and diverse sets of reasonable chemical structures for searching by 3D database

programs. The genetic algorithm was used at the conformational analysis stage in the searching of

databases of flexible 3D molecules to find candidate drug structures that fit a query pharmacophore.

The side effect was the estimation of the binding affinities and the most energetically favorable

combination of interactions between a receptor and a flexible ligand, and the superimposition of

flexible molecules with the use of the resulting overlays to suggest possible pharmacophoric

patterns.

Walters and coworkers (Walters, D.E. and Muhammad, T.D. (1996). In, Genetic Algorithms in

Molecular Modeling (J. Devillers, Ed .). Academic Press, London, pp. 193-210.) proposed a program

called Genetically Evolved Receptor Models (GERM) for producing atomic-level models of receptor

binding sites, based on a set of known structure-activity relationships. The generation of these

models requires no a priori information, about a real receptor, other than a reasonable assurance

regarding the SAR data. The receptors generated by the genetic algorithm can be used for:

correlating calculated binding with measured bioactivity, predicting the activity of new compounds

by docking the chemicals, calculating their binding energies and calculating their biological activity.

12. Molecular docking

Molecular docking means finding the binding orientation of two molecules with known

structures. The binding of small molecule ligands to large protein targets is central to numerous

biological processes. The accurate prediction of the binding modes between the ligand and protein,

(the docking problem) is of fundamental importance in modern structure-based drug design. The

task of molecular docking can be divided according to the molecules being involved:

Protein-Ligand docking

Protein-Protein docking

Specific docking algorithms are usually designed to deal with one of these problems but not with

both (different contact area, flexibility, level of representation). Assuming the receptor structure is

available, a primary challenge in lead discovery and optimization is to predict both ligand orientation

and binding affinity.

The major techniques currently available are: molecular dynamics, Monte Carlo methods,

genetic algorithms, fragment-based methods, point complementarity methods, distance geometry

methods, tabu searches and systematic searches.

Introduction to molecular modeling. 12. Molecular docking 71 by Rajmund Kaźmierkiewicz

Docking protocols can be described as a combination of two components; a search strategy

and a scoring function. The search algorithm should generate an optimum number of configurations

that include the experimentally determined binding mode.

12.1. Types of compatibilities

Geometrical compatibility prevents overlap between atoms and maximizes shape

compatibility between the ligand and the binding site. Chemical compatibility maximizes the number

of chemically favorable interactions and minimizes the number of unfavorable interactions.

12.2. Finding the place and the orientation of the interactions

The general problem includes a search for the location of the binding site and a search to

figure out the exact orientation of the ligand in the binding site. A program that does both makes a

Global docking. Sometimes the location of the binding site is known. In this case we only need to

orient the ligand in the binding site. In this case the problem is called Local docking. Global docking is

more demanding in terms of computational time and the results are less accurate. When the

location of the binding site is unknown we have several possibilities: a) Looking for the binding sites

by separate programs (such as detection of cavities, conservation etc.) and then applying a local

docking program, b) Global docking.

Protein-protein docking methods.

The appoaches to protein-protein docking have a lot in common with small molecule docking. The

methods are still based on the combination of search algorithm coupled to scoring function. The

scoring functions are essentially the same (since we are still dealing with atomic interactions),

however the major problem is that the conformational space we now need to search is massive.

12.3. Computational time

Docking programs are often restricted by the computational time, due to the enormous

number of possibilities that should be examined. Docking algorithms should consider the

computational time and often the price is the accuracy of the scoring function. Efficiency is especially

required for drug design.

12.4. Rigidity vs. flexibility

Most of the early algorithms assumed that the docked molecules do not change

conformations. This assumption allows to treat the molecules as rigid bodies, making the algorithm

simpler and faster. This assumption is obviously problematic and was proven to be wrong in some

cases. Newer algorithms try to face the flexibility problems with variety of ways. Other methods try

to handle the flexibility problem indirectly or at least to “minimize the damage” of not incorporating

flexibility. Docking procedures that perform rigid body search are termed rigid docking. Docking

procedures that consider possible conformational changes are termed flexible docking.


12.5. Flexibility

Some algorithms break the ligand up into pieces, dock the individual pieces, and try and

reconnect the bound conformations. For example FlexX uses a library of precomputed, minimized

geometries from the Cambridge database with up to 12 minima per bond. Sets of alternative

fragments are selected by choosing simple or multiple pieces in combination. Flexible docking via

molecular dynamics with minimization can handle arbitrary flexibility, however it is extremely slow

compared to other docking methods.

12.6. Bound and unbound docking

In bound docking the goal is to reproduce a known complex where the starting coordinates of the

individual molecules are taken from the crystal of the complex. In the unbound docking, which is a

significantly more difficult problem, the starting coordinates are taken from the unbound molecules.

12.7. Components of the problem

Algorithms to dock molecules need: A. System representation, B. Searching procedure, C.

Scoring function, D. Clustering procedure. The parameters of the problem for docking of 2 rigid

bodies are 3 angles (rotations) and 3 distances (translations). Usually the ligand is not rigid and few

other parameters are required Np = 3 + 3 + Nfb. Where Np is a number of the number of parameters

needed to fully describe ligand position, 3+3 is a Position and Orientation, Nfb is a Number of flexible

bonds. If there are many parameters the problem is more complicated. GA have advantages over

other methods in these situations.

12.8. Aspects of Docking

Docking algorithms need to generate poses (configurations) of ligands in binding site then

score the poses using simple energy function or rank the poses using more complex thermodynamic

functions like ΔG. They need to take into account desolvation and entropic contribution.

12.9. Docking and de novo design methods

There is a difference between docking algorithms and de novo design methods. This is

subjective and in many cases significant overlap in methodology occurs between the two strategies.

Examples of de novo design tools are BUILDER, CONCEPTS, CONCERTS, DLD/MCSS, Genstar, Group-

Build, Grow, HOOK, Legend, LUDI, MCDNLG, SMOG and SPROUT. LUDI is given as an example of a de

novo design tool applied to the docking problem.

12.10. Additional Challenges in Docking

Limited resolution of receptor structures

Flexibility of receptor

Conformational changes in ligand and/or receptor upon binding

Role of water molecules in binding (desolvation of ligand and binding site, waters bridging

protein and ligand)


12.11. Protein Flexibility and Its Influence on Ligand Binding

For a single, fixed conformation to be an adequate representation of the protein, the system

would have to be very rigid. Such a system would correspond with the “lock-and-key” theory of

binding in which the protein exists in a single well-defined state with only one optimal

complementary ligand. However, the energy landscape of most proteins is frequently described in

terms of a folding funnel in which there are many highly unfavorable states that collapse via multiple

routes into possibly several favorable folded states. It is implied by the width of the minima on the

protein potential energy hypersurface that the folded state of the flexible system has higher entropy

and, as such, will have a larger ensemble of occupied states. This set of conformations is a collection

of structurally similar and nearly energetically equivalent conformations of the protein that,

together, make up the folded state.

It is important to note that a conformational potential energy hypersurface and the states

that it represents are condition-dependent. By altering the conditions (ionic strength, pH,

temperature), the minima can shift, changing which state is most populated. The most important

point to consider is that introducing a ligand into the system also changes the environment. It too

may affect the most populated state of the protein; such a case would correspond to an “induced-fit”

system. The protein most likely exists in a full complement of conformations out of which most are in

the native state, some in the induced-fit state, and some in other states. If the ligand binds

preferentially to the induced-fit state with sufficiently favorable free energy (meaning greater than

the free energy difference between the two protein states), the average structure of the protein will

change.

The implications for drug discovery are clear; a single protein structure is only useful to

identify ligands for that particular narrow state of the ensemble of lowest energy protein

conformations. To obtain new leads and properly predict activity of existing inhibitors, multiple

structures are the best option. The available computational methods range in the degree of protein

flexibility that they accommodate (focusing on a narrow set of states, a broader ensemble, or the

substantial fragment of the protein potential energy hypersurface). Keep in mind that most of the

advancements have only appeared in literature over the last five years, and there are only a few

systems to compare experimental data.

Searching procedures applied for docking: Greed search, Monte Carlo search/Random,

Genetic Algorithms. Scoring functions procedures applied for docking: Force fields, Geometric

features, Knowledge based parameters.

12.12. Clustering

No docking algorithm can produce a single, trustworthy structure for the bound complex, but

instead they produce an ensemble of predictions. Each predicted structure has an associated energy

(or enthalpy as the case may be) as well as the relative population. By clustering our data based on

some “distance” criteria, we can gain some sense of similarity between predictions. The distance can

be any of a number of similarity measures, but for 3D structures, RMSD is the standard choice.


12.13. Search algorithms

Docking protocols can be described as a combination of two components; a search strategy

and a scoring function. The search algorithm should generate an optimum number of configurations

that include the experimentally determined binding mode. A rigorous search algorithm would

exhaustively check all possible binding modes between the ligand and receptor. All six degrees of

translational and rotational freedom of the ligand would be explored along with the internal

conformational degrees of freedom of both the ligand and protein. However, this is impractical due

to the size of the search space. The practical application of such an extensive search involves the

sampling of many high energy unfavorable states which can restrict the success of an optimization

algorithm.

For a simple system comprising a ligand with four rotatable bonds and six rigid-body

alignment parameters, the search space has been estimated as follows. If the angles are considered

in 10 degree increments and translational parameters on a 0.5 Å grid there are approximately 4×108

rigid body degrees of freedom to sample, corresponding to 6×1014 configurations (including the four

rotatable torsions) to be searched. This would require approximately 2000000 years of

computational time at a rate of 10 configurations per second. As a consequence only a small amount

of the total conformational space can be sampled, and so a balance must be reached between the

computational expense and the amount of the search space examined.

In practice therefore, to sample such a large search space the computational expense is

limited by applying constraints, restraints and approximations to reduce the dimensionality of the

problem in an attempt to locate the global minimum as efficiently as possible. A common

approximation in early docking algorithms was to treat both the ligand and target as rigid bodies and

only the six degrees of translational and rotational freedom were explored.

Although these methods have been successful in certain cases, there is a limitation to the

rigid body docking paradigm in that the ligand conformation must be close to the experimentally

observed conformation when bound to the protein. Numerous examples of conformational change

of the ligand upon binding, for example the binding of cyclosporin A to cyclophilin, have led the drive

to incorporate conformational flexibility into the search algorithm.

A common approach in modeling molecular flexibility is to consider only the conformational

space of the ligand, assuming a rigid receptor throughout the docking protocol. The techniques used

to incorporate conformational flexibility into a docking protocol will be discussed in some detail.

However, the searching algorithm is only half the docking problem; the other factor to be

incorporated into a docking protocol is the scoring function.

12.14. Scoring functions

A large number of current scoring functions are based on force fields that were initially

designed to simulate the function of proteins. A force field is an empirical fit to the potential energy

surface in which the protein exists and is obtained by establishing a model with a combination of

bonded terms (bond distances, bond angles, torsional angles) and non-bonded terms (van der Waals

and electrostatic). Some scoring functions used in molecular docking have been adapted to include

terms such as solvation and entropy. A separate approach is to use pure empirical scoring functions

that are derived using multivariate regression methods of experimental data.


12.15. Rigid Protein Docking

Most of the docking methods used at the present moment in academic and industrial

research all assume a rigid protein. To illustrate the methodology used by these methods I will briefly

discuss three of the most common programs used for docking: Autodock, Dock and FlexX.

Autodock uses a kinematic model for the ligand, it assumes the rigid valence geometry of

ligand and only torsional angles are allowed to change. The ligand begins the search process

randomly outside the binding site and by exploring the values for translations, rotations and its

internal degrees of freedom, it will eventually reach the bound conformation. Distinction between

good and bad docked conformations is carried out by the scoring function. Autodock is able to use

Monte Carlo methods or simulated annealing (SA) in the search process and in its last version

introduced the ability to use genetic algorithms (GA). The routine implemented in the recent release

is a Lamarkian genetic algorithm (LGA), in which a traditional GA is used for global search and is

combined with a Solis and Wets local search procedure. Authors of Autodock show that the new LGA

is able to handle ligands with a larger number of degrees of freedom than SA or traditional GA. FlexX

and Dock both use an incremental construction algorithm which attempts to reconstruct the bound

ligand by first placing a rigid anchor in the binding site and later using a greedy algorithm to add

fragments and complete the ligand structure. Although these programs are more efficient than

Autodock in the sense that they require fewer energy evaluations they also have some tradeoffs.

One of the main problems is that it is not trivial to choose the anchor fragment and its choice will

determine what solutions can be obtained. Also the greedy algorithm propagates errors resulting

from initial bad choices that lead to missing final conformations of lower energy. Recently other

docking programs have also been reported such as DREAM++, QSDock, and Darwin.

12.16. Partial Protein Flexibility

The first approximation used in modeling partial protein flexibility was the soft-docking

method first described by Jiang and Kim (Jiang, F. & Kim, S.H. "Soft docking": matching of molecular

surface cubes. J Mol Biol 219, 79-102 (1991).). The principle underlying this method consists of

decreasing the van der Waals repulsion energy term between the atoms in the binding site and those

in the ligand. This method could result in final solutions that include physically impossible atom

collisions. Due to the mobility available to the protein atoms in the binding site, it is possible that

there is a low energy rearrangement of these that would eliminate collisions while maintaining the

conformation of the ligand returned during its conformational search. This method has the

advantage of being computationally efficient as it still describes the protein using fixed coordinates

and it is easy to implement since it does not require changes to the energy evaluation function

besides changing van der Waals parameters.

The most common approximation used to incorporate partial protein flexibility in modeling

the binding process is to select a few degrees of freedom in the protein binding site and do a

simultaneous search of the combined ligand/protein conformational space. Incorporating select

degrees of freedom from the binding site in the conformational search process is based on the

assumption that these degrees of freedom are the ones playing a major role in determining the

conformational changes during the binding process. This choice requires deep chemical

understanding of the system under study and is therefore difficult to automate.


One of the earliest reports of using select degrees of freedom from the protein was

described by Jones et. al. (Jones, G., Willett, P., Glen, R.C., Leach, A.R. & Taylor, R. Development and

validation of a genetic algorithm for flexible docking. J Mol Biol 267, 727- 48 (1997).) and was

implemented in the program Genetic Optimization for Ligand Docking (GOLD). This program

improves on the rigid protein model by performing a conformational search on the binding site with

the aim of improving the hydrogen bonding network between the protein and the ligand. Hydrogen

bonds are local electrostatic interactions between pairs of atoms which play an important energetic

role in ligand recognition and binding. GOLD selects the degrees of freedom in the binding site that

correspond to reorientations of hydrogen bond donor and acceptor groups. These degrees of

freedom represent only a very small fraction of the total conformational space that is available but

should account for a significant difference in binding energy values.

More recent studies have been reported in which other degrees of freedom from aminoacid

sidechains are also used in the conformational search. These are searched using stochastic methods

with arbitrary step sizes or using rotamer libraries. Rotamer libraries consist of discrete sidechain

conformations of low energy which are usually determined from statistical analysis of structural data

derived experimentally.

12.17. Full Protein Flexibility

Ideally ligand docking to a protein could be simulated using Molecular Dynamics (MD). This

has the advantage that not only it takes into account all the degrees of freedom available to the

protein but also enables an explicit modeling of the solvent. Furthermore, accurate energy

calculations can also be carried out using the free energy perturbation method. Unfortunately,

modeling proteins using MD is computationally expensive, and the computational power necessary

to simulate the full process of diffusion and ligand binding without any approximations will be out of

our reach for many years to come.

Mangoni et. al.( Mangoni, M., Roccatano, D. & Di Nola, A. Docking of flexible ligands to

flexible receptors in solution by molecular dynamics simulation. Proteins 35, 153-62 (1999).) reported

a modification to the standard MD protocol which reduces the computational time required for the

docking simulation. The protocol consists of separating the center of mass motion of the ligand from

its internal and rotational motions by coupling the different degrees of freedom to separate thermal

baths. This optimization allows the ligand to sample the space surrounding the binding site faster

while maintaining correct interactions with both protein and solvent.

An alternative approach to model full protein flexibility is to generate a set of rigid protein

conformations that together represent the conformational diversity available to the protein. These

conformations can later be docked to database of ligands using traditional rigid-protein/flexible-

ligand methods. There are several possible methods to generate the ensembles, but unfortunately

their accuracy is proportional to the difficulty in obtaining them. The most accurate ensemble is the

one determined exclusively from experimental data. An example is the case where several structures

of protein/ligand complexes are determined using X-ray crystallography bound to different candidate

drugs. Under these circumstances it is usually possible to observe alternative binding modes directly

(Munshi, S. et al. An alternate binding site for the P1-P3 group of a class of potent HIV- 1 protease

inhibitors as a result of concerted structural change in the 80s loop of the protease. Acta Crystallogr

D Biol Crystallogr 56, 381-8 (2000).).


Another less accurate option is to use the ensemble of structures that results from an

experimental protein structure determination using the NMR (Nuclear Magnetic Resonance)

technique. This docking methodology was first reported by Knegtel et al (Knegtel, R.M., Kuntz, I.D. &

Oshiro, C.M. Molecular docking to ensembles of protein structures. J Mol Biol 266, 424-40 (1997).).

Finally, one can generate an ensemble using computational methods such as Monte Carlo (MC) or

MD sampling. The accuracy of these alternatives is closely related to the accuracy of the force field

used and is limited by the ability of these computational techniques to effectively sample the

conformational space.

A different representation for full protein flexibility is to divide the protein in tightly coupled

domains whose constituent atoms move collectively as one. Hinges connect the domains and the

motion of the protein is simulated similarly to an articulated robot. Required conformational changes

inside domains can be handled using minimization. An application of this model to the docking

problem was reported by Sandak et al (Sandak, B., Wolfson, H.J. & Nussinov, R. Flexible docking

allowing induced fit in proteins: insights from an open to closed conformational isomers. Proteins 32,

159-74 (1998).).

12.18. Examples of Docking Programs

Table 6. The examples of docking programs

Docking approach Examples

Matching of descriptors DOCK, QSDOCK, SLIDE

Incremental construction FlexX, Hammerhead

Monte Carlo Simulated Annealing AutoDock, MCDOCK

Monte Carlo Minimization ICM, QXP

Molecular Dynamics MDD

Genetic Algorithms GOLD, AutoDock3

12.19. Activated Dynamics

There are many biochemical processes involved in proteins and DNA, which occur only

infrequently. For example, docking a ligand to a particular space of a protein takes place on a

millisecond time scale. Therefore, these processes are not adequately sampled in conventional

simulation approaches, because they require a very long dynamic simulation, which is not feasible

even using the fastest computers.

Activated dynamics provides an alternative approach to this problem. In this approach one

performs several molecular dynamics simulations, and each of them has a constrained value of a

reaction coordinate describing the investigated process. In the case of docking of a ligand to the

protein, the reaction coordinate can by represented by a relative distance between the ligand center

of mass, and the center of mass of the protein pocket. Then one performs a series of molecular

dynamics simulations, and each of them has a constrained value of the distance between the ligand

and the protein. The final structure of the molecular dynamics with a bigger reaction coordinate, is

the initial structure of the dynamics using the smaller value of this coordinate. In this series of

simulations, the entire protein relaxes according to the potential of the mean force, representing the

move of the ligand in a direction of the protein pocket.

78 Introduction to molecular modeling. 13. Hybrid QM/MM method by Rajmund Kaźmierkiewicz

13. Hybrid QM/MM method

Hybrid QM/MM method combines quantum mechanical and molecular mechanical method.

It treats just the reacting part of the system quantum mechanically, and uses MM for the

surroundings. It uses a combined Hamiltonian (The total mechanical energy function) for the system:

Figure 52. The hybrid QM/MM method

13.1. Boundary treatment

The boundary term is given by

where i is summed over all MM partial charges, m overall QM nuclei, and e overall QM electrons. The

first term is 1-electron interaction between QM electron density and MM partial charges. The second

term is a standard Coulomb interaction between QM nuclei and MM charges. The final term is

required because electron density (and hence dispersion) is explicitly treated in the QM region, but

not in the MM region.

The valence of the QM region must be satisfied. The MM bond, angle, dihedral terms need a

partner atom to act on, in order to maintain the geometry of the system. The QM/MM is often used

to simulate a solute quantum mechanically, with explicit solvent treated with MM — in this instance,

the problem of QM-MM bonds is avoided.

Conventional solution for boundary treatment is a “link atoms” approach (usually hydrogen

atoms, but sometimes halogens or even methyl groups) are added along the bond (Singh, U. and

Kollman, P. J. Comput. Chem. 1986, 7, 718). The link atom satisfies the valence of the QM region. The

QM atom is used for calculation of all MM bond terms. For non-bonded interactions (electrostatic

terms), originally the link atom did not interact with any MM atom. Better properties are usually

obtained if the link atom interacts with the entire MM region. It leads to poor handling of electron

density.

Introduction to molecular modeling. 13. Hybrid QM/MM method 79 by Rajmund Kaźmierkiewicz

13.2. Improved bond treatments

Local Self-Consistent Field (LSCF) (Warshel, A. and Levitt, M. J. Mol. Biol. 1976, 103, 227) uses

a parameterized frozen orbital along the QM-MM bond, which is not optimized in the SCF.

Generalized Hybrid Orbital (GHO) (Gao, J. et. al. J. Phys. Chem. A 1998, 102, 4714) includes the QM-

MM orbitals in the SCF.

Figure 53. The "frozen orbital" scheme of the QM-MM method

Dynamics of a QM/MM system is almost identical to those of an MM system: Forces are

calculated from first derivatives on each atom. The QM nuclei are treated identically to the MM

partial charges. The system is propagated by standard Newtonian dynamics. QM/MM can also be

used in conjunction with Monte Carlo methods. A complication: the MM atoms affect the QM

electron density, so an SCF is required for every Monte Carlo move. Workaround: approximate the

energy change of the QM region by first-order perturbation theory (it is called the “Perturbative

QM/MC”) as long as moves are far enough away from the QM region (Truong, T. and Stefanovich, E.

Chem. Phys. Lett. 1996, 256, 348).

There are some Drawbacks of QM/MM. Some parameterization is still required for the

boundary treatment. The choice of the size of the QM region is still something of an art. Although the

QM region polarizes in response to the MM partial charges, the reverse is not also true (although

fully polarizable QM/MM methods are being developed). The free energy of a QM system can be

determined via frequency calculation; however, this is rather inaccurate when applied to QM/MM

systems (second derivatives are poorly determined, due to the harmonic approximation).

13.3. Other approaches

ONIOM approach (Svensson, M. et. al. J. Phys. Chem. 1996, 100, 19357) divides the system

into the “real” (full) system and the “model” (subset) and treats the model at high level, and the real

at low level, giving the total energy as

which relies on the approximation

The “model” system still has to be properly terminated. Extension to three level systems is relatively

straightforward (for example, it is also possible to incorporate other methods ab-initio core, semi-

empirical boundary, MM surroundings).

80 Introduction to molecular modeling. 13. Hybrid QM/MM method by Rajmund Kaźmierkiewicz

Empirical Valence Bond method (Warshel, A. and Weiss, M. J. Am. Chem. Soc. 1980, 102,

6218) treats any point on a reaction surface as a combination of two or more valence bond

structures. Parameterization is made from QM or experimental data. It is an effective method, but

must be carefully set up for each system.

Effective Fragment Potential (Webb, P. and Gordon, M. J. Phys. Chem. 1999, 103, 1265) adds

“fragments” to a standard QM treatment, which are fully polarizable and are “parameterized” from

separate ab initio calculations. Treatment of bonds between the ‘true’ QM region and the fragments

is still problematic.

13.4. Available software

CHARMM has been interfaced with MOPAC, GAMESS-US, GAMESS-UK, CADPAC, DeFT.

AMBER works with ROAR and with Gaussian. DYNAMO implements semi-empirical QM/MM. QSite is

a commercial package from Schrödinger, Inc.

13.5. An Example - a Diels-Alder Reaction

Figure 54. The Diels-Alder cyclo-addition reaction catalyzed by 1E9, a catalytic antibody that was raised against a transition-state analogue compound

13.6. Geometry Optimization after MM Dynamics

The combined QM/MM calculations can also be performed for protein systems, for which

experimental geometry is unknown. In this approach we usually start from the protein structure,

which is very similar to the target protein structure, and modify the initial structure manually to get

the final protein geometry. However any modifications of the initial protein structure change

interactions in the entire protein system, therefore we need to relax and equilibrate the final

protein geometry in a process of molecular mechanical dynamics. However, the molecular

mechanical dynamics generates a trajectory, which is a set of several protein structures (snapshots),

rather than just one protein structure. Therefore, the combined QM/MM calculations which are

based on the MM dynamics, should be performed for several protein structures, representing several

Introduction to molecular modeling. 13. Hybrid QM/MM method 81 by Rajmund Kaźmierkiewicz

protein geometries in a process of protein thermal motions. Final results of those QM/MM

calculations will be obtained as average values of individual QM/MM calculations, where each of

them (snapshot) represents a temporary protein structure in a dynamic process. The QM/MM

calculations of an individual protein structure (snapshot) are performed in the same approach, as the

method used in the combined QM/MM calculations in the fixed MM protein matrix.

13.7. Oscillations of Active Sites after MM Dynamics.

When optimal geometries of an active site are already obtained using a series of QM/MM

calculations after MM dynamics, one can calculate oscillation hessian of the active site of each

individual protein snapshot. The individual QM/MM calculations can be performed according to the

same procedure as QM/MM oscillation calculations in the fixed protein matrix. After QM/MM

calculations, the final results will be obtained as average values based on all individual QM/MM

calculations performed for the fixed MM protein matrix. This QM/MM method allows us to calculate

molecular oscillations of active sites in protein systems for which experimental structures are

unknown. This is particularly important for any protein modifications, where some amino acids of the

active site pocket have been replaced by other amino acids. This type of protein modification is very

expensive to perform experimentally, therefore we can use the QM/MM method for prediction

which protein modification will have the biggest impact on the electronic and geometric structure of

the protein active site. By calculating molecular oscillations of the active site for different protein

mutations, we can show what type of protein modification will change the oscillations of the active

site the most. Then this theoretical prediction can be verified experimentally.

13.8. Complex Reaction after MM Dynamics

In most cases, chemical reactions generate products, whose geometrical structures are much

different from geometrical structures of reactants. If these reactions take place in a protein

environment, the entire protein relaxes in response to newly obtained products. Therefore we

cannot use a simple QM/MM approach in the fixed MM protein matrix for calculating more complex

chemical reactions in a protein. The best method is a combination of individual QM/MM calculations

in the fixed MM protein matrix, which is applied to several protein structures obtained during MM

dynamics. Therefore, we usually do a separate MM dynamics for a protein structure representing

reactants, and a similar separate MM dynamics representing products of the reaction. Then for each

MM dynamics, we do a series of combined QM/MM calculations, and calculate the total energy of

the reacting molecules, as average values. A more complex situation is for chemical reactions, which

have one or several intermediate states on a reaction profile between reactants and products. In this

case, we need to perform a separate MM dynamics for each intermediate state, in a similar way as

for reactants and products. For example a methylation reaction, which takes place between a

cofactor, which is part of the protein, and a DNA base, which is completely flipped out the DNA helix.

There is experimental evidence indicating that this chemical reaction involves cysteine from the

protein, and there are two intermediate states between the reactants and the products. Therefore in

order to calculate this reaction, first we need to perform four different MM dynamics. Then for each

MM dynamics we can perform a series of QM/MM calculations, and calculate the energy of reacting

molecules as average values over selected protein snapshots. The same procedure need to be

applied to the other MM dynamics.

82 Introduction to molecular modeling. 14. Normal Modes and Principal Component Analysis by Rajmund Kaźmierkiewicz

13.9. Electronic Excitation in Fixed MM Matrix

The QM/MM calculations of electronic excitations of an active site can be also performed

after MM dynamics. We apply this approach to the protein system which does not have an

experimental structure, or to any protein modification. Each protein modification changes

interactions between protein atoms, therefore, after protein modification we need to perform MM

dynamics in order to relax and equilibrate the protein system. After MM dynamics, we have a series

of protein structures and for randomly selected snapshots, we perform the combined QM/MM

calculations, optimizing the geometry of the active site quantum mechanically. Those calculations

can be performed using the same approach as the QM/MM geometry optimization in the fixed MM

protein matrix. Then for each protein snapshot we perform QM/MM calculations of electronic

excitations, based on the optimal geometry obtained from the QM/MM calculations. Those

calculations are based on the geometry of the active site in its ground electronic state, therefore as

results of those calculations, we have electronic excitations which can be compared with

experimental electronic absorption spectra. We can also perform geometry optimization of the

active site in its excited electronic state, and then we can calculate electronic excitations from the

excited to the ground electronic state. Results of those calculations can be compared with

experimental electronic emission spectra, such as fluorescence or phosphorescence. There are

usually small differences in the geometry of the same molecule in the ground and in the excited

electronic states. Therefore we can use the same protein matrix for the QM/MM calculations in the

ground and in the excited electronic states. However, there are molecular systems where after

electronic excitation the active site of a protein changes its geometry considerably. The best example

of that protein system is bacteriorhodopsin, where electronic excitation changes the conformation of

the active site from cis to trans. In this case we need to perform a separate MM dynamics for both

conformers of the active site, and separate series of QM/MM calculations. Similar as in the QM/MM

calculations of optimal geometries of an active site after MM dynamics, final results of the QM/MM

electronic excitations will be calculated as average values over a series of randomly selected protein

snapshots.

14. Normal Modes and Principal Component Analysis

The vibrations of a molecule are given by its normal modes. Each absorption in a vibrational

spectrum corresponds to a normal mode. For example, the four normal modes of carbon dioxide, are

the symmetric stretch, the asymmetric stretch and two bending modes. The two bending modes

have the same energy and differ only in the direction of the bending motion. Modes that have the

same energy are called degenerate. In the classical treatment of molecular vibrations, each normal

mode is treated as a simple harmonic oscillator.

Figure 55. Normal Modes for a linear triatomic molecule. In the last bending vibration the motion of the atoms is in-and-out of the plane of the paper

Introduction to molecular modeling. 14. Normal Modes and Principal Component Analysis 83 by Rajmund Kaźmierkiewicz

In general linear molecules have 3N-5 normal modes, where N is the number of atoms. The five

remaining degrees of freedom for a linear molecule are three coordinates for the motion of the

center of mass (x, y, z) and two rotational angles. Non-linear molecules have three rotational angles,

hence 3N-6 normal modes. The characteristics of normal modes is summarized below.

Characteristics of Normal Modes:

1. Each normal mode acts like a simple harmonic oscillator.

2. A normal mode is a concerted motion of many atoms.

3. The center of mass doesn’t move.

4. All atoms pass through their equilibrium positions at the same time.

5. Normal modes are independent; they don’t interact.

In the asymmetric stretch and the two bending vibrations for CO2, all the atoms move. The

concerted motion of many of the atoms is a common characteristic of normal modes. However, in

the symmetric stretch, to keep the center of mass constant, the center atom is stationary. In small

molecules all or most of the atoms move in a given normal mode; however, symmetry may require

that a few atoms remain stationary for some normal modes. The last characteristic, that normal

modes are independent, means that normal modes don’t exchange energy. For example, if the

symmetric stretch is excited, the energy stays in the symmetric stretch.

14.1. One mass and two springs

Figure 56. Applying of normal mode analysis to one mass and two springs model

14.2. Two masses

For a single mass on a spring, there is one natural frequency, namely . Let's see what happens

if one has two equal masses and three springs. The two outside spring constants

Figure 57. Applying of normal mode analysis to two masses and three springs model


are the same, but we'll allow the middle one to be different. In general, all three spring constants

could be different, but the math gets messy in that case.

Let x1 and x2 measure the displacements of the left and right masses from their respective

equilibrium positions. We can assume that all of the springs are unstretched at equilibrium, but we

don't actually have to, because the spring force is linear. The middle spring is stretched (or

compressed) by x2-x1, so the F = ma equations on the two masses are

,

.

Concerning the signs of the terms here, they are equal and opposite, as dictated by Newton's third

law, so they are either both right or both wrong. They are indeed both right, as can be seen by taking

the limit of, say, large x2. The force on the left mass is then in the positive direction, which is correct.

These two F = ma equations are “coupled”, in the sense that both x1 and x2 appear in both equations.

How do we go about solving for and ?

Let's guess solutions of the form and

. It is convenient to

write these solutions in vector form:

Substituting these guesses into the equation F = ma , and canceling the factor of , yields

In matrix form, this can be written as

At this point, it seems like we can multiply both sides of this equation by the inverse of the matrix.

This leads to (A1,A2) = (0, 0). This is obviously a solution (the masses just sit there), but we're looking

for a nontrivial solution that actually contains some motion. The only way to escape the preceding

conclusion that A1 and A2 must both be zero is if the inverse of the matrix doesn't exist. For the

present purposes, the only fact we need to know about matrix inverses is that they involve dividing

by the determinant. So if the determinant is zero, then the inverse doesn't exist. This is therefore

what we want. Setting the determinant equal to zero gives the quartic equation,

We now perform the usual step of invoking the fact that the positions x1(t) and x2(t) must be real for

all t, and we obtain


So what we did above was solve the eigenvectors and eigenvalues of this matrix. The eigenvectors of

a matrix are the special vectors that get carried into a multiple of themselves when acted on by the

matrix. And the multiple (which is m2 here) is called the eigenvalue. Such vectors are indeed

special, because in general a vector gets both stretched (or shrunk) and rotated when acted on by a

matrix. Eigenvectors don't get rotated at all.

14.3. N atoms and potential energy function

Let H be the Hessian matrix for a system with N atoms and potential energy function U:

Motions will depend on the masses, so with M being a 3Nx3N diagonal matrix:

M11=m1, M22=m1, M33=m1, M44=m2 ...

F=M-1/2HM-1/2

Diagonalize F, this is to solve secular equation |F-I|=0, equivalent to solving the eigenvalue problem:

Fx=x)

The eigenvalues correspond to frequencies of motions described by the

eigenvectors xi. It assumes that the system is described by harmonic potentials, which is usually not

true for real systems, but may be approximately true near the minimum of the potential energy; less

good for liquids and very floppy molecules.

For general 3D systems we expect 3N-6 non-zero frequencies: six should be zero,

corresponding to the six overall translation rotation degrees of freedom

Two-Atomic Molecule Three-Atomic Molecule

Figure 58. Natural Modes of Vibrations for Some Simple Systems


Figure 59. Low frequencies and Large collective motions for the multi-atom molecule.

14.4. For the multi-atom molecule

One mode can represent 70-90% of functionally relevant motion.

For many observed movements, the first 12 normal modes contain the relevant degrees of

freedom

Figure 60. The sample Frequency spectrum of a protein


14.5. Harmonic Approximation in Anharmonic Systems and in real

Proteins

For a theoretical system with a purely harmonic potential V(R), there exists a superposition of

normal modes that exactly expresses any given motion.

For an anharmonic potential V(R), the real world systems, harmonic potential still gives an

excellent approximation near a potential energy minimum, and any small-amplitude motion

around such a minimum can still be well described by a sum of normal modes.

Any classical system can be said to behave harmonically at a sufficiently low temperature.

A typical normal mode analysis computes the characteristic vibrations and the corresponding

frequencies assuming V(R) is harmonic in all degrees of freedom.

Only systems near a potential energy minimum exhibit harmonic behavior, so normally, the

system energy is first minimized to assist with the harmonic assumption.

Around 200 K, the fluctuations of the atoms of a globular protein begin to deviate considerably

from harmonic behavior. As the protein is heated to 300 K, its motion becomes considerably

anharmonic.

These facts should be kept in mind when attempting to interpret in vivo and in

vitro protein behavior using normal modes.

Normal Mode Analysis (NMA), also known as “Harmonic Analysis” is a

classical technique for studying the vibrational and thermal properties of various

molecular structures at the atomic level. It was originally developed for interpretation of the

optical spectra (for example the Fourier Transform Infrared (FTIR) and Raman spectra) of

small molecules. Normal modes of vibration are simple harmonic oscillations about a local

energy minimum, characteristic of a system’s structure and its energy function V(R).

Normal mode analysis is an alternative method to study dynamics of molecules. It

does not require trajectory, because it works with single structure. Conformational fluctuation

is given by a superposition of normal modes. One can use normal mode analysis to refine

small-angle X-ray scattering profiles.

There are two classes of applications of the Normal Mode analysis method:

Separation of frequencies

Simple analytic description of the potential

Figure 61. An illustration to the simple analytic description of the potential


Normal modes are collective motions of coupled atoms, example:

Symmetric stretch, translation, and asymmetric stretch

Figure 62. Linear motions of three masses attached by springs (coupled 1-D harmonic oscillators)

For proteins one has: N masses = N normal modes

Harmonic Potential :

V=∑i=1

3N

∑j=1

3N

{k ij (x i− x j)2}

The equations for non-bonding interactions for example the Van der Waals potential, are very

different. Can one express the equation in harmonic form? Yes, by using Taylor expansion of the

energy function:

14.6. Force Constants

The force constants are equal to the second derivatives of the actual energy function:

V=∑i=1

3N

∑j=1

3N

{k ij (x i− x j)2}

k ij=∂

2E

∂ x i∂ x j

The Normal Modes Analysis (NMA) using Molecular Mechanics is a little bit more

complicated, because the full atomic representation and MM interactions require:

energy minimization

diagonalization of the second derivative of the potential energy (3N x 3N Hessian matrix).


Calculation and diagonalization of the 3N x 3N Hessian matrix is usually not a trivial task. Some

authors suggest approximations to speed-up these procedures:

Figure 63.NMA using Molecular Mechanics, Memory-Efficient Diagonalization

14.7. NMA using Molecular Mechanics, Reducing the Number of

Variables.

Figure 64. NMA using Molecular Mechanics, Reducing the Number of Variables

One of the most popular approximation is to perform an NMA using Molecular Mechanics and

reducing the number of variables by introduction of the Elastic Network Model approximation

(Monique M Tirion (1996) Phys Rev Lett. T7, 1905-1908):

Figure 65. An Elastic Network Model approximation in NMA method


Vector Quantization:

Encode data (in R3 space) using a finite set {wj} (j=1,…,k) of codebook vectors. Delaunay triangulation

divides R3 into k Voronoi polyhedra (so-called “receptive fields”):

Figure 66. The Voronoi polyhedra

The reduction of the number of variables by vector quantization can help explain the large

scale motions (Bahar, Curr Opin. Struct. Biol., 2005, 15:1-7)

Figure 67. An illustration of reduction of the number of variables by vector quantization


14.8. Using Normal Mode Analysis to Model Protein Dynamics

Figure 68. Using Normal Mode Analysis to Model Protein Dynamics (Tirion M., Large Amplitude Elastic Motions in Proteins from a Single Parameter, Atomic Analysis. Physical Review Letters 1996, 77:9)

14.9. The equilibrium correlation between fluctuations

The equilibrium correlation (Tirion, M. Large Amplitude Elastic Motions in Proteins from a

Single Parameter, Atomic Analysis. Physical Review Letters. 1996. 77:9) between fluctuations Ri and

Rj of two C carbons i and j is given by:

is a symmetric Kirchhoff matrix (connectivity matrix):

RMS deviation of backbone C atoms per mode (Tirion, M. Large Amplitude Elastic Motions in

Proteins from a Single Parameter, Atomic Analysis. Physical Review Letters. 1996. 77:9).


Figure 69. An illustration of RMS deviation of backbone C atoms per mode

Only a small number of modes contribute to overall motion

Tirion’s “geometric” modes match “energy-based” modes

14.10. Calculation of protein B-factors

Table 7. The B-factor value is located in the last column of the PDB formatted protein structure description

B-factor

ATOM 4 N GLY O 1 26.266 -12.458 5.676 1.00 40.85

ATOM 5 CA GLY O 1 26.236 -11.169 6.450 1.00 33.10

ATOM 6 C GLY O 1 27.338 -10.107 6.224 1.00 28.33

ATOM 7 O GLY O 1 28.478 -10.258 6.644 1.00 33.77

ATOM 8 N ASP O 2 27.085 -9.047 5.480 1.00 24.61

ATOM 9 CA ASP O 2 28.167 -8.101 5.107 1.00 22.56

ATOM 10 C ASP O 2 28.316 -6.857 5.988 1.00 21.47

ATOM 11 O ASP O 2 27.527 -5.948 5.802 1.00 14.42

The numbers in the last column in the protein PDB file designate the temperature factors, or B-

factor, for each atom in the structure.

The B-factor describes the displacement of the atomic positions from an average (mean) value.

For example, the more flexible an atom is the larger the displacement from the mean position

will be (mean-squares displacement).

In graphics programs we can often color a protein according to B-factor value.


Figure 70. Sample protein structure colored according to B-factor value

14.11. Examples of Applications Using Normal Mode Analysis to

Model Protein Dynamics

14.11.1. Collective dynamics of protofilaments in microtubules:

Figure 71. Mechanisms of muscle action


Figure 72. More details of mechanisms of muscle action

14.11.2. Applications of NMA : ribosome (Application to EM Data)

Figure 73. Rotation of the 30S relative to the 50S: Ratchet-like motion. It is a key mechanical step in the translocation (Frank J., Agrawal R.K. Nature 2000, 318)

14.11.3. Applications of Normal Mode Analysis to experimental EM maps

The application of the Normal Mode Analysis method for the flexible fitting of high-

resolution structures into low-resolution maps of macromolecular complexes from electron

microscopy has been recently described in applications to simulated electron density maps. This

method uses a linear combination of low-frequency normal modes in an iterative manner to deform

the structure optimally to conform to the lower solution electron density map. Gradient-following

techniques in the coordinate space of collective normal modes are used to optimize the overall

correlation coefficient between computed and measured electron densities. With this approach,

multi-scale flexible fitting can be performed using all-atoms or C atoms. (Seth A Darst, Bacterial RNA

polymerase, Current Opinion in Structural Biology, Volume 11, Issue 2, 1 April 2001, Pages 155-162)


Figure 74. An applications of NMA to experimental EM maps

14.12. What are the Limitations of NMA:

We do not know a priori which is the relevant mode, but the first 12 low-frequency modes are

probable candidates.

The amplitude of the motion is unknown.

NMA requires additional standards for parameterization, i.e. a screening against complementary

experimental data to select the relevant modes and amplitude.

Expert (user) input / evaluation is required

This method is not based on first principles of physics (like MD).

Normal mode analysis is less (computationally) expensive than Molecular Dynamics (MD)

simulation, but because the computer must invert large matrices, it requires much more

memory when dealing with large molecules.

This problem can be overcome somewhat by clumping regions, such as amino acid residues, and

treating them as if they were a single atom, effectively reducing the number of atoms, and

hence the size of the matrices the computer must invert.

Normal modes may break the symmetry of structures due to forced orthogonalization.


14.13. The Principal Component Analysis (PCA) method

Normal mode analysis and principal component analysis are powerful theoretical tools for

studying collective motions in proteins. The former is based on the assumption of harmonicity of the

dynamics, while the latter is valid even when the dynamics is highly anharmonic. The results of the

latter analysis indicate that most important conformational events are taking place in the

conformational subspace spanned by a rather small number of principal modes, and this important

subspace is also spanned by a number of normal modes.

14.13.1. Collective coordinates

Collective variables are projections on eigenvectors obtained either by diagonalization of a

covariance matrix as in PCA or diagonalization of the second derivatives, Hessian matrix as in the

NMA. Diagonalize Hessian matrix:

Principal Component Analysis from MD

Normal Mode Analysis

Functional motions of a protein may be represented by only a few low-frequency modes.

Principal Component Analysis is a mathematical technique, used to find patterns in high-

dimensional datasets, such as protein structures. It allows to find relationships/patterns, which

would be invisible from a pure visual examination. PCA can be applied to MD simulation trajectories

to detect the global, correlated motions of the system (the principal components). One can separate

the configurational space into 2 sub-spaces:

1. The Essential subspace: correlated motions comprising only a few of the degrees of freedom

available to the protein, they are FUNCTIONALLY IMPORTANT

2. The “Irrelevant” subspace: independent, Gaussian fluctuations, which are constrained and of

no/little functional relevance – act locally

Example: a 500 frame trajectory of a 300 residue protein.

14.13.2. Building the covariance matrix from your trajectory

Populate the 900 x 900 matrix (x, y and z Cartesian coordinates of each Cα atom):

where is a time-averaged position

The covariance matrix is then diagonalized, after that procedure the columns of the

transformation matrix become the eigenvectors, each associated with an eigenvalue. Eigenvectors

are then sorted by eigenvalue, the highest eigenvalues represent the most significant relationship

between the dimensions: these are the principal components. Eigenvectors represent a correlated

displacement of groups of atoms through space. Eigenvalues represent the magnitude of this

displacement (nm2).


Figure 75. First 2 eigenvectors account for about 60% of total positional fluctuations

14.13.3. Visualizing principal components (PC’s)

The motion described by an eigenvector can be visualized by projecting the trajectory onto the

eigenvector and taking the 2 extreme projections and interpolating between them to create an

animation.

Figure 76. Projection of atom from a trajectory onto eigenvector

Figure 77. The sample porcupine plot

98 Introduction to molecular modeling. 15. Uses of Free Energy by Rajmund Kaźmierkiewicz

Porcupine plots can be used to display the motion described by an eigenvector in a static image. A

cone extending from the C position shows the direction of the atom along the

eigenvector.

Figure 78. The sample covariance plot

Covariance plots are a tool to visualize atoms which have a high correlation coefficient from the

covariance matrix. Correlation coefficient measures the “degree of synchronization” of motion of

two atoms.

14.13.4.....Validation of PCA

One may ask: How relevant are the PCs we have calculated and visualized?

1. Divide simulations into two or more parts and compare the eigenvectors for each part, to

measure subspace overlap: higher overlap indicates sampling of only a single energy minimum;

lower overlap indicates more complete sampling.

2. PCs can also measure cosine content of eigenvectors. Hess et al. (Hess, ”Similarities between

principal components of protein dynamics and random diffusion”, Phys. Rev. E 62(6):8438-8448

(2000)) showed that the first few PCs of high-dimensional random diffusion are cosines and that

several protein simulation PCs resemble these cosines. So high cosine content may mean that the

fluctuations in your simulation are due to random diffusion: typically seen when simulation

timescales are too short to reach energy barriers.

15. Uses of Free Energy

Free energy is one of the most important thermodynamic quantities (reaction equilibrium, solvation,

stability, and kinetics). It is used in:

Evaluating Protein-protein and protein-ligand interactions (binding constants,

association and disassociation)

Mutation analysis

Rational drug design

Protein folding unfolding

Introduction to molecular modeling. 15. Uses of Free Energy 99 by Rajmund Kaźmierkiewicz

15.1. Methods and Applications

Most of the free-energy methods are based on calculation of free-energy differences, which

may be the quantity of interest anyway. If reference is simple (such as ideal gas or harmonic crystal),

its absolute free energy can be evaluated analytically. The free-energy evaluation methods can be

divided into three classes:

Free energy perturbation and thermodynamic interaction

Potential of mean force calculations

“Rapid” (and not very precise) free energy methods (Beveridge, D.L. and DiCapua, F.M. (1989)

Free Energy Via Molecular Simulation: Applications to Chemical and Biomolecular Systems, Annu.

Rev. Biophys. Biophys. Chem. 18: 431-492; Brooks, C.L. and Case, D.A. (1993) Simulations of

Peptide Conformational Dynamics and Thermodynamics, Chem. Rev. 93:2487-2502; Kollman, P.

(1993) Free Energy Calculations: Applications to Chemical and Biochemical Phenomena, Chem.

Rev. 93: 2395-2417; Lybrand, T.P. (1990) Computer Simulation of Biomolecular Systems Using

Molecular Dynamics and Free Energy Perturbation Methods, in, Reviews in Computational

Chemistry, Vol.1, Lipkowitz, K.B. and Boyd, D.B., eds. VCH Publishers, New York, pp. 295-320;

Reynolds, C.A., King, P.M., and Richards, W.G. (1992) Free Energy Calculations in Molecular

Biophysics, Mol. Phys. 76, 251-275)

Calculation of thermodynamic quantities from molecular simulation is based on the

principles of statistical mechanics. We need to extend our previous discussions of that topic to

describe application of free energy simulations to biomolecular systems.

15.2. Thermodynamic Integration

For the free energy function, , on the interval to , the free energy difference is

defined by:

Since

then

From statistical mechanics

So (after quite a few substitutions of mathematical expressions) we can write

where the brackets denote an ensemble average over the probability function of . Thus, one can

write


In practice the integral is approximated by a summation over discrete intervals in λ. That is,

simulations are run at different values of over the interval 0 to 1, with ensemble averages being

determined at each . In many cases, simulations will be run in the forward direction (0 1) and the

reverse direction (1 0), with the amount of hysteresis between the forward and reverse simulations

being a measure of the statistical uncertainty in the integration. Another approach to obtaining

statistical information is to begin the simulation from a different equilibrated starting structure.

Estimates from all of the starting structures are independent estimates of the true mean and they

should be normally distributed.

15.3. Perturbation Method

The perturbation method (free energy perturbation method, FEP) is an alternative approach to

calculating the free energy. We begin again with the relationship

and we employ the coupling parameter, ,

We now write

We then multiply the numerator by the unity factor. That is,

So (again, after quite a few substitutions of mathematical expressions) we can write

where the subscript 0 indicates configurational averaging over the ensemble of configurations

representative of the initial state of the system. Thus,

We also can show

where configurational averaging is over the ensemble of configurations representative of the final

configuration.

The thermodynamic perturbation method is implemented by first performing Monte Carlo or

molecular dynamics simulations for state 0 and generating the ensemble average for the energy

difference described above (the forward calculation). Then simulations for state 1 are performed to

obtain the corresponding ensemble average (the reverse calculation). The difference in between

the forward and backward calculations is a measure of the statistical uncertainty of the calculations.

The perturbation approach will be accurate only when states 0 and 1 differ by only a small amount,

that is, when they are only perturbations of one another. However, additional methods can be

applied to extend the applicability and accuracy of these perturbation methods. If the states 0 and 1

are not sufficiently similar, the calculation can be divided into a series of steps along the

coordinate. It is recommended that the free energy changes for each step be no more than 2kT (ca.


1.5 kcal/mol). The overall free energy change is then obtained by summing the change from each of

the steps. That is,

where the interval 0 to 1 has been divided into n subintervals.

15.4. Thermodynamic Integration and Slow Growth

Figure 79. An Alternative Approach to Potential of Mean Force Calculations (C. Chipot, P. A. Kollman, D. A. Pearlman, Alternative Approaches to Potential of Mean Force Calculations: Free Energy Perturbation versus

Thermodynamic Integration. Case Study of Some Representative Nonpolar Interactions, Journal of Computational Chemistry 1996, 17(9): 11 12-131)

15.5. Thermodynamic Cycles

This approach is an extension of the free energy methods described above. It is often applied in

studying the relative strength of ligand-receptor interactions and the relative stability of proteins

differing in one or a few amino acids.

Thermodynamic cycle methods were developed because relatively large, complicated changes need

to be taken into account when considering the physical phenomena that occur in ligand-receptor

binding or the effect of a mutation on protein stability. That is, binding of a drug to a receptor will

produce relatively large conformational changes (this is, the protein will favor a particular set of

conformational substates). Binding of a very similar drug to the same site should produce most of the

same changes. The thermodynamic cycle is designed to cancel out the large changes that are

common to binding of either drug to the receptor.

Consider ligands A and A’ and a receptor B. We can write the equilibriums:


represent the binding processes in which the large conformational changes

occur. We desire to calculate the quantity

also the nonphysical processes

These processes are part of the overall thermodynamic cycle

Because , a thermodynamic function, is a state property, it is dependent only on the initial and

final states and not on the path between them. Thus,

and are calculated by one of the methods described above. The changes in these processes

are usually relatively small and localized, though it is necessary to apply the coupling parameter

approach.

15.6. Application of free energy simulations, Partitioning the free

energy

Which interactions contribute to the most of the overall free energy? In Thermodynamic Integration:

In FEP, this can be achieved by first perturbing the electrostatic and then the van-der-Waals

parameters. Note: only the sum of the contributions is truly meaningful, the individual contributions

are not state functions (Boresch S. Karplus M: The meaning of component analysis: decomposition of

the free energy in terms of specific interactions. J Mol Biol 1995. 254:801-807).


15.7. Potential of Mean Force Calculations

Figure 80. The Potential of Mean Force (PMF)

We can identify or hypothesize one biological process to take place along some inter- or intra-

molecular coordinates, called reaction coordinates (RC).

PMF is basically the free energy profile alone the reaction coordinates, and all the other degrees

of freedom will be averaged out.

A simple example. We select the distance between two atoms as RC, the PMF is the free

energy change as the separation (r) between the atoms is changed. The distribution of r can be

described by the radial distribution function g(r), so:

For a general RC q:

For multi-dimension cases, (q, s) :

It is often difficult to find suitable RC for detailed biological processes (Jensen M, Park S.

Tajkhorshid E. Schulten K: Energetics of glycerol conduction through aquaglyceroporin GlpF. Proc

Natl Acad Sci USA 2003, 99:6731-6736.).

The logarithmic relationship between the PMF and g(q) means that a small change in the free

energy may correspond to g(q) changing by an order of magnitude or more from its most likely value.

Standard MC or MD methods do not adequately sample regions where g(q) differs drastically from

the most likely value, leading to inaccurate values for the PMF (Johannes Kästner, Hans Martin Senn,

Stephan Thiel, Nikolaj Otte, and Walter Thiel, QM/MM Free-Energy Perturbation Compared to

Thermodynamic Integration and Umbrella Sampling: Application to an Enzymatic Reaction, J. Chem.

Theory Comput., 2006, 2 (2), pp 452–461).

One can calculate the PMF using the FEP method. But FEP is commonly used to study “mutations”,

which are often along non-physical pathways. One usually wants to calculate PMF for a physically

achievable process, so one can get the transition states and derive kinetic quantities such as rate

constants. The traditional way to avoid the sampling problem is Umbrella Sampling.


15.7.1. Potential of Mean Force calculation

The goal is to extract degree of freedom from partition function and free energy. The free energy is

related to probability:

For this reason one can use relatively simple approach for PMF calculation:

Run canonical MD or Monte Carlo

Compute probability distribution P(x)

P(x) determines F(x) up to a constant

15.8. Simple Umbrella Sampling

Problem: P(x) converges slowly due to barriers along x.

Solution: Add an additional potential term to the energy to encourage barrier crossing.

Sample with umbrella potential U’(x)

Compute biased probability P’(x)

Estimate unbiased free energy

in this equation F0 is undetermined but it is irrelevant.

15.9. Weighted Histogram Analysis Method (WHAM)

Weighted Histogram Analysis Method determines optimal F values for combining simulations

(Kumar, et al., J Comput Chem, 13, 1011-1021, 1992). Some generalizations are possible:

multidimensional reaction coordinates, multiple temperatures. WHAM equations:

where

Nsims = number of simulations

ni(x)= number of counts in histogram bin associated with x

Ubias,i , Fi = biasing potential and free energy shift from simulation i

P(x) = best estimate of unbiased probability distribution

Fi and P(x) are unknowns

Solve by iteration to self-consistency

15.9.1. Running a Simulation

Choose the reaction coordinate

Choose the number of simulations and the biasing potential

Run the simulations

Compute time series for the value of the reaction coordinate

Apply the WHAM equations


15.9.2. Reaction Coordinate

The choice of the reaction coordinate is sometimes obvious. It may be a dihedral angle like

for butane, the backbone dihedrals like for the alanine dipeptide. Some care is required, because

PMF depends on the choice of coordinate and the volume element may not be constant along

reaction coordinates.

15.9.3. Example: n-Butane

Compute PMF for rotating dihedral of united atom n-butane

PMF integrates out the effects of flexible bonds and angles

Protocol:

18 independent simulations:

500 ps each

Restraint spring constant = 0.02 kcal/mol-deg2

T=300K (stochastic dynamics)

WHAM:

90 bins (4°/bin)

Enforced periodicity

Figure 81. Histograms from Individual Trajectories

Figure 82. Histogram of Combined Trajectories


Figure 83.The n-butane PMF

15.10. Steered Molecular Dynamics

In Steered Molecular Dynamics (SMD), time-dependent external forces are applied to a

system, which induce unbinding of ligands and conformational changes in biomolecules on time

scales accessible to MD simulations. Assuming a reaction coordinate x, we add an external force

along the path, a simple way is by a harmonic spring:

Figure 84. The schematic illustration of Steered Molecular Dynamics

The Steered Molecular Dynamics method is similar to experiments by Atomic Force Microscopy, a

“spring” of stiffness k is attached to the ligand and a constant pulling rate is applied to measure the

adhesion forces while the ligand detaches from the protein.

15.11. “Rapid” Free Energy Methods

Free energy calculations are very important in computer-aided drug design. However, if the

calculations take longer to perform than a candidate drug molecule can be synthesized and tested,

then there is little practical benefit from attempting the calculation.

Free energy calculations are time-consuming. It is necessary to develop some alternative

methods, which still being based upon 'exact' statistical mechanics, are intended to provide free

energy with less computational effort than a full free energy calculation.


15.11.1. Linear Interaction Energy (LIE)

The Linear Interaction Energy is a semi-empirical method for estimating absolute binding free

energies of ligands binding to proteins. The interaction between the ligand and protein or solvent is

broken down into the electrostatic and van der Waals contributions.

To determine AF one thus needs to perform just two simulations, one of the ligand in the solvent and

the other of the ligand bound to the protein.

What remains is to determine values of the parameters and . By some analytical theories,

the parameter related to the electrostatic contribution is around 1/2.

For the Van der Waals component no such analytical theory exists. depends on a different

force field, and the nature of the binding sites, different distributions of polar and non-polar groups

in different binding sites. In other words needs to be evaluated for each protein separately (Wang

W. Wang J. Kollman PA: What determines the van der Waals coefficient in the LIE (Linear

Interaction Energy) method to estimate binding free energies using molecular dynamics simulations?

Proteins Struct Funct Genet 1999, 34:395-402.).

15.11.2. Molecular Mechanics Poisson-Boltzmann Surface Area Method

(MM/PBSA)

The MM/PBSA approach represents the post-processing method to evaluate free energies of

binding or to calculate absolute free energies of molecules in solutions, which combines the

molecular mechanical energies with the continuum solvent approaches. In this method, one usually

carries out a MD simulation with explicit water and counterions. Then one post-processes these

structures, removing any solvent and counterions, and calculates the Gibbs free energy (Kollman PA,

Massova L, Reyes C, Kuhn B., Huo S, Chong L. Lee M. Lee T, Duan Y. Wang W, Donini O. Cieplak P.

Srinivasan J. Case D: Cheatham TE. Ill: Calculating structures and free energies of complex molecules:

combining molecular mechanics and continuum models. Acc Chem Res 2000, 33:889-897.):

Calculated average Gibbs free energy:

The components in MM/PBSA equation:

are as follows:

average molecular mechanical

energy

Solvation free energy

Numerical solution of Poisson-Boltzmann equation or

Generalized Born model

Solvent-accessible surface area

Solute entropy, which is likely to be much smaller than other terms. It can be

estimated by harmonic analysis or normal mode analysis,


15.11.3. Example: MM/PBSA

15.11.4. Binding free energy of protein-ligand

There are two methods of G evaluation:

1. separate simulations of complex, protein, and ligand or

2. evaluation of all three terms using just the snapshots from complex simulations.

Figure 85. Sample correlation between calculated and experimental protein-ligand binding free energies

The second method is a good approximation in cases that, there are no large conformational changes

of protein and ligand before and after their association (Kuhn B. Kollman PA: Binding of a diverse set

of ligands to avidin and streptavidin: an accurate quantitative prediction of their relative affinities by

a combination of molecular mechanics and continuum solvent models. J Med Chem 2000. 43:3786-

3791).

15.11.5. Binding free energy of protein-RNA

Figure 86. The MM-PBSA free energy differences between free and bound protein and RNA

Introduction to molecular modeling. 16. Molecular Distance Geometry Problem 109 by Rajmund Kaźmierkiewicz

Conformational change upon binding of U1A protein and internal loop (IL) RNA, and are the

MM-PBSA free energy differences between free and bound protein and RNA, respectively. is the

free energy of association of protein and RNA in their bound structures. (Reyes C., Kollmann PA:

Structure and thermodynamics of RNA-protein binding: using molecular dynamics and free energy

analysis to calculating both the free energies of binding and conformational change. J Mol Biol 2000,

297:1145-1158.)

16. Molecular Distance Geometry Problem

Given n atoms a1, …, an and a set of distances di,j between ai and aj,

find the coordinates x1, ..., x3n or a1, ..., an such that

Where S is a set of integer pairs from 1 to 3n.

16.1. Current Approaches

Embed Algorithm by Crippen and Havel

Geometric Build-Up by Blumenthal 1953

CNS Partial Metrization by Brünger et al

Graph Reduction by Hendrickson

Alternating Projection by Glunt and Hayden

Global Optimization by Moré and Wu

Multidimensional Scaling by Trosset, et al

Currently, the first two approaches are most commonly used.

16.1.1. Embed Algorithm

1. bound smooth; keep distances consistent

2. distance metrization; estimate the missing distances

3. repeat (say 1000 times):

a. randomly generate D in between L and U

b. find X using SVD with D

c. if X is found, stop

4. select the best approximation X

5. refine X with simulated annealing

6. final optimization

(Crippen and Havel 1988 (DGII, DGEOM); Brünger et al 1992, 1998 (XPLOR, CNS))

16.1.2. Geometric Build-Up

Geometric Build-Up is a (rather advanced) mathematical procedure used to speed up the

reconstruction of atom locations from the matrix of distances. It uses unique mathematical tools and

concepts (for example, ):

Independent Points: A set of k+1 points in k dimensional space Rk is called independent if it is not

a set of points in Rk-1.

110 Introduction to molecular modeling. 17. Protein Folding by Rajmund Kaźmierkiewicz

Metric Basis: A set of points B in a space S is a metric basis of S provided each point of S is

uniquely determined by its distances from the points in B.

Fundamental Theorem: Any k+1 independent points in k dimensional space Rk form a metric basis

for Rk. (Blumenthal 1953: Theory and Applications of Distance Geometry)

Besides being rather cryptic for non-mathematicians it considerably speeds up calculations. The

geometric build-up algorithm solves a molecular distance geometry problem in O(n) when distances

between all pairs of atoms are given, while the singular value decomposition algorithm requires

O(n2~n3) computing time!

Build up procedures example application:

Figure 87. The X-ray crystallography structure (left) of the HIV-1 RT p 66 protein (4200 atoms) and the structure (right) determined by the geometric build-up algorithm using the distances for all pairs of atoms in the protein.

The algorithm took only 188,859 floating-point operations to obtain the structure, while a conventional singular-value decomposition algorithm required 1,268,200,000 floating-point operations. The RMSD of the

two structures is ~10-4

Å

17. Protein Folding

Proteins are created linearly and then assume their tertiary structure by “folding”. The exact

mechanism is still unknown, however molecular mechanics simulations can be informative. Proteins

assume the lowest energy structure, or sometimes an ensemble of low energy structures. It is most

likely that the hydrophobic collapse is an important “driving” force of the folding process. The local

(secondary) structure tendencies also play a significant role. The folded structure is stabilized

internally by a network of hydrogen bonds, disulphide bonds, electrostatic interactions and salt

bridges. There are three major classes of methods for the tertiary (folded) protein structure

prediction: Homology Modeling/Comparative Modeling, The probe and template sequences are

evolutionarily related

Fold Recognition/Threading, For the query sequence, determine the closest matching structure

from a library of known folds by scoring function

First Principles with Database Information, Secondary and/or tertiary information from

databases/statistical methods; First Principles/Ab-initio without Database Information,

Physiochemical models with most general application

The X-ray crystallography structure (left) of the HIV-1 RT p66 protein (4200

atoms) and the structure (right) determined by the geometric build-up algorithm

using the distances for all pairs of atoms in the protein. The algorithm took only

188,859 floating-point operations to obtain the structure, while a conventional

singular-value decomposition algorithm required 1,268,200,000 floating-point

operations. The RMSD of the two structures is ~10-4 Å.

Introduction to molecular modeling. 17. Protein Folding 111 by Rajmund Kaźmierkiewicz

Kinds of Structure Prediction: Comparative modeling, where the homolog has known structure,

which is adjusted for sequence differences. It uses energy minimization and molecular dynamics.

According to the fold recognition method proteins fall into broad fold classes. Models of folds that

recognize compatible sequences. The method treats structures and sequences separately and it aims

to find connections between (known and unknown) structures and sequences. To some degree it is

an “inverse” problem if compared to folding. Usually it is able to predict more than fold class. An Ab

initio or “new fold” prediction method seeks the structures of proteins having no homologs, and not

recognized by any fold model.

17.1. Energy Minimization

Many forces act on a protein: hydrophobic, the inside of a protein wants to avoid water;

packing, atoms can’t be too close, nor too far away; bond angle/length constraints; long distance,

these are electrostatics interactions and hydrogen bonds, disulphide bonds, salt bridges. Energy

minimization procedure can calculate all of these forces, and minimize. The most important is the

global optimization, which is intractable in a general case, but can be useful. Some related methods

(all of them serve the same purpose, finding the global minimum on the protein energy

hypersurface and that’s why they are “related”): stochastic search methods use random

perturbations; Monte-Carlo simulated annealing uses perturbations, acceptance criterion, schedule;

evolutionary algorithms use ensembles/populations; smoothing methods use deformation of

function being minimized; homotopy/continuation methods use nonlinear equations.

17.2. Some Related Methods

Example of Global optimization algorithms “Optimization” refers to trying to find the global energy

minimum of a potential surface:

Genetic Algorithm (GA)

Simulated Annealing (SA)

Tabu Search (TS)

Ant Colony Optimization (ACO)

A model system: Lennard Jones clusters

17.3. Monte Carlo-Minimization (MCM)

Generate at random a set of structures and locally minimize them.

1. Select the lowest-energy one as the "generative" structure.

2. Carry out a random change (perturbation) of "generative" structure to produce a new

3. conformation.

4. Minimize the energy of the new conformation.

5. Compare its energy to the energy of the "generative" structure by means of the Metropolis

6. criterion [accepted with probability of exp(-E/kT)].

7. If accepted in Metropolis criterion new (minimized) structure becomes the

8. “generative” structure, otherwise the "generative" structure remains unchanged.

9. Iterate into point 3.


The Conformation-Family Monte Carlo (CFMC) seems to be an extension of the MCM method. It

uses the Metropolis criterion to move between families. CFMC uses the Boltzmann distribution to

choose conformation from a family. It does not move between structures, but between families.

MD as a tool for minimization

Figure 88. MD as a tool for minimization

Crossing energy barriers

Figure 89. Using MD as a tool for minimization

The actual transition time from A to B is very quick (a few picoseconds). What takes time is waiting.

The average waiting time for going from A to B can be expressed as:

17.4. Knowledge-based Energies

The simplest form of potential energy function for investigation of the protein folding problem is the

Knowledge-based (where “knowledge” usually comes from the protein database PDB) Energy

function.

Native State Randomized State

Figure 90. An illustration of the method for obtaining Knowledge-based Energy function

Energy

positionEnergy minimization

stops at local minima

Molecular dynamics

uses thermal energy

to explore the energy

surfaceState A

State B

A

B

I

G

Position

En

erg

y

time

Po

sitio

n

State A

State B

τ A→B=Ce

ΔG

kT

many

clusters

no

clusters


The “recipe” for the function is as follows:

1. Count pairs of centers of each type at different separations, r, to give Nij(r).

2. Normalize by the expected count for a random arrangement given by Mij(r).

3. Convert to additive score: Eij(r)=log(Nij(r)/Mij(r)).

4. After applying this procedure one can obtain the Pair-wise Energies.

Figure 91. The correlation between pair-wise energies and amino-acid residues centers distributions

Get distribution of distances between pairs of atom centers of a particular type, for example

D-OD1...F-CD2. Normalize and take log to get Energy score: Eij(r)=log(Nij(r)/Mij(r))

The Knowledge-based Energies are closely connected with the Knowledge-based Geometry

they can be created by applying a simple procedure: Cut a protein into overlapping pieces for re-use.

May cluster to have less redundancy. (Jones & Thirup, EMBO J. 5, 819 (1986) Levitt, J. Mol. Biol. 226:

507-533 (1992).) This procedure leads to Fragment Libraries. One may build any structure from a

library of fragments. For 5-residue 20mer library, a protein of length N will consist of (N-3)/2

fragments. (Kolodny et al, J. Mol. Biol., 323: 297(2002).)

17.5. Predicting Protein Secondary Structure

Assign the secondary structure of every residue from a protein structure in the database

Now predict this secondary structure from the amino acid sequence based on the amino acid

residues conformational preferences.

Two general schemes relate secondary structure to sequence:

Statistical: Count how often each type of residue occurs in each type of secondary structure.

Patterns: Look for characteristic sequence patterns that define the ends of -helices, -

turns.


17.6. Protein Threading

Make a structure prediction through finding an optimal placement (threading) of a protein

sequence onto each known structure (structural template). The “placement” quality is measured by

some statistics-based energy function. The best overall “placement” among all templates may give a

structure prediction. Protein Threading Needs: construction of a template library, design of energy

function, sequence-structure alignment, template selection and model construction methods.

17.7. Reduced or Simplified Protein Models

Figure 92. One of the earliest simplified protein models created for investigation of protein folding

Each residue in this simplified protein model is represented by 1 atom/residue, it is able to fold

protein with 1000 steps of minimization. Escape from local minima is achieved with normal mode

„jumps” (Levitt & Warshel Nature, 1975)

Simplification of the protein structure representation sometimes does not mean the simplification of

the potential energy function as is apparent in the case of the UNRES force field.

Unres force field potential energy function.

Unres is a united-residue force field for off-lattice protein-structure simulations. United-

residue representation of the polypeptide chain consists of ellipsoids equivalent to side-chain

residues connected with the point dipoles representing peptide bonds.

Some simplified protein models use two dimensional lattice or three-dimensional face-

centered cubic lattice.

Figure 93. Random walk on lattice. Self-avoiding and bounded. Can extend the chain in 4 ways all with a bond angle of 120

o


Figure 94. Random walk generation on three-dimensional face-centered cubic lattice may give also “native-like” folds

One of the best lattice based simplified protein models was created by Kolioski and Skolnick

(Kolinski & Skolnick. Assembly of Protein Structure From Sparse Experimental Data: An Efficient

Monte Carlo Model. Proteins, 32: 475 (1998). )

Figure 95. They use a complicated lattice model for the side chain centroids and Multi-Replica Monte Carlo for protein structure prediction

Even the “Simple Toy Protein Model” consisting of Two kinds of amino acid residues: Hydrophobic

(H) and Polar (P), which represents the Two-Dimensional Self-Avoiding Chain of N residues on a

square lattice can be very useful.

Figure 96. The simplified 24 residues protein model of PPHPPPHPPHHPPHPHHPPHPHHH on a square lattice

The model has to be simple enough for all sequences and conformations to be explored. For such a

short protein there are 224=16777216 possible different sequences, and there are 2158326727

different chain shapes. All of the shapes can be effectively explored. This model enables testing of

our understanding of interplay between hydrophilic/hydrophobic interactions and their influence on

protein folding.

116 Introduction to molecular modeling. 18. Molecular graphics software by Rajmund Kaźmierkiewicz

18. Molecular graphics software

The selected, frequently used, molecular graphics presentation programs are collected in the table:

Software Web page L* W

** Short Description

Abalone http://www.biomolecular-

modeling.com/Abalone/ L W

Simple program for molecular structure

visualization and Biomolecular (rather short)

dynamics simulations of proteins, DNA, ligands.

AutoDock http://autodock.scripps.ed

u/ L W

Automated docking tools designed to predict how

small molecules, such as substrates or drug

candidates, bind to a receptor of known 3D

structure.

Cn3D

http://www.ncbi.nlm.nih.g

ov/Structure/CN3D/cn3d.s

html

L W

Simultaneously displays structure, sequence and

alignment. Based on NCBI MMDB database. Does

not read PDB files.

DINO http://www.dino3d.org/in

tro.php L

DINO is a real-time 3D visualization program for

structural biology data.

DeepView http://spdbv.vital-it.ch/ L W

Molecular graphics with interface allowing analysis

of several proteins at the same time. Can also read

electron density maps, and provides tools to build

into the density.

Discovery

Studio (DS)

Visualizer

http://accelrys.com/produ

cts/discovery-

studio/visualization-

download.php

L W Provides publication quality images with

hierarchical presentation of chemical structure.

iMol http://www.pirx.com/iMol

/

Molecular visualization application for Mac OS X

operating system.

Jmol http://jmol.sourceforge.ne

t/ L W

Open-source Java viewer for chemical structures in

3D.

MarvinSpace

http://www.chemaxon.co

m/products/marvin/marvi

nspace/

L W Web enabled 3D molecule visualization tool.

Molegro http://www.molegro.com/ L W Molecular Viewer for visualization of molecules and

Molegro Virtual Docker results.

MOLMOL http://www.marcsaric.de/i

ndex.php/Molmol L W

Molecular graphics program for displaying,

analyzing, and manipulating the three-dimensional

structure of biological macromolecules, especially

protein and DNA.

NOC http://noch.sourceforge.n

et/ L W

Molecular explorer for protein structure

visualization, crystallographic mapping, modeling

and refinement.

Protein

Explorer

(RasMol)

http://www.umass.edu/mi

crobio/rasmol/ L W

Molecular visualization software for looking at

macromolecular structure and its relation to

function.

PyMOL http://www.pymol.org/ L W

Molecular graphics system with an embedded

Python interpreter designed for real-time

visualization and rapid generation of high-quality

molecular graphics images and animations.

http://www.biomolecular-modeling.com/Abalone/

http://autodock.scripps.edu/

http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml

http://www.dino3d.org/intro.php

http://spdbv.vital-it.ch/

http://accelrys.com/products/discovery-studio/visualization-download.php



http://www.pirx.com/iMol/

http://jmol.sourceforge.net/

http://www.chemaxon.com/products/marvin/marvinspace/

http://www.molegro.com/

http://www.marcsaric.de/index.php/Molmol

http://noch.sourceforge.net/

http://www.umass.edu/microbio/rasmol/



http://www.pymol.org/

Introduction to molecular modeling. 19. Recommended reading 117 by Rajmund Kaźmierkiewicz

Software Web page L* W

** Short Description

QuteMol http://qutemol.sourceforg

e.net/ W

Generates high resolution antialiased snapshots for

publication quality renderings. Requires a graphics

card.

Ramachandra

n Plot

Explorer

http://boscoh.com/ramapl

ot/ W

Outstanding software to visualize and manipulate

phi and psi angles of peptide bonds.

VMD http://www.ks.uiuc.edu/R

esearch/vmd/ L W

Molecular visualization program for displaying,

animating, and analyzing large biomolecular

systems using 3-D graphics and built-in scripting.

Chimera

http://www.csb.yale.edu/

userguides/graphics/chim

era/chimera_descrip.html

L W UCSF modeling package

LIGPLOT


userguides/graphics/ligplo

t/ligplot_descrip.html

L W

Ligand plotting software. Automatically generates

schematic diagrams of protein-ligand interactions

for a given PDB file.

Molscript


userguides/graphics/molsc

ript/molscript_descrip.ht

ml

L

MolScript is a program for displaying molecular 3D

structures, such as proteins, in both schematic and

detailed representations.

Raster-3D

http://skuld.bmsc.washing

ton.edu/raster3d/raster3d

.html

L Raster3D is a set of tools for generating high quality

raster images of proteins or other molecules.

ISIS/DRAW http://www.symyx.com/d

ownloads/index.jsp W

It is commercial molecular graphics software. (but

it is free after registration). *Software available under Linux

**Software available under MS Windows

19. Recommended reading

1. Daan Frenkel. Understanding Molecular Simulation, Second Edition: From Algorithms to

Applications.

2. M.P. Allen, D.J. Tildesley. Computer Simulation of Liquids, Oxford University Press, New York

1987.

3. J.M. Haile. Molecular Dynamics Simulations: Elementary Methods, Wiley, New York 1992.

4. Molecular Modelling. Principles and Applications, Eds. A. R. Leach, Addison Wesley Longman,

Essex, England 1996.

http://qutemol.sourceforge.net/

http://boscoh.com/ramaplot/



http://www.ks.uiuc.edu/Research/vmd/

http://www.csb.yale.edu/userguides/graphics/chimera/chimera_descrip.html

http://www.csb.yale.edu/userguides/graphics/ligplot/ligplot_descrip.html

http://www.csb.yale.edu/userguides/graphics/molscript/molscript_descrip.html

http://www.csb.yale.edu/userguides/graphics/raster3d/raster.html

http://www.csb.yale.edu/userguides/graphics/isis-draw/isis-draw.descrip.html

INTRODUCTION TO MOLECULAR MODELING - LiSMIDoSlismidos.strony.ug.edu.pl/pliki/skrypty/Kazmierkiewicz... · · 2014-06-25INTRODUCTION TO MOLECULAR MODELING Rajmund Kaźmierkiewicz,

Documents