Top Banner
A A UTO UTO D D OCK OCK An Automated Docking Software for An Automated Docking Software for Predicting Optimal Protein-Ligand Predicting Optimal Protein-Ligand Interaction Interaction By By Susan McClatchy, Milind Misra, Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu Shrivastava Chandreyee Mukherjee, Indu Shrivastava
46

A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

AAUTOUTODDOCKOCK

An Automated Docking Software for An Automated Docking Software for Predicting Optimal Protein-Ligand Predicting Optimal Protein-Ligand InteractionInteraction

ByBy

Susan McClatchy, Milind Misra,Susan McClatchy, Milind Misra,

Chandreyee Mukherjee, Indu ShrivastavaChandreyee Mukherjee, Indu Shrivastava

Page 2: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

IntroductionIntroduction

Chandreyee MukherjeeChandreyee Mukherjee

Page 3: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Interaction between biomolecules lie at the core of all Interaction between biomolecules lie at the core of all metabolic processes and life activitiesmetabolic processes and life activities

The number of solved protein structures available in the The number of solved protein structures available in the databases is expanding exponentiallydatabases is expanding exponentially

To understand their functions it is essential to elucidate To understand their functions it is essential to elucidate the interaction mechanisms between the different the interaction mechanisms between the different moleculesmolecules

Primary importance lies in rational drug designPrimary importance lies in rational drug design Depending upon the success of the docked molecules the Depending upon the success of the docked molecules the

docking ligand may be redesigned or its structure further docking ligand may be redesigned or its structure further refined. refined.

Also important in the area of immunology to study Also important in the area of immunology to study antigen-antibody interaction.antigen-antibody interaction.

Automated Docking: Automated Docking: ImportanceImportance

Page 4: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Inhibitor bound to active site of HIVPR

Surface structure of HIVPR with bound inhibitor

Page 5: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Prediction of the optimal physical configuration and energy between two molecules

The docking problem optimizes:

Binding between two molecules such that their orientation maximizes the interaction

Evaluates the total energy of interaction such that for the best binding configuration the binding energy is the minimum

The resultant structural changes brought about by the interaction

What is docking?What is docking?

Page 6: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

1. Protein-Protein Docking: Both molecules are rigid Interaction produces no change in

conformation Similar to lock-and key model

2. Protein-Ligand Docking: Ligand is flexible but the receptor protein is

rigid Interaction produces conformational

changes in ligand

Categories of dockingCategories of docking

Page 7: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

1. Protein-Protein Docking

2. Protein-Ligand Docking

optimized

Page 8: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

It involves:

Finding useful ways of representing the molecules and molecular properties.

Exploration of the configuration spaces available for interaction between ligand and receptor.

Evaluate and rank configurations using a scoring system, in this case the binding energy

However, since it is difficult to evaluate the binding energy because the binding sites may not be easily accessible, the binding energy is modeled as follows:

∆G bind= ∆Gvdw + ∆Ghbond + ∆Gelect + ∆G conform+ ∆G tor + ∆G sol

Docking uses a “search Docking uses a “search and score” methodand score” method

Page 9: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Developed by AJ Olson’s group in 1990.

AutoDock uses free energy of the docking molecules using 3D potential-grids

Uses heuristic search to minimize the energy.

Search Algorithms used: Simulated Annealing

Genetic Algorithm

Lamarckian GA (GA+LS hybrid)

The AutoDock SoftwareThe AutoDock Software

Page 10: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Algorithms OverviewAlgorithms Overview

Simulated Annealing Based on temperature effects Start with high temperature and global search Lower temperature local search

Genetic Algorithm Charles Darwin’s Theory of Evolution Genotype Phenotype Lamarckian Algorithm ( Jean –Baptiste de

Lamarck) Phenotype Genotype

Page 11: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Study algorithms used to perform the searches and to calculate minimum energy

Discuss why GA+LS hybrid better than SA

Look at an example, i.e., dock a ligand to a protein molecule using latest AutoDock version

Project GoalProject Goal

Page 12: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

The AlgorithmsThe Algorithms

Sue McClatchySue McClatchy

Page 13: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Simulated AnnealingSimulated Annealing Algorithm modeled after the cooling of a solution to Algorithm modeled after the cooling of a solution to

form glass, though it’s better explained by crystal form glass, though it’s better explained by crystal formationformation

Given a long enough cooling time, molecules will relax Given a long enough cooling time, molecules will relax into their lowest energy state to form the largest into their lowest energy state to form the largest crystalscrystals Quick cooling - highly disordered systemQuick cooling - highly disordered system Slow cooling - highly ordered crystal, with each Slow cooling - highly ordered crystal, with each

molecule in its lowest energy statemolecule in its lowest energy state Algorithm simulates either linear or proportional slow Algorithm simulates either linear or proportional slow

cooling cooling

Page 14: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

The SA AlgorithmThe SA Algorithm Uses neighborhood operator N(s) to generate a set of solutions Uses neighborhood operator N(s) to generate a set of solutions

according to a fixed distributionaccording to a fixed distribution New solution compared to preceding solution, and is accepted if New solution compared to preceding solution, and is accepted if

its energy is lower than that of previous solutionits energy is lower than that of previous solution If new solution has higher energy, it is accepted probabilistically If new solution has higher energy, it is accepted probabilistically

according to Boltzmann distribution (see figure above)according to Boltzmann distribution (see figure above) At high temperatures, many higher energy solutions will be At high temperatures, many higher energy solutions will be

accepted; at low temps., majority of probabilistic moves accepted; at low temps., majority of probabilistic moves rejectedrejected

Boltzmann probability distribution = e exp(delta E/T) where Boltzmann probability distribution = e exp(delta E/T) where delta E = energy difference between two solutions, delta E = energy difference between two solutions, T = temperature T = temperature

Boltzmann finds p(of finding a system with energy E at temp T)Boltzmann finds p(of finding a system with energy E at temp T)

Page 15: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Pseudocode for SAPseudocode for SACompute a random initial state sCompute a random initial state s

n=0, x*n=0, x*nn = s = s // initialize best solution to s and first state to 0// initialize best solution to s and first state to 0Repeat i = 1, 2, …Repeat i = 1, 2, … // specify number of temperatures to try// specify number of temperatures to try

Repeat j = 1, 2, …, mRepeat j = 1, 2, …, mi i // no. of steps to perform for each temp.// no. of steps to perform for each temp. TTii

Compute a neighbor s’ = N(s) Compute a neighbor s’ = N(s) // s’ = new solution from // s’ = new solution from N(s)N(s)

if (f(s’) <= f(s)) thenif (f(s’) <= f(s)) then // if energy of s’ <= energy of s// if energy of s’ <= energy of s s = s’s = s’ // accept new solution s’// accept new solution s’

if (f(s) < f(x*if (f(s) < f(x*nn)) then)) then // if energy of new solution <// if energy of new solution <

x*x*nn = s = s // energy of best solution of // energy of best solution of n = n + 1n = n + 1 // state n, replace best with new // state n, replace best with new

endifendifelse else // otherwise replace s with s’ using// otherwise replace s with s’ using

s = s’ with probability e s = s’ with probability e (f(s) - f(s’))/T(f(s) - f(s’))/Tii // Boltzmann dist.// Boltzmann dist.

endifendifEndRepeatEndRepeat

EndRepeatEndRepeat

Page 16: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

How Genetic Algorithms How Genetic Algorithms Work Work - A Simple Example- A Simple Example

1 1 1 1 0 0

0 0 0 0 0 1

1 0 0 0 0 1

0 0 0 0 0 0

Initial population of Initial population of binary creatures binary creatures having 6 “genes”having 6 “genes”

Each gene has two Each gene has two different alleles, different alleles, either a 0 or a 1either a 0 or a 1

Three operators: Three operators: crossover, mutation crossover, mutation and selectionand selection

Page 17: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

SelectionSelection

1 1 1 1 0 0

0 0 0 0 0 1

1 0 0 0 0 1

0 0 0 0 0 0

Selection based on a Selection based on a fitness function f(x)fitness function f(x)

This operator chooses This operator chooses those individuals with those individuals with the lowest valuesthe lowest values

Those with higher Those with higher values chosen with a values chosen with a very low probabilityvery low probability

Sco

re

20

13

48

52

Page 18: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

CrossoverCrossover

0 0 0 1 0 0

1 1 1 0 0 1

1 1 1 1 0 1

0 0 0 0 0 0

1 1 1 1 0 0

0 0 0 0 0 1

1 1 1 1 0 0

0 0 0 0 0 1

Page 19: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

MutationMutation

0 0 1 1 0 0

1 1 1 0 1 1

1 1 1 1 0 1

0 0 1 0 1 0

0 0 0 1 0 0

1 1 1 0 0 1

1 1 1 1 0 1

0 0 0 0 0 0

Page 20: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

ReplacementReplacement Lower scoring individuals Lower scoring individuals

create more offspring, higher create more offspring, higher scoring ones create fewer or scoring ones create fewer or none at all none at all

Offspring replace parental Offspring replace parental generationgeneration

““Elitism” function allows best Elitism” function allows best individual from parent individual from parent generation to persist, if it is generation to persist, if it is a better solution than new a better solution than new individuals createdindividuals created

Cycle of selection, mutation, Cycle of selection, mutation, crossover and replacement crossover and replacement

repeatedrepeated

0 0 1 1 0 0

1 1 1 0 1 1

1 1 1 1 0 1

0 0 1 0 1 0

Sco

re#

off

sp

15 1

9 1

22 0

1 2

Page 21: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Pseudocode for GAPseudocode for GA

Select an initial population set xSelect an initial population set xii0 =0 = {x {x11

0 0 ,, xx2200,…, x,…, xMM

00}}

Determine fitness values f(xDetermine fitness values f(xii00) for each individual ) for each individual

Repeat for g = 1, 2, … # of generationsRepeat for g = 1, 2, … # of generationsPerform selectionPerform selection

Perform crossover with probability Perform crossover with probability Perform mutation with probability Perform mutation with probability Determine fitness f(xDetermine fitness f(xii

gg) for new individuals) for new individuals

xxgg** = argmin = argmini=1,…M i=1,…M f(xf(xii

gg) and y) and ygg* = f(x* = f(xgg**))

Perform replacementPerform replacement

Until stopping criterion (# of generations) is reachedUntil stopping criterion (# of generations) is reached

Page 22: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

How GA works in How GA works in AutoDockAutoDock

Ligand’s “genes” are its Ligand’s “genes” are its x, y and z coordinatesx, y and z coordinates

These form a unit vector, These form a unit vector, which is given a random which is given a random rotation angle between rotation angle between

00oo and 360 and 360

o o to form a to form a

quaternionquaternion Additional genes may Additional genes may

represent torsion angles represent torsion angles between bonds of the between bonds of the ligandligand

Page 23: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

MappingMapping In standard GA, the In standard GA, the

genotype (x,y,z coordinates genotype (x,y,z coordinates plus rotation and any plus rotation and any torsion angles) are mapped torsion angles) are mapped to the fitness function f(x)to the fitness function f(x)

The fitness function value The fitness function value corresponds to each corresponds to each individual’s phenotypeindividual’s phenotype

According to the right hand According to the right hand side of the figure, side of the figure, genotypes of parents with genotypes of parents with high f(x) values are high f(x) values are mutated to form genotypes mutated to form genotypes of children with lower f(x) of children with lower f(x) valuesvalues

Page 24: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Selection, Crossover & Selection, Crossover & MutationMutation

Selection chooses ligands Selection chooses ligands with the lowest fitness with the lowest fitness (energy) values(energy) values

Crossover exchanges x, Crossover exchanges x, y, z coordinates, or y, z coordinates, or rotations or torsions rotations or torsions between these ligandsbetween these ligands

Example: Two ligands Example: Two ligands with xyz coordinates Abc with xyz coordinates Abc and aBc Crossover and aBc Crossover results in new individuals results in new individuals with coordinates abc and with coordinates abc and ABc ABc

Mutation operator Mutation operator mutates coordinate or mutates coordinate or other angle values by other angle values by adding a random real adding a random real number according to a number according to a Cauchy distribution, Cauchy distribution, which is similar to a which is similar to a Gaussian but has thicker Gaussian but has thicker tailstails

Page 25: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

ReplacementReplacement Individuals with better-Individuals with better-

than-average fitness than-average fitness receive proportionally receive proportionally more offspringmore offspring

nnoo= (f= (fww – f – fii)/(f)/(fw w - <f>),- <f>),

ffw w != <f> != <f>

wherewhere

nnoo= number of offspring= number of offspring

ffi i = fitness of individual = fitness of individual (energy of ligand)(energy of ligand)

ffw w = fitness of worst = fitness of worst individual in last g individual in last g generations (typically 10) generations (typically 10)

<f> = mean fitness of <f> = mean fitness of populationpopulation

Page 26: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Lamarckian Genetic Lamarckian Genetic AlgorithmAlgorithm

According to left hand side According to left hand side of figure, LGA finds lowest of figure, LGA finds lowest fitness function (energy) fitness function (energy) values first, then maps values first, then maps these values to their these values to their respective genotypesrespective genotypes

Genetic algorithm plus Solis Genetic algorithm plus Solis and Wets local searchand Wets local search

Better performance than Better performance than either simulated annealing either simulated annealing or genetic algorithm aloneor genetic algorithm alone

Page 27: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

The ApplicationThe Application

Milind MisraMilind Misra

Page 28: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

HIV-1 Protease and HIV-1 Protease and AHA006AHA006

HIV-1 Protease in complex with the HIV-1 Protease in complex with the cyclic sulfamide inhibitor, AHA006 cyclic sulfamide inhibitor, AHA006

Source: Protein Data BankSource: Protein Data Bank Authors: K. Backbro, T. Unge Authors: K. Backbro, T. Unge Exp. Method: X-ray Diffraction (2 Å res.)Exp. Method: X-ray Diffraction (2 Å res.) Primary Citation: Backbro Primary Citation: Backbro et alet al, J Med , J Med

Chem 40 pp. 898 (1997)Chem 40 pp. 898 (1997) Polymer Chains: A, B; Residues: 198; Polymer Chains: A, B; Residues: 198;

Atoms: 1632Atoms: 1632

Page 29: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Protein (HIV-1 Protease)

Ligand (AHA006)

(Source: PDB)

Page 30: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

HIV-1 Protease dimer

(Rasmol)

Page 31: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

(SYBYL)

Initial X-Ray crystallographic positions of protein and ligand

Page 32: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Docking Preparation – Docking Preparation – LigandLigand

Assign chargesAssign charges Define rotatable bondsDefine rotatable bonds Rename aromatic carbonsRename aromatic carbons Merge non-polar hydrogensMerge non-polar hydrogens Write .pdbq ligand fileWrite .pdbq ligand file

Page 33: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Docking Preparation – Docking Preparation – ProteinProtein

Add essential hydrogensAdd essential hydrogens Load chargesLoad charges Merge lone-pairsMerge lone-pairs Add solvation parametersAdd solvation parameters Write .pdbqs protein fileWrite .pdbqs protein file

Page 34: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

AutoDock uses AutoDock uses grid-based grid-based dockingdocking

Ligand-protein Ligand-protein interaction interaction energies are pre-energies are pre-calculated and calculated and then used as a then used as a look-up table look-up table during simulationduring simulation

Grid maps are Grid maps are constructed based constructed based on atoms of on atoms of interest in ligand interest in ligand (here CA(here CANNOOSSHH))

Docking Preparation – GridDocking Preparation – Grid

Page 35: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

(AutoDockTools)

Page 36: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Docking – Simulated Docking – Simulated AnnealingAnnealing

Runs = 100Runs = 100 Cycles = 50Cycles = 50 Initial Temp (RT) = 1,000Initial Temp (RT) = 1,000 Temp reduction factor = .95Temp reduction factor = .95 Linear temperature reductionLinear temperature reduction Translation reduction factor = 1Translation reduction factor = 1 Quaternion reduction factor = 1Quaternion reduction factor = 1 Torsional reduction factor = 1Torsional reduction factor = 1 # rotatable bonds = 12# rotatable bonds = 12 Initial coordinates = RandomInitial coordinates = Random Initial quaternion = RandomInitial quaternion = Random Initial dihedrals = RandomInitial dihedrals = Random Translation step = 2.0 ÅTranslation step = 2.0 Å Quaternion step = 50 degQuaternion step = 50 deg Torsion step = 50 degTorsion step = 50 deg

Results:Results: 100 different clusters100 different clusters Energy range: -0.63 to Energy range: -0.63 to

+64,000+64,000 Conformation #81: -0.63Conformation #81: -0.63 Conformation #67: +20.02Conformation #67: +20.02 Conformation #68: +10.74Conformation #68: +10.74

Lowest energy conf not close Lowest energy conf not close to position but similar to to position but similar to originaloriginal

Conf #67 closest to position Conf #67 closest to position and conformation of original and conformation of original ligand; higher energyligand; higher energy

Conf #68 close to position but Conf #68 close to position but not conformation of original not conformation of original ligand; not as high energyligand; not as high energy

Page 37: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

(SYBYL)

Original ligand confSA conformation #67

Page 38: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Original ligand confSA conformation #67

(SYBYL)

Close-up of previous

Page 39: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

(SYBYL)

Original ligand confSA conformation #67

Page 40: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

100 Clustered SA 100 Clustered SA ConformationsConformations

(gOpenMol)

Page 41: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Docking – Genetic Docking – Genetic AlgorithmAlgorithm

Runs = 50Runs = 50 # Evaluations = 250,000# Evaluations = 250,000 Population size = 50Population size = 50 Elitism count = 1Elitism count = 1 Mutation rate = 0.02Mutation rate = 0.02 Crossover rate = 0.8Crossover rate = 0.8 Window size = 10Window size = 10 Cauchy alpha = 0Cauchy alpha = 0 Cauchy beta = 1Cauchy beta = 1 # rotatable bonds = 12# rotatable bonds = 12 Initial coordinates = RandomInitial coordinates = Random Initial quaternion = RandomInitial quaternion = Random Initial dihedrals = RandomInitial dihedrals = Random Translation step = 2.0 ÅTranslation step = 2.0 Å Quaternion step = 50 degQuaternion step = 50 deg Torsion step = 50 degTorsion step = 50 deg

Results:Results: 50 different clusters50 different clusters Energy range: -18.66 to Energy range: -18.66 to

+86.28+86.28 Conformation #39: -18.66Conformation #39: -18.66 Conformation #9: -10.60Conformation #9: -10.60

Lowest energy conformation Lowest energy conformation overall closest to original overall closest to original ligand conformationligand conformation

If only 10 runs had been used If only 10 runs had been used instead of 50, then conf #9 instead of 50, then conf #9 would have been the lowest would have been the lowest energy conformation.energy conformation.

Page 42: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Docking – Local SearchDocking – Local SearchResults:Results: 18 different clusters18 different clusters Energy range: +35.92 to Energy range: +35.92 to

+215,200+215,200 Confs #20, 21, 22, 23: +35.92Confs #20, 21, 22, 23: +35.92

Lowest energy conformation Lowest energy conformation was most dissimilar to original was most dissimilar to original ligand conformationligand conformation

Better results could have been Better results could have been obtained by reducing the step obtained by reducing the step sizessizes

Runs = 50Runs = 50 Solis-Wets iterations = 300Solis-Wets iterations = 300 Consecutive successes = 4Consecutive successes = 4 Consecutive failures = 4Consecutive failures = 4 Rho = 1Rho = 1 Lower bound on rho = 0.01Lower bound on rho = 0.01 LS frequency = 0.06LS frequency = 0.06 # rotatable bonds = 12# rotatable bonds = 12 Initial coordinates = RandomInitial coordinates = Random Initial quaternion = RandomInitial quaternion = Random Initial dihedrals = RandomInitial dihedrals = Random Translation step = 2.0 ÅTranslation step = 2.0 Å Quaternion step = 50 degQuaternion step = 50 deg Torsion step = 50 degTorsion step = 50 deg

Page 43: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

Docking – Lamarckian GADocking – Lamarckian GAResults:Results: 10 different clusters10 different clusters Energy range: -18.10 to –8.38Energy range: -18.10 to –8.38 Conformation #7: -18.10Conformation #7: -18.10

Lowest energy conformation Lowest energy conformation fairly similar to original ligand fairly similar to original ligand conformationconformation

If the number of runs was If the number of runs was restricted to 10 for both GA restricted to 10 for both GA and LGA, LGA would have and LGA, LGA would have generated the best structuregenerated the best structure

Runs = 10Runs = 10 Max # Evaluations = 250,000Max # Evaluations = 250,000 Max # Generations = 27,000Max # Generations = 27,000 Population size = 50Population size = 50 Elitism count = 1Elitism count = 1 Mutation rate = 0.02Mutation rate = 0.02 Crossover rate = 0.8Crossover rate = 0.8 Window size = 10Window size = 10 Cauchy alpha = 0Cauchy alpha = 0 Cauchy beta = 1Cauchy beta = 1 Solis-Wets iterations = 300Solis-Wets iterations = 300 Consecutive successes = 4Consecutive successes = 4 Consecutive failures = 4Consecutive failures = 4 Rho = 1Rho = 1 Lower bound on rho = 0.01Lower bound on rho = 0.01 LS frequency = 0.06LS frequency = 0.06 * Gray options ** Gray options *

Page 44: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

(SYBYL)

Original ligand confBest GA confBest LGA confBest SA confBest LS conf

Page 45: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

(SYBYL)

Original ligand confBest GA confBest LGA confBest SA conf

Page 46: A UTO D OCK An Automated Docking Software for Predicting Optimal Protein-Ligand Interaction By Susan McClatchy, Milind Misra, Chandreyee Mukherjee, Indu.

ReferencesReferenceshttp://cmgm.stanford.edu/biochem218/Projects%201998/Apaydin.pdfhttp://www.biz.uiowa.edu/class/6K299_menczer/PPT/Hart/sld018.htmlhttp://www.biz.uiowa.edu/class/6K299_menczer/PPT/Hart/sld018.htmlhttp://cs.felk.cvut.cz/~xobitko/ga/http://www.bch.msu.edu/labs/kuhn/web/projects/screening/solvation.htmlhttp://wwwcmc.pharm.uu.nl/gillies/thesis/http://www.chem.uidaho.edu/~honors/boltz.html

S.Kumar et.al. “Protein Flexibility and Electrostatic Interactions.” IBM Journal of Research and Development Vol45. No ¾ 2001.

G. Morris et.al. “Automated Docking Using a Lamarckian Genetic Algorithm and an Empirical Binding Free Energy Function.” Journal of Computational Chemistry, Vol. 19, No. 14, 1639-1662 (1998)

C. Rosin et.al. “A Comparison of Global and Local Search Methods in Drug Docking.” UCSD CSE Technical Report #CS97-522 (1997)

C. A. Sotriffer et.al. “Automated Docking of Ligands to Antibodies: Methods and Applications.” Methods 20, 280-291 (2000)

M. Vieth et.al. “Assessing Search Strategies for Flexible Docking.”

Practical Handbook of Genetic Algorithms. Edited by Lance Chambers An Introduction to Genetic Algorithms. Melanie Mitchell. Goodsell and Olson Prot. Struct. Func. Genet, 8, 195(1990). Principals of Biochemistry: LehningerR. Durbin, S Eddy, A. Krogh, G. Mitchison Biological sequence analysisWm. E. Hart. “A Theoretical Comparison of Genetic Algorithms and Simulated

Annealing” Sandia National Laboratories, www.cs.sandia.gov/~wehart.