MOLECULAR DOCKING MOLECULAR DOCKING V. Subramanian Chemical Laboratory Central Leather Research Institute Adyar, Chennai [email protected]
Dec 28, 2015
MOLECULAR DOCKINGMOLECULAR DOCKING
V. SubramanianChemical LaboratoryCentral Leather Research Institute Adyar, [email protected]
IntroductionIntroduction• Drug discovery take years to decade
for discovering a new drug and very costly
• Effort to cut down the research timeline and cost by reducing wet-lab experiment use computer modeling
Chemical + biological system desired response?
Drug discoveryDrug discovery
TRADITIONAL DRUG DESIGN
Lead generation: Natural ligand / Screening
Biological Testing
Synthesis of New Compounds
Drug Design CycleDrug Design Cycle
If promising
Pre-Clinical Studies
Finding lead compoundFinding lead compound• A lead compound is a small molecule that serves as the
starting point for an optimization involving many small molecules that are closely related in structure to the lead compound
• Many organizations maintain databases of chemical compounds
• Some of these are publically accessible others are proprietary
• Databases contain an extremely large number of compounds (ACS data bases contains 10 million compounds)
• 3D databases have information about chemical and geometrical features
» Hydrogen bond donors» Hydrogen bond acceptors» Positive Charge Centers» Aromatic ring centers» Hydrophobic centers
Finding lead compoundFinding lead compound
• There are two approaches to this problem– A computer program AutoDock (or
similar version Affinity (accelrys)) can be used to search a database by generating “fit” between molecule and the receptor
– Alternatively one can search 3D pharmacophore
Structure based drug designStructure based drug design
• Drug design and development • Structure based drug design exploits
the 3D structure of the target or a pharmacophore– Find a molecule which would be expected
to interact with the receptor. (Searching a data base)
– Design entirely a new molecule from “SCRATCH” (de novo drug/ligand design)
• In this context bioinformatics and chemoinformatics play a crucial role
Structure-based Drug Design (SBDD)
Molecular Biology & Protein Chemistry
3D Structure Determination of Target and Target-Ligand Complex
Modelling
Structure Analysisand Compound Design
Biological Testing
Synthesis of New Compounds
If promising
Pre-Clinical Studies
Drug Design CycleDrug Design Cycle
Natural ligand / Screening
Structure based drug designStructure based drug design
• SBDD:• drug targets (usually proteins)• binding of ligands to the target
(docking)
↓ “rational” drug design
(benefits = saved time and $$)
Select and Purify the target protein
Model inhibitor with
computational tools
Synthesis, Evaluate preclinical, clinical, invitro, invivo, cells, animals, & humans
Drug
Schematics for structure based drug designSchematics for structure based drug design
Obtain known inhibitor
X-Ray structural determination of native
protein
X-Ray structural determination of inhibitor complex
Determine IC50
Structure Based Drug Design have the potential to shave off years and millions of dollars
Working at the intersectionWorking at the intersection
• Structural Biology• Biochemistry• Medicinal Chemistry• Toxicology• Pharmacology• Biophysical Chemistry• Natural Products Chemistry• Chemical Ecology• Information Technology
Molecular docking-definitionMolecular docking-definition
• It is a process by which two molecules are put together in 3 Dimension
• Best ways to put two molecules together
• Using molecular modeling and computational chemistry tools
Molecular dockingMolecular docking
• Docking used for finding binding modes of protein with ligands/inhibitors
• In molecular docking, we attempt to predict the structure of the intermolecular complex formed between two or more molecules
• Docking algorithms are able to generate a large number of possible structures
• We use force field based strategy to carry out docking
Oxygen transport molecule (101M) Oxygen transport molecule (101M) with surface and myoglobin ligandwith surface and myoglobin ligand
Influenza virus b/beijing/1/87 neuraminidase Influenza virus b/beijing/1/87 neuraminidase complexed with zanamivircomplexed with zanamivir
Influenza virus b/beijing/1/87 neuraminidase Influenza virus b/beijing/1/87 neuraminidase complexed with zanamivir complexed with zanamivir
Plasma alpha antithrombin-iii and Plasma alpha antithrombin-iii and pentasaccharide protein with heparin ligand pentasaccharide protein with heparin ligand
Steps of molecular dockingSteps of molecular docking
• Three steps(1)Definition of the structure of the
target molecule
(2) Location of the binding site
(3) Determination of the binding mode
Best ways to put two molecules Best ways to put two molecules togethertogether
– Need to quantify or rank solutions
– Scoring function or force field
– Experimental structure may be amongst one of several predicted solutions
-Need a Search method
QuestionsQuestions
• Search– What is it?– When/why and which search?
• Scoring– What is it?
• Dimensionality– Why is this important?
Spectrum of searchSpectrum of search
• Local– Molecular Mechanics
• Short - Medium– Monte Carlo Simulated Annealing– Brownian Dynamics– Molecular Dynamics
• Global– Docking
Details of searchDetails of search
Level-of-Detail• Atom types• Terms of force field
– Bond stretching– Bond-angle bending– Torsional potentials– Polarizability terms– Implicit solvation
Kinds of searchKinds of search
Systematic• Exhaustive• Deterministic• Dependent on granularity of
sampling• Feasible only for low-dimensional
problems• DOF, 6D search
Kinds of searchKinds of search
Stochastic• Random• Outcome varies• Repeat to improve chances of
success• Feasible for higher-dimensional
problems
• AutoDock, < ~40D search
Stochastic search methodsStochastic search methods
•Simulated Annealing (SA)•Evolutionary Algorithms (EA)
– Genetic Algorithm (GA)
•Others– Tabu Search (TS)
•Hybrid Global-Local Search– Lamarckian GA (LGA)
Simulated annealingSimulated annealing
• One copy of the ligand (Population = 1)• Starts from a random or specific
postion/orientation/conformation (=state)
• Constant temperature annealing cycle (Accepted & Rejected Moves)
• Temperature reduced before next cycle• Stops at maximum cycles
Search parametersSearch parameters
Simulated Annealing• Initial temperature (K)• Temperature reduction factor (K-
1cycle)• Termination criteria:
– accepted moves– rejected moves– cycles
Genetic function algorithmGenetic function algorithm
• Start with a random population (50-200)• Perform Crossover (Sex, two parents -
> 2 children) and Mutation (Cosmic rays, one individual gives 1 mutant child)
• Compute fitness of each individual •Proportional Selection & Elitism• New Generation begins if total energy
evals or maximum generations reached
Search parametersSearch parameters
• Population size• Crossover rate•Mutation rate• Local search
– energy evals• Termination criteria
– energy evals– generations
Dimensionality of molecular Dimensionality of molecular dockingdocking
• Degrees of Freedom (DOF)• Position or Translation
– (x,y,z) = 3• Orientation or Quaternion
– (qx, qy, qz, qw) = 4• Rotatable Bonds or Torsions
– (tor1, tor2, … torn) = n
• Total DOF, or Dimensionality, D = 3 + 4 + n
Docking scoreDocking scoreDGbinding = DGvdW + DGelec + DGhbond +
DGdesolv + DGtors
DGvdW
12-6 Lennard-Jones potential• DGelec
Coulombic with Solmajer-dielectric• DGhbond
12-10 Potential with Goodford Directionality• DGdesolv
Stouten Pairwise Atomic Solvation Parameters• DGtors
Number of rotatable bonds
Molecular mechanics: theoryMolecular mechanics: theory• Considering the simple
harmonic approximation, the potential energy of molecules is given by
V= VBond+ VAngle + VTorsion + Vvdw + Velec+ Vop
• VBond = 1/2Kr (rij-r0)2
• Where Kr is the stretching force constant
• VAngle =1/2K (ijk-0)2
• Where K is the bending force constant
• VTorsion =V/2 (1+ Cos n(+0))• Where V is the barrier to
rotation, is torsional angle
Molecular mechanics: TheoryMolecular mechanics: Theory
• Lennard-Jones type of 6-12 potential is used to describe non-bonded and weak interaction
• Vvdw= (Aij/rij12-Bij/rij
6)
• Simple Columbic potential is used to describe electrostatic interaction
• Velec=(qiqj/rij)
• Out of plane bending/deformation is described by the following expression
• Vop= 0.5 Kop 2
The forcefieldThe forcefield• The purpose of a forcefield is to describe the
potential energy surface of entire classes of molecules with reasonable accuracy
• In a sense, the forcefield extrapolates from the empirical data of the small set of models used to parameterize it, a larger set of related models
• Some forcefields aim for high accuracy for a limited set of elements, thus enabling good predictions of many molecular properties
• Others aim for the broadest possible coverage of the periodic table, with necessarily lower accuracy
Components of a forcefieldComponents of a forcefield• The forcefield contains all the necessary
elements for calculations of energy and force: – A list of forcefield types – A list of partial charges
• Forcefield-typing rules – Functional forms for the components of the
energy expression • Parameters for the function terms
– For some forcefields, rules for generating parameters that have not been explicitly defined
– For some forcefields, a way of assigning functional forms and parameters
The energy expressionThe energy expression
Valence interactionsValence interactions• The energy of valence interactions is generally
accounted for by diagonal terms: – bond stretching (bond) – valence angle bending (angle) – dihedral angle torsion (torsion) – inversion, also called out-of-plane interactions
(oop) terms, which are part of nearly all forcefields for covalent systems
– A Urey-Bradley (UB) term may be used to account for interactions between atom pairs involved in 1-3 configurations (i.e., atoms bound to a common atom)
• Evalence=Ebond + Eangle + Etorsion + Eoop + EUB
Non-bond interactionsNon-bond interactions
• The energy of interactions between non-bonded atoms is accounted for by
• van der Waals (vdW) • electrostatic (Coulomb) • hydrogen bond (hbond) terms in
some older forcefields
• Enon-bond=EvdW + ECoulomb + Ehbond
Molecular dynamics (MD) Molecular dynamics (MD) simulationssimulations
• A deterministic method based on the solution of Newton’s equation of motion
Fi = mi ai
for the ith particle; the acceleration at each step is calculated from the negative gradient of the overall potential, using Fi = - grad Vi - = - Vi
Vi = Sk(energies of interactions between i and all other residues k located within a cutoff distance of Rc from i)
Classical molecular dynamics Classical molecular dynamics • Constituent molecules obey
classical laws of motion• In MD simulation, we have to
solve Newton's equation of motion
• Force calculation is the time consuming part of the simulation
• MD simulation can be performed in various ensembles
• NVT, NPT and NVE are the ensembles widely used in the MD simulations
• Both quantum and classical potentials can be used to perform MD simulation
• MM total energy can be used to get interaction energy of the ligands with biomolecules
• In order to compute the interaction energy, calculations have to be performed for the biomolecule, ligands and the biomolecule-ligand adduct using the same force field
• Eint= Ecomplex - {Ebiomolecule+Eligand}
Calculation of interaction energyCalculation of interaction energy
Integration of equation of motion Integration of equation of motion and time step and time step
• A key parameter in the integration algorithm is the integration time step
• The time step is related to molecular vibration• The main limitation imposed by the highest-
frequency motion• The vibrational period must be split into at
least 8-10 segments for models to satisfy the Verlet algorithm that the velocities and accelerations are constant over time step used
• In most organic models, the highest vibrational frequency is that of C-H stretching, whose period is of the order of 10-14 s (10fs). Therefore integration step should be 0.5-1 fs
Stages and duration in MD Stages and duration in MD simulationsimulation
• Dynamics simulations are usually carried out in two stages, equilibration and data collection
• The purpose of the equilibration is to prepare the system so that it comes to the most probable configuration consistent with the target temperature and pressure
• For large system, the equilibration takes long time because of the vast conformational space it has to search
• The best way to judge whether a model has equilibrated is to plot various thermodynamic quantities such as energy, temperature, pressure versus time
• When equilibrated, the system fluctuate around their average
Durations of some real Durations of some real molecular eventsmolecular eventsEvent Approximate duration
Bond stretching 1-20 fs
Elastic domain modes 100 fs to several ps
Water reorientation 4 ps
Inter-domain bending 10 ps-100 ns
Globular protein tumbling 1-10 ns
Aromatic ring flipping 100 µs to several seconds
Allosteric shifts 2 µs to several seconds
Local denaturation 1 ms to several seconds
Free energy simulationsFree energy simulations
• Ability to predict binding energy• Free energy perturbation and
thermodynamic integration• Computational demand and issues
related to sampling prevent this technique in probing structure based drug design
• Free Energy equation
• An impressive example of the application of SBDD is was the design of the HIV-I protease inhibitor
De novaDe nova design of inhibitor for HIV-I design of inhibitor for HIV-I proteaseprotease
De novaDe nova design design
• It is a member of the aspartyl protease family with the two active sites
• Structure has tetra coordinated water molecules tat accepted two hydrogen bond from the backbone amide hydrogens of isoleucine in the flaps
• Two hydrogen bonds to the carbonyl oxygens of the inhibitor
Application of structure based drug Application of structure based drug design: HIV protease inhibitorsdesign: HIV protease inhibitors
• The starting point is the series of X-ray structures of the enzyme and enzyme-inhibitor complex
• The enzyme is made up of two equal halves
• HIV protease is a symmetrical molecule with two equal halves and an active site near its center like butterfly
• For most such symmetrical molecules, both halves have a "business area," or active site, that carries out the enzyme's job
• But HIV protease has only one such active site in the center of the molecule where the two halves meet
Structure based drug design: HIV Structure based drug design: HIV protease inhibitorsprotease inhibitors
• The single active site was plugged with a small molecule so that it is possible shut down the whole enzyme and theoretically stop the virus' spread in the body
• Several Inhibitors have been designed based on– Peptidic inhibitor– Peptidomemitic compounds– Non-peptide inhibitors
• Further work has demonstrated the success of this approach
Some examplesSome examples
• Ritonavir (trade name Norvir) is one of a class of anti-HIV drugs called protease inhibitors
• Saquinavir • Indinavir is another example of very
potent peptidomimetic compound discovered using the elements of 3D structure and Structure Activity Relationship (SAR)
De novaDe nova design… design…
• The first step was a 3D database search of a subset of the Cambridge Structural Database
• The pharmacophore for this search comprised of two hydrophobic groups and a hydrogen bond donor or acceptor
• The hydrophobic groups were intented to bind to the catalytic asp residues
De novaDe nova design… design…• The search yielded the hit which contained
desired element of the pharmacophore but it also had oxygen that could replace the bound water molecules
• The benzene ring in the original compound was changed to a cyclohexanone, which was able to position substituents in a more fitting manner
• The DuPont Merck group had explored a series of peptide based diols that were potent inhibitors but with poor oral bioavailability
De novaDe nova design design
• They have retained the diol functionality and expanded the six me member ring to a seven membered diol
• The ketone was changed to cyclic urea to enhance the hydrogen bonding to the flaps and to help synthesis
• The compound chosen further studies including clinical trails was p-hydroxymethylbenzyl derivative
P1’P1
H-bond donor or acceptor
3.5-6.5Å 3.5-6.5Å
8.5-12Å
Symmetric diol docked into HIV active site
3D pharmacophore3D hit
Initial idea for inhibitor
Expand ring to give diol and incorporate urea
Stereochemistry required for optimal binding
Final Molecule selected for clinical Trials
Host-Guest Interactions with Host-Guest Interactions with Collagen: As moleculesCollagen: As molecules
Dominated by Geometrical factors and Solvent Accessible Volumes
Energy minimized structure of 24-Energy minimized structure of 24-mer collagen triple helixmer collagen triple helix
Aspargine of T.Helix and gallic acid
Aspartic acid of T.Helix and catechin
Complex Formation of poly phenols Complex Formation of poly phenols at various collagen sitesat various collagen sites
Lysine of T.Helix and epigallocatechingallate
Binding Sites in triple helix
Binding Energy (Kcal/mol)
Gallic acid (Gal)
Catechin (Cat)Epigallocatechi
ngallate (EGCG)
Pentagalloyl glucose (PGG)
9th residue Ser of C-chain (α2)
16.5 22.5 35.2 56.6
6th residue Hyp of A-chain (α1)
14.5 20.8 34.5 48.4
12th residue Lys of B-chain (α1)
19.2 23.8 37.9 41.1
21st residue Asp of A-chain (α1)
18.4 20.0 38.2 59.8
17th residue Asn of C-chain (α2)
14.1 23.7 34.3 52.8
Binding energies different complexes Binding energies different complexes between polyphenols and triple helixbetween polyphenols and triple helix
Interfacial interacting volume Vs Binding Interfacial interacting volume Vs Binding energy of the collagen-poly phenol complexenergy of the collagen-poly phenol complex
Interacting Interfacial Volume (Å3)
Effective solvent inaccessible contact volume Effective solvent inaccessible contact volume Vs Binding energy of the collagen-poly phenol Vs Binding energy of the collagen-poly phenol
complexcomplexInset: effective solvent inaccessible contact surface area Vs Binding energy of the complex
Plot of inverse of interacting interfacial volume Plot of inverse of interacting interfacial volume (1/Int.Vol.) Vs inverse of binding energy(1/B.E) of the (1/Int.Vol.) Vs inverse of binding energy(1/B.E) of the
complexescomplexes
AcknowledgementAcknowledgement
• Mr. R. Parthasarathi• Mr. B. Madhan• Mr. J. Padmanabhan• Mr. M. Elango• Mr. S. Sundar Raman• Mr. R. Vijayraj• CSIR & DST, GOI
Big Thank YouBig Thank You
Others have done the work. Some have used
the work. I have spoken only on behalf of their
behalf.