Computational Chemistry in Drug Discoveryxray.bmc.uu.se/kurs/BioinfX3/BMC_feb04.pdf · Computational Chemistry in Drug Discovery. ... Medicinal chemistry Pharmacology ... – Reduce
Post on 05-Jun-2018
243 Views
Preview:
Transcript
Mats KihlénHead of Research Informatics
Biovitrum AB
Lecture for Molekylär bioinformatik X3 Feb 24 2004
Computational Chemistryin Drug Discovery
Overview
» The role of computational chemistry» Basic concepts
– The pharmacophore concept - conformational analysis– Database searching– Virtual combinatorial chemistry– Molecular Dynamics– Structure based drug design– Predicting ADME properties– Protein modeling
» Trends and future directions
The Drug Development Process
Years0 5 10
Clinical
Phase I - III
New Drug Application
Pre-clinical
InvestigationalNew DrugApplication
Medicinal chemistry
Pharmacology
Drug Design
Mol biology
Tox
ADME
Market
The role ofcomputational chemistry
» Aid chemists in the design of compounds– Improve affinity– Find SAR (Structure Activity Relationship)– Select building blocks for combinatorial libraries– Predict permeability & solubility
» Provide protein models to biologists– Target identification– Genetic constructs– Specificity guidance
» Analyse biological data– Make sure data is captured and stored– Coordinate data flow in projects
The pharmacophore concept
Use the features as search pattern
Amine
Hydroxyl
Aromatic ring
Lipophilic
D2
D3D1
Superimpose featuresfrom several compounds
O
O
NH
OH
NH
O
Conformational analysis
» Identify the bioactive conformation» Vary all torsion angles and calculate lowest
internal energy» Typically done with MacroModel
Water
O
O
NH
OH
NH O
FreeLigand
Protein
“Water”
εr = 4
εr = 80
... but the free binding energy depends on:» Solvent» Protein ligand interactions
Structure Searching
» Pharmacophores– Finding novel scaffolds– Refining substituents– Typically ACD (Available Chemicals Directory), in-house
databases or virtual libraries (100k - 1M compounds)– Rigid or flexible search - fast
» High troughput docking– Protein structure required !– Time consuming, despite crude model
Design Synthesis
Activity measurement
Structure determi-nation of complex
“The structure based drugdesign cycle”
A project screening funnel
Design
Synthesis
Protein Xassay Caco-2
Cell basedassay
Co-crystallisation with Protein X
ADME
Selectivity
Mouse model
Papp > 1*10-6 cm/sKi < 5 µM
Virtual PPARγ Libraries
R1: 13 acid chloridesR2: 128 alcohols 1 664
R1: 95 acid chloridesR2: 647 alcohols
61 465
Full expansio
n
R1: 1 acid chlorideR2: 647 alcohols
R1: 95 acid chloridesR2: 1 alcohol 712
Iterativ
e desig
n
All reagents selected from ACD:Purity: > 95%Price: < $50MW: < 250Quantity: > 1g
OH
NH2
+
R2OH
R1 O
Cl O
NH
R2
R1
O
Building libraries in AfferentBuilding libraries in Afferent
Binding energy prediction
Water
LigandWaterLigand
Molecular Dynamics Simulations
- 10-15 CPU hours per compound- Full flexibility and solvent within simulation sphere
“As close to reality as we can get today”∆Gbinding = β∆Vel + α∆Vvdw
The Åqvist & Medina equation:
Protein
High Throughput Docking
Applications:
» Selection of compounds for screening– Smaller number of compounds to test– Possible to cover compounds not in the compound collection
» Selection of reagents for focussed libraries– Make large virtual libraries, but synthesise only the most
promising compounds» Virtual ”SAR by NMR”
– Identification of small binding fragments which could be joinedto create potent compounds
Several weak binders can beturned into one strong
» Linking two weak binders may result in a ligand with theproduct of their binding energies.
» Case study: Combinatorial linking of two weak c-Src tyrosinekinase ligands gave a 64 nM binder.
Each fragment showedappr 70% inhibition at500µMMaly D, Choong I, Ellman J, “Combinatorialtarget-guided ligand assembly: Identification ofpotent subtype-selective c-Src inhibitors”, PNAS97 (2000)
Selection for screening
» Dock public compound databases as starting pointfor compound acquisition or screening:– Example: ACDscreen - 1.2M compounds
» Pre-filtering necessary– Reduce computational needs– Remove junk, e.g. SLN-based filters– Require known features
» Good for small compound collections
Pre-filtering
Implemented as Sybyl substructure filters
O
R R X
XX
OR
OR
H R O R
O O
O
O
R
» Versatile syntax, high capacity filters
# Sul pho ny l ha l i de sS( =O) ( =O) Ha l
# Ac y l ha l i de sC( =O) Ha l
# Pe r ha l o ke t o ne sCC( =O) C( Ha l ) ( Ha l ) Ha l
# Sul pho na t e s t e r sO=S( =O) OC
# Pho s pho na t e s t e r sO=P( =O) OC
# Al pha ha l o c a r bo ny l c o mpo undsO=CCAny [ i s =Cl , Br , I ]
# He t - he t s i ng l e bo nd but no t N- 5 r i ng - he t r o c y c l e s o r s ul pho na mi de sAny - S[ no t =S=O] - He t - AnyAny [ i s =N, O, P; no t =N* [ 1 ] ~Any ~Any ~Any ~Any ~@1 ] -Any [ i s =S, N, O, P; no t =S* =O] - Any
» Definition of unwanted groups as SLNs
Docking method used at Biovitrum
» Fixed protein, fully flexible ligands– Fails if induced fit– MC generation of conformers and positions– No intial bias (positions, restraints etc)
» Docks e.g. PTP1B and PPARg binders close to crystalstructures, but...
» Docking and scoring are different things !
» Fully automated procedure using ICM or GLIDE» Capacity ~40k compounds per day using 30 CPUs
ICM docking of troglitazone, rosiglitazone, pioglitazone andPNU91325 into PPARγ ligand binding domain.
Docking test - PTP1B inhibitors
Prediction of actives vs inactives
0%
20%
40%
60%
80%
100%
Active Inactive
Conservative score threashold (-40), n = 107
FalseTrue
Activity threshold: <50uM
55 random drugs:100% predicted inactive
Relatively close analogues. Hard to explain from structure why some are inactive.
Correct prediction of actives: 82% Correct prediction of inactives: 26%
Green compound from crystal complex vs white docked analogue
Crucial interaction with Asp48
“Phosphate mimetic”
“Greasy C-term patch”
NovoNordisk Xtal vs PNU Xtal“Structure-Based Design of aLow Molecular Weight,Nonphosphorus, Nonpeptide, andHighly Selective Inhibitor ofProtein-tyrosine Phosphatase 1B”Iversen et al, J Biol Chem. 2000PDB code: 1ECV
SNH
O OH
ONH
OHO
NH
O OH
O
OOH
I
Docked “Novo #5” vs PNU
Asp48
Novo #5 vs 1ECV
SNH
O OH
ONH
OHO
NH
O OH
O
OOH
I
Created virtual libraryfrom 250 aldehydes toexplore nearby pocket
Virtual library hits
Virtual library hits
Virtual library hits
Virtual library hits
Virtual library hits
Virtual library hits
Reaching 2nd pTyr site
Structure Based Focussing
– Align with a dockedpharmacophore
– Score against the surface
LigandProtein
Combine the pharmacophore conceptwith high troughput docking:
Virtual screening summary
» ICM & GLIDE are fast and robust enough to beused as a standard docking tools
» Can clearly enrich screening sets of diversecompounds
» Not reliable enough to predict small differenciesin binding affinity
» More work should be done to improve scoring
Predicting ADME properties
Typical aspects:» Cell permeability» Aqueous solubility» Liver enzymes: inhibitiors or substrates?» Protein binding
» Physical properties vs. biological interactions
[Cneutral][Ccharged]
pKa, pH
∆G0 Size
“Water”
“Water”
“Lipids”[C]
[C] = 0
A model of passive diffusion
Predicting absorption
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
FA%
FApr
ed%
ln ln%
100100 −
���
���
�
��
�
�� = − + +
FAPSA ASAα β γ
20 diverse compoundswith known absorptionin humans (from Palm et al 1997)
Predicting aqueous solubility
-12
-10
-8
-6
-4
-2
0
2
4
-12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5
Obs
erve
d
Predicted
924_sort.M3 (PLS), Untitled, PS-924_sortlogSol, Comp 4 (Cum)
RMSEP=0.865623Simca-P 8.0 by Umetrics AB 2000-10-15 18:57Npc = 4 Ntraining = 91 Ntest = 833 RMSEP = 0.87
Predicted logS
Expe
rimen
tal l
ogS
Experimental vs predicted solubility for 833 mixed compounds
PLS model based on 3Dmolecular descriptorscalculated by Cerius2
PLS model based on 3Dmolecular descriptorscalculated by Cerius2
Blood Brain Barrier model
» In-house data from 75compounds
» High level descriptorsfrom Ab initiocalculations
» PLS statistics
-2
-1
0
1
2
-2 -1 0 1 2
Y
Predicted
all_TR.M4 (PLS), train_50_X_21, Work setlog(B/P), Comp 2(Cum)
RMSEE=0.510595
1
2
4
6
9
12
14
171821
222526
2729 31
32
33
34
35
36
37
3839
40
41
42
43
444647
48
49
5153
54
55
58
59
62
63
65
6669
7071
72
74
75
77
Simca-P 8.0 by Umetrics AB 2000-11-01 16:23
Predicted vs. observed log(B/P)
Npc = 2n = 50R2 = 0.73Q2 = 0.61
Pharmacophore model for 2D6 inhibitors
» 3D QSAR model built in Catalyst» 36 compounds from Lily paper» Correlation fitted vs. observed Km: 0.93» Correctly predicted 82% of P&U compounds < 1
log» Activities 0.0046–1000 µM
Protein Modeling
» Models usually too poor for SBDD» Sufficient for selectivity guidance» Increasing demand due to Bioinformatics revolution
– Auto-building and classification of structural domains– If family identified: select initial compound set for testing
» Major tool: ICM– Multiple sequence alignment– Structure optimisation with Monte Carlo
ZACRP7 QGDPGLPGVCRCGSIVLKSAFSVGITTSYPEER--LPIZACRP2 KGEPGLPGPCSCGSGHTKSAFSVAVTKSYPRER--LPI1c28a_a ---------------MYRSAFSVGLETRVTVPN--VPIhuzsig39 RSESRVP----------------------PPSD--APL
» More parallel synthesis and combi.chem.– Larger data volumes for theoretical evaluation
» Earlier ADME studies– Prediction of physical properties– Metabolism models
» Faster project turnover» Need for efficient data management
» Novel targets» Specialisation on target classes vs. therapeutic areas» Virtual screening as primary source for hits
» Small companies without large compound collections
Trends & Guesses
New tasks forcomputationalchemistry !
top related