1 1 /31 /31 1 1 ESCUELA TÉCNICA SUPERIOR Using “Consensus clustering” to help understand how multiple ligands might bind to multiple target sites. How this helps to choose the Right Query in Shape-Based Virtual Screening Violeta Pérez-Nueno, David Ritchie Orpailleur Team, INRIA Nancy - Grand Est LORIA (Laboratoire Lorrain de Recherche en Informatique et ses Applications), INRIA Nancy – Grand Est, 615 rue du Jardin Botanique, 54506 Vandoeuvre-lès-Nancy, France
42
Embed
Violeta Pérez-Nueno, David Ritchie · Violeta Pérez-Nueno, David Ritchie Orpailleur Team, INRIA Nancy - Grand Est LORIA (Laboratoire Lorrain de Recherche en Informatique et ses
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
11/31/3111ESCUELA TÉCNICA SUPERIOR
Using “Consensus clustering” to help understand how multiple ligands might bind
to multiple target sites. How this helps to choose the Right Query in Shape-Based Virtual Screening
Violeta Pérez-Nueno, David RitchieOrpailleur Team, INRIA Nancy - Grand Est
LORIA (Laboratoire Lorrain de Recherche en Informatique et ses Applications), INRIA Nancy – Grand Est, 615 rue du Jardin Botanique, 54506 Vandoeuvre-lès-Nancy, France
22/31/3122
1. Calculating SH Shapes
2. Calculating SH Consensus Shapes and their performance in VS
4. All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
5. Choosing the Right Query in DUD Shape-Based VS
6. Examples of how Consensus Clustering can identify Multi-Site Pockets
Presentation Overview
33/31/3133
1. Calculating SH Shapes
44/31/3144
• Real SHs: • Coefficients:• Encode radial distances
from origin as SH series…• Solve coefficients by
numerical integration…
Surface shapes are represented as radial distance expansions of the molecular surface with respect to the center of the molecule.
Spherical Harmonic Surfaces
Ritchie, D.W. and Kemp, G.J.L. J. Comp. Chem. 1999, 20, 383–395.
55/31/3155
• Distance:
• Orthogonality:
• Rotation:
• Carbo:
• Hodgkin:
• Tanimoto:
• Multi-property:
ParaFit
Ritchie, D.W. and Kemp, G.J.L. J. Comp. Chem. 1999, 20, 383–395.
ParaFit calculates superpositions between pairs of molecules by exploiting the special rotational properties of the SH functions
66/31/3166
2. Calculating SH Consensus Shapes and their performance in VS
77/31/3177 Pérez-Nueno et al. J. Chem. Inf. Model. 2008, 48, 2146–2165.
1. Do all-v-all SH comparison2. Find best pair-wise match3. Calculate SH average of pair4. Treat average as new seed5. Superpose all onto seed6. Compute new average seed7. Rotate all onto new seed8. Iterate until convergence...9. Result = SH pseudo-molecule
4) KRH derivatives5) Dipicolil amine zinc(II) complexes6) Other
PLUS…4696 inactive compounds from the
Maybridge Screening Collection with
similar 1D properties to the actives
Virtual Screening DatasetsThe consensus approach was first validated in a retrospective VS of two databases containing CCR5 and CXCR4 active compounds from the literature, and presumed inactive compounds from Maybridge.
99/31/3199
SH Consensus Shapes of theThree Most Active Inhibitors
Pseudo-molecules were obtained from the consensus shapes of the most active molecules for both CXCR4 and CCR5 targets, and used as VS queries against the database of known actives and decoys.
• Wards hierarchical clustering of chemical fingerprints
• We then used Kelley’s method to find the optimal number of clusters (16)
• These were manually merged to 10 groups based on known CCR5 families
•SH consensus shapes were calculated for the 10 groups
• These were then compared in ParaFit (all-vs-all)
Exploring CCR5 Multiple Binding Sites: Clustering the 424 CCR5 Ligands
1313/31/311313
From Consensus Shapes to Super-Consensus Clusters
•Another round of Ward’s clusteringproposed four super-consensusclusters
1414/31/311414
• Each SC pseudo-molecule was used as a VS query:
• NB. merging SC shapes significantly worsens the AUCs…• SC queries => CCR5 ligands form no less than FOUR groups
Using Super-Consensus Shapes as VS Queries
AUC=0.785 AUC=0.413
AUC=0.905 AUC=0.630
VS Super Consensus A VS Super Consensus B
VS Super Consensus C VS Super Consensus D
1515/31/311515
• SC-A docks to Site-1
(TMs 1, 2, 3, 7)
• SC-C docks to Site-2
(TMs 3, 5, 6)
• B and D dock to Site-3
(TMs 3, 6, 7)
• 3D pseudo-molecules were created as the union of all superposed ligands in each SC family for docking in Hex
Hex Blind Docking of SC Pseudo-Molecules to CCR5
Pérez-Nueno et al. J. Chem. Inf. Model. 2008, 48, 2146–2165.
1616/31/311616
• To confirm that the SC shapes were properly matched to their predicted target sites, the three proposed binding sites were treated as if they were separate targets for docking-based VS:• SC-As treated as actives for Site 1 (SCs B, C, D treated as inactives)• SC-Cs treated as actives for Site 2 (SCs A, B, D treated as inactives)• SC-B/Ds assumed active for Site 3 (SCs A and C treated as inactives)
• As before, merging SCs worsens the AUCs…• SC docking => no less than THREE CCR5 pocket sub-sites
A -> Site-1 C -> Site-2
B,D -> Site-3
Hex Docking VSw.r.t. Three CCR5 Sub-Sites
Pérez-Nueno et al. J. Chem. Inf. Model. 2008, 48, 2146–2165.
AUC=0.834 AUC=0.963
AUC=0.846
Docking VS onto Site 1 Docking VS onto Site 2
Docking VS onto Site 3
1717/31/311717
4. All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
1818/31/311818
All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
Cross-docking on DUDHuang et al. work
Huang et al. J. Med. Chem. 2006, 49, 6789-6801.
1919/31/311919
All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
Cross-shape matching on DUD• All against all shape comparison for the ligands of the 40 DUD targets
• Clustering of ligands by shape into 40 groups
• Select a threshold that emphasizes the overall similarity to the cross-docking experiment
•Count the membership over the threshold for ligands belonging to each of the targets for all the 40 clusters and sum memberships.
•Normalize this sum by the total number of ligands for each target. This gives the cross-shape matching score.
1234...
.
.
.
40
When ace is over the threshold. Sum of the normalized number of ligands of each of the other targets over the threshold for all the clusters
2020/31/312020
Ligands for a given target are split into 2 or 3 sub-groups, which could suggest they might bind to different sub-sites of the same target
p38 ligands split in C36, C30, C29
All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
2121/31/312121
Ligands from different targets are grouped together. It suggests that they could bind to the same targets.
Evidence of cox2 cross-docked with hsp90
Evidence of sahh cross-docked with tk and ada
All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
2222/31/312222
All-against-all DUD Dataset Shape Clustering vs DUD Cross Docking
8.45 h several months (from 20h to 12d/target)
Huang et al. J. Med. Chem. 2006, 49, 6789-6801.
2323/31/312323
5. Choosing the Right Query in DUD Shape-Based VS
2424/31/312424
Choosing the Right Query in DUD Shape-Based VS
Queries used:
• PARAFIT_B : PARAFIT bound conformation.
• PARAFIT_CM : PARAFIT consensus molecule.
• PARAFIT_C1_CM - PARAFIT_C3_CM: consensus molecules for SH clusters 1, 2, 3.
• PARAFIT_RCM: PARAFIT real center molecule closest to consensus.
• PARAFIT_C1_RCM - PARAFIT_C3_RCM: real center molecules for clusters 1, 2, 3.
• PARAFIT_SHEF_C: best SHEF molecule that fits pocket as PARAFIT query.
• ROCS_B: ROCS bound conformation.
• ROCS_RCM: real center molecule from PARAFIT as ROCS query.
• ROCS_C1_RCM - ROCS_C3_RCM: real PARAFIT center molecules as ROCS queries.
• ROCS_SHEF_C: best SHEF molecule that fits pocket as ROCS query.
• SHEF shape-based docking.
• GOLD conventional docking.
ROC plots Bar graphs
2525/31/312525
Choosing the Right Query in DUD Shape-Based VS
2626/31/312626
Choosing the Right Query in DUD Shape-Based VS
2727/31/312727
6. Examples of how Consensus Clustering can identify Multi-Site Pockets
2828/31/312828
Multi-Site Pockets: p38• 353 p38 DUD ligands clustered using Wards hierarchical clustering of chemical fingerprints
• Kelley’s method to find the optimal number of clusters (15)
• SH consensus shapes were calculated for the 15 groups
• These were then compared in ParaFit(all-vs-all)
• Another round of Ward’s clustering proposed three super-consensus clusters
• Agence Nationale de la Recherche (ANR-08-CEXC-017-01)• Generalitat de Catalunya-DURSI (FI2008 and BE-DGR2009)
3838/31/313838
Thank you!
3939/31/313939
• From MOPAC or VAMP, calculate:– Density contours of 2x10-4e/A3 (i.e. approx = SAS)– MEP – electrostatic potential– IEL – ionization energy– EAL – electron affinity– aL – polarizability
• Encode as Spherical Harmonic expansions to order L=15…
Lin, J.-H. and Clark, T. J. Chem. Inf. Model. 2005, 45, 1010–1016.