Virtual Screening with Topomer CoMFA
Dick Cramer
“Brave New World of QSAR”, ACS
August 19, 2002
Outline
• Topomers (similarity searching)• Method and strengths• Prospective “lead-hopping” results
• Topomer CoMFA• Methodology• Retrospective computational validation• Prospective results
Seeking and Developing “Islands of Activity” in “Chemistry Space”
• “Lead Discovery” = find land• “Lead Explosion” = define and claim island• “Lead Hopping” = find another island• “Lead Optimization” = find high enough peak• But, instead of the two latitude / longitude dimensions of geographical exploration,
there are an exponentially enormous number of ways to describe “chemistry space”.• Poor descriptions destroy islands (MW)• Topomers provide an excellent “compass” for drug discovery
B-lactams
H2 blockers
ACE Inhibitors
• Only fragments have topomers!• Whole similarity = sum-of-fragment
similarities• How topomers handle 3D
• Structures oriented• Overlay of open valences
•Single conformer• CONCORD 3D structures• Side-chain & chiral via rules
• Topomer similarity is in:• Steric fields (as in CoMFA)
• Binned values• Rot.bond-attenuated atomic fields
•Feature matching (as in conventional 3D searching)
Topomers are novel 3D models
© 1999 Tripos, Inc.
Generating a Topomer
A B C D
*
A => B: Attach “anchor group”;
generate 3D model;
overlap attachment bond
B => C: starting at attachment bond:
adjust chirality
select torsion end-points and adjust dihedral angles
attachmentbond
chiralatomfree
valence
Topomer Searching in Drug Discovery: Summary
• Many distinct advantages• Speed, to address the vastness of chemistry space* (1000’s of
‘CAS units’ per second!)
• Has yet to fail in identifying promising and patentable biological activity
• Novelty of hits• Accessibility of hits• Physical interpretability of model
• Exists in either of two flavors• ChemSpace (virtual libraries ~1013)• dbtop (conventional collections ~106)
• No plans exist for distributing either flavor
*when searching virtual libraries (ChemSpace)
Why topomer searching is so fast
R1: ordered by sorted topomer distance
R2: ordered by sorted topomer distance
R3: ordered by sorted topomer distance
VastVirtualLibrary
Discovery projects using topomer-similarity-driven “Lead-Hopping”
• Arena (structure originally found is still the lead)• BMS (published validation, see references)• Lipha (seven lead hop trials, five successes)• LeadQuest screening (partially disclosable)
Recent prospective topomer similarity results
• 7 query structures having different activities chosen from recent patents (WD Alert)
• 257 topomerically most (but not very) similar structures among 80K LeadQuest cpds (80K/(257/7) = 0.05%) were selected (by dbtop) and tested @10 or 100 um
• Screening:, >50% 37 (14%). >30%: 56 (22%)• IC50’s: 25 cpds < 30 um, for 5 of 7 query structures• Active structures are clear “lead hops” (only 1
homologue)
queries actives found
(active structures are being followed up and socurrently may be viewed only upon execution of a CDA)
The Paradoxical Limitation of Similarity SelectionReceptor As similarity to an active
compound decreases:•activity usually decreases•but sometimes increases
Similarity selection => change is bad
BUT ...Successful lead optimization(um => nm potency)requires changes that help!
Such changes are discovered by (Q)SAR
CoMFA is a (3D-Q)SAR method.quickly, how does it work?
QSARequation
PLS
ContourMaps
Predictions
QSAR Table = SYBYL MSS
Bio
Pros and Cons of CoMFA(a leading (3D-Q)SAR method)
• Advantages of CoMFA• very generally applicable• robust, widely used and accepted• models easy to understand, interpret• excellent record for predicting potency
• Disadvantages of CoMFA• Input: “alignment” of 3D models is ill-defined• Output: does not select, only predicts
Topomeric CoMFA: a neat complementarity
• Can we perform successful CoMFAs based on (automatic / ignorant) topomer aligment rules?
• Yes! (surprisingly)• the CoMFA input bottleneck is thereby broken
• Can we use the resulting CoMFA SARs to search for more active structures?• the CoMFA (QSAR) output bottleneck disappears
• topomer searching becomes very useful in lead optimization
Class Description Example Top Align by
1 Common core CombinatorialLibrary
Side chains =>fragments
2 Similar topology,central bond(s)
Most JMC pubs. Split into two at acentral bond
3 Similar topology, nocentral bond(s)
Steroids Use othermethods
4 No similar topology Screeningdecks
Alwayschallenging
Implementing Topomeric CoMFA
•Input molecules must be fragmented•each fragment set gets its own CoMFA column•data sets fall into four different classes:
Validating Topomeric CoMFA: Methodology
• How do topomeric alignments perform, compared to successful CoMFA alignments from literature?
• 10 recent CoMFA pubs => 14 end points (+1 alternative topomer fragmentation) == 15 trials• Literature alignments: 8/15 used X-ray• Data sets: 6 Class 1 (3-piece), 9 Class 2 (2-piece)
Example Topomer Alignment:2 piece (5ht3)
X1 (.320 of model)
X2 (.680 of model)
(61 structures:orthogonal views)
Lit. Top1a Top2b
Avg. # PLSComponents
4.2 5.5 3.6
Avg q2 (n=15) .636 .520 .502
Avg SdevPredictionc
(n=133)
.574 .623 .565
Validating Topomeric CoMFA:Remarkably Good Results
Satisfactory results obtained in each of the 15 trials
Average performance of automatic topomer alignments almost identical to literature alignments:
aUsing standard CoMFA fields and methodsbUsing “topomeric CoMFA fields”. #comp from xval SDEV min, not q2 maxcOmission of one data set having suspect predictions
Why do context-ignorant topomer alignments perform so well?
• 15 successes in 15 trials is not just good luck• Topomer alignments do align “like with like”• Context-knowledgable (literature) alignment must be
introducing as much noise as signal• Example: docking of combi (common core) libraries:
Docking moves the core around, producing field variation that is noise, because ..
..an invariant core cannot cause changes in biological activity
What about Topomer CoMFA Searching?
• Topomer rules are structurally universal• Directly search VL’s (ChemSpace)• Directly search conventional DB’s (dbtop) for fragments
• Search objectives (to be “and’d” together):• Similarity to average of CoMFA input fields• Predicted high potency• Exploration of new regions (happens automatically)
• Required development of• Binned electrostatic fields for all stored topomers• Extracting “features” from CoMFA input structures
Examples of Topomer CoMFA Searching results
• For each of the 15 validation data sets• Searched “2-piece” CS libraries in use (~108 structures)
• derived from commercially offered (readily accessible) reagents
• “best CoMFA Inputs” == R’s in most active CoMFA input
• “best Searching Hits” == R‘s with highest predicted potency contribution (+ < 150 similarity + synthetically tractable)
• Shown for both “best R’s” are• 2D structures with potency contributions• 3D topomer structures overlaid on CoMFA grid (orthogonal views)
• In 13 of the 15 cases, best “Searching Hits” ..• together exceed best experimental potency by >1.0 log units
Topomer CoMFA Searching Hits (5ht3)
Best CoMFA Input Best Searching Hits
+0.8 +1.2 (106)
+1.8 +2.5 (112)
Potency effect(similarity)
Potency effect(similarity)
Site 1:
Site 2:
CoMFA Input vs. Best Hit in 3D (5ht3_1)
Best HitInput example
CoMFA Input vs. Best Hit in 3D (5ht3_2)
Input example
Best Hit
Three Prospective Applications of Topomer CoMFA
• Good topomer CoMFA models automatically obtained in 3/3 trials (two projects)
• Prediction of potencies satisfactory in 2/3 trials (predicted/active r2 of .42 and .24)
• Difficulty with third unsatisfactory trial was: little variation among potency predictions, because of• Little structural variety in training set, &/or• Test set variation irrelevant to training set variation
• Errors of prediction are “false positives” much more often than “false negatives”
Topomer CoMFA (Searching):Conclusions
• Automatic CoMFA alignments are a reality• lit. alignments => topomer alignments … 15 / 15 times• 2D structures to finished CoMFA takes a few minutes
• Topomeric alignments enable topomer searching• For improved potency as well as similarity within the
vast search space accessible to topomers• Novelty of hits seems self-evident
• New “receptor space” is being “targeted”
• Promises a uniquely powerful engine for lead optimization ...• Initial applications confirm promise
Acknowledgments
• Design / Implementation• Katherine Andrews-
Cramer• Rob Jilek
• Use and Feedback• Stefan Guessregen• Mark Warne• Katherine Andrews-
Cramer
Dbtop (WDA queries) Topomer CoMFA
•Use and Feedback• Bernd Wendt• Mike Lawless
References• Cramer, R. D.; Clark, R. D.; Patterson, D. E.; Ferguson, A. M. Bioisosterism
as a molecular diversity descriptor: steric fields of single topomeric conformers. J. Med. Chem. 1996, 39, 3060-3069.
• Patterson, D. E.; Cramer, R. D.; Ferguson, A. M.; Clark, R. D.; Weinberger, L. E. Neighborhood behavior: a useful concept for validation of molecular diversity descriptors. J. Med. Chem. 1996, 39, 3049-30
• Cramer, R. D.; Patterson, D. E.; Clark, R. D.; Soltanshahi, F.; Lawless, M. S. Virtual libraries: a new approach to decision making in molecular discovery research. J. Chem. Inf. Comp. Sci. 1998, 6, 1010-1023.
• Cramer, R. D.; Poss, M. A.; Hermsmeier, M. A.; Caulfield, T. J.; Kowala, M. C.; Valentine, M. T. Prospective Identification of Biologically Active Structures by Topomer Shape Similarity Searching. J. Med. Chem. 1999, 42, 3919-3933.
• Andrews, K. M.; Cramer, R. D. Toward General Methods of Targeted Library Design: Topomer Shape Similarity Searching with Diverse Structures as Queries, J. Med. Chem, J. Med. Chem. 2000, 43, 1723-1740.
• Cramer, R. D.; Jilek, R. J.; Andrews, K. M. dbtop: Topomer Similarity Searching of Conventional Databases, J. Mol. Graph. Modeling 2002, 20, 447-462.
• Cramer, R.D. Topomer CoMFA: A Design Methodology for Rapid Lead Optimization, J. Med. Chem., manuscript accepted.