2004 Sheffield Chemoinformatics Conference, April 21-23 Automated Decision Support for the Screening Process C.A. Nicolaou 1 , D.A. Kleier 2 , T.K. Brunck 1 , P.A. Bacha 1 1 Bioreason, Inc., 121 Sandoval St., Suite 220, Santa Fe, NM, USA 2 DuPont Agricultural Products, Stine-Haskell Research Center, Newark, DE, USA
35
Embed
Automated Decision Support for the Screening …cisrg.shef.ac.uk/shef2004/talks/CNicolaou.pdf2004 Sheffield Chemoinformatics Conference, April 21-23 Automated Decision Support for
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
2004 Sheffield Chemoinformatics Conference, April 21-23
Automated Decision Support for the Screening Process
– Rgroup_number of bonds_pharmacophore point – H-bond donor, H-bond acceptor, anion, cation, polar,
hydrophobe, aromatic ring, and aliphatic ring– Example: R4_3_HBA: Yes
• All descriptors learned for each class modeled
2004 Sheffield Chemoinformatics Conference, April 21-23
SAR Extraction Algorithm
• Multiple Domain– compound can appear in
more than one child node– compound can contribute to
more than one rule• Multiple Splitting
– parent node can have more than two children
– extract as much SAR at each level as is statistically meaningful
– number of splits controlled by statistical means, e.g. parent-child chi2 cutoff (0.7)
DatasetAvg. Prop. 4.2
R4_3_HBA: YesR10_MW>31
R3_logP inrange(0.45-0.75)
R11_2_HYD: Yes… …
2004 Sheffield Chemoinformatics Conference, April 21-23
SAR Tree Interpretation
• Each node in this type of decision tree is a rule or hypothesis easily to express in English
• Each hypothesis has– Indication of certainty (statistical)– Feature name/range (e.g. logP between x and y)– Support (number of examples)– List of examples
• Rules with multiple elements are possible – Aggregate certainty terms
2004 Sheffield Chemoinformatics Conference, April 21-23
SAR Extraction Example
2004 Sheffield Chemoinformatics Conference, April 21-23
Analysis of Commercial Pesticides
• Source of compounds– The pesticide manual: A world compendium– Published by the British crop protection council
• Types of activity considered– Herbicide, insecticide, fungicide, plant growth
regulation– Binary indicator variables used for type of activity
• Task: – Identify scaffolds associated with herbicidal activity
& features that distinguish herbicides from non-herbicides within the same class
®
2004 Sheffield Chemoinformatics Conference, April 21-23
Diphenylether subclass is evenly
distributed between herbicides and non-herbicides. What
substituent features distinguish the
herbicides?
OR1
R4
R2 R7
R5
R3R6
2004 Sheffield Chemoinformatics Conference, April 21-23
R-Table sorted by herbicide activity and displayed at
cutoff between herbicides and non-herbicides
2004 Sheffield Chemoinformatics Conference, April 21-23
If R5 has a HBA located 2 bonds from the
scaffold,then probability of
activity is 95% (cf. 47% for class as a whole)
with a certainty of 1.00.
2004 Sheffield Chemoinformatics Conference, April 21-23
If R5 PSA within the range of 33.97 to 65.16,
then probability of herbicidal activity is
100% (cf. 47% for class as a whole)
with a certainty of 1.00.
2004 Sheffield Chemoinformatics Conference, April 21-23
This rule cleanly differentiates
pyrethroid insecticides from
diphenyl ether herbicides
If R3 has an AlR center located 5 bonds from
the scaffold,probability of herbicidal
activity is 0% with a certainty of 1.00.
2004 Sheffield Chemoinformatics Conference, April 21-23
Peptide Deformylase Inhibitors
ClassPharmer™SAR extraction & pharmacophore
perception
®
2004 Sheffield Chemoinformatics Conference, April 21-23
Learning SAR Rules for Inhibitors of Peptide Deformylase (PDF)
• Training set of 22 mostly Beta-sulfinylhydroxamates– Reference: Apfel, et al., J. Med. Chem., 43,
2324(2000)• Compounds classified & characterized by
MCS using ClassPharmer™ technology• R-Tables generated for each class• QSARs learned for each class
Training Set of Hydroxamic Acids1 2.22 (0)
NOH
O NH
O
NH
O OH
16 1.96 (0)
S
OO
O
NHOHNHO
14 1.51 (0)
S
OO
O
NHOHO
5 1.46 (0)
S
OO
O
NHOH
9 1.03 (0)
S
OO
O
NHOHO
7 0.80 (0)
S
OO
O
NHOH
12 0.72 (0)
S
OO
O
NHOH
8 -0.04 (0)
S
OO
O
NH OH
O
O
2 -1.34 (0)
S
OO
O
NHOH
2004 Sheffield Chemoinformatics Conference, April 21-23
Classification & R-grouping by ClassPharmer™
NO
O
R3
SO
Bx1
A
R1R2
pIC50 R116 1.95860731484
O
*H
*
NHO a
b
13 1.63827216398
O
*H
* ab
8 -0.0413926851582
O
*H
OO
*
a
b
2 -1.34242268082
O
*H
*
a
b
R2 R3 X1
1.96
1.64
-0.04
-1.34
Cpd ID
16
13
8
2
8o 342 1.96 (1)
SO
OO
NHOHNHO
8h 295 1.03 (1)
S
OO
O
NHOHO
9a 269 1.00 (1)
S
O
O
NHOH
8e 299 0.85 (1)
S
OO
O
NHOH
9e 348 0.68 (1)
SO
O
NHOHBr
ClassPharmer™ Rule for Desirable R3 Groups
If R3(MW) in range of 50 to 74, the probability of activity is significantly enhanced
92% of CompdsSatisfying the
premise are active
67% of Compounds in class are active
4 0.80 (1)
SO
OO
NHOH
7 0.80 (1)
S
OO
O
NHOH
8 -0.04 (1)
SO
OO
NH OH
OO
3 -0.48 (1)
S
OO
O
NHOH
2 -1.34 (1)
S
OO
O
NHOH
Obverse Rule for Undesirable R3 Groups
If R3(MW) outside of range of 50 to 74, the probability of activity is significantly decreased
All Compounds Satisfying the
premise are inactive
2004 Sheffield Chemoinformatics Conference, April 21-23
R3 pocket in active site of (PDF)Ni(II)
CGG49 in active site of E. coli Ni-PDF (Roche) Apfel, et al. J. Med. Chem. 2000, 43, 2324-2331
OHNHO
S OO
R3 = nBu
2004 Sheffield Chemoinformatics Conference, April 21-23
Future Directions
• Expand position specific descriptors types– ADME/Tox analysis– Electronic
• Rule Synopsis
• Mine info across screens, libraries, time
2004 Sheffield Chemoinformatics Conference, April 21-23
Acknowledgements
• Bioreason– Terence K. Brunck– Pat Bacha– Suzanne Sloan
• DuPont– Dan A. Kleier– A number of forward thinking and very
patient scientists
2004 Sheffield Chemoinformatics Conference, April 21-23