Top Banner
Prediction of Animal Clearance Using Naïve Bayesian Classification and Extended Connectivity Fingerprints Timothy A McIntyre
23
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Tam June 2009

Prediction of Animal Clearance Using Naïve

Bayesian Classification and Extended Connectivity

FingerprintsTimothy A McIntyre

Page 2: Tam June 2009

Introduction

• The pharmaceutical industry faces unprecedented pressure from payers, regulators, ethicists and the general public.

• The cost of drug development is staggering: US $1 Billion.

• There is increasing demand for new medicines with well established safety and efficacy.

• Attrition during drug discovery and development remains high.

• Absorption, distribution, metabolism and excretion (ADME) screening has significantly reduced attrition due to poor pharmacokinetics in humans.

Page 3: Tam June 2009

Pharmacokinetics and Clearance

• What is Pharmacokinetics (PK)?

– The study of what the body does to a drug

– Characterization of the drug concentration-time profile

• Drug Clearance (CL) is a primary determinant of drug PK.

– Rate of Elimination = CL x Drug Concentration.

– Measures the efficiency of irreversible elimination from the body.

– Often a result of metabolism by the liver.

• High CL limits systemic exposure and oral bioavailability.

Page 4: Tam June 2009

Experimental ADME and lead optimization

• In vitro metabolic stability (intrinsic CL in liver microsomes) is commonly used to prioritize compounds for in vivo studies.

• Animal studies are more resource intensive, more expensive, lower throughput.

• Rodent PK evaluated prior to higher species using smaller amounts of compound.

• Appropriate PK essential for subsequent evaluation in resource intensive pharmacology or toxicology models.

• Animal PK is used to predict PK in humans and set appropriate starting doses for initial clinical trials.

Page 5: Tam June 2009

In Silico ADME – Background of our approach

• Potential to provide unlimited information at low cost.

– Often based on descriptive physico-chemical properties.

• Limited precedence for modeling animal CL directly using detailed structural information.

• Our approach: Bayesian classification and extended connectivity fingerprints.

• Compared model performance to common experimental approaches.

– in vitro liver microsome intrinsic CL (CLi)

– animal CL in a lower species

Page 6: Tam June 2009

Experimental Data

• Mouse, rat, dog and monkey in vivo CL and in vitro CLi.

• GSK corporate database (20,000 unique compounds).

• Animal CL normalized to liver blood flow.

– 30 to 90 mL/min/kg depending on the species.

• CL >70% liver blood flow considered high (<70%, low).

• CLi >5 mL/min/g considered high (<5 mL/min/g, low).

– Mid-point of the assay dynamic range (0.5 to 50 mL/min/g tissue).

Page 7: Tam June 2009

Bayesian Modeling

• Pipeline Pilot™ (Accelrys, Inc., San Diego, CA, USA).

• Extended connectivity fingerprints (six bond diameter) using simplified molecular input line entry specification (SMILES) strings as input.

• Compounds randomly assigned to training or test sets in a 5:1 ratio.

• Model predictions for each test set (high/low CL) compared to experimental data for each test set.

– Accuracy (ACC), Positive Predictive Value (PPV), Negative Predictive Value (NPV), True Positive Rate (TPR), False Positive Rate (FPR), Receiver Operating Characteristic (ROC) AUC

– 90% confidence intervals, p values

Page 8: Tam June 2009

Descriptive Statistics: Experimental CL and CLi

Summary of animal in vivo clearance and in vitro intrinsic clearance dataa Mouse Rat Monkey Dog Mouse Rat Monkey Dog

Statistic CL CL CL CL CLi CLi CLi CLi

N 1369 17529 1129 2690 3021 42470 3600 3700

Q1 0.28 0.33 0.20 0.31 1.25 1.10 2.02 0.86

Q2 0.60 0.68 0.38 0.65 4.78 3.50 7.20 2.00

Q3 0.92 1.21 0.63 1.17 21.74 11.96 22.00 7.20

aCL values are expressed as a percentage of liver blood flow; CLi units are mL/min/g liver. Q1, Q2 (median value) and Q3 are the first second and third quartiles.

Page 9: Tam June 2009

Summary of animal in vivo clearance

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 0.5 1 1.5

Normalized CL

Per

cen

tile

Mouse CL, N = 1369

Rat CL, N = 17529

Dog CL, N = 2690

Monkey CL, N = 1129

Page 10: Tam June 2009

Chemical Diversity

• 20,000 unique compounds representing hundreds of lead optimization programs.

• Self-similarity tests, ring analysis, Murcko assemblies.

• Demonstrate substantial structural diversity.

• The top 20 rings accounted for 45-51% of the compounds depending on the species

• Median frequency of any particular Murcko assembly less then 0.1%

– Corresponds to ~18 compounds with Rat CL sharing the same assembly

Benzimidazole Quinoline pyrido-pyrimidine

Page 11: Tam June 2009

Near Neighbors

(0)

(1-2)

(3-4)

(5-9)

(>= 10)

Rat CL

(0)

(1-2)(3-4)

(5-9)

(>= 10)

Mouse CL

(0)

(1-2)(3-4)

(5-9)

(>= 10)Dog CL

(0)

(1-2)

(3-4)

(5-9)

(>= 10)Monkey CL

Page 12: Tam June 2009

Dog Model

(0.70 – 0.79)

(0.55 – 0.66)

(0.89 – 0.94)

(0.67 – 0.80)

(0.67 – 0.74)

(0.68 – 0.74)0.750.60.910.740.710.71610Dog CLi

(0.62 – 0.75)

(0.35 – 0.53)

(0.65 – 0.78)

(0.47 – 0.65)

(0.65 – 0.78)

(0.60 – 0.71)0.680.440.720.560.720.65202Mouse

CL

(0.71 – 0.76)

(0.38 – 0.45)

(0.75 – 0.79)

(0.54 – 0.61)

(0.75 – 0.80)

(0.68 – 0.72)0.740.420.770.580.770.71417Rat CL

(0.78 – 0.85)

(0.27 – 0.37)

(0.79 – 0.87)

(0.76 – 0.85)

(0.68 – 0.77)

(0.72 – 0.79)0.810.320.830.80.720.76490Dog

Model

ROC AUC

FPRTPR NPVPPVACCNPredictor

Summary of Performance Diagnostics for Methods Useda

aThe upper and lower limits of the 90% confidence intervals for each diagnostic are included in parentheses.

Page 13: Tam June 2009

FPR Comparisons

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Dog M

odel

Rat C

L

Dog C

Li

Rat M

odel

Mou

se C

L

Rat C

Li

Mou

se M

odel

Mou

se C

Li

FP

R

p = 0.012 p = 0.008

p < 0.001 p < 0.001 p = 0.002

Page 14: Tam June 2009

ROC AUC Comparisons

0.50

0.60

0.70

0.80

0.90

Dog M

odel

Rat C

L

Mou

se C

L

Rat M

odel

Rat C

Li

Mou

se M

odel

Mou

se C

Li

Mon

key M

odel

Mon

key C

Li

RO

C A

UC

p < 0.001 p = 0.008 p = 0.007

p = 0.002

p = 0.003

Page 15: Tam June 2009

NPV Comparisons

0.4

0.5

0.6

0.7

0.8

0.9

Dog M

odel

Rat C

L

Mou

se C

L

Rat M

odel

Rat C

Li

NP

V

p < 0.001 p < 0.001

p < 0.001

Page 16: Tam June 2009

Effect of Optimization on Rat CL

0

0.2

0.4

0.6

0.8

1

0 0.5 1 1.5 2

Normalized CL (CL/Q)

Pe

rcen

tile

Rat

Rat with Dog

Rat with Monkey

Page 17: Tam June 2009

Key Messages• Bayesian model performance exceptional.

• ROC AUC, ACC, PPV, NPV and TPR ranging from 0.72 to 0.82.

• In predicting dog CL, the Bayesian model was better than experimental rat or mouse CL.

• In predicting rat CL, the Bayesian model performed just as well as mouse CL.

• Models outperformed mouse, rat and monkey CLi for predicting mouse, rat and monkey CL, respectively.

• Models have higher negative predictive value (compounds with high experimental CL have high Bayesian CL).

• Lead optimization bias can affect modeling success (monkey).

Page 18: Tam June 2009

Conclusions

• Study demonstrates the potential of naïve Bayesian classification to predict animal CL based on structural fingerprints.

• Models can be used to

– optimize chemical libraries

– direct new chemical synthesis

– increase the efficiency of screening cascades

• Significant potential to reduce cost, time and animal usage associated with the discovery of new medicines.

Page 19: Tam June 2009

Acknowledgements

• GSK Drug discovery scientists

• Charles B Davis

• Robert Gagnon

• Amber Anderson

• Chao Han

• John Conway (Accelrys)

• Keith Ward (Bausch & Lomb)

Page 20: Tam June 2009

Backup Slides

Page 21: Tam June 2009

Near Neighbors

Distribution of Near Neighbors: Animal CL and CLia

NN Mouse Rat Monkey Dog

CL N = 1,369 N = 17,529 N = 1,129 N = 2,690

0 32.4 22.4 34.4 25.8

1-2 26.2 23.4 27.7 24.0

3-9 26.8 29.0 27.0 26.7

≥10 14.6 25.2 11.0 23.5

CLi N = 3,691 N = 43,118 N = 3,634 N = 3,732

0 23.7 15.4 21.8 23.7

1-2 26.4 19.2 23.6 26.2

3-9 26.2 29.5 25.6 26.5

≥10 23.8 35.9 29.0 23.6 aPercentage of compounds in various NN (near neighbor) bins. Tantimoto distance <0.15.

Page 22: Tam June 2009

Rat ModelSummary of Performance Diagnostics for Methods Useda

aThe upper and lower limits of the 90% confidence intervals for each diagnostic are included in parentheses.

(0.64 – 0.67)

(0.57 – 0.60)

(0.78 – 0.80)

(0.64 – 0.68)

(0.57 – 0.59)

(0.60 – 0.62)

0.650.580.790.660.580.615947Rat Cli

(0.76 – 0.84)

(0.35 – 0.48)

(0.80 – 0.88)

(0.66 – 0.78)

(0.70 – 0.79)

(0.70 – 0.77)

0.80.420.840.720.740.74409Mouse CL

(0.80 – 0.83)

(0.29 – 0.33)

(0.77 – 0.81)

(0.74 – 0.78)

(0.71 – 0.75)

(0.73 – 0.75)

0.820.310.790.760.730.743077Rat Model

ROC AUCFPRTPR NPVPPVACCNPredictor

Page 23: Tam June 2009

Monkey Model

(0.66 – 0.73)

(0.24 – 0.37)

(0.53 – 0.62)

(0.30 – 0.40)

(0.81 – 0.89)

(0.57 – 0.64)0.690.310.580.350.850.60486Monkey

CLi

(0.69 – 0.77)

(0.37 – 0.52)

(0.67 – 0.74)

(0.30 – 0.42)

(0.81 – 0.87)

(0.64 – 0.71)0.730.440.710.360.840.67569Dog CL

(0.71 – 0.77)

(0.29 – 0.41)

(0.68 – 0.74)

(0.31 – 0.40)

(0.87 – 0.92)

(0.67 – 0.73)0.740.350.710.350.900.7835Rat CL

(0.75 – 0.87)

(0.17 – 0.41)

(0.72 – 0.83)

(0.31 – 0.52)

(0.88 – 0.96)

(0.71 – 0.81)0.810.290.770.420.920.76206Monkey

Model

ROC AUC

FPRTPR NPVPPVACCNPredictor

Summary of Performance Diagnostics for Methods Useda

aThe upper and lower limits of the 90% confidence intervals for each diagnostic are included in parentheses.