Tam June 2009

Prediction of Animal Clearance Using Naïve

Bayesian Classification and Extended Connectivity

FingerprintsTimothy A McIntyre

Introduction

• The pharmaceutical industry faces unprecedented pressure from payers, regulators, ethicists and the general public.

• The cost of drug development is staggering: US $1 Billion.

• There is increasing demand for new medicines with well established safety and efficacy.

• Attrition during drug discovery and development remains high.

• Absorption, distribution, metabolism and excretion (ADME) screening has significantly reduced attrition due to poor pharmacokinetics in humans.

Pharmacokinetics and Clearance

• What is Pharmacokinetics (PK)?

– The study of what the body does to a drug

– Characterization of the drug concentration-time profile

• Drug Clearance (CL) is a primary determinant of drug PK.

– Rate of Elimination = CL x Drug Concentration.

– Measures the efficiency of irreversible elimination from the body.

– Often a result of metabolism by the liver.

• High CL limits systemic exposure and oral bioavailability.

Experimental ADME and lead optimization

• In vitro metabolic stability (intrinsic CL in liver microsomes) is commonly used to prioritize compounds for in vivo studies.

• Animal studies are more resource intensive, more expensive, lower throughput.

• Rodent PK evaluated prior to higher species using smaller amounts of compound.

• Appropriate PK essential for subsequent evaluation in resource intensive pharmacology or toxicology models.

• Animal PK is used to predict PK in humans and set appropriate starting doses for initial clinical trials.

In Silico ADME – Background of our approach

• Potential to provide unlimited information at low cost.

– Often based on descriptive physico-chemical properties.

• Limited precedence for modeling animal CL directly using detailed structural information.

• Our approach: Bayesian classification and extended connectivity fingerprints.

• Compared model performance to common experimental approaches.

– in vitro liver microsome intrinsic CL (CLi)

– animal CL in a lower species

Experimental Data

• Mouse, rat, dog and monkey in vivo CL and in vitro CLi.

• GSK corporate database (20,000 unique compounds).

• Animal CL normalized to liver blood flow.

– 30 to 90 mL/min/kg depending on the species.

• CL >70% liver blood flow considered high (<70%, low).

• CLi >5 mL/min/g considered high (<5 mL/min/g, low).

– Mid-point of the assay dynamic range (0.5 to 50 mL/min/g tissue).

Bayesian Modeling

• Pipeline Pilot™ (Accelrys, Inc., San Diego, CA, USA).

• Extended connectivity fingerprints (six bond diameter) using simplified molecular input line entry specification (SMILES) strings as input.

• Compounds randomly assigned to training or test sets in a 5:1 ratio.

• Model predictions for each test set (high/low CL) compared to experimental data for each test set.

– Accuracy (ACC), Positive Predictive Value (PPV), Negative Predictive Value (NPV), True Positive Rate (TPR), False Positive Rate (FPR), Receiver Operating Characteristic (ROC) AUC

– 90% confidence intervals, p values

Descriptive Statistics: Experimental CL and CLi

Summary of animal in vivo clearance and in vitro intrinsic clearance dataa Mouse Rat Monkey Dog Mouse Rat Monkey Dog

Statistic CL CL CL CL CLi CLi CLi CLi

N 1369 17529 1129 2690 3021 42470 3600 3700

Q1 0.28 0.33 0.20 0.31 1.25 1.10 2.02 0.86

Q2 0.60 0.68 0.38 0.65 4.78 3.50 7.20 2.00

Q3 0.92 1.21 0.63 1.17 21.74 11.96 22.00 7.20

aCL values are expressed as a percentage of liver blood flow; CLi units are mL/min/g liver. Q1, Q2 (median value) and Q3 are the first second and third quartiles.

Summary of animal in vivo clearance

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 0.5 1 1.5

Normalized CL

Per

cen

tile

Mouse CL, N = 1369

Rat CL, N = 17529

Dog CL, N = 2690

Monkey CL, N = 1129

Chemical Diversity

• 20,000 unique compounds representing hundreds of lead optimization programs.

• Self-similarity tests, ring analysis, Murcko assemblies.

• Demonstrate substantial structural diversity.

• The top 20 rings accounted for 45-51% of the compounds depending on the species

• Median frequency of any particular Murcko assembly less then 0.1%

– Corresponds to ~18 compounds with Rat CL sharing the same assembly

Benzimidazole Quinoline pyrido-pyrimidine

Near Neighbors

(0)

(1-2)

(3-4)

(5-9)

(>= 10)

Rat CL

(0)

(1-2)(3-4)

(5-9)

(>= 10)

Mouse CL

(0)

(1-2)(3-4)

(5-9)

(>= 10)Dog CL

(0)

(1-2)

(3-4)

(5-9)

(>= 10)Monkey CL

Dog Model

(0.70 – 0.79)

(0.55 – 0.66)

(0.89 – 0.94)

(0.67 – 0.80)

(0.67 – 0.74)

(0.68 – 0.74)0.750.60.910.740.710.71610Dog CLi

(0.62 – 0.75)

(0.35 – 0.53)

(0.65 – 0.78)

(0.47 – 0.65)

(0.65 – 0.78)

(0.60 – 0.71)0.680.440.720.560.720.65202Mouse

CL

(0.71 – 0.76)

(0.38 – 0.45)

(0.75 – 0.79)

(0.54 – 0.61)

(0.75 – 0.80)

(0.68 – 0.72)0.740.420.770.580.770.71417Rat CL

(0.78 – 0.85)

(0.27 – 0.37)

(0.79 – 0.87)

(0.76 – 0.85)

(0.68 – 0.77)

(0.72 – 0.79)0.810.320.830.80.720.76490Dog

Model

ROC AUC

FPRTPR NPVPPVACCNPredictor

Summary of Performance Diagnostics for Methods Useda

aThe upper and lower limits of the 90% confidence intervals for each diagnostic are included in parentheses.

FPR Comparisons

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Dog M

odel

Rat C

L

Dog C

Li

Rat M

odel

Mou

se C

L

Rat C

Li

Mou

se M

odel

Mou

se C

Li

FP

R

p = 0.012 p = 0.008

p < 0.001 p < 0.001 p = 0.002

ROC AUC Comparisons

0.50

0.60

0.70

0.80

0.90

Dog M

odel

Rat C

L

Mou

se C

L

Rat M

odel

Rat C

Li

Mou

se M

odel

Mou

se C

Li

Mon

key M

odel

Mon

key C

Li

RO

C A

UC

p < 0.001 p = 0.008 p = 0.007

p = 0.002

p = 0.003

NPV Comparisons

0.4

0.5

0.6

0.7

0.8

0.9

Dog M

odel

Rat C

L

Mou

se C

L

Rat M

odel

Rat C

Li

NP

V

p < 0.001 p < 0.001

p < 0.001

Effect of Optimization on Rat CL

0

0.2

0.4

0.6

0.8

1

0 0.5 1 1.5 2

Normalized CL (CL/Q)

Pe

rcen

tile

Rat

Rat with Dog

Rat with Monkey

Key Messages• Bayesian model performance exceptional.

• ROC AUC, ACC, PPV, NPV and TPR ranging from 0.72 to 0.82.

• In predicting dog CL, the Bayesian model was better than experimental rat or mouse CL.

• In predicting rat CL, the Bayesian model performed just as well as mouse CL.

• Models outperformed mouse, rat and monkey CLi for predicting mouse, rat and monkey CL, respectively.

• Models have higher negative predictive value (compounds with high experimental CL have high Bayesian CL).

• Lead optimization bias can affect modeling success (monkey).

Conclusions

• Study demonstrates the potential of naïve Bayesian classification to predict animal CL based on structural fingerprints.

• Models can be used to

– optimize chemical libraries

– direct new chemical synthesis

– increase the efficiency of screening cascades

• Significant potential to reduce cost, time and animal usage associated with the discovery of new medicines.

Acknowledgements

• GSK Drug discovery scientists

• Charles B Davis

• Robert Gagnon

• Amber Anderson

• Chao Han

• John Conway (Accelrys)

• Keith Ward (Bausch & Lomb)

Backup Slides

Near Neighbors

Distribution of Near Neighbors: Animal CL and CLia

NN Mouse Rat Monkey Dog

CL N = 1,369 N = 17,529 N = 1,129 N = 2,690

0 32.4 22.4 34.4 25.8

1-2 26.2 23.4 27.7 24.0

3-9 26.8 29.0 27.0 26.7

≥10 14.6 25.2 11.0 23.5

CLi N = 3,691 N = 43,118 N = 3,634 N = 3,732

0 23.7 15.4 21.8 23.7

1-2 26.4 19.2 23.6 26.2

3-9 26.2 29.5 25.6 26.5

≥10 23.8 35.9 29.0 23.6 aPercentage of compounds in various NN (near neighbor) bins. Tantimoto distance <0.15.

Rat ModelSummary of Performance Diagnostics for Methods Useda


(0.64 – 0.67)

(0.57 – 0.60)

(0.78 – 0.80)

(0.64 – 0.68)

(0.57 – 0.59)

(0.60 – 0.62)

0.650.580.790.660.580.615947Rat Cli

(0.76 – 0.84)

(0.35 – 0.48)

(0.80 – 0.88)

(0.66 – 0.78)

(0.70 – 0.79)

(0.70 – 0.77)

0.80.420.840.720.740.74409Mouse CL

(0.80 – 0.83)

(0.29 – 0.33)

(0.77 – 0.81)

(0.74 – 0.78)

(0.71 – 0.75)

(0.73 – 0.75)

0.820.310.790.760.730.743077Rat Model

ROC AUCFPRTPR NPVPPVACCNPredictor

Monkey Model

(0.66 – 0.73)

(0.24 – 0.37)

(0.53 – 0.62)

(0.30 – 0.40)

(0.81 – 0.89)

(0.57 – 0.64)0.690.310.580.350.850.60486Monkey

CLi

(0.69 – 0.77)

(0.37 – 0.52)

(0.67 – 0.74)

(0.30 – 0.42)

(0.81 – 0.87)

(0.64 – 0.71)0.730.440.710.360.840.67569Dog CL

(0.71 – 0.77)

(0.29 – 0.41)

(0.68 – 0.74)

(0.31 – 0.40)

(0.87 – 0.92)

(0.67 – 0.73)0.740.350.710.350.900.7835Rat CL

(0.75 – 0.87)

(0.17 – 0.41)

(0.72 – 0.83)

(0.31 – 0.52)

(0.88 – 0.96)

(0.71 – 0.81)0.810.290.770.420.920.76206Monkey

Model

ROC AUC

FPRTPR NPVPPVACCNPredictor

Summary of Performance Diagnostics for Methods Useda


Tam June 2009

Documents

animal pk

pharmacokinetics pk

animal studies

experimental adme

modeling animal

clisummary of animal

rodent pk

unique compounds