Top Banner
A novel approach to analysis of primary HTS data Compound Set Enrichment Thibault Varin Ansgar Schuffenhauer Gubler, H., Parker, C., Zhang, JH., Raman, P., Ertl, P.
22

Compound Set Enrichment

Jan 30, 2016

Download

Documents

airlia

Compound Set Enrichment. A novel approach to analysis of primary HTS data. Thibault Varin. Ansgar Schuffenhauer. Gubler, H., Parker, C., Zhang, JH., Raman, P., Ertl, P. Compound Set Enrichment. INTRODUCTION. Introduction. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Compound Set Enrichment

A novel approach to analysis of primary HTS data

Compound Set Enrichment

Thibault Varin Ansgar Schuffenhauer

Gubler, H., Parker, C., Zhang, JH., Raman, P., Ertl, P.

Page 2: Compound Set Enrichment

INTRODUCTION

| Compound Set Enrichment | Thibault Varin | 10/07/142

Compound Set Enrichment

Page 3: Compound Set Enrichment

Introduction

Active series identification: Can relevant SAR be extracted from primary HTS data?

Are activity data binary or continuous?

| Compound Set Enrichment | Thibault Varin | 10/07/143

Page 4: Compound Set Enrichment

IntroductionActive series identification

| Compound Set Enrichment | Thibault Varin | 10/07/144

Hypothesis 1:Within primary HTS screening data, structure activity relationships (SAR) are apparent and can be used to help selecting active compound classes.

Page 5: Compound Set Enrichment

IntroductionAre the activity data binary or continuous?

| Compound Set Enrichment | Thibault Varin | 10/07/145

Scaffold 1 Scaffold 2

Activity

Binary activity:-1 active / 5 inactives-Scaffold 1 = Scaffold 2

Continuous activity:Scaffold 1 > Scaffold 2

Active compound (binary)Inactive compound (binary)

N

N

NN

O

O

Page 6: Compound Set Enrichment

Introduction Are the activity data binary or continuous?

| Compound Set Enrichment | Thibault Varin | 10/07/146

Threshold 1Activity

Threshold 2Activity

Binary scaffold activity is different according to the threshold

Active compound (binary)Inactive compound (binary)

Hypothesis 2:

Methods based on an activity cut-off distort the activity information leading to the incorrect assignment of active series of compounds.

N

N

N

Page 7: Compound Set Enrichment

METHODS

| Compound Set Enrichment | Thibault Varin | 10/07/147

Compound Set Enrichment

Page 8: Compound Set Enrichment

The Scaffold Tree – Visualization of the Scaffold Universe by Hierarchical Scaffold Classification A. Schuffenhauer, P. Ertl et al. J. Chem. Inf. Model., 47, 47, 2007

MethodsThe Scaffold Tree classification

| Compound Set Enrichment | Thibault Varin | 10/07/148

Page 9: Compound Set Enrichment

MethodsDatasets

| Compound Set Enrichment | Thibault Varin | 10/07/149

PubChem Annotationfrom CRC

Simulation of the primary screening data

Hypothesis 1

Page 10: Compound Set Enrichment

Methods Single hypothesis test: summary procedure

1. State the null and the alternative hypotheses

- H0: „the scaffold is inactive“

- H1: „the scaffold is active“

2. Specify a significance level: α=0.01

3. Compute the statistics and the p-value )→p-value=probability that the scaffold is inactive (H0)

4. Decision step:

- p-value> α: H0 is accepted

- p-value< α: H0 is rejected and then H1 is accepted„The scaffold is active“

| Compound Set Enrichment | Thibault Varin | 10/07/1410

Page 11: Compound Set Enrichment

Methods The KS and the Binomial hypothesis tests

| Compound Set Enrichment | Thibault Varin | 10/07/1411

Continuous dataKS test

Binary dataBinomial test

Actives Inactives

BioassayScaffold

H0: there is no difference in the activity distribution defined by compounds having the scaffold S3-2 and the background distribution

H0: there is no difference in the proportion of active compounds for compounds having the scaffold S3-2 and the proportion of active compounds for the full dataset.

Page 12: Compound Set Enrichment

Methods Multiple hypothesis tests: Bonferroni correction

Problem of false positives• α =probability to identify as active an inactive scaffold (for each test done...)

• 100 inactive scaffolds: probability to identify an „active“ by chance is equal 63% (1-0.99100))

Suggests to test each scaffold at a critical significance level equal to α = 0.01 / Nbr of scaffolds

Makes the assumption that the individual tests are independent

Each level in the Scaffold Tree have been done separately

| Compound Set Enrichment | Thibault Varin | 10/07/1412

Page 13: Compound Set Enrichment

MethodsDetermining the activity of classes

| Compound Set Enrichment | Thibault Varin | 10/07/1413

Hypo1

Hypo2

Scaffold activity evaluation

Comparison of results

Multiple hypothesis test correction (Bonferroni)

Page 14: Compound Set Enrichment

RESULTS

| Compound Set Enrichment | Thibault Varin | 10/07/1414

Compound Set Enrichment

Page 15: Compound Set Enrichment

ResultsComparison of KSP and BTP predictions

| Compound Set Enrichment | Thibault Varin | 10/07/1415

BioassayTotal BPCA significantly

activesBPCA non significantly

actives

KSP BTP Δ BPCA KSP BTP Δ KSP BTP Δ

Hydroxysteroid dehydrogenase 330 231 +99 199 183 168 +15 147 63 +84

Caspase-1 331 114 +217 5 2 2 0 329 112 +217

PK 12 4 +8 12 3 3 0 9 1 +8

Luciferase 67 12 +55 15 13 11 +2 54 1 +53

Luciferase 178 48 +130 41 32 35 -3 146 13 +133

CYP450 2C9 58 33 +25 34 34 31 +3 24 2 +22

CYP450 3A4 121 64 +57 60 60 53 +7 61 11 +50With:-KSP: KS Prediction-BTP: Binomial Threshold Prediction-Δ: KSP-BTP-BPCA: Binomial PubChem Annotation

Both KSP and BTP retrieve BPCA significantly active classesNumber of active classes: KSP > BTPMost of new KSP active classes are not BPCA significantly actives

Page 16: Compound Set Enrichment

ResultsKSP significantly active scaffolds that are in Pubchem inactives

| Compound Set Enrichment | Thibault Varin | 10/07/1416

S

NH

S

O

O

NH

NH

O

NH

S O

O

O

N

N

Inconclusives?Inconclusive?

Inconclusives?

Compound activity (PubChem Annotation)

Active InconclusiveInactiveWA

WAWA

WA

Page 17: Compound Set Enrichment

ResultsPrioritize nodes instead of individual scaffolds

| Compound Set Enrichment | Thibault Varin | 10/07/1417

Scaffold activity (KS Prediction / Bonferroni)

Non significantly activeSignificantly active

Page 18: Compound Set Enrichment

ResultsVisualization tool (Peter Ertl)

| Compound Set Enrichment | Thibault Varin | 10/07/1418

Page 19: Compound Set Enrichment

CONCLUSION

| Compound Set Enrichment | Thibault Varin | 10/07/1419

Compound Set Enrichment

Page 20: Compound Set Enrichment

ConclusionCompound Set Enrichment

| Compound Set Enrichment | Thibault Varin | 10/07/1420

Validation of initial hypotheses

A method to mine HTS data and identify active series of compounds• Chemical classification: Scaffold Tree

• Statistical analysis: Kolmogorov-Smirnov hypothesis test

• Multiple hypothesis test correction: Bonferroni correction

Use all primary data

No activity cut-off

Identification of new active scaffolds not necessarily represented by very active compounds (latent hits) during the primary screen

Page 21: Compound Set Enrichment

With many thanks to

| Compound Set Enrichment | Thibault Varin | 10/07/1421

Acknowledgments

Primary mentor: - Ansgar Schuffenhauer

Scientific advisers:-Christian Parker-Hanspeter Gubler-Ji-Hu Zhang-Peter Ertl-Edgar Jacoby

Help: MLI group

Fellowship: Education office

Discussions:-Martin Beibel-Sebastian Bergling-Meir Glick-Alain Dietrich-Marie-Cecile Didiot

Page 22: Compound Set Enrichment

Questions?

| Compound Set Enrichment | Thibault Varin | 10/07/1422