Top Banner
A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009
38

A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Mar 26, 2015

Download

Documents

Isaac Trujillo
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

A knowledge-based approach for reaction

generation

Development, validation and applications

Dimitar Hristozov, 04.06.2009

Page 2: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Motivation

public

reaction

databases

>1,500,000 reactions

covering general organic chemistry

medicinal chemistslab notebooks (eLN)

proprietary

reaction

databases

public data

commercial

reaction

databases

U

large number of reactions per year, strong medicinal

chemistry bias

wealth of reaction data extract some of the knowledge hidden in these data use this knowledge to assist the medicinal chemist suggest new, synthetically feasible molecules with desired bio profile

Page 3: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Reaction vectors

From reaction database to knowledge base

1 2 3 4

Bond C-C C=O C-OH C-OR

# 0 0 -2 2

reactant vector, R = (R1 + R2) product vector, P

reaction vector, D = P - R

OH

O

OHO

O

+

1 2 3 4

Bond C-C C=O C-OH C-OR

# 4 1 2 0

1 2 3 4

Bond C-C C=O C-OH C-OR

# 4 1 0 2

R1 R2 P

Patel, H., Bodkin, M.J., Chen, B., Gillet, V.J.A Knowledge-Based Approach to De Novo Design Using Reaction Vectors, J. Chem. Inf. Model., 2009, ASAP article

Page 4: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

From reaction vector to products (I)

The reaction vector, D, equals the difference between the product vector, P, and the reactant vector, R

D = P – R

1 2 3 4

Bond C-C C=O C-OH C-OR

# 4 1 0 2

O O

O

O

O

O

better descriptor

is required

Given a reaction vector, D, and a reactant vector, R, the product vector, P, can be obtained

P = D + R Given a product vector, P, can we reconstruct the

product molecule(s)?

Page 5: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

1

2

3

4O5

O6

7

8

Extended atom pairs

atom types atom pairs

No. Symbol n p r Type

4 C 3 1 0 C(3,1,0)

5 O 2 0 0 O(2,0,0)

7 C 2 0 0 C(2,0,0)

Atom Pair Atoms

C(3,1,0)-2(1)-O(2,0,0) 4-5

C(2,0,0)-2(1)-O(2,0,0) 7-5

C(2,0,0)-3-C(3,1,0) 2-4; 7-4

AP2: atoms 1 bond away

AP3: atoms 2 bonds away

n: number of bonds to heavy atoms

p: number of π bonds

r: number of ring memberships

Page 6: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

O O

O

O

From reaction vector to products (II)

Atom Pair Count

C(1,0,0)-2(1)-C(2,0,0)

2

C(2,0,0)-2(1)-C(2,0,0)

1

C(2,0,0)-2(1)-C(3,1,0)

1

C(3,1,0)-2(1)-O(2,0,0)

1

C(3,1,0)-2(2)-O(1,1,0)

1

C(2,0,0)-2(1)-O(2,0,0)

1

C(1,0,0)-3-C(2,0,0) 1

C(2,0,0)-3-C(3,1,0) 2

C(2,0,0)-3-O(2,0,0) 1

C(2,0,0)-3-O(1,1,0) 1

O(2,0,0)-3-O(1,1,0) 1

C(1,0,0)-3-O(2,0,0) 1

O

O

C(2,1,0)-2(2)-O(1,1,0)

C(3,1,0)-2(1)-O(2,0,0)

“wrong” or “missing”

atom pairsproduct vector (P = D + R)

C(3,0,0)-2(1)-O(2,0,0)

C(3,1,0)-2(1)-O(2,0,0)

OH

O

OHO

O

+

Page 7: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Reaction vectors in action

+1C(2,1,0)-2(2)-C(1,1,0)-2C(2,0,0)-2(1)-C(2,0,0)

+1C(2,1,0)-2(1)-C(2,0,0)-1C(2,0,0)-2(1)-O(1,0,0)

APs “Gained”APs “Lost”

CC

CC

OH CC

CC

OH CC

CC

CC

New atoms/bonds added using APs gained

Atoms/bonds selected for removal using APs lost

Starting Molecule

Reaction Vector

Product

5

4

3

2

OH1

5

4

3

2

Reaction

Page 8: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Advantages

Does not require manual atom-atom mapping of the reaction centre

Makes use of the synthetic chemistry data collected through the years

Accounts for the synthetic accessibility of the proposed molecules – all transformations are derived from successful reactions

Is fast to apply – no substructure searching is required

Page 9: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Good approach…

so how is it…

implemented?

Page 10: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Optimisation made easy

build as an Eclipse plug-in => 100% Java

Page 11: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

KNIME meets Chemaxon

Page 12: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Sketcher

Page 13: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

File reader

Page 14: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Reaction generator

Page 15: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Convertor

Page 16: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Multi-objective ranking

Page 17: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

File writer

Page 18: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Marvin Views

Page 19: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Looks great…

but does it …

work?

Page 20: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Reproducing reactions

+1C(2,1,0)-2(2)-C(1,1,0)-2C(2,0,0)-2(1)-C(2,0,0)

+1C(2,1,0)-2(1)-C(2,0,0)-1C(2,0,0)-2(1)-O(1,0,0)

APs GainedAPs Lost

+1C(2,1,0)-2(2)-C(1,1,0)-2C(2,0,0)-2(1)-C(2,0,0)

+1C(2,1,0)-2(1)-C(2,0,0)-1C(2,0,0)-2(1)-O(1,0,0)

APs GainedAPs Lost

5,695

diverse

reactionscreate knowledge base

1

for each reaction2 retrieve its reaction vector3

+1C(2,1,0)-2(2)-C(1,1,0)-2C(2,0,0)-2(1)-C(2,0,0)

+1C(2,1,0)-2(1)-C(2,0,0)-1C(2,0,0)-2(1)-O(1,0,0)

APs GainedAPs Lost

+1C(2,1,0)-2(2)-C(1,1,0)-2C(2,0,0)-2(1)-C(2,0,0)

+1C(2,1,0)-2(1)-C(2,0,0)-1C(2,0,0)-2(1)-O(1,0,0)

APs GainedAPs Lost

apply the reaction vector to the starting materials4

+

-H2O

is the product obtained in less than 30 seconds?5

2,902

reaction

vectors

Page 21: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

How well did it work?

Products generated for ~90% of the 5,695 reactions

Reproducibility

0102030405060708090

100

product(s) generated no product generated

pe

r ce

nt

Page 22: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

How fast did it work?

Execution Times

0

10

20

30

40

50

60

70

80

0.05 0.1 0.5 1 5 10 15 20 25 30 > 30

time / s

pe

r ce

nt

Median run time: 0.015 seconds per reaction

Page 23: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

O

O

O

O

O

O

OH

O

Epoxide reduction

reproduced in large variety of environments (350 reactions) only one reaction was not reproduced

Epoxide reduction

O OH

Page 24: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Works like a charm…

O OH NH2O

O

NH2NH2

O

OH+

OH OH

OO O

O

OH O+ +

N

O

F

F

F

NO2

N

NH2

O

F

F

F

OH

O

O O OH O

O+

S

ONH

O O

SO

NH

O O

+ +

More than 95% reproduced successfully

epoxide reduction epoxide formation ester to amide

alcohol dehydration

Friedel-Crafts acylation

nitro reduction

acid to aldehyde nitrile to aldehyde

nitrile hyrdrolysis alcohol amination

aldol condensation alkene oxidation

O

OH

OO

N

O O

O

O

NNH

Br

NN

NH

Br

OH

O

N O

OH

NH

ON O

N

O

+

Page 25: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Still works like a charm…

More than 90% reproduced successfully

olefin metathesis amide reduction ether halogenation

ozonolysis

alkene halogenation

Wittig-Horner

Beckmann rearrangement Claisen rearrangement

Dieckmann condensation olefination

Robinson annulation

N

O

N

O

O

O

NH

OO

NO

O

+

O

O

Cl

ClBr

OH

O

+

O

O

OO

O

O

O

O

Cl

Cl

O

O OHOH

+

P SO

O

O O

OO

F

S

F

O

OPO

O

O

+ +

O

Si

O

OSi+ +

N

SO S

N

OH

N

O

NOH

S

ClN

O

N

O

S

Cl

OH

O

N

O

O

O

O N

O

OO

O+

N

O

N

O

Page 26: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

S

O

O

O

OO

S

OH+

variety of environments were tested 79 out of 100 reactions were successfully reproduced 21% of the reactions were not reproduced

mainly condensations (intra- and intermolecular) which result in ring closures

Claisen condensation

O

O

OO

O

O

O

OH+ +

Page 27: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Still works

More than 50% reproduced successfully

A large variety of reactions successfully reproduced Small difficulties with complex cycle formations

improvements are on their way

Cope rearrangement (67% success) hetero Diels-Alder (73% success) Claisen condensation (79% success)

Diels-Alder cycloaddition (49% success) Fischer indole synthesis (57% success)

OH O

N

S

O

CF3

N N

N N

CF3

NN

CF3

NCF3

O

+

O

O

OO

O

O

O

OH+ +

Cl Cl

N+

O

N N+

N N

O

O

N+

N

N N+

O

N+

O

O

Cl

Cl

+N

O

O N

NNH2

N

N

N O+

Page 28: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Wow! Cool! It works!

but what is its…

use?

Page 29: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Generating new molecules

Starting molecule

Can the transformbe applied?

Apply reaction transform

New molecule

Select reaction transform

Is a second reagentrequired?

Select suitable reagent

Discard reaction vector

yes

yes

no

no

Knowledge

base

Reagents

database

Page 30: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

rank the proposed new molecules direct the generation towards desired new molecules

Multi-objective de novo design

O

NH

N

O

S

OH

O

O

NH

NO

S

OHO

N

N

Cl

O

NH

NO

S

OHO

NH2

O

NH

NO

S

ClO

Page 31: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Use case one: Lead optimisation

Here is my starting material. What kind of (feasible) one step transformations may I make?

starting molecule: Pencillin G

O

NH

NO

S

OHO

An example from Patel, H., Bodkin, M.J., Chen, B., Gillet, V.J. A Knowledge-Based Approach to De Novo Design Using Reaction Vectors, J. Chem. Inf. Model., 2009, ASAP article

Page 32: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Lead optimisation (cntd.)

O

NH

NO

S

OHO

ClO

NH

NO

S

OHO

N

N

Cl

O

NH

NO

S

OHO

Cl

O

NH

NO

S

OHO

NH2

O

NH

NO

S

OHO

NO2

O

NH

NO

S

OHO

OH

O O

NH

NO

S

OHO

O

NH

NO

S

OHO

SO

ONH2

O

NH

NO

S

OHO

NO2O

NH

NO

S

OHO

D

D

D

D

D

O

NH

NO

S

OHO

O

O

O

NH

NO

S

OHO

S OHO

O

O

NH

NO

S

OHO

O

O

O

NH

NO

S

OHO

OO

NH

NO

S

OHO

SO

O

OH

O

NH

NO

S

OHO

OHO

NH

NO

S

OHO

O

O

NH

NO

S

OHO

Ir O

NH

NO

S

OHO

O

O

NH

NO

S

ClO

O

NH

NO

S

OO

O

NH

NO

S

OHO

NH

NH2

O

NH

NO

S

OHO

NN

O

NH

NO

S

OHO

N

NN

N NH2

O

NH

NO

S

OHO

An example from Patel, H., Bodkin, M.J., Chen, B., Gillet, V.J. A Knowledge-Based Approach to De Novo Design Using Reaction Vectors, J. Chem. Inf. Model., 2009, ASAP article

Penicillin G

Page 33: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Use case two: Synthetic route

I have this (active) fragment. Is there a route from it to the molecule I have in mind?

reproducing known synthetic route – Plavix

Synthetic route from Wang, L. et al., Synthetic Improvements in the Preparation of Clopidogrel, Org. Process Res. Dev., 2007, 11 (3), 487-489

An example from Patel, H., Bodkin, M.J., Chen, B., Gillet, V.J. A Knowledge-Based Approach to De Novo Design Using Reaction Vectors, J. Chem. Inf. Model., 2009, ASAP article

StepNo. applicable

reactionvectors

Total no.products

generated

1 17 158

2 11 123

3 12 124

4 41 386

1 2

3

4

Page 34: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Use case three: Library design

With which of these reagents will my starting material undergo reaction X?

enumerate a library using a single reaction and a number of different reagents

N

O

Br

N

Br BOH OH

BOH OH

R

+ +

An example from Patel, H., Bodkin, M.J., Chen, B., Gillet, V.J. A Knowledge-Based Approach to De Novo Design Using Reaction Vectors, J. Chem. Inf. Model., 2009, ASAP article

starting material

reaction X (X = Suzuki coupling)

628 boronic acids as reagents

Page 35: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Library design (cntd.)

NH

N

O

S

NH

N

O

O

NH

N

O

NH

N

O

Cl

NH

N

O

O

O

NH

N

O

O

Cl

NH

N

O

O

292 products generated

Page 36: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Summary

The reaction vectors offer good way to explore the knowledge hidden inside reaction databases

A variety of chemical reactions can be reproduced with this approach

The method works fast The is applicable in different medicinal chemistry related

scenarios The use of the method is made easy by variety of

KNIME nodes which have been implemented

Page 37: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Acknowledgements

Michael Bodkin for his continuous support both in and outside my daily work

Hina Patel for creating the first prototype which sprung the reaction vectors into live

(http://pubs.acs.org/doi/abs/10.1021/ci800413m)

Dave Evans, Fred Ludlow, Swanand Gore, Dave Thorner, Maria Whatton, Juliette Pradon for many stimulating discussions and for their continuous support

Page 38: A knowledge-based approach for reaction generation Development, validation and applications Dimitar Hristozov, 04.06.2009.

Thank You!

do you have any…

questions, comments, recommendations?