Top Banner
Julio E. Peironcely @peyron Juliopeironcely.com PhD student at Leiden University and TNO Structure Generation, Metabolite Space, and Metabolite-Likeness
54
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Structure generation, metabolite space, and metabolite likeness

Julio E. Peironcely @peyron

Juliopeironcely.com

PhD student at Leiden University and TNO

Structure Generation, Metabolite Space, and Metabolite-Likeness

Page 2: Structure generation, metabolite space, and metabolite likeness

Metabolomics

the quantitative and qualitative analysis of all metabolites in

samples of cells, body fluids, tissues, etc.

Julio E. Peironcely

Page 3: Structure generation, metabolite space, and metabolite likeness

Metabolomics

Julio E. Peironcely

Biological question

Sample preparation

Experi- mental design

Data acquisition

Data pre- processing

Biological inter-

pretation

Data analysis

Samples Raw data List of peaks/ biomolecules

Relevant biomolecules/ connectivities

& Models

Metabolites

Sampling

Protocol

Page 4: Structure generation, metabolite space, and metabolite likeness

Metabolomics

Julio E. Peironcely

Biological question

Sample preparation

Experi- mental design

Data acquisition

Data pre- processing

Biological inter-

pretation

Data analysis

Samples Raw data List of peaks/ biomolecules

Relevant biomolecules/ connectivities

& Models

Metabolites

Sampling

Protocol

Page 5: Structure generation, metabolite space, and metabolite likeness

De-novo identification

Page 6: Structure generation, metabolite space, and metabolite likeness

We have

Julio E. Peironcely

Elemental Composition

Fragments (sometimes)

Experimental Information

Page 7: Structure generation, metabolite space, and metabolite likeness

We want

Julio E. Peironcely

List Of Candidate Structures

As Short As Possible

Good Structure Is In The List

Page 8: Structure generation, metabolite space, and metabolite likeness

We need

Julio E. Peironcely

Structure Generator

Keep only metabolites

Use experimental information to filter molecules

Page 9: Structure generation, metabolite space, and metabolite likeness

Elemental Composition

Julio E. Peironcely

Page 10: Structure generation, metabolite space, and metabolite likeness

Elemental Composition

Structure Generation

Julio E. Peironcely

Page 11: Structure generation, metabolite space, and metabolite likeness

Elemental Composition

Structure Generation

Molecules

Julio E. Peironcely

Page 12: Structure generation, metabolite space, and metabolite likeness

Structure Generator

In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely

Elemental  Formula  

Generate  

Candidate  Structures  

Fragments  

Page 13: Structure generation, metabolite space, and metabolite likeness

Structure Generator

In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely

Elemental  Formula  

Generate  Keep  Molecules  if  

Canonical  Augmenta:on  

Candidate  Structures  

Fragments  

Page 14: Structure generation, metabolite space, and metabolite likeness

Structure Generator

In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely

Adding bonds

Page 15: Structure generation, metabolite space, and metabolite likeness

Structure Generator

In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely

Isomorphism

Isomorphic class “triangle + 1 edge”

Isomorphic class “3-edge chain”

2 3

4

12 3

4 3

4 3

4

3

4

11

1

2 2

2

2 3

4

12 3

4 3

4 3

4

3

4

1

1

21

2

2

1

Page 16: Structure generation, metabolite space, and metabolite likeness

Structure Generator

In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely

Isomorphism

Isomorphic class “triangle + 1 edge”

Isomorphic class “3-edge chain”

2 3

4

12 3

4 3

4 3

4

3

4

11

1

2 2

2

2 3

4

12 3

4 3

4 3

4

3

4

1

1

21

2

2

1

Output  ONLY  orange  graphs  

Page 17: Structure generation, metabolite space, and metabolite likeness

Structure Generator

Julio E. Peironcely

Canonical Labeling

2 3

4

12 3

4 3

4 3

4

3

4

11

1

2 2

2

2 3

4

12 3

4 3

4 3

4

3

4

1

1

21

2

2

1

                 Canonizer      (Nauty)  

(1,2) (1,3) (1,4) (2,3)

(1,2) (1,3) (2,4)

Page 18: Structure generation, metabolite space, and metabolite likeness

Only 1 canonical labeling in each

isomorphic class

Page 19: Structure generation, metabolite space, and metabolite likeness

(1,2)(1,3)(1,4)(2,3)

1

2 3

4 5

(1,2)

1

2 3

4 5

(1,2)(1,3)

1

2 3

4 5

(1,2)(1,3)(1,4)

1

2 3

4 5

(1,2)(1,3)(2,3)

1

2 3

4 5

1

2 3

4 5

(1,2)(1,3)(2,3)(2,4)

1

2 3

4 5 (1,2)(1,3)(1,4)(3,4)

1

2 3

4 5 (1,2)(1,3)(1,4)(4,5)

1

2 3

4 5

X

Use canonizer to remove duplicates after each extension

Page 20: Structure generation, metabolite space, and metabolite likeness

Canonical Augmentation

Julio E. Peironcely

A canonical object

augmented in a canonical way

produces a canonical object

Page 21: Structure generation, metabolite space, and metabolite likeness

Check For Canonical Augmentation

Julio E. Peironcely

Keep object if

a canonical deletion

takes you to the canonical father

Page 22: Structure generation, metabolite space, and metabolite likeness

(1,2)(1,3)(1,4)(2,3)

1

2 3

4 5

(1,2)

1

2 3

4 5

(1,2)(1,3)

1

2 3

4 5

(1,2)(1,3)(1,4)

1

2 3

4 5

(1,2)(1,3)(2,3)

1

2 3

4 5

1

2 3

4 5

(1,2)(1,3)(2,3)(2,4)

1

2 3

4 5 (1,2)(1,3)(1,4)(4,5)

1

2 3

4 5

Accept only canonically

augmented graphs

(1,2)(1,3)(1,4)(3,4)

2 3

4 5

X

1

X

Page 23: Structure generation, metabolite space, and metabolite likeness

Structure Generator Results

Glycine Phenylalanine Malic acid D-Cysteine p-Cresol sulfate

C2H5NO2 C9H11NO2 C4H6O5 C3H7NO2S C7H8O3S

84 277,810,163 8,070 3,838 10,203,389

6 4,037,499 1,601 100 19,940

93,137 948

584

278

Elemental Composition

# Output Molecules

1 Fragment

2 Fragments

3 Fragments

MOLGEN same # of molecules

In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely

Page 24: Structure generation, metabolite space, and metabolite likeness

Lots of candidates structures

Page 25: Structure generation, metabolite space, and metabolite likeness

We are looking for metabolites

Page 26: Structure generation, metabolite space, and metabolite likeness

Elemental Composition

Structure Generation

Molecules

Metabolite Likeness

Julio E. Peironcely

Page 27: Structure generation, metabolite space, and metabolite likeness

Elemental Composition

Structure Generation

Molecules

Metabolite Likeness

Metabolites

Julio E. Peironcely

Page 28: Structure generation, metabolite space, and metabolite likeness

How do metabolites look like?

Understanding and Classifying Metabolite Space and Metabolite-Likeness Julio E. Peironcely et al. PLoS One (in press)

Page 29: Structure generation, metabolite space, and metabolite likeness

HMDB 8K

ZINC 21M

Julio E. Peironcely

Page 30: Structure generation, metabolite space, and metabolite likeness

metabolites non metabolites

Water Solubility MW

C Atoms Struc. Complexity

PSA

Julio E. Peironcely

Page 31: Structure generation, metabolite space, and metabolite likeness

PCA

Julio E. Peironcely

Page 32: Structure generation, metabolite space, and metabolite likeness

PCA

Page 33: Structure generation, metabolite space, and metabolite likeness

Not so different

Page 34: Structure generation, metabolite space, and metabolite likeness

Decision Tree

Julio E. Peironcely

Page 35: Structure generation, metabolite space, and metabolite likeness

Elemental Composition

Structure Generation

Molecules

Metabolite Likeness

Metabolites

Julio E. Peironcely

Page 36: Structure generation, metabolite space, and metabolite likeness

Metabolite-likeness

Julio E. Peironcely

HMDB 8K

ZINC 21M

Atom Counts

Physicochemical desc.

MDL Public Keys

FCFP_4

ECFP_4

Support Vector Machines (SVM)

Random Forest (RF)

Naïve Bayes (NB)

Representation + Classification

Page 37: Structure generation, metabolite space, and metabolite likeness

Metabolite-likeness

Julio E. Peironcely

HMDB 8K

ZINC 21M

Standardization

Diversity Selection Atom Counts Physicochemical desc.

MDL Public Keys FCFP_4 ECFP_4

Page 38: Structure generation, metabolite space, and metabolite likeness

Metabolite-likeness

Julio E. Peironcely

Training Set 532 + 532

HMDB 8K

ZINC 21M

Standardization

Diversity Selection

Test Set 6.4K + 6.4K

Atom Counts Physicochemical desc.

MDL Public Keys FCFP_4 ECFP_4

Page 39: Structure generation, metabolite space, and metabolite likeness

Metabolite-likeness

Julio E. Peironcely

Training Set 532 + 532

HMDB 8K

ZINC 21M

Standardization

Diversity Selection

Test Set 6.4K + 6.4K

5-fold CV

SVM RF BC

Atom Counts Physicochemical desc.

MDL Public Keys FCFP_4 ECFP_4

Page 40: Structure generation, metabolite space, and metabolite likeness

Metabolite-likeness

Julio E. Peironcely

Training Set 532 + 532

HMDB 8K

ZINC 21M

Standardization

Diversity Selection

Test Set 6.4K + 6.4K

5-fold CV

SVM RF BC

Metabolite likeness

3 classifiers X

5 descriptions

Page 41: Structure generation, metabolite space, and metabolite likeness

Metabolite-likeness

Julio E. Peironcely

Training Set 532 + 532

HMDB 8K

ZINC 21M

Standardization

Diversity Selection

Test Set 6.4K + 6.4K

5-fold CV

SVM RF BC

Metabolite likeness

Best = RF – MDLPublicKeys

Sensitivity Specificity AUC

99.84% 87.52% 99.20%

Bad BC – P_desc

Sensitivity Specificity AUC

42.51% 86.56% 61.57%

Page 42: Structure generation, metabolite space, and metabolite likeness

Metabolite-likeness, external validation

Julio E. Peironcely

HMDB External

validation set ChEMBL

Metabolite likeness

DrugBank

Standardization

Random Selection

Page 43: Structure generation, metabolite space, and metabolite likeness

Metabolite-likeness, external validation

Julio E. Peironcely

Page 44: Structure generation, metabolite space, and metabolite likeness
Page 45: Structure generation, metabolite space, and metabolite likeness

Met-likeness + structure generation (malic acid) 8K

Julio E. Peironcely

100%

57% 77%

Page 46: Structure generation, metabolite space, and metabolite likeness

Met-likeness + structure generation (methylhistamine) 260K

Julio E. Peironcely

46% 71%

Page 47: Structure generation, metabolite space, and metabolite likeness

What else do we know about our molecules?

Page 48: Structure generation, metabolite space, and metabolite likeness

Phenylalanine Molecule Minimized_Energy ALogP Index

0.1100 -1.605 5142

Page 49: Structure generation, metabolite space, and metabolite likeness

Julio E. Peironcely

Molecule Minimized_Energy ALogP Index

0.1100 -1.605 5142

C9H11NO2

Structure Generation

277 M

Page 50: Structure generation, metabolite space, and metabolite likeness

Julio E. Peironcely

Molecule Minimized_Energy ALogP Index

0.1100 -1.605 5142

C9H11NO2

Structure Generation

41 K

44%

99%

Page 51: Structure generation, metabolite space, and metabolite likeness

Julio E. Peironcely

Molecule Minimized_Energy ALogP Index

0.1100 -1.605 5142

C9H11NO2

Structure Generation

8 K

E < 10

40%

Page 52: Structure generation, metabolite space, and metabolite likeness

Julio E. Peironcely

Molecule Minimized_Energy ALogP Index

0.1100 -1.605 5142

C9H11NO2

Structure Generation

31

E < 10

ALogP < -1

76%

Page 53: Structure generation, metabolite space, and metabolite likeness

Conclusions

Julio E. Peironcely

Met-Likeness prediction is good, interpretation not

Local models needed

Structure Generator + Met-Likeness + other constraints = Met Id

improvement

Page 54: Structure generation, metabolite space, and metabolite likeness

Acknowledgements

TNO Quality of Life Leon Coulier Albert Tas

Evry University Jean-Loup Faulon Davide Fichera

HMP University of Alberta David Wishart Ying (Edison) Dong

Leiden University Miguel Rojas-Cherto Piotr Kasper Michael van Vliet Theo Reijmers Rob Vreeken Ronnie van Doorn Thomas Hankemeier

University of Cambridge Andreas Bender

Julio E. Peironcely