Top Banner
Molecular and Data Visualization in Drug Discovery Deepak Bandyopadhyay GlaxoSmithKline
28

Molecular and data visualization in drug discovery

Apr 12, 2017

Download

Science

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Molecular and data visualization in drug discovery

Molecular and Data Visualization in Drug Discovery

Deepak Bandyopadhyay

GlaxoSmithKline

Page 2: Molecular and data visualization in drug discovery

Intro: Human Body & Disease Biology

• From Wikipedia: – Abnormal condition that affects part or all of an organism.

– Associated with specific symptoms and signs.

• Causes: – Single cause, e.g. pathogen, poison, nutrient deficiency, genetics

– Multiple factors including environment, lifestyle, genetics

http://www.biologyguide.net/biol1/1_disease.htm

Mycobacterium tuberculosis

Chest X-ray showing lung cancer

Page 3: Molecular and data visualization in drug discovery

Drug Discovery Parts/Timeline

Focus of Drug Discovery

• Narrow down on one or a few substances to test in humans and develop into a drug that treats a disease

Components:

Target Selection and Validation

genome

protein

link to disease

disease

genetics

pathology

biological target

In Vitro Biology Medicinal Chemistry (Lead Optimization)

Lead Discovery (a.k.a. Screening)

In Vivo Biology

Page 4: Molecular and data visualization in drug discovery

Molecular and Data Visualization

• The two parts of my job at GSK!

• Molecules: – small (drugs/peptides) and large

(proteins/DNA/RNA/lipids)

– visualized in 1D (SMILES), 2D (structure), 3D (coords / conformations), 4D (Mol. Dynamics)

• Data: – Format: numeric / text,

continuous / categorical, Delimited/database/XML/proprietary

– Source: instruments, manual entry, calculation

– About drug discovery projects (key: molecule ID), genomics/proteomics (key: gene/protein ID), clinical studies (key: anon. patient ID), …

Ibuprofen

DRUG

PROTEIN

EGFR

Ball and stick

EGFR ribbons

Page 5: Molecular and data visualization in drug discovery

Movie: Introduction to Drug Design

By Schrödinger (molecular modeling software company): https://www.youtube.com/watch?v=u49k72rUdyc

Page 6: Molecular and data visualization in drug discovery

Bioactivity 101

• Concentration-Response curve and IC50

• Structure Activity Relationship (SAR)

pIC50 = -log IC50 IC50 = 12.8 uM (micromolar) pIC50 = 6-log10IC50 = 4.89

Think Avogadro,

pH…

Page 7: Molecular and data visualization in drug discovery

Molecular Visualization Deconstructed

• Representations • Navigation

• Interaction • What would you add?

Aspirin (ligand)

Cox-1 (protein)

Binding pocket surface

polar +ve charge

hydrophobic

-ve charge

XY translate, Z zoom Rotate about X/Y or Z E.g. in program MOE

F1 F2

F3 Save/restore scenes

Select Hide/Show Center Prev/Next Scene Expand Sel. Import/Export Align Compute…

Page 8: Molecular and data visualization in drug discovery

Purposes of Molecule Visualization

• Understand and rationalize “SAR” in 3D

• (Protein) Structure-Based Drug Design. E.g.: – Aspirin Binds COX1/2, Celebrex binds COX2 only

• Clearly illustrate biological systems / processes

• What other tasks can you think of?

Page 9: Molecular and data visualization in drug discovery

Case study 1: Protein-Protein Interactions HIV-1 coat protein gp120 bound to antibody 17b (Light, Heavy) and CD4

gp120/CD4 interface gp120/antibody L/H interface

Rank color: > > > > > >

Ban, Y. E. A., Edelsbrunner, H., & Rudolph, J. (2006). Interface surfaces for protein-protein complexes. J. ACM, 53(3), 361-378.

Page 10: Molecular and data visualization in drug discovery

Case-Study 2: Molecular Dynamics Simulation of a drug entering into the binding site of a target protein

Decherchi et al., Nature Comms. 6(6155), 2015. https://www.youtube.com/watch?v=ckTqh50r_2w

Page 11: Molecular and data visualization in drug discovery

From Molecules to Data

Mol spreadsheets, visualizations

StarDrop Glowing Molecules™ image from http://www.asteris-app.com/technical-info.htm

Hybrid molecule/data visualization

Page 12: Molecular and data visualization in drug discovery

Software Systems: Spotfire

• Feature set / distinguishing factors: – Handling large datasets via filtering and

memory management

– Tabular file (CSV, Excel) or database input

– Multiple, configurable visualization types

– Easy enough for domain experts to use / share

– Life science add-ons

• Molecule depiction

• Specialized –omics packages

Binned pIC50 trellised by HBA and HBD pIC50 vs. % inh

Page 13: Molecular and data visualization in drug discovery

Software Systems: LiveDesign

• Consolidate multiple disconnected tools for molecule design

– Integrated Single Platform

– Intuitive UI

– 2D, 3D, Data & Visuals

– Social aspect

Page 14: Molecular and data visualization in drug discovery

Dimensions, dimensions…

• Molecules: 1D (SMILES e.g. c1ccccc1), 2D (depiction), 3D (coords), 4D (motion)

• Data: – 100s of activities, measured and predicted properties

per row (compound) – ~100K for gene expression, clinical trial data – Millions for –omics, next-gen sequencing – Then there’s systems biology…

• Dimensionality reduction is a key capability – PCA, SOM, Stochastic Proximity Embedding,…

Page 15: Molecular and data visualization in drug discovery

Challenges / Types of Visualization

• Key capabilities for data visualization

– Large data human comprehension

– High-level summary + drill-down

– Quickly (auto?) isolate interesting data points

http://guides.library.duke.edu/datavis/vis_types

map

SOM

Parallel coords

Heat map protein

Volume rendering

http://flagshipbio.com/amino-acid-structure-properties-using-self-organizing-maps/

Radar plot

Box Plot

Sunburst

2D 3D nD hierarchical

Dendro-gram

Network/Graph layout

Wikipedia

Page 16: Molecular and data visualization in drug discovery

All the Data at Once: Vlaaivis

T. J. Howe, G. Mahieu, P. Marichal, T. Tabruyn and P. Vugts. Data reduction and representation in drug discovery. Drug Discovery Today 12(1/2):45-53 Jan 2007 R

Page 17: Molecular and data visualization in drug discovery

All the Data at Once (cont’d): Radar Plots

• Circular histogram for viewing multi-parameter results

The influence of the 'organizational factor' on compound quality in drug discovery Paul D. Leeson & Stephen A. St-Gallay Nature Reviews Drug Discovery 10, 749-765 (October 2011)

Property differences are scaled to either +1, whereby the company with a positive ('best') property value had the highest magnitude, or −1, whereby the company with the lowest ('worst') value had the highest magnitude.

Page 18: Molecular and data visualization in drug discovery

Visualizing Large Datasets

P. Ertl & B. Rohde, J. Cheminformatics 4(12), 2012

Gaspar et al. J. Chem. Inf. Model., 2015, 55 (1), pp 84–94

Network-like similarity graph

Bajorath et al.

• Dimensionality reduction

• Graph layout

• Activity landscape

• Probabilistic property plots

• Scaffold abstraction

Steven Muchmore, Abbott Labs (now Abbvie)

Molecule cloud

Molecular Property 1

Mo

lecu

lar P

rop

erty

2

Pro

bab

ility

of s

ucc

ess

(cro

ssin

g ce

ll m

emb

ran

e)

Page 19: Molecular and data visualization in drug discovery

SAR Tables

• SAR: Structure-Activity Relationship – Split molecule: core/scaffold, pendant R-groups

– SAR Table: molecule spreadsheet with R-groups and Activity Data

(-OH)

(-COOH)

Page 20: Molecular and data visualization in drug discovery

SAR Maps - R1 vs. R2 on a Core

Sele

ctiv

e fo

r p

rote

in 1

pIC

50

2 ‒

pIC

501

S

elec

tive

fo

r p

rote

in 2

R1 R

2

Core “scaffold”:

D. K. Agrafiotis et al. SAR Maps:  A New SAR Visualization Technique for Medicinal Chemists. J. Med. Chem., 2007, 50 (24), 5926–5937.

Page 21: Molecular and data visualization in drug discovery

Clustering

• Based on chemical descriptors, biological activity, etc…

• Agglomerative or hierarchical

Hoek, Keith S. et al.: Metastatic potential of melanomas defined by specific gene expression profiles with no BRAF signature. Pigment Cell Research 19 (4), 290-302

http://chemmine.ucr.edu/help/

Molecules Genes

Page 22: Molecular and data visualization in drug discovery

Limitations of Clustering

Molecule single cluster, can be limiting

seals (fur)

?

singleton

?

ducks (bill)

?

penguins (flipper)

?

Cluster 3 Cluster 10

similar molecules ≠ same cluster

Many singletons

Complete Link Cluster ID

Clu

ster

Siz

e

Page 23: Molecular and data visualization in drug discovery

Automatic Decomposition into

(All) Overlapping Scaffolds Malarial parasite assay pIC50 8.1

… 49 total

… 226 total

2 total

Molecule

Scaffold(s)

Related Molecules

Page 24: Molecular and data visualization in drug discovery

8.2

Avg pIC50 8.15

Avg pIC50 7.8

Avg pIC50 7.8

Next Step: Combine with Activities and Properties

… 49 total

… 226 total

2 total

8.5

8.2

8.0

7.5

7.7

8.5

7.4

7.9

7.7 8.2

Molecule

Scaffold(s)

Annotation

Related Molecules

Page 25: Molecular and data visualization in drug discovery

Case Study: Linking Molecules By Scaffolds

• Use aggregate properties for decision making

• Find related molecules with improved properties

Improving property 1

Imp

rovi

ng

act

ivit

y 2

Aggregate (scaffold)

↓ Drill down

(8 molecules)

Improving activity 3

Im

pro

vin

g p

rop

erty

4

> Keep top half of molecule,

substitute bottom half

Example 1 Example 2

Page 26: Molecular and data visualization in drug discovery

Summary and Lessons Learned

• Drug discovery has specialized types of data that are best understood by visualization

• Good visualizations can support the making of good decisions (and the converse: GIGO…)

• The human element is important – visuals and analytics should be creatable/usable by scientists

• As new visual analytics experts, consider careers in an industry where you can add value and be creative

– Subtle plug for drug discovery

Page 27: Molecular and data visualization in drug discovery

Future Directions and Challenges in Data Visualization for Drug Discovery

• Human vs. Machine or Human + Machine ?

• Automate tediousness of data prep/integration

• Intuitiveness by design

• Interconnection by design

• Integration of latest visualization techniques developed for other domains

• Using emerging media eg. VR, Kinect

• What can you think of?