Top Banner
Bibliological data science and drug discovery Knowing the knowns* Effectively Harnessing the World’s Literature To Inform Rational Compound Design - ACS National Meeting, Philadelphia, Aug 21-24, 2016 Jeremy J Yang Translational Informatics Division School of Medicine University of New Mexico Integrative Data Science Lab School of Informatics & Computing Indiana University *phrase borrowed from Edgar Jacoby, Janssen.
36

Bibliological data science and drug discovery

Apr 12, 2017

Download

Science

Jeremy Yang
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bibliological data science and drug discovery

Bibliological data science and drug discoveryKnowing the knowns*

Effectively Harnessing the World’s Literature To Inform Rational Compound Design - ACS National Meeting, Philadelphia, Aug 21-24, 2016

Jeremy J Yang

Translational Informatics Division School of Medicine

University of New Mexico

Integrative Data Science Lab School of Informatics & Computing

Indiana University

*phrase borrowed from Edgar Jacoby, Janssen.

Page 2: Bibliological data science and drug discovery

In science, luck favors the prepared.- Louis Pasteur

The main thing was not to . . . "foul up." - The Right Stuff, by Tom Wolfe, about John Glenn.

Page 3: Bibliological data science and drug discovery

Overview of talk

● Formulation of problem● Resources and examples:

TIN-X, Target Importance and Novelty Explorer (&IDG)

Chem2Bio2RDF

OPDDR, Open Phenotypic Drug Discovery Resource

DrugCentral

Page 4: Bibliological data science and drug discovery

Formulation of problem

● "World's Literature" redefined by online revolution● Rational Compound Design = improving our odds● For given research question, what are the known knowns?● Connect the dots and weigh the evidence from global

knowledge graph.

Page 5: Bibliological data science and drug discovery

TIN-X

Page 6: Bibliological data science and drug discovery

TIN-X Target Importance & Novelty Explorer

● Bibliometric application developed for Illuminating the Druggable Genome (IDG) project

● Text mining from Novo Nordisk Center for Protein Research (U. Copenhagen) lab of Lars Juhl Jensen.

● Algorithm and client developed at UNM (Cristian Bologa, Daniel Cannon)

● Disease Ontology (DO) classification ● Drug Target Ontology (DTO) protein classification

Page 7: Bibliological data science and drug discovery

Illuminating the Druggable Genome (IDG)

7Knowledge Mgmt Center PI:

Tudor Oprea, MD, PhD

pharos.nih.gov

Page 8: Bibliological data science and drug discovery

TIN-X

http://newdrugtargets.org

Page 9: Bibliological data science and drug discovery

TIN-X

Page 10: Bibliological data science and drug discovery

TIN-X

http://newdrugtargets.org

Page 11: Bibliological data science and drug discovery

Target Novelty:

Fk = 1 / Tk

● Tk = # targets in paper (k)● Fk = fractional score of paper (k)● for papers where Tk > 0

Ni = 1 / ∑(Fk)● Ni = novelty, target (i)● sum over papers where target (i) mentioned

Target-Disease Importance:

Fk = 1 / (Tk * Dk)● Tk = # targets in paper (k)● Dk = # diseases in paper (k)● Fk = fractional score of paper (k)

Iij = ∑(Fk)● Iij = importance, target (i) for disease (j)● sum over papers where both mentioned

Target Importance and Novelty Explorer (TIN-X), Daniel Cannon, Jeremy Yang, Stephen Mathias, Oleg Ursu, Subramani Mani, Anna Waller, Stephan Schürer, Lars Juhl Jensen, Larry Sklar, Cristian Bologa, and Tudor Oprea (manuscript in preparation).

TIN-X

Page 12: Bibliological data science and drug discovery

TIN-X Target Importance & Novelty Explorer

● Text mining is a valuable tool for monitoring literature, filtering and ranking, and detecting trends.

● Automation can infer patterns regarding community trends and consensus.

● Interactive visualization tools help navigate big data.● Good big data text miners care about small data too!

Page 13: Bibliological data science and drug discovery

TIN-X Key contributors

Cristian Bologa Daniel Cannon Lars Juhl Jensen

Page 14: Bibliological data science and drug discovery

Chem2Bio2RDF

Page 15: Bibliological data science and drug discovery

● 24 sources, 52 datasets, 78M triples

● Semantically linked● Chen, B, et al, BMC

Bioinformatics (2010).● Chen, B et al, PLoS

Comp Bio (2012).● Fu, G et al, BMC

Bioinfo (2016).● Related projects:

Bio2RDF, LOD

http://chem2bio2rdf.org

Page 16: Bibliological data science and drug discovery

Classes:biological chemical

chemogenomicsliterature

phenotypesystemsdiseasepathway

polypharmacologyPPI

side effect

BindingDBBindingMOADIUChEBIChEMBLCTDDCDBDIP

DrugBankHGNCHPRDKEGGMATADOROMIMPDBePDSP

PharmGKBPubChemPubMedReactomeSIDERTTDUniProt

Sources:

Page 17: Bibliological data science and drug discovery
Page 18: Bibliological data science and drug discovery

Linked Open Data (LOD)

http://linkeddata.org/

Page 19: Bibliological data science and drug discovery

Chem2Bio2RDF apps: (1) SLAP, (2) Metapaths

2012

2016

Page 20: Bibliological data science and drug discovery

● Data semantics essential for integration of heterogeneous sources

● Strong evidence requires strong semantics● Semantic Web Technologies common framework

enabling -- but not assuring -- community progress● Chem2Bio2RDF v2.0 to leverage major community

advances (esp. Open PHACTS)● Data ecosystems, coop-tition & prisoner's dilemma

Page 21: Bibliological data science and drug discovery

Key contributors

Bin Chen Ying Ding David Wild

Page 22: Bibliological data science and drug discovery

OPDDR

Page 23: Bibliological data science and drug discovery

OPDDR

Open Phenotypic Drug Discovery Resource

Page 24: Bibliological data science and drug discovery

https://ncats.nih.gov/expertise/preclinical/pd2

Page 25: Bibliological data science and drug discovery

OPDDRcollaboration

Page 26: Bibliological data science and drug discovery

Example: OIDD HeLa cell based assayIntegrated RDF

bioassay:AID1117350skos:exactMatchoidd_assay:17 .

bioassay:AID1117350 dcterms:source source:ID846 ; dcterms:title "Increased chromatin condensation in HeLa cells-IC50"@en .

bioassay:AID1117350 rdf:type bao:BAO_0002786 .bioassay:AID1117350 rdf:type bao:BAO_0000010 .bioassay:AID1117350 rdf:type bao:BAO_0000219 .

endpoint:SID170464897_AID1117349 vocabulary:PubChemAssayOutcome vocabulary:active ; sio:has-value "0.0656"^^xsd:float ; a bao:BAO_0000190 ; rdfs:label "IC50"@en .

substance:SID170464897skos:exactMatchchembl_molecule:CHEMBL1483 .

chembl_assay:OIDD00017cco:hasCellLinechembl_cell_line:CHEMBL3308376 .

Page 27: Bibliological data science and drug discovery

D2D builds apps, tools and solutions

for knowledge discovery powered

by fast, scalable network analytics

and rigorous semantics.

d2discovery.com

Predictive Phenotypic Profiler (P3) prototype

Page 28: Bibliological data science and drug discovery

openphacts.org

Page 29: Bibliological data science and drug discovery

OPDDR

● OPDDR phenotypic assays have been linked and integrated via community semantics to both phenotypic (cell lines) and molecular (genomic/protein targets)

● New phenotypic knowledge domain offers additional value in drug discovery and pharmacological informatics

● Open PHACTS excellent, well suited platform

Page 30: Bibliological data science and drug discovery

DrugCentral

Page 31: Bibliological data science and drug discovery

DrugCentral

● DrugCentral is a free, open, curated resource about approved drugs, designed for research

● Compounds, products, labels, targets, IDs, names● DrugCentral developed over several years at UNM● DrugCentral recently released with new interface● License: CC-BY-SA

http://drugcentral.org

Page 32: Bibliological data science and drug discovery

http://drugcentral.org

Page 33: Bibliological data science and drug discovery

http://drugcentral.org

Page 34: Bibliological data science and drug discovery

DrugCentral

● Free, open, accurate, comprehensive drug reference for biomolecular and biomedical informatics research

Compounds 4444

Products 84787

Synonyms 20522

Structures 4231

Targets 3651

Bioactivities 15620

MoA 3484

SNOMED 45349

Page 35: Bibliological data science and drug discovery

"DrugCentral: online drug compendium", Oleg Ursu, Jayme Holmes, Jeffrey Knockel, Cristian Bologa, Jeremy Yang, Stephen Mathias, Stuart Nelson, Tudor Oprea (manuscript submitted).

Page 36: Bibliological data science and drug discovery

In Conclusion● New resources continue to emerge and evolve, providing

opportunities for knowledge driven drug discovery● Community standards → more intelligent web● Adapt to new data environment for success● Private + public data must be integrated to

○ Be prepared (like Pasteur)○ Not "foul up" (like Glenn)