Top Banner
http://openphacts.org [email protected] @Open_PHACTS
59

2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

May 10, 2015

Download

Health & Medicine

open_phacts

Keynote presentation given by Lee Harland at EKAW 2012

http://rd.springer.com/chapter/10.1007/978-3-642-33876-2_1
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

http://openphacts.org [email protected]

@Open_PHACTS

Page 2: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project
Page 3: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project
Page 4: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project
Page 5: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Source: Nature Reviews Drug Discovery 11, 191-200 (March 2012) | doi:10.1038/nrd3681 Jack W. Scannell, Alex Blanckley, Helen Boldon & Brian Warrington

Page 6: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Source: Nature Reviews Drug Discovery 3, 711-716 (August 2004) | doi:10.1038/nrd1470 Ismail Kola & John Landis

harmful

harmful useless

Page 7: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

http://www.medicalprogresstoday.com/spotlight/spotlight_indarchive.php?id=1039

Derek Lowe

Page 8: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

http://www.ebi.ac.uk/Information/Brochures/pdf/EMBL-EBI%20Annual%20Report%202011.pdf

Page 9: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

297,650

http://www.forbes.com/sites/matthewherper/2011/04/13/a-decade-in-drug-industry-layoffs/

Page 10: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

¤ Built to primary use-case ¤ Tailored indexes ¤ Tailored GUIs ¤ Unique language &

metadata ¤ Poor interoperability/

integration

Literature HR Synthesis Portfolio SAR Docs Safety In vivo Etc

Information Tombs…

Page 11: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

The Outside World

Page 12: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project
Page 13: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Precompetitive Informatics

Public Domain Drug Discovery Data: Pharma are accessing, processing, storing & re-processing

LiteraturePubChem

GenbankPatents Databases

Downloads

Data Integration Data Analysis Firewalled Databases

Repeat @ each

company x

Lowering industry firewalls: pre-competitive informatics in drug discovery Nature Reviews Drug Discovery (2009) 8, 701-708 doi:10.1038/nrd2944

Page 14: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

•  EC funded public-private partnership for pharmaceutical research

•  Focus on key problems –  Efficacy, Safety,

Education & Training, Knowledge Management

The Innovative Medicines Initiative

The Open PHACTS Project •  Create a semantic integration hub (“Open

Pharmacological Space”)… •  Runs 2011-2014 •  Deliver services to support on-going drug

discovery programs in pharma and public domain

•  Leading academics in semantics, pharmacology and informatics, driven by solid industry business requirements

•  23 academic partners, 8 pharmaceutical companies, 3 software SMEs

•  Work split into clusters: •  Technical Build •  Scientific Drive •  Community & Sustainability

Page 15: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

`

Pathways

Pharmacological Activities

Biological Processes

Transcripts

Pathological Processes

Diseases

Genes

Proteins

Interactions

Clinical Drug Applications

Indications

Drugs

Compounds

Chemicals

Page 16: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Optimised To Business Questions

Number   sum   Nr  of  1   Ques-on  

15 12   9   All  oxido,reductase  inhibitors  ac6ve  <100nM  in  both  human  and  mouse  

18 14   8  Given  compound  X,  what  is  its  predicted  secondary  pharmacology?  What  are  the  on  and  off,target  safety  concerns  for  a  compound?  What  is  the  evidence  and  how  reliable  is  that  evidence  (journal  impact  factor,  KOL)  for  findings  associated  with  a  compound?  

24 13   8  Given  a  target  find  me  all  ac-ves  against  that  target.  Find/predict  polypharmacology  of  ac-ves.  Determine  ADMET  profile  of  ac-ves.  

32 13   8   For  a  given  interac-on  profile,  give  me  compounds  similar  to  it.  

37 13   8  The  current  Factor  Xa  lead  series  is  characterised  by  substructure  X.  Retrieve  all  bioac-vity  data  in  serine  protease  assays  for  molecules  that  contain  substructure  X.  

38 13   8  Retrieve  all  experimental  and  clinical  data  for  a  given  list  of  compounds  defined  by  their  chemical  structure  (with  op-ons  to  match  stereochemistry  or  not).  

41 13   8  

A  project  is  considering  Protein  Kinase  C  Alpha  (PRKCA)  as  a  target.  What  are  all  the  compounds  known  to  modulate  the  target  directly?  What  are  the  compounds  that  may  modulate  the  target  directly?  i.e.  return  all  cmpds  ac-ve  in  assays  where  the  resolu-on  is  at  least  at  the  level  of  the  target  family  (i.e.  PKC)  both  from  structured  assay  databases  and  the  literature.  

44 13   8   Give  me  all  ac-ve  compounds  on  a  given  target  with  the  relevant  assay    data  46 13   8   Give  me  the  compound(s)  which  hit  most  specifically  the  mul-ple  targets  in  a  given  pathway  (disease)  59 14   8   Iden-fy  all  known  protein-­‐protein  interac-on  inhibitors  

Page 17: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Goals

Platform GUI

Standards

Apps

API

Page 18: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

A Precompetitive Knowledge Framework

Integration

Pharma Needs

Inputs

Sustainability Stability Security

Management / Governance

Data Mining Services/Algorithms

Mapping & Populating Architecture Interfaces

& Services

Content Structured & Unstructured

Vocabularies & Identifiers

(URIs)

Community KD Innovation

Page 19: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project
Page 20: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Data Cache (Virtuoso Triple Store)

Linked Data API (RDF/XML, TTL, JSON) Domain Specific Services

Open PHACTS Explorer 1st Gen Apps

Identity Resolution

Service (ConceptWiki)

Chemistry Normalisation & Q/C ChemSpider

Identifier Management

Service (BridgeDb+)

Partner Apps

Data Import

P12374 EC2.43.4

CS4532

“Adenosine receptor 2a”

Oct. 2012

Public Content Commercial

Public Ontologies

User Annotations

Page 21: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

P12047 X31045!

GB:29384!

Page 22: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project
Page 23: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project
Page 24: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project
Page 25: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Issues

¤ Provenance

¤ Conflicting Authorities

¤ Management

¤ Transitivity

Page 26: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Whats “equal” anyway?

Gleevec® = Imatinib Mesylate

Imatinib Mesylate YLMAHDNUQAMNNX-UHFFFAOYSA-N

Page 27: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Search “Gleevec”

PubChem Drugbank ChemSpider

Imatinib

Mesylate

Page 28: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Consequences…..

Page 29: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project
Page 30: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Ignore Salts?

NCX-911 Viagra ®

Page 31: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

The 18th International Conference on Knowledge Engineering and Knowledge Management is concerned with all aspects of eliciting, acquiring, modeling and managing knowledge, and its role in the construction of knowledge-intensive systems and serv ices for the semantic web, knowledge management, e-business, natural language processing, intelligent information integration, etc. The focus of the 18th edition of EKAW will be on " K n o w l e d g e E n g i n e e r i n g a n d K n o w l e d g e Management that matters".

Page 32: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Dynamic Equality

§  Tuneable (same data, different questions) §  Domain specific §  User driven §  Traceable

Strict Relaxed

Analysing Browsing

Page 33: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

LinkSet#1 { chemspider:gleevec hasParent imatinib ... drugbank:gleevec exactMatch imatinib ... }

Page 34: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

linkSet1{ chemspider:aspirin exactMatch chembl:aspirin …. } linkSet2{ imantinib_mesylate hasParent imatinib …. } linkSet3{ (+)Staurosporine enantiomer (-)Staurosporine …. } linkSet4{ vanillaEssence hasPart Vanillin …. }

Profile P1 “Broad”

Profile P2 “Parents”

Profile P2 “Strict”

Page 35: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

The Identifier Mapping Service

Identity Mapping Service

(BridgeDB)

Query Expander

Service

cw:979b545d-f9a9 cheminf:logd ?logd

cw:979b545d-f9a9

?iri cheminf:logd ?logd .FILTER (?iri = cw:979b545d-f9a9 || ?iri = cs:2157 || ?iri = chembl:1280 || ?iri = db:db00945 || …) … }

For each line of SPARQL:

[cs:2157, chembl:1280,db:db00945]

parse

recognise

expand

transform

Profiles

Mappings

Q, P1 context GRAPH <http://rdf.chemspider.com> {

Q’

Page 36: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Based on ve2 editor http://lab.linkeddata.deri.ie/ve2/

Shouldn’t an integration system be able to tell you exactly what its integrating?

Page 37: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

## your dataset description :myDS rdf:type void:Dataset ; foaf:homepage <http://example.org/> ; dcterms:title "Example Dataset"^^xsd:string ; dcterms:description """A simple dataset in RDF."""^^xsd:string ; pav:license <http://creativecommons.org/licenses/by-sa/3.0/> ; void:uriSpace "http://example.org/"^^xsd:string ; pav:retrievedFrom <http://exampledownload.com> ; pav:retrievedOn "2012-09-19"^^xsd:date ; pav:retrievedBy <http://some_web_id> ; pav:version "15.5"^^xsd:string ;

Page 38: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Provenance Everywhere

<inDataset href=“http://rdf.chemspider.com/void.rdf#chemSpiderDataset” />

Page 39: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Nanopublications

!

Page 40: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Credit For Curation

Page 41: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Quality Assertions

ChemSpider Validation & Standardization Platform http://bit.ly/NZF5VB

Page 42: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

QUDT (http://www.qudt.org/)

STANDARD_TYPE UNIT_COUNT ---------------- ------- AC50 7 Activity 421 EC50 39 IC50 46 ID50 42 Ki 23 Log IC50 4 Log Ki 7 Potency 11 log IC50 0

STANDARD_TYPE STANDARD_UNITS COUNT(*) ------------------ ------------------ -------- IC50 nM 829448 IC50 ug.mL-1 41000 IC50 38521 IC50 ug/ml 2038 IC50 ug ml-1 509 IC50 mg kg-1 295 IC50 molar ratio 178 IC50 ug 117 IC50 % 113 IC50 uM well-1 52 IC50 p.p.m. 51 IC50 ppm 36 IC50 uM-1 25 IC50 nM kg-1 25 IC50 milliequivalent 22 IC50 kJ m-2 20

~ 100 units

>5000 types

Page 43: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Licencing

Page 44: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Linked Closed Data

Page 45: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Kick-Starting Sustainability

Apps

API

•  Chem-Bio Navigator •  Target Dossier •  Polypharmacology Browser •  Utopia Documents •  Disease Maps •  … more

Page 46: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project
Page 47: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Conclusions

¤ Project designed for the new drug discovery environment

¤ Timing with RDF/SW is good ¤ Companies eager to see whether it can really make a

difference

¤ Challenge: Got to be better than state of the art (in 3 years!)

¤ Funding challenges are formidable

Page 48: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Acknowledgements ¤  Many members of the consortium who have contributed to data, use cases,

funding, support, documentation, management

¤  EBI: John Overington, Anna Gaulton, Mark Davies

¤  Lundbeck: Sune Askjær

¤  Maastricht: Chris Evelo, Andra Waagmeester, Egon Willighagen

¤  Manchester: ¤  Carole Goble, Alasdair Gray, Christian Brenninkmeijer ¤  Steve Pettifer, Ian Dunlop, Rishi Ramgolam, James Eales

¤  NBIC: Barend Mons, Kees Burger

¤  RSC: Antony Williams, Valery Tkachenko

¤  SIB: Christine Chichester

¤  VU: Frank van Harmelen, Paul Groth, Antonis Loizou

¤  OpenLink: Orri Erling, Yrjana Rankka, Hugh Williams

¤  Chem2Bio2RDF: David Wild, Bin Chen

Page 49: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

More Info

[email protected]

http://openphacts.org

@Open_PHACTS

[email protected]

@Scibitely

Page 50: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

backup

Page 51: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Find me the off-target activities of known cancer

drugs who's primary target is a cell cycle regulatory kinase

ChEMBL DrugBank Gene Ontology Wikipathways

Uniprot

ChemSpider

UMLS

ConceptWiki

ChEBI

Connected Using Semantic Technology

Page 52: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Are these Interleukin 1A?

http://bio2rdf.org/uniprot:P01583

http://identifiers.org/uniprot/P01583

Human Interleukin 1A Protein

Human Interleukin 1A Protein

Entrez Gene: 3552, Ensembl:ENSG00000115008

1ITA (3D) 2ILA (3D) 2KKI (3D) 2L5X (3D) IL1A PDB Structures

Uniprot:P01582 Mouse Interleukin 1A

Human Interleukin 1A Gene

1076_at, 210118_s_at, 208200_at, 208200_at Affymetrix probes hIL1A

….etc

Page 53: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

“There is lots of data we all use every day, and it’s not part of the web. I can see my bank statements on the web, and my photographs, and I can see my appointments in a calendar. But can I see my photos in a calendar to see what I was doing when I took them? Can I see bank statement lines in a calendar?

No. Why not? Because we don’t have a web of data. Because data is controlled by applications and each application keeps it to itself.”

Sir Tim Berners-Lee

Page 54: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project
Page 55: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Are These Vanilla?

Page 56: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Multiple Namespaces

Uniprot database ID: P26838

http://identifiers.org/uniprot/P26838 http://bio2rdf.org/uniprot:P26838 http://uniprot.bio2rdf.org/uniprot:P26838 http://chem2bio2rdf.org/uniprot/resource/P26838 http://purl.uniprot.org/uniprot/P26838 ……

Page 57: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

What’s this?

Page 58: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

http://www.drugbank.ca/drugs/DB00203

/Viagra

Page 59: 2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHACTS Project

Data sets