Top Banner
Introduction to Bio Ontologies and The Semantic Web M. Devisscher Biological Databases
103

Bio ontologies and semantic technologies

Apr 16, 2017

Download

Education

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bio ontologies and semantic technologies

Introduction to Bio  Ontologiesand The  Semantic Web

M.  DevisscherBiological Databases

Page 2: Bio ontologies and semantic technologies

Overview

• Bio  ontologies• Semantic technologies

• Practical  sessions:  – Protégé and a  bio  database– DYI  SPARQL  endpoint

Page 3: Bio ontologies and semantic technologies

Introduction

• Ontologies:  what are  ontologies ?

• Ontologies in  the  bio  domain:  OBO  Foundry• Ontologies in  the  semantic web

• OBO• RDF,  IRI,  TTL,  SPARQL,  OWL

Page 4: Bio ontologies and semantic technologies

What is  an ontology ?

• Ontology =  a  specification of  a  conceptualization (Gruber 1993)

• In  practice:  controlled vocabularies– Disambiguation (e.g.  Bank,  Running)– Language/species  independence

• Very useful in  biology – complex  hierarchies of  terms

Page 5: Bio ontologies and semantic technologies

Ontologies in  the  bio  Domain

• OBO  Foundry -­‐ open  Biological andBiomedical Ontologies

• Common  principles• List  of  ontologies at  http://www.obofoundry.org

• OBO  is  also a  data  format  .obo

Page 6: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

• The  mother of  bio-­‐ontologies:  the  GO– Oldest bio  – ontology– Many practical  applications:• Cross  species  studies• Term  abundance studies

• GO  is  an OBO  ontology

Page 7: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

• Collection  of  terms

Page 8: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

• Relationships between terms:– Subsumption:  is_a– Partonomic:  part_of

• These  terms are  transitive• Terms form  a  DAG  (directed,  acyclic graph)• Some information  can be inferred

Page 9: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

Page 10: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

Page 11: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

• Knowmore:  www.geneontology.org• AMIGO  :  the  GO  browser

Page 12: Bio ontologies and semantic technologies

Gene  Ontology  Annotation

• Gene  ontology  annotations  GOA  =  entities  labeled  with  GO  terms– E.g.  Uniprot-­‐GOA

Page 13: Bio ontologies and semantic technologies

Semantic Technologies

• The  semantic web:  Tim  Berners Lee  et  al,  Scientific American  2001

Page 14: Bio ontologies and semantic technologies

Semantic Technologies

• W3C:  a  set  of  specificationshttp://www.w3.org/standards/semanticweb/

• A  mature toolset– Dedicated data  formats– Storage– Query  language

Page 15: Bio ontologies and semantic technologies

Semantic Technologies

• Basic  data  element  =  a  Triple– A  mini  sentence– Contains three Terms:• Subject  Predicate Object

Page 16: Bio ontologies and semantic technologies

Semantic Technologies

• Representation of  triples– Basic  data  format:  RDF/XML– All data  expressed in  RDF  (Resource  DescriptionFramework)

– Several compatible  syntaxes:  TTL  (Terse Triple  Language)  most  human  readable

Page 17: Bio ontologies and semantic technologies

Example

Page 18: Bio ontologies and semantic technologies

The  Turtle Syntax

• Basic  Triple

<http://bioinformatics.be/entities#martijn><http://bioinformatics.be/relations#has_favorite_beer><http://bioinformatics.be/entities#karmeliet>.

Page 19: Bio ontologies and semantic technologies

The  Turtle Syntax

• Prefix

@prefix  b4x:  <http:bioinformatics.be/terms#>b4x:martijn  b4x:has_favorite_beer  b4x:karmeliet.

Page 20: Bio ontologies and semantic technologies

The  Turtle Syntax

• Predicate lists

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .b4x:martijn  b4x:has_favorite_beer  b4x:karmeliet;

foaf:name “Martijn  Devisscher”.

Page 21: Bio ontologies and semantic technologies

The  Turtle Syntax

• Object  lists

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .b4x:martijn  b4x:has_favorite_beer  b4x:karmeliet,

b4x:chimay_blauw;foaf:name “Martijn  Devisscher”.

Page 22: Bio ontologies and semantic technologies

IRI’s and Literals

• Terms can be either IRI’s,  Literals or  blank  nodes• IRI  = Internationalized Resource  Identifier• Unique  id – a  virtual  URI– Example:  http://bioinformatics.be/terms#martijn– There is  no  requirement for resolving– Now:  Open  Data  initiatives:  please do  use resolvableURI’s http://linkeddata.org

– Unique  identifiers can be registered on  http://identifiers.org

Page 23: Bio ontologies and semantic technologies

Introduction

• Literals:  can be typed,  allowed types  from the  XSD  namespace:– E.g.  “This is  a  string  example”^^xsd:string– E.g.  “5”^^xsd:integer

• IRI’s are  used for entities and attributes• Literals are  used for attribute values thataren’t entities

Page 24: Bio ontologies and semantic technologies

The  Turtle Syntax

• Typed literals

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .@prefix  xsd:  <http://www.w3.org/2001/XMLSchema#>  .b4x:martijn  b4x:has_favorite_beer  b4x:karmeliet,

b4x:chimay_blauw;b4x:length  “184”^^xsd:integer;foaf:name “Martijn  Devisscher”^^xsd:string.

Page 25: Bio ontologies and semantic technologies

The  Turtle Syntax

• Blank  nodes

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .@prefix  xsd:  <http://www.w3.org/2001/XMLSchema#>  .b4x:martijn  b4x:has_favorite_beer  b4x:karmeliet,

b4x:chimay_blauw;b4x:length  “184”^^xsd:integer;foaf:name “Martijn  Devisscher”^^xsd:string;b4x:owns_cat  [  b4x:color  “Gray”  ].

Page 26: Bio ontologies and semantic technologies

Classes  and Individuals

• rdf:type

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .b4x:martijn  rdf:type foaf:Person.

Page 27: Bio ontologies and semantic technologies

Classes  and Individuals

• Shorthand:  a

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .b4x:martijn  a  foaf:Person;

foaf:knows b4x:geert.b4x:geert  a foaf:Person.

Page 28: Bio ontologies and semantic technologies

Example

<http://xmpl/entities#martijn><http://xmpl/relations#has_favorite_beer><http://xmpl/entities#karmeliet>.

Page 29: Bio ontologies and semantic technologies

Semantic Technologies

• Sets  of  triples form  a  Graph

Page 30: Bio ontologies and semantic technologies

Graphs

• Triples are  building  blocks of  Graphs

• Combining sets  of  triples allows the  construction of  arbitrarily complex  graphs

b4x:martijn b4x:karmeliethas_favorite_beer

Page 31: Bio ontologies and semantic technologies

Add meaning !

• Reuse terms from existing,  well  definedvocabularies – ontologies (foaf,  dc,  go,  so)

• Describe new  terms =  Ontologies

• Contain– A  crisp  human  definition– Some machine  readable facts

Page 32: Bio ontologies and semantic technologies

Metadata

• Ontologies are  also described in  RDF– RDFS:  RDF  -­‐ Schema– OWL:  Web  Ontology Language– Also expressed in  RDF

• For  clarity,  file  extension  can be .rdfs or  .owl

Page 33: Bio ontologies and semantic technologies

RDFS  Essentials

• Descriptions– rdfs:label– rdfs:comment

Page 34: Bio ontologies and semantic technologies

RDFS

• Relationships between properties,  classes– rdfs:Class– rdfs:subClassOf– rdf:Property– rdfs:subPropertyOf– rdfs:range– rdfs:domain

Page 35: Bio ontologies and semantic technologies

RDFS:  Example

@prefix  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>.@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .@prefix  xsd:  <http://www.w3.org/2001/XMLSchema#>  .b4x:karmeliet  a  b4x:Trappist  .b4x:Beer  a  rdfs:Class .b4x:Trappist  a  rdfs:Class .b4x:Trappist  rdfs:subClassOf b4x:Beer  .b4x:has_favorite_beer  a  rdf:Property ;

rdfs:domain foaf:Person ;rdfs:range b4x:Beer  .

b4x:Beer  rdfs:subClassOf b4x:Drink  .

Page 36: Bio ontologies and semantic technologies

Analogy

• RDF  =  database  =  data• RDFS/OWL  =  schema  =  metadata

• Both  are  described in  RDF,  but  have  a  different  scope

Page 37: Bio ontologies and semantic technologies

Semantic Technologies

• Inference– Enhance dataset  using knowledge frommetadata(e.g.  rdfs,  owl)

• Types  of  inference engines– RDFS  inference• RDFS  entailment regime

– OWL  inference• Under  active research• Engines  exist for specific subsets of  OWL  (OWL-­‐DL)

Page 38: Bio ontologies and semantic technologies

RDFS  Entailment

Page 39: Bio ontologies and semantic technologies

RDFS:  Inference

b4x:kevin  b4x:has_favorite_beer  b4x:stella

Q:  What can we  infer from this using RDFS  entailment ?

Page 40: Bio ontologies and semantic technologies

RDFS:  Inference

b4x:kevin  b4x:has_favorite_beer  b4x:stellaInferred triples:b4x:kevin  a  foaf:Person [from domain]b4x:stella  a  b4x:Beer  [from range]b4x:stella  a  b4x:Drink  [from subClassOf]

Page 41: Bio ontologies and semantic technologies

DuckTyping

• Watch  out  with inference !

Example:  You want  to express that people canhave  lengths

b4x:length  a  rdf:Property;rdfs:domain foaf:Person;rdfs:range xsd:integer.

Page 42: Bio ontologies and semantic technologies

DuckTyping

• Problem:

ex:VW_Transporter b4x:length  “600”^xsd:integer.

• Would infer that VW_Transporter is  a  Person  !• This is  called DuckTyping

If  it  looks  like  a  duck,  swims  like  a  duck,  and  quacks  like  a  duck,  then  it  probably  is  a  duck

Page 43: Bio ontologies and semantic technologies

Task

• Find  a  solution:  express  in  rdfs that  people  can  have  lengths

Page 44: Bio ontologies and semantic technologies

Task

• Find  a  solution:  express  in  rdfs that  people  can  have  lengths

b4x:havingLenght  a  rdfs:Class.b4x:length  a  rdf:Property;

rdfs:domain b4x:havingLength;rdfs:range xsd:integer.

foaf:Person rdfs:subClassOf b4x:havingLength.

Page 45: Bio ontologies and semantic technologies

Storing  RDF

• As  an RDF  file  for download• In  a  Triplestore– Database  optimised for storing  triples– Examples:  BlazeGraph,  Fuseki,  Sesame

Page 46: Bio ontologies and semantic technologies

Semantic Technologies

• Querying over  RDF  data:  SPARQL• Cool  features:– Distributed  querying =  actual distribution of  data  and computing  resources

– SPARQL/Update:  modify data

• SPARQL  endpoints:  SPARQL  over  HTTP

Page 47: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• First  example:

SELECT  ?subject  ?predicate ?object  WHERE  {?subject  ?predicate ?object.

}

(Generally  not a  good idea as  it will pull  down  the  whole dataset)

Binding  variables

Graph matching

Page 48: Bio ontologies and semantic technologies

?

SELECT  ?person  WHERE  {?person  b4x:has_favorite_beer b4x:karmeliet

}

Page 49: Bio ontologies and semantic technologies

?

Page 50: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Limit  result size :

SELECT  ?subject  ?predicate ?object  WHERE  {?subject  ?predicate ?object.

}  LIMIT  10

Page 51: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Find all classes:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  WHERE  {

?class  a  rdfs:Class.?class  rdfs:label ?label.

}

(This will only retrieve classes  that have  a  label)

Page 52: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Find all classes:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  WHERE  {

?class  a  rdfs:Class.OPTIONAL  {

?class  rdfs:label ?label.}

}

Page 53: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Find all classes  that contain “duck”  in  the  label:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  WHERE  {

?class  a  rdfs:Class.?class  rdfs:label ?label.FILTER(  CONTAINS  (str(?label)  ,  “duck”  )  )

}

Page 54: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Make  it case  insensitive:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  WHERE  {

?class  a  rdfs:Class.?class  rdfs:label ?label.FILTER(  CONTAINS  (  UCASE(str(?label))  ,  “DUCK”  )  )

}

Page 55: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Search  in  specific graph:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  FROM  <http://example.org/animals>WHERE  {

?class  a  rdfs:Class.?class  rdfs:label ?label.FILTER(  CONTAINS  (  UCASE(str(?label))  ,  “DUCK”  )  )

}

Page 56: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Search  in  specific graph:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  WHERE  {

GRAPH  <http://example.org/animals>  {?class  a  rdfs:Class.?class  rdfs:label ?label.FILTER(  CONTAINS  (  UCASE(str(?label))  ,  “DUCK”  )  )

}}

Page 57: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Can also search  for graphs :

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?g  WHERE  {

GRAPH  ?g  {?class  a  rdfs:Class.?class  rdfs:label ?label.FILTER(  CONTAINS  (  UCASE(str(?label))  ,  “DUCK”  )  )

}}

Page 58: Bio ontologies and semantic technologies

Summary:  Querying RDF  data

RDF  Data InferenceEngine

RDFS/OWL

RDF  Data

Inferred

SPARQLEndpoint

Page 59: Bio ontologies and semantic technologies

• Basic data element = a Triple– A mini sentence– Contains three Terms:– Subject Predicate Object

• Example:

<http://xmpl/entities#martijn><http://xmpl/relations#has_favorite_beer><http://xmpl/entities#karmeliet>.

Take  home  Summary

Page 60: Bio ontologies and semantic technologies

• Combine triples to represent knowledge

Page 61: Bio ontologies and semantic technologies

• Use terms from ONTOLOGIES

– COMMON VOCABULARIES– POSSIBLE TO INFER

MEANING• OMIABIS• OBIB• SNOMED/ICD• MESH

Page 62: Bio ontologies and semantic technologies

?

• SPARQL searches for patterns

Page 63: Bio ontologies and semantic technologies

?

Page 64: Bio ontologies and semantic technologies

Interoperability between OBO  andSemantic Technologies

• Originated from two separate  academic worlds• Computing  applications of  OBO  mainlyconsistency checkingand overrepresentationanalysis

• Semantic Technologies:  much broader toolset

• Interoperability ?– Direct  offering in  both formats– Automatedmapping

Page 65: Bio ontologies and semantic technologies

Where to find ontologies

• OBO  Foundry• Bioportal;  NCBO• Biogateway• Bio2RDF

Page 66: Bio ontologies and semantic technologies

Where to find RDF  data

• Google  for SPARQL  endpoint• =>  e.g.  EBI  databases

• Non  biological:  DBpedia

Page 67: Bio ontologies and semantic technologies

How  about Tim  Berners Lee’s vision

• We’re not there yet,  but  for bio  data  we’regetting quite close– The  explicitome– Crowd sourcing– Nanopublications

Page 68: Bio ontologies and semantic technologies

SPARQL  in  PRACTICE

Page 69: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

Page 70: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

Page 71: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

• BIND  variables  ?label,  ?x

Page 72: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

• BIND  variables  ?label,  ?x• RETRIEVE variable  ?label

Page 73: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

• BIND  variables  ?label,  ?x• RETRIEVE  variable  ?label• PREFIX:  replace  rdfs:label by  <http://www.w3.org/2000/01/rdf-schema#>

Page 74: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

• BIND  variables  ?label,  ?x• RETRIEVE  variable  ?label• PREFIX:  replace  rdfs:label by  <http://www.w3.org/2000/01/rdf-schema#>• FILTER results  to  labels  containing  “dimethylalinine”

Page 75: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

• BIND  variables  ?label,  ?x• RETRIEVE  variable  ?label• PREFIX:  replace  rdfs:label by  <http://www.w3.org/2000/01/rdf-schema#>• FILTER  results  to  labels  containing  “dimethylalinine”• LIMIT  results  to  first  10  matches  ordered  by  label

Page 76: Bio ontologies and semantic technologies

SPARQL  :  Recap

DESCRIBE <http://rdf.wikipathways.org/Pathway/WP1425_r74390/WP/Interaction/e077e>

• Useful  short  query  to  get  direct  links  from/to  a  given  node

Page 77: Bio ontologies and semantic technologies

SPARQL  REFERENCE

http://www.w3.org/TR/sparql11-­‐overview/

Page 78: Bio ontologies and semantic technologies

Running  SPARQL• From  a  web  interface

Page 79: Bio ontologies and semantic technologies

• From  a  web  interface• Using  http

– HTTP  GET

– HTTP  POST  :  for  larger  query  strings– Headers  determine  response  type  (JSON,  XML,  HTML)

http://…/sparql?default-graph-uri=<http://graphName>&query=URLENCODEDQUERYSTRING

Running  SPARQL

Page 80: Bio ontologies and semantic technologies

BIO-­‐ONTOLOGIES

Page 81: Bio ontologies and semantic technologies

BioPortal

Page 82: Bio ontologies and semantic technologies

Access

• From  the  web  interface  !• SPARQL  endpoint:  using  API  key;  on  request  • Running  a  local  copy:  download  VM  image;  on  request

Page 83: Bio ontologies and semantic technologies

Exercises

• Find  a  term• Find  ontologies  containing  a  term• Browse  some  ontologies• Check  the  NCBO  annotator  !

Page 84: Bio ontologies and semantic technologies

BIO-­‐DATA

Page 85: Bio ontologies and semantic technologies

EBI  RDF  Resources

Page 86: Bio ontologies and semantic technologies

EBI  RDF  Resources

Page 87: Bio ontologies and semantic technologies

Ensembl

Page 88: Bio ontologies and semantic technologies

Exercise

• From  uniprot find  proteins  that  are  annotated  with  a  given  Gene  Ontology  term

Page 89: Bio ontologies and semantic technologies

PREFIX up:<http://purl.uniprot.org/core/> PREFIX taxon:<http://purl.uniprot.org/taxonomy/> PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>PREFIX obo:<http://purl.obolibrary.org/obo/>SELECT * WHERE {

?protein up:classifiedWith obo:GO_0004499.?protein up:organism taxon:9606.

}

http://sparql.uniprot.org

Page 90: Bio ontologies and semantic technologies

Exercise

• From  Expression  Atlas  find  proteins  that  are  differentially  expressed  (P  <  1e-­‐12)  in  Crohn’sdisease

Page 91: Bio ontologies and semantic technologies

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX owl: <http://www.w3.org/2002/07/owl#>PREFIX dcterms: <http://purl.org/dc/terms/>PREFIX obo: <http://purl.obolibrary.org/obo/>PREFIX sio: <http://semanticscience.org/resource/>PREFIX efo: <http://www.ebi.ac.uk/efo/>PREFIX atlas: <http://rdf.ebi.ac.uk/resource/atlas/>PREFIX atlasterms: <http://rdf.ebi.ac.uk/terms/atlas/>PREFIX up:<http://purl.uniprot.org/core/> PREFIX biopax3:<http://www.biopax.org/release/biopax-level3.owl#>SELECT distinct ?protein ?expressionValue ?pvalue WHERE {

?factor rdf:type efo:EFO_0000384 . ?value atlasterms:hasFactorValue ?factor . ?value atlasterms:isMeasurementOf ?probe . ?value atlasterms:pValue ?pvalue . ?value rdfs:label ?expressionValue . ?probe atlasterms:dbXref ?protein . FILTER ( ?pvalue < 1e-12 )FILTER ( strstarts(str(?protein),"http://purl.uniprot.org/uniprot/") )}

}ORDER BY ASC (?pvalue)

https://www.ebi.ac.uk/rdf/services/atlas/sparql

Page 92: Bio ontologies and semantic technologies

• Links  pathways  with  genes,  terms  from  Pathway,  Cell  line  and  Disease  ontology,  PubMed  references

• Models  individual  Interactions• Can  be  downloaded  as  RDF• Has  an  experimental  SPARQL  endpoint

WikiPathways

Page 93: Bio ontologies and semantic technologies

• Define  a  query  to  find  pathways  linked  to  TNFalpha gene

Exercise

Page 94: Bio ontologies and semantic technologies

PREFIX wp: <http://vocabularies.wikipathways.org/wp#>PREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX dcterms: <http://purl.org/dc/terms/>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?PathwayName where {?geneProduct a wp:GeneProduct .?geneProduct dc:identifier ?GeneID .?geneProduct dcterms:isPartOf ?pathway . ?geneProduct rdfs:label ?geneName .?pathway dc:identifier ?pathwayid . ?pathway dc:title ?PathwayName . FILTER(str(?geneName) = "TNFalpha" )

}

http://sparql.wikipathways.org

Page 95: Bio ontologies and semantic technologies
Page 96: Bio ontologies and semantic technologies
Page 97: Bio ontologies and semantic technologies

• Try  this,  or  another  query– Using  web  interface– Using  http  get• Define  a  simple  describe• Use  a  web  tool  to  URLEncode the  query• Submit  query  as  a  URL  parameter

Exercise

Page 98: Bio ontologies and semantic technologies

DisGeNet

Page 99: Bio ontologies and semantic technologies

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>PREFIX dcterms: <http://purl.org/dc/terms/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX skos: <http://www.w3.org/2004/02/skos/core#>PREFIX void: <http://rdfs.org/ns/void#>PREFIX sio: <http://semanticscience.org/resource/>PREFIX ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>PREFIX up: <http://purl.uniprot.org/core/> SELECT DISTINCT ?gene WHERE {

?gda sio:SIO_000628 ?gene,?disease .?gene a ncit:C16612 . ?gene skos:exactMatch ?GeneID .?disease a ncit:C7057 .?disease dcterms:title ?DiseaseName .?gda sio:SIO_000216 ?scoreIRI .?scoreIRI sio:SIO_000300 ?score .FILTER (?score > "0.35"^^xsd:decimal) FILTER (contains(str(?DiseaseName),"Crohn"))

}

http://rdf.disgenet.org/lodestar

Page 100: Bio ontologies and semantic technologies
Page 101: Bio ontologies and semantic technologies

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX owl: <http://www.w3.org/2002/07/owl#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>PREFIX dcterms: <http://purl.org/dc/terms/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX skos: <http://www.w3.org/2004/02/skos/core#>PREFIX void: <http://rdfs.org/ns/void#>PREFIX sio: <http://semanticscience.org/resource/>PREFIX ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>PREFIX up: <http://purl.uniprot.org/core/>PREFIX wp: <http://vocabularies.wikipathways.org/wp#>PREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX dcterms: <http://purl.org/dc/terms/>

http://rdf.disgenet.org/lodestar

Page 102: Bio ontologies and semantic technologies

SELECT DISTINCT ?PathwayName WHERE {?gda sio:SIO_000628 ?gene, ?disease .?gene a ncit:C16612 .?disease a ncit:C7057 .?disease dcterms:title ?DiseaseName .?gda sio:SIO_000216 ?scoreIRI .?scoreIRI sio:SIO_000300 ?score .FILTER (?score > "0.35"^^xsd:decimal) FILTER (contains(str(?DiseaseName),"Crohn")) SERVICE <http://sparql.wikipathways.org/> {

?geneProduct a wp:GeneProduct .?geneProduct dc:identifier ?gene .?geneProduct dcterms:isPartOf ?pathway .?pathway dc:identifier ?pathwayid . ?pathway dc:title ?PathwayName .

} }

http://rdf.disgenet.org/lodestar/sparql

Page 103: Bio ontologies and semantic technologies