Semantics 101 for Pharma - PHUSE Wiki · 2016-10-17 · Monday, 10th October 2016 Semantics 101 for Pharma Tim Williams, UCB Biosciences Inc., USA tim.williams@ucb.com Marc Andersen
Post on 30-May-2020
3 Views
Preview:
Transcript
Barcelona
Annual Conference
Monday, 10th October 2016
Semantics 101 for Pharma
Tim Williams,
UCB Biosciences Inc., USA
tim.williams@ucb.com
Marc Andersen
StatGroup ApS, Denmark
mja@statgroup.dk
101
Related PhUSE 2016 Presentations• Interactive Visualization of Linked Data
Monday, 14:30 Data Visualization
• Generating Analysis Results and MetadataMonday, 16:00 Trends and Technology
• Constructing Interoperable Study Documents From A Semantic Technology-based Repository
Poster
• CS Discussion ClubTuesday 11:00 – 12:30
102
103
Thank you
and
Enjoy the Conference!
104
Learning Resources• PhUSE Wiki “Semantic Technology Working Groups”
http://www.phusewiki.org/wiki/index.php?title=Semantic_Technology
• PhUSE Wiki “Semantic Technology Curriculum” http://www.phusewiki.org/wiki/index.php?title=Semantic_Technology_Curriculum
• White papers, publications, presentations.
• “Learning SPARQL” by Bob DuCharmehttp://www.learningsparql.com/index.html - examples for download
• Semantic University by Cambridge Semanticshttp://www.cambridgesemantics.com/semantic-university
• RDF Primerhttp://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/
• CDISC Standards in RDF User Guide v1 Final
http://www.cdisc.org/system/files/members/standard/RDF/CDISC%20Standards%20RDF%20User%20Guide%201.0%20Final%202015-07-21.pdf
• Knowledge Engineering with Semantic Web Technologies 2015 https://open.hpi.de/courses/semanticweb2015
105
Exercises
Due to time constraints and the large number of attendees, we were unable to provide hands-on experience during the session. This section provides exercises and a link to materials so you may try creating and querying Linked Data on your own.
To obtain files for the exercises, go to:http://www.phusewiki.org/wiki/index.php?title=Semantic_Technology_Curriculum
Download the file: PhUSECSS-Semantics101-AttendeeFiles.zip
106
Introduction to Jena Fuseki
• Apache-Jena – contains the APIs, SPARQL engine, the TDB native RDF database and command line tools
ARQ, RIOT …• Apache-Jena-Fuseki – the Jena SPARQL
server
107
Load a File into Fuseki• File: ex001.ttl
@prefix css: <http://www.example.org/CSS/> .
@prefix ct: <http://bio2rdf.org/clinicaltrials/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
ct:NCT00799760 css:title "Evaluation of Efficacity…"@en ;
css:phase "Phase 3"@en ;
css:enrollment "541"^^xsd:int .
Instructions sent to attendees/available on wiki
108
Query #1: Getting StartedSee
Exercises
File: ex002.rq
PREFIX css: <http://www.example.org/CSS/>
SELECT *
WHERE{
?s ?p ?o .
} LIMIT 10
109
PREFIX css: <http://www.example.org/CSS/>
PREFIX ct: <http://bio2rdf.org/clinicaltrials/>
SELECT ?nctid ?title
WHERE{
?nctid css:title ?title .
}
ct:NCT00799760 css:title "Evaluation of Efficacity and Safety…”@en ;
S
Query #2: Graph Pattern for Title
Query
PData
O
?nctidcss:title
?title
110
Query for Study TitleFile: ex003.rq
PREFIX css: <http://www.example.org/CSS/>
PREFIX ct: <http://bio2rdf.org/clinicaltrials/>
SELECT ?nctid ?title
WHERE{
?nctid css:title ?title .
}
See Exercises
111
Upload another fileFile: ex004.TTL
@prefix css: <http://www.example.org/CSS/> .
@prefix ct: <http://bio2rdf.org/clinicaltrials/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
ct:NCT00799760 css:title "Evaluation of Efficacity …”@en ;
css:phase "Phase 3"@en ;
css:enrollment "541"^^xsd:integer ;
css:primOutcome css:outcome1 .
css:outcome1 rdf:type ct:primary-outcome;
ct:measure "RT-PCR for influenza A virus…"@en ;
ct:time-frame "2 days".
See Exercises
112
css:title "Evaluation of Efficacity …”@en ;
css:phase "Phase 3"@en ;
css:enrollment "541"^^xsd:integer ;
css:outcome1 rdf:type ct:primary-outcome;
css:primOutcome css:outcome1.
ct:NCT00799760
"RT-PCR for influenza A virus…"@en ;ct:measure
ct:time-frame
Graph Query
ct:NCT00799760 ?outURIcss:primOutcome
Query for Primary Outcome
"2 days".
Data
?outURIct:measure
?outcome
113
SPARQL Query PREFIX css: <http://www.example.org/CSS/>
PREFIX ct: <http://bio2rdf.org/clinicaltrials/>
SELECT ?outcome
WHERE
{
ct:NCT00799760 css:primOutcome ?outURI .
?outURI ct:measure ?outcome .
}
Retrieve data that matches the Graph Pattern
NCTID ?outURIprimOutcome measure
?outcome
114
Query for Study Outcome
PREFIX css: <http://www.example.org/CSS/>
PREFIX ct: <http://bio2rdf.org/clinicaltrials/>
SELECT ?outcome
WHERE{
ct:NCT00799760 css:primOutcome ?outURI .
?outURI ct:measure ?outcome . }
File: ex005.rq
See Exercises
115
ns1:NCT00799760 rdf:type ns2:Resource ,
ns2:Clinical-Study .
ns1:NCT00799760 ns3:title "Evaluation of Efficacity and Safety
of Oseltamivir and Zanamivir"@en .
ns2:actual-enrollment 541 ;
…AND MUCH MORE….
Trial Triples with SPARQLhttp://lod.openlinksw.com/sparql
DESCRIBE <http://bio2rdf.org/clinicaltrials:NCT00799760>
116
Query for Study Outcome
PREFIX css: <http://www.example.org/CSS/>
PREFIX ct: <http://bio2rdf.org/clinicaltrials/>
SELECT ?outcome
WHERE{
ct:NCT00799760 css:primOutcome ?outURI .
?outURI ct:measure ?outcome . }
File: ex005.rq
See Exercises
117
Query with RR Packages:• rrdf• rrdflibs
http://github.com/egonw/rrdf
Requires Java 7 or higher
rrdf, rrdflibs
Willighagen E. (2014) Accessing biological data in R with semantic web technologies. PeerJ PrePrints 2:e185v3See https://dx.doi.org/10.7287/peerj.preprints.185v3
118
File: queryLocalTTL.R
library(rrdf)
dataSource = load.rdf(“<path to the TTL file>/ex004.ttl",
format="N3")
query = 'PREFIX css: <http://www.example.org/CSS/>
PREFIX ct: <http://bio2rdf.org/clinicaltrials/>
SELECT ?primaryOutcome
WHERE
{
ct:NCT00799760 css:primOutcome ?outURI .
?outURI ct:measure ?primaryOutcome .
}'
queryResult = as.data.frame(sparql.rdf(dataSource, query))
queryResult
See Exercises
119
Query an Endpoint with R
library(rrdf)
endpoint = "http://localhost:3030/test/query"
query = "SELECT * WHERE {?s ?p ?o . } LIMIT 10 "
queryResult = sparql.remote(endpoint, query)
queryResult
File: queryLocalFuseki.R
See Exercises
120
Query with SASSAS Macros:%sparqlquery - SPARQL query%sparqlupdate - SPARQL update
https://github.com/MarcJAndersen/SAS-SPARQLwrapper
Implementation:• SAS PROC HTTP to access the
service • Send query/update as text file• Input result using SAS LIBNAME
for XML
Other approaches: • PROC groovy to execute Java Code
from Apache Jena• SAS Java objects to interface to Apache
Jena
Requires running SPARQL service, for example Apache Jena
121
File: queryLocalFuseki.sas
Assumptions: • Service active at endpoint• TTL file uploaded to store
122
Query a Remote SourceAt: http://lod.openlinksw.com/sparql
123
Create RDF using R
• R with rrdf, rrdflibs
https://github.com/egonw/rrdf
• R Data frame to RDF
– Excel->data frame-> to RDF
– SAS dataset -> data frame -> RDF
rrdf, rrdflibs
124
Create RDF using R
Packages: rrdf, rrdflibs• add.triple()
– Add a triple :object is a URI
• add.data.triple()
– Add triple: object is a literal
125
Create RDF using R
Try or follow along
File: createTTLFromR.R
Output File: createTTLFromR.TTL
126
Create RDF using SAS
• SAS accessing SPARQL service using PROC HTTP– All functions provided by the service, see SPARQL 1.1
Protocol (https://www.w3.org/TR/sparql11-protocol/)– Implemented as SAS macros
https://github.com/MarcJAndersen/SAS-SPARQLwrapper
• SAS generating text files with– RDF in Turtle– SPARQL INSERT statements
127
Output File:
createTTLFromSAS.TTL
Create RDF using SASFile: createTTLFromSAS.SAS
21
3
Try or follow along
128
Validate• Apache Jena RIOT (RDF I/O Technology)
riot –validate CreateTTLFromEditor.TTL
Example errors1. Forgot PAV prefix
08:45:44 ERROR riot :: line: 9, col: 16] Undefined prefix: pav
2. Incorrect triples termination
08:45:44 ERROR riot :: [line: 9, col: 32] Unexpected IRI
for predicate…
* note: requires Apache Jena in the system path
top related