Top Banner
A PRACTICAL INTRODUCTION TO SADI SEMANTIC WEB SERVICES AND HYDRA QUERY TOOL Alexandre Riazanov, CTO IPSNP Computing Inc Oslo University, Sep 23, 2015
66

A practical introduction to SADI semantic Web services and HYDRA query tool

Apr 12, 2017

Download

Science

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A practical introduction to SADI semantic Web services and HYDRA query tool

A PRACTICAL INTRODUCTION TO

SADI SEMANTIC WEB SERVICES

AND HYDRA QUERY TOOL

Alexandre Riazanov, CTOIPSNP Computing Inc

Oslo University, Sep 23, 2015

Page 2: A practical introduction to SADI semantic Web services and HYDRA query tool

PLAN OF THE TALK

• A brief reminder of the previous episode: data federation with SADI and HYDRA.

• RDF and OWL as syntactic foundations of service I/O and functionality descriptions.

• Query execution with automatic service discovery and reasoning.

• Resource publishing process with SADI, with a detailed practical example (time permitting).

Page 3: A practical introduction to SADI semantic Web services and HYDRA query tool

DATA FEDERATION: QUERYING MULTIPLE HETEROGENEOUS SOURCES AS A SINGLE DB

Page 4: A practical introduction to SADI semantic Web services and HYDRA query tool

QUERY EXAMPLES

• Find the names of drugs that contain chemical category Y as active ingredients.

• Find documents mentioning enzyme activity X, extract info on protein mutations and visualize mutations on 3D structure.

• Annotate a DNA sequence X with molecular functions of proteins produced by the corresponding gene.

• Find patients with precondition X diagnosed with infections Y resulting from procedure Z.

• Find patients diagnosed with X while taking drug C.

Page 5: A practical introduction to SADI semantic Web services and HYDRA query tool

HOW WE DO IT WITH HYDRA AND SADI SEMANTIC WEB SERVICES

Page 6: A practical introduction to SADI semantic Web services and HYDRA query tool

A HIGH LEVEL VIEW OF THE HYDRA APPROACH

● Given a SPARQL query, HYDRA analyses it by using an intelligent logic-based algorithm (proprietary, unlike SADI itself).

● HYDRA requests descriptions of potentially useful services from available SADI service registries.

● HYDRA processes the descriptions and figures out which services have to be invoked, on what data and in what order.

SPARQL is a W3C standard semantic query language -- much more intuitive than SQL.

Page 7: A practical introduction to SADI semantic Web services and HYDRA query tool

HOW IS THIS ALL POSSIBLE?

• Key ingredient: the SADI framework for Semantic Web services (Semantic Automated Discovery and Integration).

• SADI services are: • RESTful services• consuming and producing one format -- RDF,• with semantic descriptions (in OWL) fully defining

their functionality.

Page 8: A practical introduction to SADI semantic Web services and HYDRA query tool

DIGRESSION: RDF

• W3C RDF = Resource Description Framework

• Standartised graph-based data model and a few standard rendering formats.

• Nodes = objects (URIs) and data values like “abc”^^xsd:string or “123”^^xsd:integer.

• Edges: binary relations.

Page 9: A practical introduction to SADI semantic Web services and HYDRA query tool

RDF EXAMPLES

@prefix mt: <http://localhost:8080/medical_terminology.owl#> .

<http://example.com/patient#1234> rdf:type mt:Patient .<http://example.com/patient#1234> mt:has_mass _:hm ._:hm rdf:type mt:Measurement ._:hm mt:has_value "92.0"^^xsd:float ._:hm mt:has_units mt:kg .

@prefix mt: <http://localhost:8080/medical_terminology.owl#> .

<http://example.com/patient#1234> a mt:Person ; mt:has_mass [a mt:Measurement; mt:has_value "92.0"^^xsd:float; mt:has_units mt:kg] .

The original XML-based rendering format is also popular.

Page 10: A practical introduction to SADI semantic Web services and HYDRA query tool

DIGRESSION: OWL

• W3C OWL = Web Ontology Language • Essentially, extends RDF with definitions and other axioms

for classes (types of objects) and properties (binary relations).

• Most useful axiom types -- class and property chierarchies:Patient subClassOf Personloves subPropertyOf knows

• SADI reuses property restriction syntax:has_MRN exactly 1 string

Page 11: A practical introduction to SADI semantic Web services and HYDRA query tool

SADI SERVICE I/O

• Input: RDF description of an input object.

• Output: another RDF graph providing more (computed or retrieved) info about the input object or linking it to other objects.

• Since all SADI services “talk the same language” (RDF), they are 100% syntactically interoperable:– output of one SADI service can be directly

consumed by any other SADI services.

Page 12: A practical introduction to SADI semantic Web services and HYDRA query tool

COMPLETE SEMANTIC DESCRIPTIONSOF SERVICE FUNCTIONALITY

SADI services publish semantic descriptions of their I/O that completely define what the service expects and can accept as input, and what RDF assertions the service can output.

• Unique and extremely powerful property: it facilitatescompletely automatic discovery

and orchestration of services.

Page 13: A practical introduction to SADI semantic Web services and HYDRA query tool

Example: computeBMI service I/O

Page 14: A practical introduction to SADI semantic Web services and HYDRA query tool

SEMANTIC FUNCTIONALITY DESCRIPTION

• OWL syntax is repurposed to define what RDF graphs are acceptable as input, and what RDF graphs may be produced in the output.

• Input(computeBMI) = Person and (has_height exactly 1 (Measurement and (has_value exactly 1 float)))• Output(computeBMI) = has_BMI exactly 1 float

Page 15: A practical introduction to SADI semantic Web services and HYDRA query tool

SERVICE INPUT CLASS

• Specifies what kind of objects (RDF descriptions) the service expects in the input. OWL syntax is convenient for such definitions.

• Almost always just an enumeration of attributes of the input objects the SADI service expects.

● If the input class is defined as Person and (has_height exactly 1 (Measurement and (has_value exactly 1 float) and (has_units exactly 1 {m})) and (has_mass exactly 1 (Measurement and (has_value exactly 1 float) and (has_units exactly 1 {kg}))

… the service expects somethinglike this in the input:

patient1234 a Person; has_height [a Measurement; has_value “1.7"^^xsd:float; has_units m]; has_mass [a Measurement; has_value “92.0"^^xsd:float; has_units kg]

Page 16: A practical introduction to SADI semantic Web services and HYDRA query tool

SERVICE OUTPUT CLASS

• A SADI service advertises itself by publishing its output class specifying what the service promises to produce as the output.

• The class must enumerate attributes that the service will add to the input object. This fully semantically defines what the service does!

● If the output class is defined as

has_BMI exactly 1 float

… service clients can expect something like this in the output: patient1234 has_BMI “31.83”^^xsd:float

Page 17: A practical introduction to SADI semantic Web services and HYDRA query tool

DIGRESSION: SPARQL

• W3C SPARQL - standard query language for the RDF data model.

• SPARQL clients are programs that execute SPARQL queries, typically on RDF triplestores.

PREFIX mt: <http://localhost:8080/medical_terminology.owl#> SELECT ?mass { <http://example.com/patient#1234> a mt:Person ; mt:has_mass [a mt:Measurement; mt:has_value ?mass; mt:has_units mt:kg] . }

• HYDRA is also a SPARQL client, but for virtual RDF DBs.

Page 18: A practical introduction to SADI semantic Web services and HYDRA query tool

AUTOMATIC SERVICE DISCOVERY

• With the I/O descriptions, a sufficiently intelligent client can figure out that it can call the service if the client has to satisfy a query condition like this:

patient1234 has_BMI ?bmi_value

• The query condition suggests that a service with has_BMI in the output may be useful if called on the object patient1234

• To make the call, the client must have enough information about patient1234 : according to the input class, has_height and has_mass must be attached to it and sent to the service.

Page 19: A practical introduction to SADI semantic Web services and HYDRA query tool

QUERY, EXECUTION, ANSWERS

Query:FROM <.......rdf> # seed data SELECT ?bmi_value { patient1234 a Person; has_BMI ?bmi_value }

Execution: HYDRA ● seed data in FROM clause describes the

heights and weights of some people, including patient1234, using has_height and has_mass;

● since has_BMI is there, HYDRA looks for all services in the available registries that can attach has_BMI and finds computeBMI;

● patient1234 satisfies the input condition of computeBMI, so HYDRA calls it;

● computeBMI returns patient1234 has_BMI “32.3”

so HYDRA can return an an answer:?bmi_value = “32.3”

Page 20: A practical introduction to SADI semantic Web services and HYDRA query tool

MULTIPLE SERVICES

• Suppose, we don’t know patient’s height/mass, but can retrieve them from a DB by patient’s medical record number (MRN).

• We write another SADI service, patientInfo :Output(patientInfo) = (has_height exactly 1 (Measurement and (has_value exactly 1 float) and (has_units exactly 1 {m})) and (has_mass exactly 1 (Measurement and (has_value exactly 1 float) and (has_units exactly 1 {kg}))

Input(patientInfo) = Person and (has_MRN exactly 1 string)

Page 21: A practical introduction to SADI semantic Web services and HYDRA query tool

AUTOMATIC SERVICE COMPOSITION

• HYDRA can figure out automatically that the output of patientInfo can be submitted to computeBMI, and the composition of the services can solve the query

SELECT ?bmi_value { ?patient a Person ; has_MRN “1234” ; has_BMI ?bmi_value } (no has_height or has_mass anywhere !)

Page 22: A practical introduction to SADI semantic Web services and HYDRA query tool

INTELLIGENT (REASONING-ENABLED) QUERY EXECUTION

● Some queries are too complex unless generality can be exploited:➢ For example, query concerning all antibiotics

requires generalisation, otherwise all types of antibiotics would have to be enumerated in the query.

● Much better way to do this is to import a classification of drugs and use it in query execution.

● HYDRA facilitates such reasoning and even more complex reasoning with rules.

Page 23: A practical introduction to SADI semantic Web services and HYDRA query tool

(TINY) REASONING EXAMPLE

Query defines ?patient as a Patient instead of Person: ?patient a Patient ; has_MRN “1234” ; ...

● HYDRA is still able to call patientInfo on the Patient instance, say patient1234, if there is an axiom Patient subClassOf Person. It infers patient1234 a Person, which can be used as input to patientInfo.

● The axiom can be included in the definition of Output(patientInfo), or specified separately.

Page 24: A practical introduction to SADI semantic Web services and HYDRA query tool

RESOURCE PUBLISHING WITH SADI (1)

• Specify the source of data / software you want to publish with SADI.

• Model data semantically: find ontologies describing your domains and decide how your data will be expressed in the terms of these ontologies.

For example, a patient database and a BMI computation

algorithm.

Page 25: A practical introduction to SADI semantic Web services and HYDRA query tool

RESOURCE PUBLISHING WITH SADI (2)• Define your services I/O semantically: decide how to describe

the operation of your services in the terms of the domain ontologies, i.e., what will be written in the input and output classes.

• Code the business logic of your services in Java, Perl or Python. If a service wraps a DB, convert the input RDF into a query and the query results back to RDF. The coding effort is usually tiny compared to the modelling.

• Overall development costs may be considerable, but this cost is well amortized because SADI services are highly reusable, due to their unprecedented degree of interoperability and discoverability.

Page 26: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (1)

● Specify the source of data / software you want to publish with SADI.➢Database (CSV file) containing patient MRN, name,

height, weight, etc. We will use it to implement patientInfo.

➢BMI computation algorithm: BMI = mass, kg / height, m ^2.

Page 27: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (2)

● Model data semantically: find ontologies describing your domains and decide how your data will be expressed in the terms of these ontologies.➢ Create ontology clinical_terms.owl in Protégé:➢ Classes: Person, Patient, Measurement, Units➢ Properties: has_BMI, has_MRN, has_height, has_mass,

has_value, has_units.➢ Individuals: m, kg.➢ RDF data sample:

patient1234 a Patient; has_MRN “1234”^^xsd:string; has_height [a Measurement; has_value “1.7"^^xsd:float; has_units m]; . . .

Page 28: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (3)Background ontology medical_terminology.owl

Deploy:cp medical_terminology.owl /var/lib/tomcat7/webapps/ROOT/

URL: http://localhost:8080/medical_terminology.owl

Page 29: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (4)● Define your services I/O semantically: decide how to describe the

operation of your services in the terms of the domain ontologies, i.e., what will be written in the input and output classes.➢ I/O ontologies: patientInfo.owl and computeBMI.owl, importing

medical_terminology.owl

Page 30: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (5)● Code the business logic of your services in Java, Perl or Python.

➢There is a good open-source Java library for creating SADI services as Java Servlets.

➢A skeleton code for a service is generated automatically; we just have to fill the body of one method.

➢The library takes care of all the HTTP connectivity issues, parses the input RDF to a simple abstract representation (Jena), and renders the output RDF.

➢The compiled WAR file can be immediately deployed on a servlet container (Tomcat, Jetty, etc).

➢SADI services take only 10-15 min to code (if the business logic is simple or already programmed).

Page 31: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (6)

Edit pom.xml and run service skeleton creation plug-in:

Page 32: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (7)Just add your business logic code in processInput():

Page 33: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (8)Source database patientsDB.csv :

Page 34: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (9)

Finished processInput() for service patientInfo :

Page 35: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (10)

Finished processInput() for service computeBMI :

Page 36: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (11)• Deploy the services:

COPY target/my-sadi-services.war TO /var/lib/tomcat7/webapps/

• Test service description availability (HTTP GET):

Page 37: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (12)Test RDF for the services:

Page 38: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (13)

Service test runs with HTTP POST:

Page 39: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (14)Running HYDRA command line application:

Page 40: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA PACKAGING

• Java API - can be embedded in something else.

• Command line application - convenient for small experiments.

• Web service (Java servlet) with– JSON-based protocol– Java client-side API.

Page 41: A practical introduction to SADI semantic Web services and HYDRA query tool

REMEMBER OUR BIG VISION?

Page 42: A practical introduction to SADI semantic Web services and HYDRA query tool

BIGGER VISION: SELF-SERVICE AD HOC QUERYING OF FEDERATED DATA

Page 43: A practical introduction to SADI semantic Web services and HYDRA query tool

THERE ARE NO PRINCIPLE OBSTACLES TO SELF-SERVICE QUERYING BECAUSE ..

● HYDRA implements semantic querying:○ users need not know how the source data is organised or

accessed.

● HYDRA can apply concept hierarchies and rules:○ syntactically simple queries for complex questions.

We just need an adequate user interface for building queries.

Page 44: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA QUERY COMPOSITION GUI PRINCIPLES

● Queries are rendered as highly readable graphs.

● A lot of query composition is done by entering keyphrases in English;○ HYDRA GUI suggests (sub)graphs

implementing a given keyphrase.

● Nodes can be delete/added manually;○ the system suggests possibilities (navigation).

Page 45: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA GUI SCREENSHOTS

Page 46: A practical introduction to SADI semantic Web services and HYDRA query tool

READABLE QUERY DESCRIPTION

Page 47: A practical introduction to SADI semantic Web services and HYDRA query tool

EMPTY CANVAS

Page 48: A practical introduction to SADI semantic Web services and HYDRA query tool

SERVICE REGISTRY

Note that we added allPatients that enumerates all patients with their MRN.

Page 49: A practical introduction to SADI semantic Web services and HYDRA query tool

KEYPHRASE INPUT

Page 50: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA GUI PROPOSES QUERY GRAPHS

Page 51: A practical introduction to SADI semantic Web services and HYDRA query tool

THE USER CAN CONFIRM THE WHOLE GRAPH OR SOME PARTS OF IT

Page 52: A practical introduction to SADI semantic Web services and HYDRA query tool

ADDING MNEMONIC VARIABLE NAME

Page 53: A practical introduction to SADI semantic Web services and HYDRA query tool

MNEMONIC VARIABLE NAME ADDED

Page 54: A practical introduction to SADI semantic Web services and HYDRA query tool

MORE KEYPHRASE INPUT

Page 55: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA GUI PROPOSES GRAPH AUGMENTATIONS

Page 56: A practical introduction to SADI semantic Web services and HYDRA query tool

VARIABLE NAME

Page 57: A practical introduction to SADI semantic Web services and HYDRA query tool

VARIABLE NAME ADDED

Page 58: A practical introduction to SADI semantic Web services and HYDRA query tool

MANUALLY ADDING RELATIONS

Numeric comparison < here, but could be any kinds of relations.

Page 59: A practical introduction to SADI semantic Web services and HYDRA query tool

EXTENDED GRAPH

Page 60: A practical introduction to SADI semantic Web services and HYDRA query tool

SPECIFYING A DATA VALUE

Page 61: A practical introduction to SADI semantic Web services and HYDRA query tool

EXTENDED GRAPH

The query is ready. It finds all patients with 20 < BMI < 30 and outputs their BMI values and MRNs.

Page 62: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA GUI GENERATES SPARQL FROM QUERY GRAPHS

Page 63: A practical introduction to SADI semantic Web services and HYDRA query tool

EXECUTING THE QUERY

Page 64: A practical introduction to SADI semantic Web services and HYDRA query tool

ANSWERS

Page 65: A practical introduction to SADI semantic Web services and HYDRA query tool

SAVING THE ANSWERS AS AN EXCEL SPREADSHEET

Page 66: A practical introduction to SADI semantic Web services and HYDRA query tool

THANK YOU!

Further materials/services are available on request:• Live and recorded demos.

• Publications on previous (academic) case studies.

• Training/consulting.

• http://ipsnp.com/ (Canada) and http://ipsnp.co/ (UK)