Top Banner
The future is federated Ruben Verborgh
65

The Future is Federated

Jan 23, 2018

Download

Internet

Ruben Verborgh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Future is Federated

The future

is federated

Ruben Verborgh

Page 2: The Future is Federated

Big DataI think

is boring.

Page 3: The Future is Federated

Big Data thriveson centralization.

Page 4: The Future is Federated

Knowledgeis inherently distributed.

Page 5: The Future is Federated

Knowledgeis inherently heterogeneous.

Page 6: The Future is Federated

Knowledge on the Webis inherently linked.

Page 7: The Future is Federated

Centralizationskips

interestingthe most

problems

Page 8: The Future is Federated

Where to find data you need?

How to access them?

How to integrate them?

Page 9: The Future is Federated

Let’s create smart appsover VIVO and Web data.

Page 10: The Future is Federated

a light interface to VIVO data

queries over that interface

an app built on such queries

You’ll get to see 3 things:

Page 11: The Future is Federated

We can integratemultiple data sourceson the live Web,but we need to setour expectations right.

Page 12: The Future is Federated

The future

is federated

Big Data fails at Web scaleLight interfaces ruleEngineer for serendipity

Page 13: The Future is Federated

The future

is federated

Big Data fails at Web scaleLight interfaces ruleEngineer for serendipity

Page 14: The Future is Federated

RDFTHE DATA LANGUAGE

Page 15: The Future is Federated

<subject> <predicate> <object>.

triple

Page 16: The Future is Federated

SPARQLTHE QUERY LANGUAGE

Page 17: The Future is Federated

SPARQLTHE PROTOCOL

Page 18: The Future is Federated

clientSPARQL

endpointSPARQL protocol

SPARQLquery

Page 19: The Future is Federated

SELECT ?person ?name WHERE { ?person a dbo:Scientist. ?person rdfs:label ?name. ?person dbo:birthPlace dbp:Denver. }

Hey, SPARQL endpoint…

Sure!

Page 20: The Future is Federated

SELECT DISTINCT ?drug ?drug1 ?drug2 ?drug3 ?drug4 ?d1 WHERE { ?drug1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugcategory/antibiotics> . ?drug2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugcategory/antiviralAgents> . ?drug3 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugcategory/antihypertensiveAgents> . ?drug4 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugcategory/anti-bacterialAgents> . ?drug1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target> ?o1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/genbankIdGene> ?g1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/locus> ?l1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeight> ?mw1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/hprdId> ?hp1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/swissprotName> ?sn1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/proteinSequence> ?ps1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/generalReference> ?gr1 . ?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target>?o1 . ?drug2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target> ?o2 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/genbankIdGene> ?g2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/locus> ?l2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeight> ?mw2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/hprdId> ?hp2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/swissprotName> ?sn2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/proteinSequence> ?ps2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/generalReference> ?gr2 . ?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target>?o2 . ?drug3 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target> ?o3 .

Hey, SPARQL endpoint…

Sure!

Page 21: The Future is Federated

SPARQL endpointstry to be the Web’sBig Data processors.

for free

Page 22: The Future is Federated

few endpoints exist

the average endpoint isdown for 1.5 days/month

Can I SPARQLyour endpoint?

Page 23: The Future is Federated

Big Data failsat Web scalebecause Web Scaleis much bigger.

Page 24: The Future is Federated

SEMANTIC WEBSHOULDN’T TRY TO COMPETE WITH

BIG DATA

Page 25: The Future is Federated

WEBI WANT TO PUT THE

BACK INTO SEMANTIC WEB

IT’S OUR MAIN DIFFERENTIATORFROM BIG DATA

Page 26: The Future is Federated

WEBIF IT’S NOT

I’M NOT INTERESTED

That’s why I thinkBig Data is boring.

Page 27: The Future is Federated

The future

is federated

Big Data fails at Web scale

Light interfaces ruleEngineer for serendipity

Page 28: The Future is Federated

AVERAGEHUMANWhat would the

do?

Page 29: The Future is Federated

SELECT ?person ?name WHERE { ?person a dbo:Scientist. ?person rdfs:label ?name. ?person dbo:birthPlace dbp:Denver. }

AVERAGE HUMAN

You can use only Wikipedia.

Page 30: The Future is Federated

AVERAGE HUMAN

Which scientists were born in Denver?

You can use only Wikipedia.

Page 31: The Future is Federated

AVERAGE HUMAN1. visit the page about Denver 2. make a list of people born there 3. read their pages to see if they’re a scientist

You can use only Wikipedia.

Page 32: The Future is Federated

WEB LINKINGIS UNIDIRECTIONALa Denver person’s page links to DenverDenver doesn’t necessarily link to that person

Page 33: The Future is Federated

AVERAGE HUMAN1. visit the page about Denver 2. make a list of people born there 3. read their pages to see if they’re a scientist

You can use only Wikipedia.

Page 34: The Future is Federated

AVERAGEHUMANWe need to empower the

but please not with a SPARQL endpoint because they’re so expensive to keep up.

Page 35: The Future is Federated

SIMPLESTCOMPLEXITYWHAT IS THE ?

Page 36: The Future is Federated

THE ESSENCEOF RDF

<subject> <predicate> <object>.

Page 37: The Future is Federated

THE ESSENCEOF LINKED DATA

?subject <predicate> <object>.

Page 38: The Future is Federated

THE ESSENCEOF LINKED DATA

Denver <predicate> <object>.

Page 39: The Future is Federated

THE ESSENCEOF TPF

?subject ?predicate ?object.

Page 40: The Future is Federated

THE ESSENCEOF TPF

?subject ?predicate Denver.

Page 41: The Future is Federated

TRIPLEPATTERNFRAGMENTS

Page 42: The Future is Federated

Clients can askthe server onlyfor triple patterns.

Page 43: The Future is Federated

AVERAGE HUMAN

Which scientists were born in Denver?

You can only use a TPF interface of DBpedia.

Page 44: The Future is Federated

AVERAGE HUMAN1. “?people birthPlace Denver.” 2. “?person type Scientist.” 3. “?person fullName ?name.”

You can only use a TPF interface of DBpedia.

Page 45: The Future is Federated

AVERAGE MACHINE1. “?person birthPlace Denver.” 2. “?person type Scientist.” 3. “?person fullName ?name.”

You can only use a TPF interface of DBpedia.

Page 46: The Future is Federated

SELECT ?person ?name WHERE { ?person a dbo:Scientist. ?person rdfs:label ?name. ?person dbo:birthPlace dbp:Denver. }

AVERAGE MACHINE

You can only use a TPF interface of DBpedia.

Page 47: The Future is Federated

The future

is federated

Big Data fails at Web scaleLight interfaces rule

Engineer for serendipity

Page 48: The Future is Federated

Engineer for serendipity.—Roy T. Fielding

Page 49: The Future is Federated

If 1 endpoint is downfor 1.5 days each month, then 2 endpoints might be for 3 days each month.

Federated queries withSPARQL endpointspose a problem.

Page 50: The Future is Federated

Just ask each of the questions to different TPF servers.

Federated queries arenative to TPF clients.

Page 51: The Future is Federated

But in federated scenarios,performance can be on par with SPARQL endpoints!

TPF trades server cost for query performance.

Page 52: The Future is Federated

TPF is not the final solution —no API will ever be— but an excellent starting point.

Lightweight interfacesare easy to extend and combine with others.

Page 53: The Future is Federated

The Memento protocolbrings time to the Web.

Ask for representations at a certain point in the past.

Page 54: The Future is Federated

TPF and Mementoare a great match.

We combined them in collaboration with Herbert Van de Sompel & team at the Los Alamos National Laboratory.

Page 55: The Future is Federated

The future

is federated

Big Data fails at Web scaleLight interfaces ruleEngineer for serendipity

Page 56: The Future is Federated

VIVO

client SPARQL

VIVO today

TPFserver

Page 57: The Future is Federated

VIVO

client TPF

VIVO tomorrow?

Page 58: The Future is Federated

Federationis a game changer.

Page 59: The Future is Federated

Federationis a game changer.

with the TPF interface

Page 60: The Future is Federated

powerWith great

responsibilitycomes great

Page 61: The Future is Federated

realisticWe need

expectationsabout our

to be

Page 62: The Future is Federated

Some queries willalways be hardon an open Web.You might need centralization if you want answers fast.

*

*Terms and conditions apply.

Page 63: The Future is Federated

…and streaming!

Many more queriesthan you’d thinkare pretty fast…

Page 64: The Future is Federated

OPEN SOURCElinkeddatafragments.org

Page 65: The Future is Federated

@RubenVerborgh

and it

starts today

The future

is federated