Relational Databases Relational Databases to RDF to RDF (a.k.a RDB2RDF) (a.k.a RDB2RDF) Juan F. Sequeda Juan F. Sequeda Dept of Computer Science Dept of Computer Science University of Texas at University of Texas at Austin Austin
Jan 13, 2016
Relational Databases Relational Databases to RDFto RDF
(a.k.a RDB2RDF)(a.k.a RDB2RDF)
Juan F. SequedaJuan F. Sequeda
Dept of Computer ScienceDept of Computer Science
University of Texas at AustinUniversity of Texas at Austin
I want RDF… but my data is I want RDF… but my data is in RDB!in RDB!
2
Why RDB2RDF?Why RDB2RDF?• Semantic Web
– Deep Web is 500 times bigger than Static Web (2008)
– Where do you think that the majority of the data is stored?
– If we want a Semantic Web, we need data to be on the web as RDF and interlinked!•Where do you think this data is going
to come from?
RDBRDB
RDBRDBRDBRDB
RDBRDB
RDBRDB
RDBRDB
RDB2RDF
RDB2RDF
RDB2RDFRDB2RDF
RDB2RDF
RDB2RDF
Why RDB2RDF?Why RDB2RDF?• Data Integration
– Do you know why RDF is cool?•because it’s a graph!
– How do link/integrate two different graphs?•add edges between nodes or merge
nodes!
• Boss: Find me clients that are based in cities who have a population less than 1 million?
• You: ???
id Name c_id
10 ACME Inc
20
11 Foo Bars 21
c_id
city state
20 Austin
TX
21 Dallas TX
Clients Locations
Real world scenarioReal world scenario
• You: I found the population information… but it’s in a different database. Can you add a column to the Location table in order to insert the new data?
• DBA: NO!
id city state
pop
1 Austin
TX 790390
2 Dallas
TX 1197816
Location
Real world scenarioReal world scenario
id Name c_id
10 ACME Inc
20
11 Foo Bars 21
c_id
city state
20 Austin
TX
21 Dallas TX
Clients Locations
id city state
pop
1 Austin
TX 790390
2 Dallas
TX 1197816
Location
http://db1/
client10
http://db1/
client10
http://db1/
client11
http://db1/
client11
http://db1/loc20
http://db1/loc20
http://db1/loc21
http://db1/loc21
ACME Inc
ACME Inc
Foo BarsFoo Bars
AustinAustin TXTX
Dallas
Dallas TXTX
790390790390
1197816
1197816
ex:Client
ex:Client
ex:basedIn
ex:basedInex:pop
ex:stateex:city
ex:city ex:state
ex:name
ex:name
rdf:type
rdf:type
http://db2/loc1
http://db2/loc1
Austin
Austin TXTX
ex:stateex:city
http://db2/loc2
http://db2/loc2
DallasDallas TXTX
ex:stateex:city
ex:pop
id Name c_id
10 ACME Inc
20
11 Foo Bars 21
c_id
city state
20 Austin
TX
21 Dallas TX
Clients Locations
id city state
pop
1 Austin
TX 790390
2 Dallas
TX 1197816
Location
http://db1/
client10
http://db1/
client10
http://db1/
client11
http://db1/
client11
ACME Inc
ACME Inc
Foo BarsFoo Bars
790390790390
1197816
1197816
ex:Client
ex:Client
ex:basedIn
ex:basedInex:pop
ex:name
ex:name
rdf:type
rdf:type
http://db2/loc1
http://db2/loc1
Austin
Austin
TXTX
ex:stateex:city
http://db2/loc2
http://db2/loc2
DallasDallas TXTX
ex:stateex:city
ex:pop
id Name c_id
10 ACME Inc
20
11 Foo Bars 21
c_id
city state
20 Austin
TX
21 Dallas TX
Clients Locations
A bit of historyA bit of history• Relational Databases on the Web.
TimBL, 1998• W3C Workshop on RDF Access to
Relational Databases, October 2007– Report: http://www.w3.org/2007/03/RdfRDB/report
• W3C RDB2RDF Incubator Group, 2008-2009– Survey:
http://www.w3.org/2005/Incubator/rdb2rdf/RDB2RDF_SurveyReport.pdf
• W3C RDB2RDF Working Group, 2009 – today– R2RML: RDB to RDF Mapping Language– A Direct Mapping of Relational Data to
RDF
RDB and the Semantic WebRDB and the Semantic Web
12
RDF
RDFS
OWL
RIF
RDB and the Semantic WebRDB and the Semantic Web
13
RELATIONAL MODEL
TABLE DEFINITION
CONSTRAINTS
TRIGGERS
RDB and the Semantic WebRDB and the Semantic Web
14
RELATIONAL MODEL
TABLE DEFINITION
CONSTRAINTS
TRIGGERS
RDF
RDFS
OWL
RIF
OverviewOverview
R2RML: RDB to RDF Mapping R2RML: RDB to RDF Mapping LanguageLanguage• Language for expressing
customized mappings from relational databases to RDF datasets
• Give precise control to the developer– You create the structure you want– You choose the target vocabulary
• No RDFS/OWL is created from the schema
16
RDBRDB
RDF
R2RML
R2RML
manual
R2RML MappingR2RML Mapping
Direct MappingDirect Mapping• Automatic transformation from Relational
Database to RDF– Click a button… Voila!
• Generate RDFS/OWL of the database schema
• If this doesn’t get you where you want…use existing languages for mapping– RDF to RDF with RIF or SPARQL Construct
• Semantic Web community
– Create SQL Views and directly map those• Database community
18
RDBRDBDirect Mapping
RDFRIF/
SPARQLConstruct
automatic
RDF
Direct MappingDirect Mapping
SQL Views
HybridHybrid• Instead of starting from a blank
R2RML file…• 1) Direct Mapping• 2) Manual Editing
20
RDBRDB
RDF
Direct Mappin
g in R2RML
Direct Mappin
g in R2RML
R2RML
R2RML
Direct Mapping
Modify
Hybrid MappingHybrid Mapping
Materialize TriplesMaterialize Triples• Data is not dynamic• Dump RDB into RDF and then
insert into triplestore• RDF dump may not be
consistent with RDB
22
RDBRDB
RDF
Dump
SPARQL
SPARQL
Materialized TriplesMaterialized Triples
Virtual TriplesVirtual Triples• Data is dynamic• Need to query RDB with SPARQL• Translate SPARQL to SQL
– Comparing the overall performance […] of the fastest rewriter with the fastest relational database shows an overhead for query rewriting of 106%. This is an indicator that there is still room for improving the rewriting algorithms [Bizer and Schultz 2009]
– Current rdb2rdf systems are not capable of providing the query execution performance required [...] it is likely that with more work on query translation, suitable mechanisms for translating queries could be developed. These mechanisms should focus on exploiting the underlying database system’s capabilities to optimize queries and process large quantities of structure data [Gray et al. 2009]
– Ultrawrap solves this
• RDF data is consistent with RDB data24
RDBRDB
Mapping
SPARQL
SPARQL
Virtual TriplesVirtual Triples
RDF
Materialized
Triples
Virtual Triples
Direct Mapping
Custom Mapping
RDB2RDF SpaceRDB2RDF Space
Hybrid
Tuples to TriplesTuples to Triples
SID NAME AGE
1 Alice 25
2 Bob 26
SUBJECT
PREDICATE
OBJECT
http://ex.com/person1
http://ex.com/person1 2525http://ex.com/agehttp://ex.com/age
Current Status of W3C Current Status of W3C RDB2RDF WGRDB2RDF WG• R2RML: RDB to RDF Mapping Language
Working Draft http://www.w3.org/TR/r2rml/
• A Direct Mapping of Relational Data to RDFWorking Drafthttp://www.w3.org/TR/rdb-direct-mapping/
• Last Call: Sept 1 (hopefully)
28
ImplementationsImplementations• Ultrawrap
– SPARQL and semantically equivalent SQL have equal execution time
– Commercial databases– http://ribs.csres.utexas.edu/ultrawrap
• Spyder– Oracle and HSQLDB– http://www.revelytix.com/content/spyder
• Other non-standard RDB2RDF– D2R Server, Virtuoso, Triplify, …
29
PublicityPublicity
• International Semantic Web Conference– Oct 23 – 27 in Bonn, Germany
• Posters and Demos– August 15
• Consuming Linked Data Workshop– August 15
• Outrageous Ideas Track– Sept 5
• Semantic Web Challenge– Sept 30
• 2nd Linked Data-a-thon– Oct 1
30http://iswc2011.semanticweb.org/
Join the Facebook group
SSSW2011
Thank YouThank You
@juansequeda
Acknowledgments:- RiBS @ UT Austin- W3C RDB2RDF WG members- David McNeil - Revelytix