Top Banner
1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier
34

1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

Jan 18, 2016

Download

Documents

Andra Ward
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

1

SPARQLing Constraints for RDF

Michael Schmidt

EDBT, 2008 March 28joint work with Prof. Georg Lausen, Michael Meier

Page 2: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

2

SPARQLing Constraints for RDF

RDF Data Format

• Machine-readable information

• Established in the Semantic Web

SPARQL Query Language

• Declarative Language

• W3C Recommendation since Jan.

Constraints

• Primary and foreign keys

• Cardinality constraints, …bases on

Extension of RDF by constraints With fixed semantics Integration into the Framework

The role of SPARQL in this context Extracting constraints Checking constraints Optimization of SPARQL

queries under constraints

Page 3: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

3

Why Constraints?

Restricting the state space of the database Maintenance of data consistency (e.g. when

data is updated) Semantic Query Optimization Better understanding of the data Here: Translation of Relational Schemata to

RDF without loss of information

Page 4: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

4

The RDF Data Format

„Fred“

Teachers

t1 t2 „43“„CS“

name

faculty

rdf:type

„Joe“ name

age

knows

„Triples of Knowledge“

(t1, name, „Joe“) , (t1, faculty, „CS“) , (t1, knows, t2)

Page 5: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

5

The RDF Data Format

„Fred“

Teachers

t1 t2 „43“„CS“

name

faculty

rdf:type

„Joe“ name

age

knows

Three elementary types URIs (describe physical/logical entities & properties) Literals (string values) Blank Nodes (not conisdered)

Page 6: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

6

A Relational Data Scheme

name faculty

Joe CS

Fred CS

matric name

11111 John

22222 Ed

taught_by name

Joe DB

Fred Web

c_id s_id

Fred 11111

Fred 22222

Teachers Students

Courses Participants

+ NOT NULL constraints on each column

Page 7: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

7

A Translation into RDF

Students

name

Teachers

Courses

t1 t2 s1 s2

c1 c2

Joe Fred“CS“ “CS“ 11111 22222“John“ “Ed“

“DB“ “Web“

namename name

name name

matric matric

facultyfaculty

taught_by

taught_by

Participants

p1 p2

s_ids_id

c_idc_id

rdf:type

Problem: Constraints only implicitly given!

Page 8: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

8

Constraints for RDF

Encoding in the schema layer New namespace „rdfc“ provides constraint

vocabulary with fixed semantics rdfc:Key for primary keys rdfc:FKey for foreign keys rdfc:ref links foreign keys to primary keys

Use built-in RDF container class rdf:Seq

Page 9: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

9

taught_by

Courses

c1 c2

“DB“ “Web“

name nametaught_by taught_by

rdfc:FKey

nameT_Key

rdfc:Key

rdf:_1 name

rdfc:Key

rdf:Seq

name

Teachers

t1 t2

Joe Fred“CS“ “CS“

facultyfaculty

C_FKey

rdfc:FKey rdf:Seq

rdfc:ref

rdf:_1

Encoding Constraints

Page 10: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

10

Types of Constraints

Let C, C1, C2 be classes and Qi, Ri properties

Primary keys, foreign keys

Key(C,[Q1,…Qn]), FKey(C1,[Q1,…Qn],C2,[R1,…Rn])

Cardinality constraints

Min(C,n,R), Max(C,n,R) for n N

Functionality constraints, totality constraints

Func(C,Q), Total(C,Q)

and many more in the full paper: singleton, subclass, subproperty, property domain, property range

Page 11: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

11

Satisfiability

Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints?

in general undecidable

Shown by reduction from the key implication problem in Relational Databases

In the paper, we indicate satisfiable constraint subclasses decidable constraint subclasses

Page 12: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

12

The SPARQL Query Language

SELECT ?name ?faculty ?titleWHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. OPTIONAL { ?teacher title ?title. }}

Declarative language Bases upon graph patterns that are matched

against the input graph Different operators to combine these patterns

AND („.“) OPTIONAL UNION FILTER

Page 13: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

13

SPARQL Query Evaluation

SELECT ?name ?faculty ?titleWHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. OPTIONAL { ?teacher title ?title. }}

title

„Professor“

?name ?faculty ?title

Joe “CS“

Fred “CS“ “Professor“

name

Teachers

t1 t2

Joe Fred“CS“ “CS“

name facultyfaculty

?teacher

?name

?faculty

?title: unbound

Variables are matched against the input graph

Page 14: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

14

Extracting Key Constraints

SELECT ?keyname ?class ?keyattWHERE { ?class rdfc:Key ?keyname. ?keyname rdf:type rdfc:Key. ?keyname ?seq ?keyatt. FILTER (?seq!=rdf:type)}

?keyname ?class ?keyatt

T_Key Teachers name

T_Key

rdfc:Key

rdf:_1 namerdfc:Key

rdf:Seq

Teachers

… …

Extraction of foreign keys very similar

Page 15: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

15

Constraint checks possible for many types constraints

A SPARQL query checks a constraint C if it returns yes for each graph that violates C, no otherwise.

Use SPARQL „ASK“ query form (returns „yes“ exactly if query contains a result,

„no“ otherwise)

Checking Constraints with SPARQL

Page 16: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

16

Checking primary key constraints

ASK { ?x rdf:type C. ?y rdf:type C. ?x p1 ?p1; [...]; pn ?pn. ?y p1 ?p1; [...]; pn ?pn. FILTER (?x!=?y)}

Key(C,[p1,. . . ,pn])

Returns „yes“ exactly if constraint is violated.

Checking Constraints with SPARQL

Checking of foreign keys is a little more complicated, but also possible

Page 17: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

17

Semantic Query Optimization

Idea: use constraint knowledge to find a more efficient query execution plan

Has been studied in the context of relational and datalog databases…

… and now is applicable in the context of RDF and SPARQL

Page 18: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

18

Semantic Query Optimization

SELECT ?teachername ?coursename ?studentnameWHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?teacher rdf:type Teachers; name ?teachername. OPTIONAL { ?student rdf:type Students; matric ?studentmatric; name ?studentname. }}

Page 19: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

19

Students

name

Teachers

Courses

t1 t2 s1 s2

c1 c2

Joe Fred“CS“ “CS“ 11111 22222“John“ “Ed“

“DB“ “Web“

namename name

name name

matric matric

facultyfaculty

taught_by

taught_by

Participants

p1 p2

s_ids_id

c_idc_id

A Solution Candidate Subgraph

Page 20: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

20

Semantic Query Optimization

SELECT ?teachername ?coursename ?studentnameWHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?teacher rdf:type Teachers; name ?teachername. OPTIONAL { ?student rdf:type Students; matric ?studentmatric; name ?studentname. }}

Key(Students,[matric])

FKey(Participants, [s_id], Students, [matric])

Total(Students,[name])

Page 21: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

21

Semantic Query Optimization

SELECT ?teachername ?coursename ?studentnameWHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?teacher rdf:type Teachers; name ?teachername. ?student rdf:type Students; matric ?studentmatric; name ?studentname.}

Key(Teacher, [name])

FKey(Courses, taught_by, Teacher, [name])

Page 22: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

22

Semantic Query Optimization

SELECT ?teachername ?coursename ?studentnameWHERE { ?course rdf:type Courses; taught_by ?teachername; name ?coursename. ?participant rdf:type Participants; c_id ?teachername; s_id ?studentmatric. ?student rdf:type Students; matric ?studentmatric; name ?studentname.}

Many more optimizations possible Rewriting of filter expressions Elimination of redundant rdf:type specifications

Page 23: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

23

Future Work

Study of other types of constraints and the interaction between constraints

Development of a schematic approach to Semantic Query Optimization Mapping to SQL/Datalog? SPARQL-specific semantic optimizations?

Efficient constraint checking algorithms

Page 24: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

24

Thank you for your attention!

• C. Bizer.D2R MAP-A Database to RDF Mapping Language. In WWW (Posters), 2003.• C.Bizer, R.Cyganiak, J. Garbers, and O. Maresch. D2RQ: Treading Non-RDF Relational Databases as Virtual RDF Graphs. User Manual and Language Specification.• J. J. King. QUIST: A System for Semantic Query Optimization in Relational Databases. Distributed systems, Vol. II, pages 287-294, 1986.• G. Lausen. Relational Databases in RDF. In Joint ODBIS & SWDB Workshop on Semantic Web, Ontologies, Databases, 2007. B. Motik, I. Horrocks, and U. Sattler. Bridging the Gap Between OWL and Relational Databases, In WWW, pages 807-816, 2007.• J. Pérez, M. Arenas, and C. Gutierrez. Semantics and Complexity of SPARQL. In CoRR Technical Report cs.DB/0605124, 2006.

• Recourse Description Framework (RDF): Concepts and Abstract Syntax. http://www.w3.org/TR/rdf-schema/. W3C Recommendation, February 10, 2004.• RDF Vocabulary Description Language 1.0: RDF Schema. http://www.w3.org/TR/rdf-schema/. W3C Recommendation, Febuary 10, 2004.• RDF Semantics.http://www.w3.org/TR/rdf-mt/. W3C Recommendation, February 10, 2004.• S.T. Shenoy and Z.M. Ozsoyoglu. A System for Semantic Query Optimization. In SIGMOD, pages 181-195, 1987.• SPAQL Query Language for RDF. http://www.w3.org/TR/rdf-sparql-query/. W3C Proposed Recommendation, November 12, 2007.• G.E. Weddell. A Theory of Functional Dependencies for Object-Oriented Data Models. In DOOD, pages 165-184, 1989.

Page 25: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

25

Additional Resources

Page 26: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

26

Checking Constraints with SPARQL

Checking foreign key constraints

ASK { ?x rdf:type C; p1 ?p1; [...]; pn ?pn. OPTIONAL { ?y rdf:type D; q1 ?p1; [...]; qn ?pn. } FILTER (!bound(?y))}

FKey(C,[p1,. . . ,pn],D,[q1,... qn])

Bind objects of type C, with properties bound to ?p1, …, ?pn

Bind the (referenced) object to variable ?y, if any

Only keep results for which no

referenced object exists

Page 27: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

27

RDFS Constraints

Let Ci denote classes, Qi denote properties

Subclass Constraint

SubC(C1,C2)

Subproperty Constraint

SubP(Q1,Q2)

Property Domain/Range

PropD(Q,C), PropR(Q,C)

Restrict the state space of the database

No „axioms“ that are used for inferencing

Page 28: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

28

Satisfiability

Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints?

in general undecidable

Primary keys + Foreign Keys

Singleton

Max-Cardinality

Subclass + Subproperty

Property Domain + Property Range

always satisfiable

Page 29: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

29

Satisfiability

Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints?

Primary keys + Foreign Keys

Singleton

Max-Cardinality

Subclass + Subproperty

Property Domain + Property Range

Min-Cardinality

undecidable

in general undecidable

Page 30: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

30

Satisfiability

Given an RDF vocabulary and a set of constraints. Is there a non-empty RDF graph that satisfies the constraints?

Unary primary keys

Unary foreign keys

Min-Cardinality + Max-Cardinality

Subclass + Subproperty

Property Domain + Property Range

decidable in ExpTime

in general undecidable

Page 31: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

31

The SPARQL Query Language

SELECT ?name ?facultyWHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty.}

name

Teachers

t1 t2

Joe Fred“CS“ “CS“

namefacultyfaculty

?name ?faculty

Joe “CS“

Fred “CS“

Operator AND („.“)

Page 32: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

32

The SPARQL Query Language

Operator UNIONSELECT ?name ?facultyWHERE { { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. FILTER (?name=„Joe“). } UNION { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. FILTER (?name=„Fred“). }}

?name ?faculty

Joe “CS“

Fred “CS“

name

Teachers

t1 t2

Joe Fred“CS“ “CS“

namefacultyfaculty

Page 33: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

33

The SPARQL Query Language

SELECT ?name ?facultyWHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. FILTER (?name=„Joe“)}

name

Teachers

t1 t2

Joe Fred“CS“ “CS“

namefacultyfaculty

?name ?faculty

Joe “CS“

Operator FILTER

Page 34: 1 SPARQLing Constraints for RDF Michael Schmidt EDBT, 2008 March 28 joint work with Prof. Georg Lausen, Michael Meier.

34

The SPARQL Query Language

SELECT ?name ?faculty ?titleWHERE { ?teacher rdf:type Teachers. ?teacher name ?name. ?teacher faculty ?faculty. OPTIONAL { ?teacher title ?title. }}

title

„Professor“

?name ?faculty ?title

Joe “CS“

Fred “CS“ “Professor“

name

Teachers

t1 t2

Joe Fred“CS“ “CS“

namefacultyfaculty

Operator OPTIONAL