Top Banner
WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011
39

WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Apr 01, 2015

Download

Documents

Francesca Hodes
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

WP3: Provenance and Access Control

Irini FundulakiGiorgos Flouris

Institute of Computer Science-FORTH1st year review

Luxembourg, December 2011

Page 2: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

18 24 30 366 120

Task 3.1ProvenanceManagement

Task 3.2Privacy, DRM and Access Control

Task 3.3Trust management

D 3.4 Trust management and inference system

FORTHFORTH

42 48

D 3.2 Provenance management and propagation through SPARQL queryand update languages

D 3.2 Provenance management and propagation through SPARQL queryand update languages

D 3.1 Access control

specification language, reasoning

and enforcement mechanisms

D 3.1 Access control

specification language, reasoning

and enforcement mechanisms

FORTHFORTH

EPFLEPFL

WP3: Work Plan View

D 3.3 Access control system andprivacy-aware languageD 3.3 Access control system andprivacy-aware language

Page 3: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Research Topics, Tasks and Partners

Objective: manage annotations of different forms and semantics over data, related to data access

Research Topics: Provenance, Access Control, Privacy, Digital Rights Management (DRM), Trust Management

Partners: FORTH, EPFL, KIT

Page 4: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Provenance

• Wikipedia: “… the origin or source of something or the history of the ownership or location of an object”

• W3C Incubator Group: “… is a record that describes entities and processes involved in producing and delivering or otherwise influencing that resource. […] Provenance assertions are a form of contextual metadata and can themselves become important records with their own provenance.”

Page 5: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Provenance

• W3C Incubator Group: “With the arrival of massive amounts of Semantic Web Data […], provenance becomes an important factor in developing new Semantic Web applications.”

• Applications Data Trustworthiness, Reputation and Reliability Information Quality Data Integration and Exchange Reproducibility Argumentation (Decision Justification) Access Control Accountability Reasoning

Page 6: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Types of Provenance

• Coarse grained provenance used to reproduce a digital object or repeat an experiment (complex programs)

I OP

I P O: coarse grained (workflow or dataflow provenance)

I´ O´

I’ P’ O’: fine grained (data provenance)

• Fine grained provenance refers to the transport of annotations between input and output data (query languages)

Page 7: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Workflow Provenance: Sensor Scenario

S1

S2

Readings Sea Temperature & Wind

Readings Sea Temperature & Wind

Complex Computationto predict the height of waves

Provenance:Complex Program executed on Input

Data

Page 8: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Data Provenance: Sensor Scenario

Provenance:annotations of the input tuples that

contributed to the query results

R2

sensor database

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

t3

t4

R1Sensor Readgs Annot.

S1

S2

8B

2B

t1

Time

00:19

01:50 t2

sensor readings

DB Server

R1 R2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

{t1,t3}

{t2,t4}

Time

00:19

01:50

Readgs

8B

2B

Page 9: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Data Provenance Models

Annotation Models: provenance computation is coupled with a particular application and a particular assignment of the provenance of source data

When the annotation of the input tuple

changes, we must re-executethe query to obtain the annotation

of theresult tuples

R2Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

1

1

Sensor Readgs Annot.

S1

S2

8B

2B

1

Time

00:19

01:50 0

The annotation of a join tuple is computed using operator x 0 x 0 = 0, 1 x 0 = 0, 1 x 1 = 1

R1 R2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

1

Time

00:19

01:50

Readgs

8B

2B 0

R1

Page 10: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Data Provenance Models

Abstract Models: provenance annotations (referred to as tokens) and operators are abstract.

When the annotation of the input tuplechanges, the annotation of

the result tuple is re-computed byevaluating the annotation expression only

R1

R2

The annotation of a join tupleis modeled by the “x” operator

R1 R2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

T3

T4

Sensor Readgs Annot.

S1

S2

8B

2B

T1

Time

00:19

01:50 T2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

T1 x T3

T2 x T4

Time

00:19

01:50

Readgs

8B

2B

Page 11: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Data Provenance Models

Abstract Models:Abstract tokens and operators are assigned concrete values, only when the concrete value of an annotation must be computed

R1

R2

R1 R2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

T3

T4

Sensor Readgs Annot.

S1

S2

8B

2B

T1

Time

00:19

01:50 T2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

T1 x T3

T2 x T4

Time

00:19

01:50

Readgs

8B

2B

Data Quality Application:

• abstract tokens T1, T2, T3, T4 take values 1 and 0

• abstract operator “x” is replaced by logical AND

0

1

1

1

1

01 1

Page 12: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Abstract Data Provenance Models

• Benefits: – in the presence of provenance updates in the

input, we need to evaluate the value of the provenance of the affected tuples only

– different applications can assign different concrete values to abstract tokens and operators, for the same data

• Challenges: Trade-off between provenance storage over computation efficiency– storage of large provenance expressions– efficient computation of provenance for dynamic

data

Page 13: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Data Provenance

• RDFS reasoning– Given a set of RDF triples whose explicit

provenance is known, and RDFS reasoning rules what is the provenance of the implicit RDF triples?

• SPARQL– Given a set of RDF triples whose explicit

provenance is known, and a SPARQL query, what is the provenance of the query result?

Page 14: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

RDFS Reasoning

(A1, sc, A3)

(A1, sc, A2)(A2, sc, A3)

(&r, sc, A2)

(&r, type, A1)(A1, sc, A2) C3C1

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

SSN Ontology

C2

&s1&s1

C4

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

SSN Ontology

C2

C3C1

&s1&s1

C4

type: sc (subclassOf):

Given a set of RDF triples (RDF Graph) whose explicit provenance is known, and RDFS entailment rules

what is the provenance of the implicit RDF triples?

Given a set of RDF triples (RDF Graph) whose explicit provenance is known, and RDFS entailment rules

what is the provenance of the implicit RDF triples?

?

?

??

Page 15: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

RDFS Reasoning

• colors to capture the provenance of explicit and implicit data and schema RDF triples

• quadruples to represent provenance information

• Provenance model: commutative semi-group structure (C, +)

–C: set of colors,

–binary operation “+” to compose colors of the input triples

Page 16: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

RDFS Reasoning

• Pediaditis P., Flouris G., Fundulaki I., Christophides V. On Explicit Provenance Management in RDF/S Graphs. In Theory and Practice of Provenance (TaPP-2009)

• Flouris G., Fundulaki I., Pediaditis P., Theoharis Y., Christophides V. Coloring RDF Triples to capture Provenance. In ISWC 2009.

Page 17: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Provenance for SPARQL

• We showed that existing provenance models for positive

relational algebra can capture the provenance of SPARQL

(without OPTIONAL)

• We follow the approach by Karvounarakis et. al. in

Provenance Semirings, PODS 2007 to develop a model for

full SPARQL

– records the input tuples and the operators used to

compute the query results

Given a set of RDF triples (RDF Graph) whose explicit provenance is known, and a SPARQL query

what is the provenance of the result?

Given a set of RDF triples (RDF Graph) whose explicit provenance is known, and a SPARQL query

what is the provenance of the result?

Page 18: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Provenance Model for SPARQL+

• K: set of provenance tokens• : operator for SPARQL join• : operator for SPARQL union

subject predicate object

S1 type Sensor

S1 Readgs &r1

S1 Latitude 23° 26’ 21”N

S2 type Sensor

S2 Readgs &r2

S2 Latitude 23° 26’ 21”N

&r1 value 8B

00:19time&r1

&r2 value 2B

01:50time&r2

prov

t1

t5

t2

t3

t6

t4

t7

t8

t9t10

select ?s, ?lwhere { ?s type Sensor . ?s latitude ?l }

SPARQL Query: return the sensor and its latitude

Page 19: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Provenance Model for SPARQL+

?s type Sensor . ?s Latitude ?lQ =

The evaluation of a triple pattern over T is a set of mappings (?variable, ?value)

?sS1

S2

1

?s type Sensor

1 t1t32

2

?s latitude ?l

?sS1

S2

?l23° 26’ 21”N

23° 26’ 23”N

t2t4

3

4

subject predicate object

S1 type Sensor

S1 Readgs &r1

S1 Latitude 23° 26’ 21”N

S2 type Sensor

S2 Readgs &r2

S2 Latitude 23° 26’ 21”N

&r1 value 8B

00:19time&r1

&r2 value 2B

01:50time&r2

prov

t1

t5

t2

t3

t6

t4

t7

t8

t9t10

Page 20: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Provenance Model for SPARQL+

?s type Sensor . ?s Latitude ?lQ =

?sS1

S2

1

?s type Sensor

1 t1t32

2

?s latitude ?l

?sS1

S2

?l23° 26’ 21”N

23° 26’ 23”N

t2t4

3

4

The result of a join between two triple patterns contains all mappings that have the same value for their common variable(s)

subject predicate object

S1 type Sensor

S1 Readgs &r1

S1 Latitude 23° 26’ 21”N

S2 type Sensor

S2 Readgs &r2

S2 Latitude 23° 26’ 21”N

&r1 value 8B

00:19time&r1

&r2 value 2B

01:50time&r2

prov

t1

t5

t2

t3

t6

t4

t7

t8

t9t10

3

?sS1

S2

?l23° 26’ 21”N

23° 26’ 23”N

t1

t3

t2

t4

Page 21: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Provenance for SPARQL

• Theoharis Y., Fundulaki I., Karvounarakis G., Christophides V. On Provenance of Queries on Linked Web Data. In IEEE Internet Computing:Provenance in Web Applications, 2011.

Page 22: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Access Control

• Refers to the ability to permit or deny the use of a particular resource by a particular entity

• Crucial for sensitive content since it ensures the selective exposure of information to different classes of users

Page 23: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

RDF Access Control

• In general, an access control model specifies

– the access annotations

– conflict resolution policy to resolve ambiguous

access annotations

– default semantics used to annotate data that

are not in the scope of any authorization

• Access Authorizations specify (by a query) the

access annotations for data

Page 24: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Access Control

• Access Annotations can be – boolean values

• true/false (grant/deny access permission)– confidentiality levels

• low, medium, high• Conflict Resolution Policy depends on the type of access

annotations– boolean values:

• deny overrides grant access annotation– confidentiality levels

• high confidentiality overrides medium, medium overrides low

• Default Semantics depend on the type of access annotations

Page 25: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Fine-grained Access Control Framework for RDF Data

• We encode access annotations of RDF triples using

quadruples

• We propose an abstract access control model defined

by a set of abstract tokens and abstract operators to

model

– the computation of access annotations of RDF

triples considering RDFS inference

– the propagation of access annotations

– conflicting and missing access annotations

Page 26: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Abstract Tokens

• L: set of abstract access control tokens

• L default access token

– assigned to triples that have not an explicitly

assigned access token

Page 27: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Abstract Operators

• Entailment Operator ⊙ to compute the access

annotations of implied quadruples

• Propagation Operator to model the

propagation of access annotations

• Conflict Resolution Operator to resolve

ambiguous access annotations

Page 28: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Entailment Operator ⊙

• binary operator to model the computation of the annotation of an implicit RDF quadruple for the subclass, subproperty and type hierarchies in an RDF graph

– Properties:

• Associativity:

• Commutativity

(A1, sc, A2, l1)

(A2, sc, A3, l2)

(A1, sc, A3, l1 ⊙ l2)

⊙ l2l1l4 ⊙( ) ⊙ l2l1l4 ⊙( )=

l1l4 ⊙ = l1 l4⊙

The order of the application of inference rules is not important

Page 29: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Entailment Operator ⊙

(A1, sc, A3 ,l1 ⊙ l2)

(A1, sc, A2 ,l1) (A2, sc, A3 ,l2)

(&r, type, A2 ,l1 ⊙ l2)

(&r, type, A2 ,l1) (A2, sc, A3 ,l2)

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

&s1&s1

l4

rdfs:Classrdfs:Classl0

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

&s1&s1

l4

type: sc (subclassOf):

l0

rdfs:Classrdfs:Class

l1l4 ⊙

l1 ⊙ l2

⊙ l2l1l4 ⊙( )

Page 30: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Propagation Operator

• unary operator to model propagation of access annotations along the subclass/subproperty and type hierarchies in an RDF Graph– a class inherits the annotation of its superclass,

an instance of a class inherits the annotation of its class, etc.

– Properties:

• Idempotence:

(A1, type, class, l1)

(&r1, type, A1, (l1 ))

(&r1, type, A1, l2)

l0 l0 ( ( )) = ( )

We do not care how many times an annotationis propagated

Page 31: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Propagation Operator

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

&s1&s1

l4

type: sc (subclassOf):

l0

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

&s1&s1

l4

rdfs:Classrdfs:Class

l0

rdfs:Classrdfs:Class

l0

⊙ l2l1l4 ⊙( )

(&r, type, A1 ,l1)(A1, type, rdfs:Class ,l2)((&r, type, A1 , ) l2

Page 32: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Conflict Resolution Operator

• binary operator to resolve ambiguous access labels

– Properties:

• Associativity:

• Commutativity:

• Idempotence:

(A1, sc, A2, L1)(A1, sc, A2, L2) (A1, sc, A2, L1 L2)

l2l0 ( )l1 = l0 ( l2l1 )

l0 l1 = l0l1

l1 = l1l1

Page 33: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Computing Abstract Access Control Annotations

• assign access annotations to triples of the RDF

graph to obtain quadruples

• apply RDFS inference rules on quadruples to

obtain the implicit annotated quadruples

• apply propagation rules on quadruples to

compute their propagated annotations

• apply the conflict resolution operator to resolve

ambiguities

Page 34: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Computing Abstract Access Control Annotations (example)

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

l0

rdfs:Classrdfs:Class

l5

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

l0

rdfs:Classrdfs:Class

l5

⊙ l2l1 ⊙( ) l0⊙ l3l5( )⊙ l0

⊙ l2l1 ⊙( ) l0 ⊙ l3l5( )⊙ l0 l0(SensingDevice, type, rdfs:Class, )

Page 35: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Concrete Policies

• A concrete policy assigns concrete values to the abstract tokens and operators

• Example

– Boolean values assigned to abstract tokens

• false: deny access

• true: grant access

– Conjunction assigned to entailment operator

• an implied triple is accessible iff all its implying triples have been granted access

– Disjunction assigned to Conflict Resolution operator

• grant overrides deny annotation

– Identity assigned to propagation operator

Page 36: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Concrete Policy (example)

(SensingDevice, type, rdfs:Class, ⊙ l2l1 ⊙(( ) l0

l2

l3

l1l0

l5

false (F)

true (T)

Assignment of abstract tokens to values

Assignment of abstract operators to concrete ones

⊙ () ()

propagation (¬)negation

entailment conjunction

conflict resolution disjunction

(SensingDevice, type, rdfs:Class, ( (( (¬ F) )(F F ) F ) T T) T)

) ⊙ l3l5 )⊙ l0(( ) l0 )( )

T

Page 37: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

References

Flouris G., Fundulaki I., Michou M., Papakonstantinou V., Antoniou G. Access Control for RDFS Graphs Using Abstract Models. Ongoing work.

Page 38: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.
Page 39: WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Computing Abstract Access Control Expressions (example)

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

l0

rdfs:Classrdfs:Class

(A1, sc, A3 ,l1 ⊙ l2)

(A1, sc, A2 ,l1)(A2, sc, A3 ,l2)

(&r, type, A2 ,l1 ⊙ l2)

(&r, type, A2 ,l1)(A2, sc, A3 ,l2)

l5

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

l0

rdfs:Classrdfs:Class

l5

l1 ⊙ l2

⊙ l3l5

⊙ l2l1 ⊙( ) l0⊙ l3l5( )⊙ l0

l0

(&r, type, A1 ,l1)(A1, type, rdfs:Class ,l2)

(&r, type, A1 , ) l2

(SensingDevice, type, rdfs:Class, ⊙ l2l1 ⊙( ) l0 ⊙ l3l5( )⊙ l0 l0 )