WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.

Post on 01-Apr-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

WP3: Provenance and Access Control

Irini FundulakiGiorgos Flouris

Institute of Computer Science-FORTH1st year review

Luxembourg, December 2011

18 24 30 366 120

Task 3.1ProvenanceManagement

Task 3.2Privacy, DRM and Access Control

Task 3.3Trust management

D 3.4 Trust management and inference system

FORTHFORTH

42 48

D 3.2 Provenance management and propagation through SPARQL queryand update languages

D 3.2 Provenance management and propagation through SPARQL queryand update languages

D 3.1 Access control

specification language, reasoning

and enforcement mechanisms

D 3.1 Access control

specification language, reasoning

and enforcement mechanisms

FORTHFORTH

EPFLEPFL

WP3: Work Plan View

D 3.3 Access control system andprivacy-aware languageD 3.3 Access control system andprivacy-aware language

Research Topics, Tasks and Partners

Objective: manage annotations of different forms and semantics over data, related to data access

Research Topics: Provenance, Access Control, Privacy, Digital Rights Management (DRM), Trust Management

Partners: FORTH, EPFL, KIT

Provenance

• Wikipedia: “… the origin or source of something or the history of the ownership or location of an object”

• W3C Incubator Group: “… is a record that describes entities and processes involved in producing and delivering or otherwise influencing that resource. […] Provenance assertions are a form of contextual metadata and can themselves become important records with their own provenance.”

Provenance

• W3C Incubator Group: “With the arrival of massive amounts of Semantic Web Data […], provenance becomes an important factor in developing new Semantic Web applications.”

• Applications Data Trustworthiness, Reputation and Reliability Information Quality Data Integration and Exchange Reproducibility Argumentation (Decision Justification) Access Control Accountability Reasoning

Types of Provenance

• Coarse grained provenance used to reproduce a digital object or repeat an experiment (complex programs)

I OP

I P O: coarse grained (workflow or dataflow provenance)

I´ O´

I’ P’ O’: fine grained (data provenance)

• Fine grained provenance refers to the transport of annotations between input and output data (query languages)

Workflow Provenance: Sensor Scenario

S1

S2

Readings Sea Temperature & Wind

Readings Sea Temperature & Wind

Complex Computationto predict the height of waves

Provenance:Complex Program executed on Input

Data

Data Provenance: Sensor Scenario

Provenance:annotations of the input tuples that

contributed to the query results

R2

sensor database

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

t3

t4

R1Sensor Readgs Annot.

S1

S2

8B

2B

t1

Time

00:19

01:50 t2

sensor readings

DB Server

R1 R2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

{t1,t3}

{t2,t4}

Time

00:19

01:50

Readgs

8B

2B

Data Provenance Models

Annotation Models: provenance computation is coupled with a particular application and a particular assignment of the provenance of source data

When the annotation of the input tuple

changes, we must re-executethe query to obtain the annotation

of theresult tuples

R2Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

1

1

Sensor Readgs Annot.

S1

S2

8B

2B

1

Time

00:19

01:50 0

The annotation of a join tuple is computed using operator x 0 x 0 = 0, 1 x 0 = 0, 1 x 1 = 1

R1 R2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

1

Time

00:19

01:50

Readgs

8B

2B 0

R1

Data Provenance Models

Abstract Models: provenance annotations (referred to as tokens) and operators are abstract.

When the annotation of the input tuplechanges, the annotation of

the result tuple is re-computed byevaluating the annotation expression only

R1

R2

The annotation of a join tupleis modeled by the “x” operator

R1 R2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

T3

T4

Sensor Readgs Annot.

S1

S2

8B

2B

T1

Time

00:19

01:50 T2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

T1 x T3

T2 x T4

Time

00:19

01:50

Readgs

8B

2B

Data Provenance Models

Abstract Models:Abstract tokens and operators are assigned concrete values, only when the concrete value of an annotation must be computed

R1

R2

R1 R2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

T3

T4

Sensor Readgs Annot.

S1

S2

8B

2B

T1

Time

00:19

01:50 T2

Sensor Latitude Annot.

S1

S2

23° 26’ 21”N

23° 26’ 21”N

T1 x T3

T2 x T4

Time

00:19

01:50

Readgs

8B

2B

Data Quality Application:

• abstract tokens T1, T2, T3, T4 take values 1 and 0

• abstract operator “x” is replaced by logical AND

0

1

1

1

1

01 1

Abstract Data Provenance Models

• Benefits: – in the presence of provenance updates in the

input, we need to evaluate the value of the provenance of the affected tuples only

– different applications can assign different concrete values to abstract tokens and operators, for the same data

• Challenges: Trade-off between provenance storage over computation efficiency– storage of large provenance expressions– efficient computation of provenance for dynamic

data

Data Provenance

• RDFS reasoning– Given a set of RDF triples whose explicit

provenance is known, and RDFS reasoning rules what is the provenance of the implicit RDF triples?

• SPARQL– Given a set of RDF triples whose explicit

provenance is known, and a SPARQL query, what is the provenance of the query result?

RDFS Reasoning

(A1, sc, A3)

(A1, sc, A2)(A2, sc, A3)

(&r, sc, A2)

(&r, type, A1)(A1, sc, A2) C3C1

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

SSN Ontology

C2

&s1&s1

C4

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

SSN Ontology

C2

C3C1

&s1&s1

C4

type: sc (subclassOf):

Given a set of RDF triples (RDF Graph) whose explicit provenance is known, and RDFS entailment rules

what is the provenance of the implicit RDF triples?

Given a set of RDF triples (RDF Graph) whose explicit provenance is known, and RDFS entailment rules

what is the provenance of the implicit RDF triples?

?

?

??

RDFS Reasoning

• colors to capture the provenance of explicit and implicit data and schema RDF triples

• quadruples to represent provenance information

• Provenance model: commutative semi-group structure (C, +)

–C: set of colors,

–binary operation “+” to compose colors of the input triples

RDFS Reasoning

• Pediaditis P., Flouris G., Fundulaki I., Christophides V. On Explicit Provenance Management in RDF/S Graphs. In Theory and Practice of Provenance (TaPP-2009)

• Flouris G., Fundulaki I., Pediaditis P., Theoharis Y., Christophides V. Coloring RDF Triples to capture Provenance. In ISWC 2009.

Provenance for SPARQL

• We showed that existing provenance models for positive

relational algebra can capture the provenance of SPARQL

(without OPTIONAL)

• We follow the approach by Karvounarakis et. al. in

Provenance Semirings, PODS 2007 to develop a model for

full SPARQL

– records the input tuples and the operators used to

compute the query results

Given a set of RDF triples (RDF Graph) whose explicit provenance is known, and a SPARQL query

what is the provenance of the result?

Given a set of RDF triples (RDF Graph) whose explicit provenance is known, and a SPARQL query

what is the provenance of the result?

Provenance Model for SPARQL+

• K: set of provenance tokens• : operator for SPARQL join• : operator for SPARQL union

subject predicate object

S1 type Sensor

S1 Readgs &r1

S1 Latitude 23° 26’ 21”N

S2 type Sensor

S2 Readgs &r2

S2 Latitude 23° 26’ 21”N

&r1 value 8B

00:19time&r1

&r2 value 2B

01:50time&r2

prov

t1

t5

t2

t3

t6

t4

t7

t8

t9t10

select ?s, ?lwhere { ?s type Sensor . ?s latitude ?l }

SPARQL Query: return the sensor and its latitude

Provenance Model for SPARQL+

?s type Sensor . ?s Latitude ?lQ =

The evaluation of a triple pattern over T is a set of mappings (?variable, ?value)

?sS1

S2

1

?s type Sensor

1 t1t32

2

?s latitude ?l

?sS1

S2

?l23° 26’ 21”N

23° 26’ 23”N

t2t4

3

4

subject predicate object

S1 type Sensor

S1 Readgs &r1

S1 Latitude 23° 26’ 21”N

S2 type Sensor

S2 Readgs &r2

S2 Latitude 23° 26’ 21”N

&r1 value 8B

00:19time&r1

&r2 value 2B

01:50time&r2

prov

t1

t5

t2

t3

t6

t4

t7

t8

t9t10

Provenance Model for SPARQL+

?s type Sensor . ?s Latitude ?lQ =

?sS1

S2

1

?s type Sensor

1 t1t32

2

?s latitude ?l

?sS1

S2

?l23° 26’ 21”N

23° 26’ 23”N

t2t4

3

4

The result of a join between two triple patterns contains all mappings that have the same value for their common variable(s)

subject predicate object

S1 type Sensor

S1 Readgs &r1

S1 Latitude 23° 26’ 21”N

S2 type Sensor

S2 Readgs &r2

S2 Latitude 23° 26’ 21”N

&r1 value 8B

00:19time&r1

&r2 value 2B

01:50time&r2

prov

t1

t5

t2

t3

t6

t4

t7

t8

t9t10

3

?sS1

S2

?l23° 26’ 21”N

23° 26’ 23”N

t1

t3

t2

t4

Provenance for SPARQL

• Theoharis Y., Fundulaki I., Karvounarakis G., Christophides V. On Provenance of Queries on Linked Web Data. In IEEE Internet Computing:Provenance in Web Applications, 2011.

Access Control

• Refers to the ability to permit or deny the use of a particular resource by a particular entity

• Crucial for sensitive content since it ensures the selective exposure of information to different classes of users

RDF Access Control

• In general, an access control model specifies

– the access annotations

– conflict resolution policy to resolve ambiguous

access annotations

– default semantics used to annotate data that

are not in the scope of any authorization

• Access Authorizations specify (by a query) the

access annotations for data

Access Control

• Access Annotations can be – boolean values

• true/false (grant/deny access permission)– confidentiality levels

• low, medium, high• Conflict Resolution Policy depends on the type of access

annotations– boolean values:

• deny overrides grant access annotation– confidentiality levels

• high confidentiality overrides medium, medium overrides low

• Default Semantics depend on the type of access annotations

Fine-grained Access Control Framework for RDF Data

• We encode access annotations of RDF triples using

quadruples

• We propose an abstract access control model defined

by a set of abstract tokens and abstract operators to

model

– the computation of access annotations of RDF

triples considering RDFS inference

– the propagation of access annotations

– conflicting and missing access annotations

Abstract Tokens

• L: set of abstract access control tokens

• L default access token

– assigned to triples that have not an explicitly

assigned access token

Abstract Operators

• Entailment Operator ⊙ to compute the access

annotations of implied quadruples

• Propagation Operator to model the

propagation of access annotations

• Conflict Resolution Operator to resolve

ambiguous access annotations

Entailment Operator ⊙

• binary operator to model the computation of the annotation of an implicit RDF quadruple for the subclass, subproperty and type hierarchies in an RDF graph

– Properties:

• Associativity:

• Commutativity

(A1, sc, A2, l1)

(A2, sc, A3, l2)

(A1, sc, A3, l1 ⊙ l2)

⊙ l2l1l4 ⊙( ) ⊙ l2l1l4 ⊙( )=

l1l4 ⊙ = l1 l4⊙

The order of the application of inference rules is not important

Entailment Operator ⊙

(A1, sc, A3 ,l1 ⊙ l2)

(A1, sc, A2 ,l1) (A2, sc, A3 ,l2)

(&r, type, A2 ,l1 ⊙ l2)

(&r, type, A2 ,l1) (A2, sc, A3 ,l2)

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

&s1&s1

l4

rdfs:Classrdfs:Classl0

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

&s1&s1

l4

type: sc (subclassOf):

l0

rdfs:Classrdfs:Class

l1l4 ⊙

l1 ⊙ l2

⊙ l2l1l4 ⊙( )

Propagation Operator

• unary operator to model propagation of access annotations along the subclass/subproperty and type hierarchies in an RDF Graph– a class inherits the annotation of its superclass,

an instance of a class inherits the annotation of its class, etc.

– Properties:

• Idempotence:

(A1, type, class, l1)

(&r1, type, A1, (l1 ))

(&r1, type, A1, l2)

l0 l0 ( ( )) = ( )

We do not care how many times an annotationis propagated

Propagation Operator

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

&s1&s1

l4

type: sc (subclassOf):

l0

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

&s1&s1

l4

rdfs:Classrdfs:Class

l0

rdfs:Classrdfs:Class

l0

⊙ l2l1l4 ⊙( )

(&r, type, A1 ,l1)(A1, type, rdfs:Class ,l2)((&r, type, A1 , ) l2

Conflict Resolution Operator

• binary operator to resolve ambiguous access labels

– Properties:

• Associativity:

• Commutativity:

• Idempotence:

(A1, sc, A2, L1)(A1, sc, A2, L2) (A1, sc, A2, L1 L2)

l2l0 ( )l1 = l0 ( l2l1 )

l0 l1 = l0l1

l1 = l1l1

Computing Abstract Access Control Annotations

• assign access annotations to triples of the RDF

graph to obtain quadruples

• apply RDFS inference rules on quadruples to

obtain the implicit annotated quadruples

• apply propagation rules on quadruples to

compute their propagated annotations

• apply the conflict resolution operator to resolve

ambiguities

Computing Abstract Access Control Annotations (example)

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

l0

rdfs:Classrdfs:Class

l5

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

l0

rdfs:Classrdfs:Class

l5

⊙ l2l1 ⊙( ) l0⊙ l3l5( )⊙ l0

⊙ l2l1 ⊙( ) l0 ⊙ l3l5( )⊙ l0 l0(SensingDevice, type, rdfs:Class, )

Concrete Policies

• A concrete policy assigns concrete values to the abstract tokens and operators

• Example

– Boolean values assigned to abstract tokens

• false: deny access

• true: grant access

– Conjunction assigned to entailment operator

• an implied triple is accessible iff all its implying triples have been granted access

– Disjunction assigned to Conflict Resolution operator

• grant overrides deny annotation

– Identity assigned to propagation operator

Concrete Policy (example)

(SensingDevice, type, rdfs:Class, ⊙ l2l1 ⊙(( ) l0

l2

l3

l1l0

l5

false (F)

true (T)

Assignment of abstract tokens to values

Assignment of abstract operators to concrete ones

⊙ () ()

propagation (¬)negation

entailment conjunction

conflict resolution disjunction

(SensingDevice, type, rdfs:Class, ( (( (¬ F) )(F F ) F ) T T) T)

) ⊙ l3l5 )⊙ l0(( ) l0 )( )

T

References

Flouris G., Fundulaki I., Michou M., Papakonstantinou V., Antoniou G. Access Control for RDFS Graphs Using Abstract Models. Ongoing work.

Computing Abstract Access Control Expressions (example)

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

l0

rdfs:Classrdfs:Class

(A1, sc, A3 ,l1 ⊙ l2)

(A1, sc, A2 ,l1)(A2, sc, A3 ,l2)

(&r, type, A2 ,l1 ⊙ l2)

(&r, type, A2 ,l1)(A2, sc, A3 ,l2)

l5

Sensing DeviceSensing Device

DeviceDevice

Sensor

Sensor

SystemSystem

l2

l3l1

l0

rdfs:Classrdfs:Class

l5

l1 ⊙ l2

⊙ l3l5

⊙ l2l1 ⊙( ) l0⊙ l3l5( )⊙ l0

l0

(&r, type, A1 ,l1)(A1, type, rdfs:Class ,l2)

(&r, type, A1 , ) l2

(SensingDevice, type, rdfs:Class, ⊙ l2l1 ⊙( ) l0 ⊙ l3l5( )⊙ l0 l0 )

top related