Top Banner
Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint work with Alan Nash, UCSD
29

Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Jan 25, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Data Exchange: Computing Coresin Polynomial Time

Georg GottlobOxford University

Joint work with Alan Nash, UCSD

Page 2: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

G....: Computing Cores for Data Exchange: New Algorithmsand Practical Solutions PODS 2005

G.... & Nash: Data Exchange: Computing Cores in PolynomialTime. Submitted to PODS 2006.

This talk is based on two recent papers:

Detailed joint extended version of both papers:

G.... & Nash: Efficient Core Computation in Data Exchange.Available from the authors (Draft).

Page 3: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Talk Structure

Introduction & basics

Computing Cores• for weakly acyclic TGDs as target dependencies• for EGDs and weakly acyclic TGDs as target dependencies

Further results (time permitting)

Page 4: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Cores

Instance:

{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

Logical meaning

p(X,Y) & p(X,b) & p(a,b) & p(U,c) & p(U,V) & q(a,c,d)∃ X, Y, U, V:

Page 5: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Cores

I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

endomorphism h: {Y b}

Page 6: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Cores

I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

endomorphism h: {Y b}

REDUNDANT!∃X,Y p(X,Y) & p(X,b)

⇑⇓∃X p(X,b)

Page 7: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Cores

I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

endomorphism h: {Y b}

REDUNDANT!∃X,Y p(X,Y)

⇑∃X p(X,b)

Page 8: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Cores

I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

endomorphism h: {Y b}

h(I) =

Page 9: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Cores

I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

endomorphism h: {Y b}

h(I) =

h(I) can be further reduced by endomorphism g: {X a, V c}

X a V c

Page 10: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Cores

I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

endomorphism h: {Y b}

h(I) =

h(I) can be further reduced by endomorphism g: {X a, V c}

{ p(a,b), p(U,c), q(a,c,d) }

Page 11: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Cores

I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }h(I) =

endomorphism f: {X a, Y b, V c}

{ p(a,b), p(U,c), q(a,c,d) }f(I)= g(h(I)) = g○h(I)=⇔

Page 12: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Cores

I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }h(I) =

endomorphism f: {X a, Y b, V c}

{ p(a,b), p(U,c), q(a,c,d) }f(I)= g(h(I)) = g○h(I)=⇔

no refinement by endomorphisms possible !

Page 13: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Cores

I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }h(I) =

endomorphism f: {X a, Y b, V c}

{ p(a,b), p(U,c), q(a,c,d) }f(I)= g(h(I)) = g○h(I)=⇔

Core(I)unique up to variable-renaming!

Page 14: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Blocks

I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

Blocks: Connected components in the variable-graph

Atom-Blocks: corresponding sets of atoms

Page 15: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

X Y

Blocks

I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

Blocks: Connected components in the variable-graph

{X,Y} {U,V} blocksize(I)=2

blocksize(I) = size of largest block of I

U V variable-graph

Page 16: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

[G. PODS’05] - Computing core(I) tractable for bounded

treewidth or hypertree-width of variable-graph=> new bound: O(nb/2+3)

based on hypertree decompositions. ( end of talk, time permitting)

- Computing core(I) is NP-hard in general.

- It is tractable for bounded blocksize b:

core(I) can be computed in time n * O(|dom(I)|b+2) = O(nb+3)

[Fagin, Kolaitis, Popa PODS’03]:

Page 17: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

DependenciesTuple generating dependencies TGDs:

∀X ∀Y ∀Z p(X,Y) & q(Y,Z) ∃U ∃ V r(X,U) & p(Z,V)

We usually omit universal quantifiers…

TGDs can be cyclic in which case the Chase may not terminate

Cyclic TGD:

p(X,Y) & q(Y,Z) ∃ U,V r(X,U) & p(Z,V)

Equality generating dependencies EGDs:

∀X ∀Y ∀Z p(X,Y) & p(X,Z) Y=Z

Page 18: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

We restrict ourselves to setting of weakly acyclic sets of TGDs + arbitrary EGDs

( [Fagin, Kolaitis, Popa 03], [Deutsch,Tannen 03] )

This covers the overwhelming part of relevant constraints:

• Functional dependencies• w.a. inclusion dependencies• referential integrity • foreign key constraints…• …

Page 19: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

S TSource-Target TGDs TGDs

I

Data Exchange

EGDsΣt

Japply Σt

Core(J´)

compute core

Σst

apply Σst

Page 20: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

S TSource-Target TGDs TGDs

I

Data Exchange

EGDsΣt

Japply Σt

Core(J´)

compute coreOpen Problem [F.K.P.03]:Can Core(J’) be computed in polytimeif Σt consists of w.a. TGDs and EGDs?

Σst

apply Σst

Page 21: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

S TSource-Target TGDs TGDs

I

Data Exchange

EGDsΣt

Japply Σt

Core(J´)

compute core

Σst

apply Σst

Nice: bounded blockwidth,treewidth, hypertree-width

Page 22: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

S TSource-Target TGDs TGDs

I

Data Exchange

EGDsΣt

Japply Σt

Core(J´)

compute core

Σst

apply Σst

Nice: bounded blockwidth,treewidth, hypertree-width

The problem: unbounded blockwidth,treewidth, hypertree-width

Lumps blocksTogether!

Page 23: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

TGDs (even full TGDs) destroy blockwidth

{X,Y} {U,V} blocksize(I)=2

{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

Page 24: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

TGDs (even full TGDs) destroy blockwidth

{X,Y} {U,V} blocksize=2

{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }

TGD: p(R,S) & p(R’,S’) p(R,R’)

{X,Y,U,V} blocksize=4

p(X,U)

Page 25: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Efficient Core Computation• Fagin, Kolaitis, and Popa [PODS 2003]

- Target dependencies are empty or contain only EGDs(blocks method and rigidity)

• G….. [PODS 2005]- Target dependencies without existential quantification (= full)- Target dependencies with a single atom in the premise

(they preserve hypertree-width)

• G…., Nash [PODS 2006]- General target dependencies for which the chase is known to terminate (weakly acyclic or new conditions)

• In summary: Whenever we know we can compute universal solutions in PTIME, we know we can compute their cores in PTIME

Page 26: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Efficient Core Computation• Fagin, Kolaitis, and Popa [PODS 2003]

- Target dependencies are empty or contain only EGDs(blocks method and rigidity)

• G….. [PODS 2005]- Target dependencies without existential quantification (= full)- Target dependencies with a single atom in the premise

(they preserve hypertree-width)

• G…., Nash [PODS 2006]- General target dependencies for which the chase is known to terminate (weakly acyclic or new conditions)

• In summary: Whenever we know we can compute universal solutions in PTIME, we know we can compute their cores in PTIME

Page 27: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Efficient Core Computation• Fagin, Kolaitis, and Popa [PODS 2003]

- Target dependencies are empty or contain only EGDs(blocks method and rigidity)

• G….. [PODS 2005]- Target dependencies without existential quantification (= full)- Target dependencies with a single atom in the premise

(they preserve hypertree-width)

• G…., Nash [PODS 2006]- General target dependencies for which the chase is known to terminate (weakly acyclic or new conditions)

• In summary: Whenever we know we can compute universal solutions in PTIME, we know we can compute their cores in PTIME

Page 28: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

Efficient Core Computation• Fagin, Kolaitis, and Popa [PODS 2003]

- Target dependencies are empty or contain only EGDs(blocks method and rigidity)

• G….. [PODS 2005]- Target dependencies without existential quantification (= full)- Target dependencies with a single atom in the premise

(they preserve hypertree-width)

• G…., Nash [PODS 2006]- General target dependencies for which the chase is known to terminate (weakly acyclic or new conditions)

• In summary: Whenever we know we can compute universal solutions in PTIME, we know we can compute their cores in PTIME

Page 29: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint

S TSource-Target TGDs TGDs

I

Data Exchange

EGDsΣt

Japply Σt

Core(J´)

compute coreTheorem: Core(J’) can be computed in polynomial time.

Σst

apply Σst