Top Banner
A DICHOTOMY ON THE COMPLEXITY OF CONSISTENT QUERY ANSWERING FOR ATOMS WITH SIMPLE KEYS Paris Koutris Dan Suciu University of Washington
28

A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

Mar 31, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

A DICHOTOMY ON THE COMPLEXITY OF CONSISTENT QUERY ANSWERING FOR ATOMS WITH SIMPLE KEYS

Paris KoutrisDan Suciu

University of Washington

Page 2: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

REPAIRS

• An uncertain instance I for a schema with key constraints• A repair r of I is a subinstance of I that satisfies the key

constraints and is maximal

2

R(x, y)

(a1, b1)

(a1, b2)

(a2, b2)

(a3, b3)

(a3, b4)

(a4, b4)

(a1, b1)

(a2, b2)

(a3, b4)

(a4, b4)

(a1, b1)

(a2, b2)

(a3, b3)

(a4, b4)

(a1, b2)

(a2, b2)

(a3, b4)

(a4, b4)

(a1, b2)

(a2, b2)

(a3, b3)

(a4, b4)

The 4 possible repairs

Page 3: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

CONSISTENT QUERY ANSWERING

• If Q is boolean, we say that I is certain for Q, I |= Q, if for every repair r of I, Q(r) is true

3

R(x, y)

(a1, b1)

(a1, b2)

(a2, b2)

(a3, b3)

(a3, b4)

(a4, b4)

S(y, z)

(b1, c1)

(b2, c1)

(b2, c2)

(b3, c3)

• Q() = R(x, y), S(y, z)• I |= Q

Page 4: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

PROBLEM STATEMENT

CERTAINTY(Q): Given as input an instance I, does I |= Q when Q is a boolean CQ?

• In general, CERTAINTY(Q) is in coNP– Q1 = R(x, y), S(y, z) : expressible as a first-order query

– Q2 = R(x, y), S(z, y) : coNP-complete

– Q3 = R(x, y), S(y, x) : PTIME but not first-order expressible

4

Conjecture For every boolean conjunctive query Q, CERTAINTY(Q) is either in PTIME or coNP-complete

Page 5: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

PROGRESS SO FAR

• [Wijsen, 2010]– Syntactic characterization of FO-expressible acyclic CQs w/o self-

joins• [Kolaitis and Pema, 2012]

– A trichotomy for CQs with 2 atoms and no self-joins• [Wijsen, 2010 & 2013]

– PTIME algorithm for cyclic queries: Ck = R1(x1,x2), …, Rk(xk, x1)

– Further classification of acyclic CQs w/o self-joins

5

Page 6: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

OUR CONTRIBUTION

A dichotomy for CQs w/o self-joins where atoms have either • Simple keys : R(x, y, z)• Keys that consist of all attributes: S(x, y, z)

6

Theorem For every boolean CQ Q w/o self-joins where for each atom the key consists of either one attribute or all attributes, there exists a dichotomy of CERTAINTY(Q) into PTIME and coNP-complete

Page 7: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

OUTLINE

1. The Dichotomy Condition

2. Frugal Repairs & Representable Answers

3. Strongly Connected Graphs

7

Page 8: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

THE QUERY GRAPH

• We equivalently study boolean CQs consisting only of binary relations where one attribute is the key: R(x, y)

• Relations can be consistent (Rc) or inconsistent (Ri)

Query Graph: a directed edge (u, v) for each atom R(u,v)

8

Q = Ri(x, y), Si(z, w), Tc(y, w)

y w

x

S

T

R

zG[Q]

source node uR

end node vR

Page 9: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

DEFINITIONS

• x+,R : set of nodes reachable from node x once we remove the edge R (through a directed path)

• R ~ S [source-equivalent]: source nodes uR, uS are in the same SCC

• [R]: the equivalence class of R w.r.t ~

9

y R

z

xT

S v

w

u• x+,R = {x, v, w}• R ~ T and [R] = {R, T}V

U

Page 10: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

COUPLED EDGES

coupled+(R) = edges in [R] + any inconsistent edge S s.t. the source node uS is connected to the end node vR through a (undirected) path that does not intersect with uR

+,R

10

y = vR

R

z

x = uR

T

S v

w

u = uV

coupled+(R): • contains R,T: [R] = {R, T}• contains V: path from y (= vR )

to u (= uV)• does not contain U

V

U

The set uR+,R

Page 11: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

SPLITTABLE GRAPHS

• Two inconsistent edges R, S are coupled if – S in coupled+(R) & R in coupled+(S)

• A graph G[Q] is:– unsplittable if it contains a pair of coupled edges that are not

source-equivalent.– splittable otherwise

11

y R

z

xT

S v

w

u

V

U

coupled+(R) = {R, T, V}coupled+(T) = {R, T, V}coupled+(V) = {V}coupled+(U) = {U,V,R,T}

Only R,T are coupled

SPLITTABLE!

Page 12: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

THE DICHOTOMY CONDITION

12

y R

z

xT

S v

w

u

V

U

Dichotomy Theorem • If G[Q] is splittable, CERTAINTY(Q) is in PTIME• If G[Q] is unsplittable, CERTAINTY(Q) is coNP-

complete

Splittable, so in PTIME

Page 13: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

EXAMPLES

13

PTIME

R(x, y), S(y, z)

coNP-complete

R(x, y), S(y, z), Tc(x, z)

x

y

zx

y

z

PTIME

R(x, y), S(y, z), Uc(z, y)

x

y

z

coNP-complete

R(x, y), S(z, y), Uc(y, z)

x

y

z

Page 14: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

OUTLINE

1. The Dichotomy Condition

2. Frugal Repairs & Representable Answers

3. Strongly Connected Graphs

14

Page 15: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

FRUGAL REPAIRS (1)

15

Definition A repair r of an instance I is frugal for a boolean query Q if for any other repair r’ of I, Qf(r’) is not strictly contained in Qf(r)

R(x, y)

(a1, b1)

(a1, b2)

(a2, b3)

(a3, b4)

(a4, b4)

S(y, x)

(b1, a1)

(b3, a2)

(b4, a3)

(b4, a4)

repair r1 = { R(a1, b1), R(a2, b3), R(a3, b4), R(a4, b4) S(b1, a1), S(b3, a2), S(b4, a3) }Qf(r1) = { (a1, b1), (a2, b3), (a3, b4) }

repair r2 = { R(a1, b2), R(a2, b3), R(a3, b4), R(a4, b4) S(b1, a1), S(b3, a2), S(b4, a3) }Qf(r2) = { (a2, b3), (a3, b4) }

not frugal

frugal

Qf = all body variables to the head (full query)

Page 16: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

R(x, y)

(a1, b1)

(a1, b2)

(a2, b3)

(a3, b4)

(a4, b4)

S(y, x)

(b1, a1)

(b3, a2)

(b4, a3)

(b4, a4)

FRUGAL REPAIRS (2)

16

• I |= Q if and only if every frugal repair satisfies Q• We lose no generality if we study only frugal repairs!

Only two frugal repairs:• Qf(r2) = {(a2, b3), (a3, b4)}• Qf(r3) = {(a2, b3), (a4, b4)}

Page 17: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

OR-SETS

17

• Efficiently represent all answer sets of frugal repairs• We use or-sets: <1, 2, 3> means 1 or 2 or 3

– A = < {1, 3}, {1, 4}, {2, 3}, {2, 4} > – We can “compress” A as B = {<1, 2>, <3, 4>}– [Libkin and Wong, ‘93] “decompression” α operator: α(B) = A

• The or-set of answer sets for frugal repairs of I for Q:– MQ(I) = < {(a2, b3), (a3, b4)}, {(a2, b3), (a4, b4)} >

• Compressed form (set of or-sets):– AQ(I) = { < (a2, b3) >, < (a3, b4), (a4, b4) > }

Page 18: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

REPRESENTABILITY (1)

18

• An or-set-of-sets S is representable if there exists a set-of-or-sets S0 (compression) such that:– α(S0) = S

– For any distinct or-sets A, B in S0, the tuples in A and B use distinct constants in all coordinates

• The compression of a representable set with active domain of size n has size polynomial in n

< {(a2, b3), (a3, b4)}, {(a2, b3), (a4, b4)} >

{< (a2, b3) >, <(a3, b4), (a4, b4) >}

< {(a2, b3), (a3, b4)}, {(a2, b2), (a4, b4)} >

compression not representable

Page 19: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

REPRESENTABILITY (2)

19

• I |= Q iff the compression AQ(I) is not empty

• If we can compute AQ(I) in polynomial time, deciding whether I |= Q is in PTIME

Theorem If G[Q] is a strongly connected graph, MQ(I) is representable and its compression can be computed in polynomial time in the size of I

Page 20: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

OUTLINE

1. The Dichotomy Condition

2. Frugal Repairs & Representable Answers

3. Strongly Connected Graphs

20

Page 21: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

CYCLES

21

• Ck= R1(x1, x2), R2(x2, x3)…, Rk(xk, x1)

• The purified instance contains a collection of disjoint SCCs

• ALGORITHM FrugalC– Find the SCCs that contain no directed

cycle of length > k– For each such SCC i, create an or-set Ai

that contains all cycles of length k– Output ACk(I) = {A1, A2, …}

R(x, y)

(a1, b1)

(a2, b2)

(a2, b3)

S(y, z)

(b1, c1)

(b2, c2)

(b3, c2)

T(z, x)

(c1, a1)

(c2, a2)

a1

b1 c1

a2

b2 c2

b3

AC3(I) = {<(a1, b1, c1)>, <(a2, b2, c2), (a2, b3, c2)>}

Page 22: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

GENERAL CASE: SCCS (1)

22

• Recursively split a SCC G into a SCC G’ and a directed path P that intersects G’ only at its start and end node

• The set AG’(I) can be recursively computed

x

y

R S

T

tU

V

Graph G’

The path P = y -- > t -- > z

AG’(I) = {<(a1, b1, c1)>, <(a2, b2, c2), (a2, b3, c2)>}

A1 A2

z

Page 23: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

GENERAL CASE: SCCS (2)

23

AG’(I) = {<(a1, b1, c1)>, <(a2, b2, c2), (a2, b3, c2)>}

A1 A2

B(a, b)

(A1, [a1b1c1])

(A2, [a2b2c2])

(A2, [a2b3c2])

B1c (b, y)

([a1b1c1], b1)

([a2b2c2], b2)

([a2b3c2], b3)

B2c (b, z)

([a1b1c1], c1)

([a2b2c2], c2)

([a2b3c2], c2)

B0c (z, b)

(c1, A1)

(c2, A2)

Any value belongs in a unique or-set

a

y tU

Vb

B

B1c

z

B2c

B0c

Replacement of G’

A cycle C = a -> b -> y -> t -> z -> a + a chord B2 that is a consistent relation

Page 24: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

REST OF THE PROOF

24

• PTIME algorithm for splittable graphs– Find a separator in G[Q] (always exists if a graph is splittable)– The separator splits G[Q] into cases with fewer inconsistent edges,

which are solved recursively– Base case: all edges are consistent (check whether Q(I) is true)

• coNP-hardness– Reduction from the Monotone-3SAT problem

Page 25: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

CONLUSIONS

25

• Significant progress towards proving the dichotomy for the complexity of Certain Query Answering for Conjunctive Queries

• Settle the dichotomy (or trichotomy) even for queries with self-joins!

Page 26: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

Thank you !

26

Page 27: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

GENERAL CASE: SCCS (3)

27

a

y tU

Vb

B

B1c

z

B2c

B0c

Replacement of G’

• A cycle C = a -> b -> y -> t -> z -> a+ a chord B2 that is a consistent relation

• Compute AC for the modified input• Throw away any or-sets that have a

tuple that does not agree with B2

B(a, b)

(A1, [a1b1c1])

(A2, [a2b2c2])

(A2, [a2b3c2])

B1c (b, y)

([a1b1c1], b1)

([a2b2c2], b2)

([a2b3c2], b3)

B2c (b, z)

([a1b1c1], c1)

([a2b2c2], c2)

([a2b3c2], c2)

B0c (z, b)

(c1, A1)

(c2, A2)

Page 28: A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.

OVERVIEW • A query graph G[Q] is associated with query Q• The condition for PTIME (splittability) is defined on G[Q]

• PTIME case:– We introduce the notion of frugal repairs & representable answers– Algorithm for Strongly Connected Graphs– Use the notion of separators to recursively split the query graph

(self-reducibility)

• coNP-complete case:– Reduction from the Monotone-3SAT problem

28