Top Banner
CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm
14

CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

CSE 636Data Integration

Answering Queries Using Views

MiniCon Algorithm

Page 2: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

2

The MiniCon Algorithm

• Concentrate on variables rather than subgoals to create MiniCon Descriptions (MCDs)

• Combine MCDs that only overlap on distinguished view variables

• No containment check!

Page 3: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

3

Query:q(X) :- cites(X,Y), cites(Y,X), sameTopic(X,Y)Views:V4(A) :- cites(A,B), cites(B,A)V5(C,D) :- sameTopic(C,D)V6(F,H) :- cites(F,G), cites(G,H), sameTopic(F,G)

MiniCon Example

distinguished variable

existential variable

Page 4: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

4

MiniCon Example

Form buckets (MCDs) more intelligently:• Ask what is the minimal set of query subgoals

that must be covered (via mappings) by each view

• First, look at join conditions in q• Then, follow joins on existential variables in

views

Page 5: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

5

MiniCon Example

Consider V4(A) :- cites(A,B), cites(B,A)• Can cover query subgoal cites(X,Y)• To do this, we map: XA, YB• Y is an existential join variable• So V4 needs to cover the query subgoals that

contain Y, which are: cites(Y,X) and sameTopic(X,Y)

• V4 can cover the cites subgoal (XA, YB), but not the sameTopic

• Hence, no MCD is created for V4

Page 6: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

6

MiniCon Example

Consider V5(C,D) :- sameTopic(C,D)• Can cover sameTopic(X,Y) (and nothing else)• To do this, we map: XC, YD• The MCD says how the query subgoals may be

covered by V5

V5

MiniCon Descriptions (MCDs)View Mappings Query Subgoals Covered

XC, YD 3

Page 7: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

7

MiniCon Example

Consider V6(F,H) :- cites(F,G), cites(G,H), sameTopic(F,G)

• Can cover query subgoal sameTopic(X,Y)• To do this, we map: XF, YG• Y is an existential join variable• So V6 needs to cover the query subgoals that

contain Y, which are: cites(X,Y) and cites(Y,X)

• To do this, we map: XF, YG and XH

Page 8: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

8

MiniCon Example

We get the following MCD forV6(F,H) :- cites(F,G), cites(G,H), sameTopic(F,G)

Problem: Mappings do not define a function• X is mapped to both F and H• Can we fix this?• Yes, if both F and H are distinguished in V6• No, otherwise

V5

MiniCon Descriptions (MCDs)View Mappings Query Subgoals Covered

XC, YD 3

V6 XF, YG, XH 1,2,3

Page 9: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

9

MiniCon Example

We fix the problem by making F and H equal when producing the rewriting:

Query rewriting: q’(F) :- V6(F,F)

Since V6 covers all query subgoals, no other view is needed

Page 10: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

10

MiniCon Algorithm

When forming MCDs:• Join obligations in query via existential variables

cannot be fulfilled by another view• Unless those variables are mapped to the view’s

distinguished variables!• Current view should fulfill them allWhen combining MCDs:• Combined MCDs must cover disjoint sets of

query subgoals• All query subgoals must be covered

Page 11: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

11

MiniCon Algorithm

• For every possible way of covering all query subgoals, “chain” the corresponding MCDs of views in the covering, and form a rewriting

• The union of all such rewritings is the rewriting of the query w.r.t. the given views

• Best known algorithm for answering queries using views

• Empirically shown to scale to 1000s of views, i.e., sources

Page 12: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

12

MiniCon Example 2

Query:q(X) :- cites(X,Y), cites(Z,X), inSIGMOD(X)Views:V7(A) :- cites(A,B), inSIGMOD(A)V8(C) :- cites(D,C), inSIGMOD(C)

Step 1:

V7

MiniCon Descriptions (MCDs)View Mappings Query Subgoals Covered

XA, YB 1

V8 ZD, XC 2

V7 XA 3

V8 XC 3

Page 13: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

13

MiniCon Example 2

Step 2:Query rewriting 1: q1(X) :- V7(X), V8(X), V7(X)Query rewriting 2: q2(X) :- V7(X), V8(X), V8(X)

Final rewriting: q’(X) :- V7(X), V8(X)

V7

MiniCon Descriptions (MCDs)View Mappings Query Subgoals Covered

XA, YB 1

V8 ZD, XC 2

V7 XA 3

V8 XC 3

Page 14: CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.

14

References

• MiniCon: A Scalable Algorithm for Answering Queries Using Views– By Rachel Pottinger and Alon Halevy– VLDB Journal, 2000

• Laks VS Lakshmanan– Lecture Slides

• Alon Halevy– Answering Queries Using Views: A Survey– VLDB Journal, 2000– http://citeseer.ist.psu.edu/halevy00answering.html