Top Banner
Information Integration Using Logical Views Jeffrey D. Ullman
17

Information Integration Using Logical Views Jeffrey D. Ullman.

Dec 14, 2015

Download

Documents

Bella Prentis
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Information Integration Using Logical Views Jeffrey D. Ullman.

Information Integration Using Logical Views

Jeffrey D. Ullman

Page 2: Information Integration Using Logical Views Jeffrey D. Ullman.

Overview

Information Integration Systems

Global-as-view (Gav.) vs. Local-as-view (Lav.)

Query Reformulation Specification of Source

Description Adding new sources

Page 3: Information Integration Using Logical Views Jeffrey D. Ullman.

Query Reformulation

Problem: rewrite a user query expressed in the mediated schema into a query expressed in the source schema

Given a query Q in terms of the mediator schema relations, and descriptions of information sources

Find a query Q’ that uses only the source relations, such that

– Q’ Q, and– Q’ provides all possible answers to Q given the sources

Page 4: Information Integration Using Logical Views Jeffrey D. Ullman.

Solving Queries by Views

Mediator Relations

Source Relations

Page 5: Information Integration Using Logical Views Jeffrey D. Ullman.

Query Rewriting Using Views

Query Containment: q’ q D q’(D) q(D) Query Equivalence: q’=q q’ q ^ q q’Given query q and view definitions V={v1, …, vn} q’ is an Equivalent Rewriting of q using V if

– q’ refers only to views in V, and– q’ = q

q’ is an Maximally-Contained Rewriting of q using V if – q’ refers only to views in V and– q’ q, and– There is no rewriting q1, such that q’ q1 and q1q’

Page 6: Information Integration Using Logical Views Jeffrey D. Ullman.

ComputationComplexity

p

k

p

k

pk

p

k

p

k 1

Page 7: Information Integration Using Logical Views Jeffrey D. Ullman.

Complexity of Query Containment

Conjunctive Queries (CQ) (NP-Complete) – Q1: p(X,Z) :- a(X,Y) & a(Y,Z)– Q2: p(X,Z) :- a(X,Y) & a(V,Z)

CQ’s With Negation ( -Complete)– Q1: p(X,Z) :- a(X,Y) & a(Y,Z) & NOT a(X,Z)

CQ’s With Arithmetic Comparision ( -Complete)– Q1: p(X,Z) :- a(X,Y) & a(Y,Z) & X<Y

Datalog Programs– p(A,C) :- a(A,B) & b(B,C)

p

2

p

2

Page 8: Information Integration Using Logical Views Jeffrey D. Ullman.

Specification of Source Description

Views: resources that used by integrator to help to answer queries

Gav. Mediator relation defined as view over source relations

Lav. Source relation defined as view over mediator relations

Page 9: Information Integration Using Logical Views Jeffrey D. Ullman.

Information Integration Systems

Information Manifold (IM)– AT&T– Local-as-View (Lav)– Description logic– Source relations defined as views of mediator

relations ( a collection of global predictions) Tsimmis

– Stanford and IBM– Global-as-View (Gav)– Mediator relations defined as views of source

relations

Page 10: Information Integration Using Logical Views Jeffrey D. Ullman.

IM Example

Global Predicates: Mediator relations

Page 11: Information Integration Using Logical Views Jeffrey D. Ullman.

IM Example (Cont.)

Views: Source Relations

Query: “What are Sally’s phone and office?”

Mediator Relations

Mediator Relations

Page 12: Information Integration Using Logical Views Jeffrey D. Ullman.

IM Example (Cont.)

Answer: Source Relations

Query reformulation : Bucket Algorithm (check query containment NP-Complete (query length) )

Page 13: Information Integration Using Logical Views Jeffrey D. Ullman.

Advantages and Disadvantages (IM)

Advantage: adding new sources– Mediator (global predicates, source descriptions)– Query processing

Disadvantages : query reformulation (Bucket algorithm)

Page 14: Information Integration Using Logical Views Jeffrey D. Ullman.

Tsimmis

OEM and MSL Mediator Relations

Page 15: Information Integration Using Logical Views Jeffrey D. Ullman.

Tsimmis Example

Exported OEM Objects

Query: “What are Sally’s phone and office?”

Mediator Relations

Source Relations

Source Relations

Page 16: Information Integration Using Logical Views Jeffrey D. Ullman.

Advantage and Disadvantage ( Tsimmis)

Advantage– Query reformulation: rule unfolding

Disadvantage– Mediation description– Adding, removing, and modifying source description

Page 17: Information Integration Using Logical Views Jeffrey D. Ullman.

IM vs. Tsimmis

Query Reformulation Adding Sources Levels of Mediation Semistructured Data Constraints Automatic Generation of Components

(Wrappers and Mediators)