Information Integration Using Logical Views Jeffrey D. Ullman.

Post on 14-Dec-2015

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Information Integration Using Logical Views

Jeffrey D. Ullman

Overview

Information Integration Systems

Global-as-view (Gav.) vs. Local-as-view (Lav.)

Query Reformulation Specification of Source

Description Adding new sources

Query Reformulation

Problem: rewrite a user query expressed in the mediated schema into a query expressed in the source schema

Given a query Q in terms of the mediator schema relations, and descriptions of information sources

Find a query Q’ that uses only the source relations, such that

– Q’ Q, and– Q’ provides all possible answers to Q given the sources

Solving Queries by Views

Mediator Relations

Source Relations

Query Rewriting Using Views

Query Containment: q’ q D q’(D) q(D) Query Equivalence: q’=q q’ q ^ q q’Given query q and view definitions V={v1, …, vn} q’ is an Equivalent Rewriting of q using V if

– q’ refers only to views in V, and– q’ = q

q’ is an Maximally-Contained Rewriting of q using V if – q’ refers only to views in V and– q’ q, and– There is no rewriting q1, such that q’ q1 and q1q’

ComputationComplexity

p

k

p

k

pk

p

k

p

k 1

Complexity of Query Containment

Conjunctive Queries (CQ) (NP-Complete) – Q1: p(X,Z) :- a(X,Y) & a(Y,Z)– Q2: p(X,Z) :- a(X,Y) & a(V,Z)

CQ’s With Negation ( -Complete)– Q1: p(X,Z) :- a(X,Y) & a(Y,Z) & NOT a(X,Z)

CQ’s With Arithmetic Comparision ( -Complete)– Q1: p(X,Z) :- a(X,Y) & a(Y,Z) & X<Y

Datalog Programs– p(A,C) :- a(A,B) & b(B,C)

p

2

p

2

Specification of Source Description

Views: resources that used by integrator to help to answer queries

Gav. Mediator relation defined as view over source relations

Lav. Source relation defined as view over mediator relations

Information Integration Systems

Information Manifold (IM)– AT&T– Local-as-View (Lav)– Description logic– Source relations defined as views of mediator

relations ( a collection of global predictions) Tsimmis

– Stanford and IBM– Global-as-View (Gav)– Mediator relations defined as views of source

relations

IM Example

Global Predicates: Mediator relations

IM Example (Cont.)

Views: Source Relations

Query: “What are Sally’s phone and office?”

Mediator Relations

Mediator Relations

IM Example (Cont.)

Answer: Source Relations

Query reformulation : Bucket Algorithm (check query containment NP-Complete (query length) )

Advantages and Disadvantages (IM)

Advantage: adding new sources– Mediator (global predicates, source descriptions)– Query processing

Disadvantages : query reformulation (Bucket algorithm)

Tsimmis

OEM and MSL Mediator Relations

Tsimmis Example

Exported OEM Objects

Query: “What are Sally’s phone and office?”

Mediator Relations

Source Relations

Source Relations

Advantage and Disadvantage ( Tsimmis)

Advantage– Query reformulation: rule unfolding

Disadvantage– Mediation description– Adding, removing, and modifying source description

IM vs. Tsimmis

Query Reformulation Adding Sources Levels of Mediation Semistructured Data Constraints Automatic Generation of Components

(Wrappers and Mediators)

top related