September 6, 2001 1 Università di Ancona CoopIS’01 COOPERATION STRATEGIES FOR INFORMATION INTEGRATION Maurizio Panti, Luca Spalazzi , Loris Penserini {panti,spalazzi ,pense}@inform.unian.it Istituto di Informatica - Università degli studi di Ancona
Mar 27, 2015
September 6, 2001 1
Università di AnconaCoopIS’01
COOPERATION STRATEGIES FOR
INFORMATION INTEGRATION
Maurizio Panti, Luca Spalazzi, Loris Penserini
{panti,spalazzi,pense}@inform.unian.it
Istituto di Informatica - Università degli studi di Ancona
September 6, 2001 2
Università di AnconaCoopIS’01
Talk Overview
Motivations and Goals Local strategies Cooperation strategies
the choice of partnersthe choice of queriesthe choice of answers
Discussion
September 6, 2001 3
Università di AnconaCoopIS’01
Motivation
Information systems are collections of information sources and information consumers: Distributed Heterogeneous
at physical levelat logical levelat conceptual level (names and schemas)
Dynamic changes of information sources or their schemaschanges of information consumers or their needs
September 6, 2001 4
Università di AnconaCoopIS’01
Goal
Rewriting a consumer’s query into queries to specific information sources
when we have
a distributed, heterogeneous, and strongly dynamic information system.
September 6, 2001 5
Università di AnconaCoopIS’01
Usually query rewriting and information integration systems adopt the
Mediator Architecture:
[TSIMMIS, Squirrel, WHIPS, Carnot, SIMS, Information Manifold, Infomaster]
dynamic sources: systems are overloaded with expensive updating operations;
dynamic consumers: systems do not perform user profiling.
Related Work
consumer consumer
mediator mediator
... ...
source sourcesource ...
consumer
September 6, 2001 6
Università di AnconaCoopIS’01
consumer consumer
mediator mediator
... ...
source sourcesource ...
consumersource: w1
publication
journal conference
acm_trans ieee_trans
... ...
...people =
pub.publication name.String affiliation.String
publication =title.String booktitle.String publisher.String keyword.String
Wrapper
DBMS
publication
titlebooktitlepublisherkeyword
people
pub name affiliation
wrappera description logic as data modelling
and query language [e.g. C-Classic]source query processing is based on rewriting query using views over the local source [adapted by Beeri, Levy, Rousset]
Information Source
September 6, 2001 7
Università di AnconaCoopIS’01
consumer consumer
mediator mediator
... ...
source sourcesource ...
consumer
Mediator
mediator
Thesaurus
Mediated Schema
mediated schemaa description logic as data modelling
and query language [e.g. C-Classic]query processing is based on rewriting query using views over the distributed sources [adapted by Beeri, Levy, Rousset]
September 6, 2001 8
Università di AnconaCoopIS’01
Retrieval: Conjunction of concepts that are maximally contained in the query
Rewriting: Composition of rewriting of retrieved concepts
query: Q = pub.(ai ∏ db) ∏ pub.acm
Rewriting query using views
T
journal
type.{“Trans”}
acm_trans
acm ai
agents
db pub.(journal db)
J=pub.ai
K=pub.(agents db)
I=pub..acm_journalacm_journal
K=pub.(agents
db)
pub.(ai db)
H J = pub.(ai db journal)
I=pub..acm_journal
pub.acm
retrieved concepts: {K, (H ∏ J)} { I }rewrite: (view(K)∏view(I))U(view(H)∏view(J)∏view(I))
September 6, 2001 9
Università di AnconaCoopIS’01
source: w1
conference
publication
journal
acm_trans ieee_trans... ...
...
acm_trans =journal
type.{“Trans”}
publisher.{“ACM”}
.
.
...
journal =title.String
booktitle.String
type.{“Trans”,”Magazine”}
publisher.String
keyword.String
source: w3
people = pub.ai
name.String
position.String
.
.
ai
planning cbr agents
... ... ...
...
ai =title.String
booktitle.String
type.{“Trans”,”Magazine”}
publisher.String
year.Integer
source: w2
db =title.String
booktitle.String
type.String
year.Integer
.
people =pub.db
name.String
degree.String
.
db
object-oriented active federated
... ... ...
...
#id pub… …..
#id pub… …..
Mediator M
rewrite: (view(K)∏view(I))U(view(H)∏view(J)∏view(I))
Query Execution
#id pub… …..
rewrite: (view(K)∏view(I))U(view(H)∏view(J)∏view(I))data: #id pub
… …..… …..… …..
September 6, 2001 10
Università di AnconaCoopIS’01
Local Failures
In query rewriting:the mediator is not able to rewrite (some or all the components) of the input query.
In query execution:the mediator is not able to execute (some or all the components) of the rewrited query.
September 6, 2001 11
Università di AnconaCoopIS’01
Cooperation Strategies
consumer consumer
mediator mediator
... ...
source sourcesource ...
consumer
mediator
Thesaurus
Mediated Schema
all the mediators
partner
mediator
source
new mediators
succeeding mediators
failing mediators
all the sources
new sources
succeeding sources
failing sources
originalwhole
single component
selectedconcepts
whole
single component
rewritingwhole
single component
queryanswer
rewriting
data
September 6, 2001 12
Università di AnconaCoopIS’01
Cooperation with MediatorsAsking for Rewriting
Mediator MMediator M Mediator NMediator N
W1W1 W2W2 W3W3
1: request
3: query4: data
2: rewrite
September 6, 2001 13
Università di AnconaCoopIS’01
Mediator MMediator M Mediator NMediator N
W1W1 W2W2 W3W3
1: request
2: query 3: data
4: data
Cooperation with MediatorsAsking for Data
September 6, 2001 14
Università di AnconaCoopIS’01
Cooperation with Sources
Mediator MMediator M Mediator NMediator N
W1W1 W2W2 W3W3
2: rewrite3: data
1: request
September 6, 2001 15
Università di AnconaCoopIS’01
Strategy Comparison
Partner Type of Efficiency Effectiveness Answer No. of Messages Redund. Recall Prec. 1st. time Next time
Source (Sj) Rewriting 4s 2s No 1 1 “ Data 2s 2s No 1 1
Mediator (Ni) Rewriting 2m+2s’ 2s’ Yes 1 No “ Data 2m+2ms’ 2m+2ms’ No 1 No
m: number of mediators
s: number of sources
s’: number of sources that cooperate with Ni
September 6, 2001 16
Università di AnconaCoopIS’01
Strategy Comparison
Query RewritingOriginal Query Q Sol N (Q)Selected Concepts Q’ Sol N (Q’)Rewriting of selected concepts view(Q’) Sol N (view(Q’))
SolN(view(Q’)) SolN(Q’) SolN(Q)
September 6, 2001 17
Università di AnconaCoopIS’01
Conclusion
1st scenario (s ≈ m+s’)
mediators can be used for user profiling, mediators can be used to solve name
heterogeneity and integrate data, in order to solve schema heterogeneity, for a
mediator the most efficient and effective strategy is to directly cooperate with sources,
in order to update its schemas, for a mediator a “lazy” approach can be not appropriate.
September 6, 2001 18
Università di AnconaCoopIS’01
Conclusion
2nd scenario (s >> m+s’)
mediators can be used for user profiling, the most efficient strategy is the cooperation
with other mediators, cooperation with wrappers is useful only
when mediators are not able to rewrite a given query,
in order to update its schemas, for a mediator a “lazy” approach is appropriate.
September 6, 2001 19
Università di AnconaCoopIS’01
September 6, 2001 20
Università di AnconaCoopIS’01
M’s Mediated Schema
Problem Solution (Query) Rewriting Information Sources
pub. (journalΠdb)
(pub.journal Π pub.db), (pub.ai Π pub.db)
(pub.journal, w1), (pub.db,w2), (pub.ai,w3)
I=pub. acm_journal
(pub.acm_trans), (pub.type.{“Trans”} Π pub.publisher.{“ACM”})
(pub.acm_trans, w1), (pub.type.{“Trans”}, w3), (pub.publisher.{“ACM”}, w3)
J=pub.ai (pub.keyword.{“AI”}), (pub.ai)
(pub.keyword.{“AI”}, w1), (pub.ai, w3)
K=pub. (agentsΠdb)
(pub.keyword.{“Agents”} Π pub.db), (pub.agents Π pub.db)
(pub.keyword.{“Agents”}, w1) (pub.db, w2), (pub.agents, w3 )
September 6, 2001 21
Università di AnconaCoopIS’01
Rewriting query using views
Composition of rewriting of retrieved concepts
view( pub.(ai ∏ db) ∏ pub.acm ) =
= view( pub.(ai ∏ db) ) ∏ view( pub.acm )
= (view(K)∏view(I))U(view(H)∏view(J)∏view(I))
= … ...
September 6, 2001 22
Università di AnconaCoopIS’01
Rewriting query using views
={ (pub. keyword.{“Agents”} ∏ pub.db ∏ pub.acm_trans) ,
(pub.keyword.{“Agents”}∏pub.db ∏ pub.type.{“Trans”}∏pub.publisher.{“ACM”}),
(pub.agents ∏ pub.db ∏ pub.acm_trans) ,
(pub.agents ∏ pub.db ∏ pub.type.{“Trans”}∏pub.publisher.{“ACM”}),
(pub.journal ∏ pub.db ∏ pub. keyword.{“AI”} ∏ pub.acm_trans) ,
…}
=(view(K) ∏ view(I)) U (view(H) ∏ view(J) ∏ view(I))
September 6, 2001 23
Università di AnconaCoopIS’01
Local Failure in Query Rewriting
query: Q=pub.(ai∏db)∏affiliation.{“Stanford”}
T
journal
type.{“Trans”}
acm_trans
acm ai
agents
db pub.(journal db)
J=pub.ai
K=pub.(agents db)
I=pub..acm_journalacm_journal
affiliation.{“Stanford”} K=pub.(agents
db)
pub.(ai db)
H J = pub.(ai db journal)
retrieved concepts: {K, (H ∏ J)} Ørewrite: failure
September 6, 2001 24
Università di AnconaCoopIS’01
source: w1
conference
publication
journal
acm_trans ieee_trans... ...
...
acm_trans =journal
type.{“Trans”}
publisher.{“ACM”}
.
.
...
journal =title.String
booktitle.String
type.{“Trans”,”Magazine”}
publisher.String
keyword.String
source: w3
people = pub.ai
name.String
position.String
.
.
ai
planning cbr agents
... ... ...
...
ai =title.String
booktitle.String
type.{“Trans”,”Magazine”}
publisher.String
year.Integer
source: w2
db =title.String
booktitle.String
type.String
year.Integer
.
people =pub.db
name.String
degree.String
.
db
object-oriented active federated
... ... ...
...
Local Failure in Query Execution
no a
nsw
er
#id pub… …..
#id pub… …..
Mediator M
rewrite: (view(K)∏view(I))U(view(H)∏view(J)∏view(I))rewrite: (view(K)∏view(I))U(view(H)∏view(J)∏view(I))data: failure
September 6, 2001 25
Università di AnconaCoopIS’01
T
acm_tods
acm
pub.acm_tods
pub.acm_tocl
acm_tocl
Mediator N
query: pub.acm
Mediator M
Cooperation with MediatorsAsking for Rewriting
rewrite:{view(pub.acm_tocl),view(pub.acm_tods)}
pub.acm_tods
pub.acm_tocl
pub.acm
pub.acm_tods
pub.acm_tocl
retrieved concepts:{pub.acm_tocl, pub.acm_tods}
September 6, 2001 26
Università di AnconaCoopIS’01
Mediator M
Cooperation with MediatorsAsking for Rewriting
rewrite:{view(pub.acm_tocl),view(pub.acm_tods)} rewrite:{view(pub.acm_tocl),view(pub.acm_tods)}data: #id pub
… …..… …..… …..
source: w1
conference
publication
journal
acm_trans ieee_trans... ...
...
acm_trans =journal
type.{“Trans”}
publisher.{“ACM”}
.
.
...
journal =title.String
booktitle.String
type.{“Trans”,”Magazine”}
publisher.String
keyword.String
source: w3
people = pub.ai
name.String
position.String
.
.
ai
planning cbr agents
... ... ...
...
ai =title.String
booktitle.String
type.{“Trans”,”Magazine”}
publisher.String
year.Integer
source: w2
db =title.String
booktitle.String
type.String
year.Integer
.
people =pub.db
name.String
degree.String
.
db
object-oriented active federated
... ... ...
...
September 6, 2001 27
Università di AnconaCoopIS’01
T
acm_tods
acm
pub.acm_tods
pub.acm_tocl
acm_tocl
Mediator N
query: pub.acm
Mediator M
pub.acm_tods
pub.acm_tocl
pub.acm
pub.acm_tods
pub.acm_tocl
Cooperation with MediatorsAsking for Data
retrieved concepts:{pub.acm_tocl, pub.acm_tods} rewrite:{view(pub.acm_tocl),view(pub.acm_tods)}
September 6, 2001 28
Università di AnconaCoopIS’01
T
acm_tods
acm
pub.acm_tods
pub.acm_tocl
acm_tocl
Mediator N
pub.acm_tods
pub.acm_tocl
pub.acm
pub.acm_tods
pub.acm_tocl
Cooperation with MediatorsAsking for Data
rewrite:{view(pub.acm_tocl),view(pub.acm_tods)}
source: w1
conference
publication
journal
acm_trans ieee_trans... ...
...
acm_trans =journal
type.{“Trans”}
publisher.{“ACM”}
.
.
...
journal =title.String
booktitle.String
type.{“Trans”,”Magazine”}
publisher.String
keyword.String
source: w3
people = pub.ai
name.String
position.String
.
.
ai
planning cbr agents
... ... ...
...
ai =title.String
booktitle.String
type.{“Trans”,”Magazine”}
publisher.String
year.Integer
source: w2
db =title.String
booktitle.String
type.String
year.Integer
.
people =pub.db
name.String
degree.String
.
db
object-oriented active federated
... ... ...
...
rewrite:{view(pub.acm_tocl),view(pub.acm_tods)}data: #id pub
… …..… …..… …..
September 6, 2001 29
Università di AnconaCoopIS’01
T
acm_tods
acm
pub.acm_tods
pub.acm_tocl
acm_tocl
Mediator N
pub.acm_tods
pub.acm_tocl
pub.acm
pub.acm_tods
pub.acm_tocl
Cooperation with MediatorsAsking for Data
rewrite:{view(pub.acm_tocl),view(pub.acm_tods)} rewrite:{view(pub.acm_tocl),view(pub.acm_tods)}data: #id pub
… …..… …..… …..
query: pub.acm
Mediator M
rewrite:{view(pub.acm_tocl),view(pub.acm_tods)}data: #id pub
… …..… …..… …..
September 6, 2001 30
Università di AnconaCoopIS’01
Cooperation with Sourcessource: w1
conference
publication
journal
acm_trans ieee_trans... ...
...
acm_trans =journal
type.{“Trans”}
publisher.{“ACM”}
.
.
...
journal =title.String
booktitle.String
type.{“Trans”,”Magazine”}
publisher.String
keyword.String
source: w3
people = pub.ai
name.String
position.String
.
.
ai
planning cbr agents
... ... ...
...
ai =title.String
booktitle.String
type.{“Trans”,”Magazine”}
publisher.String
year.Integer
source: w2
db =title.String
booktitle.String
type.String
year.Integer
.
people =pub.db
name.String
degree.String
.
db
object-oriented active federated
... ... ...
...
source: w4
....
acm_tocl = acm_trans
booktitle.{“TOCL”}
acm_tods = acm acm_transbooktitle.{“TODS”}
.acm_jacm =acm journal
booktitle.{“JACM”}
.
acm_ccs =acm
booktitle.{“CCS”}
type.{“Proc”}
.
conference journal
acm_transacm_jacm
.........
acm_ccs
acm_tods acm_tocl
acm
query: pub.acm
Mediator M
source: w4
....
acm_tocl = acm_trans
booktitle.{“TOCL”}
acm_tods = acm acm_transbooktitle.{“TODS”}
.acm_jacm =acm journal
booktitle.{“JACM”}
.
acm_ccs =acm
booktitle.{“CCS”}
type.{“Proc”}
.
conference journal
acm_transacm_jacm
.........
acm_ccs
acm_tods acm_tocl
acm
retrieved:{pub.acm_ccs,pub.acm_jacm,pub.acm_tods}
rewrite:{view(pub.acm_ccs),view(pub.acm_jacm), view(pub.acm_tods)}
data:
Mediator M
#id pub… …..… …..… …..
September 6, 2001 31
Università di AnconaCoopIS’01
Redundancy
Cn(M) mediated schema of M after n interactions with N
C(N) mediated schema of N
))((
))()((
NCcard
NCMCncard
September 6, 2001 32
Università di AnconaCoopIS’01
Recall
Cn(M) mediated schema of M after n interactions with S1, … Sn
information need of a consumer(a view of S1, … Sn )
)(
))((
card
MCncard
September 6, 2001 33
Università di AnconaCoopIS’01
Precision
Cn(M) mediated schema of M after n interactions with S1, … Sn
information need of a consumer(a view of S1, … Sn )
))((
))((
MCncard
MCncard
September 6, 2001 34
Università di AnconaCoopIS’01
N’s Mediated Schema
Problem Solution (Query) Rewriting Information Sources
pub. acm_tods
(pub.booktitle.{“TODS”} Π pub.acm)
(pub.booktitle.{“TODS”}, w4), (pub.acm , w4)
pub. acm_tocl
(pub.booktitle.{“TOCL”} Π pub.acm_trans)
(pub.booktitle.{“TODS”}, w4), (pub.acm_trans , w1)
September 6, 2001 35
Università di AnconaCoopIS’01
M’s Mediated Schema (updated)
Problem Solution (Query) Rewriting Information Sources
H (pub.journal Π pub.db), (pub.ai Π pub.db)
(pub.journal, w1), (pub.db,w2), (pub.ai,w3)
I (pub.acm_trans), (pub.type.{“Trans”} Π pub.publisher.{“ACM”})
(pub.acm_trans, w1), (pub.type.{“Trans”}, w3), (pub.publisher.{“ACM”}, w3)
J (pub.keyword.{“AI”}), (pub.ai)
(pub.keyword.{“AI”}, w1), (pub.ai, w3)
K (pub.keyword.{“Agents”} Π pub.db), (pub.agents Π pub.db)
(pub.keyword.{“Agents”}, w1) (pub.db, w2), (pub.agents, w3 )
pub.acm (pub.booktitle.{“TODS”} Π pub.acm)
(pub.booktitle.{“TODS”}, w4), (pub.acm , w4)
September 6, 2001 36
Università di AnconaCoopIS’01
M’s Mediated Schema (updated)
Problem Solution (Query) Rewriting Information Sources
H (pub.journal Π pub.db), (pub.ai Π pub.db)
(pub.journal, w1), (pub.db,w2), (pub.ai,w3)
I (pub.acm_trans), (pub.type.{“Trans”} Π pub.publisher.{“ACM”})
(pub.acm_trans, w1), (pub.type.{“Trans”}, w3), (pub.publisher.{“ACM”}, w3)
J (pub.keyword.{“AI”}), (pub.ai)
(pub.keyword.{“AI”}, w1), (pub.ai, w3)
K (pub.keyword.{“Agents”} Π pub.db), (pub.agents Π pub.db)
(pub.keyword.{“Agents”}, w1) (pub.db, w2), (pub.agents, w3 )
pub.acm (pub.acm_tods) (pub.acm_tods , N)
September 6, 2001 37
Università di AnconaCoopIS’01
M’s Mediated Schema (updated)
Problem Solution (Query) Rewriting Information Sources
H (pub.journal Π pub.db), (pub.ai Π pub.db)
(pub.journal, w1), (pub.db,w2), (pub.ai,w3)
I (pub.acm_trans), (pub.type.{“Trans”} Π pub.publisher.{“ACM”})
(pub.acm_trans, w1), (pub.type.{“Trans”}, w3), (pub.publisher.{“ACM”}, w3)
J (pub.keyword.{“AI”}), (pub.ai)
(pub.keyword.{“AI”}, w1), (pub.ai, w3)
K (pub.keyword.{“Agents”} Π pub.db), (pub.agents Π pub.db)
(pub.keyword.{“Agents”}, w1) (pub.db, w2), (pub.agents, w3 )
pub.acm (pub.acm_ccs) , (pub.acm_jacm) , (pub.acm_tods)
(pub.acm_ccs , w4) , (pub.acm_jacm , w4) , (pub.acm_tods , w4)
September 6, 2001 38
Università di AnconaCoopIS’01
Cooperation with Mediators
Theorem.
Cn(M) mediated schema of M after n interactions with N
C(N) mediated schema of N
limn
card (Cn(M)C(N))
card (C(N))1redundancy:
September 6, 2001 39
Università di AnconaCoopIS’01
Cooperation with Mediators
Theorem.
Cn(M) mediated schema of M after n interactions with N
information need of a consumer(a view of S1, … Sn )
limn
card (Cn(M) )
card ( )1recall:
limn
card (Cn(M) )
card(Cn(M)) indeterminateprecision:
September 6, 2001 40
Università di AnconaCoopIS’01
Cooperation with Sources
Theorem.
Cn(M) mediated schema of M after n interactions with S1, … Sn
information need of a consumer(a view of S1, … Sn )
recall:
precision:
limn
card (Cn(M) )
card ( )1
limn
card (Cn(M) )
card(Cn(M))1