Top Banner
Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia, Pavel Shvaiko, Mikalai Yatskevich, Ilya Zaihrayeu
31

Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

Dec 25, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

Semantic MatchingSemantic MatchingPavel Shvaiko

Stanford University, October 31, 2003

Paper with Fausto Giunchiglia

Research group (alphabetically ordered): Fausto Giunchiglia, Pavel Shvaiko, Mikalai Yatskevich, Ilya Zaihrayeu

Page 2: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

2

Stanford University, October 31, 2003

Outline

Matching

Syntactic Matching

Semantic Matching

On Implementing Semantic Matching

Conclusions

Page 3: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

3

Stanford University, October 31, 2003

MATCHING

Page 4: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

4

Stanford University, October 31, 2003

Application Domains

Generic Model Management Schema integration

Data warehouses

E-commerce

Data Coordination in P2P systems, Semantic Web

Page 5: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

5

Stanford University, October 31, 2003

Example of Matching

Arts

Organizations

Art History

Music

Baroque

History

www.google.com

Organizations

Arts&Humanities

Art History

www.yahoo.com

Design Art

Baroque

Architecture

History

Sc=0.9

Sr={}

Sc=1.0

Sr={}

Sr={}

Page 6: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

6

Stanford University, October 31, 2003

Matching

Match is an operator that takes two graph-like structures (e.g., database schemas or ontologies) and produces a mapping between elements of the two graphs that correspond semantically to each other

Page 7: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

7

Stanford University, October 31, 2003

Matching

The problem of matching can be decomposed in two steps:

Extract graphs from the data and conceptual models

Match the resulting graphs (generic matching)

Page 8: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

8

Stanford University, October 31, 2003

Matching

Mapping element is a 4-tuple < mID, Ni1, N

j2, R >

mID is a unique identifier of the given mapping element;

Ni1 is the i-th node of the first graph,

Nj2 is the j-th node of the second graph,

R specifies a similarity relation of the given nodes

Mapping is a set of mapping elements

Matching is the process of discovering mappings between two graphs through the application of a matching algorithm

Page 9: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

9

Stanford University, October 31, 2003

Matching

Semantic Matching

Syntactic Matching•R is computed between labels at nodes

•R = {x[0,1]}

•R is computed between concepts at nodes

•R = { =, , , , }

Matching: Syntactic AND Semantic

Page 10: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

10

Stanford University, October 31, 2003

SYNTACTIC MATCHING

Page 11: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

11

Stanford University, October 31, 2003

Syntactic Matching

Mapping element is a 4-tuple < mID, Li1, L

j2, R >, where

Li1 is the label at the i-th node of the first graph;

Lj2 is the label at the j-th node of the second graph;

R specifies a similarity relation in the form of a coefficient, which measures the similarity between the labels of the given nodes

Example: R is a similarity coefficient in [0,1]

R = <m21,telephone, phone, 0.7>

Page 12: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

12

Stanford University, October 31, 2003

The State of the Art

Cupid… is a hybrid matching prototype. It exploits linguistic and structural schema matching heuristics, and computes similarity coefficients between nodes of the trees.

Similarity Flooding… is a hybrid matching prototype. It uses fix-point computation to determine correspondences between nodes of the graphs.

COMA…is a composite matching prototype. It provides an extensible library of different matchers which manipulate DAGs and supports various ways of combining final results.

As far as we know, so far only syntactic matching…

Page 13: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

13

Stanford University, October 31, 2003

SEMANTIC MATCHING

Page 14: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

14

Stanford University, October 31, 2003

Semantic Matching  

Mapping element is a 4-tuple < mID, Ci1, C

j2, R >, where

Ci1 is the concept of the i-th node of the first graph;

Cj2 is the concept of the j-th node of the second graph;

R specifies a similarity relation in the form of a semantic relation between the extensions of concepts at the given nodes

Possible R’s: equality {=}, overlapping {}, mismatch {}, more general/specific {, }

Example: R = <m21,telephone, phone, {=}>

Page 15: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

15

Stanford University, October 31, 2003

Examples: Analysis of Ancestors. Case 1

Suppose that we want to match nodes 51 and 12

A

C

is-a

B

is-a

is-a is-a

E D

1

2

4

3

5

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5

Cupid does not find a similarity coefficient between the nodes under consideration, due to the significant differences in structure of the given graphs

Semantic matching: The concept denoted by the label at node 51 is CC1, while the concept at node 51 is C51 =

CA1CC1. The concept at node 12 is C12 = CC2. Thus, C51 C12

Page 16: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

16

Stanford University, October 31, 2003

Examples: Analysis of Ancestors. Case 2

Suppose that we want to match nodes 51 and 52

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5 *

A

is-a

C

is-a

is-a is-a

E D

1

2

3

5

Cupid: R= 0,86. This is because of the identity of labels A1=A2, C1=C2

Semantic matching: The concept at node 51 is C51 =

CA1CC1; while the concept at node 52 is C52 = CA2*CC2.

Since we have that CA1=CA2 and CC1=CC2, then C52 C51

Page 17: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

17

Stanford University, October 31, 2003

ON

IMPLEMENTING

SEMANTIC MATCHING

Page 18: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

18

Stanford University, October 31, 2003

On Implementation

Semantic Matching

Structure - level

Element - level

Weak Semantics Techniques

Strong Semantics Techniques

Page 19: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

19

Stanford University, October 31, 2003

Element-level Semantic Matching

Weak Semantics Techniques Analysis of strings {=}

<phone, telephone,{=}>

Analysis of data types {=, , , , } <string, integer,{}>

<integer, real,{}>

Analysis of soundex {=}< Fausto, Phausto,{=}>

Strong Semantics Techniques Precompiled thesaurus

syn key <Discount, Rebate,{=}>

WordNet <Art_#1, Humanities_#1,{}>, where #1 … sense number 1 of the word Art according to WordNet

Page 20: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

20

Stanford University, October 31, 2003

Element-level Semantic Matching (cont.)

Semantic Relations via WordNetEquality: one concept is equal to another if there is at least one sense of the first concept, which is a synonym of the secondOverlapping: one concept is overlapped with the other if there are some senses in commonMismatch: two concepts are mismatched if they have no sense in commonMore general: one concept is more general then the other iff there exists at least one sense of the first concept that has a sense of the other as a hyponym or meronymLess general: one concept is less general than the other iff there exists at least one sense of the first concept that has a sense of the other concept as hypernym or as a holonym

Page 21: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

21

Stanford University, October 31, 2003

Structure-level Semantic Matching

We translate the matching problem, namely the two graphs (in particular, the pair of nodes submitted to matching) into a propositional formula and then check for its validity

We check for validity using SAT

Page 22: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

22

Stanford University, October 31, 2003

Semantic Matching Algorithm

1. Extract the two graphs

2. Compute element-level semantic matching

3. Compute concepts at nodes

4. Construct the propositional formula

5. Run SAT

6. Perform iterations

Page 23: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

23

Stanford University, October 31, 2003

Semantic Matching Algorithm: Example – (1)

Extract the two graphs

A

C

is-a

B

is-a

is-a is-a

E D

1

2

4

3

5

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5

• In the case of RDB, XML and OODB schemas, it is

necessary to extract useful semantic information, for instance in the form of ontologies

Page 24: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

24

Stanford University, October 31, 2003

Semantic Matching Algorithm: Example – (2)Element-level semantic matching. For each node, compute semantic relations holding among all the concepts denoted by labels at nodes under consideration

A

C

is-a

B

is-a

is-a is-a

E D

1

2

4

3

5

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5

CA1 = CA2

CB1 = CB2

CC1 = CC2

CD1 = CD2

CE1 = CE2

Page 25: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

25

Stanford University, October 31, 2003

Semantic Matching Algorithm: Example – (3)

Compute concepts at nodes. Suppose, we want to find a semantic relation between nodes 51 and 12

A

C

is-a

B

is-a

is-a is-a

E D

1

2

4

3

5

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5

?

C11 = CA1

C51 = CA1 CC2

C12 = CC2

Page 26: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

26

Stanford University, October 31, 2003

Semantic Matching Algorithm: Example – (4)Construct the propositional formula. We translate all the semantic relations computed in step 2 into propositional formulas under the following rules:

CA1 CA2 CA2 CA1CA1 CA2 CA1 CA2CA1 = CA2 CA1 CA2CA1 CA2 (CA1 CA2)

A

C

is-a

B

is-a

is-a is-a

E D

1

2

4

3

5

A

C B

is-a is-a

is-a is-a

E D

1

4 3

2 5

?

From step 2 we have: CC1 CC2.

We want to prove that C51 C12 ( we guess relation

between nodes at this stage)

(CA1 CC1) CC2

(CC1 CC2) ((CA1 CC1) CC2)

Page 27: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

27

Stanford University, October 31, 2003

Semantic Matching Algorithm: Example – (5)

Run SAT

In order to prove that (CC1 CC2) ((CA1 CC1 ) CC2) is

valid, we prove that its negation is unsatisfiabile

(CC1 CC2) ((CA1 CC1) CC2)

SAT returns FALSE

Thus, C51 C12

Page 28: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

28

Stanford University, October 31, 2003

Example: Cupid vs. Semantic Matching

Arts

Organizations

Art History

Music

Baroque

History

www.google.com

Organizations

Arts&Humanities

Art History

www.yahoo.com

Design Art

Baroque

Architecture

History{}

{}

{}

{}

{}

{}

Page 29: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

29

Stanford University, October 31, 2003

Conclusions

We have made a rational reconstruction of the major matching problems and articulated them in terms of the more generic problem of matching graphs

We have identified semantic matching as a new approach for performing generic matching

We have proposed an implementation of semantic matching using SAT

Page 30: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

30

Stanford University, October 31, 2003

Future Work

Extend to a full graph matcher

How to extract semantics from schemas

Study how to take into account attributes and instances

Develop an efficient implementation of the system

Do a thorough testing of the system

Page 31: Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

31

Stanford University, October 31, 2003

References

Project website: http://www.dit.unitn.it/~p2p/

F. Giunchiglia, P.Shvaiko “Semantic Matching”. Technical Report #DIT-03-013. Also to appear in The Knowledge Engineering Review journal. Short version in proceedings of Semantic Integration workshop at ISWC’03.

F. Giunchiglia, I. Zaihrayeu “Making peer databases interact – a vision for an architecture supporting data coordination” In Proc. Of the Conference of Information Agents (CIA 2002), Madrid, 2002.