Top Banner
OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li, Tok Wang Ling Department of Computer Science School of Computing National University of Singapore
36

OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

Dec 13, 2015

Download

Documents

Imogene Harmon
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability

Changqing Li, Tok Wang Ling

Department of Computer ScienceSchool of Computing

National University of Singapore

Page 2: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

2

Outline Introduction

Preliminary and motivation

OWL-based Semantic Conflicts Detection and Resolution

Conclusion

Q & A

Page 3: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

3

Introduction Data interoperability and integration is a long-

standing challenge to the database research community.

Ontology provides sharing knowledge among different data sources

Clarify the semantics of information.

Provide a way to solve the interoperability problem in database integration

Page 4: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

4

Introduction (Cont.) OWL is being promoted as a standard for web

ontology language

In the future a considerable number of ontologies will be created based on OWL.

Therefore automatically detecting semantic conflicts based on OWL will greatly expedite the step to achieve semantic interoperability, and will greatly reduce the manual work to detect semantic conflicts.

Page 5: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

5

Ontology Definition An ontology defines the basic terms

and relations comprising the vocabulary of a topic area, as well as the rules for combining terms and relations to define extensions to the vocabulary [1].

1. Robert Neches, Richard Fikes, Timothy W. Finin, Thomas R. Gruber, Ramesh Patil, Ted E. Senator, William R. Swartout: Enabling Technology for Knowledge Sharing. AI Magazine 12(3): pp36-56 (1991)

Page 6: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

6

Ontology Language SHOE

RDF

RDFS

DAML+OIL

OWL

Page 7: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

7

SHOE

The Simple HTML Ontological Extensions (SHOE) [2] extends HTML with machine-readable knowledge annotated.

2. Sean Luke and Jeff Heflin: SHOE Specification 1.01. http://www.cs.umd.edu/projects/plus/SHOE/spec.html

Page 8: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

8

RDF Resource Description Framework (RDF) [3] is a

recommendation of W3C for Semantic Web [4]

It defines a simple model to describe relationships among resources in terms of properties and values.

SVO form (Subject-Verb-Object) Resource-property-Value

3. Ora Lassila and Ralph R. Swick: Resource description framework (RDF).

http://www.w3c.org/TR/WD-rdf-syntax

4. The SemanticWeb Homepage. http://www.semanticweb.org

Page 9: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

9

RDF (Cont.)

< Re s o u rc e A >

< p ro p e rty A >

< Re s o u rc e B>

< p ro p e rty B>

< Re s o u rc e C>

< p ro p e rty C>

Va lu e C

< /p ro p e rty C>

< /Re s o u rc e C>

< /p ro p e rty B>

< /Re s o u rc e B>

< /p ro p e rty A >

< /Re s o u rc e A >

Va lu e o fp ro p e rty B

Va lu e o fp ro p e rty A

Page 10: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

10

RDFS RDF Schema (RDFS) [5], the primitive

description language of RDF

Provide some basic primitives subClassOf subPropertyOf …

5. Dan Brickley and R.V. Guha. Resource Description Framework (RDF) Schema Specification 1.0, W3C Candidate Recommendation 27 March 2000. http://www.w3.org/TR/rdf-schema/

Page 11: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

11

DAML+OIL DARPA Agent Markup Language (DAML) [6]

To facilitate the semantic concepts and relationships understood by machines

Ontology Inference Layer (OIL) [7] Extends RDFS with additional language primitives

not yet presented in RDFS. DAML+OIL [8] are the successors of RDFS

Combination of DAML and OIL More semantic rich primitives are defined

6. The DARPA Agent Markup Language Homepage. http://daml.semanticweb.org/

7. The Ontology Inference Layer OIL Homepage.http://www.ontoknowledge.org/oil/TR/oil.long.html

8. DAML+OIL Definition. http://www.daml.org/2001/03/daml+oil

Page 12: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

12

OWL DAML+OIL is evolving as OWL (Web Ontology

Language) [9].

OWL is almost the same as DAML+OIL

Some primitives of DAML+OIL are renamed in OWL for easier understanding.

e.g., “sameClassAs” is changed to “equivalentClass” …

9. Frank van Harmelen, Jim Hendler, Ian Horrocks, Deborah L. McGuinness, Peter F. Patel-Schneider and Lynn Andrea Stein. OWL Web Ontology Language Reference. http://www.w3.org/TR/owl-ref/

Page 13: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

13

Primitives of OWL

“owl” before “:” is the namespace owl:equivalentClass owl:euqivalentProperty owl:sameIndividualAs owl:disjointWith owl:differentFrom …

Page 14: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

14

Our Extension of OWL (EOWL)

We extend OWL with the following primitives eowl:orderingProperty eowl:overlap eowl:properSubClassOf eowl:properSubPropertyOf …

Page 15: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

15

OWL-based Semantic Conflicts Cases

A. Name conflictsB. Order sensitive conflictsC. Scaling conflictsD. Whole and part conflictsE. Partial similarity conflictsF. Swap conflicts

Page 16: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

16

A. Name conflicts Example A. two distributed data warehouses

one is used to analyze the United States market country, state, city and district

and the other is used to analyze the China market country, province, city and county

Based on the context

“provicnce” is defined equivalent to “State” using the OWL primitive “owl:equivalentClass”.

To resolve this conflict, one name needs to be changed. Change to the referenced name.

Page 17: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

17

A. Name conflicts (Cont.)

<owl:Class rdf:ID="Province"> <rdfs:label>Province</rdfs:label> <owl:equivalentClass rdf:resource="#State"/></owl:Class>

Fig. A. Detection of synonym conflicts

“owl:equivalentClass” is the indicator to detect synonym conflicts

Change to “State” as which is referenced in the ontology definition.

Page 18: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

18

A. Name conflicts (Cont.) Case A. Synonyms. The OWL primitives

“owl:equivalentClass”, “owl:equivalentProperty” and “owl:sameInvidualAs” are indicators to detect this case.

Conflict Resolution Rule A. If synonym conflicts are detected, different attribute names with the same semantics need to be translated to the same name (referenced name) for smooth data interoperability.

Page 19: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

19

B. Order sensitive conflicts Example B. Consider the highest three scores of a course.

The highest three scores of course A are listed as “90, 95, 100” at ascending order,

The highest three scores of course B are listed as “98, 95, 93” at descending order.

The “highestThreeScores” is defined as an “eowl:orderingProperty” in the ontology

The sequences of the highest three scores for course A and B should be adjusted both to ascending order or descending order.

Adjust to the sequence of the first one by default, e.g. the sequence of course A

Page 20: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

20

B. Order sensitive conflicts (Cont.)

Fig. B. Detection of order sensitive conflicts

<eowl:orderingProperty rdf:ID="highestThreeScores"> <rdfs:label>highest three scores of a course</rdfs:label> <rdfs:domain rdf:resource="#Course"/> <rdfs:range rdf:resource="xsd#integer"/></eowl:orderingProperty>

We can further define the ascendant or descendant order for more precise semantics.

Page 21: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

21

B. Order sensitive conflicts (Cont.) Case B. Order sensitive. EOWL primitive

“eowl:orderingProperty” and RDF primitive “rdf:Seq” are indicators to detect this case.

Conflict Resolution Rule B. If order sensitive conflicts are detected, we need to adjust the member sequence according to the same criterion for smooth data interoperability, the sequence of the first one by default.

Page 22: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

22

C. Scaling conflicts Example C. Consider two database schemas

Product(ID, Price) Product(ID, Price)

One price may refer to the US dollars, while the other may refer to the Singapore dollars. Figure 4 shows some concepts about a currency ontology; “price” is defined

Translate the price to refer to the same currency unit. The unit of the first one by default.

Page 23: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

23

C. Scaling conflicts (Cont.)

Fig. C. Detection of scaling conflicts

<owl:DatatypeProperty rdf:ID="price"> <rdfs:domain rdf:resource="#Product"> <rdfs:range rdf:parseType="Resource"> <rdf:value/> <currency:CurrencyUnit/> </rdfs:range></owl:DatatypeProperty>

Page 24: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

24

C. Scaling conflicts (Cont.) Case C. Semantic conflicts may exist if the

value of a data type property comprises both value and unit (Scaling). RDF primitive “rdf:parseType="Resource"” and OWL primitive “owl:DatatypeProperty” are indicators for this case.

Conflict Resolution Rule C. If scaling conflicts are detected, the value should be translated to refer to the same unit for smooth data interoperability. The first unit by default.

Page 25: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

25

D. Whole and part conflicts Example D. Consider schemas

Person(ID, name) Person(ID, surname, givenName)

“surname” and “givenName” are both defined as the proper sub property of “name”; using “eowl:properSubClassOf”

“eowl:properSubClassOf” has clearer semantics than “rdfs:subClassOf” because “rdfs:subClassOf” is ambiguous with two meanings: “eowl:properSubClassOf”and “owl:equivalentClass”.

Divide the whole attribute “name” to the part attributes “surname” and “givenName”

Or combine the part attributes “surname” and “givenName” together in the correct sequence to form the whole attribute “name”.

Page 26: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

26

D. Whole and part conflicts (Cont.)

Fig. D1. Detection of whole and part conflicts

<rdf:Property rdf:ID="surname"> <eowl:properSubPropertyOf rdf:resource="#name"></rdf:Property>

Fig. D2. Detection of whole and part conflicts

<rdf:Property rdf:ID=“givenname"> <eowl:properSubPropertyOf rdf:resource="#name"></rdf:Property>

Page 27: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

27

D. Whole and part conflicts (Cont.) Case D. Semantic conflicts may exist if one

concept is completely contained in another concept (Whole and part). EOWL primitives “eowl:properSubClassOf”, “eowl:properSubPropertyOf” are indicators to detect this case.

Conflict Resolution Rule D. If whole and part conflicts are detected, the whole attributes should be divided into part attributes or the part attributes should be combined together to whole attributes for smooth data interoperability.

Page 28: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

28

E. Partial similarity conflicts Example E. integration ResearchAssistant and

GraduateStudent

The relationship between research assistant and graduate student is overlap because some research assistants are also graduate students,

but not all research assistants are graduate students,

and not all graduate students are research assistants.

After integration, there should be three schemas: Research Assistant but not Graduate Student RNotG Graduate Student but not Research Assistant GNotR both Research Assistant and Graduate Student RAndG

Page 29: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

29

E. Partial similarity conflicts (Cont.)

Fig. E. Detection of partial similarity conflicts

<owl:Class rdf:ID="ResearchAssistant"> <eowl:overlap rdf:resource="#GraduateStudent"/></owl:Class>

Page 30: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

30

E. Partial similarity conflicts (Cont.) Case E. Semantic conflicts may exist if two

concepts are overlapped (Partial similarity). EOWL primitive “eowl:overlap” is indicators to detect this case.

Conflict Resolution Rule E. If partial similarity conflicts are detected, the overlap part should be separated before integration.

Page 31: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

31

F. Swap conflicts Example F. Continued from Example A

In China, county is contained in city (city has larger area)

In US, city is contained in county (county has larger area).

The domain (“County”) of property “region:containedIn” in the China ontology is just the range of the same property “region:containedIn” in the US ontology

The range (“City”) of property “region:containedIn” in the China ontology is just the domain of the same property “region:containedIn” in the US ontology.

We can add “China.” or “US.” before “City” and “County” for smooth data interoperability.

Page 32: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

32

F. Swap conflicts (Cont.)

Fig. F1. Detection of swap conflicts (the relationship between city and county in the China ontology)

<owl:Class rdf:ID="County"> <region:containedIn rdf:resource="#City”/></owl:Class>

Fig. F2. Detection of swap conflicts (the relationship between city and county in the US ontology)

<owl:Class rdf:ID="City"> <region:containedIn rdf:resource="#County”/></owl:Class>

Page 33: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

33

F. Swap conflicts (Cont.) Case F. Semantic conflicts may exist if the

domain of a property in the first ontology is the range of the same property in the second ontology, and the range of the property in the first ontology is the domain of the same property in the second ontology (Swap).

Conflict Resolution Rule F. If swap conflicts are detected, context restrictions (see Example F) should be added to the schema for smooth data interoperability.

Page 34: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

34

Conclusion We extend OWL with several primitives which have

clearer semantics

summarize several cases based on OWL in which semantic conflicts are easily to be encountered

The conflict resolution rules for each case are presented.

In the future, OWL will be frequently used to build ontologies, and this paper provides a computer-aid approach to detect and resolve semantic conflicts for smooth data interoperability.

Page 35: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.

35

References

1. Robert Neches, Richard Fikes, Timothy W. Finin, Thomas R. Gruber, Ramesh Patil, Ted E. Senator, William R. Swartout: Enabling Technology for Knowledge Sharing. AI Magazine 12(3): pp36-56 (1991)

2. Sean Luke and Jeff Heflin: SHOE Specification 1.01. http://www.cs.umd.edu/projects/plus/SHOE/spec.html

3. Ora Lassila and Ralph R. Swick: Resource description framework (RDF).

http://www.w3c.org/TR/WD-rdf-syntax

4. The SemanticWeb Homepage. http://www.semanticweb.org5. Dan Brickley and R.V. Guha. Resource Description Framework (RDF) Schema Specification 1.0,

W3C Candidate Recommendation 27 March 2000. http://www.w3.org/TR/rdf-schema/6. The DARPA Agent Markup Language Homepage.

http://daml.semanticweb.org/

7. The Ontology Inference Layer OIL Homepage.

http://www.ontoknowledge.org/oil/TR/oil.long.html

8. DAML+OIL Definition. http://www.daml.org/2001/03/daml+oil9. Frank van Harmelen, Jim Hendler, Ian Horrocks, Deborah L. McGuinness, Peter F. Patel-Schneider

and Lynn Andrea Stein. OWL Web Ontology Language Reference. http://www.w3.org/TR/owl-ref/

Page 36: OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li,Tok Wang Ling Department of Computer Science School of Computing.