Page 1
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA, A Federated Architecture for Ontologies Tarcisio Mendes de Farias, Ana Roxin and Christophe Nicolle
[email protected]
The 9th International Web Rule Symposium August 2-5, 2015 Freie Universität Berlin, Berlin, Germany
Page 2
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
CONTEXT
o Data to process and share has exponentially increased since the advent of the internet
o The web of data is pointed as a solution to publish structured data on the Web
o Various ontologies and relevant vocabularies keep emerging nowadays
2
Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/
Page 3
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
PROBLEM
o Data integration in the context of enterprise information systems and Semantic Web
o 3 layers of data interoperability
– Physical (e.g. network protocols )
– Syntactic (e.g. XML)
– Semantic (e.g. RDF, OWL)
o Needs of mechanisms for semantic interoperability
3
Page 4
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
PROBLEM
o Semantic heterogeneity
– Schema vs Data heterogeneity
o Full data integration is only possible considering both
– Schema
– Data
4
Source: cloudtweaks.com
Page 5
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
GOALS AND PROPOSED SOLUTIONS
o Mitigating semantic heterogeneity
– Solution: interoperability at the schema (data model) level
o Tackling semantic data interoperability
– Solution: • A loosely coupled federated architecture for OWL ontologies
• A rule-based integration of several autonomous ontologies
5
Page 6
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
BACKGROUND
o Ontology Matching
– Tackling complex alignments (user involvement)
6
onto2:C21(?x1) ∧ onto2:C22(?x6) ∧ onto2:C23(?x3) ∧ … ∧ onto2:p28(?x7, ?x8) ∧ onto2:p26(?x5, ?x7) ∧ onto2:p27(?x6, ‘‘Category”) ∧ onto2:p28(?x3,‘‘ProductResource”) → onto1:p11(?x1, ?x8)
Source: www.webology.org/2006/v3n3/a28.html
o Ontology Alignment
– Alignment format (e.g. SWRL)
Page 7
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
BACKGROUND
o Target and source ontologies
7
“[email protected] ”^^xsd:string
onto:email
rdf:type
Target Source
Page 8
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
RELATED WORK
o Interoperability for different database schemas
– Non-federated (e.g. centralized database )
– Federated database architecture
8
[1] Heimbigner, D., and McLeod, D.. A Federated Architecture for Information Management. ACM Trans. Off. Znf. Syst. 3, 3 253-278 (1985).
“Collection of components to unite loosely coupled federation in order to share and exchange information” using “an organization model based on equal, autonomous databases, with sharing controlled by explicit interfaces.” [1]
Page 9
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
RELATED WORK
o Correndo et al. [2] and Makris et al. [3]
– SPARQL query rewriting approaches for data interoperability
– Graph pattern rewriting based on ontology alignments
– Semantic interoperability over various ontologies
o Main drawbacks
– Cases of several source and target ontologies are ignored
– Impossible to write queries using terms from different ontologies
– No inference capabilities
9
[2]Makris et al. Ontology mapping and SPARQL rewriting for querying federated RDF data sources. In Proceedings of the 2010 International Conference on On the Move to Meaningful Internet Systems: Part II, OTM’10, pages 1108–1117, Berlin (2010). [3] Correndo et al. Sparql query rewriting for implementing data integration over linked data. In Proceedings of the 2010 EDBT/ICDT Workshops, pages 4:1–4:11, New York, NY, USA. ACM (2010).
Page 10
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA
o Federated architecture for OWL ontologies
“We define FOWLA as an architecture based on autonomous ontologies with sharing described through a rule-based format controlled by inference mechanisms.”
10
Page 11
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA – General architecture
11
Autonomous ontologies
Ontology alignments
(rule-based)
Inference mechanisms
Page 12
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA – FD Component
o Separating alignments from the ontology definition
o Federal Logical Schema (FLS)
‒ Ensemble of logical DL-safe rules
‒ OWL + SWRL
‒ Impossible to create new concept instances
o Federal Concept Instantiation (FCI)
– Creating instances for encapsulated data
12
Page 13
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA – FD Component
o Interoperability over two OWL ontologies
13
Onto1 TBox
Onto1 ABox
Onto2 TBox
Onto2 ABox
Page 14
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA – FD Component
14
swrl1: onto1:Car (?x) → onto2:Motor_Car(?x) swrl2: onto2:Motor_Car(?x) → onto1:Car(?x) swrl3: onto1:Car(?x) ∧ onto1:hasColour( ?x, ?y) ∧ onto1:Colour(?y) ∧ onto1:hasName(?y, ?z) → onto2:hasBodyColour(?x, ?z)
Onto1 TBox
Onto1 and Onto2 ABox
Onto2 TBox
FLS
Page 15
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA – FD Component
15
swrl1: onto1:Car (?x) → onto2:Motor_Car(?x) swrl2: onto2:Motor_Car(?x) → onto1:Car(?x) swrl3: onto1:Car(?x) ∧ onto1:hasColour( ?x, ?y) ∧ onto1:Colour(?y) ∧ onto1:hasName(?y, ?z) → onto2:hasBodyColour(?x, ?z)
swrl4: onto2:Motor_Car(?x) ∧ onto2:hasBodyColour(?x,?z) ∧ onto1:Colour(?y) ∧ onto1:hasColour( ?x, ?y) → onto1:hasName(?y,?z)
FLS
Onto1 TBox
Onto1 and Onto2 ABox
Onto2 TBox
FCI
Page 16
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA – FC Component
o Performs the bulk of necessary inferences
o Contains the following sub-modules:
– Rule Selector (RS)
– Rule Engine associated to a DL reasoner
o Controls the interoperation among the considered ontologies based on an ensemble of rules and DL formalisms (e.g. OWL)
16
Page 17
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA – FC Component
o RS is responsible for improving backward-chaining reasoning
– The number of rules highly impacts query execution time
– Integrates access policies
o Why backward-chaining (or hybrid) reasoner ?
– Avoiding considerable amounts of materialized data
– Modification → re-computation of all inferred data
17
Page 18
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA - Implementation
18
Page 19
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA – Pre-processing Phase
o Alignments converted to a rule format (e.g. SWRL)
o Query Module
– Identifies each alignment presenting schema heterogeneity
– Missing properties are materialized along with new instances for each one
19
swrl4: onto2:Motor_Car(?x) ∧ onto2:hasBodyColour(?x,?z) ∧ onto1:Colour(?y) ∧ onto1:hasColour( ?x, ?y) → onto1:hasName(?y,?z)
Page 20
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA - Query Execution Phase
20
o Selection of specific rules necessary to answer a given query addressed over the federated ontologies
swrl1: onto1:Car (?x) → onto2:Motor_Car(?x) swrl2: onto2:Motor_Car(?x) → onto1:Car(?x) swrl3: onto1:Car(?x) ∧ onto1:hasColour( ?x, ?y) ∧ onto1:Colour(?y) ∧ onto1:hasName(?y, ?z) → onto2:hasBodyColour(?x, ?z) swrl4: onto2:Motor_Car(?x) ∧ onto2:hasBodyColour(?x,?z) ∧ onto1:Colour(?y) ∧ onto1:hasColour( ?x, ?y) → onto1:hasName(?y,?z)
SELECT ?x ?y WHERE{ ?x rdf:type onto2:Motor_Car. ?x onto2:hasBodyColour ?y }
FLS ARS
Page 21
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA BENEFITS
o Avoiding data redundancy
o Inferring new ontology alignments
o Modularizing the maintainability
o Querying with vocabulary terms issued from different ontologies
o Improving query execution time
21
Page 22
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA BENEFITS
o Inferring new ontology alignments
22
Page 23
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FOWLA BENEFITS
o Modularizing the maintainability
– Modification in IS(A,D) – { IS(A,B) ∩ IS(A,D) }
23
Page 24
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
EVALUATION
24
o We consider two aligned ontologies
– FLS composed of 474 SWRL rules
o Triple store: Stardog
– OWL reasoner associated to a SWRL engine
– It is based on backward-chaining reasoning
OWL entities
Onto1
Onto2
Classes
30
802
Object properties
32
1292
Data properties
125
247
Inverse properties
7
115
Triples in the Tbox
2212
9978
DL expressivity
ALCHIF(D)
ALUIF(D)
Page 25
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
EVALUATION
Number of rules Characteristics
KB1 474 All the rules contained in the FLS (all the rules forming the alignment between Onto1 and Onto2)
KB2 266 All subsumption rules along with all the rules that have elements from Onto1 in their head
KB3 178 All rules from KB2 minus some of the rules that have elements from Onto1 in their head (we aimed at reducing the data inferred)
KB4 variable All the rules contained in the Activated Rule Set (ARS) conceived by the RS.
25
o Experiment Environment – Each repository’s ABox contains 1,146,294 triples
– Server: Intel Xeon CPU E5-2430 at 2.2GHz with 2 cores out of 6, 8GB of DDR3 RAM memory (Java Heap = 6GB)
– Client: Intel Core CPU I7-4790 at 3.6GHz with 4 cores, 8GB of DDR3 RAM memory at 1600MHz (Java Heap = 1GB)
Page 26
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
EVALUATION
Query name SPARQL Query
Q1 SELECT ?x ?y WHERE { ?x onto1:p11 ?y . }
Q2 SELECT ?x ?y WHERE { ?x a onto2:C21 . ?x onto1:p11 ?y . }
Q3 SELECT ?x ?u WHERE { ?x a onto1:C11 . ?y a onto2:C22 . ?x onto1:p12 ?y . ?y onto1:p11 ?x . }
26
Query KB Mean execution time (s)
Standard deviation ()
#RuleSet #Results
Q1
KB1 - - 474 0
KB2 - - 266 0
KB3 9.25 12.21 178 1683
KB4 2.23 1.78 16 38318
Q2
KB1 - - 474 0
KB2 - - 266 0
KB3 32.99 0.75 178 74
KB4 0.16 0.04 2 74
Q3
KB1 - - 474 0
KB2 - - 266 0
KB3 71.62 0.95 178 0
KB4 0.88 0.43 5 9
Page 27
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
CONCLUSION
o An approach for federating ontologies in order to address the problem of semantic interoperability
o Advantages:
– Allows composing queries using terms from different ontologies (be it source or target)
– Takes advantage of existing inference mechanisms for deducing new knowledge
– Reduces execution time for queries addressed over rule-based alignments
27
Page 28
Tarc
isio
MEN
DES
DE
FAR
IAS
– t.
men
des
def
aria
s@ac
tive
3D
.net
– P
h.D
. Can
did
ate
R
esea
rch
Gro
up
Ch
ecks
em –
Lab
ora
tory
LE2
I (U
MR
CN
RS
630
6)
– U
niv
ersi
ty o
f B
urg
un
dy
FUTURE WORKS
o Defining the strategies for ordering ontologies to be aligned
o Integration of SWRL built-ins (e.g. swrlb) at the level of the FLS
o Investigating the use of query languages other than SPARQL for implementing our approach
28