Page 1
1
This is a student talk, at the Knowledge Engineering course, each student is is
asked to present his/her synthesis paper.
Course Page: http://jarrar-courses.blogspot.com/2011/09/knowledgeengineering-fall2011.html
Ontology-Based
Data Integration
Knowledge Engineering Course (SCOM7348)
University of Birzeit, Palestine
December, 2012
Synthesis Paper Talk
Ali Ahmad Al Jadaa
Master of Computing, Birzeit University
[email protected]
Page 2
2
Abstract
• We are need to obtain information from
several local or external sources, Each
source may be built in different ways, so we
will face many various conflicts in the
meaning or structure and other conflicts.
We'll see also examples show why need
data integration .
• Ontology provides effective solutions to data
integration in different ways, and which you
will highlight about it
Page 3
3
What is Data
• Data is a collection of facts, such as values
or measurements. It can be numbers,
words, measurements, observations or
even just descriptions of things.
Page 4
4
What is data integration
• Data coming from different sources
and providing users with a unified view
of these data .
Page 5
5
What is Semantic
• Understand the meaning .
• Relation between signifiers .
• Difference with other
≠
Page 6
6
What is Ontology
• Set of concepts within a domain, and the
relationships between pairs of concepts. It
can be used to model a domain and support
reasoning about entities .
• Dictionary Computers can be understand
• List of things that exist
• Description of the kinds of entities there are
and how they are related
Page 7
7
Why need data integration
• The world turns to a small village, and
became a speed factor in the completion of
transactions of the most important features
of any state.
Page 8
8
example
• if any student want to travel to complete his study , he
must visit many of ministries to prepare :
• 1 - General Certificate of Secondary Education of school.
• 2 - Ratification certificate from the Ministry of Education.
• 3 - Identification of the Ministry of the Interior.
• 4 - Valid passport of the Ministry of the Interior.
• 5 - Disease free certificate from the Ministry of Health .
• 6 - Guarantee from a bank
Page 9
9
The problem of data integration
• Since the data come from different sources,
and all data source built in a different way
and different meaning, we will face many
problems
Page 10
10
Name Heterogeneities
• which mean different names for the same concepts
(Synonyms) ,
• for example schema use (Code)
• and anther one use (Number) or (No.) ,
Page 11
11
Meaning Heterogeneities
• which mean Same name for different
concepts(Homonyms),for example schema
use City as a Birth City , but another
schema use it to mean work city
Page 12
12
Structure Heterogeneities
• which means that different information
systems store their data in different
structures ,
Page 13
13
Type Heterogeneities
• which mean same attribute in different data type
,example attribute "Gender" in schema use String data
type ("Male"," Female"), but in another schema use
Boolean data type (0,1).
Page 14
14
Rules and Constraints Heterogeneities
• which mean different cardinalities in the same
relationships, example in schema the Age of student
between(18-25) year but in another one it's between
(18-30)
Page 15
15
Model Heterogeneities
Occurs when different databases adheres to
different data models,
Page 16
16
Service Oriented Architecture
• Is a set of principles and methodologies for
designing and developing software in the
form of interoperable services[1]
Page 17
17
Publish-Subscribe Architecture
• networking technologies and products
enable a high degree of connectivity across
a large number of computers, applications,
and users[5]
Page 18
18
Consolidation
• involves capturing of data from multiple
source systems and integrating into a single
persistent data store. The latency of the
information in the consolidated data store
depends upon whether batch or real time
data consolidation is being used and how
often the updates are being applied to the
data store.[2]
Page 19
19
Multibase system
• A multibase (multiple database) system
allows the users to view the database
through a single global schema ,simulating
to users that a federated data base
exists.[3]
Page 20
20
Data Warehouse
• Is a database used for reporting and data
analysis[4]
Page 21
21
Federated systems (Virtual Data Integration)
• It is characterized by the existence of a
federated schema which establishes the
interface to this integrated system.[4]
Page 23
23
Solution
• Can be solved with relational Views (A to B)
CREATE VIEW Men As
SELECT Code , Name FROM Person
WHERE Gender="Male"
CREATE VIEW Women As
SELECT Code , Name FROM Person
WHERE Gender="Female"
• Or can be solved with relational View (B to A)
CREATE VIEW (Code , Name , Gender) As
SELECT Code , Name ,"Male"
FROM Men
UNION
SELECT Code , Name , "Female"
FROM Women
Page 24
24
References
1. Bell_ Michael (2010). SOA Modeling Patterns for Service-Oriented Discovery and
Analysis. Wiley & Sons. p. 390. ISBN 978-0-470-48197-4.
2. Amit P. Sheth, and J.A. Larson, Federated Database Systems for Managing
Distributed, Heterogeneous, and Autonomous Databases, ACM Computing
Surveys,Vol 22, No. 3, pp. 183-236, September 1990.
3. Manuel Garcia-Solaco, Felix Saltor, and Malu Castellanos, Semantic heterogeneity in
multidatabase systems, In Object-oriented Multidatabase Systems: A Solution for
dvanced Applications, Omran A. Bukhres and Ahmed K. Elmagarmid, editors, Prentice-
Hall, 1996 Chapter 5, pp. 129-202.
4. Oracle® Database Application Developer's Guide – Fundamentals 10g Release 1
(10.1) Part Number B10795-01