Logics for Data and Knowledge Representation The DERA methodology for the development of domain ontologies Feroz Farazi Originally by Fausto Giunchiglia and Biswanath Dutta Modified by Feroz Farazi
Dec 26, 2015
Logics for Data and Knowledge Representation
The DERA methodology for the development of domain ontologies
Feroz Farazi
Originally by Fausto Giunchiglia and Biswanath DuttaModified by Feroz Farazi
Knowledge Representation (KR) Abstraction of the world via models, of a particular
domain or problem, which allow automatic reasoning and interpretation
Fundamental Goal to represent knowledge in a manner that facilitates
inferencing new knowledge (i.e. drawing conclusions) from the already known facts possibly encoded in a knowledge base
2
According to (Crawford & Kuipers, 1990): A knowledge representation system must have a reasonably compact syntax a well defined semantics so that one can say
precisely what is being represented sufficient expressive power to represent human
knowledge an efficient, powerful and understandable reasoning
mechanism support in building large knowledge bases
3
Knowledge Representation Properties
Knowledge Representation Issues KR issues:
How do people represent knowledge? What is the nature of knowledge? Do we have domain specific schema or generic, domain
independent schema? How much it needs to be expressive?
4
Ontology “formal, explicit specification of a shared
conceptualisation” [T. R. Gruber, 1993]
Models a domain consisting of a shared vocabulary with the definition of objects and/or concepts and their properties and relations
A structural framework for organizing information, and used as a form of KR in the fields like, AI, SW, Lib. Sc., Inf.
Architecture, etc.
Can be used also as a language resource
5
Ontology Properties Some of the ontological properties are:
Extendable
Reusable
Flexible
Robust …
6
Domain An area of knowledge or field of study that we are
interested in or that we are communicating about
Example: Computer science, Artificial Intelligence, Soft computing,
Social networks, …Library science, Mathematics, Physics, Chemistry, Agriculture, Geography, …
Music, Movie, Sculpture, Painting, …Food, Wine, Cheese, …Space,…
7
Domain A domain can be decomposed into its several
constituents, and Each of them denotes a different aspect of entities
An example from Space domain: by region, by body of water, by landform, by populated places, by administrative division, by land, by agricultural land, by facility, by altitude, by climate,…
Each of these aspects is called facet
8
Facet A hierarchy of homogeneous terms describing an
aspect of the domain, where each term in the hierarchy denotes a different concept
E.g., Body of water(e.g., River, Lake, Pond, Canal), Landform
(e.g., mountain, hill, ridge), facility (e.g., house, hut, farmhouse, hotel, resort), etc.
language facet (e.g., English, Hindi, Italian,), property facet, author facet, religion facet (e.g., Christian, Hindu, Muslim), commodity facet, etc.
DERA a facet based knowledge organization framework independent from any specific domain allows building domain specific ontologies mapping to Description Logic logically sound decidable
Developed by the UniTn KnowDive group
10
DERA Surface Structure In the surface level, it has the following components:
D – Domain E – Entity R – Relation A – Attribute
11
Domain (D) A DERA domain is a tuple of,
D = <E, R, A>
Entity (E) an elementary component that consists of entity
classes and their instances, having either perceptual correlates or only conceptual existence in a domain in context. It can be represented as a pair
E = <C , E'> Where,
C = a set of entity classes or concepts representing the entities
E' = a set of entities (also called objects, instances or individuals), possibly, real world named entities, those are the instantiations of C
12
Entity (E) Entity classes (C) :
Represent the essence of the domain under consideration;
Consist of the core classes representing a domain in context
E.g., Consider the following classes in context of Space domain: Mountain, Hill, Lake, River, Canal, Province, City, Hotel,...
13
Entity (E) Entity (E') :
the real world named entities representations of the real world entities
E.g., The Himalaya, Monte Bondone, Lake Garda, Trento, Povo, Hotel
America,...
14
Entity (E)
15
An example from the Space domain
Relation (R) An elementary component consists of classes
representing relations between entities
R = <{r}>
{r} is a set of relations A relation r is a link between two entities (E') Builds a semantic relation between the entities
E.g., Some relations (spatial) from Space domain: near,
adjacent, inside, before, center, sideways, etc.
16
Attribute (A) An elementary component consists of classes
expressing the characteristics of entities
A = <A', C> Where A' is a set of datatype attributes and C is a
set of descriptive attributes An attribute is any property, qualitative, quantitative
or descriptive measure of an entity
17
Attribute (A) (contd…) Datatype Attributes (A'):
The datatype attributes include the attribute classes that account the quality or quantity of an entity within a domain
E.g., latitude, longitude (of a place):
450 N, 180 S altitude (of a mountain):
8000ft, 2400m. high, low
depth (of a lake): deep, shallow 100ft., 20m.
18
Attribute (A) (contd…) Descriptive Attributes (C):
include the attribute classes that describe the entities under a domain in consideration
value could consist of a single string (single valued) or a set of strings (multivalued)
E.g., natural resource (of a place):
coal, natural gas, oil, … architectural style (of a castle):
{Classical architecture, Greek architecture, Roman architecture, Bauhaus, etc.}
history (of a place) ……….
climbing route (to a mountain) ……………….
19
Mapping From DERA to DL
Entity classes (C) -> Concepts Relations (R) -> Roles Datatype attributes (A') -> Roles Descriptive attributes (C) -> Roles Entity (E') -> Individuals
20
Methodology Step 1: Identification of the atomic concepts Step 2: Analysis (per genus et differentiam) Step 3: Synthesis Step 4: Standardization Step 5: Ordering
Following the above steps leads to the creation of a set of facets. They constitute a faceted representation scheme for a domain
21
Ontological Principle Relevance (e.g.,breed is more realistic to classify the universe of cows
instead of by grade) Ascertainability (e.g., flowing body of water) Permanence (e.g., Spring- a natural flow of ground water) Exhaustiveness (e.g., to classify the universe of people, we need both
male and female) Exclusiveness (e.g., age and date of birth, both produce the same
divisions) Context (e.g., bank, a bank of a river, OR, a building of a financial
institution) Important: helps in reducing the homographs
Currency (e.g., metro station vs. subway station) Reticence (e.g., minority author) Ordering
Important: ordering carries semantics as it provides implicit relations between the coordinate terms
22
Identification of the atomic concepts
Sources of the concepts WordNet GeoNames TGN Literature
23
Identification of the atomic concepts
Some of the relevant sub-trees in WordNet are: location artifact, artefact body of water, water geological formation, formation land, ground, soil land, dry land, earth, ground, solid ground, terra firma
Note: not necessarily all the nodes in these sub-trees need to be part of the space domain. For example, the descendants of artifact, like, article, anachronism, block, etc. are not.
24
Hill Stream River
• the well defined elevated land
• formed by the geological formation (where geological formation is a natural phenomenon)
• altitude in general >500m
• the well defined elevated land
• formed by the geological formation, where geological formation is a natural phenomenon
• altitude in general <500m
• a body of water
• a flowing body of water
• no fixed boundary
• confined within a bed and stream banks
• a body of water
• a flowing body of water
• no fixed boundary
• confined within a bed and stream banks
• larger than a brook
Mountain
Analysis
25
Body of water
Flowing body of waterStream
BrookRiver
Stagnant body of waterPond
Landform
Natural depressionOceanic depression
Oceanic valleyOceanic trough
Continental depressionTroughValley
Natural elevationOceanic elevation
SeamountSubmarine hill
Continental elevationHillMountain
* each term in the above has gloss and is linked to synonym(ous) terms in the knowledge base
Synthesis
26
Space [Domain] by geographical feature [Entity class]
by water formation by land formation by land by administrative division …
by relations [Relation] spatial relation
direction, internal, external, longitudinal, sideways, etc. functional relation (e.g., primary inflow, primary outflow) …
by attribute [Datatype attribute]
latitude Longitude dimension …
[Descriptive attribute] Natural resource Architectural style Time zone ph History …
Facets and sub-facets
27
Log-in: http://uk.disi.unitn.it/resources/html/UKDomain.html
References F. Giunchiglia and B. Dutta. DERA: A Faceted Knowledge Organization
Framework. Technical report, KnowDive, DISI, University of Trento, 2010.
B. Dutta, F. Giunchiglia, V. Maltese, A facet-based methodology for geo-spatial modelling, GEOS, 2011.
Crawford, J. M. & Kuipers, B. (1990). ALL: Formalizing Access Limited Reasoning. Principles of semantic networks: Explorations in the representation of knowledge, Morgan Kaufmann Pub., 299-330.
S. R. Ranganathan. Prolegomena to Library Classification. Asia Publishing House, 1967.
T. R. Gruber. A translation approach to portable ontologies. Knowledge Acquisition, 5(2):199-220, 1993.