D1 - 09/01/2009 Adding Semantic to Web Data and Services Part 2 – DL Knowledge base Reasoning Doctoral School, St Etienne January 2009 Alain Léger FT R&D Orange Labs Research DR Knowledge Processing (KRR) Manager Industry Area IST NoEs OntoWeb et Knowledgeweb (2000 -2007) Associated DR CNRS Lyon I - LIRIS D2 - 09/01/2009 Plan Cours 1 (5 janv 09 13:30 – 17:15 / 6 janv 09 8:00 – 11:45 ) • Why adding semantics to the Web ? (1h30) CIntroduction CTake Away and References • Foundations of Semantic Web (2h15) CIntroduction to Description Logics CStandards Inferences and Tableau • From XML, RDF to OWL (2h45) CXML, RDF, RDF-S COWL • Applications and Roadmap (1h) CApplication Scenarios CVisions prospectives et verrous technologiques
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
D1 - 09/01/2009
Adding Semantic to Web Data and ServicesPart 2 – DL Knowledge base Reasoning
Doctoral School, St Etienne January 2009
Alain Léger FT R&D Orange Labs ResearchDR Knowledge Processing (KRR)Manager Industry Area IST NoEs OntoWeb et Knowledgeweb (2000 -2007)Associated DR CNRS Lyon I - LIRIS
D2 - 09/01/2009
Plan Cours 1 (5 janv 09 13:30 – 17:15 / 6 janv 09 8:00 – 11:45)
• Why adding semantics to the Web ? (1h30)
CIntroduction
CTake Away and References
• Foundations of Semantic Web (2h15)
CIntroduction to Description Logics
CStandards Inferences and Tableau
• From XML, RDF to OWL (2h45)
CXML, RDF, RDF-S
COWL
• Applications and Roadmap (1h)
CApplication Scenarios
CVisions prospectives et verrous technologiques
p-3 - 09/01/2009
Ontologies
p-4 - 09/01/2009
a philosophical discipline—a branch of philosophy thatdeals with the nature and the organisation of reality
• Science of Being (Aristotle, Metaphysics, IV, 1)
• Tries to answer the questions:What characterizes being?Eventually, what is being?
• How should things be classified ?
Ontology the key ingredient: Origins and History
Ontology in Philosophy
p-5 - 09/01/2009
Classification: An Old Problem
Les représentations du Système figuré : novembre 1750 et juin 1751, publié dans l'Encyclopédie ou Dictionnaire raisonné des sciences, des arts et métiers,Par une Société de gens de Lettres … Tome I, 1751
p-6 - 09/01/2009
Machine Intelligence and Turing Test
Dialogues IHMAcquisition de Connaissances
Représentation des connaissances
Raisonnements Automatisés
Traitement automatisé du langage
Emotion
s
The Phaistos Disc (1700 BC) – undecyphered -can perhaps be thought of as the earliest typewritten workdiscovered on the 3rd of July 1908 by L. Pernier, during an excavation he supervised at the Minoan palace of Phaistos in Southern Crete
p-7 - 09/01/2009
• An ontology is an engineering artefact consisting of:CA vocabulary used to describe (a particular view of) some domain
CAn explicit specification of the intended meaning of the vocabulary.
• almost always includes how concepts should be classified
CConstraints capturing additional knowledge about the domain
• Ideally, an ontology should:CCapture a shared understanding of a domain of interest
CProvide a formal and machine manipulable model of the domain
Ontology in Computer Science
D8 - 09/01/2009
What is a concept?
• Concepts or “classes”:CAre in general language independent (the words ‘university’ and ‘ollscoil’ denote the
same concept)
CAre mental or logical representations of reality
CAre related to other concepts
CDo not need symbols but hold them for means of communication
• A concept has:CIntension, i.e. meaning
CExtension, i.e. a set of objects that the concept refers to
• Ontology is mainly concerned with intension
D9 - 09/01/2009
Components of an ontology
• ConceptsCCatCDog
• PropertiesCLengthCAge
• ConstraintsCCardinality is at least 1CMaximum value is 300
• AxiomsCCows are larger than dogsCCats cannot eat only vegetation
• RelationshipsCIs aCPart of
Illustration
p-10 - 09/01/2009
Example Ontology (1)
• Vocabulary and meaning (“definitions”)
CElephant is a concept whose members are a kind of animal
CHerbivore is a concept whose members are exactly those animals
who eat only plants or parts of plants
CAdult_Elephant is a concept whose members are exactly those
elephants whose age is greater than 20 years
• Background knowledge/constraints on the domain (“general axioms”)
CAdult_Elephants weigh at least 2,000 kg
CAll Elephants are either African_Elephants or Indian_Elephants
CNo individual can be both a Herbivore and a Carnivore
p-11 - 09/01/2009
Example Ontology (2)
D12 - 09/01/2009
Implementing or creating ontologies
• Implementation consists in defining all the ontology
components through an ontology definition language
• Generally in two stages:CInformal stage:
• Ontology is sketched out using either natural language descriptions or some diagram technique
CFormal stage:• Ontology is encoded in a formal knowledge representation language, that is machine computable
• Different tools (e.g., Protégé) may help in the implementation• http://protege.stanford.edu/overview/protege-owl.html
Consider
Re-use
Enumerate
Terms
Define
Classes
Determine
Scope
Define
Properties
Define
constraints
Create
Instances
p-13 - 09/01/2009
Example Ontology (Editor Protégé)
p-14 - 09/01/2009
Example Ontology (Editor OilEd)
p-15 - 09/01/2009
Where are ontologies used?
• e-Science, e.g., Bioinformatics
CThe Gene Ontology
CThe Protein Ontology (MGED)
C“in silico” investigations relating theory and data
• Medicine
CTerminologies
• Databases
CIntegration
CQuery answering
• User interfaces
• Linguistics
• The Semantic Web
D16 - 09/01/2009
Ontology in a nutshell
• “Ontology is an explicit conceptualisation, formal and shared ”
[Gruber 95] [Borst 97]
• Thus, an ontology describes a formal specification of a certain domain:
CShared understanding of a domain of interestCFormal and machine manipulable model of a domain of interest
Aristotle ten categories•Substance. E.g., individual man.
•Quantity. E.g., two cubits.•Quality. E.g., white.
•Relation. E.g., double.•Location. E.g., in the market.
Aristotle was the ontologistof common sense reality
METAPHYSIC : ONTOLOGY and EPISTEMOLOGY
Ontology
« discourse on the being » fondamental questions: « what does exist ? » ; « what is the content of the reality ? » ; « how does the reality work ?» ; « what are the origins of the reality ? » ; « what is the future of the reality ? »
Epistemology
« discourse on the knowledge ». Central question is : « how do we know ? »
• Axioms (axiomatic relations between concepts or roles)Ce.g., Female ⊆ PersonCe.g. HappyFather ⊆ Father Π ≥1 hasChild.Woman Π ≥1 hasChild.Man
• Operators (for forming concepts and roles) CAnd(Π) , Or(U), Not (¬)CUniversal qualifier (∀), Existent qualifier(∃)CNumber restiction : ≤, ≥, = CInverse role (-) : hasParent = hasChild
Ctransitive role (+) : hasBrother(Bob,David), hasBrother(David, Mack) -> hasBrother(Bob,Mack)
CRole hierarchy : hasMother ⊆ hasParent
p-21 - 09/01/2009
The DL Family (Notation)
• Given DL defined by set of concept and role forming operators• Smallest propositionally closed DL is ALC (equiv modal K(m))
CConcepts constructed using u, t, ¬, ∃ and ∀• S often used for ALC with transitive roles (R+)
• Additional letters indicate other extension, e.g.:CH for role inclusion axioms (role hierarchy)
CO for nominals (singleton classes, written {x})
CI for inverse roles
CN for number restrictions (of form 6nR, >nR)
CQ for qualified number restrictions (of form 6nR.C, >nR.C)
• E.g., ALC + R+ + role hierarchy + inverse roles + QNR = SHIQ• Have been extended in many directions
CConcrete domains, epistemic, n-ary, fuzzy, …
p-22 - 09/01/2009
The DL Family (a very few part)
p-23 - 09/01/2009
Concept Description
• Représenter des conceptsCConcepts atomiques et rôlesCConstructeurs de conceptCExemple : classe des mères, i.e. des personnes de sexe féminin ayant au moins un enfant qui est lui-même une personne.
Mère ≡ Personne Féminin ∃aEnfant.Personne
• Une terminologie (ou Tbox) = {définitions de concepts}• Une logique de description = {constructeurs}• Sémantique :
CNotion d’interprétation issue de la théorie des modèlesCConcept = ensemble d’{individus} du dom. Interprétation
Descriptions de concepts complexes
concepts atomiques rôleconstructeursdescription de conceptconcept
défini
définition de concept
p-24 - 09/01/2009
DL Semantics
• Semantics defined by interpretations• An interpretation I = (∆I, .I), where
C ∆I is the domain (a non-empty set)
C.I is an interpretation function that maps:
• Concept (class) name A → subset AI of ∆I
• Role (property) name R → binary relation RI over ∆I
• Individual name i → iI element of ∆I
p-25 - 09/01/2009
DL Semantics (2)
• Interpretation function .I extends to concept (and role)
expressions in the obvious way, e.g.:
p-26 - 09/01/2009
Interpretation Example (homework)
∆ = {v, w, x, y, z}
AI = {v, w, x}
BI = {x, y}
RI = {(v, w), (v, x), (y, x), (x, z)}• ¬ B =
• A u B =
• ¬ A t B =
• ∃ R B =
• ∀ R B =
• ∃ R (∃ R A) =
• ∃ R ¬ (A t B) =
• 6 1 R A =
• > 1 R A =
AI
v
x
yz
w
BI
p-27 - 09/01/2009
Base de Connaissance (KB) : Architecture, syntaxe, sémantique
• Un langage L-KR étant donné, une Base de connaissance K dans Lest définie par K= hT ,AiCT (Tbox) est un ensemble de définitions et d'axiomes (in L) :
• C D (concept inclusion)
• C ≡ D (concept equivalence)• R S (role inclusion)
• … + autres constructeurs selon L-KR
CA (Abox) est un ensemble d'assertions (in L) : • x ∈ D (concept instantiation)• hx,yi ∈ R (role instantiation)
• La sémantique est donnée par interprétation I=(∆I,.I)C∆I est le domaine (non vide)
C.I est une fonction d'interpretation qui fait correspondre :• Concept (classe) nom A → Sous-ensemble AI of ∆I
• Role (propriété) nom R → Relation binaire RI sur ∆I
• Individu (instance) nom i → iI element of ∆I
Knowledge Base
Tbox (schema)
Abox (data)
Man ≡ Human u Male
Happy-Father ≡ Man u ∃ has-child Female u …
John : Happy-FatherhJohn, Maryi : has-child In
fere
nce
Syst
em
p-28 - 09/01/2009
Knowledge Base Semantics
• An interpretation I satisfies (models) an axiom A (I A):
C I C D iff CI ⊆ DI
C I C ≡ D iff CI = D I
CI R S iff RI ⊆ SI
CI R ≡ S iff RI = SI
CI R+ R iff (RI)+ ⊆ RI
CI x ∈ D iff x ∈ DI
CI h x,yI i ∈ R iff (xI ,yI) ∈ RI
• I satisfies a Tbox T (I T ) iff I satisfies every axiom T in I• I satisfies an Abox A (I A) iff I satisfies every axiom A in I• I satisfies an KB K (I K) iff I satisfies both T and A
p-29 - 09/01/2009
Multiple Models -v- Single Model
• DL KB doesn’t define a single model, it is a set of constraints
that define a set of possible modelsCNo constraints (empty KB) means any model is possibleCMore constraints means fewer modelsCToo many constraints may mean no possible model (inconsistent KB)
• In contrast, DBs (and frame/rule KR systems) make
assumptions such that DB/KB defines a single modelCUnique name assumption
• Different names always interpreted as different individuals
CClosed world assumption• Domain consists only of individuals named in the DB/KB
CMinimal models• Extensions are as small as possible
p-30 - 09/01/2009
Example of Multiple Models
KB = {}
KB = {a:C, b:D, c:C, d:E}
KB = {a:C, b:D, c:C, d:E, b:C}
KB = {a:C, b:D, c:C, d:E, b:CD v C}
KB = {a:C, b:D, c:C, d:E, b:CD v C, E v C}
KB = {a:C, b:D, c:C, d:E, b:CD v C, E v C, d:¬ C}
I1:
∆ = {v, w, x, y, z}CI = {v, w, y}DI = {x, y} EI = {z}aI = v bI = xcI = w dI = y
I3:
∆ = {v, w, x, y, z}CI = {v, w, y}DI = {x, y} EI = {z}aI = v bI = ycI = w dI = z
I2:
∆ = {v, w, x, y, z}
CI = {v, w, y}
DI = {x, y} EI = {z}
aI = v bI = x
cI = w dI = z
I4:
∆ = {v, w, x, y, z}
CI = {v, w, x, y}
DI = {x, y} EI = {z}
aI = v bI = x
cI = y dI = y
p-31 - 09/01/2009
Example of Single Model (homework)
KB = {}
KB = {a:C, b:D, c:C, d:E}
KB = {a:C, b:D, c:C, d:E, b:C}
KB = {a:C, b:D, c:C, d:E, b:CE v C}
I:
∆ = {}
I:
∆ = {a, b, c, d}
CI = {a, b, c}
DI = {b} EI = {d}
aI = a bI = b
cI = c dI = d
I:
∆ = {a, b, c, d}
CI = {a, c}
DI = {b} EI = {d}
aI = a bI = b
cI = c dI = d
I:
∆ = {a, b, c, d}
CI = {a, b, c, d}
DI = {b} EI = {d}
aI = a bI = b
cI = c dI = d
p-32 - 09/01/2009
Short History of Description Logics
Phase 1: (early 80's) mostly system development
CIncomplete systems (KL-ONE, Back, Classic, Loom, . . . )
CBased on structural algorithms
Phase 2: (mid-80's) first formal investigation
CDevelopment of tableau algorithms and complexity results
CTableau-based systems for Pspace logics (e.g., Kris, Crack)
CInvestigation of optimisation techniques
Phase 3: (90's) tableau algorithms and thorough complexity analysis
CTableau algorithms for very expressive DLs
CHighly optimised tableau systems for ExpTime logics (e.g., FaCT, DLP, Racer)
CRelationship to modal logic and decidable fragments of FOL
p-33 - 09/01/2009
Recent Developments
Phase 4: (2010's)CMainstream applications and tools
• Databases– Consistency of conceptual schemata (EER, UML etc.)
– Reasoning with ontology-based annotations (data)
CMature implementations• Research implementations
– FaCT, FaCT++, Racer, Pellet, …
• Commercial implementations
– Cerebra system from Network Inference (and now Racer)
p-34 - 09/01/2009
Description Logic Reasoning
p-35 - 09/01/2009
Practical Reasons
• Given key role of ontologies in e-Science and Semantic Web, it is essential to provide tools and services to help users:CDesign and maintain high quality ontologies, e.g.:
• Meaningful — all named classes can have instances
• Optimised classification (compute partial ordering)CEnhanced traversal (exploits information from previous tests)
CUse structural information to select classification order
• Optimised subsumption testing (search for models)CNormalisation and simplification of concepts
CAbsorption (simplification) of axioms
CDependency directed backtracking
CCaching of satisfiability results and (partial) models
CHeuristic ordering of propositional and modal expansion
p-39 - 09/01/2009
KB Inférences
• Subsomption : CC T D ssi ≤I T CI ⊆ DI
CStructure la connaissance, calcule le graphe terminologique
• Consistance ou Satisfiable :C concept C ssi ≥I T CI ≠ ∅C ABox ssi ≥I A Toutes les assertions de ABoxCKB K = hT ,Ai ssi ≥I T A Toutes les assertions de TBox et ABox
• Aussi Equivalence C ≡ T D , Instance a :C(a)
• Les problèmes d'inférence sont liés :CC T D ssi ≤I T CI ¬DI
CC est consistant ssi ≤I T CI ⊆ AI 3 ¬AI
CInférences standard se réduisent à un test de satisfiabilité
• Inférences non standard (étudiées systématiquement depuis ~2000)
• Knowledge is correct (captures intuitions)CDoes C subsume D w.r.t. ontology O? (CI µ DI in every model I of O)
• Knowledge is minimally redundant (no unintended synonyms)CIs C equivalent to D w.r.t. O? (CI = DI in every model I of O)
• Knowledge is meaningful (classes can have instances)CIs C is satisfiable w.r.t. O? (CI ≠ ∅ in some model I of O)
• Querying knowledgeCIs x an instance of C w.r.t. O? (xI ∈ CI in every model I of O)
CIs hx,yi an instance of R w.r.t. O? ((xI,yI) ∈ RI in every model I of O)
• Above problems can be solved using highly optimised DL reasoners
p-41 - 09/01/2009
DL Reasoning: Basics
p-42 - 09/01/2009
Tableau Algorithm (1)
• Tableau Algorithm is the de facto standard reasoning algorithm
used in DL
• Basic intuitions
CReduces a reasoning problem to concept satisfiability problem
CFinds an interpretation that satisfies concepts in question.
CThe interpretation is incrementally constructed as a "Tableau«
• Tableaux algorithms are decision procedures for concept
satisfiability (& subsumption & w.r.t. an ontology)
i.e., algorithms return “SAT” iff input concept is satisfiable
p-43 - 09/01/2009
Tableau Algorithm (2) (basic case)
• given: Wife⊆ Woman, Woman⊆ Person
question: if Wife⊆ Person
• Reasoning processCTest if there is a individual that is a Woman but not a Person, i.e. test the satisfiability of concept C0=(WifeΠ¬Person)CC0(x) -> Wife(x), (¬Person)(x)CWife(x)->Woman(x)CWoman(x) ->Person(x)CConflict!CC0 is unsatisfiable, therefore Wife⊆ Person is true with the given ontology.
p-44 - 09/01/2009
Tableau Algorithm (3) (General Process)
• Transform C into negation normal form(NNF), i.e. negation
occurs only in front of concept names.
• Denote the transformed expression as C0, the algorithm
starts with an ABox A0 = {C0(x0)}, and apply consistency-
preserving transformation rules (tableaux expansion) to the
ABox as far as possible.
• If one possible ABox is found, C0 is satisfiable.
• If not ABox is found under all search pathes, C0 is
unsatisfiable.
p-45 - 09/01/2009
Tableau Algorithm (4) (Exemple 2)
From Naouel Karam, ISIMA, Tutorial 2005, Postdam
p-46 - 09/01/2009
Current Research (1)
• Extending Description LogicsCExisting DL systems implement (at most) SHIQCOWL extends SHIQ with datatypes and nominals (SHOIN(Dn))