CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering Department of Computer Science & Engineering University of California, San Diego University of California, San Diego CSE-291: Ontologies in Data CSE-291: Ontologies in Data Integration Integration Spring 2003 Spring 2003 Ontologies in Action Ontologies in Action Amarnath Gupta Amarnath Gupta [email protected][email protected]
Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration Spring 2003 Ontologies in Action. Amarnath Gupta [email protected]. Overview. Information Integration Querying with Ontologies Registering Into Ontologies - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
CSE-291: Ontologies in Data Integration
Department of Computer Science & Engineering Department of Computer Science & Engineering University of California, San DiegoUniversity of California, San Diego
CSE-291: Ontologies in Data IntegrationCSE-291: Ontologies in Data IntegrationSpring 2003Spring 2003
• Information IntegrationInformation Integration– Querying with Ontologies
– Registering Into Ontologies
• Ontologies of ProcessesOntologies of Processes– An Application Scenario
– A Disease Map
• A look at a theoryA look at a theory
CSE-291: Ontologies in Data Integration
Ontologies in Information IntegrationOntologies in Information Integration
• Why is information integration with Ontologies different Why is information integration with Ontologies different from “regular” information integration?from “regular” information integration?
• Regular Information IntegrationRegular Information Integration– Assume relational sources S1, S2
– A Query against the view• Ans(Brain_region, Protein) if
V1(Brain_region,V,Protein,D) D > 0 V < 0.25
CSE-291: Ontologies in Data Integration
Information Integration under GAVInformation Integration under GAV• Ans(Brain_region, Protein) Ans(Brain_region, Protein)
if V1(Brain_region,V,Protein,D) D > 0 V < 0.25
• Ans(Brain_region, Protein) Ans(Brain_region, Protein) if R1(_, Brain_region, V) R2(“human”, Brain_region, Protein, D) D > 0
V < 0.25
• Ans(Brain_region, Protein) Ans(Brain_region, Protein) if R1(_, Brain_region1, V) R2(“human”, Brain_region2, Protein, D) Brain_region1= Brain_region2 D > 0 V < 0.25
• Ans(Brain_region, Protein) Ans(Brain_region, Protein) if R1(_, Brain_region, V) V < 0.25 R2(“human”, Brain_region, Protein, D) D > 0 Brain_region1= Brain_region2
• Ans(Brain_region, Protein) Ans(Brain_region, Protein) if R1(_, Brain_region1, V) V < 0.25 @S1 R2(“human”, Brain_region2, Protein, D) D > 0 @S2 Brain_region1= Brain_region2 @mediator
CSE-291: Ontologies in Data Integration
cerebellum
brain
cerebellar peduncle
fiber bundleaxon
neuron
compartment
dendritecell body
brain stem
vermis
cortex
folia
Purkinje cell
granule cell
medullary center
flocconodular lobe
corpus cerebelli
flocculusposteolateral fissure
primary fissure
l. cerebellarhemisphere
paravermealzone
anterior lobe posterior lobe
deep cerebellarnuclei
molecularlayer
Purkinjecell layer
granularlayer r. cerebellar
hemisphere
dentate nucleus
inf. olive nucleus
globosenucleus
interposednucleus
fastigial nucleus
Sup. CP Mid. CPInf. CP
receives_afferent_from
attaches(cp,cerebellum,bstem)
CSE-291: Ontologies in Data Integration
Effect of an Ontology in GAV IntegrationEffect of an Ontology in GAV Integration
• Ontologies provideOntologies provide– relations (subclass, part-of…) over terms and axioms about relations
• Part-of can be of different kinds– member-collection (axons are part of a fiber bundle)– component-object (compartments like axon are components of a neuron)– portion-mass (myelin-sheath around axons constitute white matter of the brain)– stuff-object (cytosol is the constituent part of cytoplasm)– phase-activity (metastasis is a phase of cancer)– place-area (Manhattan is a place in New York)– feature-event
• For each flavor of part-of there is a transitive relation part-of-tr within itself but not necessarily with respect to each other
– Arm is a part of a musician, and a musician is part of an orchestra BUT an arm is *not* part of an orchestra!!
– constraints in the form logic statements• Intensional (derived) relations:
– inside(a,b) if part_of(mc)(a,b) part_of(co)(a,b) part_of(pm)(a,b) spatially_in(a,b)
• Integrity constraints– The protein “neuN” is not expressed in Purkinje cells
CSE-291: Ontologies in Data Integration
Effect of an Ontology in GAV IntegrationEffect of an Ontology in GAV Integration
• Consider the same caseConsider the same case– S1 exports relation R1(patientID, brain_region, brain_vol)– S2 exports relation R2(species, brain_region, protein, density)– Ontology source Ont exports all relations and constraints shown before– Define an “integrated view”
How would you compute a residue?How complex/feasible is this computation?
How would you compute a residue?How complex/feasible is this computation?
–But more importantly• How do you control evaluation of a recursive predicate in Ont by supplying integrity constraints from the mediator or a data source?
»By invoking general recursion control mechanisms?– OPEN RESEARCH PROBLEM
CSE-291: Ontologies in Data Integration
The Registration ProblemThe Registration Problem
• Suppose a semantic mediator system already exists with Suppose a semantic mediator system already exists with nn sourcessources
• A new source SA new source Snn+1+1 wants to join the mediator such that wants to join the mediator such that– The mediator can simply “read in” the source’s model without any
disruptions
– All existing integrated views can make “best effort” use of the new source seamlessly
• Problems: Problems: – What does the source need to declare itself to mediator?
– How does the mediator use this information to assimilate the new source?
CSE-291: Ontologies in Data Integration
Source DescriptionSource Description
• Conceptual ModelConceptual Model– Local Ontology (ONT) – the terminological vocabulary used
by the schema• Properties of relationships in the ontology
– Object Model (OM) – the export schema • Ontological Grounding (ONTG) – relationship between export schema
and local ontology
– Contextualization (CON) – relationship of OM and ONT with mediator’s knowledge base ONT(M)
• CSLCSL: a language to express : a language to express CONCON formulae formulae
dom(Image.Struct) in tc_has(cytoplasm)Structure.Name stores Protein
tc_has(X) = trans_closure(has(X))
has
Objects
Associations
Functions
Local Ontology
Property of Local Ontology
Ontological Grounding
substancestores
has has
hashas
has
has
CSE-291: Ontologies in Data Integration
Roles of Ontological GroundingRoles of Ontological Grounding
• Semantic Constraints on Attribute DomainsSemantic Constraints on Attribute Domains– Image.Struct has to be below Cytoplasm
• Refinement of local OntologyRefinement of local Ontology– Cytoplasm stores substances, but instances of the exported
object called Structure stores only proteins
• Intensional DefinitionsIntensional Definitions– DENATURED PROTEIN(ProtName) IF DEPOSIT(ID, ProtName, protein,
dark, _), deposit in structure(ID) NULL;
CSE-291: Ontologies in Data Integration
ContextualizationContextualization
• WHAT: Local schema elements are expressed as views WHAT: Local schema elements are expressed as views over mediator’s ontology over mediator’s ontology – Recall: integrated views are still defined in a global as view
fashion
• WHY: The LAV technique allows sources to join while WHY: The LAV technique allows sources to join while queries against GAV views do not need us to do an queries against GAV views do not need us to do an inverse rule mapping inverse rule mapping
CSE-291: Ontologies in Data Integration
Context Specification LanguageContext Specification Language
• Types of local schema elementsTypes of local schema elements– From Object Model: classes(S), attributes(S), associations(S), instances(S)
– From Local Ontology: concepts(S), relationships(S)
– From Both: constraints(S)
• Types of mediator’s schema elementsTypes of mediator’s schema elements– concepts(M), relationships(M), constraints(M)
• Context specificationContext specification
map map ((correspondence relationcorrespondence relation)()(XX11,…, X,…, Xnn) ) IF IF type declarationstype declarations, ,
bodybody– Correspondence relation: the name of the mapping
– X1 …Xn : the S elements and the M elements
– Type declarations: types of the S and M elements
– Body: the actual mapping definition
CSE-291: Ontologies in Data Integration
Context Specification LanguageContext Specification Language
• map map (subconcept)(cytoplasm, cell_compartment) (subconcept)(cytoplasm, cell_compartment) IF IF cytoplasm:concepts(CCDB), cell_compartment: cytoplasm:concepts(CCDB), cell_compartment: concepts(mediator)concepts(mediator)– Relates a concept of the local ontology (cytoplasm) to that of the
mediator’s ontology(cell_compartment) – cytoplasm is a cell compartment
• Consider a query at the mediatorConsider a query at the mediator– “Which cell_compartments have associated images?”– The mapping will enable the mediator to ask the CCDB source “Which
‘isa descendants’ of ‘cytoplasm’ have associated images?”– Using ontological grounding the source can translate this to a query
against the Image class
CSE-291: Ontologies in Data Integration
Some Example CasesSome Example Cases
• map map (concept-concept)(regulates( nejire,CREB )) (concept-concept)(regulates( nejire,CREB )) IF IF nejire:concepts(mediator), CREB:concepts(CCDB) nejire:concepts(mediator), CREB:concepts(CCDB) – The mapping instantiates a relation (regulates) between the mediator’s
concept nejire and CCDB’s concept CREB
– Query enabled: “Find images with deposits of nejire-regulated proteins”
• map map (concept concept)(tc_regulates(nejire, CREB)) (concept concept)(tc_regulates(nejire, CREB)) IF IF nejire:concepts(mediator), CREB:concepts(CCDB)nejire:concepts(mediator), CREB:concepts(CCDB)– Query enabled: “Find images with deposits of proteins that are indirectly
regulated by nejire”
– The query will traverse the “regulates” edges in the mediator and the source to find all paths between nejire in the mediator and CREB in CCDB. The concepts in the path will then be used to answer the query.
CSE-291: Ontologies in Data Integration
Some Example CasesSome Example Cases
• Relating edgesRelating edgesmap (assoc-rel)(surrounds(s1 s2), inverse( inside(s2,s1)) IF
surrounds(s1; s2):assoc(CCDB), inside(s2,s1):relationships(mediator), not has_part(s1,s2)
– The mediator’s ontology has a relationship “inside” and the source’s object model has an association called “surrounds”
– They are almost inverses of each other• A surrounds B B inside A unless B part_of A
– This brings out the conceptual difference between the source’s semantics of a relationship and the mediator’s semantics of the same
– The mapping will force the mediator to test the has_part condition before pushing a (rewritten) query to the CCDB source
CSE-291: Ontologies in Data Integration
Registration at MediatorRegistration at Mediator
• The source sends the mediator its conceptual model The source sends the mediator its conceptual model including the including the CSLCSL mappings mappings
• The mediatorThe mediator– Stores the description in a global registry
– Updates ONT(M) with new relationships or rules about the relationships, duly tagged by the source name
– Translates ontological groundings to executable rules• domain(STRUCTURE.volume) in [0,300] becomes
false :– X:structure[volumeV], not (0 < V < 300)
CSE-291: Ontologies in Data Integration
Registration at MediatorRegistration at Mediator
– Translates each CSL statement to two rulesmap (subrelation)(has(co); has part) IF
false :– CCDB.has(co)(X,Y), not has_part(X,Y) (denial)
• The first rule is an IDB for has_part
• The second rule is an integrity constraint
CSE-291: Ontologies in Data Integration
Ontologies of ProcessesOntologies of Processes• What is a Process?What is a Process?
– From Merriam-Webster2 a (1) : a natural phenomenon marked by gradual changes that lead toward a particular result <the process of growth> (2) : a natural continuing activity or function <such life processes as breathing> b : a series of actions or operations conducing to an end; especially : a continuous operation or treatment
• Revisiting the Central Theme of Formal OntologyRevisiting the Central Theme of Formal Ontology– Given a logical language L ...
• ... a conceptualization is a set of models of L which describes the admittable (intended) interpretations of its non-logical symbols (the vocabulary)
• ... an ontology is a (possibly incomplete) axiomatization of a conceptualization.
– Theory of formal distinctions among things and relations– Basic tools
• Theory of parthood• Theory of integrity• Theory of identity• Theory of dependence
CSE-291: Ontologies in Data Integration
Disease Maps: “Designing” an OntologyDisease Maps: “Designing” an Ontology
• On-going work (Gupta, Ludäscher, Martone, Grethe)On-going work (Gupta, Ludäscher, Martone, Grethe)– Goal: to characterize the processes, manifestations and outcomes of a
specific disease (or family of diseases)– A node and edge labeled multigraph where logical formulae can be
constructed over subset of edge labels to describe• Transitive relations• Temporal relations• Causal relations• …
– Views• A subgraph that reflect the viewpoint of a specific discipline
– Elaborations and Abstractions• A “zoom in” ability where a subgraph may be the detail of another smaller
shrink mitochondria break down release of cytochrome c bleb development on surface degradation of chromatin in nucleus
TriggeringTriggeringEventEvent
CSE-291: Ontologies in Data Integration
An Intuitive Attempt to FormalizeAn Intuitive Attempt to Formalize
• Let Let SS00 be an be an initial situationinitial situation
• Let Let occurs occurs be a distinguished binary function symbolbe a distinguished binary function symbol– occurs(, s) denotes a successor situation to situation s resulting
from event – events may be parameterized
• degrades(chromatin, nucleus) may mean that chromatin degrades in the nucleus
• occurs(degrades(chromatin, nucleus), s) demotes the resultant situation occurring due to degradation of chromatin when the current situation is s
Neurotoxin ‘MPTP’ is converted to ‘MPP+’ by ‘MAOB’ in the synaptic cleft. The active form ‘MPP+’ is picked up by the dopamine transporter, and released inside the neuron, where it accumulates in mitochondria. This leads to complex I (an antioxidant) inhibition, which leads to free radical generation.
Neurotoxin ‘MPTP’ is converted to ‘MPP+’ by ‘MAOB’ in the synaptic cleft. The active form ‘MPP+’ is picked up by the dopamine transporter, and released inside the neuron, where it accumulates in mitochondria. This leads to complex I (an antioxidant) inhibition, which leads to free radical generation.
If there are E events and S situations, 2 E S frame axioms may be needed!!
If there are E events and S situations, 2 E S frame axioms may be needed!!
CSE-291: Ontologies in Data Integration
Toward a ConclusionToward a Conclusion
• Solutions for the Frame ProblemSolutions for the Frame Problem– Causal Completeness Assumption
• We know all preconditions under which an event causes a fluent to change values to a successor state
– Explanation Closure Assumption• We know all events that may cause a fluent to change its value
– Unique Name Assumption• Identical events have identical attributes
– Then, the number of axioms can be reduced to the order of E+F provided
• conditional, iterative, recursive and nondeterministic events do not occur
• For a multi-theory Ontology like a disease mapFor a multi-theory Ontology like a disease map– We need much more than a description logic and a situation
calculus
CSE-291: Ontologies in Data Integration
ReferencesReferences
1. D. Leviant, “Higher Order Logic” In D.M. Gabbay, C.J. Hogger and J.A. Robinson (eds.), Handbook of Logic in Artif. Inell. And Logic Programming, pp. 229-321, Clarendon Press, Oxford, 1994.
2. A. Gupta, B. Ludäscher, M. E. Martone, “Registering Scientific Information Sources for Semantic Mediation”, 21st International Conference on Conceptual Modeling, (ER), Tampere, Finland, pp. 182-198, October 2002.
3. J. McCarthy, Situations, actions and causal laws. Tech. Report, Stanford Univ., 1968.
4. R. Reiter, Knowledge in Action, The MIT Press, Cambridge, MA, 2001.
5. P. Godfrey, J. Grant, J. Gryz, and J. Minker, “Integrity constraints: Semantics and applications” In Jan Chomicki and Gunter Saake (eds.), Logics for Databases and Information Systems. Kluwer, 1998.
6. U. Chakravarthy, J. Grant, and J. Minker, “Logic-based approach to semantic query optimization”, ACM Transactions on Database Systems, 15(2), pp. 162-207, 1990.
7. R. Kolwaski and M. Sergot, “A logic-based calculus of events”, New Generation Computing, 4, pp. 67-95, 1986.