Ontology and semantic web (2011)

Post on 17-Dec-2014

337 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

An Ontology is a description of things that exist and how they relate to each other. Ontologies and Natural Language Processing (NLP) can often be seen as two sides of the same coin.

Transcript

© 2012 IBM Corporation

Ontologies and the Semantic Web

July 2011

cmtrim@us.ibm.com

© 2012 IBM Corporation

Outline

Triples– Reification– Confidence Levels

Ontology– Design– Architecture (big picture)– SPARQL– Inferencing

Methodology– Creating a Semantic Network

© 2012 IBM Corporation

© 2012 IBM Corporation

Triples

Subject Predicate Object

“The author of Hamlet is Shakespeare” Shakespeare authorOf Hamlet Hamlet hasAuthor Shakespeare

© 2012 IBM Corporation

Triples

“Shakespeare wrote Hamlet in 1876”

Shakepeare authorOf Hamlet

Hamlet writtenIn 1876

© 2012 IBM Corporation

Triples (Reification)

Wikipedia states “Shakespeare wrote Hamlet in 1876”

Wikipedia states Shakepeare

Shakepeare authorOf Hamlet

Hamlet writtenIn 1876

© 2012 IBM Corporation

Triples (Reification)

Wikipedia states “Shakespeare wrote Hamlet in 1876”

Wikipedia states (Hamlet writtenIn 1876)

Shakespeare authorOf Hamlet

© 2012 IBM Corporation

Triples (Confidence Levels)

ShakespeareOnline states (Hamlet writtenIn 1599)

Wikipedia states (Hamlet writtenIn 1876)

When was Hamlet written?– 1599– 1876

© 2012 IBM Corporation

Triples (Confidence Levels)

Go from this:– ShakepeareOnline states (Hamlet writtenIn 1599)

To this:– (ShakepeareOnline states (Hamlet writtenIn 1599)) hasConfidenceLevel 90

© 2012 IBM Corporation

Triples (Confidence Levels)

© 2012 IBM Corporation

What is an Ontology?

Description of the kinds of entities there are and how they are related (Chris Welty)

© 2012 IBM Corporation

Ontology

“Shakespeare wrote Hamlet in 1876”

How many “types” of things are there in this statement?– Authors– Books– Plays– Years– Sources– Characters

What relationships could exist between these types?

© 2012 IBM Corporation

Ontology

Author – Playwright {Shakespeare, Marlowe}

Book– Play {Hamlet, Macbeth, Faustus}

RDF:– Shakepeare a Playwright– Shakepeare a Author– Hamlet a Play– Hamlet a Book

© 2012 IBM Corporation

© 2012 IBM Corporation

William Shakespeareen2:Playwright was an English poet and playwright, widely regarded as the greatest writer in the English language and the world's pre-eminent dramatist.

© 2012 IBM Corporation

© 2012 IBM Corporation

© 2012 IBM Corporation

© 2012 IBM Corporation

© 2012 IBM Corporation

AIX hasCommand topas monitors (process uses (CPU hasComponent resources))

Semantic Chains

© 2012 IBM Corporation

SELECT ?commandWHERE {

AIX hasCommand ?command .?command monitors/uses CPU

}

SPARQL

© 2012 IBM Corporation

© 2012 IBM Corporation

Inference

Ontology Model (Classes):

Product– SupportedProduct (x hasMaker IBM)

Company– IBM– NonIBM (disjoint to IBM)

• { Microsoft, Oracle, Teradata)

Ontology Model (Predicates):

<Product> hasMaker <Company>

Triple Store data:

Rational Software Architect hasMaker IBM

Rational Software Architect a SupportedProduct

© 2012 IBM Corporation

© 2012 IBM Corporation

© 2012 IBM Corporation

Tivoli Monitoring hasSynonym ITM

© 2012 IBM Corporation

Tivoli Monitoring hasSynonym ITMITM hasComponent ITM Agent

© 2012 IBM Corporation

Tivoli Monitoring hasSynonym ITMITM hasComponent ITM AgentTivoli Monitoring hasComponent Tivoli Monitoring AgentTivoli Monitoring Agent hasSynonym ITM Agent

© 2012 IBM Corporation

© 2012 IBM Corporation

© 2012 IBM Corporation

© 2012 IBM Corporation

© 2012 IBM Corporation

© 2012 IBM Corporation

“Agent” analysis

itm agent 54

db2 agent 32

os agent 32

ul agent 31

monitoring agent 29

oracle agent 22

agent needs 21

itm ul agent 16

windows os agent 15

agent left 14

agent system 14

citrix agent 14

mysap agent 14

unix os agent 13

linux agent 13

© 2012 IBM Corporation

Proximal Verbs (normalized)

monitor

support

configure

run

start

show

build

appear

© 2012 IBM Corporation

Events

Situation Event

Omnibus Event

ITM Event

Minor Event

Triggering Event

Console Event

System Event

TBSM Event

JMX Event

TEC Event

© 2012 IBM Corporation

Blank Nodes

Explict Characterization vs Implicit (Predicate-driven) Identification

© 2012 IBM Corporation

Blank Nodes

What are blank nodes?– A way of profiling entities– A way of identifying entities without explicit identification– Implicit identification– Predicate driven identification of data (rather than explict characterization)

Examples:– “That person has a child”– “That person has a child and a husband”

© 2012 IBM Corporation

Anonymous (Anon) Nodes

What is the difference between an Anon Node and a Blank Node?

An “anonymous node” is an existentially quantitifed variable

A typical RDF node has an identifier to which it is useful to refer

© 2012 IBM Corporation

Appendix A - Resources

Glossary

Books

Common OWL Editors

Triple Stores

© 2012 IBM Corporation

Glossary

OWL – Web Ontology Language

RDF – Resource Description Framework

SPARQL – Simple Protocol and RDF Query Language

© 2012 IBM Corporation

Books

Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL – Author(s): Dean Allemang and Jim Hendler– Second Edition

© 2012 IBM Corporation

Common OWL Editors

TopBraid Composer (TBC)

Free Edition (also Standard + Maestro Editions) http://www.topquadrant.com/products/TB_Composer.html

Protege

Free, open source ontology editor and knowledge-base framework http://protege.stanford.edu/

© 2012 IBM Corporation

Triple Stores

Comparison and links here:

http://www.w3.org/wiki/LargeTripleStores

Sesame - scalable and transactional

May be more suited to web environments Setup slightly more complex than Jena TDB

Jena TDB - scalable and very simple set up

Code Samples and API introduction here: http://cattail.boulder.ibm.com/cattail/#view=cmtrim@us.ibm.com/files/

53A1E4007F0F3DDB8C12752E093F23B6 The latest version of Jena TDB (0.90) is transactional. Past versions of TDB

were not transactional, and may not be suited for web environments.

DB2-RDF – builds on top of the Jena Graph SPI.

https://www.ibm.com/developerworks/mydeveloperworks/blogs/nlp/entry/db2_rdf_nosql_graph_support13

© 2012 IBM Corporation

Appendix B - OWL

OWL (Web Ontology Language)– Built on top of RDF (same syntax RDF)

Open World vs Closed World assumption

Parts of an Ontology:– Header– Classes and Individuals– Properties– Annotations– Datatypes

Instance vs Subclass

© 2012 IBM Corporation

OWL – Subclasses and Types

alpha rdfs:subClassOf of Thing– a rdf:type alpha– b rdf:type alpha

beta rdfs:subClassOf alpha– c rdf:type beta– d rdf:type beta– c rdf:type alpha – d rdf:type alpha

© 2012 IBM Corporation

OWL – Subclasses and Types

President rdfs:subClassOf Dignitary

Dignitary rdfs:subClassOf Person

This model states:– All dignitaries are people– All presidents are dignitaries (and thus,

people)

John Smith rdf:type Person

Queen Elizabeth rdf:type Dignitary– Queen Elizabeth rdf:type Person

GW Bush rdf:type President– GW Bush rdf:type Dignitary– GW Bush rdf:type Person

Barack Obama rdf:type President– Barack Obama rdf:type Dignitary– Barack Obama rdf:type Person

How do we expand this model to classify actively-serving American presidents?

© 2012 IBM Corporation

OWL – Subclasses and Types

President rdfs:subClassOf Dignitary

Dignitary rdfs:subClassOf Person

This model states:– All dignitaries are people– All presidents are dignitaries (and thus,

people)

John Smith rdf:type Person

Queen Elizabeth rdf:type Dignitary– Queen Elizabeth rdf:type Person

GW Bush rdf:type President– GW Bush rdf:type Dignitary– GW Bush rdf:type Person

Barack Obama rdf:type President– Barack Obama rdf:type Dignitary– Barack Obama rdf:type Person

How do we expand this model to classify actively-serving American presidents?

© 2012 IBM Corporation

Appendix C – OWL Properties

Transitive Property

Functional Property

Inverse Functional Property

Symmetric Property

Asymmetric Property

Reflexive Property

Irreflexive Property

Property Chains

Putting it all together

Others

© 2012 IBM Corporation

Transitive Property

hasVersion rdf:type owl:TransitiveProperty

Windows hasVersion Windows XP

Windows XP hasVersion Windows XP SP2

Windows hasVersion Windows XP SP2

© 2012 IBM Corporation

Functional Property

ssn-name rdf:type owl:FunctionalProperty

123-45-6789 ssn-ame Bob Smith

123-45-6789 ssn-ame Robert Smythe

Bob Smith owl:sameAs Robert Smythe

© 2012 IBM Corporation

Inverse Functional Property

hasSpeKey rdf:type owl:InverseFunctionalProperty

File Net Web Services hasSpeKey 5724S03

FN WS hasSpeKey 5724S03

File Net Web Services owl:sameAs FN WS

© 2012 IBM Corporation

Symmetric Property

siblingOf rdf:type owl:SymmetricProperty

Tim siblingOf Jim

Jim siblingOf Tim

© 2012 IBM Corporation

Asymmetric Property

hasParent rdf:type owl:AsymmetricProperty

Stewie hasParent Peter

Peter does not have parent Stewie

© 2012 IBM Corporation

Reflexive Property

© 2012 IBM Corporation

Irreflexive Property

© 2012 IBM Corporation

Property Chain

[] rdfs:subPropertyOf hasGrandfather;owl:propertyChain (

hasFatherhasFather

).

John III hasFather John JR

John JR hasFather John SR

John III hasGrandfather John SR

© 2012 IBM Corporation

Putting it all together …

hasSynonym– Transitive, Symmetric

© 2012 IBM Corporation

Appendix D - Classic Mereology

Transitive Axiom

Reflexive Axiom

Antisymmetric Axiom

© 2012 IBM Corporation

Transitive Axiom

parts of parts are parts of the whole

If A is part of B and B is part of C, then A is part of C

© 2012 IBM Corporation

Reflexive Axiom

everything is part of itself– A is part of A

© 2012 IBM Corporation

Antisymmetric Axiom

nothing is a part of its parts– if A is part of B and A != B then B is not part of A

© 2012 IBM Corporation

Appendix E - Partonomy

Can you distinguish parts from kinds?

Why is this important?

This is often the difference between a taxonomy and an ontology– A taxonomy doesn’t need to distinguish between parts and kinds– An ontology must make this distinction

Vehicle-Car--Engine---Crankcase----Aluminum Crankcase

© 2012 IBM Corporation

Partonomy

© 2012 IBM Corporation

Partonomy

© 2012 IBM Corporation

Appendix F – Common Predicates

hasPart– hasPart owl:inverseOf partOf– hasPart rdf:type owl:TransitiveProperty– partOf rdf:type owl:TransitiveProperty

hasLocus

© 2012 IBM Corporation

Appendix G

Blank nodes

Anonymous (Anon) nodes

Quads

© 2012 IBM Corporation

Quads

(Reference Jena Tutorial with TDB.ppt)

© 2012 IBM Corporation

Maintenance*

The relational model has relations between entities established through explict keys (primary, foreign) and associative entities.

– Changing relationships in this case is cumbersome, as it requires changes to the base model structure itself.

– Changes in an RDBMS can be difficult for a populated database.

Hierarchcal models have similar limitations

The graph model (RDF) makes it much easier to maintain the model once it is deployed.– A critical point is that relations are part of the data, not part of the database structure– If a new relationship needs to be added that was not anticipated, a new triple is simply

added to the datastore.– A graph model can be traversed from any perspective. In constrast, other types of

database designs might require structural changes to answer new questions that arise after initial implementation.

© 2012 IBM Corporation

Design Styles

Avoid proliferating owl:inverseOf [1]

top related