Page 1
1Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
1
Advanced databases –
The Semantic Web
Bettina Berendt
Katholieke Universiteit Leuven, Department of Computer Science
http://www.cs.kuleuven.be/~berendt/teaching/
Last update: 11 October 2011
Page 2
2Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
2
Agenda
The Semantic Web: Motivation and overview
Very brief recap of XML (& why it’s not semantic)
RDF and RDFS
OWL and ontologies
Linked (Open) Data (LOD)
Storing, accessing and combining SW data
Page 3
3Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
3
The original vision
The entertainment system was belting out the Beatles' "We Can Work It Out" when the phone rang. When Pete answered, his phone turned the sound down by sending a message to all the other local devices that had a volume control. His sister, Lucy, was on the line from the doctor's office: "Mom needs to see a specialist and then has to have a series of physical therapy sessions. Biweekly or something. I'm going to have my agent set up the appointments." Pete immediately agreed to share the chauffeuring.
At the doctor's office, Lucy instructed her Semantic Web agent through her handheld Web browser. The agent promptly retrieved information about Mom's prescribed treatment from the doctor's agent, looked up several lists of providers, and checked for the ones in-plan for Mom's insurance within a 20-mile radius of her home and with a rating of excellent or very good on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete's and Lucy's busy schedules. (The emphasized keywords indicate terms whose semantics, or meaning, were defined for the agent through the Semantic Web.)
Tim Berners-Lee, James Hendler and Ora Lassila (2001). The Semantic Web. A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities. Scientific American. http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21
Page 4
4Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
4
Questions
1. [concrete] What (meta-)data & procedures would be needed to solve this problem?
2. [general] What do you find works poorly on the Web today when you look for information?
Page 5
5Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
5
The Semantic Web: overview
The semantic web is an evolving extension of the World Wide Web in which web content can be expressed not only in natural language, but also in a format that can be read and used by software agents, thus permitting them to find, share and integrate information more easily.
It derives from W3C director Sir Tim Berners-Lee's vision of the Web as a universal medium for data, information, and knowledge exchange.
At its core, the semantic web comprises a philosophy, a set of design principles, collaborative working groups, and a variety of enabling technologies.
Some elements of the semantic web are expressed as prospective future possibilities that have yet to be implemented or realized.
Other elements of the semantic web are expressed in formal specifications.
Some of these include Resource Description Framework (RDF), a variety of data interchange formats (e.g. RDF/XML, N3, Turtle, N-Triples), and notations such as RDF Schema (RDFS) and the Web Ontology Language (OWL), all of which are intended to provide a formal description of concepts, terms, and relationships within a given knowledge domain.
Page 6
6Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
6
The Semantic Web layer cake (T. Berners-Lee talk at XML 2000)
RDF: W3C Rec. 2004
OWL: W3C Rec. 2004OWL2: W3C Rec. 2009
Page 7
7Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
7
BTW: Semantic non-interoperability has real consequences ...
Page 8
8Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
8
Working example: People and their relations
Page 9
9Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
9
Approach 1: Centralised
Page 10
10Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
10
Approach 2: Decentralised / open
Page 11
11Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
11
FOAF (Friend of a Friend)
a machine-readable ontology describing persons, their activities and their relations to other people and objects.
Anyone can use FOAF to describe him or herself. FOAF is an extension to RDF and is defined using OWL. Computers may use these FOAF profiles to find, for example, all
people living in Europe, or to list all people both you and a friend of you know.
This is accomplished by defining relationships between people. Each profile has a unique identifier (such as the person's e-mail
addresses, a Jabber ID, or a URI of the homepage or weblog of the person), which is used when defining these relationships.
The FOAF project, which defines and extends the vocabulary of a FOAF profile, was started in 2000 by Libby Miller and Dan Brickley.
http://www.foaf-project.org
„possibly the single most prevalent use of Semantic Web technologies so far“ – blog software exporting FOAF + RSS (Paolillo et al., 2005)
Page 12
12Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
12
FOAF example (1)
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<foaf:Person rdf:about="#JW">
<foaf:name>Jimmy Wales</foaf:name>
<foaf:mbox rdf:resource="mailto:[email protected] " />
<foaf:homepage rdf:resource="http://www.jimmywales.com/" />
<foaf:nick>Jimbo</foaf:nick>
<foaf:depiction rdf:resource="http://www.jimmywales.com/aus_img_small.jpg" />
<foaf:interest>
<rdf:Description rdf:about="http://www.wikimedia.org" rdfs:label="Wikipedia" />
</foaf:interest>
Page 13
13Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
13
FOAF example (2)
<foaf:knows>
<foaf:Person>
<foaf:name>Angela Beesley</foaf:name> <!-- Wikimedia Board of Trustees -->
</foaf:Person>
</foaf:knows>
</foaf:Person>
</rdf:RDF>
Social-web inferences
Page 14
14Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
14
FOAF extensions (1)
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:rel="http://www.perceive.net/schemas/relationship/">
<foaf:Person rdf:ID="spiderman">
<foaf:name>Spiderman</foaf:name>
<rel:enemyOf rdf:resource="#green-goblin"/>
</foaf:Person>
<foaf:Person rdf:ID="green-goblin">
<foaf:name>Green Goblin</foaf:name>
<rel:enemyOf rdf:resource="#spiderman"/>
</foaf:Person>
Page 15
15Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
15
FOAF extensions (2)
<foaf:Person rdf:ID="peter">
<foaf:name>Peter Parker</foaf:name>
<rel:friendOf rdf:resource="#harry"/>
</foaf:Person>
<foaf:Person rdf:ID="harry">
<foaf:name>Harry Osborn</foaf:name>
<rel:friendOf rdf:resource="#peter"/>
<rel:childOf rdf:resource="#norman"/>
</foaf:Person>
<foaf:Person rdf:ID="norman">
<foaf:name>Norman Osborn</foaf:name>
<rel:parentOf rdf:resource="#harry"/>
</foaf:Person>
</rdf:RDF>
Page 16
16Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
16
FOAF multimedia (1)
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<foaf:Person rdf:ID="peter">
<foaf:name>Peter Parker</foaf:name>
<foaf:depicts rdf:resource="http://www.peterparker.com/peter.jpg"/>
</foaf:Person>
<foaf:Person rdf:ID="spiderman">
<foaf:name>Spiderman</foaf:name>
</foaf:Person>
Page 17
17Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
17
FOAF multimedia (2)
<foaf:Person rdf:ID="green-goblin">
<foaf:name>Green Goblin</foaf:name>
</foaf:Person>
<!-- codepiction -->
<foaf:Image rdf:about="http://www.peterparker.com/photos/spiderman/statue.jpg">
<dc:title>Battle on the Statue Of Liberty</dc:title>
<foaf:depicts rdf:resource="#spiderman"/>
<foaf:depicts rdf:resource="#green-goblin"/>
<foaf:maker rdf:resource="#peter"/>
</foaf:Image>
</rdf:RDF>
Page 18
18Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
18What inferences? Ex.: A social-network analysis of LiveJournal FOAF entries(Paolillo et al., 2005)
Interests over time remain similar
Friends over time remain similar
But: the manner in which people elect friends and interests in their LiveJournal profiles is sharply different. ... [These differences] represent fundamentally different social behaviors.
What does this mean for recommender systems?
Page 19
19Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
19
Agenda
The Semantic Web: Motivation and overview
Very brief recap of XML (& why it’s not semantic)
RDF and RDFS
OWL and ontologies
Linked (Open) Data (LOD)
Storing, accessing and combining SW data
Page 20
20Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
20
You have data … How should you structure it?
medium-altitude, long-endurance unmanned aerial vehicle
14.7 meters
512 kilograms70 knots
Here's some data about an aircraft:
400 nautical miles
Page 21
21Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
21The XML approach is to "wrap" each data item in start/end tags
<Aircraft> <wingspan>14.8 meters</wingspan> <weight>512 kilograms</weight> <cruise-speed>70 knots</cruise-speed> <range>400 nautical miles</range> <description> medium-altitude, long-endurance unmanned aerial vehicle </description></Aircraft>
RQ-1.xml
and define this data
schema, e.g. in a DTD
Page 22
22Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
22
XML Terminology
<wingspan>14.8 meters</wingspan>
Start tag End tag
Data
Element
Page 23
23Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
23
Why use XML?
It is a universally accepted standard way of structuring data (syntax).
It is a W3C recommendation (W3C = World Wide Web Consortium)
The marketplace supports it with a lot of free/inexpensive tools.
The alternative to using XML is to define your own proprietary data syntax, and then build your own proprietary tools to support the proprietary syntax (Not a very appealing idea).
Page 24
24Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
24
But: What is this XML snippet talking about, i.e., what are the semantics?
<Predator> …</Predator>
What is a Predator?
Page 25
25Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
25
Predator - which one?
Predator: a medium-altitude, long-endurance unmanned aerial vehicle system.
Predator : one that victimizes, plunders, or destroys, especially for one's own gain.
Predator : an organism that lives by preying on other organisms.
Predator: a company which specializes in camouflage attire.
Predator: a video game.
Predator: software for machine networking.
Predator: a chain of paintball stores.
Page 26
26Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
26
A little more flexibility through namespaces
<?xml version="1.0" encoding=„UTF-8"?>
<myThings
xmlns:h=http://www.mySchemas.org/TR/aircraft/ xmlns:f="http://www.yourSchemas.com/animals">
<h:Predator>
<h:name>OL231-b</hname>
<h:wingspan>14.8 metres</h:wingspan>
</h:Predator>
<f:Predator>
<f:name>Panthera</f:name>
<f:eats>antelopes</f:eats>
</f:Predator>
</myThings>
Page 27
27Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
27
... But this doesn‘t solve the fundamental problems
1. What does nesting mean?
2. What do syntactical variations mean?
3. What do linguistic variations mean?
4. How can we extend our knowledge?
Page 28
28Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
28
1. What does nesting mean?
Schema 1 allows for expressions like:
<Person>
<name>Peter Parker</name> ...
</Person>
name being an XML-element of Person means: the person HAS-A ...
Schema 2 allows for expressions like:
<Person>
<type>Comic-book hero</type> ...
</Person>
type being an XML-element of Person means: the person IS-A ...
Problems: a) we don‘t know what nesting means, b) even if we do know, we can‘t express this in a machine-readable way (at most build it into an application that uses these XML statements, but that would bury meaning in procedures!)
Page 29
29Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
29
2. What do syntactical variations mean?
Schema 1 allows for expressions like:
<Person>
<name>Peter Parker</name>
<birthday>1932-04-12</birthday> ...
</Person>
Schema 2 allows for expressions like:
<Person name=“Peter Parker“>
<type>Comic-book hero</type> ...
</Person>
Problems: a) what does it mean for some information to be an XML-element vs. an XML-attribute? b) even if we do know that they are the same, we can‘t express this in a machine-readable way, for example to combine the information from the two sources (same remark about applications as in 1.)
Page 30
30Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
30
3. What do linguistic variations mean?
Schema 1 allows for expressions like:
<Person>
<name>Peter Parker</name> ...
</Person>
Schema 2 allows for expressions like:
<Person>
<naam>Peter Parker</naam> ...
</Person>
Problems: a) we do not know whether elements from different data sources that differ by, e.g. natural, language, are the same or not b) even if we do know that they are the same, we can‘t express this in a machine-readable way, for example to combine the information from the two sources (same remark about applications as in 1.)
Page 31
31Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
31
4. How can we extend our knowledge?
Schema 1 allows for expressions like:
<WebResource>
<type>Picture</type>
<hasURL>http://www.example.org/Pictures/myPic.png</hasURL>
<isAbout>Peter Parker</isAbout> ...
</WebResource>
Schema 2 allows for expressions like:
<WebResource>
<hasURL>http://www.example.org/Pictures/myPic.png</hasURL>
<hasLicence>CreativeCommons</hasLicence> ...
</WebResource>
Problems: a) we cannot refine our schema information by that provided by another source b) even if we can be sure about principal linkability (here: via the URL), we can‘t express this in a machine-readable way, for example to combine the information from the two sources (same remark about applications as in 1.)
Page 32
32Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
32Summary: XML not well-suited for conceptual modelling and therefore not suited for truly semantic markup
XML makes no commitment on:
Domain-specific ontological vocabulary
Ontological modeling primitives
Requires pre-arranged agreement on &
Only feasible for closed collaboration
agents in a small & stable community
pages on a small & stable intranet
Not suited for sharing Web-resources
Page 33
33Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
33
Solution approach of the „higher levels“ of the Semantic Web
1. Break down information into atomic statements: subject-predicate-object
2. Define (in a formal-semantics way) what each component of each statement means
a. Give it a URI (uniform resource identifier) to enable uniform meaning specification
b. Define languages to say more about (specify) the meaning (by relating it to other units of meaning – cf. a dictionary in which each word is explained by other words)
c. (exception: some components may be literals / strings – these are not defined further)
3. The languages mentioned in 2.b. each add more expressivity:
1. RDF: subject-predicate-object statements (in RDF terminology: a resource has a property with a certain value.
2. RDFS: simple ontology building blocks: class, subclass-of relation, use RDF‘s type to denote that (e.g.) an individual is a instance of a class (= make it possible to define a schema and its instances), ...
3. OWL: more advanced ontology building blocks: a class (= concept) is disjoint with another one, is the same as another one; a property is functional, symmetric, the inverse of another one; ...
Page 34
34Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
34
Semantic Web vs. Database
Advantages of using RDF/RDFS/OWL to define an Ontology:
Extensible: much easier to add new properties. Contrast with a database - adding a new column may break a lot of applications
Portable: much easier to move an OWL document than to move a database.
Advantages of using a Database to define an Ontology:
Mature: the database technology has been around a long time and is very mature.
Page 35
35Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
35
Agenda
The Semantic Web: Motivation and overview
Very brief recap of XML (& why it’s not semantic)
RDF and RDFS
OWL and ontologies
Linked (Open) Data (LOD)
Storing, accessing and combining SW data
Page 36
36Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
36
What is RDF ?
RDF is a data model
the model is domain-neutral, application-neutral
the model can be viewed as directed, labeled graphs or as an object-oriented model (object/attribute/value)
RDF data model is an abstract, conceptual layer independent of XML
consequently, XML is a transfer syntax for RDF, not a component of RDF
RDF data might never occur in XML form
Page 37
37Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
37
RDF model
RDF “statements” consist of
resources (= nodes)which have propertieswhich have values (= nodes,strings)
http://www.w3.org/TR/REC-rdf-syntax/
“Ora Lassila”
author
= subject= predicate= object
“http://www.w3.org/TR/REC-rdf-syntax/ has the author Ora Lassila”
resource valueproperty
Page 38
38Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
38
RDF Model Example
http://www.w3.org/TR/REC-rdf-syntax/
“Ora Lassila”
dc:Creator
“1999-02-22”
dc:Date
“W3C”
dc:Publisher
Page 39
39Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
39
Complex values
So far, values of properties have been strings
A graph node (corresponding to a resource) also can be the value of a property
arbitrarily complex tree and graph structures are possible
syntactically, values can be embedded (i.e. lexically in-line) or referenced (linked)
Example:
http://www.w3.org/TR/REC-rdf-syntax/
“Ora Lassila”
dc:Creator
“[email protected] ”
p:EMail
p:Name
Page 40
40Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
40
Complex values (continued)
Corresponding triples
{ “http://www.w3.org/TR/PR-rdf-syntax/”, dc:Creator, x }
{ x, p:Name, “Ora Lassila” }
{ x, p:EMail, “[email protected] ” }
http://www.w3.org/TR/REC-rdf-syntax/
“Ora Lassila”
dc:Creator
“[email protected] ”
p:EMail
p:Name
Page 41
41Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
41
Containers
Containers are collections
they allow grouping of resources (or literal values)
It is possible to make statements about the container (as a whole) or about its members individually
Different types of containers exist
bag - unordered collection
seq - ordered collection (= “sequence”)
alt - represents alternatives
It is also possible to create collections based on URI patterns
for example, all files in a particular web site
Duplicate values are permitted
there is no mechanism to enforce unique value constraints
Page 42
42Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
42
Containers (continued)
http://www.w3.org/TR/REC-rdf-syntax
“Ora Lassila”
rdf:_1
rdf:Seq
dc:Creator
rdf:Type
“Ralph Swick”
rdf:_2
Page 43
43Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
43
Higher-order statements
One can make RDF statements about other RDF statements
example: “Ralph believes that the web contains one billion documents”
Higher-order statements
allow us to express beliefs (and other modalities)
are important for trust models, digital signatures,etc.
also: metadata about metadata
are represented by modeling RDF in RDF itself
Page 44
44Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
44
Reification
RDF is not really second-order
But it does provide a built-in predicate vocabulary for reification
http://www.w3.org/TR/REC-rdf-syntax “Ora Lassila”dc:Creator
“Library of Congress”
dc:Creator
• The dotted box corresponds to the following statements
• { x, rdf:predicate, “dc:creator” }• { x, rdf:subject, “http://www.w3.org/TR/RED-rdf-syntax }• { x, rdf:object, “Ora Lassila” }• { x, rdf:type, “rdf:statement” }
Page 45
45Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
45
Reification
pers05 ISBN...Author-of
NYT claims
<rdf:Description rdf:about=“#NYT”> <claims> <rdf:Description rdf:about=“#pers05”> <authorOf>ISBN...</authorOf> </rdf:Description> </claims></rdf:Description>
Any statement can be an objectgraphs can be nested - reification
Page 46
46Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
46
RDF Schema
• Defines small vocabulary for RDF: • Class, subClassOf, type• Property, subPropertyOf• domain, range
• Vocabulary can be used to define other vocabularies for your application domain
Person
Student Researcher
subClassOfsubClassOf
Jeentype
hasSuperVisordomain range
Frank
type
hasSuperVisor
Page 47
47Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
47
<rdf:Description ID="MotorVehicle"> <rdf:type resource="http://www.w3.org/...#Class"/> <rdfs:subClassOf rdf:resource="http://www.w3.org/...#Resource"/></rdf:Description>
<rdf:Description ID="Truck"> <rdf:type resource="http://www.w3.org/...#Class"/> <rdfs:subClassOf rdf:resource="#MotorVehicle"/></rdf:Description>
<rdf:Description ID="registeredTo"> <rdf:type resource="http://www.w3.org/...#Property"/> <rdfs:domain rdf:resource="#MotorVehicle"/> <rdfs:range rdf:resource="#Person"/></rdf:Description>
<rdf:Description ID=”ownedBy"> <rdf:type resource="http://www.w3.org/...#Property"/> <rdfs:subPropertyOf rdf:resource="#registeredTo"/></rdf:Description>
RDF Schema syntax in XML
Page 48
48Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
48
Agenda
The Semantic Web: Motivation and overview
Very brief recap of XML (& why it’s not semantic)
RDF and RDFS
OWL and ontologies
Linked (Open) Data (LOD)
Storing, accessing and combining SW data
Page 49
49Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
49
Ontologies and concepts
An ontology is a conceptual model.
An Ontology is the collection of semantic definitions for a domain.
Example: an Aircraft Ontology is the set of semantic definitions for the Aircraft domain, e.g.,
Predator is a subClassOf Aircraft.
sensorID is a FunctionalProperty.
Platform is an equivalentClass to Aircraft.
Predator, Aircraft etc. are concepts.
Page 50
50Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
50Basic idea of conceptual modelling (not only in SW): The semiotic triangle
Page 51
51Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
51What is an ontology?(A commonly accepted informal definition and one formal definition)
An ontology is „an explicit specification of a shared conceptualisation.“ (Gruber, 1993)
(Stumme, Hotho & Berendt, Semantic Web Journal 2006))
Page 52
52Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
52
In which semantic web languages can ontologies be formulated?
RDF Schema is sufficient to specify an ontology with the first 4 components
For the fifth component (logical axioms), need a more expressive language like OWL.
Page 53
53Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
53
Ontologies, decentralization, and bottom-up engineering
Communities of users (application builders, ...) can
Re-use existing ontologies
Established domain-specific ontologies (e.g., real-estate, medicine, bioinformatics)
„The big one“: Cyc, see www.cyc.com
Search for ontologies
– See overview at http://en.wikipedia.org/wiki/Ontology_%28information_science%29#Ontology_libraries
– Use Sindice with some tricks: http://groups.google.com/group/sindice-dev/browse_thread/thread/831c084c3b5a0214 (or try the Advanced Search directly: http://sindice.com/search )
Link to existing ontologies
Extend existing ontologies
Page 54
54Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
54
Ontologies as conceptual models / schemas; or:Database (knowledge base) = Ontology + Instances
My Life and Times
Illusions
First and Last Freedom
Paul McCartney
Richard Bach
J. Krishnamurti
June, 1998
1972
1974
title author date
BookCatalogue
<owl:Class rdf:ID="BookCatalogue"/>
<owl:DatatypeProperty rdf:ID="title"> <rdfs:domain rdf:resource="#BookCatalogue"/> <rdfs:range rdf:resource="&xsd;#string"/></owl:DatatypeProperty>
<owl:DatatypeProperty rdf:ID="author"> <rdfs:domain rdf:resource="#BookCatalogue"/> <rdfs:range rdf:resource="&xsd;#string"/></owl:DatatypeProperty>
<owl:DatatypeProperty rdf:ID="date"> <rdfs:domain rdf:resource="#BookCatalogue"/> <rdfs:range rdf:resource="&xsd;#date"/></owl:DatatypeProperty>
<?xml version=“1.0”?><BookCatalogue> <title>My Life and Times</title> <author>Paul McCartney</author> <date>June, 1998</date></BookCatalogue>
Page 55
55Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
55
OWL: more details
You‘ve already worked with it
Here is a nice tutorial that takes you through OWL‘s possibilities for formulating restrictions, constructing classes, etc., starting from the Protégé interface:
http://protege.stanford.edu/plugins/owl/publications/2004-07-06-OWL-Tutorial.ppt
And here is a version of the Tourism ontology created there in OWL (XML notation):
http://gaia.fdi.ucm.es/ontologies/travel.owl
Page 56
56Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
56
Agenda
The Semantic Web: Motivation and overview
Very brief recap of XML (& why it’s not semantic)
RDF and RDFS
OWL and ontologies
Linked (Open) Data (LOD)
Storing, accessing and combining SW data
Page 57
57Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
57
What is LOD?
“A way of making the Semantic Web happen“ (it is hoped)
Key concept: leverage the existence of structured data and combine it with the languages and infrastructures of the Web and the Semantic Web
Tim Berners-Lee: four principles of Linked Data (http://www.w3.org/DesignIssues/LinkedData)
Use URIs to identify things.
Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents.
Provide useful information about the thing when its URI is dereferenced, using standard formats such as RDF/XML.
Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web.
Page 58
58Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
58
Data items are identified with HTTP URIs
pd:cygri
Richard Cyganiak
dbpedia:Berlin
foaf:name
foaf:based_near
foaf:Personrdf:type
pd:cygri = http://richard.cyganiak.de/foaf.rdf#cygri
dbpedia:Berlin = http://dbpedia.org/resource/Berlin
From http://www.ai.sri.com/~nysmith/slides/aic-seminars/090724-bizer.ppt
Page 59
59Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
59
Resolving URIs over the Web
dp:Cities_in_Germany
3.405.259dp:population
skos:subject
Richard Cyganiak
dbpedia:Berlin
foaf:name
foaf:based_near
foaf:Personrdf:type
pd:cygri
From http://www.ai.sri.com/~nysmith/slides/aic-seminars/090724-bizer.ppt
Page 60
60Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
60
Dereferencing URIs over the Web
dp:Cities_in_Germany
3.405.259dp:population
skos:subject
Richard Cyganiak
dbpedia:Berlin
foaf:name
foaf:based_near
foaf:Personrdf:type
dbpedia:Hamburg
dbpedia:Muenchen
skos:subject
skos:subject
pd:cygri
From http://www.ai.sri.com/~nysmith/slides/aic-seminars/090724-bizer.ppt
Page 61
61Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
61
The Linked Open Data Cloud
Page 62
62Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
62Interactive visualization of the Linked Data Cloudhttp://www.webknox.com/blog/2010/05/linked-open-data-on-the-web-visualization/
Page 63
63Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
63
More on LOD
W3C Linking Open Data community project: http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
A nice slideset is available at http://www.ai.sri.com/~nysmith/slides/aic-seminars/090724-bizer.ppt
and a tutorial (together with a link to a recent book) at
http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/
Page 64
64Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
64
Agenda
The Semantic Web: Motivation and overview
Very brief recap of XML (& why it’s not semantic)
RDF and RDFS
OWL and ontologies
Linked (Open) Data (LOD)
Storing, accessing and combining SW data
Page 65
65Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
65
How is this data stored? (1)
In „Semantic Web / LOD databases“: triplestores
A triplestore is a purpose-built database for the storage and retrieval of Resource Description Framework (RDF) metadata.
A triplestore can store many (up to billions) of RDF triples
For a list of implementations, see http://en.wikipedia.org/wiki/Triplestore
Page 66
66Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
66
How is this data stored? (2)
Embedded in Web pages
RDFa
a W3C Recommendation: http://www.w3.org/TR/rdfa-syntax/
e.g. for people: embed FOAF
Microformats
e.g. for people: hcard + XHTML Friends Network
Microdata
e.g. http://schema.org
What are the differences? Which one(s) are “truly Semantic Web“?
Just one out of many sample blogs comparing the three: http://blog.foolip.org/2009/08/23/microformats-vs-rdfa-vs-microdata/
Page 67
67Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
67
How is this data accessed?
By search engines that can extract the markup from Web pages
e.g., Google
By search engines that directly access triplestores
e.g. Sindice
By your own applications that directly access triplestores
e.g. your homework 2
Obviously, data can then also be transformed into RDF (e.g. RDFa) or into human-readable web pages, see the following for an example
Page 68
68Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
68
LOD on people (1)
Page 69
69Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
69
LOD on people (2)
Page 70
70Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
70
What do the LOD on people actually say? (1)
Page 71
71Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
71
What do the LOD on people actually say? (2)
Page 72
72Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
72What does the combination/integration of this information require?
“Linkability“ at the technical level: see Linked Data principles
“Linkability“ at the semantic level of identity: sameAs
“Linkability“ at the semantic level of more complex relationships: schema / ontology matching
e.g. your homework 2
Page 73
73Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
73
PS: What are „semantic technologies“?
encode meanings separately from data and content files, and separately from application code
Often uses elements (e.g. the OWL language) of the Semantic Web
But not necessarily open data
Thus, increasingly popular for example for within-company solutions
Page 74
74Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
74
Outlook
The Semantic Web: Motivation and overview
Very brief recap of XML (& why it’s not semantic)
RDF and RDFS
OWL and ontologies
Linked (Open) Data (LOD)
Storing, accessing and combining SW data
Schema/ontology matching
Page 75
75Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
75
Used sources
(From or based on):
p. 6: http://en.wikipedia.org/wiki/Semantic_Web
pp. 21-26, 49: Costello, R.L. & Jacobs, D.B. (2003). A Two Minute Intro to XML. www.daml.org/meetings/2003/05/SWMU/briefings/07_1045_Essential_Building_Blocks.ppt
p. 33, pp. 37-48: Unnamed (no date). RDF and XML tutorial. http://lsdis.cs.uga.edu/SemWebCourse/RDF.ppt
p. 35, 53: Costello, R.L. & Jacobs, D.B. (2003). OWL Web Ontology Language.
http://www.racai.ro/EUROLAN-2003/html/presentations/JamesHendler/owl/OWL.ppt
pp. 12-14: http://en.wikipedia.org/wiki/FOAF_(software)
pp. 15-18: Dodds, L. (2004). An Introduction to FOAF. http://www.xml.com/pub/a/2004/02/04/foaf.html
Picture credits: see PPT „comments“ field
Page 76
76Berendt: Advanced databases, 1st semester 2011/2012, http://www.cs.kuleuven.be/~berendt/teaching/
76
Further references, background reading; acknowledgements
J. C. Paolillo, S. Mercure, and E. Wright. (2005). The social semantics of Livejournal FOAF: Structure and change from 2004 to 2005. In G. Stumme, B. Hoser, C. Schmitz, and H. Alani, editors, Proceedings of the 1st Workshop on Semantic Network Analysis at the ISWC 2005 Conference, pages 69 – 80. http://www.blogninja.com/paolillo-mercure-wright.final.pdf
Specifications:
RDF: http://www.w3.org/RDF/ , http://www.w3.org/TR/rdf-primer
OWL: http://www.w3.org/TR/owl-features
OWL2: http://www.w3.org/TR/owl2-overview/
FOAF: http://xmlns.com/foaf/spec