Top Banner
Introduction to RDF By: Sukhpal Singh Gill PhD Research Scholar Thapar University, Patiala 1
67

Introduction to RDF

Apr 14, 2017

Download

Education

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to RDF

Introduction to RDF

By: Sukhpal Singh Gill

PhD Research Scholar Thapar University, Patiala

1

Page 2: Introduction to RDF

Resource Description Framework

• A framework (not a language) for describing resources

• Model for data

• Syntax to allow exchange and use of information stored in various locations

• The point is to facilitate reading and correct use of information by computers, not necessarily by people

2

Page 3: Introduction to RDF

w3c recommendation

• Find the official recommendation at http://www.w3.org/RDF/s

• Note the subtle difference between a standard and a recommendation

– w3c has no power to enforce compliance.

– Obeying the rules in the recommendation allows a site to participate in the world wide web cooperative enterprise.

3

Page 4: Introduction to RDF

Identification and description

• RDF identifies resources with URIs – Often, though not always, the same as a URL

– Anything that can have a URI is a RESOURCE

• RDF describes resources with properties and property values – A property is a resource that has a name

• Ex. Author, Book, Address, Client, Product

– A property value is the value of the Property • Ex. “Joanna Santillo,” http://www.someplace.com/, etc.

• A property value can be another resource, allowing nested descriptions.

4

Page 5: Introduction to RDF

Statements

• Resource, Property, Property Value

• Aka subject, predicate, object of a statement

• Predicates are not the same as English language verbs.

– Specify a relationship between the subject and the object

5

Page 6: Introduction to RDF

Examples

• Statement: "The author of http://www.w3schools.com/RDF is Jan Egil Refsnes".

• Subject: http://www.w3schools.com/RDF

• Predicate: author

• Object: Jan Egil Refsnes

• Statement: "The homepage of http://www.w3schools.com/RDF is http://www.w3schools.com".

• Subject: http://www.w3schools.com/RDF

• Predicate: homepage

• Object: http://www.w3schools.com

6

Page 7: Introduction to RDF

Binary predicates

• RDF offers only binary predicates.

• Think of them as P(x,y) where P is the relationship between the objects x and y.

• From the example,

• X = http://www.w3schools.com/RDF

• Y = Jan Egil Refsnes

• P = author

http://www.w3schools.com/RDF Jan Egil Refsnes author

7

Page 8: Introduction to RDF

<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:cd="http://www.recshop.fake/cd#"> <rdf:Description rdf:about="http://www.recshop.fake/cd/Empire Burlesque"> <cd:artist>Bob Dylan</cd:artist> <cd:country>USA</cd:country> <cd:company>Columbia</cd:company> <cd:price>10.90</cd:price> <cd:year>1985</cd:year> </rdf:Description> <rdf:Description rdf:about="http://www.recshop.fake/cd/Hide your heart"> <cd:artist>Bonnie Tyler</cd:artist> <cd:country>UK</cd:country> <cd:company>CBS Records</cd:company> <cd:price>9.90</cd:price> <cd:year>1988</cd:year> </rdf:Description> … </rdf:RDF>.

Root element of RDF documents

Source of namespace for elements with rdf prefix

Source of namespace for elements with cd prefix

Description element describes the resource identified by the rdf:about attribute.

Cd:country etc are properties of the resource.

8

Page 9: Introduction to RDF

RDF validator

• Check the correctness of an RDF document:

• http://www.w3.org/RDF/Validator/

• Result shows the subject, predicate and object of each element of the document and a graph of the model.

9

Page 10: Introduction to RDF

What is the Purpose of RDF?

• The purpose of RDF (Resource Description Framework) is to give a standard way of specifying data "about" something.

• Here's an example of an XML document that specifies data about China's Yangtze river:

<?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

"Here is data about the Yangtze River. It has a length of 6300 kilometers. Its startingLocation is western China's Qinghai-Tibet Plateau. Its endingLocation is the East China Sea."

10

Page 11: Introduction to RDF

XML --> RDF

<?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

XML

Modify the following XML document so that it is also a valid RDF document:

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

RDF

Yangtze.xml

Yangtze.rdf

"convert to"

11

Page 12: Introduction to RDF

The RDF Format

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

RDF provides an ID attribute for identifying the resource being described.

The ID attribute is in the RDF namespace.

Add the "fragment identifier symbol" to the namespace.

1

2

3

12

Page 13: Introduction to RDF

The RDF Format (cont.)

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

Identifies the type (class) of the resource being described.

Identifies the resource being described. This resource is an instance of River.

These are properties, or attributes, of the type (class).

Values of the properties

1

2

3

4

13

Page 14: Introduction to RDF

Namespace Convention

xmlns="http://www.geodesy.org/river#"

Question: Why was "#" placed onto the end of the namespace? E.g.,

Answer: RDF is very concerned about uniquely identifying things - uniquely identifying the type (class) and uniquely identifying the properties. If we concatenate the namespace with the type then we get a unique identifier for the type, e.g.,

http://www.geodesy.org/river#River

If we concatenate the namespace with a property then we get a unique identifier for the property, e.g.,

http://www.geodesy.org/river#length

http://www.geodesy.org/river#startingLocation

http://www.geodesy.org/river#endingLocation

Thus, the "#" symbol is simply a mechanism for separating the namespace from the type name and the property name.

Bes

t P

ract

ice

B

est Practice

14

Page 15: Introduction to RDF

The RDF Format

<?xml version="1.0"?> <Class rdf:ID="Resource" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="uri"> <property>value</property> <property>value</property> ... </Class>

15

Page 16: Introduction to RDF

Advantage of using the RDF Format • You may ask: "Why should I bother designing my XML to be in the RDF

format?"

• Answer: there are numerous benefits: – The RDF format, if widely used, will help to make XML more

interoperable: • Tools can instantly characterize the structure, "this element is a type (class),

and here are its properties”.

• RDF promotes the use of standardized vocabularies ... standardized types (classes) and standardized properties.

– The RDF format gives you a structured approach to designing your XML documents. The RDF format is a regular, recurring pattern.

– It enables you to quickly identify weaknesses and inconsistencies of non-RDF-compliant XML designs. It helps you to better understand your data!

– You reap the benefits of both worlds: • You can use standard XML editors and validators to create, edit, and validate

your XML.

• You can use the RDF tools to apply inferencing to the data.

– It positions your data for the Semantic Web!

Net

wo

rk e

ffec

t In

tero

per

abili

ty

16

Page 17: Introduction to RDF

Disadvantage of using the RDF Format

• Constrained: the RDF format constrains you on how you design your XML (i.e., you can't design your XML in any arbitrary fashion).

• RDF uses namespaces to uniquely identify types (classes), properties, and resources. Thus, you must have a solid understanding of namespaces.

• Another XML vocabulary to learn: to use the RDF format you must learn the RDF vocabulary.

17

Page 18: Introduction to RDF

18

The Semantic Web “Layer Cake”

Page 19: Introduction to RDF

19

RDF Basics

• RDF is based on the idea of identifying resources using Web identifiers and describing resources in terms of simple properties and property values.

• To identify resources, RDF uses Uniform Resource

Identifiers (URIs) and URI references (URIrefs). • Definition: A resource is anything that is identifiable

by a URIref.

Page 20: Introduction to RDF

20

Example

• Consider the following information:

“there is a Person identified by

http://www.w3.org/People/EM/contact#me,

whose name is Eric Miller, whose email

address is [email protected], and whose title is

Dr.”

Page 21: Introduction to RDF

21

Example (cont’d)

Page 22: Introduction to RDF

22

Basics (cont’d) • Forget the long URIs for the moment! • RDF is based on the idea that the resources being described

have properties which have values, and that resources can be described by making statements, similar to the ones above, that specify those properties and values.

• Terminology:

– The part that identifies the thing the statement is about is called the subject.

– The part that identifies the property or characteristic of the subject that the statement specifies is called the predicate.

– The part that identifies the value of that property is called the object.

Page 23: Introduction to RDF

23

Uniform Resource Identifiers

• The Web provides a general form of identifier, called the Uniform Resource Identifier (URI), for identifying (naming) resources on the Web.

• Unlike URLs, URIs are not limited to identifying things that have network locations, or use other computer access mechanisms. A number of different URI schemes (URI forms) have been already been developed, and are being used, for various purposes.

• Examples:

– http: (Hypertext Transfer Protocol, for Web pages) – mailto: (email addresses), e.g., mailto:[email protected] – ftp: (File Transfer Protocol) – urn: (Uniform Resource Names, intended to be persistent location-

independent resource identifiers), e.g., urn:isbn:0-520-02356-0 (for a book)

• No one person or organization controls who makes URIs or how they can be used.

While some URI schemes, such as URL's http:, depend on centralized systems such as DNS, other schemes, such as freenet:, are completely decentralized.

Page 24: Introduction to RDF

24

URIs (cont’d) • A URI reference (or URIref) is a URI, together with an optional fragment identifier

at the end.

Example: the URIref http://www.example.org/index.html#section2

consists of the URI http://www.example.org/index.html and (separated

by the "#" character) the fragment identifier section2. • URIrefs may be either absolute or relative. • An absolute URIref refers to a resource independently of the context in which the

URIref appears, e.g., the URIref http://www.example.org/index.html. • A relative URIref is a shorthand form of an absolute URIref, where some prefix of

the URIref is missing, and information from the context in which the URIref appears is required to fill in the missing information.

Example: the relative URIref otherpage.html, when appearing in a resource http://www.example.org/index.html, would be filled out to the

absolute URIref http://www.example.org/otherpage.html.

Page 25: Introduction to RDF

25

URIrefs in RDF (cont’d)

• Another difference is in the way URIrefs with fragment identifiers are handled. Consider the following URIrefs: http://www.example.org/index.html

http://www.example.org/index.html#Section2

• In normal HTML usage, these URIrefs are related (they both refer to the same document, the second one identifying a location within the first one).

• RDF assumes no particular relationship between these two URIrefs. As far as RDF is concerned, they are syntactically different URI references, and hence may refer to unrelated things.

Page 26: Introduction to RDF

Uniquely Identify the Resource

• Earlier we said that RDF is very concerned about uniquely identifying the type (class) and the properties. RDF is also very concerned about uniquely identifying the resource, e.g.,

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

This is the resource being described. We want to uniquely identify this resource.

26

Page 27: Introduction to RDF

rdf:ID

• The value of rdf:ID is a "relative URI".

• The "complete URI" is obtained by concatenating the URL of the XML document with "#" and then the value of rdf:ID, e.g.,

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

Suppose that this RDF/XML document is located at this URL: http://www.china.org/geography/rivers. Thus, the complete URI for this resource is:

Yangtze.rdf

http://www.china.org/geography/rivers#Yangtze 27

Page 28: Introduction to RDF

xml:base • On the previous slide we showed how the URL of the document provided the base URI.

• Depending on the location of the document is brittle: it will break if the document is moved, or is copied to another location.

• A more robust solution is to specify the base URI in the document, e.g.,

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

Resource URI = concatenation(xml:base, '#', rdf:ID) = concatenation(http://www.china.org/geography/rivers, '#', "Yangtze") = http://www.china.org/geography/rivers#Yangtze

28

Page 29: Introduction to RDF

rdf:about

• Instead of identifying a resource with a relative URI (which then requires a base URI to be prepended), we can give the complete identity of a resource. However, we use

rdf:about, rather than rdf:ID, e.g.,

<?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

29

Page 30: Introduction to RDF

Triple -> resource/property/value

http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#length of 6300 kilometers

resource property value

http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#startingLocation of western China's ...

resource property value

http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#endingLocation of East China Sea

resource property value

30

Page 31: Introduction to RDF

The RDF Format = triples! • The fundamental design pattern of RDF is to structure your XML data

as resource/property/value triples!

The value of a property can be a literal (e.g., length has a value of 6300 kilometers). Also, the value of a property can be a resource, as shown above (e.g., property-A has a value of Resource-B, property-B has a value of Resource-C). We will see examples of properties having a resource value in a little bit.

<?xml version="1.0"?> <Resource-A> <property-A> <Resource-B> <property-B> <Resource-C> <property-C> Value-C </property-C> </Resource-C> </property-B> </Resource-B> </property-A> </Resource-A>

value of property-A

value of property-B

Notice that the RDF design pattern is an alternating sequence of resource-property. This pattern is known as "striping".

31

Page 32: Introduction to RDF

Naming Convention

• The convention is to use a capital letter to start a type (class) name, and use a lowercase letter to start a property name.

– This helps the eye quickly discern the striping pattern.

<?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

uppercase

lowercase

32

Page 33: Introduction to RDF

RDF Model (graph)

Legend: Ellipse indicates "Resource" Rectangle indicates "literal string value"

33

Page 34: Introduction to RDF

rdf:Description + rdf:type • There is still another way of representing the XML. This way makes it

very clear that you are describing something, and it makes it very clear what the type (class) is of the thing you are describing:

<?xml version="1.0"?> <rdf:Description rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <rdf:type rdf:resource="http://www.geodesy.org/river#River"/> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </rdf:Description>

This is read as: "This is a Description about the resource http://www.china.org/geography/rivers#Yangtze. This resource is an instance of the River type (class). The http://www.china.org/geography/rivers#Yangtze resource has a length of 6300 kilometers, a startingLocation of western China's Qinghai-Tibet Plateau, and an endingLocation of the East China Sea."

Note: this form of describing a resource is called the "long form". The form we have seen previously is an abbreviation of this long form. An RDF Parser interprets the abbreviated form as if it were this long form.

34

Page 35: Introduction to RDF

Alternative

• Alternatively we can use rdf:ID rather than rdf:about, as shown here:

<?xml version="1.0"?> <rdf:Description rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <rdf:type rdf:resource="http://www.geodesy.org/river#River"/> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </rdf:Description>

35

Page 36: Introduction to RDF

Equivalent Representations!

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

<?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

<?xml version="1.0"?> <rdf:Description rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <rdf:type rdf:resource="http://www.geodesy.org/river#River"/> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </rdf:Description>

Note: In the RDF literature the examples are typically shown in this form.

36

Page 37: Introduction to RDF

RDF Namespace

http://www.w3.org/1999/02/22-rdf-syntax-ns#

ID

about

type

resource

Description

37

Page 38: Introduction to RDF

Terminology

• As you read the RDF literature you may see the following terminology: – Subject: this term refers to the item that is playing the role of the

resource.

– predicate: this term refers to the item that is playing the role of the property.

– Object: this term refers to the item that is playing the role of the value.

Subject Object predicate

Resource Value property

Equivalent!

38

Page 39: Introduction to RDF

RDF Parser

• There is a nice RDF parser at the W3 Web site:

http://www.w3.org/RDF/Validator/

This RDF parser will tell you if your XML is in the proper RDF format.

Do Lab1

39

Page 40: Introduction to RDF

Example #2

<?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </River>

Yangtze2.xml

Modify the following XML document so that it is RDF-compliant:

40

Page 41: Introduction to RDF

Note the two types (classes)

River Dam

Instance: Yangtze Properties: length startingLocation endingLocation

Instance: ThreeGorges Properties: name width height cost

41

Page 42: Introduction to RDF

Dam - out of place <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </River>

Dam

Types (classes) contain properties . Here we see the River type containing the properties - length, startingLocation, and endingLocation. It also shows River containing a type - Dam. Thus, there is a Resource that contains another Resource. This is inconsistent with RDF design pattern. (We are seeing one of the benefits of using the RDF format - to identify inconsistencies in an XML design.) 42

Page 43: Introduction to RDF

Property value must be a Literal or a Resource

<length>6300 kilometers</length>

property

Value is a Literal

<obstacle> <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </obstacle>

property

Value is a Resource

43

Page 44: Introduction to RDF

Modified XML (to make it consistent) <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <obstacle> <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </obstacle> </River>

Yangtze2,v2.xml

"The Yangtze River has an obstacle that is the ThreeGorges Dam. The Dam has a name - The Three Gorges Dam. It has a width of 1.5 miles, a height of 610 feet, and a cost of $30 billion."

44

Page 45: Introduction to RDF

RDF Format <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <obstacle> <Dam rdf:ID="ThreeGorges" xmlns="http://www.geodesy.org/dam#"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </obstacle> </River>

Changed id to rdf:ID Added the '#' symbol

As always, the other representations using rdf:about and rdf:Description are available. 45

Page 46: Introduction to RDF

RDF Model (graph)

46

Page 47: Introduction to RDF

<?xml version="1.0"?> <Dam rdf:ID="ThreeGorges" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/dam#" xml:base="http://www.china.org/geography/rivers"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam>

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <obstacle rdf:resource="http://www.china.org/geography/rivers#ThreeGorges"/> </River>

Three-Gorges-Dam.rdf

Alternatively, suppose that someone has already created a document containing information about the Three Gorges Dam:

Yangtze.rdf

Then we can simply reference the Three Gorges Dam resource using rdf:resource, as shown here:

47

Page 48: Introduction to RDF

Note: reference is to a resource, not to a file

<obstacle rdf:resource="http://www.china.org/geography/rivers #ThreeGorges"/>

Why was this the reference:

<obstacle rdf:resource="http://www.china.org/geography/rivers /Three-Gorges-Dam.rdf"/>

and not this:

That is, why wasn't the reference to a "file"? Answer: 1. What if the file moved? Then the reference would break. 2. By using an identifier of the Three Gorges Dam, and keeping a particular file unspecified, then an "aggregator tool" will be able to collect information from all the files that talk about the Three Gorges Dam resource (see next slide).

Do Lab2

48

Page 49: Introduction to RDF

Anyone, Anywhere, Anytime Can Talk About a Resource

• In all of our examples we have provided a unique identifier to resources, e.g.,

http://www.china.org/geography/rivers#Yangtze

• Consequently, if another RDF document identifies the same resource then the data that it specifies gives additional data about that resource.

• An aggregator tool will be able to collect all data about a resource and present a consolidated set of data for the resource. That's powerful!

49

Page 50: Introduction to RDF

rdf:ID versus rdf:about

• When should rdf:ID be used? When should rdf:about be used?

– When you want to introduce a resource, and provide an initial set of information about a resource use rdf:ID

– When you want to extend the information about a resource use rdf:about • The RDF philosophy is akin to the Web philosophy. That is,

anyone, anywhere, anytime can provide information about a resource.

50

Page 51: Introduction to RDF

<?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

http://www.china.org/geography/rivers/yangtze.rdf <?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <name>Dri Chu - Female Yak River</name> <name>Tongtian He, Travelling-Through-the-Heavens River</name> <name>Jinsha Jiang, River of Golden Sand</name> </River>

http://www.encyclopedia.org/yangtze-alternate-names.rdf

<?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <name>Dri Chu - Female Yak River</name> <name>Tongtian He, Travelling-Through-the-Heavens River</name> <name>Jinsha Jiang, River of Golden Sand</name> </River>

Aggregated Data!

Aggregator tool collects data about the Yangtze

A distributed network of data!

51

Page 52: Introduction to RDF

<?xml version="1.0"?> <Dam rdf:ID="ThreeGorges" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/dam#" xml:base="http://www.china.org/geography/rivers"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam>

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <obstacle rdf:resource="http://www.china.org/geography/rivers #ThreeGorges"/> </River>

http://www.china.org/geography/rivers/yangtze.rdf

http://www.encyclopedia.org/three-gorges-dam.rdf

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <obstacle> <Dam rdf:ID="ThreeGorges" xmlns="http://www.geodesy.org/dam#"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </obstacle> </River>

Aggregate!

Note that the reference to the ThreeGorges Dam resource has been replaced by whatever information the aggregator could find on this resource!

Another Example of Aggregation

52

Page 53: Introduction to RDF

Example #3

<?xml version="1.0"?> <River xmlns="http://www.geodesy.org/river#"> <name>Yangtze</name> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

Notice that in this XML document there is no unique identifier:

Yangtze3.xml

XML

<?xml version="1.0"?> <River xmlns="http://www.geodesy.org/river#"> <name>Yangtze</name> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

Yangtze3.rdf

RDF

The RDF is identical to the XML!

53

Page 54: Introduction to RDF

Interpreting the RDF

<?xml version="1.0"?> <River xmlns="http://www.geodesy.org/river#"> <name>Yangtze</name> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

Yangtze3.rdf

This is read as: "This is an instance of the River type (class). The River has a name of Yangtze, a length of 6300 kilometers, a startingLocation of western China's Qinghai-Tibet Plateau, and an endingLocation of the East China Sea."

In this document the resource is anonymous - it has no identifier.

54

Page 55: Introduction to RDF

Disadvantage of anonymous resources

<?xml version="1.0"?> <River xmlns="http://www.geodesy.org/river#"> <name>Yangtze</name> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

http://www.china.org/geography/rivers/yangtze.rdf

<?xml version="1.0"?> <River xmlns="http://www.geodesy.org/river#"> <name>Yangtze</name> <name>Dri Chu - Female Yak River</name> <name>Tongtian He, Travelling-Through-the-Heavens River</name> <name>Jinsha Jiang, River of Golden Sand</name> </River>

http://www.encyclopedia.org/yangtze-alternate-names.rdf

An aggregator tool will not be able to determine if these documents are talking about the same resource.

Aggregate

55

Page 56: Introduction to RDF

Example #4 <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure"> <length uom:units="kilometers">6300</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

XML

Yangtze4.xml

Yangtze4.rdf

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length> <rdf:Description> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </rdf:Description> </length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

RDF

56

Page 57: Introduction to RDF

<?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length uom:units="kilometers">6300</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

Yangtze4.xml

RDF does not allow attributes on the properties (except for special RDF attributes such as rdf:resource). So we need to make the uom:units attribute a child element. Your first instinct might be to modify length to have two child elements:

<?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length> <value>6300</value> <uom:units>kilometers</uom:units> </length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

However, now the length property has as its value two values. RDF only binary relations i.e., a single value for a property.

57

Page 58: Introduction to RDF

rdf:value

length

6300

kilometers

length has two values - 6300 and kilometers. RDF provides a special property, rdf:value, to be used for specifying the "primary" value. In this example, 6300 is the primary value, and kilometers is a value which provides additional information about the primary value.

58

Page 59: Introduction to RDF

RDF Format

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length> <rdf:Description> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </rdf:Description> </length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

Yangtze4.rdf

An anonymous resource

Read this as: "The Yangtze River has a length whose value is a resource which has a value of 6300 and whose units is kilometers.

59

Page 60: Introduction to RDF

Advantage of anonymous resources

<rdf:Description> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </rdf:Description>

This is an anonymous resource. Its purpose is solely to provide a context for the two properties. Other RDF documents will have no need to amplify this resource. So, in this case, there is no reason for giving the resource an identifier. In this case it makes good sense to use an anonymous resource.

60

Page 61: Introduction to RDF

RDF Model (graph)

An anonymous resource (also called a "blank node"). That is, a resource with no identifier. (Note: RDF Parsers will typically generate a unique identifier for anonymous resources, to distinguish one anonymous resource from another.)

Legend:

61

Page 62: Introduction to RDF

rdf:parseType="Resource"

<?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length rdf:parseType="Resource"> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

Yangtze4,v2.rdf

If the value of a property is comprised of several values then one option is to create an anonymous resource, as we saw. RDF provides a shorthand, so that you don't need to create an rdf:Description element, by using

rdf:parseType="Resource", as shown here:

The meaning of this is identical to that shown on the previous slide.

62

Page 63: Introduction to RDF

Equivalent!

<length> <rdf:Description> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </rdf:Description> </length>

<length rdf:parseType="Resource"> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </length>

Do Lab3

63

Page 64: Introduction to RDF

Summary

Modify the following XML document so that it is also a valid RDF document:

<?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length uom:units="kilometers">6300</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </River>

Yangtze.xml

See next slide --> 64

Page 65: Introduction to RDF

Containers

• Groups of things: <bag> <seq> <alt>

• <bag> unordered list; duplicates allowed

• <seq> ordered list; duplicates allowed

• <alt> list of alternatives; one will be selected

65

Page 66: Introduction to RDF

Example <alt>

<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:cd="http://www.recshop.fake/cd#"> <rdf:Description rdf:about="http://www.recshop.fake/cd/Beatles"> <cd:format> <rdf:Alt> <rdf:li>CD</rdf:li> <rdf:li>Record</rdf:li> <rdf:li>Tape</rdf:li> </rdf:Alt> </cd:format> </rdf:Description> </rdf:RDF>

Exactly one of these formats

66

Page 67: Introduction to RDF

Limiting the scope

• Collection - describes a group that contains only the specified members, no others.

<rdf:Description rdf:about="http://recshop.fake/cd/Beatles"> <cd:artist rdf:parseType="Collection"> <rdf:Description

rdf:about="http://recshop.fake/cd/Beatles/George"/> <rdf:Description

rdf:about="http://recshop.fake/cd/Beatles/John"/> <rdf:Description

rdf:about="http://recshop.fake/cd/Beatles/Paul"/> <rdf:Description

rdf:about="http://recshop.fake/cd/Beatles/Ringo"/> </cd:artist> </rdf:Description>

67