RDF Semantic Web
Dec 29, 2015
RDF
Semantic Web
SourcesW3C Sources The 2 primary authoritative sources for RDF
Resource Description Framework (RDF): Concepts and Abstract Syntax, Graham Klyne and Jeremy J. Carroll, Editors, W3C Recommendation, 10 February 2004
http://www.w3.org/TR/rdf-concepts/
RDF Semantics, Patrick Hayes, Editor, W3C Recommendation, 10 February 2004
http://www.w3.org/TR/rdf-mt/
The authoritative source for the XML serialization of RDF
RDF/XML Syntax Specification (Revised), Dave Beckett, Editor, W3C Recommendation, 10 February 2004
http://www.w3.org/TR/rdf-syntax-grammar/
The Primer
RDF Primer, Frank Manola and Eric Miller, Editors, W3C Recommendation, 10 February 2004
http://www.w3.org/TR/2004/REC-rdf-primer-20040210/
We take considerable material (including examples) from this
Secondary SourcesShelley Powers, Practical RDF, O'Reilly, 2003
One of our main sources (including examples)
It follows the XML serialization
Dean Allemang and James Hendler, Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL, 2nd ed., Morgan Kaufmann, 2011
Probably the best general book on the Semantic Web from the point of view of methodology and standards
But it uses the N3 serialization of RDF
Introduction to RDF Consider the article “Archtitheusis Dux” (on the giant squid) by Shelley
Powers at http://burningbird.net/articles/monster3.htm
(The link is actually out of date.)
A Google search for “giant squid” returns a huge number of hits
Uses keyword search instead of searching in the context of an interest
What’s missing in key-word based classification of web resources is the use of statements about a resource—e.g., The article’s title is “Archtitheusis Dux” The article’s author is Shelley Powers The article is part of a series A related article is … The article is about the giant squid and its place in legends
RDF provides a mechanism for recording statements about resources
The RDF Triple The RDF triple allows both human and machine consumption of a fact
Expressed in English as a simple subject-predicate sentence
The 3 pieces:
Subject: what the fact is about
Property (or property type): the aspect of the subject focused on
Object: the value associated with the property for the given subject
The property & object together correspond to the predicate in English
E.g., in the sentence
The title of the article is “Archtitheusis Dux”
Subject = the article
Property = title
Object = “Archtitheusis Dux”
In RDF, the subject is a URIref or a blank node (bnode or anonymous node—see below)
A URIref is a URI optionally followed by a ‘#’ and a fragment identifier Covered in more detail below
URI: Uniform Resource Identifier URL: Uniform Resource Locator All URLs are URIs that locate resources on the Web Some URIs are just unique identifiers and don’t locate
something The syntax and uniqueness are the important aspect of a URI
As a first cut, express the fact in question as the triple
http://burningbird.net/articles/monster3.htm, title, “Archtitheusis Dux”
In fact,
the predicate must be a URIref
the object is a URIref, bnode, or literal (string in the simplest, most common case)
Regardless of how an RDF triple is serialized (expressed in text), it satisfies the following properties
It’s a triple, made up of subject, predicate, and object
It’s a complete and unique fact
It can be joined with other triples but retains its unique meaning
A URIref node consists of a URI reference giving an identifier unique to the node
The node is drawn enclosed in an ellipse enclosing the URI
A bnode is shown with an empty ellipse
Specific implementations of the graph (e.g., generated by the RDF Validator) draw an ellipse enclosing a generated identifier
A literal has a character string and optionally a language tag and data type
The node is drawn as a rectangle containing the literal
Literal values represent RDF objects only
An arc is labeled with an RDF predicate
It’s from the resource to the object
The RDF graph for the running example (1 triple)
Most RDF tools generate a unique identifier for each bnode
E.g., the following is generated by the W3C RDF Validator
States that resource http://en.wikipedia.org/wiki/Tony_Benn
has title “Tony Benn”
has publisher “Wikipedia”
has a topic that has a type identifier by http://xmlns.com/foaf/0.1/Person has a name “Tony Benn”
Identifier genid:A19581 has been generated for the bnode
The following are the corresponding triples
The bnode occurs in both subject (twice) and object (once) position
A bnode is similar to a relative pronoun and lets us link triples Equivalently, a bnode corresponds to an existentially
quantified variable
The identifier generated for a bnode changes from run to run
Not a problem since bnodes are placeholders
URIs URIs provide a common syntax for naming a resource regardless
of the protocol used to access the resource
The syntax can be extended to meet new needs and to include new protocols
A URIref can optionally include a fragment identifier
Separated from the URI proper by a ‘#’
E.g., in
http://burningbird.net/articles/monster3.htm#introduction
‘http://burningbird.net/articles/monster3.htm’ is the URI
‘introduction’ is the fragment identifier
‘http://’ is the protocol
A URI is only an identifier
The object identified needn’t exist on the Web
You needn’t specify a resolvable protocol
Could use something like a UUID Universally Unique Identifier: standardized by OSF so
distributed systems can uniquely identify info without significant central coordination
A URN (Universal Resource Name) is also an instance of a URI
Conforms to a special scheme
URIrefs are absolute or partial URIs
A partial URI lacks the protocol and perhaps some of the initial part of the rest of the URI—e.g.,
monster3.htm
To derive a URI from a partial URI reference, it is merged with an absolute base URI—e.g.,
http://burningbird.net/articles/
If the base is not specified, it’s assumed to be the base of the containing document The absolute URI of that document without the file name at
the end
The N-Triples Serialization N-Triples is the basis of the notation used in the Primer
Write each triple in the form
subject predicate object .
Note the period at the end
May span more than one line
URIrefs enclosed in <…>s
Consider the following English statements
http://www.example.org/index.html has a creator whose value is John Smith
http://www.example.org/index.html has a creation-date whose value is August 16, 1999
http://www.example.org/index.html has a language whose value is English
Assume we can identify John Smith with the URI
http://www.example.org/staffid/85740
We have the following graph
In N-Triples notation, express this as
<http://www.example.org/index.html>
<http://purl.org/dc/elements/1.1/creator>
<http://www.example.org/staffid/85740> .
<http://www.example.org/index.html>
<http://www.example.org/terms/creation-date>
"August 16, 1999" .
<http://www.example.org/index.html>
<http://purl.org/dc/elements/1.1/language> "en" .
The triples notation requires a node to be separately identified for each statement it appears in
Unlike the drawn graph but like the original statements
The full triples notation requires that URIrefs be written out completely
For convenience, use a shorthand
Substitute an XML qualified name (or QName) without <…>s as an abbreviation for a full URIref
A QName contains a prefix that has been assigned to a namespace URI, followed by a colon, and then a local name
Form the full URIref from the QName by appending the local name to the namespace URI assigned to the prefix
E.g., if the QName prefix foo is assigned to the namespace URI
http://example.org/somewhere/
then QName foo:bar is shorthand for the URIref
http://example.org/somewhere/bar
We’ll use the following prefixes
prefix dc:, namespace URI: http://purl.org/dc/elements/1.1/
prefix ex:, namespace URI: http://www.example.org/
prefix exstaff:, namespace URI: http://www.example.org/staffid/
prefix exterms:, namespace URI: http://www.example.org/terms/
Using the shorthand, the previous set of triples becomes
ex:index.html dc:creator exstaff:85740 .
ex:index.html exterms:creation-date "August 16, 1999" .
ex:index.html dc:language "en" .
RDF uses URIrefs, not words, to name things in statements
So RDF refers to a set of URIrefs (especially if intended for a specific purpose) as a vocabulary
A common namespace URIref is often chosen for all terms in a vocabulary
Typically a URIref under the control of whoever defines the vocabulary
URIrefs contained in the vocabulary are formed by appending individual local names to the common URIref Gives a set of URIrefs with a common prefix
E.g., in the previous example, organization example.org might define
a vocabulary of URIrefs starting with prefix
http://www.example.org/terms/
for terms it used in its business E.g., "creation-date" or "product"
a vocabulary of URIrefs starting with
http://www.example.org/staffid/
to identify its employees
RDF uses the same approach to define its own vocabulary of terms with special meanings
The URIrefs in this vocabulary begin with
http://www.w3.org/1999/02/22-rdf-syntax-ns#
Conventionally associated with the QName prefix rdf:
But RDF does not assume
there’s a relationship between URIrefs with a common leading prefix
URIrefs with different leading prefixes aren’t part of the same vocabulary
URIrefs from different vocabularies can be freely mixed in RDF graphs
An organization may (but needn’t) use a vocabulary's namespace URIref as the URL of a Web resource giving info about that vocabulary
E.g., we use prefix dc: associated with the namespace URIref
http://purl.org/dc/elements/1.1/
Accessing this URIref in a browser gives info about the Dublin Core vocabulary (actually, an RDF document)
No restriction on how many statements using a given URIref as predicate can be used in a graph to describe the same resource
E.g., if resource ex:index.html had been created by several staff members besides John Smith, we might have all the statements
ex:index.html dc:creator exstaff:85740 .
ex:index.html dc:creator exstaff:27354 .
ex:index.html dc:creator exstaff:00816 .
Advantages of using URIrefs (not, e.g., character strings) to identify properties
Avoids ambiguity
Lets us treat a property itself as a resource
Can add new RDF statements with the property's URIref as the subject
Record additional info about it—e.g., the English description of what example.org means by “name”
Structured Property Values and Blank Nodes In describing John’s address, might write out the entire address
as a triple—e.g.,
exstaff:85740
exterms:address
"1501 Grant Avenue, Bedford, Massachusetts 01730" .
But often want the address as a structure with separate street, city, state, and postal code values
Structured info is represented in RDF by considering the aggregate thing as a resource
Then make statements about that new resource
E.g., to break up John’s address into its component parts,
create a new node to represent the concept of John’s address
This node is identified by a new URIref to identify it, say
http://www.example.org/addressid/85740
Abbreviated as exaddressid:85740
Write RDF statements with that node as subject
See the graph and triples on the next slide
exstaff:85740 exterms:address exaddressid:85740 .
exaddressid:85740 exterms:street "1501 Grant Avenue" .
exaddressid:85740 exterms:city "Bedford" .
exaddressid:85740 exterms:state "Massachusetts" .
exaddressid:85740 exterms:postalCode "01730" .
But this way of representing structured info can involve generating numerous "intermediate" URIrefs (e.g., exaddressid:85740 )
Perhaps no need to refer to such concepts outside the given graph
So no need for a URIref
And, in the drawing of the graph, the URIref assigned to identify John’s address isn’t needed
The graph can just as easily be drawn as follows
In the drawing, the bnode itself provides the connectivity
Eliminates the need for a URIref
But, for triples, need an explicit identifier for the bnode
Also, a complex graph might contain more than 1 bnode
Need a way to differentiate between them in triples
So triples use blank node (or bnode) identifiers, of the form _:name
E.g., in our example, use a bnode identifier _:johnaddress to refer to the bnode
The triples are now
exstaff:85740 exterms:address _:johnaddress .
_:johnaddress exterms:street "1501 Grant Avenue" .
_:johnaddress exterms:city "Bedford" .
_:johnaddress exterms:state "Massachusetts" .
_:johnaddress exterms:postalCode "01730" .
Unlike URIrefs and literals, bnode identifiers aren’t parts of the RDF graph
They have significance only within the triples representing a single graph
If a node will be referenced from outside the graph, assign a URIref to identify it
Bnode identifiers represent (blank) nodes, not arcs, in triples
So bnode identifiers may not be used as predicates in triples
RDF directly represents only binary relationships,
e.g. between John Smith and the literal representing his address
Representing the relationship between John and the group of address components involves an n-ary relationship (here n=5) between
John and
the street, city, state, and postal code components
Break this n-way relationship into a group of binary relationships
For each n-ary relationship, choose 1 participant as the subject of the relationship (here John)
Create a bnode to represent the rest of the relationship (here John's address)
Represent the remaining participants (here, e.g., the city) as separate properties of the new resource represented by the bnode
Bnodes also let us make more accurate statements about resources that
lack URIs but
are described in terms of relationships with resources that have URIs
E.g., making statements about Jane, we might use a URI based on her email address (mailto:[email protected]) as her URI
But this has problems if we also want to make statements about her mailbox
There’s a similar problem when a company’s webpage URI is used as the URI for the company itself
Fundamental problem: using Jane's mailbox as a stand-in for Jane is not really accurate
When Jane lacks a URI, a bnode provides a more accurate way of modeling this situation
Represent Jane by a bnode used as subject of a statement with exterms:mailbox as the property and the URIref mailto:[email protected] as its value
The blank node could also be described with an rdf:type property (see below) having a value of
exterms:Person, an exterms:name property having a value of "Jane Smith", etc.
_:jane exterms:mailbox <mailto:[email protected]> .
_:jane rdf:type exterms:Person .
_:jane exterms:name "Jane Smith" .
_:jane exterms:empID "23748" .
_:jane exterms:age "26" .
Assume we know that an email address uniquely identifies someone at example.org
That fact can still be used to associate info about that person from multiple sources
E.g., suppose some RDF is found on the Web describing a book whose author's contact info is mailto:[email protected]
Combine this info with the previous triples to conclude that the author's name is Jane Smith
Saying
The author of the book is mailto:[email protected]
is shorthand for
The author of the book is someone whose mailbox is mailto:[email protected]
Such a use of bnodes also helps avoid inappropriate use of literals
E.g., the publisher, describing Jane’s book but lacking a URIref identifying her, might write (using its own ex2terms: vocabulary):
ex2terms:book78354 rdf:type ex2terms:Book .
ex2terms:book78354 ex2terms:author "Jane Smith" .
But the book’s author isn’t the string "Jane Smith" but a person whose name is Jane Smith
Give the same info more accurately as
ex2terms:book78354 rdf:type ex2terms:Book .
ex2terms:book78354 ex2terms:author _:author78354 .
_:author78354 rdf:type ex2terms:Person .
_:author78354 ex2terms:name "Jane Smith" .
In English:
Resource ex2terms:book78354 is of type ex2terms:Book, and its author is a resource of type ex2terms:Person, whose name is Jane Smith.
Typed Literals In an earlier example, Jane’s age is given as 27
It’s actually 27 years
But the units info (years) isn’t explicitly given
It’s assumed that anyone accessing the property value knows the units
But, in the wider context of the Web, this assumption isn’t safe
Programming languages and database systems provide this additional info by associating a datatype with the literal
RDF uses typed literals
A typed literal is formed by pairing a string with a URIref that identifies a particular datatype
This gives a single literal node in the graph with the pair as the literal
The value represented by the typed literal is the value that the specified datatype associates with the specified string
Consider the triple
<http://www.example.org/staffid/85740>
<http://www.example.org/terms/age>
"27"^^<http://www.w3.org/2001/XMLSchema#integer> .
Using the QName simplification for URIs, we have
exstaff:85740 exterms:age "27"^^xsd:integer .
The following is the RDF graph
In an earlier example, the value of a page's exterms:creation-date property was written as the plain literal "August 16, 1999"
Using a typed literal, we have the triple:
ex:index.html
exterms:creation-date
"1999-08-16"^^xsd:date .
See the following graph
RDF hasn’t its own set of datatypes
RDF typed literals just provide a way to indicate, for a given literal, what datatype should be used to interpret it
The datatypes are defined externally to RDF and identified by their datatype URIs
(The 1 exception is a built-in datatype with the URIref rdf:XMLLiteral to represent XML content as a literal value)
This gives RDF the flexibility to directly represent info from different sources without the need to perform type conversions between
these sources and a native set of datatypes
The most commonly used datatypes are from XML Schema (prefix xsd defined above)
Some of the XML Schema primitive datatypes are
string
boolean
decimal
float
double
duration
dateTime
time
date See
Paul V. Biron and Ashok Malhotra (Eds.), XML Schema Part 2: Datatypes, 2nd Ed., W3C Recommendation, 2004, http://www.w3.org/TR/xmlschema-2/
RDF’s conceptual framework defines a datatype as consisting of:
A set of values (the value space) that literals of the datatype are intended to represent E.g., for xsd:date, this set is a set of dates
A set of character strings (the lexical space) that the datatype uses to represent its values This set determines which character strings can legally be used E.g., xsd:date defines 1999-08-16 (not, e.g., August 16, 1999) as
legal
A lexical-to-value mapping from the lexical space to the value space E.g., the string 1999-08-16 is mapped for datatype xsd:date to the
date August 16, 1999 Note: The same character string may represent different values for
different datatypes
Some built-in XML Schema datatypes aren’t suitable for RDF
E.g., xsd:duration lacks a well-defined value space
The interpretation of a typed literal in an RDF graph must be done by software written to correctly process
not only RDF
but also the typed literal's datatype
XML Schema datatypes are treated just like any other datatypes
The Notation 3 (N3) Serialization The Notation 3 RDF (or just N3) serialization was developed by
Tim Berners-Lee
An N3 document begins with a preamble that defines the bindings between (local) QName prefixes and (global) URIs
E.g.,
@prefix mfg: <http://www.exs.com/Chap3/Manufacturing.rdf#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
Note the #s and the periods
To express a triple, list the resources, using QNames, in subject-predicate-object order
Terminate it with a period
E.g.,
mfg:Product1 rdf:type mfg:Product .
The predicate rdf:type gives the type of the subject
N3 has a compact way to present triples with a common subject
The 1st triple is in subject-predicate-object order but ends with a “;”, not a period
Subsequent triples (with the same subject) omit the subject and also end with a “;” Except for the last, which ends with a period
E.g.,
mfg:Product1 rdf:type mfg:Product;
mfg:Product_SKU "FB3524";
mfg:Product_Available "23" .
Validate N3 code with the online RDF Validator and Converter at
http://www.rdfabout.com/demo/validator/
In the Input Format menu, select Notation 3 (or N-Triples/Turtle)
Then paste your N3 code into the text box
Click the Validate! button
When we enter the above 3 lines preceded by the 2 @prefix lines given earlier, the underlying triples are identified as
<http://www.exs.com/Chap3/Manufacturing.rdf#Product1>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://www.exs.com/Chap3/Manufacturing.rdf#Product> .
<http://www.exs.com/Chap3/Manufacturing.rdf#Product1>
<http://www.exs.com/Chap3/Manufacturing.rdf#Product_SKU>
"FB3524" .
<http://www.exs.com/Chap3/Manufacturing.rdf#Product1>
<http://www.exs.com/Chap3/Manufacturing.rdf#Product_Available>
"23" .
No longer
available
To validate N3 code, do the following
Go to Apache Any23 at http://any23.org/
Copy your N3 into the window below the heading “Convert copy & pasted document”
Use the following settings:
Input format: auto-detect
Output format: rdfxml
Validation: validate
Report: checked
Annotate: un-checked
Click Convert
It replaces the page with a page of unformatted RDF/XML
Go to the XML Formatter at http://www.freeformatter.com/xml-formatter.html
Copy the unformatted RDF/XML, starting at <?xml, into XML Formatter in the window below "Option 1"
Click "Format XML"
Unformatted RDF/XML replaced by formatted RDF/XML in window
Go to the W3C RDF Validator at http://www.w3.org/RDF/Validator/
Copy formatted RDF/XML into window below "Check by Direct Input"
Below "Display Result Options," specify the following settings
Triples and/or Graph: Triples and Graph
Graph format: PNG – embed
Click "Parse RDF"
In a new page, a table of triples and a drawn graph
If several triples have the same subject and predicate,
write the common subject
then write the common predicate
then list the various objects separated by commas
E.g.,
lit:Shakespeare b:hasChild b:Susanna, b:Judith, b:Hamnet .
The 3 triples represented here are
lit:Shakespeare b:hasChild b:Susanna .
lit:Shakespeare b:hasChild b:Judith .
lit:Shakespeare b:hasChild b:Hamnet .
This assumes we have a preamble including, e.g.,
@prefix lit: <http://www.exs.com/Chap3/Shakespeare.rdf#> .
@prefix b: <http://www.exs.com/Chap3/Biography.rdf#> .
N3 provides several intuitive abbreviations
One is to use the word “a” for ”rdf:type”—e.g.,
lit:Shakespeare rdf:type lit:Playwright .
abbreviates to
lit:Shakespeare a lit:Playwright .
corresponding to the English
Shakespeare (is ) a Playwright.
Indicate a bnode in N3 by putting all triples with it as subject in […]
E.g., modifying the N-Tuples example of John Smith and his address
@prefix exstaff: <http://www.example.org/staffid/> .
@prefix exterms: <http://www.example.org/terms/> .
exstaff:85740 exterms:address
[ exterms:street "1501 Grant Avenue";
exterms:city "Bedford";
exterms:state "Massachusetts";
exterms:postalCode "01730"] .
As before, all triples with the common subject (now a bnode) end with “;” except the last But now the last doesn’t end with any punctuation
Note that period after the “]”
It’s customary to leave a space after the “[“ to indicate a missing subject
N3 examples with bnodes can get considerably more complex
Let [ pol ], where pol is a list of predicate-object pairs separated by “;”s, mean that
there exists an x such that x has each of the attributions in the list
This notation may appear as either the subject or object
For the following, assume we have the prefix definitions
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
E.g., consider[ foaf:firstName "Herman"] dc:creator [ dc:title "Moby Dick"] .
See the RDF graph
In semi-logical notation, this says
exists x, y firstname(x, “Herman") & dc:wrote(x, y ) & dc:title (y, "Moby Dick")
Herman
Moby Dick
foaf:firstName
dc:title
dc:creator
or, in English, Some person who has first
name “Herman” wrote a book entitled “Moby Dick”
The previous expression is equivalent to (has the same RDF as) [ foaf:firstName "Herman" ; dc:creator [ dc:title "Moby Dick"]] .
At the top level, this has 2 predicate-object pairs, separated by a “;”
Both forms involve the following triples, which the RDF Validator and Converter gives as
_:bnode0 <http://xmlns.com/foaf/0.1/firstName> "Herman" .
_:bnode1 <http://purl.org/dc/elements/1.1/title> "Moby Dick" .
_:bnode0 <http://purl.org/dc/elements/1.1/creator> _:bnode1 .
Shown only part of N3
Will cover more later
Turtle (Terse RDF Triple Language) is another serialization format for RDF
A subset of N3
See
David Beckett and Tim Berners-Lee, Turtle - Terse RDF Triple Language, W3C Team Submission 14 January 2008
http://www.w3.org/TeamSubmission/2008/SUBM-turtle-20080114/
Dublin Core The Dublin Core metadata element set is a standard for cross-domain info
resource description (i.e., it isn’t specific to any particular topic)
Metadata: data about data
Widely used to describe digital materials such as video, sound, image, text, and composite media like web pages
The most widely used RDF vocabulary
“Core”: its elements are broad and generic, usable for describing a wide range of resources
Originated from the OCLC/NCSA Metadata Workshop hosted in 1995 by the OCLC, a library consortium based in Dublin, Ohio
NCSA is the National Center for Supercomputing Applications (Univ. of Illinois)
Note: Documents are resources, and librarians track documents
OCLC Online Computer Library Center, Inc. A nonprofit, membership, computer library service and research
organization dedicated to the public purpose of
furthering access to the world's information and
reducing info costs
The Dublin Core Metadata Initiative (DCMI)
http://dublincore.org/ An organization providing an open forum for the development of
interoperable online metadata standards that support a broad range of purposes and business models
Activities include
consensus-driven working groups,
global conferences and workshops,
standards liaison, and
educational efforts to promote widespread acceptance of metadata standards and practices
DCMI Usage Board, DCMI Metadata Terms (Recommendation), 2008, http://dublincore.org/documents/dcmi-terms/
Specification of all metadata terms maintained by the DCMI, including properties, vocabulary encoding schemes, syntax encoding schemes, and classes
15 elements (properties, for simple bibliographic description) defined in Dublin Core Metadata Element Set (DCMES) Version 1.1, “simple Dublin Core”
contributor
coverage
creator
date
description
format
identifier
language
publisher
relation
rights
source
subject
title
type
The URIs all extend the base URI http://purl.org/dc/elements/1.1/
The QName prefix conventionally associated with this URI is dc:
E.g., the URI for contributor is
http://purl.org/dc/elements/1.1/contributor
Or (using the QName prefix)
dc:contributor
Recently, a larger set of “terms” was defined in the more comprehensive document "DCMI Metadata Terms"
The URIs of terms all extend base URI http://purl.org/dc/terms/
The QName prefix conventionally associated with this URI is dcterms:
Terms refine elements
E.g., abstract refines description, accessRights refines rights
So as not to affect the conformance of existing implementations of simple Dublin Core in RDF,
15 new properties with names identical to those of DCMES Version 1.1 were created
These are subproperties of the corresponding properties of DCMES Version 1.1
Terms:
abstract, accessRights, accrualMethod, accrualPeriodicity, accrualPolicy, alternative, audience, available, bibliographicCitation, conformsTo, contributor, coverage, created, creator, date, dateAccepted, dateCopyrighted, dateSubmitted, description, educationLevel, extent, format, hasFormat, hasPart, hasVersion, identifier, instructionalMethod, isFormatOf, isPartOf, isReferencedBy, isReplacedBy, isRequiredBy, issued, isVersionOf, language, license, mediator, medium, modified, provenance, publisher, references, relation, replaces, requires, rights, rightsHolder, source, spatial, subject, tableOfContents, temporal, title, type, valid
E.g., the URI for abstract is http://purl.org/dc/terms/abstract
Or (using the QName prefix) dcterms:abstract
See also
DCMI Usage Board, Dublin Core Metadata Element Set, Version 1.1 (Recommendation), 2008, http://dublincore.org/documents/dces/
The full set of vocabularies, DCMI Metadata Terms, also includes
sets of resource classes (including the DCMI Type Vocabulary),
vocabulary encoding schemes, and
syntax encoding schemes
This relates to RDF Schema (RDFS)
We’ll return to the Dublin Core when we cover RDFS-Plus
For how you’re supposed to use it, see
Diane Hillmann, Using Dublin Core, Dublin Core Metadata Initiative, 2005
http://dublincore.org/documents/usageguide/
FOAF (Friend of a Friend) A machine-readable ontology describing persons, their activities and their
relations to other people and objects
Uses RDF
Anyone can use FOAF to describe him/herself (generally in his/her homepage)
Groups can describe social networks without a centralized database
FOAF Project homepage: http://www.foaf-project.org/
FOAF Wiki Main Page: http://wiki.foaf-project.org/w/Main_Page
Dan Brickley and Libby Miller, Introducing FOAF, 2000 (updated 2008), http://www.foaf-project.org/original-intro
The following is largely derived from
Dan Brickley and Libby Miller, FOAF Vocabulary Specification 0.91, Namespace Document, 2 November 2007, http://xmlns.com/foaf/spec/
The FOAF vocabulary is (and always will be) identified by the namespace URI http://xmlns.com/foaf/0.1/
The QName prefix conventionally associated with this URI is foaf:
The initial focus of FOAF has been on the description of people
People are what link together most of the other kinds of things we describe in the Web: they make documents, attend meetings, are depicted in photos, etc.
Because natural language supports such nuanced concepts relating to people,
defining an adequate vocabulary is a major challenge
Why FOAF Uses RDF The FOAF vocabulary can’t incorporate everything we might say
about people without becoming unmanageably large
RDF lets FOAF mix together different descriptive vocabularies (e.g., Dublin Core) consistently
Vocabularies can be created by different communities and mixed together as required
No need for a centralized agreement on how terms from different vocabularies can be written down
A Bit of RDF-S We can define classes (to which individuals belong)
By convention, classes (unlike properties) are capitalized—e.g., foaf:Person
The most general class is taken from OWL:http://www.w3.org/2002/07/owl#Thing
The by now familiar literal is from the RDF-S namespace http://www.w3.org/2000/01/rdf-schema#Literal
Classes have subclasses, properties sub-properties
A property has
a domain (the class of the subjects) and
a range (the class of the objects)
E.g., foaf:knows has domain foaf:Person and range foaf:Person It addresses people knowing people
Some properties are functional: given a domain element, the property identifies a unique element of the range
E.g., given a foaf:Person, foaf:name identifies a unique literal
Other properties are inverse-function: given a range element, the property identifies a unique element of the domain
E.g., given a login ID (a literal) of a person on Jabber, foaf:jabberID identifies a unique foaf:Person
But foaf:jabberID isn’t functional: a person might have several Jabber IDs
Some pairs of properties are inverses of each other
E.g., foaf:page and foaf:topic
If a foaf:Document is about an owl:Thing (the foaf:topic property),
then that owl:Thing is covered in (the foaf:page property) that foaf:Document
But foaf:page is neither functional nor inverse-functional
Likewise for foaf:topic
FOAF Terms by Category Capitalized are classes (cf. RDF-S), the rest are properties
When a property is followed by another in parentheses, the 2nd is the inverse of the 1st
Below, we discuss the most common, some of the most interesting, and some related classes and properties
FOAF BasicsAgent
Person
name
nick
title
homepage
mbox
mbox_sha1sum
img
depiction (depicts)
surname
family_name
givenname
firstName
Personal InfoweblogknowsinterestcurrentProjectpastProjectplanbased_nearworkplaceHomepageworkInfoHomepageschoolHomepagetopic_interestpublicationsgeekcodemyersBriggsdnaChecksum
Online Accounts / Instant MessagingOnlineAccount
OnlineChatAccount
OnlineEcommerceAccount
OnlineGamingAccount
holdsAccount
accountServiceHomepage
accountName
icqChatID
msnChatID
aimChatID
jabberID
yahooChatID
Projects and GroupsProject
Organization
Group
member
membershipClass
fundedBy
theme
Documents and ImagesDocument
Image
PersonalProfileDocument
topic (page)
primaryTopic
tipjar
sha1
made (maker)
thumbnail
logo
Some FOAF stats from Sindice Libby Miller’s blog http://planb.nicecupoftea.org/
Sindice (http://sindice.com/) monitors, harvests and brings Semantic Web data together under a coherent umbrella of functionalities and services
Show the number of times the classes and properties occurred in statements List only the ones with 1M or more, sorted by number
Document 6.15 million
Agent 3.84 million
Person 2.64 million
page 5.84 million
topic 3.13 million
made 1.97 million
maker 1.97 million
name 1.77 million
knows 1.08 million
Classes Properties
Now document the most common and important or interesting and terms related to them
The core of FOAF now is considered stable
As terms stabilize in usage and documentation, they progress through the categories 'unstable', 'testing' and 'stable‘
foaf:Person (stable) Represents people (possibly dead or even imaginary)
A sub-class of the foaf:Agent class
foaf:Agent (stable) The class of things that do stuff
Besides foaf:Person, includes foaf:Organization and foaf:Group
Useful where foaf:Person is overly specific
E.g., the instant-messaging chat ID properties (e.g., foaf:jabberID) sometimes belong to software bots
foaf:Document (testing) The foaf:Image class is a sub-class of foaf:Document
Currently no precise distinction
between physical and electronic documents or
between copies of a work and the abstraction those copies embody
The relationship between documents and their byte-stream representation needs clarification (see foaf:sha1 for related issues)
foaf:sha1 (unstable) Relates a foaf:Document to the textual form of a SHA1 hash of (some
representation of) its contents
The SHA (Secure Hash Algorithm) hash functions are a set of cryptographic hash functions designed by the NSA, published by NIST
foaf:Document is currently used in a way that lets multiple instances at different URIs have the 'same' contents (hence hash)
If foaf:sha1 were an inverse-functional property, could deduce that several such documents were the self-same thing (in this sense)
foaf:topic (testing) Domain foaf:Document, range owl:Thing
Inverse: foaf:page
Relates a document to a thing the document is about
foaf:maker (stable) Domain owl:Thing, range foaf:Agent
Inverse: foaf:made
Relates something to a foaf:Agent that foaf:made it
The foaf:name of the foaf:maker of something can be described as the dc:creator of that thing
Use dc:creator only for simple textual names
Use foaf:maker to indicate the creating foaf:Agent itself so as not to risk confusing the agents with their names
foaf:name (testing) Domain owl:Thing, range rdfs:Literal
The foaf:name of something is a simple textual string involving no substructure
Contrast with the following, all with domain foaf:Person and range rdfs:Literal
foaf:surname
foaf:family_name
foaf:givenname
foaf:firstName
foaf:jabberID (testing) Domain foaf:Agent, range rdfs:Literal
Relates a foaf:Agent to a textual identifier assigned to it in the Jabber messaging system
Since the ID uniquely identifies the agent, this is an inverse-functional property
Jabber IDs can be assigned to a variety of things: software 'bots', chat rooms, etc.
All are kinds of foaf:Agent
Similar properties, all with status ‘testing’, domain foaf:Agent, and range rdfs:Literal
foaf:aimChatID
foaf:msnChatID
foaf:icqChatID
foaf:yahooChatID
foaf:nick (testing) Domain foaf:Person, range rdfs:Literal
A short informal nickname characterizing an agent: login identifiers, IRC (Internet Relay Chat) nicknames, etc.
Necessarily vague: doesn’t indicate a particular naming control authority (unlike foaf:jabberID etc. above)
Can’t distinguish a person's login from their (possibly various) IRC nicknames or other similar identifiers
Yet it’s useful Many use the same string across a variety of such
environments Can’t have a property for every naming database—this serves
as a catchall
[Some more classes, so explicitly distinguish classes and properties]
Class: foaf:OnlineAccount (unstable) Represents the provision of online service by some party (indicated
indirectly via a foaf:accountServiceHomepage) to some foaf:Agent
The foaf:holdsAccount property of the agent indicates accounts associated with the agent
Sub-classes include
foaf:OnlineChatAccount
foaf:OnlineEcommerceAccount
foaf:OnlineGamingAccount
Property: foaf:holdsAccount (unstable) Domain foaf:Agent, range foaf:OnlineAccount
Relates a foaf:Agent to a foaf:OnlineAccount for which it’s the sole account holder
Property: foaf:accountServiceHomepage (unstable) Domain foaf:OnlineAccount, range foaf:Document
Indicates a relationship between a foaf:OnlineAccount and the homepage of the supporting service provider
Property: foaf:accountName (unstable) Domain foaf:OnlineAccount, range rdfs:Literal
This property of a foaf:OnlineAccount is a textual representation of the account name (unique ID) associated with that account
Class: foaf:OnlineChatAccount (unstable) A foaf:OnlineAccount devoted to chat / instant messaging.
This and associated FOAF terms let us describe a great variety of online accounts without anticipating them in the FOAF vocabulary
As with email, there are privacy and anti-SPAM considerations
FOAF does not currently provide a way to represent an obfuscated chat ID I.e., no parallel to the foaf:mbox / foaf:mbox_sha1sum
mapping
Property: foaf:mbox (stable) Domain foaf:Agent, range owl:Thing
A relationship between the owner of a mailbox and a mailbox
Typically identified using the mailto: URI scheme
There are many mailboxes (e.g., shared ones) that aren’t the foaf:mbox of anyone
And a person can have multiple foaf:mbox properties
Often use foaf:mbox as an indirect way of identifying its owner
It’s an inverse-functional property
Works even if the mailbox is itself out of service It’s a static inverse-functional property
Property: foaf:mbox_sha1sum (testing) Domain foaf:Agent, range rdfs:Literal
If you have a mailbox (foaf:mbox) but don't want to reveal its address (a ‘mailto’ identifier, a URL),
apply the SHA1 functional to it to generate a foaf:mbox_sha1sum representation of it
Just as a foaf:mbox can be used as an indirect identifier for its owner, so can a foaf:mbox_sha1sum
There’s only 1 foaf:Agent with any particular value for that property
It’s an inverse-functional property
Many FOAF tools use foaf:mbox_sha1sum in preference to exposing mailbox info
For privacy and SPAM-avoidance reasons
foaf:knows (testing) Domain foaf:Person, range foaf:Person
Relates a foaf:Person to another foaf:Person he or she knows
Since conventions on this topic vary greatly across cultures, not appropriate to be overly-specific
But do require some reciprocated interaction
Yet no obligation for either party to publish FOAF describing the relationship
A foaf:knows relationship doesn’t imply friendship, endorsement, or a face-to-face meeting
You might list only a few of the many people you know
Cf. the Semantic Web principle of partial description:
RDF documents rarely describe the entire picture
Though vague by design, foaf:knows has uses
Typically involve combining other RDF properties—e.g., an application might look at properties of each foaf:weblog that was
foaf:made by someone you foaf:knows or check the newsfeed of the online photo archive for each of
these people
For levels of representation beyond mere 'knows', FOAF applications can do several things
Use more precise relationships than foaf:knows to relate people to people Early relationships of this kind were removed as they
inappropriately suggested precision But see Eric Vitiello's Relationship module for FOAF (see below)
Use RDF descriptions of the states of affairs which imply particular kinds of relationship. E.g., 2 people with the same value for their
foaf:workplaceHomepage property are typically colleagues If there’s a foaf:Document listing 2 people as its
foaf:makers, they are probably collaborators If 2 people appear in 100s of digital photos together, they're
probably friends or colleagues
Don’t clutter FOAF up with these extra relationships
FOAF is built on top of a general purpose machine language for representing relationships (i.e., RDF)
So it can represent any kind of relationship we care to add
Problems are generally social, not technical
Perhaps the most important use of foaf:knows:
along with the rdfs:seeAlso property, to connect FOAF files together
By mentioning other people (via foaf:knows or other relationships) and providing an rdfs:seeAlso link to their FOAF file,
you make it easy for FOAF indexing tools ('scutters', see below) to find
your FOAF,
the FOAF of the people you've mentioned,
the FOAF of the people they mention,
and so on
Can build FOAF aggregators without a centrally managed directory of FOAF files
Eric Vitiello, Jr., “Relationship: A module for defining relationships in FOAF,” 19 July 2002
http://www.perceive.net/schemas/20021119/relationship/
relationship is a module for extending the usefulness of the foaf:knows element Alias the foaf:knows element into elements that describe the
relationship between people in more detail
The URI is http://www.perceive.net/schemas/20021119/relationship/ The conventional prefix is rel:
Propertiesrel:friendOf rel:acquaintanceOf
rel:parentOf rel:siblingOf
rel:childOf rel:grandchildOf
rel:spouseOf rel:enemyOf
rel:antagonistOf rel:ambivalentOf
A scutter is a program that loads, parses, interprets and acts upon the contents of a Web of interconnected RDF/XML documents
A Semantic Web variant on the old theme of distributed Web indexing—a 'harvester', 'spider', or 'robot'
The links between RDF documents are usually, but not necessarily, expressed using RDF's rdfs:seeAlso property.
See http://wiki.foaf-project.org/w/Scutter
As of 2009, the most up-to-date and LinkedData-friendly scutter is Slug
http://code.google.com/p/slug-semweb-crawler/
Implemented in Java using Jena
Provides an RDF vocabulary for describing crawler configurations
Collects metadata concerning crawling activity
We'll return to FOAF once we cover RDFS-Plus
vCards and RDF A simple standard for expressing info about people is the vCard
electronic business card profile defined by RFC 2426 See
Renato Iannella, Representing vCard Objects in RDF/XML, W3C Note 22 February 2001
http://www.w3.org/TR/vcard-rdf
Specifies an RDF expression that corresponds to the vCard standard
The vCard URI is http://www.w3.org/2001/vcard-rdf/3.0# The conventional prefix is vCard:
The explicit use of this URI in RDF eliminates the need to support the VCARD Profile and VERSION type
Look more at vCards when we cover the XML serialization of RDF
Simple Knowledge Organization System (SKOS) See
Alistair Miles and Sean Bechhofer (Eds.), SKOS Simple Knowledge Organization System Reference, W3C Recommendation 18 August 2009
http://www.w3.org/TR/2009/REC-skos-reference-20090818/
SKOS provides a way for systematizing “knowledge”
It’s a common data model for sharing and linking knowledge organization systems via the Web
Knowledge organization systems include thesauri, taxonomies, classification schemes, and subject heading systems
Many share a similar structure and are used in similar applications
SKOS makes much of this similarity explicit Enables data & technology sharing across diverse applications
The SKOS data model provides a standard, low-cost path for porting existing knowledge organization systems to the Semantic Web
SKOS also provides a lightweight, intuitive language for developing and sharing new knowledge organization systems
May be used on its own, or in combination with formal knowledge representation languages such as OWL
The elements of the SKOS data model are classes and properties
The structure and integrity of the data model is defined by the logical characteristics of those classes and properties, and the interdependencies between them
But SKOS is not a formal knowledge representation language (like RDF and OWL)
Consider it later (once we cover RDFS-Plus)