Top Banner
Adventures in Linked Data Land Richard Light Consultancy Culture Geeks, 25 February 2009
30

Culture Geeks Feb talk: Adventures in Linked Data Land

Nov 01, 2014

Download

Technology

val.cartei

Culture Geeks talk: "Adventures in Linked Data Land", by Richard Light.

Feb, 25th 2009 - Regency Town House

Culture Geeks is a Brighton-based community open to everyone who is
interested in using digital technologies in the cultural sector.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Culture Geeks Feb talk: Adventures in Linked Data Land

Adventures in Linked Data Land

Richard Light Consultancy

Culture Geeks, 25 February 2009

Page 2: Culture Geeks Feb talk: Adventures in Linked Data Land

Discovering Linked Data

Four principles of Linked Data (Tim B-L):

● Use URIs to identify resources

● Use HTTP URIs so that people can look them up

● Provide useful information about the resource

● Include links to other URIs in your data

Page 3: Culture Geeks Feb talk: Adventures in Linked Data Land

Discovering dbPedia

● Extraction of Linked Data from Wikipedia● Statements in info boxes (mainly) become RDF

triples:

<rdf:Description rdf:about="http://dbpedia.org/resource/Berlin_Marathon">

<dbpprop:location rdf:resource="http://dbpedia.org/resource/Berlin"/>

</rdf:Description>

Note the URLs

Page 4: Culture Geeks Feb talk: Adventures in Linked Data Land

Browsing Linked Data

● View RDF as a web page:http://dbpedia.org/page/Berlin

● Navigate from one data source to another

● Specialist Linked Data browsers/plugins:– DISCO– Marbles– Openlink Data Explorer– Tabulator

Page 5: Culture Geeks Feb talk: Adventures in Linked Data Land

dbPedia page for Berlin

Page 6: Culture Geeks Feb talk: Adventures in Linked Data Land

OpenLink Data Explorer: What

Page 7: Culture Geeks Feb talk: Adventures in Linked Data Land

OpenLink Data Explorer: Where

Page 8: Culture Geeks Feb talk: Adventures in Linked Data Land

Querying Linked Data

● SPARQL query language: http://www.w3.org/TR/2008/REC-rdf-sparql-query-

20080115/

● And SPARQL XML results format:http://www.w3.org/TR/rdf-sparql-XMLres/

● “SPARQL end-points”:http://dbpedia.org/sparql http://dbtune.org/bbc/peel/sparql http://data.linkedmdb.org/sparql

Page 9: Culture Geeks Feb talk: Adventures in Linked Data Land

dbPedia SPARQL endpoint page

Page 10: Culture Geeks Feb talk: Adventures in Linked Data Land

Asking interesting questions

● German musicians born in Berlin:●

Page 11: Culture Geeks Feb talk: Adventures in Linked Data Land

So what do we have here?

● An initiative to generate lots of Linked Data

● A Linked Data Cloud, containing a growing number of RDF datasets

● A hard-to-use query language capable of very precise and powerful querying

Where do museums come into this picture?

Page 12: Culture Geeks Feb talk: Adventures in Linked Data Land

The Wordsworth Trust

● Typical museum collection: about 60,000 objects

● Major collection of manuscripts (notebooks, letters, etc.)

● Objects published to the Web from a ModesXML database

● Unwise enough to allow me Remote Desktop access ...

Page 13: Culture Geeks Feb talk: Adventures in Linked Data Land

Typical collections object

GRMDC.C104.2

Page 14: Culture Geeks Feb talk: Adventures in Linked Data Land

Same object represented as RDF

Page 15: Culture Geeks Feb talk: Adventures in Linked Data Land

Same object represented as XTM

Page 16: Culture Geeks Feb talk: Adventures in Linked Data Land

One identifier; three “views”

● This object has a single persistent identifier:http://collections.wordsworth.org.uk/object/GRMDC.C104.2

● This maps to different views depending on the “Accept” header in the HTTP request:

– application/rdf+xml >> RDF– application/xtm+xml >> XTM Topic Map– Otherwise >> HTML (human-readable)

● Achieved through a custom 404 “page not found” handler

Page 17: Culture Geeks Feb talk: Adventures in Linked Data Land

“Page not found” handler (1)

● All URLs are fictitious, so they generate a 404

● Modified a generic smart 404 handler from:http://evolvedcode.net/content/code_smart404/

● Added support for “303 See other” redirects

● added wild card matching to re-format URLs

Page 18: Culture Geeks Feb talk: Adventures in Linked Data Land

“Page not found” handler (2)

● Generic URL, plus requested Accept format, determine initial “303 See other” mapping, e.g.:http://collections.wordsworth.org.uk/object/GRMDC.C104.2 +Accept: application/rdf+xml=http://collections.wordsworth.org.uk/object/rdf/GRMDC.C104.2

● When this is passed back in, the 404 handler has to generate the required RDF directly

● Can't just keep redirecting requests!

Page 19: Culture Geeks Feb talk: Adventures in Linked Data Land

“Page not found” handler (3)

● Redirect rules declare mappings:

Page 20: Culture Geeks Feb talk: Adventures in Linked Data Land

“Page not found” handler (4)

● Generic URL plus a supported Accept type generates a “303 See other” redirect

● If it comes back as a page request, it is further redirected with a “301 Moved permanently” to the object's web page

● If it comes back as an RDF or XTM request, the record is fetched as XML and subjected to an XSLT transform by the handler

Page 21: Culture Geeks Feb talk: Adventures in Linked Data Land

What has been learnt?● The Linked Data paradigm encourages simple

RDF triples: no “blank nodes”

● For an object, this becomes a simple metadata set, very analogous to the PNDS DCAP format

● The properties involved need to encapsulate the whole relation between object and data, e.g.<p:title>Ulswater from Pooley Bridge</p:title><p:technique>drawn</p:technique><p:maker>Farington, Joseph (1747-1821)</p:maker><p:technique>engraved</p:technique><p:maker>Middiman, Samuel (1750-1831)</p:maker>

Page 22: Culture Geeks Feb talk: Adventures in Linked Data Land

Properties: which framework?

● I have used dbPedia properties (for Linked Data compatibility):http://dbpedia.org/property/title http://dbpedia.org/property/maker

● A viable alternative would be PNDS DCAP:http://purl.org/dc/elements/1.1/title http://purl.org/dc/elements/1.1/creator

● One framework which doesn't fit is the CIDOC CRM:E21 Physical Thing – E12 Production – E39 Actor = “creator”

Page 23: Culture Geeks Feb talk: Adventures in Linked Data Land

The problem of URIs

● Good Linked Data requires URIs everywhere

● Most of my museum RDF resolves to strings

● One exception is Geonames lookup:Ullswater

becomeshttp://www.geonames.org/2635191/

● In the absence of a central “people” registry, should be minting URIs for people myself

Page 24: Culture Geeks Feb talk: Adventures in Linked Data Land

Does it work? - yes, sort of

Page 25: Culture Geeks Feb talk: Adventures in Linked Data Land

Data Explorer place view

Page 26: Culture Geeks Feb talk: Adventures in Linked Data Land

Implementation details

● HTML needed a “back link” to RDF to keep OpenLink Explorer happy:<link rel="alternate" type="application/rdf+xml"

href="http://collections.wordsworth.org.uk/object/data/GRMDC.C104.2" title="RDF" />

● Result is totally unfindable: need a search or harvesting mechanism:– OAI support (possible)– SPARQL end-point (harder)

Page 27: Culture Geeks Feb talk: Adventures in Linked Data Land

Conclusions

● Implementing an RDF Linked Data front-end to a museum database is feasible if:– You can generate multiple outputs from your database

(XML is sufficient)– You can implement a suitable URL rewriter or 404

handler

● It's easy (and a good idea) to mint and publish URIs for your collection objects

● It's less clear where all the other URIs we'll need will come from

Page 28: Culture Geeks Feb talk: Adventures in Linked Data Land

LD: foothills of the Semantic Web

● Linked Data is a very modest start

● It's not obvious how this will scale

● Full Semantic Web will involve machine-driven processes

● Judging by where we are today, that will be a while coming ...

Page 29: Culture Geeks Feb talk: Adventures in Linked Data Land

Ask Multimap where Lancaster is

Page 30: Culture Geeks Feb talk: Adventures in Linked Data Land

Get a Netbook delivered ...