Top Banner
Janifer Gatenby OCLC EMEA With acknowledgements to Richard Wallis and Anila Angjeli LINKED DATA A PERSONAL PERSPECTIVE
33

It19 20140721 linked data personal perspective

Jul 08, 2015

Download

Internet

Janifer Gatenby

A presentation made for Standards Australia's seminar. Outlines the basic aspects of linked data from a personal perspective and where it fits with direct and subject searching.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: It19 20140721 linked data personal perspective

Janifer Gatenby

OCLC EMEA

With acknowledgements to Richard Wallis and Anila Angjeli

LINKED DATAA PERSONAL PERSPECTIVE

Page 2: It19 20140721 linked data personal perspective

LINKED DATA

• What is it?

• What does it promise?

• How do we get there?

• What happens when we get there?

Page 3: It19 20140721 linked data personal perspective

WHAT IS IT?

Page 4: It19 20140721 linked data personal perspective

A WAY OF EXPRESSING A LINK

What is it?

• Not really a new way of linking but a new way of expressing a link

It is about using canonical trusted globally

referenceable identifiers for concepts, people,

organisations, locations etc. instead of copying text

strings and losing the connection with the

authoritative sources they came from.

Richard Wallis

Page 5: It19 20140721 linked data personal perspective

MARC21 LINKS

What is it?

• 700 10 $a name $e role $0 authority control number

– (added entry in a MARC record for a name related to a work, not the main

author)

These familiar links reference an authority record in the

same database as a bibliographic record, hence have

no address portion. Linked data extends the linking

range.

Page 6: It19 20140721 linked data personal perspective

EXTENDING THE LINKING RANGE: URI

What is it?

• URI – immutable address as well as an identifier

– http://id.loc.gov/authorities/names/nr89009099

– http://viaf.org/viaf /116774723

– http://isni.org/isni/000000114556841

9 NACO libraries –

LC,

National Agricultural Library,

National Library of Medicine,

British Library,

NL Mexico,

NLNZ,

NL Scotland,

NL South Africa,

NL Wales

Page 7: It19 20140721 linked data personal perspective

• RDF – metadata is expressed in triples

– Data

– Data label (properties)

– Vocabulary from which the label comes (gives context to the label)

EXTENDING THE LINKING RANGE: RDF

What is it?

Page 8: It19 20140721 linked data personal perspective

SPARQL

What is it?

• A database can offer a SPARQL endpoint = can receive RDF queries

– Author [schema] Name [data label] De Groot, Gerard J., 1955 [data]

• “SPARQL allows users to write queries against data that can loosely be called "key-value" data,

more specifically it is data that follows the RDF specification of the W3C. The entire database is

thus a set of "subject-predicate-object" triples.”

• 1.1 Stable release 2013-03-21

– W3C recommendation

http://en.wikipedia.org/wiki/SPARQL

http://www.w3.org/blog/SW/2008/01/15/sparql

_is_a_recommendation/

Page 9: It19 20140721 linked data personal perspective

LINKED DATA PRINCIPLES

What is it?

1. Use URIs as names for things

2. Use HTTP URIs so people can look up those names

3. When someone looks up a URI, provide useful information, using the standards - RDF

4. Include links to other URIs, so that they can discover more

Tim Berners-Lee - 2006

Page 10: It19 20140721 linked data personal perspective

VOCABULARIES

What is it?

• Vocabularies are not schemas, they are lists of defined data labels (concepts)

– Schema.org (Search engines)

– BibFrame (Library community)

– FOAF Friend of a friend

– OWL same as

• Vocabularies can be mixed foaf:name "Jimmy Wales" ;

foaf:mbox <mailto:[email protected]> ;

foaf:homepage <http://www.jimmywales.com/> ;

foaf:nick "Jimbo" ;

Page 11: It19 20140721 linked data personal perspective

WHAT DOES IT PROMISE?

Page 12: It19 20140721 linked data personal perspective

What does it promise?

• Enriched displays without data maintenance

• Better harvesting and ranking

• because of markup

• and because of links

• Navigation to pages with additional information –

– Example: from VIAF via ISNI to encyclopaedias, rights management societies (digitisation

rights), Bowker – biographies from fly leaves

Page 13: It19 20140721 linked data personal perspective
Page 14: It19 20140721 linked data personal perspective
Page 15: It19 20140721 linked data personal perspective

INTERCONNECTING FRENCH CULTURAL HERITAGE TREASURES ON

THE WEB

What does it promise?

BnF Main catalogue

(MARC)

Digital documents

(DC)

Web pages for

Internet usersBnF Archives and

Manuscripts

catalogue

(EAD) Raw data for machines

Modeling

Matching

Clustering

Alignments

Semantic Web

techniques

Other BnF

resourcesExternal

resources

Page 16: It19 20140721 linked data personal perspective

What does it promise?

BnF persistent ID

Imported

from

Wikipedia

and

integrated in

the page

Page 17: It19 20140721 linked data personal perspective

What does it promise?

Information about the data model (or ontology) at : http://data.bnf.fr/about-en

Data can be downloaded

Existing ones + othersdefined for the specific

needs of the project

Page 18: It19 20140721 linked data personal perspective

BIG DATA AS RDF

• Data is re-usable without a full blown conversion

• Permits 3rd party analysis of big data sets

• Data mining for new information

What does it promise?

Page 19: It19 20140721 linked data personal perspective

HOW DO WE GET THERE?

Page 20: It19 20140721 linked data personal perspective

MAKING THE LINKS

How do we get there?

DNB CultureGraph

– “It’s all about creating

connections”

– DDC to RVK (German

classification) by

comparing search

results

– GND (names) to

German Wikipedia

Page 21: It19 20140721 linked data personal perspective

EXAMPLE: VIAF

How do we get there?

• Ingesting data to compare and create links

• Makes clusters; cluster identifier

• Ingesting preferred to external linking

– Wikipedia, ISNI, WorldCat identities

– More data used for clustering, so more reliable

• VIAFBot for making reciprocal links in Wikipedia / Wikidata

<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>

<rdf:typedf:resource="http://rdvocab.info/uri/schema/FRBRentitiesRDA/Person"/>

<foaf:name>De Groot, Gerard J., 1955-</foaf:name>

<foaf:name>DeGroot, Gerard J., 1955-</foaf:name>

<rdaGr2:dateOfBirth>1955-06-22</rdaGr2:dateOfBirth>

<owl:sameAs rdf:resource="http://data.bnf.fr/ark:/12148/cb12299846b#foaf:Person"/>

<owl:sameAs rdf:resource="http://www.idref.fr/034977651/id"/>

<owl:sameAs rdf:resource="http://d-nb.info/gnd/12422900X"/>

Page 22: It19 20140721 linked data personal perspective

Libraries

Text Rights

Music RightsTrade Sources

Encyclopaedias

Researchers & ProfessionalGranting organisationsProfessional SocietiesArticle databasesTheses databases

cross-domain bridging-domains

Archives and Museums

Page 23: It19 20140721 linked data personal perspective

EXAMPLE: ISNI: 15 MILLION LINKS

How do we get there?

Linked Data: isni.org/isni/

Page 24: It19 20140721 linked data personal perspective

LA TROBE UNIVERSITY LINKS: 3,427

How do we get there?

Page 25: It19 20140721 linked data personal perspective

LA TROBE UNIVERSITY: 1,864 VIAF LINKS

How do we get there?

Page 26: It19 20140721 linked data personal perspective

ISNI – A LINKING IDENTIFIER

How do we get there?

• Identifiers Seal Uniqueness: “n” number

of other elements are necessary for

uniqueness

• Stable identifier; stable metadata:

• assigned where there is confidence in

the quality and completeness of the

metadata to establish uniqueness

• ISNI system + Quality Team (BL & BnF)

Linking erroneous data

propagates errors.

Page 27: It19 20140721 linked data personal perspective

LINKS ARE MADE ONCE – THEN INHERITED

How do we get there?

• URI – immutable address as well as an identifier

– http://id.loc.gov/authorities/names/nr89009099

– http://viaf.org/viaf /116774723

– http://isni-url.oclc.nl/isni/000000114556841

9 NACO libraries –

Library of Congress,

National Agricultural Library,

National Library of Medicine,

British Library,

NL Mexico,

NLNZ,

NL Scotland,

NL South Africa,

NL Wales

Page 28: It19 20140721 linked data personal perspective

WHAT HAPPENS WHEN WE GET THERE?

Page 29: It19 20140721 linked data personal perspective

HOW DOES SEARCHING WORK?

• Search happens mostly in the search engines

• Library catalogue concentrates on:

– Being linked to

– Linking out (navigation)

– Delivery, particularly of the digitised and immediate

What happens when we get there?

Page 30: It19 20140721 linked data personal perspective

HOW DO SEARCH AND LINKED DATA INTERACT?

• Is search really fully delegated to search

engines & larger union catalogues?

What happens when we get there?

Page 31: It19 20140721 linked data personal perspective

SEARCH TYPES

What happens when we get there?

Search type Happening in

Known item Search engines, also in more specific

sources where noise is a problem

Subject search Search engines, also in more specific

sources, to reduce noise and benefit from

more precise searching capabilities

Index browse In catalogues

Follow a link Everywhere . In library catalogues from a

full record display.

The more your catalogue is linked in, the more likely it is

to attract all types of searches

Page 32: It19 20140721 linked data personal perspective

STORE ONLY THE LINKS?

What happens when we get there?

• Data needed

• For making indexes

• For comparisons,

e.g. For de-

duplication

• Data mining

It is about using canonical trusted globally

referenceable identifiers for concepts, people,

organisations, locations etc. instead of copying text

strings and losing the connection with the

authoritative sources they came from.

This doesn’t mean that you only

need the links; you often also

need to ingest the data

Besides data storage no longer the constraint it once was

Page 33: It19 20140721 linked data personal perspective

READ FURTHER

• http://www.slideshare.net/tulipbiru64/the-single-power-of-link-richard-wallis

• http://www.slideshare.net/rjw/linked-data-and-oclc