Top Banner
Classification schemes, thesauri and other Knowledge Organization Systems - a Linked Data perspective Antoine Isaac Pelagios: Linked Pasts London, July 20-21, 2015
37
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: linked pasts

Classification schemes, thesauri and other Knowledge Organization Systems

- a Linked Data perspective

Antoine Isaac

Pelagios: Linked Pasts

London, July 20-21, 2015

Page 2: linked pasts

Classification schemes?

Scope: knowledge organization systems (KOS) such as classification systems, thesauri, gazetteers, subject heading lists…

(last-minute addition: also time periods, cf. PeriodO )

Page 3: linked pasts
Page 4: linked pasts
Page 5: linked pasts
Page 6: linked pasts

Simple Knowledge Organization System

SKOS is for exchanging KOSs as Linked Data (in RDF)

• Better than semi-structured data (CSV)

• Still relatively simple

Page 7: linked pasts

A SKOS graphanimalscats

UF domestic catsRT wildcatsBT animalsSN used only for domestic

catsdomestic cats

USE catswildcats

Page 8: linked pasts

Representing semantics

The formal way: OWL Semantic Web ontology language

Used for ontologies that enable machine reasoning

Mother is a class

Parent is the class of entities of type Person that are related to at least one other resource of type Person using the child property

Page 9: linked pasts

Do we want to represent every vocabulary as a formal ontology?

It is possible, but not easy

KOS are large

KOS have softer “semantics”

Parent RelatedTerm Child

KOS have a focus on terminological information

Child UsedFor Offspring

Softer semantics can be useful for many applications!

Page 10: linked pasts

Europeana and knowledge organisation systems

Create a “semantic layer” on top of cultural heritage objects

From: Stefan Gradmann

Page 11: linked pasts

Using KOS in the Europeana Data Model

Page 12: linked pasts

Enhanced descriptive metadata

Page 13: linked pasts

Using KOS Linked Data

<skos:Concept rdf:about="http://www.mimo-db.eu/InstrumentsKeywords/2251"> <skos:prefLabel xml:lang="">Harpsichord</skos:prefLabel> <skos:prefLabel xml:lang="de">Cembalo</skos:prefLabel> <skos:prefLabel xml:lang="sv">Cembalo</skos:prefLabel> <skos:prefLabel xml:lang="fr">Clavecin</skos:prefLabel> <skos:prefLabel xml:lang="it">Clavicembalo</skos:prefLabel> <skos:prefLabel xml:lang="en">Harpsichord</skos:prefLabel> <skos:prefLabel xml:lang="nl">Klavecimbel</skos:prefLabel> <skos:broader> <skos:Concept rdf:about="http://www.mimo-db.eu/InstrumentsKeywords/2239"> <skos:prefLabel>Harpsichords</skos:prefLabel> </skos:Concept> </skos:broader></skos:Concept>

Page 14: linked pasts

Other types of contextual resources

<gn:Feature rdf:about="http://sws.geonames.org/3176959/"> <gn:name>Florence</gn:name> <gn:alternateName xml:lang="ko"> 피렌체 </gn:alternateName> <gn:alternateName xml:lang="ja"> フィレンツェ </gn:alternateName> <gn:alternateName xml:lang="th">ฟลอเรนซ์�</gn:alternateName> <gn:alternateName xml:lang="bo">ཧྥུ་ལོ� ་རོ� ན་ཟི� འུ་ཡ།</gn:alternateName> <gn:alternateName xml:lang="cy">Fflorens</gn:alternateName> <gn:alternateName xml:lang="bs">Firenca</gn:alternateName> <gn:alternateName xml:lang="hbs">Firenca</gn:alternateName> <gn:alternateName xml:lang="hr">Firenca</gn:alternateName> <gn:alternateName xml:lang="sq">Firenca</gn:alternateName> <gn:alternateName xml:lang="pl">Firence</gn:alternateName> <gn:alternateName xml:lang="sl">Firence</gn:alternateName> <gn:alternateName xml:lang="lij">Firense</gn:alternateName> <gn:population>371517</gn:population> <wgs84_pos:lat>43.76667</wgs84_pos:lat> <wgs84_pos:long>11.25</wgs84_pos:long>

Page 15: linked pasts

http://blogs.getty.edu/iris/art-architecture-thesaurus-now-available-as-linked-open-data/

Page 16: linked pasts

Multilingual search

'uurglazen' in Italy

http://europeana.eu/portal/search.html?query=uurglazen&rows=96&qf=COUNTRY%3Aitaly

Page 17: linked pasts
Page 18: linked pasts
Page 19: linked pasts

Vocabularies currently provided to Europeana

Page 20: linked pasts

Europeana metadata enrichment

Page 21: linked pasts

Enrichment types and vocabularies

Enrichment Type

Target vocabulary

Source metadata fields

Number of enriched objects

Places GeoNames dcterms:spatial, dc:coverage

7M

Concepts GEMET, DBpedia,

dc:subject, dc:type

9.2M

Agents DBpedia dc:creator, dc:contributor

144K

Time Semium Time 

dc:date, dc:coverage, dcterms:temporal, edm:year

10,2M

Page 22: linked pasts

Work in progress

Entity-based search and browsing

Annotation

Pundit @ DM2E project http://dm2e.eu

Europeana Channels

Semantic auto-completion

Page 23: linked pasts

Not only end-user facing functions

Data must be accessible

(Unified) APIs, Linked Data

Data re-users should be able to provide enhanced services to their audience easily, especially in digital humanities

Specific collection and application needs cannot rely on a handful of generic vocabularies

Page 24: linked pasts

Work needed

Page 25: linked pasts

Vocabulary management and publication

Europeana developed its own WWI vocabulary based on a subset of LCSH

Terms translated in 10 languages and linked to id.loc.gov

Page 26: linked pasts

Vocabulary services

http://data.europeana.eu/concept/loc/sh85148236

Page 27: linked pasts

Representing finer-grained semantics

More precise relationships and formal semantics

For query expansion or data validation

E.g. ISO 25964 and Getty SKOS extensions

Page 28: linked pasts

Representing finer-grained semantics

Depth level, concept associations

XKOS

Pre-coordinated strings

MADS/RDF

Page 29: linked pasts

Representing finer-grained semantics?

Finer-grained semantics can be useful, but core models are key

They are what most people will start using

Page 30: linked pasts

The need for alignment / co-reference / reconciliation

KOS 1:animalscatswildcats

KOS 2:animalhumanobject

Page 31: linked pasts

A lot of work (being) done

A long line of work in the KOS community: DESIRE, CARMEN, Renardus, LIMBER, HILT, MSAC, MACS, Crisscross, KoMoHe, FAO…

Continued in Linked data context: Pleiades, Wikidata…

MACS: 120K links between Library of Congress Subject Headings (LCSH), RAMEAU, Schlagwortnormdatei (SWD)

Page 32: linked pasts

Semantic mismatches

Irish vocabulary

From: Runar Bergheim

Norwegian vocabulary

skos:exactMatch

Page 33: linked pasts

Requires flexible approaches

AMALGAME/CultuurLink:

http://semanticweb.cs.vu.nl/amalgame/http://cultuurlink.beeldengeluid.nl/

Page 34: linked pasts

Finding and re-using vocabularies

Well-known or new vocabularies

Wikidata, VIAF, Geonames, Pleiades, DBpedia, LCSH…

Data repositories and inventories

The Data Hub

Page 35: linked pasts

Vocabulary selection criteria

Available in technically appropriate way

Well-maintained

Documented (including metadata)

Well-connected, e.g. equivalent elements in other vocabularies are indicated

Multilingual

Open• license stacking hampers re-use

Quality assessment?

Cf. Data on the Web Best Practices

http://www.w3.org/TR/2015/WD-dwbp-20150625/#dataVocabularies

Page 36: linked pasts

Take-home messages

Efforts across the whole ecosystem

Publishers of vocabularies, Providers of object data, Application developers, Researchers…

Requires to get very different steps right

Implementing standards for data exchange

Design consuming applications

Not only technical: encouraging open data!

Page 37: linked pasts

Thank you!

Antoine Isaac

[email protected]