Top Banner
Metadata Vocabularies and Cultural Heritage Reconciling static and dynamic views Gerard Kuys, January, 30, 2014 1
37

20140130 metadata vocabularies_and_cultural_heritage_final

Jun 01, 2015

Download

Technology

Gerard Kuys

Advocating the use of the Event class in order to express the dynamics of things happening. This may be particularly useful when connecting concepts across domain boundaries
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 20140130 metadata vocabularies_and_cultural_heritage_final

Metadata Vocabularies and

Cultural Heritage

Reconciling static and dynamic views

Gerard Kuys, January, 30, 2014

1

Page 2: 20140130 metadata vocabularies_and_cultural_heritage_final

The Tower of Progress (*) 2

(*) Mundaneum illustrations from: Françoise Levie, L’homme qui voulait classer le monde (2006)

Page 3: 20140130 metadata vocabularies_and_cultural_heritage_final

The Tower of Progress 3

Page 4: 20140130 metadata vocabularies_and_cultural_heritage_final

Why linking data, and not a single frame for them all?

• Throughout the web, there is no single point

of truth – whereas there may be one indeed

for individual people or organisations only

• Accordingly, people and organisations have

organised their way of describing their

universes following their own singular pattern

• If we are going to link scattered data within

scattered vocabularies, do we need to

compel them all to fit within a single

framework?

• AAA(AAF): anybody can say anything about

anything (in Almost Any Fashion)

• But DBpedia can be used as a ‘hub of

meanings’, providing a common reference

for them all

4

Page 5: 20140130 metadata vocabularies_and_cultural_heritage_final

Connecting content, or rather: connecting meaningful concepts

• Until now, every type of data collection has had its own way of describing its domain

• How could we construct a common way of describing related domains?

• Like Paul Otlet, we should move from Documents to Bits-of-Knowledge, from the

Web of Documents to the Web of Data

• But what current models are lacking, is dynamics, not in the least the DBpedia ontology

• The ‘collection model’ versus the ‘event model’

• CIDOC and FRBRoo (*) try to bridge this gap by introducing a dual view:

* Static view (what are the entities and artefacts)

* Dynamic view (how did these entities and artefacts come about)

• The Europeana Data Model quite recently has elaborated on events enough

to offer a choice between ‘object models’ and ‘event models’

• Could we ever reconcile these two views within the DBpedia ontology and, if so, how?

5

(*) http://www.cidoc-crm.org/frbr_inro.html

Page 6: 20140130 metadata vocabularies_and_cultural_heritage_final

Why are dynamics so important?

• Emerging trends in cultural productions:

• Interactive additions to content offered

• Semantic storytelling (like the BBC is doing)

• Museums presenting their stuff by way of the ‘journey’ metaphor

• Various interpretations and annotations (‘provenance’ will prove to be crucial)

• Might be a valuable addition to collection modelling as well (e.g., reprints)

• Enriching texts, like publishers do, will bring about ever more versions of a text

• ‘Semantic publishing’

6

Page 7: 20140130 metadata vocabularies_and_cultural_heritage_final

How to get dynamics into the DBpedia ontology?

Option 1: Incorporating CIDOC-CRM 7

Page 8: 20140130 metadata vocabularies_and_cultural_heritage_final

How to get dynamics into the DBpedia ontology?

Option 1: Replicating CIDOC-CRM 8

• DBpedia ontology might borrow some notions from CIDOC CRM

• But replicating it all is not a good idea

• This is Linked Open Data, after all

Page 9: 20140130 metadata vocabularies_and_cultural_heritage_final

CIDOC CRM on Life Cycle Events 9

The E2 Temporal Entity Hierarchy

Page 10: 20140130 metadata vocabularies_and_cultural_heritage_final

How to get dynamics into the DBpedia ontology?

Option 2: Incorporating a model of Events 10

Page 11: 20140130 metadata vocabularies_and_cultural_heritage_final

How to get dynamics into the DBpedia ontology?

Option 2: Incorporating a model of Events 11

Other viable time / calendar models:

• SNaP Event Ontology (http://data.press.net/ontology/event/)

• Schema.org ( http://www.schema.org/Event )

• QUDT (http://www.qudt.org/ )

• RDF Calendar Workspace (http://www.w3.org/2002/12/cal/ )

• LODE (Linked Open Description of Events) (http://linkedevents.org/ontology/)

• Events in the Europeana Data Model (http://pro.europeana.eu/tech-details )

Page 12: 20140130 metadata vocabularies_and_cultural_heritage_final

How to get dynamics into the DBpedia ontology?

Option 2: Incorporating a model of Events 12

Page 13: 20140130 metadata vocabularies_and_cultural_heritage_final

How to get dynamics into the DBpedia ontology?

Option 2: Incorporating a model of Events 13

Page 14: 20140130 metadata vocabularies_and_cultural_heritage_final

Now, let’s get practical

• When linking datasets, at what points would we want DBpedia to provide an Event model?

• When definitely not:

• Vocabulary matching & reconciliation (with SKOS)

• Establishing common identities (owl:sameAs)

• When indeed:

• Connecting persons to persons

• Connecting persons to objects

• Focus on life cycle / ‘Werdegang’

14

Page 15: 20140130 metadata vocabularies_and_cultural_heritage_final

Case # 1: A.J. van der Aa’s Aardrijkskundig Woordenboek 15

Page 16: 20140130 metadata vocabularies_and_cultural_heritage_final

A.J. van der Aa’s Aardrijkskundig Woordenboek

• Comprises 14 volumes, being published from 1837 to 1851

• Is a historical description of places, from big cities to tiny hamlets

• Connects these places to historical persons and to what they were doing there

• The Person Index contains references to 22.360 historical persons

• To be corrected for double occurrences

• To be validated against other sources / datasets

• To be related to persons who have a lemma of their own

in Wikipedia and, therefore, are a resource in DBpedia

• And, of course, A.J. van der Aa’s book has its own Wikipedia lemma as well

16

Page 17: 20140130 metadata vocabularies_and_cultural_heritage_final

A.J. van der Aa’s Aardrijkskundig Woordenboek 17

Page 18: 20140130 metadata vocabularies_and_cultural_heritage_final

A.J. van der Aa’s Aardrijkskundig Woordenboek 18

Page 19: 20140130 metadata vocabularies_and_cultural_heritage_final

Case # 1: A.J. van der Aa’s Aardrijkskundig Woordenboek 19

Page 20: 20140130 metadata vocabularies_and_cultural_heritage_final

Case # 1: Do we need Events here? 20

• No, this is about establishing identities

• Since a lot of people have no lemma of their own in Wikipedia (nor do they occur

in a list), we consider the question whether or not to add these data to the

DBpedia ontology without them being extracted

• The Reference class proves to be a solid mechanism to mediate between texts

and resources

• There is, however, no development and no narrative, so there is no need here to

introduce events into the model

Page 21: 20140130 metadata vocabularies_and_cultural_heritage_final

Case # 2: Connecting Wikipedia monument links 21

(*) Met dank aan Roland Cornelissen

Page 22: 20140130 metadata vocabularies_and_cultural_heritage_final

Case # 2: Connecting Wikipedia monument links 22

Wikipedia

page

XML-Version

of a book on

regional

monuments

Concept representing a Monument,

e.g. an information resource on an

Amsterdam canal mansion

DBpedia Ontology:

- Work

- Annotation

- Reference

Page 23: 20140130 metadata vocabularies_and_cultural_heritage_final

Case # 2: Do we need Events here?

• No, this is about concept recognition

• The Reference class again proves to be a solid mechanism to mediate between

texts and resources

• There is, still, no development and no narrative, so there is no need to introduce

events into the model

23

Page 24: 20140130 metadata vocabularies_and_cultural_heritage_final

Case # 3: Connecting Van der Aa people to ‘citizens’ 24

• ‘Wie Was Wie’ database: contains data about 18 million people since 1811

• Pivotal dataset for genealogical research

• Based on municipal registers of birth, death, marriage etc. since their very beginning

• Needs to reflect changes in municipal organisation (splits and mergers):

• For that, we made a mapping from Wikipedia lists of former municipalities to a

DBpedia class FormerMunicipality

• We still have to implement some periodisation from the point of view of Dutch civil

administration, and the changes it went through

• Need urgently a Time ontology other (that is, less physical and more cultural,

including approximate time spans) than W3C Time Ontology (*),

(*) http://www.w3.org/TR/owl-time/

Page 25: 20140130 metadata vocabularies_and_cultural_heritage_final

Life-cycle aware ontologies: the A2A Archive model 25

• being born

• dying

• being wed

• baptism

• divorce

• etc.

Page 26: 20140130 metadata vocabularies_and_cultural_heritage_final

Connecting an Event-driven dataset with a ‘static’ one 26

540 infants in the town of Goes, 1811-1813

542 mothers, of which 1 unknown

542 fathers, of which 63 unknown

76 Van der Aa celebrities related to the

town of Goes, … - 1843

Page 27: 20140130 metadata vocabularies_and_cultural_heritage_final

Connecting an Event-driven dataset with a ‘static’ one 27

540 infants in the town of Goes, 1811-1813

542 mothers, of which 1 unknown

542 fathers, of which 63 unknown

76 Van der Aa celebrities related to the

town of Goes, … - 1841

1 match, not to the infant Servaas (* April 4, 1811)

but to its father, vicar Jacobus de Kanter

(called to Goes in 1811)

Page 28: 20140130 metadata vocabularies_and_cultural_heritage_final

Connecting an Event-driven dataset with a ‘static’ one 28

540 infants in the town of Goes, 1811-1813

542 mothers, of which 1 unknown

542 fathers, of which 63 unknown

76 Van der Aa celebrities related to the

town of Goes, … - 1841

1 match, not to the infant Servaas (* April 4, 1811)

but to its father, vicar Jacobus de Kanter

(dismissed from the Goes diacony in 1811)

Page 29: 20140130 metadata vocabularies_and_cultural_heritage_final

Connecting an Event-driven dataset with a ‘static’ one 29

540 infants in the town of Goes, 1811-1813

542 mothers, of which 1 unknown

542 fathers, of which 63 unknown

76 Van der Aa celebrities related to the

town of Goes, … - 1841

1 match, not to the infant Servaas (* April 4, 1811)

but to its father, vicar Jacobus de Kanter

(dismissed from the Goes diacony in 1811)

Connection to

https://nl.dbpedia.org/resource/Johan_de_Kanter ??

Page 30: 20140130 metadata vocabularies_and_cultural_heritage_final

Case # 3: Do we need Events here? 30

• Yes, official registers tend to be very much focused on the act of registration,

being almost an event in itself

• (as is the case in deposing or retracting a will, and similar formal

declarations of a person’s intents)

• Events can be anything, from a person’s birth, to him being called somewhere as

a Vicar, or to a work being published or re-published

• The match to vicar De Kanter would have been much more difficult if the Birth

Register would have been oriented towards single persons (to the infant,

especially), and not to Events with several persons related to them

Page 31: 20140130 metadata vocabularies_and_cultural_heritage_final

Case # 4: Connecting Across Collections 31

31

Page 32: 20140130 metadata vocabularies_and_cultural_heritage_final

Case # 4: Do we need Events here? 32

• Yes, this is the stuff from which narratives and interactions with existing materials

are made

• Events can be anything, we have to think about an ontology of sentiments, not

unlike the Sentiment Wortschatz (*), but then in order to apply it ourselves when

enriching descriptions

• Europeana Data Model hasMet property could be the container notion, but would

be very much in need of specific subproperties

• Maybe this would go as far as a gotInfatuatedWith subproperty in an

interactive history play

(*) http://datahub.io/dataset/sentiws

Page 33: 20140130 metadata vocabularies_and_cultural_heritage_final

Case # 5: Making Linked Open Data Fit for Enrichment 33

Page 34: 20140130 metadata vocabularies_and_cultural_heritage_final

Semantic Storytelling 34

Page 35: 20140130 metadata vocabularies_and_cultural_heritage_final

The Potential of Event Modeling - 1

It is time to think about a shift in modeling:

• We still are very much captured within the thought model of the State Machine:

an Event causes a change of state within one or more resources

• This works fine in an environment, in which there is but a single process and a

single thread of action

• However, we are entering a stage in which various parallel courses of action

will be coexisting, both for future scenarios and for history scripts

• In all of these, authorship / provenance is of utmost importance in order to

assess reliability

35

Page 36: 20140130 metadata vocabularies_and_cultural_heritage_final

The Potential of Event Modeling - 2

To what extent would the DBpedia ontology have to reflect Event-related requirements?

• I would suggest, that DBpedia must remain above all a data hub, offering a

common point of convergence for vocabularies that are much more refined

• But in order to remain a data hub, the DBpedia ontology must at least

accommodate a basic Event model

• ‘Events’ in DBpedia to be distinguished between:

• something that happens either in Nature (‘NatureEvent’) or in society

(‘SocietalEvent’)

• something that causes a change of state within a resource (‘LifeCycleEvent’)

36