Top Banner
Intro to PROV Nicholas Car Data Architect [email protected]
19

Provenance and social science data Nicholas Car - Intro to PROV

Mar 21, 2017

Download

Data & Analytics

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROVNicholas CarData [email protected]

Page 2: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

Outline• What is PROV?

• How do I use PROV: modelling

• How do I use PROV: data management

• How do I use PROV: with other systems

Page 3: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

What is PROV?• W3C Recommendation (standard)

• Completed 2013

• Large number of authors

• The only international provenance standard

• Successor to precursors: PML, OPM. • Many precursor authors involved• Simpler than precursors

• No v2 any time soon

• Authors recommend extending the current standard

• Seeing good adoption

Page 4: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

What is PROV?• A “Family of documents”

• PROV-OVERVIEW – documentation

• PROV-PRIMER – tutorial

• PROV-DM – Data Model

• PROV-O – OWL Ontology version of DM

• PROV-N – special Notation for DM

• PROV-XML – XML encoding of DM

• PROV-CONSTRAINS – DM constraints

• http://www.w3.org/TR/prov-overview/

Page 5: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: modellingNot like this:

Do not describe the lineage of something in the metadata document of that thing

ISO19115or other

standardised Document

provenance information contained in document

some provenance field

Ref: https://geo-ide.noaa.gov/wiki/index.php?title=ISO_Lineage

Page 6: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: modellingNot like this:

Do not link a class of something to a provenance object

Data Catalogue Vocabulary (DCAT)https://www.w3.org/TR/vocab-dcat/

Provenance

field 1field 2

provenance

Page 7: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: modellingNot like this:

Do not link a class of something to a provenance object

Data Catalogue Vocabulary (DCAT)https://www.w3.org/TR/vocab-dcat/

Provenance

field 1field 2

provenance

Not even by using the Dublin Core ‘provenance’Property!

Page 8: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: modellingLike this:

Model things you are interested in as either Entities, Agents or Activities and relate them to one another

PROV-DM’s basic classes expressed in a PROV-O style. After https://www.w3.org/TR/prov-o/

Page 9: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: modellingLike this:

GA’s “process provenance model”

Page 10: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: data management• For humans, or systems that log things:

• create Reports

• store them in a document DB• with all the perks of a graph DB!

Page 11: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: data management• For humans, or systems that log things:

• create Reports

• store them in a document DB• with all the perks of a graph DB!

A provenance Report generation form for human use in PROMS

Page 12: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: data management• For humans, or systems that log things:

• create Reports

• store them in a document DB

• For catalogue-like things:

• Add the ability to link Entities, Agents, Activities

Dataset X

Dataset Y

Page 13: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: data management• For humans, or systems that log things:

• create Reports

• store them in a document DB

• For catalogue-like things:

• Add the ability to link Entities, Agents, Activities

Dataset X

Dataset Y

wasDerivedFrom

Entity YEntity X

Page 14: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: data management• For humans, or systems that log things:

• create Reports

• store them in a document DB

• For catalogue-like things:

• Add the ability to link Entities, Agents, Activities

• Ensure relevant properties align with PROV

Dataset X Creator

creator

Page 15: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: data management• For humans, or systems that log things:

• create Reports

• store them in a document DB

• For catalogue-like things:

• Add the ability to link Entities, Agents, Activities

• Ensure relevant properties align with PROV

Dataset X

wasAssociatedWith

Creatorcreator

Agent Creator

hadRole

Page 16: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: data management• For humans, or systems that log things:

• create Reports

• store them in a document DB

• For catalogue-like things:

• Add the ability to link Entities, Agents, Activities

• Ensure relevant properties align with PROV

• For databases:

• Ensure you represent the PROV-DM

Page 17: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: data management• For humans, or systems that log things:

• create Reports

• store them in a document DB

• For catalogue-like things:

• Add the ability to link Entities, Agents, Activities

• Ensure relevant properties align with PROV

• For databases:

• Ensure you represent the PROV-DM

• prove it via exporting

Page 18: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: with other systems• PROV & Metadata System X:

1. Full Alignment – Classify all things in MSX in PROVo Requires a data model for MSXo May have to reconsider some MSX objectso Can profile PROV, don’t allow everything

2. Partial Alignment – Classify some of MSX in PROVo Link classified things onlyo Even link to things outside MSXo Need to demo valid PROV-DM

3. Just PROV – Interpret/create PROV-only datao Deprecate MSX for PROVo Or create new data

Page 19: Provenance and social science data   Nicholas Car - Intro to PROV

Intro to PROV

How do I use PROV: data managementLike this:

GA’s “process provenance model”, full version