Mar 21, 2017
Intro to PROVNicholas CarData [email protected]
Intro to PROV
Outline• What is PROV?
• How do I use PROV: modelling
• How do I use PROV: data management
• How do I use PROV: with other systems
Intro to PROV
What is PROV?• W3C Recommendation (standard)
• Completed 2013
• Large number of authors
• The only international provenance standard
• Successor to precursors: PML, OPM. • Many precursor authors involved• Simpler than precursors
• No v2 any time soon
• Authors recommend extending the current standard
• Seeing good adoption
Intro to PROV
What is PROV?• A “Family of documents”
• PROV-OVERVIEW – documentation
• PROV-PRIMER – tutorial
• PROV-DM – Data Model
• PROV-O – OWL Ontology version of DM
• PROV-N – special Notation for DM
• PROV-XML – XML encoding of DM
• PROV-CONSTRAINS – DM constraints
• http://www.w3.org/TR/prov-overview/
Intro to PROV
How do I use PROV: modellingNot like this:
Do not describe the lineage of something in the metadata document of that thing
ISO19115or other
standardised Document
provenance information contained in document
some provenance field
Ref: https://geo-ide.noaa.gov/wiki/index.php?title=ISO_Lineage
Intro to PROV
How do I use PROV: modellingNot like this:
Do not link a class of something to a provenance object
Data Catalogue Vocabulary (DCAT)https://www.w3.org/TR/vocab-dcat/
Provenance
field 1field 2
provenance
Intro to PROV
How do I use PROV: modellingNot like this:
Do not link a class of something to a provenance object
Data Catalogue Vocabulary (DCAT)https://www.w3.org/TR/vocab-dcat/
Provenance
field 1field 2
provenance
Not even by using the Dublin Core ‘provenance’Property!
Intro to PROV
How do I use PROV: modellingLike this:
Model things you are interested in as either Entities, Agents or Activities and relate them to one another
PROV-DM’s basic classes expressed in a PROV-O style. After https://www.w3.org/TR/prov-o/
Intro to PROV
How do I use PROV: modellingLike this:
GA’s “process provenance model”
Intro to PROV
How do I use PROV: data management• For humans, or systems that log things:
• create Reports
• store them in a document DB• with all the perks of a graph DB!
Intro to PROV
How do I use PROV: data management• For humans, or systems that log things:
• create Reports
• store them in a document DB• with all the perks of a graph DB!
A provenance Report generation form for human use in PROMS
Intro to PROV
How do I use PROV: data management• For humans, or systems that log things:
• create Reports
• store them in a document DB
• For catalogue-like things:
• Add the ability to link Entities, Agents, Activities
Dataset X
Dataset Y
Intro to PROV
How do I use PROV: data management• For humans, or systems that log things:
• create Reports
• store them in a document DB
• For catalogue-like things:
• Add the ability to link Entities, Agents, Activities
Dataset X
Dataset Y
wasDerivedFrom
Entity YEntity X
Intro to PROV
How do I use PROV: data management• For humans, or systems that log things:
• create Reports
• store them in a document DB
• For catalogue-like things:
• Add the ability to link Entities, Agents, Activities
• Ensure relevant properties align with PROV
Dataset X Creator
creator
Intro to PROV
How do I use PROV: data management• For humans, or systems that log things:
• create Reports
• store them in a document DB
• For catalogue-like things:
• Add the ability to link Entities, Agents, Activities
• Ensure relevant properties align with PROV
Dataset X
wasAssociatedWith
Creatorcreator
Agent Creator
hadRole
Intro to PROV
How do I use PROV: data management• For humans, or systems that log things:
• create Reports
• store them in a document DB
• For catalogue-like things:
• Add the ability to link Entities, Agents, Activities
• Ensure relevant properties align with PROV
• For databases:
• Ensure you represent the PROV-DM
Intro to PROV
How do I use PROV: data management• For humans, or systems that log things:
• create Reports
• store them in a document DB
• For catalogue-like things:
• Add the ability to link Entities, Agents, Activities
• Ensure relevant properties align with PROV
• For databases:
• Ensure you represent the PROV-DM
• prove it via exporting
Intro to PROV
How do I use PROV: with other systems• PROV & Metadata System X:
1. Full Alignment – Classify all things in MSX in PROVo Requires a data model for MSXo May have to reconsider some MSX objectso Can profile PROV, don’t allow everything
2. Partial Alignment – Classify some of MSX in PROVo Link classified things onlyo Even link to things outside MSXo Need to demo valid PROV-DM
3. Just PROV – Interpret/create PROV-only datao Deprecate MSX for PROVo Or create new data
Intro to PROV
How do I use PROV: data managementLike this:
GA’s “process provenance model”, full version