Top Banner
UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau
19
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Paul Groth, Simon Miles, Luc Moreau

Page 2: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Outline

Process Documentation for Provenance

Power of the P-Structure P-assertion Recording Protocol PReServ’s Functionality Performance Pitch

Page 3: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Provenance

The Provenance Question– Lots of definitions…– Boil it down to a question.– What is the process that led to a

particular result? How do we answer this question?

– Search through documentation.

Page 4: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Documentation

Process Documentation– encompasses all other

documentation SOA based model of process Actors communicate via message

passing Actors make ASSERTIONS to

document process. Termed p-assertions.

How to organise these p-assertions

Page 5: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

P-Structure

Page 6: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

P-Structure View

Page 7: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Benefits

Domain independent queries That are provenance specific P-structure is a shared logical

organisation of p-assertions Does not prescribe how p-

assertions are exactly stored in an implementation.

Page 8: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

PReP Introduces the Provenance Store

– A Separate entity for maintaining process documentation

PReP specifies how an actor can communicate with the Provenance Store.

PReP has a number of nice properties. – Statelessness– Idempotence– Terminiation

Page 9: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

An Implementation

What is PReServ?– A Web Services implementation

of a Provenance Store– Implements

• PReP for recording• XQuery for querying

– Provides libraries and wrappers for making applications provenance aware.

Page 10: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

AxisHandler

AxisHandler

Provenance Store

Backend Store Interface

DatabaseStore

In-MemoryStore

…Backend Stores

PS Client Side

Library

PS Client Side

Library

Web Service WS Client

Query Actor WS

PS Client Side

Library

WS Calls

Java Calls

PReServ Implementation Diagram

Page 11: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Implementation cont.

Backend Store Interface

Java Object Database Memory …

Store Plug In Query Plug In …

Dispatcher

SOAP Msg SOAP Msg

Caching mechanism to improve performance

Berkeley Java Database 2.0• No setup required• Completely Transactional

Page 12: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Requirements

Apache Tomcat 5.0 Apache Ant 1.6.2 Java 1.5 (1.4 supported with some

help) Pure Java, tested on

– Windows– Mac OS X– Debian Linux

Page 13: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Evaluation Deployment

Protein Compressibility Experiment– HPDC’05

Workflow runs under VMWare – deployment consistency– ease of development

Workflow is executed on one machine PReServ runs on another machine

– Version 0.1.5 of PReServ

Page 14: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Record Performance

Page 15: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Query Performance

Page 16: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Applications

Page 17: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Conclusion

The p-structure allows for domain independent, provenance specific queries using XQuery.

Both recording and query times are linear

PReServ has a extensible architecture allowing for further functionality to be easily added.

Page 18: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Download! Try it out! Download PReServ 0.2:

– The AHM release – Released under Open Source MIT

License

www.pasoa.org– Click software

Contact us, we will try to help you make your application provenance-aware.

Page 19: UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.

UK e-Science All Hands Meeting 2005

Configuration Redhat Linux 9.1 on VMWare on

Windows XP Pentium P4 2.8 GHZ 1.5 GB RAM PReServ on another machine

– Database backend Berkley JDB 100 Mb local ethernet