Top Banner
SHARE "beyond data sharing: sharing research software in a dedicated cloud" Pieter Van Gorp, Information Systems, [email protected] @pvgorp
25

[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Jun 19, 2015

Download

Technology

3TU.Datacentrum

3TU.Datacentrum Symposium Research Data Management:
Funder requirements, Questions and Solutions

At this symposium the funding organisation NWO and the European Commission explained their vision, plans and requirements. Researchers from the three universities of technology shared their experiences of data management in different stages of research. And the Research Data Services team informed the audience about research data management services offered by 3TU.Datacentrum.

The 3TU.Datacentrum symposium took place at the TU Delft (26 May), University of Twente (2 June) and TU Eindhoven (11 June) for and with local researchers.

More information on: datacentrum.3tu.nl/over-3tudatacentrum/symposium-2014
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

SHARE

"beyond data sharing: sharing research software in a dedicated cloud"

Pieter Van Gorp, Information Systems, [email protected]

@pvgorp

Page 2: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]
Page 3: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]
Page 4: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Related Initiatives: Experimental Software & Toolkits (EST)

• “In physics or chemistry papers about experiments contain a lot of technical details in order to facilitate other researchers to replay the experiments in order to validate the results described in these papers”

• “More and more computer scientists use the Open Source community to distribute their tools. In this way it is not necessary to reimplement tools, only to download and install them.”

M.G.J. van den Brand. Guest editor's introduction: Experimental

software and toolkits (EST). Science of Computer Programming, 69(1-3):1 2, 2007.

4

Page 5: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Structure of Talk

Background (TTC aims

and process)

A world without SHARE

SHARE functionality

Adoption of SHARE

Ongoing Work

Conclusions

Page 6: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Reviewing Software Artifacts: Hot Topic

• ECOOP • ESEC/FSE 2011 (Zeller, Krishnamurthi and Ghezzi):

• out of 34 submissions, 16 indicated they would submit an artifact and 14 actually did so

• Of the 14 submissions, 7 met or exceeded expectations, while the rest, sadly, did not.

• Special Issues in Science of Computer Programming

Presenter
Presentation Notes
 Jan Vitek, Erik Ernst, and Shriram Krishnamurthi run AEC 2013
Page 7: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Success Story: TTC’s Collaborative Tranformation Engineering Research

Page 8: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Step 13: Statistical Analysis

Page 9: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Study Execution

1. Calling for cases 2. Selecting cases 3. Calling for solutions 4. Discussing during solution building 5. Submission of solutions (as executable paper in ) 6. Committee Review of solutions 7. Non-Blind Peer review 8. Presentation of solution 9. (Presentation of opponents) 10. Evaluation of solution (using online forms) 11. Resubmission of solutions (as executable paper in ) 12. Committee Review of solutions 13. One journal paper per case: one author per solution, plus “core team”

• Statistical analysis of (10) • Reflection

Page 10: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Part 2: SHARE

Page 11: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

• Effort to install tools • Multiple download locations • Conflicts with OS − Personal preferences − Linux kernel (gcc, ...) − XP service packs (.net framework, ...) − ...

• Conflicts with specific version (Eclipse plugin hell...) • Legacy

• Effort to retrieve input data • What about closed source, licensed contributions?

11

A world without … Limitations of Open Source

Page 12: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

12

Effort Reader: Download & Install...

Volume Editor: Workflow? Future proof result?

12

Reviewer: Download & Install...

Author: Make installer, documentation

SHARE saves time and worries

Presenter
Presentation Notes
Concerning AUTHOR, mention JETI etc. HERE!!!
Page 13: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Demonstrating Software: Levels of Accessibility

1) Not Accessible 2) Accessible After Request 3) Available Online, Manual Installation 4) Available Online, Manual Configuration 5) Available Online, Fully Configured

Cloud Computing

Virtualization

Ad-hoc resource reservation

13

Page 14: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

SHARE in a Nutshell

Page 15: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Core feature

15

Page 16: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

SHARE: Typical User Walkthrough 1 2

3

4 5

Page 17: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

SHARE

17

Page 18: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Usage Data

0

50

100

150

200

250

feb

mrt

apr

mei jun jul

aug

sep

okt

nov

dec

jan

feb

mrt

apr

mei jun jul

aug

sep

okt

nov

dec

jan

feb

mrt

apr

mei jun jul

aug

sep

okt

nov

dec

jan

feb

mrt

apr

mei jun jul

aug

sep

okt

nov

dec

jan

feb

mrt

apr

2009 2010 2011 2012 2013

#VM sessions/month

TTC13

Page 19: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

1. First Prize [Kraków, Poland]: The Collage Authoring Environment 2. Second Prize [TUE]: SHARE 3. Third Prize [Stanford]: A Universal Identifier for Computational Results

Evaluation Criteria

• Project quality • Usefulness to the user • Innovation/vision • Scope • Feasibility (publishing workflow integration) 19

Page 20: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

SHARE

20

Page 21: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Ongoing Work: ScienceDirect Integration

21

Paper source analysis =>

provide tabular overview in

app

App can reuse existing RSS-

based querying of SHARE database

Page 22: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

• Example • http://dx.doi.org/10.1016/j.scico.2013.12.007

ScienceDirect<>SHARE today

Currently links to share20.eu/… and can be easility adapted

Currently links to share20.eu/… and can be easility adapted

Currently links to share20.eu/… and can be easility adapted

Currently links to share20.eu/… and can be easility adapted

Currently links to share20.eu/… and can be easility adapted

Currently links to share20.eu/… and can be easility adapted

Currently links to share20.eu/… and can be easility adapted

Currently links to share20.eu/… and can be easility adapted

Currently links to share20.eu/… and can be easility adapted

Currently links to share20.eu/… and can be easility adapted

Currently links to share20.eu/… and can be easily adapted

Page 23: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Ongoing Work: Opening the Sandboxes

• Current Limitation: • Published SHARE VMs have no internet access • Content can only be uploaded and updated in SHARE • For VMs with open source content, simple download should be possible too

• Promising extension:

• Provide one specific folder for storing VM files that are allowed to be downloaded

• VMs would still not have internet access, but files could be downloaded via portal

Page 24: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

24

Page 25: [3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, Eindhoven]

Conclusions & Invitations

• easily reproducing/extending each other’s results • holistic preservation

• Invitations:

• Evaluate and apply SHARE • Archive SHARE .VDIs in a data repository