The Reality of Reproducibility in Computational Science reproduce? repeat? rerun? does it matter? Prof Carole Goble FREng FBCS The University of Manchester, UK [email protected]Based on: e-Science 2012 Chicago, October 2012 https://dl.dropbox.com/u/617206/eScience-2012-GOBLE-release-nonotes.ppt JCDL 2012 Washington DC, June 2012 https://dl.dropbox.com/u/617206/JCDL%20Goble%20Final%20Clean-nobigbird.ppt Scholarly Communication Workshop, 14-15 January 2013, Pittsburgh, USA
37
Embed
The Reality of Reproducibility in Computational Science · • Marco Roos • Jose Enrique Ruiz del Mazo • Oscar Corcho • Anton Güntsch • Cherian Mathew • Ian Cottam •
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
The Reality of Reproducibility in Computational Science reproduce? repeat? rerun? does it matter?
Prof Carole Goble FREng FBCS The University of Manchester, UK [email protected]
Based on: e-Science 2012 Chicago, October 2012 https://dl.dropbox.com/u/617206/eScience-2012-GOBLE-release-nonotes.ppt
JCDL 2012 Washington DC, June 2012 https://dl.dropbox.com/u/617206/JCDL%20Goble%20Final%20Clean-nobigbird.ppt
Scholarly Communication Workshop, 14-15 January 2013, Pittsburgh, USA
Computational Methods Scientific workflows. In the wild. Distributed web/grid/cloud services Cyber-Infrastructure
Social Methods: Sharing and Exchange e-Laboratories for scientific artefacts. Libraries, Repositories and Catalogues for data, models, web services, workflows, scripts, SOPs…
Knowledge Management Semantic technology, semantic applications, Linked Open Data, research objects, executable papers, publishing
Software Engineering Software Sustainability Institute Open Middleware Infrastructure Institute, S/W and Data Policy Institutional Repository
Systems Biology
Chemistry
Astro-Physics
Astronomy
Biology
Social Science
Library Digital
Preservation Biodiversity
Public Health
Products Methods Applications
Research Objects
Nanopublications
Systems Biology data, models and SOPs
Service and Workflows
Data, Service and Workflows
Reproducibility a principle of the scientific method
Evidence to test and justify claims
Comparison of results and methods
Peer review
http://xkcd.com/242/
“An experiment is reproducible until another laboratory tries to repeat it.”
Alexander Kohn The Reproducibility Initiative
Reproducibility as a Service PLoS, FigShare
http://reproducibilityinitiative.org
Data Journals / Repositories
In silico (Computational) Science
Datasets Data collections Algorithms Configurations Tools and Apps Codes Workflows Scripts Code Libraries Services, Infrastructure, Compilers Hardware
Simulations, data exploration, data processing, analytics, database based, text mining, auto recommendation, visual analytics…(Digital Science = Science)
Science 13 April 2012: 336(6078) 159-160 DOI: 10.1126/science.1218263
DASTY Ensembl Browser JWS Online Simulator
Specialist Codes Libraries, Platforms, Tools
Service based Science
(Cloud) Hosted Services
Cytoscape
Commodity Platforms
Data Collections Catalogues Software
Repositories
My Data My Process My Codes My Libraries My Special Tweaks
Compound Assemblies: Workflows See Tom Moritz talk Execution
Multi-step coordinated execution of (distributed) computational components Repeatable and comparative Explicated computation
Virtual Witnessing / Minute-Taking Transparent, precise, citable documentation Accurate logs Reusable protocols, know-how, best practice
Replicate / Repeat Exactly replicate the original experiment and experimental conditions. Eliminate change. Observe.
Reproduce Run experiment with differences in experimental conditions.. Compare to test for same result. Observe.
Capture Curate Discover Use Reuse Preserve
Reproduce Between Labs
Repeat Within Lab
Reproduce
Replicate Repeat
Verify
Recreate results without existing code or data, independently.
Re-run to determine the sensitivity of results when underlying measurements are retaken
Regenerate results from existing code, data.
(Re)examine accuracy, wrt underlying model (Verify), or data (model error, measurement error) (Validate)
Adapted from V. Stodden, “Trust Your Science? Open Your Data and Code!” Amstat News, 1 July 2011. http://magazine.amstat.org/blog/2011/07/01/trust-your-science/
Re*<verb> Bingo
Fix and Compare
Vary and compare
Review the Record
[adapted from W
atson and Missier]
Decay
Reproduce Repeat
Replicate
Version Control (data, services, workflows)
No Version Control
External Dependencies Mixed Environments (open service set)
Complete control over services Single Environments (closed service set) Enclaves
Workflows in the Wild
Detect and Repair
Dependencies Snapshots
Community Workflows
Virtual Machines, Deployed Codes Prevent
Reproducible Research Systems There are many emerging (time for “standards”?)
• ID it to Cite It: ORCID (people), DOI (data, models, tools ...) • Tracking: local helper systems to instrument and track
provenance • Science as a Service: Virtual Machines, Cloud Appliances,
Hosted platforms deploys on your behalf, no installations, common platforms (e.g. Galaxy)
• Libraries and Repositories: with rich documentation • Publish: executable papers, companion web sites,
embedded notebooks/publishing, active publications • Explication of experimental mechanics: pipelines, workflows,
script systems with documentation, common tools/languages (e.g. MatLab)
Reuse Use the explained and trusted results (data, method) for new / my science on demand. Compare. Extend.
Is it “true”? Can I repeat it?
Can I use it? Can I reproduce it?
Method Provenance (link data and code)
Data Method Documentation
Method Execution
Snapshot State Available
Replay
Recreate
Altered State Available
Rerun Repeat
Reproduce with new Data
Reproduce with new Method
Repair Documented Provenance Of State
Recover Repurpose
Reuse Review
Good enough To Verify
Drummond C Replicability is not Reproducibility: Nor is it Good Science, online Peng RD, Reproducible Research in Computational Science Science 2 Dec 2011: 1226-1227. De Roure http://www.scilogs.com/eresearch/replacing-the-paper-the-twelve-rs-of-the-e-research-record/
Method
Reproduce Method
Extend
2. Reproducibility is a Spectrum
Replicate
Reproducibility is a Spectrum Partial reproducibility – over proprietary steps
or difficult-to-reproduce subparts, or just through examining the log
“perfect is the enemy of the good” Voltaire
3. Reproducibility through Inspection Archived Record to Manage
d1
S0
d2
S1
w
S2
y
S4
df
d1'
S0
d2
S1
z w
S'2
y'
S4
df'
(i) Trace A (ii) Trace B
http://ww
w.w
f4ever-project.org/research-object-model
Log, Fix, Replay, Analyse -> Instrument Systems and Apps
[Woodman, et al, 2011]
W3C PROV
4. Reproducibility by Invocation Active Instrument to Maintain
• Active Preservation: – Preservation vs Just in Time Just Enough
restoration/reconstruction: The natural state is broken.
• Stop Publishing, Start Releasing – Software release practices for workflows and
scripts, services, data and articles [Schopf, JCDL 2012]
• Librarianship, Stewardship and Best Practices of Everything – “Better Science through Superior Software” – C
Titus Brown – Zeeya Merali , Nature 467, 775-777 (2010) | doi:
10.1038/467775a
Data Stewardship
Software sustainability Software practices Software deposition Long term access to software Credit for software Software Journals Licensing Open Source Software
Best Practices for Scientific Computing http://arxiv.org/abs/1210.0530 Stodden, Reproducible Research Standard, Intl J Comm Law & Policy, 13 2009 Prlić A, Procter JB (2012) Ten Simple Rules for the Open Development of Scientific Software. PLoS Comput Biol 8(12): e1002802. doi:10.1371/journal.pcbi.1002802
Software “Better Science through Superior Software” – C Titus Brown
Open does not mean understandable.
José Enrique Ruiz (IAA-CSIC)
Galaxy Luminosity Profiling Why?
Make it Matter. Trade, Asset and Curation economics
What? Numerous standards: formats, terminologies and checklists
When? Incremental, Eager and Lazy, UpStream, Downstream
• Reproducibility for the 95% • Bottom up not just top down
• “Weak” reproducibility is better than none at all and could be enough.
Acknowledgements and Inspirations • David De Roure • Tim Clark • Sean Bechhofer • Robert Stevens • Christine Borgman • Victoria Stodden • Marco Roos • Jose Enrique Ruiz del Mazo • Oscar Corcho • Anton Güntsch • Cherian Mathew • Ian Cottam • Steve Pettifer
• Wf4ever, SysMO, BioVel, UTOPIA and myGrid teams
• Robin Williams • Pinar Alper • C. Titus Brown • Greg Wilson • Juliana Freire • Jill Mesirov • Simon Cockell • Paolo Missier • Paul Watson • Gerhard Klimeck • Matthias Obst • Jun Zhao • Pinar Alper • Daniel Garijo • Yolanda Gil
Further Information • myGrid
– http://www.mygrid.org.uk • Taverna
– http://www.taverna.org.uk • myExperiment
– http://www.myexperiment.org • BioCatalogue
– http://www.biocatalogue.org • SysMO-SEEK
– http://www.sysmo-db.org • MethodBox
– http://www.methodbox.org.uk • Rightfield
– http://www.rightfield.org.uk • UTOPIA Documents
– http://www.getutopia.com • Wf4ever
– http://www.wf4ever-project.org • Software Sustainability Institute