What is Reproducibility? The R* Brouhaha Professor Carole Goble The University of Manchester, UK Software Sustainability Institute UK [email protected]Alan Turing Institute Symposium Reproducibility, Sustainability and Preservation 6-7 April 2016, Oxford, UK
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
What is Reproducibility?The R* BrouhahaProfessor Carole Goble
The University of Manchester, UKSoftware Sustainability Institute UK
Was Produced2. Avoid Manual Data Manipulation Steps3. Archive the Exact Versions of All
External Programs Used4. Version Control All Custom Scripts5. Record All Intermediate Results, When
Possible in Standardized Formats6. For Analyses That Include Randomness,
Note Underlying Random Seeds7. Always Store Raw Data behind Plots8. Generate Hierarchical Analysis Output,
Allowing Layers of Increasing Detail to Be Inspected
9. Connect Textual Statements to Underlying Results
10.Provide Public Access to Scripts, Runs, and Results
Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285
Record Everything
Automate Everything
Scientific publications goals: (i) announce a result and (ii) convince readers that the result is correct.
Papers in experimental science should describe the results and provide a clear enough protocol to allow successful repetition and extension.
Papers in computational science should describe the results and provide the complete software development environment, data and set of instructions which generated the figures.
Virtual Witnessing*
*Leviathan and the Air-Pump: Hobbes, Boyle, and the Experimental Life (1985) Shapin and Schaffer.
Why? Jill Mesirov
David Donoho
Datasets, Data collectionsStandard operating proceduresSoftware, algorithmsConfigurations, Tools and apps, services
“an experiment is reproducible until another laboratory tries to repeat it.”
Alexander Kohn
reviewers want additional workstatistician wants more runsanalysis needs to be repeatedpost-doc leaves, student arrivesnew data, revised dataupdated versions of algorithms/codessample was contaminated
What is Reproducibility?Why, When, Where, Who for, Who by, How
Special thanks to• C Titus Brown• Juliana Freire• David De Roure• Stian Soiland-Reyes• Barend Mons• Tim Clark• Daniel Garijo• Wf4Ever and Research Object teams• Dagstuhl Seminar 16041 • Force11 http://www.force11.org