Varsha Khodiyar, PhD Data Curation Editor, Scientific Data Nature Publishing Group @varsha_khodiyar @scientificdata Tweet with #SDJPN16 Gaining credit for sharing research data Data publishing with Scientific Data RIKEN Center for Life Science Technologies 4 th March 2016
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Varsha Khodiyar, PhD
Data Curation Editor, Scientific Data
Nature Publishing Group
@varsha_khodiyar
@scientificdata
Tweet with #SDJPN16
Gaining credit for sharing research data
Data publishing with Scientific Data RIKEN Center for Life Science Technologies 4th March 2016
My background • Joined Scientific Data in October 2014
• Professional data curator since 2003
• PhD in Molecular Biology from the University of Leicester
• Contributed to the Human Genome Project as member of the Human Gene Nomenclature Committee (HGNC)
• Gene Ontology curator for 8 years, at University College London, UK
• 3 years of open data publishing experience
2
Why share research data?
Generating research data is expensive
Just 18.1% NIH grant applications funded in 2014*
• Hours spent writing grants?
• Hours spent reviewing grants?
Resources are finite/expensive
• Modified animals
• Specialized reagents
Time and effort taken in the laboratory to generate good, valid data
* report.nih.gov/success_rates/Success_ByIC.cfm
Irreproducibility of published science
Figure 1 - Ioannidis JPA. et al. Repeatability of published microarray gene
Rathi V, Dzara K, Gross CP, Hrynaszkiewicz I, Joffe S, Krumholz HM, Strait KM, Ross JS: Sharing of clinical trial data among trialists: a cross sectional survey. BMJ 2012;345:e7570
• Sharing de-identified data via repositories should be required (236 respondents, 74%)
• Investigators should share de-identified data on request (229 respondents, 72%)
…clinical data producers have specific concerns
Rathi V, Dzara K, Gross CP, Hrynaszkiewicz I, Joffe S, Krumholz HM, Strait KM, Ross JS: Sharing of clinical trial data among trialists: a cross sectional survey. BMJ 2012;345:e7570
Example initiatives for sharing clinical data
Yale Open Data Access (YODA) & Clinical Study Data Request (CSDR) projects:
• Data Use Agreements (DUAs) • Controlled access environment • Scientific validity of reanalysis checked • Independent governance • Data anonymisation checks
• Identify repositories able to archive clinical data
• Work with identified repositories to establish workflows for
peer review and publication, whilst maintaining patient
privacy
• Facilitate specialist peer review process for clinical data, for
example ensure peer reviewers have agreed to terms of data
use agreement
Hrynaszkiewicz, I., Khodiyar, V., Hufton, A. & Sansone, S. A. Publishing descriptions of non-public clinical datasets: guidance for researchers, repositories, editors and funding organisations. BioRxiv http://dx.doi.org/10.1101/021667 (2015).
A robust data-on-request workflow?
Published Data Descriptor with clinical data Data Records
section details how to access
the data
Links to restricted access data Data Citations link to repository
Data files requiring
permission to access
Freely accessible data files
Data Reuse stories
Data reuse by (some of) the same researchers
38
Data reuse by other researchers in the same field
39
“The Data Descriptor made it easier to use the data, for me it was critical that everything was there…all the technical details like voxel size.”
Professor Daniele Marinazzo
According to Google Scholar, cited 43 times! (February 2016)