| 1 Anita de Waard VP Research Data Collaborations Elsevier RDMS [email protected] NFAIS Annual Conference, Philadelphia, PA February 21, 2016 The Rocky Road To Reuse: Encouraging infrastructures to promote data integration and reuse
| 1
Anita de Waard
VP Research Data Collaborations
Elsevier RDMS
NFAIS Annual Conference, Philadelphia, PA
February 21, 2016
The Rocky Road To Reuse: Encouraging infrastructures to promote data integration and reuse
| 2
Source: JISC: How and why you should manage your research data: a guide for researchers
Caroline Ingram, Published: 7 January 2016
Research Data Life Cycle
| 4
Save and store: Data Rescue Award
https://www.elsevier.com/physical-sciences/earth-and-planetary-sciences/the-2015-international-data-rescue-award-in-the-geosciences
| 6
The first Reproducibility Paper was published recently:
http://www.sciencedirect.com/science/article/pii/S0306437915301113
and is linked to this paper:
http://www.sciencedirect.com/science/article/pii/S0306437915000472
The data is hosted here: https://data.mendeley.com/datasets/xz6gv65m6d/6
To reproduce the experiment, the journal requires source code for the software components,
together with installation scripts, and we suggest authors to host their code in GitHub (See
software publication project) , In addition to the source code, we recommend authors to
submit a virtual machine, where all appropriate software components are readily installed and
can be reproduced on a wide variety of platforms. Authors are to submit their experiments
using either ReproZip or Docker.
Publish Reproducible Formats
| 7
https://data.mendeley.com/datasets/xz6gv65m6d/6
Linked to published
papers – or not
Linked to Github – or
not
Versioning and
provenance
Manage, Store: Mendeley Data
| 8
Share and Publish, Today:
• Supplementary data at PANGAEA
• Bidirectional links between PANGAEA &
ScienceDirect
• Data visualized next to the article
http://www.elsevier.com/databaselinking
| 9
Share and Publish, Tomorrow:
• ICSU/WDS/RDA Publishing Data Service Working group
• Currently creating linked-data model for exposing DOI to
DOI links outside publisher’s firewall
• Merged with National Data Service pilot with the same goal
• Collaboration between CrossRef, DataCite, Europe PubMed
Central, ANDS, Thompson Reuters, Elsevier
• About to deliver: http://dliservice.research-infrastructures.eu/#/api
Objective: move from
.. a one-for-all cross-
referencing service for
articles and data
a plethora of (mostly) bilateral
arrangements between the
different players… to…
| 10
Researche
rs
Funding
AgencyInstitution
Data
RepositoryDataset
JournalPaper
1. Researcher creates datasets
2. Researcher writes paper & publishes in journal
3. (Sometimes,) dataset gets posted to repository
4. Researcher reports (post-hoc) to Institution and Funder
22
1
3
4
4
Share and Publish, Current Status:
| 11
Researche
rs
Funding
AgencyInstitution
Dataset
JournalPaper2
2
1
3
4
4iii. No link between data
and paper
iv. Funders/Institutions informed as an
afterthought
i. Too much work for researchers
ii. Data posting not mandatory
Data
Repository
Share and Publish, Issues:
| 12
Researche
rs
Funding
AgencyInstitution
Data
Repository
Dataset
Journal
Paper
1. Researcher creates datasets and posts to
repository(under embargo)
2. Funder is automatically notified of dataset publication
3. Researcher writes paper & publishes in journal;
embargo is lifted and data linked
- NB this also allows release of non-used data for negative result and
reproducibility
4. Funder and institution get report on publication and embargo lifting
2
1
1
3
33
4
4i. Less
Work!
iv. Better
Tracking!
iii. Better
Linking!
ii. More
Data
Stored!
Share and Publish, Proposal:
| 13
Cite:
https://www.elsevier.com/connect/data-citation-is-becoming-real-with-force11-and-elsevier
| 15
Federated
Poor API
Rich API
FTP & Index
Federated
Poor API
Rich API
FTP & Index
Federated
Poor API
Rich API
FTP & Index
Data
EnrichmentManual
Automated
(User) Intent
Ranking
Filtering (how to
mix federated &
indexed rich &
poor)
Search
RenderingSearch all data
Faceted query/Results
refinement
Store & Use results
General
UIDomain
UI
Filtering
Feeding user signals
back into Search ranking
Evaluation
Birds of a Feather on Data Search: https://rd-alliance.org/bof-data-search.html
DESIRE: Networks of Discovery
| 16
Source: JISC: How and why you should manage your research data: a guide for researchers
Caroline Ingram, Published: 7 January 2016
Research Data Life Cycle
Electronic Lab Notebooks
Software Publication
Data repositories
DataSearch
Data Linking and Publishing
Data Citation
Electronic Lab Notebooks
Software Publication
Data repositories
DataSearch
Data Linking and Publishing
Data Citation
Electronic Lab Notebooks
Software Publication
Data repositories
DataSearch
Data Linking and Publishing
Data Citation
| 17
https://www.elsevier.com/connect/10-aspects-of-highly-effective-research-data
A Maslow Hierarchy for Research Data:
| 18
Networks of Collaboration:
Force11:
- Multi-stakeholder, member-driven organisation
- Unites scholars, tool developers, librarians, publishers, funding agencies etc. etc.
- E.g. Software citation group, akin to Data Citation Group
- Will present at Force16 in Portland, OR April 17-19, 2016
National Data Service:
- Multi-stakeholder group, based around supercomputing centres
- Aims to be a ‘connective tissue’ between data creation, curation, storage etc projects.
- Inviting Pilots: two or more partners who have not worked together, interested in
collaborating on a data-centric project to solve a real-world needs: can include software
sharing
- E.g. Datasearch, Data Linking systems
RDA:
- Coleading Data publishing, linking group
- Colead Cost Recovery group
- Active in Chemistry, Earth Science groups
- Starting BoF Data Search
The National
DATA SERVICE
| 19
• https://www.hivebench.com
• https://www.elsevier.com/physical-sciences/earth-and-planetary-sciences/the-2015-international-data-rescue-award-in-the-geosciences
• http://www.journals.elsevier.com/softwarex/
• https://www.elsevier.com/books-and-journals/content-innovation/data-base-linking
• https://rd-alliance.org/groups/rdawds-publishing-data-services-wg.html
• https://rd-alliance.org/bof-data-search.html
• https://data.mendeley.com/
• https://www.elsevier.com/connect/10-aspects-of-highly-effective-research-data
• https://www.force11.org/
• http://www.nationaldataservice.org/
• https://rd-alliance.org/
• https://www.elsevier.com/about/open-science/research-data
Anita de Waard, [email protected]
Thank you! Questions?