Top Banner
Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You may reuse any of the original content in these slides as you wish, provided you attribute the source
33

Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Dec 27, 2015

Download

Documents

Eileen Bryant
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Literature/data integration and

Ryan ScherleData Repository ArchitectDryad Digital Repository

HighWire Fall Publishers’ MeetingNovember 20, 2013

You may reuse any of the original content in these slides as you wish, provided you attribute the source

Page 2: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

CC-BY-NC-SA nic221http://www.flickr.com/photos/nic221/391536867/

Page 3: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Bumpus HC (1898) The Elimination of the Unfit as Illustrated by the Introduced Sparrow, Passer domesticus. Biological Lectures from the Marine Biological Laboratory: 209-226.

CC-BY Adamohttp://www.piqs.de/fotos/121272.html

Page 4: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.
Page 5: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Who cares if the data is lost?

By Agrant141 (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

James Cook, portrait by Nathaniel Dance-Holland, c. 1775, National Maritime Museum, Greenwich

Page 6: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Source: Publishing Research Consortium, http://publishingresearch.netn=3824

6

Who cares if the data is lost?

Page 7: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Data “available upon request”

Wicherts and colleagues requested data from from 141 articles in American Psychological Association journals.

“6 months later, after … 400 emails, [sending] detailed descriptions of our study aims, approvals of our ethical committee, signed assurances not to share data with others, and even our full resumes…” only 27% of authors complied

Wicherts JM, Borsboom D, Kats J, Molenaar D (2006) doi:10.1037/0003-066X.61.7.726

Page 8: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Fighting data entropy

8

Info

rmati

on

Con

ten

t

Time

Time of publication

Specific details

General details

Accident

Retirement or career change

Death

(Michener et al. 1997)

Page 9: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Funder policies

o CDCo DODo DOEo EPAo NASA

o NIHo NISTo NOAAo NSFo USDA

US funding agencies that require or strongly recommend data sharing:

Page 10: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Joint data archiving policy

Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future.

As a condition for publication, data supporting the results in the article should be deposited in an appropriate public archive.

Authors may elect to embargo access to the data for a period up to a year after publication.

Exceptions may be granted at the discretion of the editor, especially for sensitive information.http://datadryad.org/pages/jdap

Page 11: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Piwowar HA, Chapman WW (2008) hdl:10101/npre.2008.1700.1

Impact factor and archiving policies

n=70

IF=3.6

IF=4.5

IF=6.0

Page 12: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Data archiving landscape

There are so many data repositories that we need directories of them:

o http://re3data.orgo http://DataBib.org

These repositories vary along many dimensions:o Datatype focuso Community focuso Allowed file sizeso Curation policieso Data access policieso Funding model

Page 13: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Data archiving landscape

Datatype Focus

Com

mu

nit

y F

ocu

s

General

General

Focused

Focused

Figshare

Institutional RepositorySupplement

alMaterials

Genbank

Pangaea Zenodo

LabDatabas

e

Dryad

Page 14: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

14

Dryad vs supplementary materials

Dryad SOM

Discoverable: indexed and exposed to both web and bibliographic search engines

✔ ✗

Identifiable: DataCite DOIs within articles serve as permanent, resolvable identifiers

✔ ✗*

Permanent: processes in place to promote preservation (incl. format migration) ✔ ✔/✗**

Curated: quality control by both automated processes and human inspection ✔ ✗*

Ease of deposit: streamlined deposit, allowance for large and complex datasets ✔ ✔/✗**

Formatted for reuse: do not convert reusable formats to PDF ✔ ✔/✗**

Updatable: new versions of data files can be added, metadata can be enhanced ✔ ✗

Support for embargoes: can delay release of data in accordance with journal policy

✔ ✗

Free reuse: no paywall, clear terms of reuse (all data released under CC Zero) ✔ ✔/✗**

Support for large files: allow data files up to 10GB ✔ ✗

Economy of scale: cost efficiency from shared infrastructure ✔ ✔/✗**

Alignment to organizational mission: focus on archiving and reuse of scientific data

✔ ✗

* A few publisher SOM sites are exceptions to the general rule** Practices differ among publishers, see Smit (2011), doi:10.1045/january2011-smit

Page 15: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

DataDryad.org 15

What makes Dryad unique

1. Tight focus on data associated with published literature

2. Data packages are curated

3. Open development process allows broad participation

4. Nonprofit organization managed by stakeholders

Page 16: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Dryad features

Quick and easy submission process…

Page 17: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Dryad features

…referencing authoritative sources…

Page 18: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Dryad features

…and leveraging integration with journals…

Page 19: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Dryad features

…to maximize the submitter’s valuable time.

Page 20: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

DataDryad.org 20

Page 21: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

DataDryad.org 21

Data citations

Best practice is to cite both the article and the data – they are both useful research products

But limit data citations to one data package per article – this eliminates most concerns about the size/granularity of data files

Page 22: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

DataDryad.org 22

Page 23: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Materials and Methods

References

Page 24: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.
Page 25: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.
Page 26: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.
Page 27: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.
Page 28: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Dryad uptake

>4,000 data packages containing >12,000 files associated with articles in 275 journals

200 submissions each month and growing

Some data packages have been downloaded more than 10,000 times

Fewer than 10% of authors chose to embargo their data when this option is allowed by the journal

Page 29: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Price schedule

Plan Member Non-member Minimum Purchase

Voucher $65 per data package $70 per data package 25 vouchers

Deferred Payment $70 per data package $75 per data package 1 year

contract

Subscriptionannual fee based on $25 per published research article

annual fee based on $30 per published research article

2 year contract

Pay on submission N/A

$80 per data package, payable by the submitter

1 data package

29

Page 30: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Sponsoring open data

Functional EcologyHeredityJournal of HereditySystematic BiologyThe American NaturalistEcological MonographsProceedings AProceedings BJournal of EcologyInterface FocusPlant PhysiologyThe Plant CallOpen BiologyEcology and EvolutionEvolutionary ApplicationseLife

Publishers, societies, and other organizations are now sponsoring deposits in 44 Journals

EvolutionElementaPalaeontologyMycoKeysComparative CytogeneticsSubterranean BiologyNature ConservationNeoBiotaPhytoKeysZooKeysPaleobiologyBiodiversity Data JournalBioRiskMolecular EcologyMolecular Ecology Resources

GMS German Medical ScienceGMS Medizinische Infomatik, Biometric und EpidemiologieSpecial Papers in PalaeontologyJournal of Evolutionary BiologyJournal of the Royal Society InterfaceJournal of Applied EcologyJournal of Animal EcologyMethods in Ecology and EvolutionThe Journal of PaleontologyJournal of Hymenoptera ResearchPhilosophical Transactions APhilosophical Transactions B

Page 31: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

In development…

Added value for journals, including a data display widget and a dashboard for editors

Page 32: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

Integrated article & data submission

Key functionalityo Makes data deposition simple for

authors (once files are prepared)o Ensures permanent link to data

within each article (and vice versa).

Options are customized to meet journal policies

o Data can be submitted prior to manuscript review or upon acceptance

o Journals may allow authors the option of a embargoing data for 1 year after publication

32

Page 33: Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You.

To learn more

Repository home: http://datadryad.orgNews: http://blog.datadryad.orgTwitter: @datadryad

Ryan Scherle, [email protected]

33