Top Banner
DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th
27

DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

Dec 29, 2015

Download

Documents

Chrystal Boyd
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

DataCite –Persistent links to scientific data

Jan Brase, DataCite – TIB

1st PRELIDA workshopPISA, June 26th

Page 2: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

High visability of the content

Easy re-use and verification.

Scientific reputation for the collection and documentation of content (Citation Index)

Encouraging the Brussels declaration on STM publishing

Avoiding duplications

Motivation for new research

What if any kind of scientific content would be citable?

Page 3: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

Digital Object Identifiers (DOI names) offer a solution

Mostly widely used identifier for scientific articles

Researchers, authors, publishers know how to use them

Put datasets on the same playing field as articles

DatasetYancheva et al (2007). Analyses on sediment of Lake Maar. PANGAEA.doi:10.1594/PANGAEA.587840

URLs are not persistent

(e.g. Wren JD: URL decay in MEDLINE- a 4-year follow-up study. Bioinformatics. 2008, Jun 1;24(11):1381-5).

DOI names for citations

Page 4: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

How to achieve this?

Science is global• it needs global standards• Global workflows• Cooperation of global players

Science is carried out locally• By local scientist• Beeing part of local infrastrucures• Having local funders

Page 5: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

Global consortium carried by local institutions

focused on improving the scholarly infrastructure around datasets and other non-textual information

focused on working with data centres and organisations that hold content

Providing standards, workflows and best-practice

Initially, but not exclusivly based on the DOI system

Founded December 1st 2009 in London

DataCite

Page 6: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

1. Technische Informationsbibliothek (TIB)2. Canada Institute for Scientific and Technical Information (CISTI), 3. California Digital Library, USA4. Purdue University, USA5. Office of Scientific and Technical

Information (OSTI), USA6. Library of TU Delft,

The Netherlands7. Technical Information

Center of Denmark8. The British Library9. ZB Med, Germany10. ZBW, Germany11. Gesis, Germany12. Library of ETH Zürich13. L’Institut de l’Information Scientifique

et Technique (INIST), France14. Swedish National Data Service (SND)15. Australian National Data Service (ANDS)16. Conferenza dei Rettori delle Università Italiane (CRUI)17. National Research Council of Thailand (NRCT)

DataCite members

Affiliated members:1. Digital Curation Center (UK)2. Microsoft Research3. Interuniversity Consortium for Political and Social

Research (ICPSR) 4. Korea Institute of Science and Technology

Information (KISTI) 5. Bejiing Genomic Institute (BGI)6. IEEE7. Harvard University Library

Page 7: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

IRD

( gr av/ 10 cm 3)

Sand

( %)

CaCO3

( %)

TOC

( %)

Radio

( %/ sand)

Smect

( %/ clay)

IRD

( gr av/ 10 cm 3)

Sand

( %)

CaCO3

( %)

TOC

( %)

Radio

( %/ sand)

Smect

( %/ clay)

IRD

( gr av/ 10 cm 3)

Sand

( %)

CaCO3

( %)

TOC

( %)

Radio

( %/ sand)

Smect

( %/ clay)

IRD

( gr av/ 10 cm 3)

Sand

( %)

CaCO3

( %)

TOC

( %)

Radio

( %/ sand)

Smect

( %/ clay)

IRD

( gr av/ 10 cm 3)

Sand

( %)

CaCO3

( %)

TOC

( %)

Radio

( %/ sand)

Smect

( %/ clay)

PS1389-3 PS1390-3 PS1431-1 PS1640-1 PS1648-1

Age (kyr) max. : 233.55 kyr PS1389-3ff

0.0

100.0

200.0

0 20 0 100 0 15 0 0. 5 0 50 0 100 0 20 0 100 0 15 0 0. 5 0 50 0 100 0 20 0 100 0 15 0 0. 5 0 50 0 100 0 20 0 100 0 15 0 0. 5 0 50 0 100 0 20 0 100 0 15 0 0. 5 0 50 0 100

54° 0' 54° 0'

54°30' 54°30'

55° 0' 55° 0'

55°30' 55°30'

11°

11°

12°

12°

13°

13°

14°

14°

15°

15°

World vector shore lineGrain size class KOLP AGrain size class KOEHN2Grain size class KOEHNGeochemistryGrain size class KOLP BGrain size class KOLP DIN20 m

Scale: 1:2695194 at Latitude 0°

Source: Baltic Sea Research Institute, Warnemünde.

Earth quake events => doi:10.1594/GFZ.GEOFON.gfz2009kciu

Climate models => doi:10.1594/WDCC/dphase_mpeps

Sea bed photos => doi:10.1594/PANGAEA.757741

Distributes samples => doi:10.1594/PANGAEA.51749

Medical case studies => doi:10.1594/eaacinet2007/CR/5-270407

Computational model => doi:10.4225/02/4E9F69C011BC8

Audio record => doi:10.1594/PANGAEA.339110

Grey Literature => doi:10.2314/GBV:489185967

Videos => doi:10.3207/2959859860

What type of data are we talking about?

Anything that is the foundation of further reserach

is research data

Data is evidence

Page 8: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

Over 1,700,000 DOI names registered so far

DataCite Metadata schema published (in cooperation with all members) http://schema.datacite.org

DataCite MetadataStore

http://search.datacite.org

DataCite in 2013

Page 10: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.
Page 11: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.
Page 12: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.
Page 13: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

OAI and Statistics

OAI Harvester

http://oai.datacite.org

DataCite statistics (resolution and registration)

http://stats.datacite.org

Page 14: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.
Page 15: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.
Page 16: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

DataCite Content Service

Service for displaying DataCite metadata

Different formats (BibTeX, RIS, RDF, etc.)

Content Negotation (through MIME-Typ)

• Access through DOI proxy (http://dx.doi.org)

• First implemented by CNRI and CrossRef:

Documentation:

http://www.crosscite.org/cn/

Page 17: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

Content negotiation

Optimized for m2m communication using the accept header of the http protocol

curl -L -H "Accept: MIME_TYPE" http://dx.doi.org/DOI

Try a shortcut out in any webbrowser:

http://data.datacite.org/MIME_TYPE/DOI

http://data.crossref.org/DOI

Page 18: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

Resolving to the citation

http://data.datacite.org/application/x-datacite+text/10.5524/100005

Li, j; Zhang, G; Lambert, D; Wang, J (2011): Genomic data from Emperor penguin. GigaScience. http://dx.doi.org/10.5524/100005

Page 19: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

Resolving to the RDF metadata

http://data.datacite.org/application/rdf+xml/10.5524/100005

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:j.0="http://purl.org/dc/terms/" > <rdf:Description rdf:about="http://dx.doi.org/10.5524/100005"> <j.0:identifier>10.5524/100005</j.0:identifier> <j.0:creator>Li, J</j.0:creator> <j.0:creator>Zhang, G</j.0:creator> <j.0:creator>Wang, J</j.0:creator> <owl:sameAs>doi:10.5524/100005</owl:sameAs> <owl:sameAs>info:doi/10.5524/100005</owl:sameAs> <j.0:publisher>GigaScience</j.0:publisher> <j.0:creator>Lambert, D</j.0:creator> <j.0:date>2011</j.0:date> <j.0:title>Genomic data from the Emperor penguin (Aptenodytes forsteri)</j.0:title> </rdf:Description></rdf:RDF>

Page 20: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

Example of use

This allows persistent identification of RDF statements!

Implemented for all over 45 million CrossRef and DataCite DOI names

Example of use:

DOI Citation Formatter

http://www.crosscite.org/citeproc/

Page 21: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.
Page 22: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.
Page 23: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

2012: STM, CrossRef and DataCite Joint Statement

1. To improve the availability and findability of research data, the signers encourage authors of research papers to deposit researcher validated data in trustworthy and reliable Data Archives.

2. The Signers encourage Data Archives to enable bi-directional linking between datasets and publications by using established and community endorsed unique persistent identifiers such as database accession codes and DOI's.

3. The Signers encourage publishers and data archives to make visible or increase visibility of these links from publications to datasets and vice versa

23

Page 24: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

Example

The dataset:Storz, D et al. (2009): Planktic foraminiferal flux and faunal composition of sediment trap

L1_K276 in the northeastern Atlantic. http://dx.doi.org/10.1594/PANGAEA.724325

Is supplement to the article:Storz, David; Schulz, Hartmut; Waniek, Joanna J; Schulz-Bull, Detlef;

Kucera, Michal (2009): Seasonal and interannual variability of the planktic foraminiferal flux in the vicinity of the Azores Current.

Deep-Sea Research Part I-Oceanographic Research Papers, 56(1), 107-124,

http://dx.doi.org/10.1016/j.dsr.2008.08.009

Page 25: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.
Page 26: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.
Page 27: DataCite – Persistent links to scientific data Jan Brase, DataCite – TIB 1st PRELIDA workshop PISA, June 26th.

Thank you!

See you September 2013 in Washington DC