Data Citation Principles CODATA TG on Data Citation Presentation to Dryad membership meeting Oxford, 23 rd May 2013 Co-Chairs Jan Brase, Germany [email protected]Christine Borgman, US [email protected]Sarah Callaghan, UK(@sorcha_ni Presenter) [email protected]Support: Paul Uhlir, US [email protected]
15
Embed
Data Citation Principleswiki.datadryad.org/images/9/9c/REVISEDData_Citation_TG_Presentation...Sarah Callaghan, UK(@sorcha_ni Presenter) [email protected] US [email protected].
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Data Citation Principles CODATA TG on Data Citation
• Co sponsored by ICSTI and supported by the US National Committee for CODATA
• Objectives:
– Examine key issues related to data identification, attribution citation and linking
– Help coordinate activities internationally
– Promote common practices and standards
CODATA Data Citation Task Group Co-Chairs:
Jan Brase,(Director, DataCite, and ICSTI representative), Technische Informations Bibliothek (TIB)/German National Library of Science and Technology, GERMANY
Sarah Callaghan (U.K. CODATA), The NCAS British Atmospheric Data Centre, STFC Rutherford Appleton Laboratory, UNITED KINGDOM
Christine Borgman, University of California, Los Angeles, USA
Members:
Micah Altman MIT Libraries USA
Elizabeth Arnaud Bioversity International ITALY
Todd Carpenter National Information Standards Organization USA
Vishwas Chavan Global Biodiversity Information Facility DENMARK
Paul Groth VU University of Amsterdam THE NETHERLANDS
Mark Hahnel FigShare UNITED KINGDOM
John Helly Scripps Institution of Oceanography, Climate, Atmospheric Science, and Physical Oceanography USA
Puneet Kishor Creative Commons USA
Jianhui LI Chinese Academy of Sciences CHINA
Franciel Azpurua Linares Information International Associates, USA
Karen Morgenroth National Research Council Canada CANADA
Yasuhiro Murayama National Institute of Information and Communications Technology JAPAN
Fiona Murphy Wiley Europe Ltd UNITED KINGDOM
Giri Palanisami Oak Ridge National Laboratory USA
Mark Parsons Research Data Alliance/U.S. Center for a Digital Society USA
Soren Roug European Environmental Agency BELGIUM
Helge Sagen Institute of Marine Research NORWAY
Eefke Smit International Association of STM Publishers, THE NETHERLANDS
Martie J. van Deventer CSIR South Africa SOUTH AFRICA
Michael Witt Purdue University Libraries USA
Koji Zettsu National Institute of Information and Communications Technology, JAPANConsultants:
William L. Anderson Associate Editor CODATA Data Science Journal USA.
Daniel Cohen NRC Board on Research Data and Information, and U.S. Committee for CODATA [on detail from the Library of Congress] USA
Yvonne Socha University of Tennessee USA
Project Director:
Paul Uhlir U.S. National Committee for CODATA, USA
CODATA EC Liaison:
Bonnie Carroll (U.S. CODATA and CENDI) Information International Associates USA
Other Organizations Working on Data Citation
• International Council for Scientific and Technical Information (ICSTI)
• DataCite
• The Dataverse Network
• National Information Standards Organization (NISO)
• Creative Commons and Science Commons
• CENDI – U.S. interagency group focused on scientific and technical information issues and coordination of activities.
• Global Biodiversity Information Facility (GBIF)
• World Data System (WDS)
• STM-Association
• Digital Curation Center, UK
• Research Data Alliance (RDA)
• … and many more
TG Work Products
• Inventory and analysis of existing literature on data citation and attribution
• Interviews with a sample of identified stakeholders concerning data citation and attribution practices – Data Repositories – Publishers – Researchers – Funding Organizations
• Public web presence on the CODATA site • Symposium and Workshop, Berkeley, CA August 2011: For Attribution:
Developing Data Attribution and Citation Practices and Standards • 3 Track session at CODATA 2012 on Data Publishing and Data Citation in
Cooperation with the WDS • Draft report on Current Activities and Best Practices in Data Citation (in
external review, expected release Summer 2013)
First Principles for Data Citation
1. Status of Data: Data citations should be accorded the same importance in the scholarly record as the citation of other objects.
2. Attribution: A citation to data should facilitate giving scholarly credit and legal attribution to all parties responsible for those data.
3. Persistence: Citations should refer to objects that persist.
4. Access: Citations should facilitate access to data by humans and by machines.
5. Discovery: Citations should support the discovery of data.
First Principles for Data Citation
6. Provenance: Citations should facilitate the establishment of provenance of data.
7. Granularity: Citations should support the finest-grained description necessary to identify the data.
8. Verifiability: Citations should contain information sufficient to identify the data unambiguously.
9. Metadata Standards: A citation should employ existing metadata standards.
10. Flexibility: Citation methods should be sufficiently flexible to accommodate the variant practices among communities.
VO Sandpit, November 2009
How do Dryad citations match with the principles?
1. Status of Data
2. Attribution
3. Persistence
4. Access
5. Discovery
6. Provenance
7. Granularity
8. Verifiability
9. Metadata Standards
10. Flexibility
VO Sandpit, November 2009
What do I want Dryad to do?
• Keep up the good work!!
• Work with journals ensure that data is cited (where possible) • Guidance for authors • Train editors and reviewers to look to see
if data is cited, and if not to ask why not
• Keep providing clear information on how the datasets in your archive should be cited • In the context of the data citation first
principles
• Keep talking to researchers about citing data
We trying to change the culture of research so that citing data is the norm – it’s not an easy job!
We can extend citation to other things like: • data • code • multimedia
And the best bit is, researchers don’t need to learn a new method of linking – they cite like they normally would!
We already have a working method for linking between publications which is: • commonly used • understood by the research community • used to create metrics to show how much of an impact something has
(citation counts) • applied to digital objects (digital versions of journal articles)
But data are different from articles! • Too big! • Not human readable! • Needs metadata to make sense of it! • Keeps getting updated/changed! • Needs to be machine readable! • Too complicated! • Etc. etc. etc.
"Piled Higher and Deeper" by Jorge Cham www.phdcomics.com
Citation does work, but we need to be clear what it does for data
(provides a link to the data version of record)
It’s not a magic bullet that will solve all our problems
VO Sandpit, November 2009
Recommended formats for dataset citation
Need to tell researchers:
• Don’t get hung up on it!
• There is no citation police
• Most of the time the repository should tell you how to cite the dataset on the catalogue page
• If they don’t – ask them!
• DataCite is “discipline‐agnostic concerning matters pertaining to academic style sheet requirements.” http://schema.datacite.org/meta/kernel-2.2/doc/DataCite-MetadataKernel_v2.2.pdf
• Applicable for subjects from astronomy to zoology (and more)