Top Banner
Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK 15 th May 2007
14

Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Dec 10, 2015

Download

Documents

Aaron Higgins
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Publishing Data

Catherine JonesLibrary Systems Development Manager, STFC

Rutherford Appleton Laboratory

CLADDIER workshop, Chilworth, Southampton, UK 15th May 2007

Page 2: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Contents

• Set the scene• Definition of publication• Complexities• Making data permanently available• Quality control• User requirements• Issues

Page 3: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Microsoft’s Science 2020

ReportModern scientific communication relies on both journals and databases. At present these are not integrated.

By 2020 mutual linking will be commonplace and publications just containing peer-reviewed data will become available.

http://research.microsoft.com/towards2020science/downloads.htm

Page 4: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Publication concept

In this context “publication” is defined as the process through which data is fixed and made retrievable over the long term, and may imply that there has been some quality control process.

Page 5: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Complexities of Data

These all show the same data at different levels of processing.

Page 6: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Making data permanently

available

Three areas: 1. Defining what is to be kept: encapsulation 2. Ensuring that it is described effectively:

metadata3. Identifying who is responsible for the

data management: trusted repository

Page 7: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Encapsulation

A method of identifying a fixed collection of meaningful data so that it can be preserved as a clearly defined unchanging entity.

Datasets which are still growing Versions of datasetsFormat translations

Page 8: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Metadata

Needs to be created to ensure that the data is usable now and over the long term. Semantic encapsulation is important as this is likely to be used in citation.

Page 9: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Trusted repository

To ensure that the data is available over the long term, the Data Centre needs to be on a secure footing and well managed.

Page 10: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Quality Control

Usability of the dataset. This is one of the roles of the Data Centres.Usefulness of the dataset. This is the role of domain experts.

Page 11: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

User requirements for

citation1.Need for an unambiguous reference to a well defined

permanent entity2.This reference/citation needs to be understandable for

humans3.Author and publication year, or equivalents, are

important 4.An unambiguous data reference, in this area includes

the activity or tool which produced the data 5.Source of the data (i.e. the repository) may be as

important as the producer and needs to be unambiguous

Page 12: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Requirements from data producers

1. Traceable to the data provider/producer 2. Usable for usage metrics 3. To be recognised as intellectually

equivalent to academic papers 4. Able to be used to search for papers

citing data

Page 13: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Citation format

Author, title, [medium], publisher, publication date, identifier, feature, [access date, available at]

Natural Environment Research Council, Mesosphere-Stratosphere-Troposphere Radar Facility at Aberystwyth, [Internet]. British Atmospheric Data Centre (BADC), 1990- urn badc.nerc.ac.uk/data/mst/v3/upd15032006, feature 200409031205 [http://featuretype.registry/VerticalProfile] [cited 2006 Apr 25. Available from http://badc.nerc.ac.uk/data/mst.]

Page 14: Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.

Issues for consideration

•The ability to cite data is strongly linked to the definition of the data. •Dynamic datasets pose additional issues for long-term accessibility. •Versioning of the data and the processing/analysing software are big issues to resolve.•Peer review of the data is important. •Identification of datasets where a facility may provide data from a set of instruments is a complex decision.