UCAR Workshop Review – “Bridging Data Lifecycles: Tracking Data Use via Data Citations” Matt Mayernik Research Data Service Specialist NCAR Library/Integrated Information Services (IIS) National Center for Atmospheric Research (NCAR) University Corporation for Atmospheric Research (UCAR) BESSIG, April 18, 2012
32
Embed
UCAR Workshop Review – “Bridging Data Lifecycles: Tracking Data Use via Data Citations” Matt Mayernik Research Data Service Specialist NCAR Library/Integrated.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
UCAR Workshop Review – “Bridging Data Lifecycles:
Tracking Data Use via Data Citations”
Matt MayernikResearch Data Service Specialist
NCAR Library/Integrated Information Services (IIS)National Center for Atmospheric Research (NCAR)
University Corporation for Atmospheric Research (UCAR)
BESSIG, April 18, 2012
Workshop• April 5-6, at UCAR Center Green Campus• Funded by NOAA through the UCAR JOSS program• ~80 attendees
– Academic librarians– Data management professionals– Software engineers– Scientists
• Agenda and presentations posted at http://library.ucar.edu/data_workshop/
2
What is a Data Citation?
3
From:Patil, S. and M. Stieglitz. 2011. Hydrologic similarity among catchments under variable flow conditions. Hydrology and Earth System Sciences, 15, 989–997. doi: 10.5194/hess-15-989-2011
Citation to journal article
Citation to data set
Interest in Data Citations
NSF GEO issued a “Dear Colleague Letter” on March 29
4
NCAR/UCAR Data
5
Climate model output data
Longitudinaltime-series data
Observational data from field studies
All images: copyright University Corporation for Atmospheric Research
Motivation for Data Citations• Understand use and impact of data
– Measurements of data use– Give scientists and data centers credit for producing,
managing, and curating data– Metrics requirements as an FFRDC
• Connecting data and scholarship• Increase transparency of data and science
6
7
Mark Parsons
Data Citation Practices
• Most data users don’t cite data
• Ex. “MODIS snow cover data” from NSIDC
From: Parsons, M. A., Duerr, R., and Minster, J.-B. 2010. Data Citation and Peer Review. Eos Transactions, AGU, 91(34): 297-298. http://dx.doi.org/10.1029/2010EO340001
Hypothesis: ~80% of citation scenarios for 80% of ESS data 9
Mark Parsons
EZID: long-term identifiers made easy
take control of the management
and distribution of your research,
share and get credit for it, and
build your reputation through its
collection and documentation
Primary Functions1. Create persistent identifiers2. Manage identifiers over time3. Manage associated metadata over time
Joan Starr
• Established brand in publishing
• Indexed by major A&I citation databases
• Cannot be deleted• More costly• Ex. http://dx.doi.org/10.5065/D6WD3XH5
DOIs vs ARKs
• Case sensitive• Special feature
supports granularity• Informative• Less costly• Ex. http://n2t.net/ark:/b5065/d6wd3xh5
Joan Starr
Both resolve to:http://www.ncl.ucar.edu
Excerpts from existing AGU policy – Citing Data
..data cited in AGU publications must be permanently archived in a data center or centers that meet the following conditions:• are open to scientists throughout the world.• are committed to archiving data sets indefinitely.• provide services at reasonable costs.Data sets that are available only from the author, through miscellaneous public network services, or academic, government or commercial institutions not chartered specifically for archiving data, may not be cited in AGU publications.
Bill Cook
Excerpts from existing AGU policy – Preserving/Archiving Data
AGU does not expect to archive data sets subject to this policy, except on a for-fee basis and for sets of a small sizeIt is not AGU's intention to serve as an archive for large data sets that should be housed in data centers.AGU maintains a deposit service for supplementary material of different types in order to provide long-term access to small supporting data sets and graphics files that are published concurrently with, and are an electronic component of, some AGU journal articles.
Bill Cook
NCAR Data Citation Initiatives
1. Technical
2. Policy/procedural
14Image copyright University Corporation for Atmospheric Research
15
Citation Challenges1. Diversity
2. Granularity
3. Version Control
4. Maintenance Over Time
What granularity for EOL DOIs and when are they issued?
• Given a large project with aircraft, soundings, radars, model output and satellite data do we:– Assign a DOI for each data file?– Assign one DOI for all datasets for the project?– Assign separate DOIs for datasets from each major platform?– What about ancillary data? Do we assign DOIs or does the providing
institution?
• We are thinking to assign DOIs for each major platform data associated with the project (e.g. C-130, S-Pol), outside datasets that we have “value-added”, and data for which no DOI exists
• It may be beneficial to only issue DOIs when processed data are released so as to prevent pubs from referencing preliminary data
Mike Daniels
Data QCGary Strand
The LTER NIS 2000
Nicole Kaplan, CSU - Long-Term Management of Ecological Data - April 2012, UCAR
K.S. Baker, B.J. Benson, D.L. Henshaw, D. Blodgett, J.H. Porter, S.G. Stafford. (2000) Evolution of a Multisite Network Information System: The LTER Information Management Paradigm. BioScience. 50(11) 963-978.
Nicole Kaplan
The LTER NIS 2011
Nicole Kaplan, CSU - Long-Term Management of Ecological Data - April 2012, UCAR
Nicole Kaplan
Results of CU Faculty Survey About Data Curation
• Many researchers had curation plans for their data• Many had orphan data without curation plans• Few departments had procedure for data preservation, some
participated in disciplinary based repositories supporting long-term storage
• Receptivity to a library role in data curation fell more in-line with the researchers disciplinary culture or philosophy regarding data sharing and collaborative projects.
Barb Losoff
21
Ruth Duerr
22
Lynn Yarmey
Citations in the Bigger PictureTed Habermann, NOAA/NESDIS/NGDC, NASA/ESDIS
Data preservation is communicating with the future
Ted Habermann
Metadata Types and Sharing
Discovery
Use / Mashup
Understanding
Discovery Portal
Community Metadata Collections
UserUser
More documentation is required for understanding data than discovering or using it.
Ted Habermann
25
Tim Killeen
Bridging Data Lifecycles, April 5-6, 2012
26
Current Practices @ NCAR’s Research Data ArchiveMetrics Usage - Sample
International Comprehensive Ocean Atmosphere Data Set (ICOADS)Global marine surface observations (1662-2011)
HadISST(1871-2011)
NOAA OI SST(1981-2011)
NOAA ERSST (1854-2011)
HadSLP (1871-2011)
JMA SST (1871-2011)
Ocean Clouds(1900-2010)
NOC Surf. Flux (1973-2009)
WASwind(1950-2009)
Global and Regional Atmospheric and Ocean Re-analysesNCEP/NCAR, NARR, ERA-40, ERA-Interim, 20CR, OARCA
Etc.
Steve Worley
How to Get Started• Know what you want to achieve• Know your identifier options• Engage stakeholders• Start with well-bounded cases• Plan for the long-term implications
– How to maintain– How to count
30
Thank You
Workshop agenda and presentations:http://library.ucar.edu/data_workshop/