Top Banner
NSF Data Management Plans https://www.flickr.com/photos/intersectionconsulting/7537238368/in/set-72157614274686504/ Kate Anderson Chris Elsik 3/22/17
37

NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Aug 01, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

NSF Data Management Plans

https://www.flickr.com/photos/intersectionconsulting/7537238368/in/set-72157614274686504/

Kate AndersonChris Elsik3/22/17

Page 2: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due
Page 3: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Why we’re here…

“The goal of data management is to produce self-describing data sets” (DataONE Primer)

Page 4: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Why we’re here…

“The goal of data management is to produce self-describing data sets” (DataONE Primer)

u Data are important!

Page 5: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Why we’re here…

“The goal of data management is to produce self-describing data sets” (DataONE Primer)

u Data are important!

u Benefits you & your collaborators

Page 6: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due
Page 7: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Why we’re here…

“The goal of data management is to produce self-describing data sets” (DataONE Primer)

u Data are important!

u Benefits you & your collaborators

u Benefits science & inquiry

Page 8: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due
Page 9: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Why we’re here…

“The goal of data management is to produce self-describing data sets” (DataONE Primer)

u Data are important!Note: today, we’re focusing on final data (rather than raw or intermediate data)

u Benefits you & your collaborators

u Benefits science & inquiry

u Funding Agencies and Journals Require Data ManagementPlans and Sharing of Data

Page 10: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due
Page 11: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due
Page 12: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due
Page 13: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Data Sharing Policy

u Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.

Data Management Plans

u Proposals submitted or due on or after January 18, 2011, must include a supplementary document of no more than two pages labeled “Data Management Plan”.

Page 14: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Data Management Plans

u Subject to peer review

u Read the DMP Guidelines for your Directorate!

u Standard FAQ Answer regarding the sharing of data: “Data resulting from the award should be managed according to the data management plan that accompanied the proposal.”

Page 15: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Data Management Plans

u The types of data, samples, physical collections, software, curricular materials, and other materials to be produced in the course of the project;

u The standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies);

u Policies for access and sharing, including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements;

u Policies and provisions for re-use, re-distribution, and the production of derivatives; and

u Plans for archiving data, samples, and other research products, and for preservation of access to them.

Page 16: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

The Circle of Life…

(DataONE Primer)

Page 17: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

The Circle of Life…

(DataONE Primer)

Page 18: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

DMP Best Practices

http://libraryguides.missouri.edu/datamanagement

Page 19: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

DMP’s: A little vague

u All files will be stored on PI’s secure computer. All laboratory notebooks will be stored in PI’s office.

u All sample data will be collected and organized using [Specialty Software Name]. The files will contain information about sample characteristics and the conditions under which these characteristics were measured. Approximately 1-2 GB of data will be generated.

u Data will be available to anyone who desires access to our data. When possible, data will be made available online.

Page 20: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

DMP’s: Better!

u NSF example (excerpt of data and metadata standards section): The project will leverage existing metadata standards currently stored in Ecological Metadata Language (EML) format for the NutNet project. We will add additional metadata entries for the arthropod community composition and arthropod stoichiometry; field notes taken during the time of collection will be recorded. Morpho software will be used to generate the metadata file in EML. We chose EML format for our metadata since it allows integration with existing NutNet data housed in the Knowledge Network for Biocomplexity (KNB) data repository.

https://www.dataone.org/sites/all/documents/DMP_NutNet_Formatted.pdf

Page 21: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

The Circle of Life…

(DataONE Primer)

Page 22: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Metadata Basicsu Data about data

u Metadata lets others discover, understand, and use your data

u Metadata/annotation must be added throughout the lifecycle

Page 23: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Metadata to Consider: Who, What, Where, When, Why, How

u Name of the data set and data files

u Date of creation and last modification

u Software used to create file (including version)

u Data processing performed

u Who collected the data

u Contact information of responsible party

u Sponsor or funding agencies

u Why the data were collected (abstract; keywords; controlled vocabulary); when and where

u Instrumentation; experimental conditions; calibrations

u Units of measure

u Taxonomic details

u Known problems that limit data use

u How to cite the data set

Page 24: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Metadata to Consider: Who, What, Where, When, Why, How

u Name of the data set and data files

u Date of creation and last modification

u Software used to create file (including version)

u Data processing performed

u Who collected the data

u Contact information of responsible party

u Sponsor or funding agencies

u Why the data were collected (abstract; keywords; controlled vocabulary); when and where

u Instrumentation; experimental conditions; calibrations

u Units of measure

u Taxonomic details

u Known problems that limit data use

u How to cite the data set

Page 25: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

The Circle of Life…

(DataONE Primer)

Page 26: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Data Repositories

Domain Repositories

u Data stored with similar items

u Researchers in your area are familiar with the repository

u Subject-specific / data-type specific needs addressed

u More computational tools available

MOspace

u Subject repository may not exist

u Preserves link to institution with guarantee of support from the university

u Domain repositories can shut down once the grant ends

Page 27: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Domain Repositories: So Many Choices!

Page 28: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Depositing Data to MOspace

u [email protected]

u Simply email the MOspace team to get things going!

u Let them know:

u Author name(s)

u Project title and description

u Types of file(s) you want to submit

u Estimated file size

u Special software needed to read the file(s)

u If your data have been deposited in another repository (e.g. Dryad, DataONE, ICPSR)

u The MOspace team will contact you about best ways to submit your data.

Page 29: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

So, what do I say about MOspace in my NSF DMP?

Remember that DMPs are subject to peer review, so the nature of the plan will be specific to your project.

"[X type of data] will be deposited in MOspace, the University of Missouri's digital institutional repository. MOspace is based on MIT's DSpace technology and is a joint venture of the University of Missouri's Division of Information Technology and the University Libraries. MOspace items will include appropriate metadata and a permanent URL. Items will be freely available via the MOspace web site at https://mospace.umsystem.edu and will be searchable via Google and other search engines."

Page 30: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Think about the Licensing…

Page 31: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

More Resourcesu MU Libraries Guide on NSF Data Management Plans:

http://libraryguides.missouri.edu/datamanagement

u MU Libraries Guide on Data Sets: http://libraryguides.missouri.edu/datasets

u MU Libraries Guide on Open Access: http://libraryguides.missouri.edu/oajournals

u MU Libraries Guide on Public Access: http://libraryguides.missouri.edu/publicaccess

u DataONE: Primer on Data Management: What you always wanted to know* (*but were afraid to ask): https://www.dataone.org/best-practices

u MIT Libraries. Data Management and Publishing: http://libraries.mit.edu/guides/subjects/datamanagement/index.html

u UW-Madison Research Data Services: http://researchdata.wisc.edu/

u University of Arizona Libraries Data Management Resources: http://data.library.arizona.edu/

Page 32: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

DMP development exercise for Missouri Transect trainees

u Missouri Transect students and postdocs are tasked with developing a Data Management Plan (DMP) for their research projects.

u The exercise will provide valuable experience to trainees.

u CI Team will provide advice and feedback.

u The individual DMPs will be used to update the Missouri Transect DMP.

Page 33: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Current Missouri Transect Data Management Plan

u The current Missouri Transect DMP is available here:

https://missouriepscor.org/cyberinfrastructure/data-management

u Was developed prior to proposal submission by someone without expertise in the subject domains (C. Elsik), and some subjects are not included.

u We plan to update the Missouri Transect DMP after receiving your individual DMPs.

u A more current and detailed Data Sharing Policy (separate from the DMP) is available here, but it does not include subject-specific information:

https://missouriepscor.org/cyberinfrastructure/data-policy

Page 34: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Excerpt from current Missouri Transect DMP

u Types of Data Produced

This project will produce many diverse datasets. The Climate team will work with current and archived climate data from weather stations throughout the state, including 5-minute, hourly and daily conditions for air temperature, relative humidity, wind direction and speed, soil temperature at 2-inch depth, solar radiation, and rainfall. Microclimate data will be collected by Doppler radar. Climate models will also use data from the North American Regional Climate Change Assessment Program, Missouri Mesonet, the PRISM grid (parameter-elevation regressions on independent slopes model), the National Elevation Dataset 30m Digital Elevation Models grid, the Pennsylvania State University Soil Information for Environmental Modeling Ecosystem Management database, and the National Land Cover Dataset 2001. The climate team will also collect soil redox potential, soil moisture, and pH using in situ probes.

Page 35: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

An example from the current Missouri Transect DMP

u Data and metadata standards

Climate data will be stored in CF-compliant NetCDF format. Doppler radar data will be available as Nexrad level III and IRIS, which can be converted to Universal Format. Genomic data formats include Fastq, Fasta, GFF3, BAM/SAM, VCF. File formats include text/ASCII, standard imaging (e.g. jpg, pgm, ppm, tiff for 2D, ply, blend, mesh, pcd for 3D), imaging for GIS (GeoTiff), video (e.g. mp4, MPEG, avi), binary MatLab (mat). For some data types, metadata content is embedded in data files. For example, webcam image data will be stored in JPEG format, because it includes the ability to store meta-data as EXIF-tags within the jpg format itself, including time-stamps, GPS location, exposure, focal length, focus distance, and what color-correction algorithm has been applied. Similarly, GeoTiffis a public domain metadata standard that allows georeferencing information to be embedded within a TIFF file.

Page 36: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

An example from the current Missouri Transect DMP

u Plans for archiving & preservation

The CI Team will work with investigators to identify appropriate repositories. Sequencing data will be submitted to the NCBI Sequence Read Archive. iPlant will serve as a repository for plant image and phenotyping data. Other repositories will be identified through resources such as the Open Access Directory (http://oad.simmons.edu/oadwiki/Data_repositories), the Ohio State University science repository (http://library.osu.edu/find/subjects/science-data/) and the Registry of Research Data Repositories (http://www.re3data.org). Metadata, code, small datasets and links to large datasets used in publications will be archived as supplements or in repositories such as Dryad (http://datadryad.org/) or MOspace(http://libraryguides.missouri.edu/MOspace), the UM System institutional repository.

Page 37: NSF Data Management Plans - Missouri EPSCoR · under NSF grants. Grantees are expected to encourage and facilitate such sharing. Data Management Plans u Proposals submitted or due

Clarification: Missouri Transect Data Portal vs Archival Repository

u The Missouri Transect data portal provides a means to store and share data throughout the duration of the Missouri Transect Project.

https://data.missouriepscor.org

u According to the Missouri Transect Data Policy, all data must be submitted to this portal or another approved server for data sharing during the project period.

u However, the Missouri Transect Data Portal is not a long-term archival repository.

u Each individual DMP should list a repository where data will be submitted at the end of the Missouri Transect project.