Top Banner
Managing Research Data Part 1 Planning Working Finalizing Sharing Data This work is licensed under a Creative Commons Attribution 4.0 International License . WHY – WHAT– WHO – WHEN & HOW
44

Research Data Management: Part 1, Principles & Responsibilities

Aug 08, 2015

Download

Data & Analytics

AmyLN
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Research Data Management: Part 1, Principles & Responsibilities

Managing Research Data Part 1

Planning Working

Finalizing Sharing Data

This work is licensed under a Creative Commons Attribution 4.0 International License.

WHY – WHAT– WHO – WHEN & HOW

Page 2: Research Data Management: Part 1, Principles & Responsibilities

WHY manage data -

WHAT research data are-

WHO manages research data -

WHEN & HOW data management is done -

Planning Working

Finalizing Sharing Data

Managing Research Data

This work is licensed under a Creative Commons Attribution 4.0 International License.

Page 3: Research Data Management: Part 1, Principles & Responsibilities

This two-part course is a collaboration between CU Libraries/Information Services and the Office of Research Compliance & Training. The purpose of this course is to familiarize you with the various aspects of research data management (RDM) by taking

3

Managing Research Data

44/ Managing Research Data

This course will guide you through these areas, offering in-depth details on each of them. Please refer to the top navigation to keep track of which area you are currently exploring.

•  Why RDM is both recommended and required

•  What research data are

•  Who is responsible for RDM

•  When RDM activities occur

•  How you can carry out RDM activities

Part 1:

Part 2:

Page 4: Research Data Management: Part 1, Principles & Responsibilities

Learning objectives: At the end of this training you will be able to: •  Define & identify research data •  Understand the demands of responsible conduct of research with

regard to research data management •  Understand the reasons behind the federal mandates of research data

management

4

Managing Research Data

44/ Managing Research Data

Page 5: Research Data Management: Part 1, Principles & Responsibilities

Links to many of the references and policies referred to in this course can be found on the final slides. Have Fun!

5

Managing Research Data

44/ Managing Research Data

Page 6: Research Data Management: Part 1, Principles & Responsibilities

Why should you care about Research Data Management?

WHY –WHAT – WHO – WHEN & HOW

6 44/ Managing Research Data

Page 7: Research Data Management: Part 1, Principles & Responsibilities

WHY –WHAT – WHO – WHEN & HOW Managing research data: SAVES TIME

Taking time to plan for your expected data, back them up, and document them in detail saves time otherwise lost in searching for, recovering, and deciphering data in the future

SIMPLIFIES YOUR LIFE Managing your data, by adopting an organization scheme, developing a description standard, and creating a preservation plan avoids future confusion and turmoil

INCREASES RESEARCH EFFICIENCY By saving time and avoiding confusion you will be more efficient! Manage your data for the future and you will be able to more easily find, access, understand, and use your data

ENSURES RESEARCH INTEGRITY Good research data management makes it more feasible to fulfill the commitments of responsible research

7 44/ Managing Research Data

Page 8: Research Data Management: Part 1, Principles & Responsibilities

Up to 80% of data lost within 20 years of publication:

http://www.nature.com/news/scientists-losing-data-at-a-rapid-rate-1.14416

516 ecology papers published between 1991 and 2011 The chance of data being accessible fell by 17% per year Vines, T. H. et al. Curr. Biol. http://dx.doi.org/10.1016/j.cub.2013.11.014 (2013)

8

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

Page 9: Research Data Management: Part 1, Principles & Responsibilities

When you engage in research at Columbia University you must:

•  Be ethical in the conduct of the research •  Abide by regulations and policies •  Be responsible stewards of the research dollars and other resources •  Share the results of your research for the good of society

Managing data is a critical responsibility for all researchers

9

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

Page 10: Research Data Management: Part 1, Principles & Responsibilities

•  Increases visibility •  Facilitates discovery •  Satisfies funder & journal requirements •  Reinforces open scientific inquiry •  Establishes priority & enables citation •  Speeds research

Adapted from: https://libraries.mit.edu/guides/subjects/data-management/why.html & http://researchdata.wisc.edu/share-your-data/data-access-2/ 10

WHY –WHAT – WHO – WHEN & HOW Managing research data enables sharing, which:

44/ Managing Research Data

Page 11: Research Data Management: Part 1, Principles & Responsibilities

Sharing enables breakthroughs that lead to economic development:

http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf 11

WHY –WHAT – WHO – WHEN & HOW

“Scientific research supported by the Federal Government catalyzes innovative breakthroughs that drive our economy. The results of that research become the grist for new insights and are assets for progress in areas such as health, energy, the environment, agriculture, and national security.”

44/ Managing Research Data

Page 12: Research Data Management: Part 1, Principles & Responsibilities

“…a research project's success is measured … also by the data it makes available to the wider community.”

“It is obvious that making data widely available is an essential element of scientific research.”

12

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

Page 13: Research Data Management: Part 1, Principles & Responsibilities

The directive to make federally funded research data openly accessible “is integrally tied to and supports the mission of higher education to produce, preserve, and share scholarship. It therefore provides the community with an opportunity to marshal its resources to improve the interoperability of research support systems and maximize the value of research funding.”

Association of Research Libraries (ARL) on the Office of Science & Technology Policy memorandum “Increasing Access to the Results of

Federally Funded Scientific Research”

13

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

Page 14: Research Data Management: Part 1, Principles & Responsibilities

•  Funders –  Federal agencies –  Foundations

•  Journals

14

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

Data sharing is required by:

Page 15: Research Data Management: Part 1, Principles & Responsibilities

National Science Foundation (NSF):

https://www.nsf.gov/eng/general/dmp.jsp 15

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

“Beginning January 18, 2011, proposals submitted to NSF must include a supplementary document of no more than two pages labeled "Data Management Plan" (DMP) . This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results. Proposals that do not include a DMP will not be able to be submitted.”

Page 16: Research Data Management: Part 1, Principles & Responsibilities

National Institutes of Health (NIH):

1 http://grants.nih.gov/grants/policy/data_sharing/ 2 http://grants.nih.gov/grants/guide/notice-files/NOT-OD-03-032.html

16

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

“Data sharing is essential for expedited translation of research results into knowledge, products and procedures to improve human health.”1

“…all investigator-initiated applications with direct costs greater

than $500,000 in any single year will be expected to address data sharing in their application”2

Page 17: Research Data Management: Part 1, Principles & Responsibilities

“The Office of Science and Technology Policy (OSTP) hereby directs each Federal agency with over $44 million in annual conduct of research and development expenditures to develop a plan to support increased public access to the results of research funded by the Federal Government.”

“…digitally formatted scientific data resulting from unclassified research supported wholly or in part by Federal funding should

be stored and publicly accessible to search, retrieve, and analyze.” (2013)

http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf 17

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

…and not just them! More federal agencies will be requiring public access to data:

Page 18: Research Data Management: Part 1, Principles & Responsibilities

http://www.nature.com/srep/policies/index.html http://http://www.aeaweb.org/aer/data.php

Journal Sharing Policies: “It is the policy of the American Economic Review to publish papers only if the data used in the analysis are clearly and precisely documented and are readily available to any researcher for purposes of replication. Authors of accepted papers that contain empirical work, simulations, or experimental work must provide to the Review, prior to publication, the data, programs, and other details of the computations sufficient to permit replication.”

“…authors are required to make materials, data and associated protocols promptly available to readers.”

18 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

Page 19: Research Data Management: Part 1, Principles & Responsibilities

http://www.sciencemag.org/site/feature/contribinfo/prep/gen_info.xhtml#dataavail http://www.bmj.com/about-bmj/resources-authors/article-types/research

“…trials of drugs and medical devices will be considered for publication only if the authors commit to making the relevant anonymised patient level data available on reasonable request”

“All data necessary to understand, assess, and extend the conclusions of the manuscript must be available to any reader of Science. All computer codes involved in the creation or analysis of data must also be available...”

19

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

Journal Sharing Policies:

Page 20: Research Data Management: Part 1, Principles & Responsibilities

Benefits of good data management & sharing practices:

20

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

•  Increase citations •  Avoid retractions (& potential misconduct questions) •  Advance knowledge •  Enable reproducibility

Page 21: Research Data Management: Part 1, Principles & Responsibilities

Increase citations:

Piwowar HA, Day RS, Fridsma DB (2007). Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. doi:10.1371/

journal.pone.0000308 21

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

Publicly available data was significantly associated with a 69% increase in citations, independent of journal impact factor, date of publication, and author country of origin using linear regression.

Page 22: Research Data Management: Part 1, Principles & Responsibilities

Avoid retractions:

http://retractionwatch.wordpress.com/2013/10/30/nejm-paper-on-sleep-apnea-retracted-when-original-data-cant-be-found/ 22

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

Page 23: Research Data Management: Part 1, Principles & Responsibilities

Advance knowledge:

http://www.sciencedaily.com/releases/2013/09/130903194155.htm 23

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

“70 percent of published genetic sequence comparisons are not publicly accessible, leaving researchers worldwide unable to get to critical data they may need to tackle a host of problems ranging from climate change to disease control.”

Page 24: Research Data Management: Part 1, Principles & Responsibilities

Enable reproducibility: Looking at 238 recently published papers, pulled from five fields of biomedicine, a team of scientists found that just under 50 percent of the research materials, from lab mice to antibodies, used in the work could not be identified. This phenomenon impedes the ability of scientists to reproduce & extend published studies.

Vasilevsky NA, Brush MH, Paddock H, Ponting L, Tripathy SJ et al. (2013) On the reproducibility of science: unique identification of research resources in the

biomedical literature. PeerJ 1:e148 http://dx.doi.org/10.7717/peerj.148 24

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

Page 25: Research Data Management: Part 1, Principles & Responsibilities

TAKE-AWAYS

25

WHY –WHAT – WHO – WHEN & HOW

44/ Managing Research Data

•  Researchers share the products of their research (e.g., publications, data) for the good of: –  Society –  Advancement of science –  Themselves

•  Data management is required by: –  Funding bodies –  Institutions

•  Data sharing is a requirement of: –  Funding bodies –  Publishers

Page 26: Research Data Management: Part 1, Principles & Responsibilities

What are the research data that need to be managed?

WHY –WHAT – WHO – WHEN & HOW

26 44/ Managing Research Data

Page 27: Research Data Management: Part 1, Principles & Responsibilities

WHY –WHAT – WHO – WHEN & HOW Defining Research Data:

1 Marieke Guy. http://www.slideshare.net/MariekeGuy/bridging-the-gap-between-researchers-and-research-data-management , #2

2 Queensland University of Technology. Manual of Procedures and Policies. Section 2.8.3. http://www.mopp.qut.edu.au/D/D_02_08.jsp

3 http://www.whitehouse.gov/omb/circulars_a110#36

27 44/ Managing Research Data

“…information created [or discovered] in the course of research”1

Material or information “on which an argument, theory, test or hypothesis, or another research output is based.” 2

“(i) Research data is defined as the recorded factual material commonly accepted in the [research] community as necessary to validate research findings…”3

Page 28: Research Data Management: Part 1, Principles & Responsibilities

Data may be collected in many ways:

28 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

Through real time & unique observations, from repeatable experiments or simulations, or through derivations from unique collections of data, as a few examples.

Data collection method costs and risks: Some data may be impossible to replace, some data may merely be very expensive replace. Alternatively, some data are so cheap and quick to acquire that it is less expensive to repeat the collection process than to store the data, e.g., some gene sequences.

Page 29: Research Data Management: Part 1, Principles & Responsibilities

Data may be classified by collection method, which include:

29 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

OBSERVATIONS – collected in real time / irreplaceable e.g. survey results, images, telemetry, sensor readings, some literary/historical sources, recordings

EXPERIMENTS – reproducible/ variable expense e.g. chromatograms, antenna mappings, word frequency

SIMULATIONS – Models & inputs used to create datasets e.g. economic models, climate models

DERIVATIONS/COMPILATIONS – reproducible/expensive e.g. text or data mining, 3D models, compiled database

RESEARCH PROCESS DATA – real time / irreplaceable e.g. survey instruments, data description/documentation, developed software, algorithms, code/script, instrument settings

Page 30: Research Data Management: Part 1, Principles & Responsibilities

TRUE OR FALSE:

30 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

In scientific research, only the information and observations that are collected as part of your research are considered data.

Page 31: Research Data Management: Part 1, Principles & Responsibilities

FALSE

31 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

Data are not only the information and observations made as part of scientific research but also the materials, the means, and the products of that research. Examples: •  Survey instruments •  Associated software •  Cell lines •  Specimens

Page 32: Research Data Management: Part 1, Principles & Responsibilities

Information exists in different forms during the research process:

32 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

RAW OR PRIMARY– Lab notebooks, observational notes, instrument readings, images, footage, individual survey responses, historical sources, textual analysis, etc.

PROCESSED– Statistical analyses, sources organized as evidence, rich descriptions, aggregated survey responses, etc.

PUBLISHED– Distribution in some finalized format to those outside of the project. Distribution may occur in both static and dynamic (e.g. longitudinal data sets with annual reporting) instances, etc.

Page 33: Research Data Management: Part 1, Principles & Responsibilities

“Research data means the recorded factual material commonly accepted in the scientific community as necessary to validate research findings”

33

•  Preliminary analyses •  Drafts of scientific papers •  Plans for future research •  Peer reviews •  Communications with colleagues. •  Trade secrets •  Commercial information •  Materials necessary to be held

confidential by a researcher until they are published

•  Information which is protected under law

•  Personnel and medical information •  Information the disclosure of which

would constitute a clearly unwarranted invasion of personal privacy, such as information that could be used to identify a particular person in a research study.

http://federalregister.gov/a/2013-30465

44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW The federal Office of Management and Budget offers the following explanation (summarized):

Exclusions:

2 CFR § 200.315, Intangible property, e (3)

Page 34: Research Data Management: Part 1, Principles & Responsibilities

Some data on research subjects may require special protections because they are highly sensitive and highly regulated. These sensitive data may require encryption and other security measures:

•  Personal Health Information (PHI) e.g. insurance information, health conditions, etc.

•  Personally Identifying Information (PII) e.g. financial information, social security numbers, etc.

There are a number of university policies that govern handling information of these types. Special training is required for researchers and others handling PHI.

34 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

Sensitive data:

Page 35: Research Data Management: Part 1, Principles & Responsibilities

Release of sensitive data can damage:

35 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

-  Individuals whose data were released: identity theft, financial loss, privacy violations, etc.

-  Research team members: loss of reputation, loss of position -  Research institution: financial liability

UNIVERSITY RESOURCES: -  Office of HIPAA Compliance website -  HIPAA training -  IRB website and training -  Data Classification Policy -  Policy on Electronic Data Security Breach Reporting and

Response -  Other IT Security Policies

Page 36: Research Data Management: Part 1, Principles & Responsibilities

Take-aways:

36 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

•  Definitions of data are varied – Use the one(s) appropriate to your research community

•  Some data are sensitive –  Know which data they are –  Know and take the proper precautions to protect these data

Page 37: Research Data Management: Part 1, Principles & Responsibilities

Who is responsible for Research Data Management?

WHY –WHAT – WHO – WHEN & HOW

37 44/ Managing Research Data

Page 38: Research Data Management: Part 1, Principles & Responsibilities

WHY –WHAT – WHO – WHEN & HOW

Who is responsible for research data?

38 44/ Managing Research Data

The PI is ultimately responsible for the data, and is the steward of the data (more on this later).

It is incumbent upon every member of the research team to safeguard research products (more on this later, too).

Page 39: Research Data Management: Part 1, Principles & Responsibilities

PI responsibilities:

39 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

“The full administrative, fiscal and scientific responsibility for the management of a sponsored project resides with the principal investigator named in the award”

Faculty Handbook 2008

As with all aspects of a proposal submission, the PI must be involved with establishing and describing an appropriate data management plan, as required.

Page 40: Research Data Management: Part 1, Principles & Responsibilities

PI responsibilities:

40 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

The PI is responsible for the collection, management, maintenance and retention of research data accumulated during a research project. It is the PI’s responsibility to:

•  Determine what records need to be retained to comply with sponsor requirements

•  Adopt an orderly system of data organization

•  Communicate the chosen system to all members of a research group & to the appropriate administrative personnel

•  Establish & maintain procedures for protection of essential records in the event of a natural disaster or other emergency

Sponsored Projects Handbook

Page 41: Research Data Management: Part 1, Principles & Responsibilities

Research team member responsibilities:

41 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

Everyone involved in the research project is responsible for adhering to the statements and requirements presented in the data management plan, and all other data management practices related to the research project.

These may include practices of handling:

•  Physical data e.g., lab notebooks, samples, data documentation (aka metadata), etc.

•  Electronic data e.g., file naming conventions, generating metadata, keeping an e-lab notebook, data storage, data back-ups, annotating findings

Page 42: Research Data Management: Part 1, Principles & Responsibilities

Take-aways:

42 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

•  PI is responsible for all aspects of the grant, including data management

•  All members of the research team are responsible for adhering to the data management plan

Research data management can be complex, but there are

resources available

See Part 2 of this course for details on WHEN & HOW to practice Research Data Management

à SEE NEXT PAGE!

Page 43: Research Data Management: Part 1, Principles & Responsibilities

Resources for Research Data Management:

43 44/ Managing Research Data

WHY –WHAT – WHO – WHEN & HOW

Title   URL  

Scholarly Communications Program, Data Management http://scholcomm.columbia.edu/data-management/

Research and Data Integrity Program (ReaDI)

http://www.columbia.edu/cu/compliance/docs/ReaDI_Program/index.html

Data Management Plan Templates http://scholcomm.columbia.edu/data-management/data-management-plan-templates/

CUIT Research Computing Services http://rcs.columbia.edu

Academic Commons Archival Storage http://academiccommons.columbia.edu/about

Citation Management http://library.columbia.edu/research/citation-management.html

Managing Secure Information - Training http://columbia.sighttraining.com

Data Security Policies http://policylibrary.columbia.edu/category/computingtechnology

This work is licensed under a Creative Commons Attribution 4.0 International License.

Page 44: Research Data Management: Part 1, Principles & Responsibilities

REFERENCES •  Sco$,  Mark,  Boardman,  Richard  P.,  Reed,  Philippa  A.S.  and  Cox,  Simon  J.  (2012)  

Introducing  research  data.  Southampton,  GB,  Univeristy  of  Southampton,  29pp.  h$p://eprints.soton.ac.uk/338816/  

•  Responsible  research  data  management  and  the  prevenQon  of  scienQfic  misconduct  www.knaw.nl/Content/Internet_KNAW/publicaQes/pdf/2013449.pdf  

•  h$p://dmconsult.library.virginia.edu/  

44 44/ Managing Research Data Created  by:  Amy  Nurnberger,  2015-­‐05-­‐12    

This work is licensed under a Creative Commons Attribution 4.0 International License.