Research Data Management What’s it all about? Jane Fry, MacOdrum Library November 21, 2017
Acronyms
Research data management| aka RDM
| aka data management
• aka DM
Research data management plan| aka RDMP
| aka data management plan
• aka DMP
3
Why the Library?
Research partner
Support the research endeavor
RDM expert
Partner with CU Research Office
The scholarly life-cycle
Discipline-agnostic
6
Why the Library? (cont’d)
Our role| Information
| Consultation
Challenge| Determine how we can help researchers advance their
research
References: Rambo Neil; Shorish, Yasmeen7
What are Research Data?
Exercise:
| What do you think – what are research data?
*Take 2 minutes to discuss in your group.
* Someone record answers, please.
8
What are Research Data?
“Research data may be defined as the factual records (e.g.
microarray, numerical and textual records, images and
sounds, etc.) used as primary sources for research, and
that are commonly accepted in the research community
as necessary to validate research findings.
For the most part these data are born digital, and stored
and managed electronically, making them easy to share,
replicate, and combine with other data. …”
Source:
http://www.carl-abrc.ca/advancing-research/research-data-management/9
Research Data
Why are research data important?
Sharing research data
Check out the following examples …
10
Example: Reproducibility
Political Persuasion and Attitude
Change Study: The Los Angeles
Longitudinal Field Experiments, 2013-
2014
Principal Investigator:| Michael J. LaCour
Reference: https://www.openicpsr.org/openicpsr/project/100037/version/V8/view 11
Another Example
“New Study Links Vaccines To Autism.
There's Just One Tiny Problem With It”
“… one of its own co-authors claimed
that figures in the paper were
deliberately altered before
publication. The data had been tampered
with. …”
Source: http://bit.ly/2zSwAxo 14
…
“Researchers from the University of British Columbia
are retracting their scientific paper linking aluminum in
vaccines to autism in mice, because one of the co-
authors claims figures published in the study were
deliberately altered before publication — an issue he
says he realized after allegations of data manipulation
surfaced online.”
“…original data cited in the study is inaccessible, which
would be a contravention of the university's policy
around scientific research. ”
“…the original data is in China, with an analyst who
worked on the paper.”
(October 16, 2017)
Source: http://www.cbc.ca/news/canada/british-columbia/ubc-autsism-
vaccine-paper-retraction-chris-shaw-1.435185516
What is Research Data Management?
Exercise:
| Research data management (RDM) - What do
you think it is?
(*2 minutes in your group)
(*Have a recorder!)
17
What is RDM?
What is it?| “ …describes the activities researchers perform as they
create and save their research data.”
• Source: http://researchdata.library.ubc.ca/learn/
Includes:| Sound practices
| Data curation
| Data stewardship
18
Benefits of RDM
Confirmation of original findings
Further research
Planning follow-up studies
Bonus …
19
Why RDM Now?
Requirement by funders
| Tri-Council (SSHRC, CIHR and NSERC)
| CFI
| Genome Canada
Tri-Agency Statement of Principles on Digital
Data Management
We should be ahead of the curve in this
You are at the beginning of a research career20
RDM and Data Stewardship
Managing research data entails the many
activities dealing with the operational support of
data across the stages of the research lifecycle.
This involves the “what” and “how” of research
data.
Data Stewardship involves assigning
responsibility for ensuring data management
activities are performed to best practice levels
and standards across the complete lifecycle.
This addresses “who” is responsible for specific
data activities.
Source: Moon, J. & Fry, J., September 2017
21
Metadata (cont’d)
What is it| Information about the data
| Usually in a standardized and structured format
Explains …
Why is it important
Who enters it
23
Metadata (cont’d)
Why keep metadata| Researchers re-use data
| Good research practice
When to record it
What to keep| What do you think?
24
Metadata (cont’d)
25
What to keep - everything!| Research design
| Data collection
| Data preparation
| Questionnaires
| Interviewer instructions
| Details of decisions made
– Why certain decisions were made
» e.g. if data collection not to be done on a certain date (Easter)
Metadata (cont’d)
26
Keep all processes What worked
What didn’t work
Changes made after pilots conducted
Why they were made
Was another pilot conducted after changes made
Any and all changes that were made or not
made
End goal
Documentation
27
What documentation will be needed For the data to be read
For the data to be interpreted correctly
For the data to be replicated in the future, if necessary
What do you think?
Documentation (cont’d)
28
Study background| Purpose
| Time frame
| Geographic location
| Creator
| Sampling design
• Description
• Size
| Any changes that were made
| …
Documentation (cont’d)
29
Study description| Describes all aspects of the data collection and processing
• Data collection methodology
• Data preparation procedure
• Instruments used
• Geographic coverage
• Temporal coverage
• Date of file creation
• Description of codes used
Data description & collection
30
Data description A brief summary of the overall project
Data collection Collection method
File formats
File Names
Exercise:
Choose one of the data types identified in the
previous exercise and draw a lifecycle model
representing the steps through which the data
would flow in a research project.
Focus on high-level, generalized steps in the
research process – aim for six to eight steps.
*Take 10 minutes, and then we’ll report back…
32
RDM Lifecycle
33
University of Central Florida Libraries
Source: http://stars.library.ucf.edu/cgi/viewcontent.cgi?article=1058&context=lib-docs
34
UKDA RDM Lifecycle
Source: http://www.data-archive.ac.uk/create-manage/life-cycle
35
http://www.data-archive.ac.uk/create-manage/life-cycle
36
http://www.data-archive.ac.uk/create-manage/life-cycle
Data processing (cont’d)
37
Variables
| Names
| Labels
• Comprehensible
• Unique
| Description
| Value labels
• Comprehensible
• Complete
| Associated question
Missing values
| Codes used
• Should be consistent
| Reasons for missing values
http://www.data-archive.ac.uk/create-manage/life-cycle
39
http://www.data-archive.ac.uk/create-manage/life-cycle
40
Data preservation
41
Preservation Where
Where will all information be backed-up If at your institution
How often will you back-up
Long term-preservation and access
http://www.data-archive.ac.uk/create-manage/life-cycle 42
Data sharing
43
Why share?
What data will you be sharing?
How will you be sharing your data?
Describe any restrictions placed on your data
for access when they are made available
Confidentiality
44
What procedures will be taken to ensure
confidentiality Not possible to identify any individual
Record all decisions made Why this decision made
http://www.data-archive.ac.uk/create-manage/life-cycle 45
Summing up …
You have learned| What research data is
| What RDM is
| What a research lifecycle is
46
47http://taitegallery.net/wp-content/uploads/2012/02/unanswered-questions.jpg
What’s next?
Need an RDMP
Why an RDMP?| Safety
| Efficiency
| Quality
If no RDMP?| Potential problems
49
DMPs can help researchers
a. identify institutional services that can
support their data during and after a
project ends,
and
b. determine the process for transferring
their data. 50
Platforms & Templates
Web-based data management
platforms Tools
Software
Data stewardship templates Frameworks
Used for planning
Within a platform
Could also be called a form
51
Data Management
Exercise:
• Choose one step in the research
lifecycle.
• List three data management activities
that would be conducted for this step.
* 5 minutes!
52
§ “As we are at the beginning of our project, the DMP
process really helped us to plan what we want to do with
our data and how we want to proceed. We found it very
useful. However lots of points will need to be clarified.”
§ “The process of writing the DMP really pointed out the
great diversity of the data we are dealing with within our
project, and showcased the importance of having a distinct
data management approach for each type of data, and the
challenges that come with it.”
§ “I came to see the DMP process as about planning for the
preservation and archivization of data, not about its ideal
presentation or optimal accessibility.”
SSHRC DMP Workshop feedback
54
Goals of the Portage Network
Foster a community of practice for research data
management (RDM)
Facilitate and provide leadership in the
development of RDM infrastructure
Engage and advocate for research data
management with stakeholder communities
55
RDMP (Cont’d)
Portage DMP Assistant
| Data Collection
| Documentation and Metadata
| Storage and Backup
| Preservation
| Sharing and Re-use
| Responsibilities and Resources
| Ethics and Legal Compliance
64
Ethics and legal compliance
Sensitive data| Primary use
| Secondary use
Legal, ethical and IP issues
71
The export tab allows displaying a plan in full or selectively for specific
themes and their questions
Export formats include: pdf
csv
html
json
text
xml
docx
Exporting
76
DMP Template
Exercise:
Choose one of the sections from the DMP
template.
Read over the questions for your section.
Answer one of the questions listed.
Bonus
Are there any other questions that you think were
omitted?
* Hint: use your research experience to help
answer these questions
* You have 5 minutes!83
Still don’t believe me?
What could happen if you don’t practice
good RDM?
https://www.youtube.com/watch?v=N2zK3sAtr-4#t=17
84
RDM help
Help with RDM| https://library.carleton.ca/services/research-data-
management
| Consultations
Help with RDMPs| Portage: https://assistant.portagenetwork.ca/
| Word template:
https://library.carleton.ca/services/research-data-
management#how
85
In sum …
You are now able to:
| Define the key components of RDM
| Define an RDMP
| Create an RDMP
86
Resources
RDM at Carletonhttps://library.carleton.ca/services/research-data-management
Portage DMP Assistant
https://portagenetwork.ca/
Research Data Lifecycle (UK Data Archive)http://www.data-archive.ac.uk/create-manage/life-cycle
Tri-Agency Statement of Principles on Digital
Data Managementhttp://www.science.gc.ca/default.asp?lang=En&n=547652FB-1
87
References
Moon, Jeff & Fry, Jane. “Portage and Data Management Plans
in Canada: Policies, Templates and Platforms.” Presentation at
Portage & RDM in Canada, September 18, 2017.
https://portagenetwork.ca/about/documents-and-presentations/
Rambo, Neil (October 22, 2015). “Research data management
roles for Libraries” .
http://www.sr.ithaka.org/publications/research-data-management/
Shorish, Yasmeen (November 23, 2015). “The Library as
Research Partner”. ACRL TechConnect Blog.
http://acrl.ala.org/techconnect/post/the-library-as-research-partner
Stephenson, L. “Data management for advanced research”.
Presentation given 28 March 2008. UCLA Social Science Data
Archive, Unpublished.
UK Data Archive. “Create & manage data: Research Data
lifecycle”. Retrieved 13 October 2013 from
http://data-archive.ac.uk/create-manage/life-cycle 88
89
http://www.quotationof.com/images/question-quotes-7.jpg
Thank you!
Jane Fry
Data Services Librarian
Rm 122
MacOdrum Library
613.520.2600 x1121
http://www.library.carleton.ca/find/data
90