Martin Hamilton, Martin Donnelly Implementing Open Access conference, June 2014 Effective Management of your Research Data
Jan 16, 2015
Martin Hamilton, Martin DonnellyImplementing Open Access conference, June 2014
Effective Management of your Research Data
Outline
»1. Background
»2. Jisc Co-Design challenge: Research at Risk
»3. RDM support from the DCC– Capability studies
– Data management planning
– Training and development
»4. UK HEI survey
»5. Feedback and futures
Outline
»1. Background
»2. Jisc Co-Design challenge: Research at Risk
»3. RDM support from the DCC– Capability studies
– Data management planning
– Training and development
»4. UK HEI survey
»5. Feedback and futures
Background: About Jisc
» Registered charity championing theuse of digital technologies in researchand education
» Wide range of shared services for UKUniversities and Colleges, e.g.
‑ JANET, world leading NREN
‑ Groundbreaking content dealswith publishers
‑ Cloud brokerage, e.g. Amazon portal
» R&D achievements such as:
‑ IETF standards track Moonshot project
‑ Pioneering work in Open EducationalResources, Open Access and Open Data
Background: About the DCC
The (est. 2004) is…» UK centre of expertise in digital preservation, with
a particular focus on research data management
(RDM)
» Based across three sites: Universities of
Edinburgh, Glasgow and Bath
» Working with a number of UK universities to
identify gaps in RDM provision and raise
capabilities across the sector
» Also involved in a variety of international
collaborations
Research Data Management (RDM) is:» An integral part of doing quality research in the 21st
century
» Increasingly expected / mandated by funders, publishers and others
» An opportunity for new discoveries and different approaches to research
» A safeguard against inappropriate data disclosure
» An activity that requires careful planning and consideration, and – ideally – coordination and support across many stakeholder types
Background: Why RDM?
Background: Policy drivers
» Seven “Common Principles on Data Policy” – Data as a public good; Preservation; Discovery; Confidentiality; Right of first use; Recognition; Public funding for RDM
» Six of the seven RCUK funders require data management plans, or equivalent, at the application stage
» The other (EPSRC) requires nothing less than an institutional data infrastructure (by May 2015). We expect that DMP will be a key component in many cases…
Background: Horizon 2020» From 2014, Data Management Plans are required for
‘key areas’ of the Horizon 2020 programme, coveredby the Open Data Pilot. These include several technology-oriented strands and others addressing ‘societal challenges’– we are expecting compliance requirements to be furtherdetailed via specific calls.
» Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 (pp. 8-11):http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
» These guidelines echo the G8 Science Ministers’ statement (2013), which offered similar good practice principles: https://www.gov.uk/government/news/g8-science-ministers-statement
Background: Collaborations» Liaison :UK HEIs and research
institutes – e.g. DBIS, HEFCE, libraries, IT directors, RCUK , publishers etc
» Research Sector Transparency Board, RCUK National E-Infrastructure Group & E-Infrastructure Leadership Council
» Work with international initiatives:
› Research Data Alliance, CODATA, EuroCRIS, ANDS
› Knowledge Exchange – at the moment through the KE we are exploring incentives to sharing & funding models for research data infrastructure
» European projects: SIM4RDM & 4Cs
» Jisc CASRAI - UK pilot :development of related vocabularies and standards, for example data management plans vocabularies
Data
Clouod
Librarians, research managers & IT have three
interlocking suites of services, to support
researcher needs and institutional policies
Researchers have a cohesive and interlocking suite of
research data management, publication and discovery
services
Research Data Management and Planning
Services
Research Data Storage and Archival Services
Research Data Discovery Services
Data Data
UKDA, BADC
Research Data Management Applications
ICSU / WDS
EBI / GenBank
Research Data Management Applications
Journal Policies Registry Research Data Registry /
Cross Repository Discovery Service
KeyEstablished serviceProject
Other supportedJISC supported
DMPonline
DMP Registry
Research Data Management and Discovery Services for the Research Data Lifecycle
SWORD +
Disciplinary Data Repositories (National and International)
Institutional Data CataloguesInstitutional Data Catalogues
Disciplinary Research Data Discovery Services
Metadata Exchange Between Journals, Archives, Repositories
Researcher identifiers
Organisation identifiers
RegistriesData Identifiers
Data Identifiers and Metadata Schema
Support for Research Data Lifecycle
Cloud/Storage
There is a set of
infrastructure components
that underpin all three suites
Outline
»1. Background
»2. Jisc Co-Design challenge: Research at Risk
»3. RDM support from the DCC– Capability studies
– Data management planning
– Training and development
»4. UK HEI survey
»5. Community feedback
Background: The co-design process
Outline
»1. Background
»2. Jisc Co-Design challenge: Research at Risk
»3. RDM support from the DCC– Capability studies
– Data management planning
– Training and development
»4. UK HEI survey
»5. Community feedback
Support from DCC: Helping institutions
http://blog.soton.ac.uk/keepit/2010/01/28/aida-and-institutional-wobbliness
/
» Three principal areas for HEIs to focus on:
› Developing and integrating their technical infrastructure (storage space, repositories/ CRIS systems, data catalogues, etc)
› Developing human infrastructure (creatingpolicies, assessing current data management capabilities, identifying areas of good practice, data management plan templates, tailoring training and guidance materials…)
› Developing business plans for sustainable services / roles
Support from DCC: Institutional engagement
Support from DCC: Prioritising effort
» RDM is a complex and hybrid issue, involving a heterogeneous mix of stakeholder groups
» It can’t all be tackled at once, but rather should be planned out carefully and broken down into achievable goals
» Working with universities, we’ve carried out funder analyses, which in turn inform strategy and policy development
» Capability / maturity studies using CARDIO tool (about which more in a moment)
» Fact-finding exercises can also help to identify – and subsequently leverage – existing pockets of good practice and/or enthusiasm
Support from DCC: CARDIO
»CARDIO: Collaborative Assessment of Research Data Infrastructures and Objectives› A methodology and tool to assess
research data infrastructure and support› It uses the concept of maturity, asking
different stakeholders to rate provision on a 1-5 scale
› CARDIO is collaborative – the aim is to get multiple viewpoints to identify discrepancies and reach consensus
› http://cardio.dcc.ac.uk/
Support from DCC: CARDIO
»RDM maturity› Assess a ‘data context’ – a place where data is
created and managed (e.g. department, school, project, funding stream, institution...)
› How well can it/does it manage its data? › That’s dependent on:
– Finances– Technology– Policy and procedures– Organisational will– Skills…
Support from DCC: Data Mgt Planning
Analysed requiremen
ts
Developed a
Checklist
Provided tools &
guidance
Analysis of funder policies
(2009)
DMPonline tool (2010) How-To guide
(2011)https://dmponline.dcc.ac.uk/
Support from DCC: DMP Checklist
» Checklist for a Data Management Plan v4.0 (2013) www.dcc.ac.uk/resources/data-management-plans
DMP SECTIONS
1. Administrative Data, e.g. project name, description, PI, funder, etc
2. Data Collection, e.g. description, capture methods, etc
3. Documentation and Metadata, e.g. what information is needed for the data to be to be accessed and understood in the future?
4. Ethics and Legal Compliance, e.g. consent, sensitivity, copyright/IPR
5. Storage and Backup, e.g. where will data be held and backed up? Security and access issues
6. Selection and Preservation, e.g. keep it all or just some? How long should it be kept?
7. Data Sharing, e.g. how will data be found and accessed, any restrictions?
8. Responsibilities and Resources, e.g. who will do it and who will pay?
Support from DCC: DMPonline
» A free Web-based, Open Source data management planning tool incorporating templates and guidance: https://dmponline.dcc.ac.uk/
› v1 (April 2010)
› v2 (March 2011) added scope for multiple versions of plans and templates
› v3 (May 2012) added functionality for sharing plans
› v4 (November 2013) changed relationship with Checklist, improved usability
» Technologies involved: Ruby on Rails, JavaScript, MySQL database
Support from DCC: Training / community dev.
» We’ve run many awareness-raising and advocacy events and workshops for staff and students
» These can be general, or focused on particular elements of the data management ecosystem (for example, data management planning)
» We also facilitate internal working groups, which often bring together groups of colleagues not used to collaborating with each other
» We’ve recently been asked to provide training for EPSRC’s doctoral training centres – details TBC at this stage
» Lastly, we organise community events, such as the biannual Research Data Management Forum (most recent event was last week, on Workflows and Lifecycle Models)
Outline
»1. Background
»2. Jisc Co-Design challenge: Research at Risk
»3. RDM support from the DCC– Capability studies
– Data management planning
– Training and development
»4. UK HEI survey
»5. Feedback and futures
UK HEI Survey
»2014 Survey of UK Higher Education Institutions› Driven by T-1 year for EPSRC expectations› National picture of institutional progress› Understand barriers, gaps in support needs› 20 questions – online survey link emailed › To Pro-VC’s for Research & Service Heads› Library, IT, Research Support & Commercialisation› Institutions with at least 10% income from
research
UK HEI Survey: Who participated?
Respondents
Russell Group (39)
Others 10%+ (35)
Others (13)
From 61 institutions
UK HEI Survey: Demographics
31%
38%
14%
17% Research Support & Commercialisation
Library or Infor-mation Service
IT/ Research computing
Others
UK HEI Survey: Institutional Drivers
UK Research Council data policies
Government policy on open data
Governance of research integrity / academic conduct
Strategy to expand support for research
EU Horizon2020 policy on data management
0 10 20 30 40 50 60 70 80 90100
92
57
54
54
53
% Agreeing
UK HEI Survey: Areas with most progress
Policy development
Data Management & Sharing Plans
RDM skills training & consultancy
0 10 20 30 40 50 60 70
% indicating piloting or live
UK HEI Survey: Getting there?
Access & storage systems
Data cataloguing & publishing
Managing implementation as a whole
32 34 36 38 40 42
% indicating piloting or live
UK HEI Survey: Areas of least progress
Business planning & sustainability
Digital preservation & continuity planning
Governance of data access & reuse
19 20 21 22 23 24 25
% indicating piloting or live
UK HEI Survey: Obstacles
Lack of appropriate staff re-sources and infrastructure
Availability of funding
Low priority for researchers
71
64
59
% citing
Issues needing External Support?
Defining what to retain and for how long
Specifying tools/ infrastructure
Supporting metadata creation for research data discoveryIdentifying which costs may be recovered from grants
Advocacy to senior management
Developing data catalogues and registers
Outline
»1. Background
»2. Jisc Co-Design challenge: Research at Risk
»3. RDM support from the DCC– Capability studies
– Data management planning
– Training and development
»4. UK HEI survey
»5. Feedback and futures
Feedback and futures
‑ Active storage is not compliance, evidence of HEIs repeating the same patterns & interpreting compliance differently.
‑ Creating compelling services that will appeal to researchers.
‑ Getting researchers to engage ; researchers /PI think of it as their data.
‑ Building trust in across the parts of the HEI that are involved.
‑ Political issues and disciplinary differences.
‑ Misconceptions – not all data is open – needto be clear and ensure this is understood.
‑ Tools that take you through the whole journey.
‑ New shared services and brokered agreements, to avoid 150 HEIs coming up with own solutions!
Feedback and futures
‑ Active storage is not compliance, evidence of HEIs repeating the same patterns & interpreting compliance differently.
‑ Creating compelling services that will appeal to researchers.
‑ Getting researchers to engage ; researchers /PI think of it as their data.
‑ Building trust in across the parts of the HEI that are involved.
‑ Political issues and disciplinary differences.
‑ Misconceptions – not all data is open – needto be clear and ensure this is understood.
‑ Tools that take you through the whole journey.
‑ New shared services and brokered agreements, to avoid 150 HEIs coming up with own solutions!
Feedback and futures
[email protected], [email protected] (with thanks to DCC colleagues for some slides)Implementing Open Access conference, June 2014
Effective Management of your Research Data