Top Banner
RDM Data Storage Workshop February 25 th 2013 Brian Clifford University of Leeds
18

Research Data Management Storage Requirements: University of Leeds

Nov 27, 2014

Download

Education

Research Data Management Storage Requirements Workshop, Mon 25 February, organised by Jisc, Janet and DCC. Presentation covers a research data survey, the RoaDMaP project, research data characteristics and potential storage requirements at the University of Leeds.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Research Data Management Storage Requirements: University of Leeds

RDM Data Storage WorkshopFebruary 25th 2013

Brian Clifford

University of Leeds

Page 2: Research Data Management Storage Requirements: University of Leeds

The University of Leeds: Institutional Context

• 1,500 researchers (plus postgrads)• £130m research income• 80% RCUK Funded• 9 Faculties

–Devolved budgets–Faculty based support for researchers

• Development of a Central RDM including The Library, Research and Innovation Office, IT Service, Staff Development supporting staff based in Faculties

• Investigations being undertaken by the JISC funded RoaDMaP Project

Page 3: Research Data Management Storage Requirements: University of Leeds

How much research data do you typically generate in a year?

Page 4: Research Data Management Storage Requirements: University of Leeds

What % research data would you need to keep for others to validate your research findings?

Page 5: Research Data Management Storage Requirements: University of Leeds
Page 6: Research Data Management Storage Requirements: University of Leeds
Page 7: Research Data Management Storage Requirements: University of Leeds

RoaDMaP considering aspects of Long term storage

• Tested use of F5 systems for virtual storage• Archiving as a service – e.g. Arkivum

– Currently working on proof of concept depositing / retrieving large files

• Plan to investigate feasibility of integration with ePrints for retrieval of archived datasets.

• Pros and cons of outsourcing vs consortial options vs institutional options

• Does outsourcing help direct cost recovery from grants?• Consortial options:

– White Rose (DCC Institutional Engagement Project)– N8 (parallels with HPC model)?

Page 8: Research Data Management Storage Requirements: University of Leeds

Funding options

• Considering three different models for the funding of the institutional research data management service

–Top slice through RAM from Faculty income to pay for central service

–Strategy Development Funding (one off!)–Recharge model

• Investigating all three to ensure that the model chosen does not lead to negative behaviours

• What can we afford, what do we need to store?

Page 9: Research Data Management Storage Requirements: University of Leeds

RDM Storage Requirements

Graham Blyth

JISC RoaDMaP Project

Engineering IT

Page 10: Research Data Management Storage Requirements: University of Leeds

Current estimate of required storage volume?

• MAPS 1 PByte• Environment 1 PByte• M+H 0.3 PByte• FBS 0.25 PByte• Engineering 0.1 PByte*

Page 11: Research Data Management Storage Requirements: University of Leeds

Research Scenarios

• Large volume – expensive - changing• Large volume – expensive – static• Large volume – cheap – static or changing• Small volume – expensive

• Shared access• Rate of creation• Performance in use

Page 12: Research Data Management Storage Requirements: University of Leeds

Research Scenarios – Flame fronts

Raw data - High speed camera – large data, expensive experiment

Processed camera data – large data, moderately expensive process

Particle detection – moderate data, moderately expensive computation

Software development – small data, very expensive

Page 13: Research Data Management Storage Requirements: University of Leeds

Research Scenarios

Characteristic Implication for Storage

Raw Camera dataCost to reproduce very high Permanent long term storage

Shared access Access control

Very large volume of data Dedicated network storage

High speed access needed Local copy may be required

Page 14: Research Data Management Storage Requirements: University of Leeds

Types of Data

Static

Live/Archive

Published/Repository

Changing

Cheap

Expensive

Page 15: Research Data Management Storage Requirements: University of Leeds

Storage – focus on value axis

Scratch –– cheap static or changing data

Backed-up –– traditional fully managed storage

Repository –– discipline repositories and growing institutional or regional repositories

Archive –– ?

Page 16: Research Data Management Storage Requirements: University of Leeds

With an Archive for this ScenarioStore raw camera data in archive

May keep local copy on scratch disk for performance

Simplified backup

Capture metadata at time of data creation

Common scenario – estimate 80% of expensive Engineering data

Page 17: Research Data Management Storage Requirements: University of Leeds

Data Management

Planning

Managing Active Data

Processes for selection and

retention

Deposit and Handover

Data Repositories/Ca

talogues

Components of research data management support services

RDM Policy and Roadmap Business Plan and Sustainability

Guidance, Training and Support

Page 18: Research Data Management Storage Requirements: University of Leeds