Page 1
Institutional Data Management Blueprint
Kenji Takeda (Engineering Sciences), Mark Brown (University Librarian), Simon Coles (Chemistry), Les Carr (ECS, EPrints), Jeremy Frey (Chemistry), Graeme Earl (Archaeology) Peter Hancock (iSolutions), Wendy White (Library)
Page 2
Introduction• Why data management?
• IDMB project
• Key findings
• Recommendations
• Business plan
• Conclusions
www.southamptondata.org 2
Page 3
Data Management @ Southampton• What do we mean?
– Everything• Why do we care?
– Foundation for all of our research• How should it be managed?
– We want to find out from users• How can the University help?
– What do researchers need?• Key outcomes
– Impact & profile3
cop
yri
ght
© 2
01
0 S
ean D
reili
ng
er.
Rep
rod
uce
d u
nd
er
Cre
ati
ve C
om
mons
license
Page 4
IDMB Project Overview• Produce framework for managing research data for
an HEI
• Scope and evaluate a pilot implementation plan for an institution-wide data model
4
Policy Best Practice
Pilot Projects
Training &
Workshops
Page 5
Review of Data Management
Page 6
Where do you store your data?
6
Page 7
How much electronic data do you currently retain?
7
Page 8
How long do you keep your data for?
8
Page 9
How frequently do you backup your data?
9
Page 10
Where do you backup your data?
10
Page 11
Key Findings• Schools research practice is embedded and unified
• Schools data management capabilities vary widely
• Data management is carried out on an ad-hoc basis in many cases
• Researchers demand for storage is significant
• Researchers resort to their own best efforts in many cases, where central support does not meet their needs
• Users want more support for backup, particularly for large quantities of data
11
Page 12
Key Findings• Researchers want to
keep their data for a long time
• There is a need from researchers to share data, both locally and globally
• Data curation and preservation support needs to be improved
12
Page 13
Gap Analysis• Policy and governance is robust, but is not
communicated to researchers in the most accessible way
• Services and infrastructure are in place, but lack capacity and coherence
• There is a lack of training and guidance on data management
• Lack of coherence and sustainable business model
13
Page 15
Recommendations• Short-term (1 year)
– Develop an institutional data repository– Develop a scalable business model– One-stop shop for data management advice and guidance
• Medium-term (1-3 years)
– Comprehensive and affordable backup service for all– Open research data mandate, and supporting
infrastructure– Research data lifecycle management– Embedding data management training and support
15
Page 16
Long-term recommendations• Provide coherent data
management support across all disciplines
• Embed exemplary data management practice across the institution
• Agile business plan for continual improvement
16
Page 18
Archaeology Data Management• Archaeology is all
about data and metadata
• Spectrum of data is huge– Laser scans– Photography– Geophysics– CAD– CGI
• Context is everything
http://www.portusproject.org/
Page 19
SharePoint 2010 Data Management
20
Page 20
Data Browsing in Context
21
Page 21
Business planning
Page 22
Business Plan• Strategy
• Principles
• Policy
• Infrastructure and services
• Business model
• Partnership approach between all stakeholders– Senior management,
Researchers, IT, Library, Research & Innovation Services, Finance, Legal 26
Page 23
Institutional Data Management Policy• Help researchers
– Provide guidance on what is expected
– Provide guidance on how to manage their data
• Help the institution– Define what is
required• Comply with funders
• Provide governance and decision-making process 27
Page 24
Cost Modelling• Data management is
expensive
• Who is responsible?
• Who pays for it?
• How does it scale?
• What if somebody cannot afford it?
• Not just about hardware
• Sustainability28
Researcher
Page 25
Cloud Possibilities
Page 26
Cloud Solutions• Data storage demand growing
at frightening rate
– Users can accommodate this locally very cheaply
– Difficult to satisfy with current server-based storage
• Potential to provide:
– Zero capital cost– Burst capability– Scalability 30
Page 27
Cloud Benefits
31Diagram courtesy of Dr Steven Johnston
Page 28
Cloud Issues• Cost models
– Dropbox £8k per TB– Data transfer charges
• Security
• Reliability
– Amazon!
• Legal implications
• Vendor lock-in
32
Page 29
Conclusions
• Good data management is vital for better research
• Two-pronged approach– Bottom-up to
augment researcher’s world
– Top-down to provide support and guidance
• Providing a roadmap for the future – including Cloud
33
Researcher
Faculty
University
• www.southamptondata.org
• [email protected]