Dec 19, 2015
Research Data Management
• Rather neglected until recently
– Little recognition of importance
– Little training
“It feels as though it should be common sense, but I wonder how far common sense
gets you” – postdoctoral researcher in linguistics
“It’s easy to waste a lot of time, so it’s good to get things organized right from the
start” – postgraduate researcher in linguistics
• Now attracting investment and interest
– JISC Managing Research Data programme
– Increasing institutional recognition of importance of data assets
– Drivers include: funding councils; reputation management; FOI requests;
REF impact; publishers; researchers themselves
Research data infrastructure at Oxford
• Programme begun in 2008 with an internal scoping study
• Eidcsr (JISC funded, 2009-2010)– Scoping and piloting institutional data repository (software, metadata, etc.)
– Development of data management policy
• Sudamih (JISC funded, 2010-2011)– Researcher training (organising materials, software and tools to help, etc.)
– Pilot ‘Database as a Service’ (DaaS)
• VIDaaS (JISC & HEFCE funded, 2011-2012)– Full production-level DaaS, ready to use, hosted on cloud infrastructure
A research data lifecycle
Training
• No real concept of ‘best practice’“I’ve just sort of picked it up, and probably not learnt lots of little tricks that might save time. I’ve
picked things up by talking to other people … but on the whole the people I tend to talk to about
research often aren’t particularly technically minded.” - early-career researcher in Classics
• Common problems that can compromise re-usability include:
– Lack of documentation
– Inappropriate choice of technology
– Poor organisation / structuring
– Idiosyncratic data selection or manipulation
“People need to think about their data not just in terms of what they want to do now (which is often
just to make a list), but in terms of what they might want to do with it in the future.” – senior
lecturer in Music Faculty
How to share data
• Database practices vary greatly between researchers
• But the process often looks like this...
Looks like there is some interesting data behind this
But it doesn’t makes much sense to me
How to share data
Nice data –I can use this!Oh.
• Database practices vary greatly between researchers
• or this...
Problems identified
• Lack of technological awareness
• Poor backing-up practices
• Collaboration difficult
• Difficult to re-discover and re-use data
• Risk of technical obsolescence
• Servers and Websites costs money
• Funding only lasts as long as the project, but servers and websites
require maintenance
• Technical expertise required
What is the DaaS?
• A web-based system that will enable researchers to quickly and intuitively
– build a relational database from scratch, or
– Import an existing database in common formats (such as Access)
• Generic data addition, editing, and querying interfaces
– Research groups may, if desired, develop their own Web front-end interfaces
to databases hosted by DaaS
• Databases centrally hosted, maintained, and routinely backed up
• Access controls to determine who can view or edit each database
• Metadata capture to improve data rediscovery
• Economies of scale
Using the DaaSI can access and cite good research data
I can find what data other people have
been gathering
We can quickly & easily add & edit data and open it to the public
DaaS Components
1. ‘Core’ DaaS database management system – database admin and
user interface for browsing and editing databases
2. Conversion utility - converts existing databases in Access or CSV
formats into PostgreSQL
3. Graphical SQL-designer utility - create or modify database structures
via a simple Web interface
4. Graphical form-builder utility - drag and drop buttons, text fields,
multiple choice menus, and other standard form components
5. Advanced SQL query-builder - construct sophisticated search queries
without needing to be an SQL expert [In development]
Core DaaS System
• Simple interfaces for
– Registering projects, users, and databases
– Editing and browsing data
PostgreSQL Converter
• Import and export
databases in various
common formats,
including Access
SQL Designer
• All possible types of
relationship between tables
are available
• Not limited to use in the
DaaS
• Will be able to import and
re-structure existing
databases
Form Builder
• Enables the user to drag and
drop buttons, text fields,
multiple choice menus, and
other standard form
components via a
straightforward Web interface
• Potentially has uses beyond
the DaaS
Virtual Infrastructure with Database as a Service (VIDaaS)
• Follow-on project to Sudamih – just started!
• Extended functionality beyond the humanities
• Incorporation of data storage models other than relational
databases (e.g. document-based and native XML)
• Deployment as a cloud-based software service
– Identity and access management issues
– Involves advance monitoring and management tools
– Capable of running on other institutions’ virtual infrastructures
• Improved user interface, documentation, and support
Join us!
• Now in process of testing and improving DaaS
• Test users get service for free (for one project & within reason!)
• System to go live to early adopters in November
• Full launch in January 2012
http://vidaas.oucs.ox.ac.uk/
[email protected]://sudamih.oucs.ox.ac.uk/