Top Banner
I2S2 - Infrastructure for Integration in Structural Sciences Research Data Management Infrastructure Programme Launch Meeting 27 th November 2009, London Project Presenter: Simon Coles http://www.ukoln.ac.uk/projects /I2S2/
8

I2S2 - Infrastructure for Integration in Structural Sciences Research Data Management Infrastructure Programme Launch Meeting 27 th November 2009, London.

Mar 28, 2015

Download

Documents

Lily Glass
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: I2S2 - Infrastructure for Integration in Structural Sciences Research Data Management Infrastructure Programme Launch Meeting 27 th November 2009, London.

I2S2 - Infrastructure for Integration in Structural Sciences

Research Data Management Infrastructure Programme Launch Meeting27th November 2009, London

Project Presenter: Simon Coleshttp://www.ukoln.ac.uk/projects/I2S2/

Page 2: I2S2 - Infrastructure for Integration in Structural Sciences Research Data Management Infrastructure Programme Launch Meeting 27 th November 2009, London.

I2S2: Project Overview

• Identify requirements for a data-driven research infrastructure in the Structural Sciences

• Understand localised data management practices

• Understand large centralised facilities data management infrastructure

OrganisationalScale

Discipline

Data LifecycleConceive Research Propose Analyse Publish Curate Reuse

ChemistryPhysics

Materials

Earth Sciences

Life Sciences

Medical

EngineeringTechnology

Lone Researcher

Research Group

Department

Mid-range Service

Central Facility

Increased efficiency through effective cross-institutional data management

Page 3: I2S2 - Infrastructure for Integration in Structural Sciences Research Data Management Infrastructure Programme Launch Meeting 27 th November 2009, London.

I2S2: Institutional Context

• University of Cambridge– “lone” researcher scenario– data sharing with colleagues via email– Little or no infrastructure– data received from ISIS is currently stored on laptop or WebDAV server– management of intermediate and derived data (intra & inter institution) a major issue

• EPSRC National Crystallography Service– service provision function– operates across institutions – moderate infrastructure– raw data generated in-house is stored at ATLAS– Local / institutional repository for intermediate and derived data

• Diamond & ISIS Central Facilities– operates on behalf of multiple institutions (community)– established, ‘formulaic’ & bespoke processes for experiments – large infrastructure engineered to manage raw data– derived data taken off site on laptops / removable drives– results data independently worked up and published

Bridging the gap between raw and derived data

Page 4: I2S2 - Infrastructure for Integration in Structural Sciences Research Data Management Infrastructure Programme Launch Meeting 27 th November 2009, London.

MethodologyMapping across organisational infrastructures

Proposals

Once awarded beamtime at ISIS, an entry will be created in ICAT that describes your proposed experiment.

Experiment

Data collected from your experiment will be indexed by ICAT (with additional experimental conditions) and made available to your experimental team

Analysed Data

You will have the capability to upload any desired analysed data and associate it with your experiments.

Publication

Using ICAT you will also be able to associate publications to your experiment and even reference data from your publications.

B-lactoglobulin protein interfacial structureE

xam

ple

IS

IS P

rop

osa

l

GEM – High intensity, high resolution neutron

diffractometer

H2-(zeolite) vibrational frequencies vs polarising

potential of cations

Home Institution Central Facility

Page 5: I2S2 - Infrastructure for Integration in Structural Sciences Research Data Management Infrastructure Programme Launch Meeting 27 th November 2009, London.

MethodologyAn established starting platform

http://code.google.com/p/icatproject/

Investigation

Publication KeywordTopic

SampleSample

ParameterDataset

Dataset Parameter

Datafile

Datafile Parameter

Investigator

Related DatafileRelated Datafile

Parameter

Authorisation

Core Metadata model forms the information model for ICAT.

Designed to describe facilities based experiments in Structural Science.

Forms the basis for extension:

- To laboratory based science- To secondary analysis data- To preservation information- To publication data

Covering the scientist’s research lifecycle as well as the facilities.

Basis of I2S2 integrated information model

Page 6: I2S2 - Infrastructure for Integration in Structural Sciences Research Data Management Infrastructure Programme Launch Meeting 27 th November 2009, London.

I2S2: Research Challenges• Research teams capture, manage, discuss and disseminate their

data in relative isolation with highly fragmented data infrastructures and poorly integrated software applications

• Conventional systems of publication lead to insufficient information on provenance of results and irreproducible experiments

• The processes for recognition lead to a lack of inclination and incentive to share or make all the supporting information for a study publicly available

• A low awareness of data curation and preservation issues leads to data loss and reduced productivity

• Scale and complexity: from small lab equipment through institutional installations to large scale facilities such as the DLS and ISIS at STFC

• Inter-disciplinary: research across domain boundaries• Data lifecycle: time-factored data flows and data transformations

Page 7: I2S2 - Infrastructure for Integration in Structural Sciences Research Data Management Infrastructure Programme Launch Meeting 27 th November 2009, London.

I2S2: Objectives & DeliverablesObjectives:

Broadly: development of pilot data management infrastructure solutions which bridge discipline, laboratory and institutional boundaries

Specifically, development of data practices across the research data lifecycle: – A framework for data management, deployable across Structural Sciences – Explore a range of data acquisition techniques at different scales (complexity,

volume, definition) – Advocate recognition in the community for sharing data to encourage reuse– Facilitate access to data underpinning publications with higher levels of

verification, resulting in higher quality research– Support long-term preservation assuring future discovery of results

Deliverables (in addition to project reporting):

D1.1 Requirements Report D1.2 Two Use CasesD2.1 Extended Cost Model based on KRDS2D2.2 Cost Analysis Phase 1 D3.1 Integrated Information Model D3.2 Pilot Implementation PlanD3.3a Pilot One: Scale and Complexity (Chemistry)D3.3b Pilot Two: Inter-disciplinary (Earth Science)

D4.1 Cost Analysis Phase 2 D4.2 Benefits report and business modelD5.1 Advocacy and Training materials D5.2 Two Workshops (disciplinary; RDMF)D6.3 Evaluation report D6.4 Sustainability recommendations

Page 8: I2S2 - Infrastructure for Integration in Structural Sciences Research Data Management Infrastructure Programme Launch Meeting 27 th November 2009, London.

I2S2: Measuring success & Future

• Implementation of key aspects of the project to bridge the chasm between large scale facilities and the lab bench– Integrated Information Model based on the Core Scientific Metadata

Schema and the DCC Digital Curation Lifecycle Model– Development of use cases and inter-disciplinary Pilots– Before and after cost-benefit analysis– Advocacy and training materials

In 18 months time … • Realised the potential to streamline the working processes of

structural scientists across domains and organisations

• Matched funding to implement operational infrastructure for the National Service?