UKOLN is supported by: Developing e- Infrastructure to support new research and learning paradigms. Dr Liz Lyon, Director UKOLN, University of Bath, UK Building the Info Grid, Copenhagen, September 2005. www.bath.ac.u k a centre of expertise in digital information management www.ukoln.ac.u k
38
Embed
UKOLN is supported by: Developing e-Infrastructure to support new research and learning paradigms. Dr Liz Lyon, Director UKOLN, University of Bath, UK.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
UKOLN is supported by:
Developing e-Infrastructure to support new research and learning paradigms.
Dr Liz Lyon, DirectorUKOLN, University of Bath, UK
Building the Info Grid, Copenhagen, September 2005.
www.bath.ac.uk
a centre of expertise in digital information management
www.ukoln.ac.uk
DEFF Seminar, Copenhagen, September 2005
2
Overview
1. e-Research: a changing landscape
2. Developing infrastructure: repository services & adding value• Aggregation and linking: eBank UK• Integration and workflows
3. Looking to the longer term: digital curation and preservation
1. e-Research: a changing landscape
DEFF Seminar, Copenhagen, September 2005
4
Data Overload!
How do we disseminate?
EPSRC National Crystallography
Service
eScience - the data deluge
DEFF Seminar, Copenhagen, September 2005
5
Diversity of data collections• Very large, relatively homogeneous: Large-scale Hadron
Collider (LHC) outputs from CERN• Smaller, heterogeneous and richer collections: World Data Centre for
at the University of Bath• Population survey data: UK Biobank
• Highly sensitive, personal data: patient care records
DEFF Seminar, Copenhagen, September 2005
6
Taxonomy of data collections• Research collections:
jumping robots • Community collections:
Flybase at Indiana (with UC Berkeley )
• Reference collections: Protein Data Bank
Source: NSF Long-Lived Digital Data Collections
Draft report revised May 2005
Evolution……
DEFF Seminar, Copenhagen, September 2005
7
Experience of data-sharing
• Large scale data sharing in the life sciences Draft Report June 2005 Sponsored by UK research funding bodies MRC, BBSRC, NERC, JISC, Wellcome
• Outcomes & recommendations– Importance of standards and good quality metadata– Require a data management plan– Work needed on vocabularies & ontologies– Awareness of archiving & long term preservation
2. Developing infrastructure: repository services & adding value
DEFF Seminar, Copenhagen, September 2005
12
Developing models• The e-Framework for Education & Research• JISC, UK and Department of Education, Science
& Training, Australia • www.e-framework.org
“The primary goal of the initiative is to produce an evolving and sustainable, open standards based service oriented technical framework to support the education and research communities.”
• UKOLN• Michael Day• Monica Duke• Rachel Heery• Traugott Koch • Liz Lyon• +• Andy Powell
• Southampton• Les Carr• Simon Coles• Jeremy Frey• Chris Gutteridge• Mike Hursthouse• Andrew Milstead
• Manchester• John Blunden-Ellis
DEFF Seminar, Copenhagen, September 2005
17
Data Flow in eBank UK
Submit
Store/link
Data files
Metadata
Present
HTML
Institutional repository eCrystals
OA
I-P
MH
Harvest (XML)
Index and Search
Present
HTML
eBank aggregator service
Create
Deposition Interface
Local archive search
interface
Service Provider interfaces e.g. Subject PortalDeposit
DEFF Seminar, Copenhagen, September 2005
18
CombeChem: An EPSRC pilot project
X-Raye-Lab
Analysis
Properties
Propertiese-Lab
SimulationVideo
Diff
ract
omet
er
Grid Middleware
StructuresDatabase
DEFF Seminar, Copenhagen, September 2005
19
Crystallography workflowRAW DATA DERIVED DATA RESULTS DATA
• Initialisation: mount new sample set up data collection• Collection: collect data• Processing: process and correct images• Solution: solve structures• Refinement: refine structure• CIF: produce CIF (Crystallographic Information File)• Validation: chemical & crystallographic checks• Report: generate Crystal Structure Report
DEFF Seminar, Copenhagen, September 2005
20
DEFF Seminar, Copenhagen, September 2005
21
A data repository entry
DEFF Seminar, Copenhagen, September 2005
22
Access to the underlying data: complex objects
ecrystals.chem.soton.ac.uk
DEFF Seminar, Copenhagen, September 2005
23
Harvesting: OAIster
DEFF Seminar, Copenhagen, September 2005
24
Aggregating: search & discover
DEFF Seminar, Copenhagen, September 2005
25
Linking data to publications
DEFF Seminar, Copenhagen, September 2005
26
eBank embedded in a science portal
DEFF Seminar, Copenhagen, September 2005
27
Ontologies for discovery in an interdisciplinary world
• Transform the ‘list’ into an ‘ontology’
• Embed ontology into the deposition process
• Publish keywords in OAI
• Aggregators use keywords for linking with the broader literature
• Researchers use keyword ontology in search and discovery services
DEFF Seminar, Copenhagen, September 2005
28
Persistent identifiers for data citation
• eBank use cases: depositor, author, service provider, reader, publisher, ?
• Schemes: DOI, Handle, ARK, PURL• Global identification: express as http URIs• Added value services: CrossRef, resolution
service, integration (Globus), look-up service, ?• Degree of trust or persistence• Costs• Future potential: political, ?• Domain identifiers: International Chemical Identifier
(InChI) codes
DEFF Seminar, Copenhagen, September 2005
29
Publication & citation of scientific primary data project
• National Library for Science & Technology (TIB), University of Hanover, Germany
• STD-DOI Project http://www.std-doi.de • DOI registry for datasets• Data requirements: quality control, long-term curation,
use DOI resolver• Data publication agents: World Data Center Climate,
GeoForschungsZentrum Potsdam• Exemplar data citation:
– Kamm, H; Machon, L; Donner, S (2004): Gas chromatography (KTB Field Lab), GFZ Potsdam. doi:10.1594/GFZ/ICDP/KTB/ktb-geoch-gaschr-p
DEFF Seminar, Copenhagen, September 2005
30
Integration into crystallographic publishing practices
Publishers seal of approval
DEFF Seminar, Copenhagen, September 2005
31
Integration into chemistry research workflows
• R4L Repository for the Laboratory Project (JISC-funded) automated data capture from instrumentation, registration of results