Federation eCrystals Federation: Open Repositories for Open Science Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton, UK Dr Manjula Patel, UKOLN, University of Bath, UK CNI Taskforce Meeting, Washington DC, December 2007 This work is licensed under a Creative Commons Licence Attribution-ShareAlike 3.0 http://creativecommons.org/licenses/by-sa/3.0/
38
Embed
Federation eCrystals Federation: Open Repositories for Open Science Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Federation
eCrystals Federation: Open Repositories for Open Science
Dr Liz Lyon, UKOLN, University of Bath, UK
Dr Simon Coles, University of Southampton, UK
Dr Manjula Patel, UKOLN, University of Bath, UK
CNI Taskforce Meeting, Washington DC, December 2007
This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 3.0
http://creativecommons.org/licenses/by-sa/3.0/
1. Chemistry and Open Science : context and practice.
2. Lessons learnt from eBank Phase 3
3. Data curation and preservation issues
4. Setting up the Federation: Challenges ahead?
Overview
Federation
Chemistry and Open Science: context and practice
Social networks for chemists….
New postgraduate cohorts : millennials / Google generation : new behaviours
Community content for chemists : rich media
video + paper = Pubcast
>8000 views
At the coalface: tagging & sharing workflows Astronomy, Bioinformatics, Chemistry, Social Science pilots.
Universities of Manchester & Southampton
“Small science” : sharing in the lab
Open Wetware Laboratory wikis
Transforming practice?
2006
Open Notebook Science (ONS)
26 September:
1st use of term blogged by Jean-Claude Bradley, Drexel University
2007
27 March: ONS at
Amer Chem Society Symposium
7 August: ONS Poster in Second Life on Nature island
24 September: ONS Case Studies in Second Life
4 October: > 43,000 hits in Google for term ONS
10 & 15 October: Policy lists,DabbleDB membership database created US
11 October: ONS experiment starts in Cambridge, UK
10 November: Open Data for common molecules - Wikichemicals? Peter Murray-Rust’s blog at Univ. Cambridge, UK
27 November: Research Network proposal submitted to UK research council
Yesterday: about 2,400,000 Google hits for Open Notebook Science
New ideas are surfacing very fast with instant development, testing and take-up…..
eBank Project – building the eCrystals Data Repository
Institutional Repository exemplar
http://ecrystals.chem.soton.ac.uk
Metadata Publication• Using simple Dublin Core
• Crystal structure• Title (Systematic IUPAC Name)• Authors• Affiliation• Creation Date
• Additional chemical information through Qualified Dublin Core• Empirical formula• International Chemical Identifier (InChI)• Compound Class & Keywords
• Specifies which ‘datasets’ are present in an entry
• DOI http://dx.doi.org/10.1594/ecrystals.chem.soton.ac.uk/145
• Rights & Citation http://ecrystals.chem.soton.ac.uk/rights.html
• Maintenance and open access of critical file formats and software– Work-up software e.g. XPREP– Export raw data from instrumentation as imgCIF
• Consider Representation Information (RI) in context of whole crystallography landscape (CCDC, IUCR etc.) • Develop a preservation and curation strategy and formal policies to indicate levels of service
– Deposit, ingest, validation, dissemination
• Consider services to be developed over the DCC Registry/Repository of Representation Information (RRoRI)
Observations & Recommendations 3• Develop preservation strategy & plan for the specific content • Capture preservation metadata, including versioning and provenance information• PREMIS Data Dictionary
– Semantic Units (e.g. file format, significant properties, provenance, fixity info)
– Extend eBank metadata application profile (AP)?
• Obtain consensus on AP• Seek to automate metadata generation, extraction, maintenance• ePrints.org support for information packages
Federation
Setting up the Federation: Challenges ahead?
CreateDeposit
Link
Curate Preserve Standards
Scientist
Funder
Collaborate Share
User
Discover Re-use
eCrystals Federation Data Deposit Model
Link
Link
Scientist
Policy AdvocacyTraining
HarvestIR Federation
Publishers
Data centres / aggregator
servicesAdvisory
Repository deployment & support
• Roll-out in 2 phases– Universities Sydney, Glasgow, Newcastle with
workflows: avoiding fragmentation of data, results and interpretations
• Account for differinglaboratory practice
RAW DATA DERIVED DATA RESULTS DATA
Repository interoperability & linking services
• Establish core Federation application profile and mappings
• Bi-directional links with derived articles in “publisher repositories”, IUCr, Royal Society of Chemistry (RSC), Chemistry Central
• Test linking options: StORe middleware and CLADDIER (JISC-funded projects)
• OAI-ORE Pathways Project developments
Interoperability testbed• Experimental data sets + metadata as compound objects• Dublin Core and METS not sufficient• OAI-ORE (base: Atom Publishing Protocol) testbed• Enable 3rd party services e.g. data / text mining
eChemistry project
Enabling data discovery
• Royal Society of Chemistry Project Prospect tagging & semantic linking