a centre of expertise in data curation and preservation IMechE Workshop, London, 26 th September 2006 Looking to the longer term: some perspectives on data curation and preservation This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 Funded by: Dr Liz Lyon, DCC Associate Director Outreach Director, UKOLN, University of Bath, UK
28
Embed
A centre of expertise in data curation and preservation IMechE Workshop, London, 26 th September 2006 Looking to the longer term: some perspectives on.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
a centre of expertise in data curation and preservation
IMechE Workshop, London, 26th September 2006
Looking to the longer term: some perspectives on data curation
and preservation
This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0
Funded by:
Dr Liz Lyon,
DCC Associate Director Outreach Director, UKOLN, University of Bath, UK
About UKOLN
• “a centre of expertise in digital information management”• Funding: Joint Information Systems Committee (JISC) +
Museums, Libraries & Archives Council (MLA)• Portfolio of R&D projects Delos, DRIVER, Grand Challenge• 29+ staff based at the University of Bath• Inform the library, information, education and cultural
heritage communities• Policy, advocacy at national level, build innovative Web-
based systems & services, R&D, e-journal Ariadne, workshops and conferences.
• http://www.ukoln.ac.uk/
Acknowledgement: Alex Ball, Grand Challenge Project
UK Digital Curation Centre
• Digital Curation Centre• Funded by JISC & EPSRC• Development activities• Research agenda• Delivering services• Outreach Programme• http://www.dcc.ac.uk/
a centre of expertise in data curation and preservation
IMechE Workshop, London, 26th September 2006
Overview• Data curation and digital preservation issues • Draw on research and scholarship
perspectives• Data / information flows and the “business
process”• UK Digital Curation Centre activities
“maintaining and adding value to a trusted body of digital information for current and
future use”
Data-centric 2020 vision
Reference datasets as infrastructure?
(Very simple) Product Research Cycle & Data Curation
Formulate ideas / hypothesis, test, experiment, observe, design: data
creation, collection & capture
Adding value: Data linking, annotation,
visualisation, simulation
(New) knowledge extraction: data mining, modelling, analysis, synthesis
e-Infrastructure
Open ?? access
Collaboration
Scholarly communications & Business transactions: data disclosure, publication, citation, discovery, re-use
Data management storage & validation: description, deposit,
self-archiving, preservation,
certification
Data processing
Data processingData processing
Data processing
Data processing
This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0
Maintenance Engineer Aircraft Lands
Visual Inspection
Provide Information
Quote Diagnos is
Brief Diagnos is / Prognos is
Check Diagnoses
Maintenance Procedure
Diagnos is Result
Release Engine
complete
Maintenance Result
Maintenance Analys t (Fleet Manager)
Detailed Diagnos is / Prognos is
Provide Further Details
Reques t Information
Sign-off Diagnos is
Analys t Decis ion
[ information required ]
[ diagnosis ]
DAME signal processing workflows using Grid Services
Domain Expert
Detailed Analys is
[ unknown ]
Reques t Further Details
Expert Decis ion
[ known ][ Clear ]
[ unknown ]
[ information required ]
[ diagnosis ]
[ fault unresolved ]
[ fault resolved ]
Rolls RoyceDS&SAirport
• RepoMMan: Repository Metadata and Management (Hull) using WS-BPEL
• Are your engineering workflows identified and described?
Workflowe-Scientist desktop?
Slide: Carole Goble
Research outputs in institutional repositories: engineering
“JISC Vision”: a global landscape of federated repositories
• eBank Application Profile crystallography data http://www.ukoln.ac.uk/projects/ebank-uk/schemas/
• What data models and metadata schema are in place?
Persistent identifiers for data citation• How will they be used? We need use cases: depositor, author,
service provider, researcher, publisher?• Schemes: DOI, Handle, ARK, PURL• Global identification: express as http URIs• Data citation (human and machine-actionable)• Publication & citation of scientific primary data project National
Library for Science & Technology (TIB), University of Hanover, Germany. STD-DOI Project DOI registry for datasets http://www.std-doi.de
• Is there a data citation policy?
• What persistent identifiers have been assigned to your data?
• Domain identifier: International Chemical Identifier (INChI) code• Google molecule using INChISlide from Simon Coles
Domain identifiers for engineering?
Format migration challenges? CAD Program Compatibility Chart http://www.okino.com/conv/filefrmt_cad.htm
Registry development
Development: Representation Information Registry Repository
• “DCC Approach to Digital Curation” based on OAIS• Representation Information Registry Repository • Prototype demonstrator: based on 2 key concepts to facilitate
sharing of the curation effort– Curation Persistent Identifier (CPID)– Descriptive “label” (structural, semantic, other metadata)
• Development of (M2M) tools and interfaces for creating, using and re-using representation information
• http://dev.dcc.ac.uk Wiki and email list
• EU CASPAR Integrated Project
• Task Force on the Permanent Access to the Records of Science http://www.casparpreserves.info/pages/1/index.htm
http://tfpa.kb.nl/
Registry APIAllows applications to talk to many different registry implementations e.g. GDFR, PRONOM, UDDI
•GUI Access and via Web browser http://registry.dcc.ac.uk
Adding value through annotation Research at the University of Edinburgh
• Scientific databases: Annotation scoping report
• New annotation model + prototype MONDRIAN
• Intuitive visual interface iMONDRIAN
• Annotate sets of values
• Support for querying annotations
Nature 23 March 2006 OTMI: Open Text Mining Interface
• Briefing Papers– Curating emails – Digital repositories – Geospatial data – Data protection – eScience data
• Case studies
a centre of expertise in data curation and preservation
IMechE Workshop, London, 26th September 2006
DCC Case Study published: Wide Field Astronomy Unit
Supporting the community: Outreach & Services • Workshops:
• Geospatial data, NeSC, 27 October• OAIS 5 year Review, October• Audit & Certification Forum, October• Records Management, L’pool 30 Nov• Curation & Preservation Training, Dec• 2007 Preservation of journals tbc• 2007 Legal environment tbc• 2007 Preparing for audit tbc
• Information Days British Library L’pool UCL
• 2nd International DCC Conference 21-22 November, Glasgow
• Keynotes: Hans F. Hoffmann, CERN, Clifford Lynch, CNI
a centre of expertise in data curation and preservation
IMechE Workshop, London, 26th September 2006
DCC Phase 2: 2007-2010• Working more closely with data centres, e-Science
Programmes and Research Councils• SCARP Project: disciplinary approach• JISC Digital Repository Programme collaboration• RepInfo Registry service migration• Define self-assessment procedures and tools• Collaborate with CASPAR, DPE and PLANETS (EU-
funded Digital Preservation Projects)• Workshop Programme, International Conference 2007
University of Bath, 13 September 2006
a centre of expertise in data curation and preservation