A centre of expertise in digital information management www.ukoln.ac.u k UKOLN is supported by: Acting as Advocate? Seven steps for libraries in the data decade Dr Liz Lyon, Director, UKOLN, University of Bath, UK Associate Director, UK Digital Curation Centre IATUL Conference, Purdue University, June 2010 . This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0
38
Embed
A centre of expertise in digital information management UKOLN is supported by: Acting as Advocate? Seven steps for libraries in the data.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A centre of expertise in digital information management
www.ukoln.ac.uk
UKOLN is supported by:
Acting as Advocate? Seven steps for libraries in the data decade
Dr Liz Lyon, Director, UKOLN, University of Bath, UKAssociate Director, UK Digital Curation Centre
IATUL Conference, Purdue University, June 2010
.This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 2.0
•CNI, Baltimore April 2010•http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/presentations.html
data scaleHuman Genome printed http://www.flickr.com/photos/johnjobby/2252981353/sizes/l/
Human Genome printed http://www.flickr.com/photos/johnjobby/2252981353/sizes/l/
“Data sets are becoming the new instruments of science”
$1000 genome in <15 minutes ....by 2013?
...data logistic challenges....
• Large-scale data storage that is:– Cost-effective (rent on-demand)– Secure (privacy and IPR)– Robust and resilient– Low entry barrier / ease-of-use– Has data-handling / transfer / analysis capability
• Move sequencing out of genome centres
• “....analyse an entire human genome in a single day sitting with a laptop at your local Starbucks.”
...cloud services
Clients in the cloud
Library Actions1. Provide Briefings on Cloud Data Services
(in partnership with local IT Services?)
Workflows, Models, Tools
Sage Bionetworks genomics Workflow
Reference Linking
Research Outputs
User registration data; Instrument allocation data etc.
Comments, annotations, ratings etc.
Risk assessment data; other sample dataAnalyse
Derived Data
Research Concept and/or
Experiment Design
Acquire Sample
Peer-review Proposal
Conduct ExperimentGenerate, Create,
& Collect Raw Data
Process Raw Data into Derived Data
Interpret & Analyse
Results Data
Archive, Preservation & Curation
IPR, Embargo & Access Control
Validate, Reuse& Repurpose Data
Publish Research
Results Data Derived Data Processed Data Raw, Correction & Calibration Data
Papers, articles, presentations, reports
An Idealised Scientific Research Data Lifecycle Model
State-of-the-Art Report : Models & Tools (Alex Ball, June 2010)
• Data Lifecycles• Data Policies (UK) incl DMP• Standards & tools• Data Asset Framework (DAF) • DANS Seal of Approval• Preservation metadata• Archive management tools• Cost / benefit tools
Library Actions1. Provide Briefings on Cloud Data Services
(in partnership with local IT Services?)
2. Build usable Data Management Tools working in partnership with researchers
Data Sustainability….
Dimension 1
Direct Indirect (costs avoided)
Dimension 2
Near-term Long-term
Dimension 3
Private Public
Benefits Taxonomy: Summary
Keeping Research Data Safe2 Report: April 2010
Library Actions1. Provide Briefings on Cloud Data Services
(in partnership with local IT Services?)
2. Build usable Data Management Tools working in partnership with researchers
3. Develop Data Sustainability Strategies and articulate the cost-benefits
Ethics, Privacy, Culture
“You have zero privacy anyway. Get over it”
Scott McNealy, CEO Sun Microsystems, 1999
Post-genome decade
Human genomes: >24 published &almost 200 unpublished
“P4 medicine : Predictive, Personalised, Preventive, Participatory.”
Leroy Hood – Institute for Systems Biology
Image from Scientific American
...“medicine is going to become an information science”...
P4 medicine• Each patient’s genome sequenced
• Your genome is basis of your medical record
• New method to anonymise medical records for genomics research at Vanderbilt Univ (April ‘10)
• New predictive models of health and disease
• Personalised treatments focus on preventative therapiesGenome scale network biologyGenomic data as a commodity
They have shared their data….
Share my
data?
“While many researchers are positive about sharing data inprinciple, they are almost universally reluctant in practice. ..... using these data to publish results before anyone else is theprimary way of gaining prestige in nearly all disciplines.” INCREMENTAL Project
• Sage Bionetworks : Integrative genomics• Open data in the Sage Commons repository• Human and mouse: clinical and genetics data• Develop predictive models of disease: liver /
breast / colon cancer, diabetes, obesity• Crowd-sourced effort : global scope
Stephen Friend
Participatory medicine : share data &empower the patient...
Sage Congress San Francisco April 2010
Library Actions1. Provide Briefings on Cloud Data Services
(in partnership with local IT Services?)
2. Build usable Data Management Tools working in partnership with researchers
3. Develop Data Sustainability Strategies and articulate the cost-benefits
4. Publish Case Studies on Open Science to show benefits of universal data sharing
Library Actions1. Provide Briefings on Cloud Data Services
(in partnership with local IT Services?)
2. Build usable Data Management Tools working in partnership with researchers
3. Develop Data Sustainability Strategies and articulate the cost-benefits
4. Publish Case Studies on Open Science to show benefits of universal data sharing
5. Present at University Ethics Committee to highlight open data issues for faculty
Professional Scientists Enthusiastic amateurs
Training Citizen scientist
Standards and ethics Local : natural history, environ.
Peer-review Global : astronomy
Organisational support Self-supporting
Citizen Science : validated in the professional press
Working with science professionals
Library Actions6. Raise awareness of Citizen Science