Top Banner
BD2K @ NIH – A Vision Through 2020 Philip E. Bourne, PhD, FACMI Associate Director for Data Science [email protected]
18

BD2K @ NIH - A Vision Through 2020

Apr 16, 2017

Download

Education

Philip Bourne
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BD2K @ NIH - A Vision Through 2020

BD2K @ NIH – A Vision Through 2020

Philip E. Bourne, PhD, FACMIAssociate Director for Data Science

[email protected]

Page 2: BD2K @ NIH - A Vision Through 2020

First and foremost you should see this meeting as a celebration of the hard work of the past two years

Yes these are uncertain times, but …

There is a commitment to the BD2K program through 2020

Page 3: BD2K @ NIH - A Vision Through 2020

BD2K cannot be viewed in isolation, but rather as part of a broader view of data science @ NIH …

Particularly as funding is increasingly from the IC’s

Page 4: BD2K @ NIH - A Vision Through 2020

A View Which Includes:

• A vibrant research program of:– Fundamental developments in data science– Application of those fundamental developments– Flagship projects to which developments are applied:

• PMI, Brain, Moonshot, ECHO

• A sustainable data ecosystem– Commons and the FAIR Principles adoption– Cross-cutting activities

• Increased workforce training• A changing governance model

Page 5: BD2K @ NIH - A Vision Through 2020

A Strategic Response can be Modeled on Three Axes:

Research

Resources

Outcomes

Page 6: BD2K @ NIH - A Vision Through 2020

A Strategic Response

Research

Resources

Outcomes

• Fundamental• Machine learning• Data mining• Indexing• Predictive modeling …

• Applied• Sustainability, governance,

economics of data• Privacy and security• Effective use of clouds …

Page 7: BD2K @ NIH - A Vision Through 2020

A Strategic Response

Research

Resources

Outcomes

• Standards• Commons

APIsReference data setsWorkflowsAccess &

Authentication• Workforce

• Fundamental• Machine learning• Data mining• Indexing• Predictive modeling …

• Applied• Sustainability, governance,

economics of data• Privacy and security• Effective use of clouds …

Page 8: BD2K @ NIH - A Vision Through 2020

A Strategic Response

Research

Resources

Outcomes

• Standards• Commons

APIsReference data setsWorkflowsAccess &

Authentication• Workforce

• Fundamental• Machine learning• Data mining• Indexing• Predictive modeling …

• Applied• Sustainability, governance,

economics of data• Privacy and security• Effective use of clouds …

• Evaluated pilots• FAIR data• Trained workforce• Best practices• Policies• Effective use of clouds• On-ramps for all IC’s

Page 9: BD2K @ NIH - A Vision Through 2020

A View Which Includes:

• A vibrant research program of:– Fundamental developments in data science– Application of those fundamental developments– Flagship projects to which developments are applied:

• PMI, Brain, Moonshot, ECHO

• A sustainable data ecosystem– Commons and the FAIR Principles adoption– Cross-cutting activities

• Increased workforce training• A changing governance model

Page 10: BD2K @ NIH - A Vision Through 2020

The Current Situation

• NIH Funded Data– Total data from NIH-funded research currently estimated at 650 PB*– 20 PB of that is in NCBI/NLM (3%) and it is expected to grow by 10 PB this year

• Dark Data– Only 12% of data described in published papers is in recognized archives –

88% is dark data^

• Cost– 2007-2014: NIH spent ~$1.2Bn extramurally on maintaining data archives

* In 2012 Library of Congress was 3 PB^ http://www.ncbi.nlm.nih.gov/pubmed/26207759

Page 11: BD2K @ NIH - A Vision Through 2020

The Commons - Status

• Commons and FAIR principles* adopted across NIH• Development and public release of a prototype Data

Discovery Index– DataMed

• Feb. v 1.0• Nov v 1.5

• Cloud credits being issued for work in the Commons• FOA’s for Commons Framework being issued• Commons pilots under way

* https://www.ncbi.nlm.nih.gov/pubmed/26978244

Page 12: BD2K @ NIH - A Vision Through 2020

Sustainability – Sample Other Activities

• Request for Information: Metrics to Assess Value of Biomedical Digital Repositories (NOT-OD-16-133)– To be discussed at Sustainability Session, Wed 1pm

• RFA to support community based standards work was released in the fall for May 2017 award, session today 1pm

• Funding opportunity announcement: (BD2K) Enhancing the Efficiency and Effectiveness of Digital Curation for Biomedical Big Data (RFA-LM-17-001)Applications due Dec 15

Page 13: BD2K @ NIH - A Vision Through 2020

Sustainability – Looking Forward

• International collaboration on business models for sustainable data repositories– Sustainable Business Models for Data Repositories (OECD Global

Science Forum)– Future of Life Sciences and Biomedical Databases (International

Human Science Frontiers Program)• NIH long-term data repository support

– Federal interagency Workshop on Measuring the Impact of Data Repositories, 2017

– Recommend mechanism(s), review criteria, implementation plan

Page 14: BD2K @ NIH - A Vision Through 2020

Example Cross-cutting Activities

• International partnerships• Count everything – Secure count query

framework• California centers regional meetings• GA4GH – Beacon project

Page 15: BD2K @ NIH - A Vision Through 2020

A View Which Includes:

• A vibrant research program of:– Fundamental developments in data science– Application of those fundamental developments– Flagship projects to which developments are applied:

• PMI, Brain, Moonshot, ECHO

• A sustainable data ecosystem– Commons and the FAIR Principles adoption– Cross-cutting activities

• Increased workforce training• A changing governance model

Page 16: BD2K @ NIH - A Vision Through 2020

NLM

• Working Group Report – http://

acd.od.nih.gov/reports/Report-NLM-06112015-ACD.pdf

– Recommendation – NLM should become the programmatic epicenter for data science at NIH …

• Patti Brennan – New NLM director

Page 17: BD2K @ NIH - A Vision Through 2020

What We Hope to See in 2020

• New innovations bought about by large and complex data

• Evidence of translation i.e. real application at the point of care

• Broad Commons adoption leading to– Improved sharing, reuse and hence cost effectiveness and

reproducibility• A balance between what is spent on data vs what is

gained from that data• Policies that are supportive of the above

Page 18: BD2K @ NIH - A Vision Through 2020

… for your hard work and to the NIH staff from the ADDS office and from across the IC’s who have toiled to make BD2K a success