Top Banner
Big Data Analytics in Science and Research: New Drivers for Growth and Global Challenges Richard A. Johnson CEO, Global Helix LLC and BLS, National Academy of Sciences ICCP Foresight Forum Big Data Analytics and Policies 22 October 2012 [email protected]
21

Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

May 22, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Big Data Analytics in Science and Research: New Drivers for

Growth and Global Challenges

Richard A. Johnson CEO, Global Helix LLC and BLS, National Academy of Sciences

ICCP Foresight Forum – Big Data Analytics and Policies

22 October 2012

[email protected]

Page 2: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Session 3: 4 Questions for Discussion

Q1 – Importance of data openness and interoperability for science and research, especially in biomedicine and health?

Q2 – Are current IPR regimes ≈ data-intensive scientific discovery?

Q3 – Do we still need scientific methods (and traditional domain scientists) in an era of big data analytics?

Q4 – How, and why, does this matter for policy?

Page 3: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Convergence of Biology with Physical Sciences & Engineering through Data and Data Analytics = the

“New Biology” or Third Revolution in the Life Sciences Foundational trend in STI for next 20 years – NAS (2010); MIT (2011)

Page 4: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Genomic Data is Increasing Faster than Computing Power –

Convergence of 3 key DATA DRIVERS with RESEARCH and ECONOMIC VALUE: (1)Sequencing + (2) Synthesis + (3) Reading AND Writing DNA

Data Tools in the Life Sciences: Moore’s Law on Steroids

Gene Expression Data Sets (Nature 2012)

Page 5: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Life Sciences and Biomedical Research as an Information Science: Quantitative, Data-driven,

Simulation-oriented, Predictive Science

Page 6: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Data and Convergence Driving the Future: Data Analytic Tools, Platforms, and Measurement for New Sources of Growth

6

• Technology Convergence, Data Analytics and Metrology as Interdependent Drivers (Agilent 2012)

Synthetic Biology

Energy and the Environment

Advancing High Growth Economies

Portable, Mobile and Out-of-Lab

Nanotechnology

Food Safety

Personalized Medicine

Single Cells and Microbiome

Intern Executive Speaker Series

Page 7: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Beyond Interoperability, The Power of Interconvertibility: FROM

PHYSICAL LIVING MATERIAL/DNA to DIGITAL DATA, and back 1’s and 0’s ↔ A, C, T, G’s

“IT from Bits” (Poste 2012)

• Programming: increasing ability to both Read and Write DNA

• DNA Construction (analog to Read/Write; 1’s and 0’s manipulation) - Genetic Expression Operating Systems; Scale DNA construction engineering

• Data enables Decoupling:

biological processes from evolution-based descent and replication + design from fabrication

Tools to Edit and Write Genomes: MAGE + CAGE (Church/Isaacs 2011, 2012)

Page 8: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Big Data and Data Analytics Drive new 21st Century Infrastructures and KNMs, and Create Opportunities for New

Research, Better Health Outcomes, and Value Creation (Toward Precision Medicine: Building a Knowledge Network for Biomedical Research and New Taxonomy of Disease: NAS 2011)

Page 9: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

The Creative Destruction of Medicine (Topol 2012)

Page 10: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Data Sharing, Disease Modeling and

Biomarkers to Accelerate the Development

Page 11: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Big Data and Engineering Biology as the Transformative “New Normal” in the Life Sciences Driving New Sources of Growth

Synthetic Biology - Standardization, Abstraction and Modularity

Predictive Platforms for Engineering Biology and Predictable Integration of new Genetic Designs built on Massive Data • “an Engineering

METHODOLOGY to construct complex systems and novel properties based on biological components” (EU-US Task Force, June 2010)

Page 12: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Data-driven and Engineering Biology “Value Proposition” Increasingly Drives Science, New Sources of Growth, and our

ability to meet societal Grand Challenges – NAS 2011

Page 13: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Neuroscience – a 21st Century Frontier for Human Understanding and Grand Challenges

Traversing the scales at all levels in understanding the brain from molecular and cellular to systems – neurons (100 Billion)/synapses (150 Trillion), and neural signaling

Human Connectome Project = mapping neural networks with >1 million more connections than the genome has letters of DNA, and linking all this to other life experience data sets

Page 14: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

ENCODE: the Encyclopedia of DNA Elements – Big Data, Data Analytics, and Big Science increasingly change how we do

science (Sept. 2012)

Page 15: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

The Plasticity of IPR/Open Science Meanings – and lots of rethinking in different domains about IPR, Openness and

Scientific Research

• IPR and Competing Visions of Openness

Open Science (Public domain; BioBricks library/BBF) v. Open Source (IPR-driven; GPL, BSD, CC) v. Open Standards v. Open Development v. Open Access (including reuse and sharing public-funded data) v. Open Innovation (depends on strong, well-functioning IPR system)

• Innovative New Thinking– e.g., Semi-commons as a new lens to view Data – interacting common and private uses that are dynamic/scalable over the same resources and that can adjust through contracting and other mechanisms

• Knowledge Networks and Markets (KNMs) and Knowledge-based Capital KBC) – major OECD initiatives on-going

• Growing Counter-intuitive View that Role of IPR Increasingly Important as a Tool to Promote Openness, Transparency, and Diffusion , e.g., Algorithms, Data Exchanges, Tools and Re-use

Page 16: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Growing Linkage of Data-intensive Science, IPR, and New Models of Innovation: Big Data Analytics Intersect with

Open Innovation, Multi-directional S&T, University-Industry Partnering, New Business Models, Forward-looking IPR, and New Public-Private Collaborative Mechanisms to Enable Cutting-edge

Research and Innovation

Page 17: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics – the Need for a New

Generation of Scientific computing tools and platforms to manage, visualize and analyze Big Data for Research

(Gray 2009)

Page 18: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Wide Range of New Data Analytic Convergence Challenges with Policy Implications (Gray 2009)

Risks to Scientific Research from (Bad) Data Analytics?

- Jeopardize reproducibility

- Retard pace of research

- Produce poorly written code/bad algorithms on which science relies

- Create serious errors in scientific outcomes, and the interpretations of them

Page 19: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

New Day-to-day Science Research Implications of Big Data: Data Analytics Challenges

• Which data to keep – in what format? for how long? • What about “emergent properties”? – resulting from

elaborate networks of interactions and data patterns • How to deal with data distributed across many

locations, formats, scales, etc., and merge them? • How to model large complex data, and derive valuable

knowledge from analytics/models? • How to infuse data into complex computations to

enable simulations of predictive value? • How to deal with different kinds of big data (temporal,

spatial, dimensional, heterogeneous) – Massive data – High-dimensional data – Multi-modal data – Real-time and Streaming data

Page 20: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

In a data-driven science era, should we still fund, “incentivize” and value Empirical, Theoretical, Model-based Approaches to Scientific

Discovery? Is Popper’s scientific method paradigm outdated?

• “I believe that math is trumping science. What I mean by that is you don't really have to know why, you just have to know that if a and b happen, c will happen.” Vivek Ranadivé, entrepreneur and CEO, financial-data software company TIBCO (2011)

• “With enough numbers, the data speak for themselves” Chris Anderson, Editor-in-Chief, Wired, “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete (2008)

• “All models are wrong, and increasingly you can succeed without them.” Peter Norvig, Director of Research, Google

• “The numbers have no way of speaking for themselves….Data-driven

predictions can succeed — and they can fail. It is when we deny our role in the process that the odds of failure rise. Before we demand more of our data, we need to demand more of ourselves.” Nate Silver, The Signal and the Noise: Why So Many Predictions Fail – but Some Don’t (2012)

• “The invalid assumption that correlation implies cause is probably among the two or three most serious and common errors of human reasoning.” Stephen Jay Gould, American evolutionary biologist (1981)

Page 21: Big Data Analytics in Science and Research: New Drivers ... Johnson.pdf · The Fourth Paradigm, the Internet of Things, Automated Data Extraction Methods, and Big Data Analytics –

Thank you!

Contact Information -- Richard A. Johnson

CEO, Global Helix LLC

[email protected]

MIT

[email protected]