Top Banner
Data Science: The End of Statistics? Larry Wasserman Carnegie Mellon University Interface 2015
75

Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Aug 23, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Data Science: The End of Statistics?

Larry WassermanCarnegie Mellon University

Interface 2015

Page 2: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conclusion

Let’s turn the Interface meeting into the statistics version of NIPS

Page 3: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conclusion

Let’s turn the Interface meeting into the statistics version of NIPS

Page 4: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

This Talk

Will be short

Will be annoying provocative

Page 5: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

This Talk

Will be short

Will be annoying provocative

Page 6: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

This Talk

Will be short

Will be annoying provocative

Page 7: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Main Points

• Statisticians are being left out

• This should worry everyone (not just statisticians)

• It’s (partly) our fault

• We need a culture shift:

1. modernize training (no more UMVUE’s)

2. embrace the CS conference culture

3. watch and learn from CS: active learning, deep learning, SVM,online learning, RKHS, differential privacy ...

Page 8: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Main Points

• Statisticians are being left out

• This should worry everyone (not just statisticians)

• It’s (partly) our fault

• We need a culture shift:

1. modernize training (no more UMVUE’s)

2. embrace the CS conference culture

3. watch and learn from CS: active learning, deep learning, SVM,online learning, RKHS, differential privacy ...

Page 9: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Main Points

• Statisticians are being left out

• This should worry everyone (not just statisticians)

• It’s (partly) our fault

• We need a culture shift:

1. modernize training (no more UMVUE’s)

2. embrace the CS conference culture

3. watch and learn from CS: active learning, deep learning, SVM,online learning, RKHS, differential privacy ...

Page 10: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Main Points

• Statisticians are being left out

• This should worry everyone (not just statisticians)

• It’s (partly) our fault

• We need a culture shift:

1. modernize training (no more UMVUE’s)

2. embrace the CS conference culture

3. watch and learn from CS: active learning, deep learning, SVM,online learning, RKHS, differential privacy ...

Page 11: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Main Points

• Statisticians are being left out

• This should worry everyone (not just statisticians)

• It’s (partly) our fault

• We need a culture shift:

1. modernize training (no more UMVUE’s)

2. embrace the CS conference culture

3. watch and learn from CS: active learning, deep learning, SVM,online learning, RKHS, differential privacy ...

Page 12: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Where are the Statisticians?

• President’s Council of Advisors on Science and Technology(PCAST) includes ...

0 statisticians!

• Chief Data Scientist of the United States Office of Science andTechnology Policy.Not a statistician.

• Forbes: World’s 7 Most Powerful Data Scientists0 statisticians.

• Startups?• Google, Microsoft, Facebook all have Chief Economists. ChiefStatisticians?

Page 13: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Where are the Statisticians?

• President’s Council of Advisors on Science and Technology(PCAST) includes ...0 statisticians!

• Chief Data Scientist of the United States Office of Science andTechnology Policy.Not a statistician.

• Forbes: World’s 7 Most Powerful Data Scientists0 statisticians.

• Startups?• Google, Microsoft, Facebook all have Chief Economists. ChiefStatisticians?

Page 14: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Where are the Statisticians?

• President’s Council of Advisors on Science and Technology(PCAST) includes ...0 statisticians!

• Chief Data Scientist of the United States Office of Science andTechnology Policy.

Not a statistician.

• Forbes: World’s 7 Most Powerful Data Scientists0 statisticians.

• Startups?• Google, Microsoft, Facebook all have Chief Economists. ChiefStatisticians?

Page 15: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Where are the Statisticians?

• President’s Council of Advisors on Science and Technology(PCAST) includes ...0 statisticians!

• Chief Data Scientist of the United States Office of Science andTechnology Policy.Not a statistician.

• Forbes: World’s 7 Most Powerful Data Scientists0 statisticians.

• Startups?• Google, Microsoft, Facebook all have Chief Economists. ChiefStatisticians?

Page 16: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Where are the Statisticians?

• President’s Council of Advisors on Science and Technology(PCAST) includes ...0 statisticians!

• Chief Data Scientist of the United States Office of Science andTechnology Policy.Not a statistician.

• Forbes: World’s 7 Most Powerful Data Scientists

0 statisticians.

• Startups?• Google, Microsoft, Facebook all have Chief Economists. ChiefStatisticians?

Page 17: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Where are the Statisticians?

• President’s Council of Advisors on Science and Technology(PCAST) includes ...0 statisticians!

• Chief Data Scientist of the United States Office of Science andTechnology Policy.Not a statistician.

• Forbes: World’s 7 Most Powerful Data Scientists0 statisticians.

• Startups?• Google, Microsoft, Facebook all have Chief Economists. ChiefStatisticians?

Page 18: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Where are the Statisticians?

• President’s Council of Advisors on Science and Technology(PCAST) includes ...0 statisticians!

• Chief Data Scientist of the United States Office of Science andTechnology Policy.Not a statistician.

• Forbes: World’s 7 Most Powerful Data Scientists0 statisticians.

• Startups?

• Google, Microsoft, Facebook all have Chief Economists. ChiefStatisticians?

Page 19: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Where are the Statisticians?

• President’s Council of Advisors on Science and Technology(PCAST) includes ...0 statisticians!

• Chief Data Scientist of the United States Office of Science andTechnology Policy.Not a statistician.

• Forbes: World’s 7 Most Powerful Data Scientists0 statisticians.

• Startups?• Google, Microsoft, Facebook all have Chief Economists. ChiefStatisticians?

Page 20: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Everyone Should Care (Not Just Statisticians)

• Big Data + Bad Analysis = Bad Decisions

• Gary King: Big data is not about the data, it’s about theanalytics.

• Google search: big data bad analytics = 10,700,000 hits

• Statisticians have been doing data science for at least 100 years.

• You would not get brain surgery done by a cardiologist.

Page 21: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Everyone Should Care (Not Just Statisticians)

• Big Data + Bad Analysis = Bad Decisions

• Gary King: Big data is not about the data, it’s about theanalytics.

• Google search: big data bad analytics = 10,700,000 hits

• Statisticians have been doing data science for at least 100 years.

• You would not get brain surgery done by a cardiologist.

Page 22: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Everyone Should Care (Not Just Statisticians)

• Big Data + Bad Analysis = Bad Decisions

• Gary King: Big data is not about the data, it’s about theanalytics.

• Google search: big data bad analytics = 10,700,000 hits

• Statisticians have been doing data science for at least 100 years.

• You would not get brain surgery done by a cardiologist.

Page 23: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Everyone Should Care (Not Just Statisticians)

• Big Data + Bad Analysis = Bad Decisions

• Gary King: Big data is not about the data, it’s about theanalytics.

• Google search: big data bad analytics = 10,700,000 hits

• Statisticians have been doing data science for at least 100 years.

• You would not get brain surgery done by a cardiologist.

Page 24: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Everyone Should Care (Not Just Statisticians)

• Big Data + Bad Analysis = Bad Decisions

• Gary King: Big data is not about the data, it’s about theanalytics.

• Google search: big data bad analytics = 10,700,000 hits

• Statisticians have been doing data science for at least 100 years.

• You would not get brain surgery done by a cardiologist.

Page 25: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Why Are Statisticians Left Out?

Statisticians are:

conservativestubborninflexiblebad at selling themselvesafraidexperts at saying what you can’t do

Page 26: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Why Are Statisticians Left Out?

Statisticians are:

conservative

stubborninflexiblebad at selling themselvesafraidexperts at saying what you can’t do

Page 27: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Why Are Statisticians Left Out?

Statisticians are:

conservativestubborn

inflexiblebad at selling themselvesafraidexperts at saying what you can’t do

Page 28: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Why Are Statisticians Left Out?

Statisticians are:

conservativestubborninflexible

bad at selling themselvesafraidexperts at saying what you can’t do

Page 29: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Why Are Statisticians Left Out?

Statisticians are:

conservativestubborninflexiblebad at selling themselves

afraidexperts at saying what you can’t do

Page 30: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Why Are Statisticians Left Out?

Statisticians are:

conservativestubborninflexiblebad at selling themselvesafraid

experts at saying what you can’t do

Page 31: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Why Are Statisticians Left Out?

Statisticians are:

conservativestubborninflexiblebad at selling themselvesafraidexperts at saying what you can’t do

Page 32: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

A (mostly) True Story

• Astronomer asks us for help.

• We spend months learning the science, cleaning the data andcarefully analyzing the data.

• Some careful, modest results after one year.

• In the meantime...... my astronomer friend went to see my friends in ML.

• Two days later the ML people produced fancy plots, analyses etc.

• We complain that their analysis was not rigorous.

• Who will the astronomer go to in the future?

Page 33: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

A (mostly) True Story

• Astronomer asks us for help.

• We spend months learning the science, cleaning the data andcarefully analyzing the data.

• Some careful, modest results after one year.

• In the meantime...... my astronomer friend went to see my friends in ML.

• Two days later the ML people produced fancy plots, analyses etc.

• We complain that their analysis was not rigorous.

• Who will the astronomer go to in the future?

Page 34: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

A (mostly) True Story

• Astronomer asks us for help.

• We spend months learning the science, cleaning the data andcarefully analyzing the data.

• Some careful, modest results after one year.

• In the meantime...... my astronomer friend went to see my friends in ML.

• Two days later the ML people produced fancy plots, analyses etc.

• We complain that their analysis was not rigorous.

• Who will the astronomer go to in the future?

Page 35: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

A (mostly) True Story

• Astronomer asks us for help.

• We spend months learning the science, cleaning the data andcarefully analyzing the data.

• Some careful, modest results after one year.

• In the meantime...

... my astronomer friend went to see my friends in ML.

• Two days later the ML people produced fancy plots, analyses etc.

• We complain that their analysis was not rigorous.

• Who will the astronomer go to in the future?

Page 36: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

A (mostly) True Story

• Astronomer asks us for help.

• We spend months learning the science, cleaning the data andcarefully analyzing the data.

• Some careful, modest results after one year.

• In the meantime...... my astronomer friend went to see my friends in ML.

• Two days later the ML people produced fancy plots, analyses etc.

• We complain that their analysis was not rigorous.

• Who will the astronomer go to in the future?

Page 37: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

A (mostly) True Story

• Astronomer asks us for help.

• We spend months learning the science, cleaning the data andcarefully analyzing the data.

• Some careful, modest results after one year.

• In the meantime...... my astronomer friend went to see my friends in ML.

• Two days later the ML people produced fancy plots, analyses etc.

• We complain that their analysis was not rigorous.

• Who will the astronomer go to in the future?

Page 38: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

A (mostly) True Story

• Astronomer asks us for help.

• We spend months learning the science, cleaning the data andcarefully analyzing the data.

• Some careful, modest results after one year.

• In the meantime...... my astronomer friend went to see my friends in ML.

• Two days later the ML people produced fancy plots, analyses etc.

• We complain that their analysis was not rigorous.

• Who will the astronomer go to in the future?

Page 39: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

A (mostly) True Story

• Astronomer asks us for help.

• We spend months learning the science, cleaning the data andcarefully analyzing the data.

• Some careful, modest results after one year.

• In the meantime...... my astronomer friend went to see my friends in ML.

• Two days later the ML people produced fancy plots, analyses etc.

• We complain that their analysis was not rigorous.

• Who will the astronomer go to in the future?

Page 40: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Anecdote: My One Week as Editor of JASA

I was hired as editor of JASA.

I insisted that the journal be made freely available, online.

I was fired.

ASA sold the rights to the journal to Taylor and Francis.

JASA is still behind a paywall.

Compare this to JMLR (Journal of Machine Learning Research)jmlr.org. or NIPS (nips.cc) or ICML (imcl.cc) etc.

Page 41: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Anecdote: My One Week as Editor of JASA

I was hired as editor of JASA.

I insisted that the journal be made freely available, online.

I was fired.

ASA sold the rights to the journal to Taylor and Francis.

JASA is still behind a paywall.

Compare this to JMLR (Journal of Machine Learning Research)jmlr.org. or NIPS (nips.cc) or ICML (imcl.cc) etc.

Page 42: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Anecdote: My One Week as Editor of JASA

I was hired as editor of JASA.

I insisted that the journal be made freely available, online.

I was fired.

ASA sold the rights to the journal to Taylor and Francis.

JASA is still behind a paywall.

Compare this to JMLR (Journal of Machine Learning Research)jmlr.org. or NIPS (nips.cc) or ICML (imcl.cc) etc.

Page 43: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Anecdote: My One Week as Editor of JASA

I was hired as editor of JASA.

I insisted that the journal be made freely available, online.

I was fired.

ASA sold the rights to the journal to Taylor and Francis.

JASA is still behind a paywall.

Compare this to JMLR (Journal of Machine Learning Research)jmlr.org. or NIPS (nips.cc) or ICML (imcl.cc) etc.

Page 44: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Anecdote: My One Week as Editor of JASA

I was hired as editor of JASA.

I insisted that the journal be made freely available, online.

I was fired.

ASA sold the rights to the journal to Taylor and Francis.

JASA is still behind a paywall.

Compare this to JMLR (Journal of Machine Learning Research)jmlr.org. or NIPS (nips.cc) or ICML (imcl.cc) etc.

Page 45: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Anecdote: My One Week as Editor of JASA

I was hired as editor of JASA.

I insisted that the journal be made freely available, online.

I was fired.

ASA sold the rights to the journal to Taylor and Francis.

JASA is still behind a paywall.

Compare this to JMLR (Journal of Machine Learning Research)jmlr.org. or NIPS (nips.cc) or ICML (imcl.cc) etc.

Page 46: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Anecdote: My One Week as Editor of JASA

I was hired as editor of JASA.

I insisted that the journal be made freely available, online.

I was fired.

ASA sold the rights to the journal to Taylor and Francis.

JASA is still behind a paywall.

Compare this to JMLR (Journal of Machine Learning Research)jmlr.org. or NIPS (nips.cc) or ICML (imcl.cc) etc.

Page 47: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

What to Do?

• Change “Department of Statistics” to “Department of Statisticsand Data Science”

• Mostly, we need a cultural shift: training, conferences, topics.

Page 48: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

What to Do?

• Change “Department of Statistics” to “Department of Statisticsand Data Science”

• Mostly, we need a cultural shift: training, conferences, topics.

Page 49: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

What to Do?

• Change “Department of Statistics” to “Department of Statisticsand Data Science”

• Mostly, we need a cultural shift: training, conferences, topics.

Page 50: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Training

• Get rid of: MVUE, ancillarity, completeness, ...

• Get rid of assumptions: (more on this is in a minute)

• Add:VC dimensionsupport vector machinesonline learning, banditsdeep learningoptimizationcoding (not just R)cloud computingbasic software engineering (github etc)

Page 51: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Training

• Get rid of: MVUE, ancillarity, completeness, ...

• Get rid of assumptions: (more on this is in a minute)

• Add:VC dimensionsupport vector machinesonline learning, banditsdeep learningoptimizationcoding (not just R)cloud computingbasic software engineering (github etc)

Page 52: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Training

• Get rid of: MVUE, ancillarity, completeness, ...

• Get rid of assumptions: (more on this is in a minute)

• Add:VC dimensionsupport vector machinesonline learning, banditsdeep learningoptimizationcoding (not just R)cloud computingbasic software engineering (github etc)

Page 53: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Training

• Get rid of: MVUE, ancillarity, completeness, ...

• Get rid of assumptions: (more on this is in a minute)

• Add:VC dimensionsupport vector machinesonline learning, banditsdeep learningoptimizationcoding (not just R)cloud computingbasic software engineering (github etc)

Page 54: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 55: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 56: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 57: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 58: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 59: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 60: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 61: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 62: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 63: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 64: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 65: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Assumptions are For Suckers

• model-based, assumption-laden methods are useless in the worldof big, complex, datasets

• We need assumption-light methods with good visualization

• I propose we ban these things:

Y = Xβ + ε

Normality

sparsity (sparse methods not sparse models)

design assumptions (incoherence)

radical suggestion: let’s get rid of probability!

Jim Ramsay: “what good has probability ever done for statistics?”

Why should X1, . . . ,Xn be thought of as draws from somedistribution?

-online learning, individual sequence prediction, ...

Page 66: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conference Culture

• conference model: refereed conferences: NIPS, ICML, AISTATS,etc• leads to energetic, fast, continuous progress• Every student should be regularly submitting papers to NIPS,AISTATS, ICML, ...• The Interface:-Let’s make the interface the epicenter of statistics. Make it likeNIPS.

Page 67: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conference Culture

• conference model: refereed conferences: NIPS, ICML, AISTATS,etc

• leads to energetic, fast, continuous progress• Every student should be regularly submitting papers to NIPS,AISTATS, ICML, ...• The Interface:-Let’s make the interface the epicenter of statistics. Make it likeNIPS.

Page 68: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conference Culture

• conference model: refereed conferences: NIPS, ICML, AISTATS,etc• leads to energetic, fast, continuous progress

• Every student should be regularly submitting papers to NIPS,AISTATS, ICML, ...• The Interface:-Let’s make the interface the epicenter of statistics. Make it likeNIPS.

Page 69: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conference Culture

• conference model: refereed conferences: NIPS, ICML, AISTATS,etc• leads to energetic, fast, continuous progress• Every student should be regularly submitting papers to NIPS,AISTATS, ICML, ...

• The Interface:-Let’s make the interface the epicenter of statistics. Make it likeNIPS.

Page 70: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conference Culture

• conference model: refereed conferences: NIPS, ICML, AISTATS,etc• leads to energetic, fast, continuous progress• Every student should be regularly submitting papers to NIPS,AISTATS, ICML, ...• The Interface:

-Let’s make the interface the epicenter of statistics. Make it likeNIPS.

Page 71: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conference Culture

• conference model: refereed conferences: NIPS, ICML, AISTATS,etc• leads to energetic, fast, continuous progress• Every student should be regularly submitting papers to NIPS,AISTATS, ICML, ...• The Interface:-Let’s make the interface the epicenter of statistics. Make it likeNIPS.

Page 72: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conclusion

• Statisticans are the original Data Scientists.

• Let’s embrace some of the CS culture. (If you can’t beat them,join them).

THE END

Page 73: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conclusion

• Statisticans are the original Data Scientists.

• Let’s embrace some of the CS culture. (If you can’t beat them,join them).

THE END

Page 74: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conclusion

• Statisticans are the original Data Scientists.

• Let’s embrace some of the CS culture. (If you can’t beat them,join them).

THE END

Page 75: Data Science: The End of Statistics? · President’s Council of Advisors on Science and Technology (PCAST) includes ... 0 statisticians! Chief Data Scientist of the United States

Conclusion

• Statisticans are the original Data Scientists.

• Let’s embrace some of the CS culture. (If you can’t beat them,join them).

THE END