Top Banner
Dealing With Uncertainty: What the reverend Bayes can teach us
28

Dealing with Uncertainty: What the reverend Bayes can teach us.

Jan 15, 2015

Download

Technology

OReillyStrata

By Jurgen Van Gael - http://jvangael.github.io/ - @jvangael

As data scientists and decision makers, uncertainty is all around us: data is noisy, missing, wrong or inherently uncertain. Statistics offers a wide set of theories and tools to deal with this uncertainty, yet most people are unaware of a unifying theory of uncertainty. In this talk I want to introduce the audience to a branch of statistics called Bayesian reasoning which is a unifying, consistent, logical and most importantly successful way of dealing with uncertainty.

Over the past two centuries there have been many proposals for dealing with uncertainty (e.g. frequentist probabilities, fuzzy logic, ...). Under the influence of early 20th century statisticians, the Bayesian formalism was somewhat pushed into the background of the statistical scene. More recently though, some to the credit of computer science, Bayesian thinking has seen a revival. So what and how much should a data scientist or decision maker know about Bayesian thinking?

My talk will consist of four different parts. In the first part, I will explain the central dogma of Bayesian thinking: Bayes Rule. This simple equation (4 variables, one multiplication and one division!) describes how we should update our beliefs about the world in light of new data. I will discuss evidence from neuroscience and psychology that the brain uses Bayesian mechanism to reason about the world. Unfortunately, sometimes the brain fails miserably at taking all the variables of Bayes rule into account.

This leads to the second part of the talk where I will illustrate Bayes rule as a tool for decision makers to reason about uncertainty.

In the third part of the talk I will give an example of how we can build machine learning systems around Bayes rule. The key idea here is that Bayes rule allows us to keep track of uncertainty about the world. In this part I will illustrate one a Bayesian machine learning system in action.

In the final part of the talk I will introduce the concept of “Probabilistic Programming”. Probabilistic programming is a new embryonic programming paradigm that introduces “uncertain variables” as a first class citizen of a programming language and then uses Bayes rule to execute the programs.

When we look at machine learning conferences in the last few years, the Bayesian framework has been prominent. In this talk I want to help the audience understand how the Bayesian framework can help them in their data mining and decision making processes. If people leave the talk thinking Bayes rule is the E=MC^2 of data science, I will consider the presentation a success.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dealing with Uncertainty: What the reverend Bayes can teach us.

Dealing With Uncertainty: What the reverend Bayes can teach us

Page 2: Dealing with Uncertainty: What the reverend Bayes can teach us.

Probability – Bernoulli, de Moivre §  Fair coin

-  50% heads

-  50% tails

What is the probability of two consecutive heads?

25%

25%

25%

25%

Page 3: Dealing with Uncertainty: What the reverend Bayes can teach us.

1701

Page 4: Dealing with Uncertainty: What the reverend Bayes can teach us.

Inverse Probability (Bayes)

§  Given a coin, not sure whether biased or not?

§  If two rolls turn up heads, is the coin biased or not?

Original Belief

Observation

New Belief

Page 5: Dealing with Uncertainty: What the reverend Bayes can teach us.
Page 6: Dealing with Uncertainty: What the reverend Bayes can teach us.

BAYESIAN PROBABILITY

Page 7: Dealing with Uncertainty: What the reverend Bayes can teach us.

Cox Axioms §  The plausibility of a statement is a real number and is

dependent on information we have related to the statement.

§  Plausibilities should vary sensibly with the assessment of plausibilities in the model.

§  If the plausibility of a statement can be derived in many ways, all the results must be equal.

Outcome:

§  If A is true then p(A) = 1

§  p(A) + p(not A) = 1

§  p(A and B) = p(A|B) x p(B)

Page 8: Dealing with Uncertainty: What the reverend Bayes can teach us.

p(“cause”|“e↵ect”) = p(“e↵ect”|“cause”)p(“cause”)p(“e↵ect”)

Original Belief

Observation

New Belief

Page 9: Dealing with Uncertainty: What the reverend Bayes can teach us.

What is the probability that the person behind the screen is a girl?

50% What is the probability that the person called Charlie behind the screen is a girl?

Page 10: Dealing with Uncertainty: What the reverend Bayes can teach us.

Something about probability of Charlie

§ Girls: 32 / 22989 = 0.13%

§ Buys: 89 / 22070 = 0.4%

Page 11: Dealing with Uncertainty: What the reverend Bayes can teach us.

What is the probability that the person called Charlie behind the screen is a girl?

p(Girl|“Charlie”) = p(“Charlie”|Girl)p(Girl)

p(“Charlie”)

32 / 22989 = 0.13%

50%

p(“Charlie”|Girl)p(Girl) + p(“Charlie”|Boy)p(Boy)50% 50% 32 / 22989 = 0.13% 89 / 22070 = 0.4%

25%

Page 12: Dealing with Uncertainty: What the reverend Bayes can teach us.

BAYESIAN MACHINE LEARNING

Page 13: Dealing with Uncertainty: What the reverend Bayes can teach us.

p(Spam|Content) =p(Content|Spam)⇥ p(Spam)

p(Content)

Page 14: Dealing with Uncertainty: What the reverend Bayes can teach us.

TrueSkill

p(Skill|Match Outcomes) =p(Match Outcomes|Skill)⇥ p(Skill)

p(Match Outcomes)

Page 15: Dealing with Uncertainty: What the reverend Bayes can teach us.

p(Roadt+1|Imaget) =p(Imaget|Roadt)⇥ p(Roadt)

p(Imaget)

Page 16: Dealing with Uncertainty: What the reverend Bayes can teach us.

Bayesian Sick People Experiment §  1 in 100 has health issue.

§  Test is 90% accurate.

§  You test positive, what are the odds that you need a treatment?

Page 17: Dealing with Uncertainty: What the reverend Bayes can teach us.

What is the probability of being sick?

A.  ≈ 95% B.  ≈ 90% C.  ≈ 50% D.  ≈ 10%

Page 18: Dealing with Uncertainty: What the reverend Bayes can teach us.

§  1000 people in our sample.

§  We expect 10 people to be sick (give or take).

§  Imagine testing all individuals?

Page 19: Dealing with Uncertainty: What the reverend Bayes can teach us.

§  1000 people in our sample.

§  We expect 10 people to be sick (give or take).

§  Imagine testing all individuals?

à  9 out of 10 sick people test positive.

Page 20: Dealing with Uncertainty: What the reverend Bayes can teach us.

§  1000 people in our sample.

§  We expect 10 people to be sick (give or take).

§  Imagine testing all individuals?

à  9 out of 10 sick people test positive.

à  99 out of 990 healthy people test positive!

§  I.o.w. if you test positive, it is actually not very likely that you are sick.

Page 21: Dealing with Uncertainty: What the reverend Bayes can teach us.

PROBABILISTIC PROGRAMMING

Page 22: Dealing with Uncertainty: What the reverend Bayes can teach us.

Cause à Effect Effect à Cause

Inputà Output Output à Input

Page 23: Dealing with Uncertainty: What the reverend Bayes can teach us.

§  Imagine a timeline of sales per day for a particular product. §  Did the sales rate for this product change over time?

Page 24: Dealing with Uncertainty: What the reverend Bayes can teach us.

Thinking From Cause to Effect

§  In: -  Sales rate for period 1. -  Sales rate for period 2. -  Switchover point between period 1 and 2.

§ Output: -  Unit sales over period 1 and 2.

model = pymc.Model()

with model:

switch = pymc.DiscreteUniform(lower=0, lower=70)

rate_1 = pymc.Exponential(1.0)

rate_2 = pymc.Exponential(1.0)

rates = pymc.switch(switch >= arange(70), rate_1, rate_2)

unit_sales = pymc.Poisson(rates, observed=data)

Page 25: Dealing with Uncertainty: What the reverend Bayes can teach us.
Page 26: Dealing with Uncertainty: What the reverend Bayes can teach us.
Page 27: Dealing with Uncertainty: What the reverend Bayes can teach us.

References §  Bayesian vs. Frequentist Statistics

-  http://www.stat.ufl.edu/~casella/Talks/BayesRefresher.pdf §  Probabilistic Programming & Bayesian Methods for Hackers

-  https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

§  Bayesian Methods -  http://www.gatsby.ucl.ac.uk/~zoubin/tmp/tutorial.pdf

§  “The Theory That Would not Die”, Sharon Bertsch Mcgrayne -  http://www.amazon.co.uk/dp/0300188226

Page 28: Dealing with Uncertainty: What the reverend Bayes can teach us.

Medical Example using PyMC

model = pymc.Model() with model: sick = pymc.Bernoulli(p=0.01) test_result = pymc.Bernoulli(sick * 0.9 + (1-sick) * (1.0-0.9), observed=[1]) algorithm = pymc.Metropolis() print “Pr(Sick | Test) = %f” % pymc.sample(1000, algorithm)[sick].mean()