Top Banner
Probability •Formal study of uncertainty •The engine that drives statistics
65

Probability Formal study of uncertainty The engine that drives statistics.

Dec 22, 2015

Download

Documents

Lee Harris
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Probability Formal study of uncertainty The engine that drives statistics.

Probability

•Formal study of uncertainty•The engine that drives statistics

Page 2: Probability Formal study of uncertainty The engine that drives statistics.

Introduction

• Nothing in life is certain• We gauge the chances of

successful outcomes in business, medicine, weather, and other everyday situations such as the lottery (recall the birthday problem)

Page 3: Probability Formal study of uncertainty The engine that drives statistics.

History

• For most of human history, probability, the formal study of the laws of chance, has been used for only one thing: gambling

Page 4: Probability Formal study of uncertainty The engine that drives statistics.

History (cont.)• Nobody knows exactly when

gambling began; goes back at least as far as ancient Egypt where 4-sided “astragali” (made from animal heelbones) were used

Page 5: Probability Formal study of uncertainty The engine that drives statistics.

History (cont.)• The Roman emperor Claudius

(10BC-54AD) wrote the first known treatise on gambling.

• The book “How to Win at Gambling” was lost.

Rule 1: Let Caesar win IVout of V times

Page 6: Probability Formal study of uncertainty The engine that drives statistics.

Approaches to Probability

• Relative frequencyevent probability = x/n, where x=# of occurrences of event of interest, n=total # of observations

• Coin, die tossing; nuclear power plants?

• Limitationsrepeated observations not practical

Page 7: Probability Formal study of uncertainty The engine that drives statistics.

Approaches to Probability (cont.)

• Subjective probabilityindividual assigns prob. based on personal experience, anecdotal evidence, etc.

• Classical approachevery possible outcome has equal probability (more later)

Page 8: Probability Formal study of uncertainty The engine that drives statistics.

Basic Definitions

• Experiment: act or process that leads to a single outcome that cannot be predicted with certainty

• Examples:1. Toss a coin2. Draw 1 card from a standard deck of

cards3. Arrival time of flight from Atlanta to

RDU

Page 9: Probability Formal study of uncertainty The engine that drives statistics.

Basic Definitions (cont.)

• Sample space: all possible outcomes of an experiment. Denoted by S

• Event: any subset of the sample space S;typically denoted A, B, C, etc.Simple event: event with only 1 outcomeNull event: the empty set Certain event: S

Page 10: Probability Formal study of uncertainty The engine that drives statistics.

Examples

1. Toss a coin onceS = {H, T}; A = {H}, B = {T} simple events

2. Toss a die once; count dots on upper faceS = {1, 2, 3, 4, 5, 6}A=even # of dots on upper face={2, 4, 6}B=3 or fewer dots on upper face={1, 2, 3}

Page 11: Probability Formal study of uncertainty The engine that drives statistics.

Laws of Probability

1)(,0)(.2

event any for ,1)(0 1.

SPP

AAP

Page 12: Probability Formal study of uncertainty The engine that drives statistics.

Laws of Probability (cont.)

3. P(A’ ) = 1 - P(A)For an event A, A’ is the complement of A; A’ is everything in S that is not in A.

AA'

S

Page 13: Probability Formal study of uncertainty The engine that drives statistics.

Birthday Problem• What is the smallest number of

people you need in a group so that the probability of 2 or more people having the same birthday is greater than 1/2?

• Answer: 23No. of people 23 30 40 60Probability .507.706.891.994

Page 14: Probability Formal study of uncertainty The engine that drives statistics.

Example: Birthday Problem

• A={at least 2 people in the group have a common birthday}

• A’ = {no one has common birthday}

502.498.1)'(1)(

498.365

343

365

363

365

364)'(

:23365

363

365

364)'(:3

APAPso

AP

people

APpeople

Page 15: Probability Formal study of uncertainty The engine that drives statistics.

Unions and Intersections

S

A B

A

A

Page 16: Probability Formal study of uncertainty The engine that drives statistics.

Mutually Exclusive Events

• Mutually exclusive events-no outcomes from S in common

S

AB

A =

Page 17: Probability Formal study of uncertainty The engine that drives statistics.

Laws of Probability (cont.)

Addition Rule for Disjoint Events:

4. If A and B are disjoint events, then

P(A B) = P(A) + P(B)

Page 18: Probability Formal study of uncertainty The engine that drives statistics.

• 5. For two independent events A and B

P(A B) = P(A) × P(B)

Page 19: Probability Formal study of uncertainty The engine that drives statistics.

Laws of Probability (cont.)

General Addition Rule

6. For any two events A and B

P(A B) = P(A) + P(B) – P(A B)

Page 20: Probability Formal study of uncertainty The engine that drives statistics.

P(AB)=P(A) + P(B) - P(A B)

S

A B

A

Page 21: Probability Formal study of uncertainty The engine that drives statistics.

Example: toss a fair die once

• S = {1, 2, 3, 4, 5, 6}• A = even # appears = {2, 4, 6}• B = 3 or fewer = {1, 2, 3}• P(A B) = P(A) + P(B) - P(A B)

=P({2, 4, 6}) + P({1, 2, 3}) - P({2})

= 3/6 + 3/6 - 1/6 = 5/6

Page 22: Probability Formal study of uncertainty The engine that drives statistics.

Laws of Probability: Summary

• 1. 0 P(A) 1 for any event A• 2. P() = 0, P(S) = 1• 3. P(A’) = 1 – P(A)• 4. If A and B are disjoint events, then

P(A B) = P(A) + P(B)• 5. If A and B are independent events,

thenP(A B) = P(A) × P(B)

• 6. For any two events A and B,P(A B) = P(A) + P(B) – P(A B)

Page 23: Probability Formal study of uncertainty The engine that drives statistics.

Probability Models

The Equally Likely Approach(also called the Classical

Approach)

Page 24: Probability Formal study of uncertainty The engine that drives statistics.

Assigning Probabilities

• If an experiment has N outcomes, then each outcome has probability 1/N of occurring

• If an event A1 has n1 outcomes, then

P(A1) = n1/N

Page 25: Probability Formal study of uncertainty The engine that drives statistics.

We Need Efficient Methods for Counting Outcomes

Page 26: Probability Formal study of uncertainty The engine that drives statistics.

Product Rule for Ordered Pairs

• A student wishes to commute to a junior college for 2 years and then commute to a state college for 2 years. Within commuting distance there are 4 junior colleges and 3 state colleges. How many junior college-state college pairs are available to her?

Page 27: Probability Formal study of uncertainty The engine that drives statistics.

Product Rule for Ordered Pairs

• junior colleges: 1, 2, 3, 4• state colleges a, b, c• possible pairs:(1, a) (1, b) (1, c)(2, a) (2, b) (2, c)(3, a) (3, b) (3, c)(4, a) (4, b) (4, c)

Page 28: Probability Formal study of uncertainty The engine that drives statistics.

Product Rule for Ordered Pairs

• junior colleges: 1, 2, 3, 4• state colleges a, b, c• possible pairs:(1, a) (1, b) (1, c)(2, a) (2, b) (2, c)(3, a) (3, b) (3, c)(4, a) (4, b) (4, c)

4 junior colleges3 state collegestotal number of possiblepairs = 4 x 3 = 12

4 junior colleges3 state collegestotal number of possiblepairs = 4 x 3 = 12

Page 29: Probability Formal study of uncertainty The engine that drives statistics.

Product Rule for Ordered Pairs

• junior colleges: 1, 2, 3, 4• state colleges a, b, c• possible pairs:(1, a) (1, b) (1, c) (2, a) (2, b) (2, c)(3, a) (3, b) (3, c)(4, a) (4, b) (4, c)

In general, if there are n1 waysto choose the first element ofthe pair, and n2 ways to choosethe second element, then the number of possible pairs isn1n2. Here n1 = 4, n2 = 3.

In general, if there are n1 waysto choose the first element ofthe pair, and n2 ways to choosethe second element, then the number of possible pairs isn1n2. Here n1 = 4, n2 = 3.

Page 30: Probability Formal study of uncertainty The engine that drives statistics.

Counting in “Either-Or” Situations• NCAA Basketball Tournament: how

many ways can the “bracket” be filled out?

1. How many games?2. 2 choices for each game3. Number of ways to fill out the bracket:

263 = 9.2 × 1018

• Earth pop. about 6 billion; everyone fills out 1 million different brackets

• Chances of getting all games correct is about 1 in 1,000

Page 31: Probability Formal study of uncertainty The engine that drives statistics.

Counting Example

• Pollsters minimize lead-in effect by rearranging the order of the questions on a survey

• If Gallup has a 5-question survey, how many different versions of the survey are required if all possible arrangements of the questions are included?

Page 32: Probability Formal study of uncertainty The engine that drives statistics.

Solution• There are 5 possible choices for the

first question, 4 remaining questions for the second question, 3 choices for the third question, 2 choices for the fourth question, and 1 choice for the fifth question.

• The number of possible arrangements is therefore

5 4 3 2 1 = 120

Page 33: Probability Formal study of uncertainty The engine that drives statistics.

Efficient Methods for Counting Outcomes

• Factorial Notation:n!=12 … n

• Examples1!=1; 2!=12=2; 3!= 123=6; 4!

=24;5!=120;• Special definition: 0!=1

Page 34: Probability Formal study of uncertainty The engine that drives statistics.

Factorials with calculators and Excel

• Calculator: non-graphing: x ! (second function)graphing: bottom p. 9 T I Calculator Commands(math button)

• Excel:Paste: math, fact

Page 35: Probability Formal study of uncertainty The engine that drives statistics.

Factorial Examples• 20! = 2.43 x 1018

• 1,000,000 seconds?• About 11.5 days• 1,000,000,000 seconds?• About 31 years• 31 years = 109 seconds• 1018 = 109 x 109

• 31 x 109 years = 109 x 109 = 1018 seconds

• 20! is roughly the age of the universe in seconds

Page 36: Probability Formal study of uncertainty The engine that drives statistics.

Permutations

A B C D E• How many ways can we choose 2

letters from the above 5, without replacement, when the order in which we choose the letters is important?

• 5 4 = 20

Page 37: Probability Formal study of uncertainty The engine that drives statistics.

Permutations (cont.)

20)!25(

!5:

45!3

!5

)!25(

!52045

25

PNotation

Page 38: Probability Formal study of uncertainty The engine that drives statistics.

Permutations with calculator and Excel

• Calculatornon-graphing: nPr

• Graphingp. 9 of T I Calculator Commands(math button)

• ExcelPaste: Statistical, Permut

Page 39: Probability Formal study of uncertainty The engine that drives statistics.

Combinations

A B C D E• How many ways can we choose 2

letters from the above 5, without replacement, when the order in which we choose the letters is not important?

• 5 4 = 20 when order important• Divide by 2: (5 4)/2 = 10 ways

Page 40: Probability Formal study of uncertainty The engine that drives statistics.

Combinations (cont.)

!)!(

!

102

20

21

45

!2!3

!5

!2)!25(

!525

52

rrn

nC

C

rnnr

Page 41: Probability Formal study of uncertainty The engine that drives statistics.

ST 101 Powerball Lottery

From the numbers 1 through 20,choose 6 different numbers.

Write them on a piece of paper.

Page 42: Probability Formal study of uncertainty The engine that drives statistics.

Chances of Winning?

760,38!6)!620(

!20

ies?possibilit ofNumber

important.not order t,replacemen

without 20, from numbers 6 Choose

620206

C

Page 43: Probability Formal study of uncertainty The engine that drives statistics.

North Carolina Powerball Lottery

Prior to Jan. 1, 2009 After Jan. 1, 2009

:

55!3,478,761

5!50!

:

42!42

1!41!

3,478,761*42

146,107,962

5 from 1- 55

1 from 1- 42 (p'ball #)

:

59!5,006,386

5!54!

:

39!39

1!38!

5,006,386*39

195,249,054

5 from 1- 59

1 from 1- 39 (p'ball #)

Page 44: Probability Formal study of uncertainty The engine that drives statistics.

Visualize Your Lottery Chances

• How large is 195,249,054?• $1 bill and $100 bill both 6” in length

• 10,560 bills = 1 mile• Let’s start with 195,249,053 $1 bills

and one $100 bill …• … and take a long walk, putting

down bills end-to-end as we go

Page 45: Probability Formal study of uncertainty The engine that drives statistics.

Raleigh to Ft. Lauderdale…

… still plenty of bills remaining, so continue from …

Page 46: Probability Formal study of uncertainty The engine that drives statistics.

… Ft. Lauderdale to San Diego

… still plenty of bills remaining, so continue from…

Page 47: Probability Formal study of uncertainty The engine that drives statistics.

… still plenty of bills remaining, so continue from …

… San Diego to Seattle

Page 48: Probability Formal study of uncertainty The engine that drives statistics.

… still plenty of bills remaining, so continue from …

… Seattle to New York

Page 49: Probability Formal study of uncertainty The engine that drives statistics.

… still plenty of bills remaining, so …

… New York back to Raleigh

Page 50: Probability Formal study of uncertainty The engine that drives statistics.

Go around again! Lay a second path of bills

Still have ~ 5,000 bills left!!

Page 51: Probability Formal study of uncertainty The engine that drives statistics.

Chances of Winning NC Powerball Lottery?

• Remember: one of the bills you put down is a $100 bill; all others are $1 bills

• Your chance of winning the lottery is the same as bending over and picking up the $100 bill while walking the route blindfolded.

Page 52: Probability Formal study of uncertainty The engine that drives statistics.

Example: Illinois State Lottery

balls) pong pingmillion 16.5 house, ft (1200

months) 10in second 1about (

165,827,25!6!48

!54

importantnot order t;replacemen

withoutnumbers 54 from numbers 6 Choose

2

654 C

Page 53: Probability Formal study of uncertainty The engine that drives statistics.

Virginia State Lottery

969000,52!1!24

!25760,118,2

760,118,2

760,118,2!5!45

!50: 5Pick

125

550

C

C

Page 54: Probability Formal study of uncertainty The engine that drives statistics.

Probability Trees

A Graphical Method for Complicated Probability

Problems

Page 55: Probability Formal study of uncertainty The engine that drives statistics.

Example: AIDS Testing• V={person has HIV}; CDC: P(V)=.006• +: test outcome is positive (test

indicates HIV present)• -: test outcome is negative• clinical reliabilities for a new HIV test:

1. If a person has the virus, the test result will be positive with probability .999

2. If a person does not have the virus, the test result will be negative with probability .990

Page 56: Probability Formal study of uncertainty The engine that drives statistics.

Question 1

• What is the probability that a randomly selected person will test positive?

Page 57: Probability Formal study of uncertainty The engine that drives statistics.

Probability Tree Approach

• A probability tree is a useful way to visualize this problem and to find the desired probability.

Page 58: Probability Formal study of uncertainty The engine that drives statistics.

Probability Treeclinical reliability

clinical reliability

Page 59: Probability Formal study of uncertainty The engine that drives statistics.

Probability TreeMultiply

branch probsclinical reliability

clinical reliability

Page 60: Probability Formal study of uncertainty The engine that drives statistics.

Question 1 Answer

• What is the probability that a randomly selected person will test positive?

• P(+) = .00599 + .00994 = .01593

Page 61: Probability Formal study of uncertainty The engine that drives statistics.

Question 2

• If your test comes back positive, what is the probability that you have HIV?(Remember: we know that if a person has the virus, the test result will be positive with probability .999; if a person does not have the virus, the test result will be negative with probability .990).

• Looks very reliable

Page 62: Probability Formal study of uncertainty The engine that drives statistics.

Question 2 Answer

Answertwo sequences of branches lead to positive test; only 1 sequence represented people who have HIV.

P(person has HIV given that test is positive) =.00599/(.00599+.00994) = .376

Page 63: Probability Formal study of uncertainty The engine that drives statistics.

Summary• Question 1:• P(+) = .00599 + .00994 = .01593• Question 2: two sequences of

branches lead to positive test; only 1 sequence represented people who have HIV.

P(person has HIV given that test is positive) =.00599/(.00599+.00994) = .376

Page 64: Probability Formal study of uncertainty The engine that drives statistics.

Recap• We have a test with very high clinical

reliabilities:1. If a person has the virus, the test result will be

positive with probability .9992. If a person does not have the virus, the test

result will be negative with probability .990

• But we have extremely poor performance when the test is positive:

P(person has HIV given that test is positive) =.376

• In other words, 62.4% of the positives are false positives! Why?

• When the characteristic the test is looking for is rare, most positives will be false.

Page 65: Probability Formal study of uncertainty The engine that drives statistics.

examples1. P(A)=.3, P(B)=.4; if A and B are

mutually exclusive events, then P(AB)=?

A B = , P(A B) = 02. 15 entries in pie baking contest at

state fair. Judge must determine 1st, 2nd, 3rd place winners. How many ways can judge make the awards?

15P3 = 2730