Top Banner
Statistics and Data Analysis Part 3 – Probability
36

Statistics and Data Analysis Part 3 – Probability.

Mar 31, 2015

Download

Documents

Calista Borden
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Statistics and Data Analysis Part 3 – Probability.

Statistics and Data Analysis

Part 3 – Probability

Page 2: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Counting Rule for Probabilities

Probabilities for compounds of atomistic equally likely events are obtained by counting.

P(Compound Event) =

Number of Elementary Events in Compound Event

Number of Elements in the Sample Space

9/52

Page 3: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Compound Events

1 2 3 4 5 6 7 8

E = A Random consumer’s random choice of exactly one product

Event(fruit) = Event(berry #3) + Event(fruity #6) + Event(apple #8)

P(Fruity) = P(#3) + P(#6) + P(#8) = 1/8 + 1/8 + 1/8 = 3/8

P(Sweetened) = P(HoneyNut #2) + P(Frosted #7) = 1/8 + 1/8 = 1/4

10/52

Page 4: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Appplications: Games of Chance; Poker In a 5 card hand from a deck of 52,

there are 52*51*50*49*48)/(5*4*3*2*1) different possible hands. (Order doesn’t matter). 2,598,960 possible hands.

How many of these hands have 4 aces? 48 = the 4 aces plus any of the remaining 48 cards.

20/52

Page 5: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Probability of 4 Aces

0.000018469

Number of hands with 4 acesP(4 Aces) =

Number of hands with 5 cards4 48

×4 1

= 52

5

1× 48 =

2,598,960

21/52

Page 6: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

The Dead Man’s Hand The dead man’s hand is 5 cards, 2 aces, 2 8’s

and some other 5th card (Wild Bill Hickok was holding this hand when he was shot in the back and killed in 1876.) The number of hands with two aces and two 8’s is 44 = 1,584

The rest of the story claims that Hickok held all black cards (the bullets). The probability for this hand falls to only 44/2598960. (The four cards in the picture and one of the remaining 44.)

Some claims have been made about the 5th card, but noone is sure – there is no record.

http://en.wikipedia.org/wiki/Dead_man's_hand

4

2

4

2

22/52

Page 7: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Counting the Dead Man’s Cards

4 4!6

2 2!(4 2)!

The Aces 6: There are 6 possible pairs out of [A♠ A♣ A♥ A♦] (♠ ♣) (♠♥) (♠♦) (♣♥) (♣♦) (♥♦)

The 8’s: There are also 6 possible pairs out of [8♠ 8♣ 8♥ 8♦] (♠ ♣) (♠♥) (♠♦) (♣♥) (♣♦) (♥♦)

There are 44 remaining cards in the deck that are not aces and not 8’s.

The total number of possible different hands is therefore 6(6)(44) = 1,584. If he held the bullets (black cards), then there are only (1)(1)(44) = 44 combinations.There is a claim that the 5th card was a diamond. This reduces the number ofpossible combinations to (1)(1)(11).

4 4!6

2 2!(4 2)!

23/52

Page 8: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Poker Hands

Royal Flush – Top 5 cards in a suit

Straight Flush – 5 sequential cards in the same suit suit

4 of a kind – plus any other card

Full House – 3 of one kind, 2 of another. (Also called a “boat.”)

24/52

Page 9: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

More Poker Hands

Flush – 5 cards in a suit, not sequential

Straight – 5 cards in a numerical row, not the same suit

3 of a kind plus two other cards

Two pairs plus one other card

25/52

Page 10: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Still More Poker Hands

1 pair plus 3 other cards

High card: 5 cards, no pairs, mixed suits

26/52

Page 11: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Probabilities of 5 Card Poker Hands

http://www.durangobill.com/Poker.html

Poker Hand        Different Combinations     Probability Odds Against--------------------------------------------------------------------------Royal Straight Flush                 4        .0000015391 649,729:1Other Straight Flush                36        .0000138517 72,193:1Straight Flush (Royal or other) 40 .0000153908 64,973:1Four of a kind                     624        .0002400960 4,164:1Full House                      3,744        .0014405762 693:1Flush                            5,108        .0019654015 508:1Straight                        10,200        .0039246468 254:1Three of a kind                 54,912        .0211284514 46:1Two Pairs                      123,552        .0475390156 20:1One Pair                     1,098,240        .4225690276 1.4:1High card only (None of above)  1,302,540        .5011773940 1:1Total                        2,598,960       1.0000000000

27/52

Page 12: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Odds (Ratios)

Prob(Event)Odds in Favor =

1-Prob(Event)

1-Prob(Event)Odds Against =

Prob(Event)

28/52

Page 13: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Odds vs. 5 Card Poker Hands

http://www.durangobill.com/Poker.html

Poker Hand        Combinations     Probability Odds Against--------------------------------------------------------------------------Royal Straight Flush                 4        .0000015391 649,729:1Other Straight Flush                36        .0000138517 72,193:1Straight Flush (Royal or other) 40 .0000153908 64,973:1Four of a kind                     624        .0002400960 4,164:1Full House                      3,744        .0014405762 693:1Flush                            5,108        .0019654015 508:1Straight                        10,200        .0039246468 254:1Three of a kind                 54,912        .0211284514 46:1Two Pairs                      123,552        .0475390156 20:1One Pair                     1,098,240        .4225690276 1.4:1High card only (None of above)  1,302,540        .5011773940 1:1Total                        2,598,960       1.0000000000

29/52

Page 14: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Joint Events Pairs (or groups) of events: A and B

One or the other occurs: A or B ≡ A B

Both events occur A and B ≡ A B Independent events: Occurrence of A does not

affect the probability of B An addition rule: P(A B) = P(A)+P(B)-P(A B) The product rule for independent events:

P(A B) = P(A)P(B)

30/52

Page 15: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Joint Events: Pick a Card, Any Card

Event A = Diamond: P(Diamond) = 13/52

2♦ 3♦ 4♦ 5♦ 6♦ 7♦ 8♦ 9♦ 10♦ J♦ Q♦ K♦ A♦ Event B = Ace: P(Ace) = 4/52

A♦ A♥ A♣ A♠ Event A or B = Diamond or Ace

P(Diamond or Ace) = P(Diamond) + P(Ace) – P(Diamond Ace)= 13/52 + 4/52 – 1/52 = 16/52

31/52

Page 16: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Application

Female Male Total

Uninsured1144

.04186

1979

.07242

3123

.11429

Insured11939

.43691

12264

.44880

24203

.88571

Total13083

.47877

14243

.52123

27326

1.00000

Survey of 27326 German Individuals over 5 yearsFrequency in black, sample proportion in red.E.g., .04186=1144/27326, .52123=14243/27326

32/52

Page 17: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

The Addition Rule - Application

An individual is drawn randomly from the sample of 27,326 observations.P(Female or Insured) = P(Female) + P(Insured) – P(Female and Insured) = .47877 + .88571 - .43691 = .92757

Female Male Total

Uninsured1144

.04186

1979

.07242

3123

.11429

Insured11939

.43691

12264

.44880

24203

.88571

Total13083

.47877

14243

.52123

27326

1.00000

Survey of 27326 German Individuals over 5 years

33/52

Page 18: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Product Rule for Independent Events

If two events A and B are independent, the probability that both occur is P(A B) = P(A)P(B)

Example: I will fly to Washington (and back) for a meeting on Monday. I will use the train on Tuesday. P(Late if I fly) = .6. P(Late if I take the train)=.2. Late or on time for the two days are independent.

What is the probability that I will miss at least one meeting? P(Late Monday, Not late on Tuesday) = .6(.8) = .48 P(Not late Monday, Late Tuesday) = .4(.2) = .08 P(Late Monday and Late Tuesday) = .6(.2) = .12 P(Late at least once) = .48+.08+.12 = .68

34/52

Page 19: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Joint Events and Joint Probabilities

Marginal probability = Probability for each event, without considering the other.

Joint probability = Probability that two (several) events happen at the same time

35/52

Page 20: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Marginal and Joint Probabilities

Female Male Total

Uninsured1144

.04186

1979

.07242

3123

.11429

Insured11939

.43691

12264

.44880

24203

.88571

Total13083

.47877

14243

.52123

27326

1.00000

Survey of 27326 German Individuals over 5 yearsConsider drawing an individual at random from the sample.

Marginal Probabilities; P(Male)=.52123, P(Insured) = .88571

Joint Probabilities; P(Male and Insured) = .44880

36/52

Page 21: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Conditional Probability “Conditional event” = occurrence of an

event given that some other event has occurred.

Conditional probability = Probability of an event given that some other event is certain to occur. Denoted P(A|B) = Probability of A will occur given B occurred.Prob(A|B) = Prob(A and B) / Prob(B)

37/52

Page 22: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Conditional Probabilities

Company ESI sells two types of software, Basic and Advanced, to two markets, Government and Academic.Sales occur with the following probabilities:

AcademicGovernment Total

Basic .4 .2 .6

Advanced .3 .1 .4

Total .7 .3 1.0

P(Basic | Academic) = .4 / .7 = .571

P(Government | Advanced) = .1 / .4 = .25

38/52

Page 23: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Conditional Probabilities

P(Uninsured|Female)

=P(Uninsured and Female)/P(Female)

=.04186/.47877=.08743

P(Male|Insured)

=P(Male and Insured)/P(Insured)

= .44880/.88571=.50671

An individual is drawn randomly from the sample of 27,326 individuals in the German socioeconomic panel.

39/52

Page 24: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

The Product Rule for Conditional Probabilities

For events A and B, P(A B)=P(A|B)P(B) Example: You draw a card from a well shuffled

deck of cards, then a second one. What is the probability that the two cards will be a pair?

There are 13 cards. Let A1 be the card on the first draw and A2 be the second one. Then, P(A1 A2) = P(A1)P(A2|A1).

For a pair of kings, P(K1) = 1/13. P(K2|K1) = 3/51. P(K1 K2) = (1/13)(3/51). There are 13 possible

pairs, so P(Pair) = 13(1/13)(3/51) = 1/17.

40/52

Page 25: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Independent Events

Events are independent if the occurrence of one does not affect probabilities related to the other.

Events A and B are independent if P(A|B) = P(A). I.e., conditioning on B does not affect the probability of A.

41/52

Page 26: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Independent Events? Pick a Card, Any Card

P(Red card drawn) = 26/52 = 1/2 P(Ace drawn) = 4/52 = 1/13. P(Ace|Red) = (2/52) / (26/52) = 1/13 P(Ace) = P(Ace|Red) so “Red Card”

and “Ace” are independent.

42/52

Page 27: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Independent Events?

Company ESI sells two types of software, Basic and Advanced, to two markets, Government and Academic.Sales occur randomly with the following probabilities:

Academic Government Total

Basic .4 .2 .6

Advanced .3 .1 .4

Total .7 .3 1.0

P(Basic | Academic) = .4 / .7 = .571 not equal to P(Basic)=.6

P(Government | Advanced) = .1 / .4 = .25 not equal to P(Govt) =.3

43/52

Page 28: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Litigation Risk Analysis

http://www.jenkens.com/Image/Jenkens/Content/The Decision Tree.pdf#search=%22%22litigation risk%22%2Bgilchrist%22

DecisionP(Outcome | Decision)

P(Result | Outcome,Decision=L)

44/52

Page 29: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Litigation Risk

If we decide to LITIGATE, the probability we will PREVAIL and FIND ASSET is

P(Prevail,Find Asset) = P(Find Asset|Prevail) P(Prevail) = .5 * .5 = .25.

45/52

Page 30: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Litigation Risk Analysis: Using Probabilities to Determine a Strategy

Two paths to a favorable outcome. Probability =(upper) .7(.6)(.4) + (lower) .5(.3)(.6) = .168 + .09 = .258.

How can I use this to decide whether to litigate or not?

46/52

Page 31: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Using Conditional Probabilities: Bayes Theorem

P(A,B)P(A | B) Target

P(B)

P(B | A)P(A) Theorem

P(B)

P(B | A)P(A) Def

P(A,B) P(notA,B)

inition

P(B | A)P(A) Computation

P(B | A)P(A) P(B | notA)P(notA)

47/52

Page 32: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Using Bayes Theorem

If I choose a cookie from Bowl #1, the probability it is chocolate chip is P(CC|#1) = P(CC and #1)/P(#1) = .125 / .5 = .250 = 1/4

If you give me a chocolate chip cookie, what is the probability it came from Bowl #1? P(#1|CC) = P(CC|#1)P(#1)/P(CC) = (1/4)(1/2)/(3/8) = 1/3

Example from http://en.wikipedia.org/wiki/Bayes'_theorem

48/52

Page 33: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Drug Testing

Data P(Test correctly indicates disease)=.98 (Sensitivity) P(Test correctly indicates absence)=.95 (Specificity) P(Disease) = .005 (Fairly rare)

Notation + = test indicates disease, – = indicates no disease D = presence of disease, N = absence of disease

Data: P(D) = .005 (Incidence of the disease) P(+|D) = .98 (Correct detection of the disease) P(–|N) = .95 (Correct failure to detect the disease)

What are P(D|+) and P(N|–)? Note, P(D|+) = the probability that a patient actually has the disease when the test says they do.

49/52

Page 34: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

More Information

Deduce: Since P(+|D)=.98, we know P(–|D)=.02 because P(-|D)+P(+|D)=1[P(–|D) is the P(False negative).

Deduce: Since P(–|N)=.95, we know P(+|N)=.05 because P(-|N)+P(+|N)=1

[P(+|N) is the P(False positive). Deduce: Since P(D)=.005, P(N)=.995

because P(D)+P(N)=1.

50/52

Page 35: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Now, Use Bayes Theorem

We have P(+|D)=.98. What is P(D|+)?

P(D and +) P(+|D)P(D)P(D|+)= = (By Bayes Theorem)

P(+) P(+)

P(+) = P(D and +) + P(N and +)

= P(+|D)P(D) + P(+|N)P(N) so

P(+|D)P(D) P(+|D)P(D)P(D|+) = =

P(+) P(+|D)P(D) + P(+|N)P(N)

.98(.005) = = 0.08966 (!!)

.98(.005)+.05(.995)

Using the same approach, P(-|N) = 0.999889

51/52

Page 36: Statistics and Data Analysis Part 3 – Probability.

PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball

CategoryMeatball

5.0%Garlic2.3%

Mushroom and Onion9.2%

Pepper and Onion7.3%

Sausage5.8%

Mushroom16.2%

Plain32.5%

Pepperoni21.8%

Pie Chart of Percent vs Type

List

ing

900000

800000

700000

600000

500000

400000

300000

200000

100000

Boxplot of Listing

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Perc

ent

10000008000006000004000002000000

99

95

90

80

70

605040

30

20

10

5

1

Mean 369687StDev 156865N 51AD 0.994P-Value 0.012

Probability Plot of ListingNormal - 95% CI

IncomePC

List

ing

3250030000275002500022500200001750015000

900000

800000

700000

600000

500000

400000

300000

200000

100000

Scatterplot of Listing vs IncomePC

Listing

Frequency

900000800000700000600000500000400000300000200000

14

12

10

8

6

4

2

0

Histogram of Listing

Listing

Perc

ent

9000

00

8000

00

7000

00

6000

00

5000

00

4000

00

3000

00

2000

00

1000

000

100

80

60

40

20

0

Mean 369687StDev 156865N 51

Empirical CDF of ListingNormal

IncomePC

List

ing

30000250002000015000

1000000

800000

600000

400000

200000

Marginal Plot of Listing vs IncomePC

2e mc

Summary

Randomness and decision making Probability

Sources Basic mathematics (the axioms)

Simple and compound events and constructing probabilities

Joint events Independence Addition and product rules for probabilities

Conditional probabilities and Bayes theorem

52/52