Top Banner
M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University
31

M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Dec 14, 2015

Download

Documents

Adriel Goulding
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

M&Ms Two-way Tables

Ellen Gundlach

STAT 301 Course Coordinator

Purdue University

Page 2: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

M&Ms Color Distribution % according to their website

Brown Yellow Red Blue Orange Green

Plain 13 14 13 24 20 16

Peanut 12 15 12 23 23 15

Peanut Butter/ Almond

10 20 10 20 20 20

Page 3: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Skittles Color Distribution % according to their hotline

Red Orange Yellow Green Purple

Skittles 20 20 20 20 20

Page 4: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

My M&Ms data in counts

Brown Yellow Red Blue Orange Green Total

Plain 14 10 10 8 4 8 54

Peanut 2 3 5 0 8 4 22

Total 16 13 15 8 12 12 76

Page 5: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

My M&Ms data: joint %(divide counts by total = 76)

Brown Yellow Red Blue Orange Green

Plain 18.4 13.2 13.2 10.5 5.3 10.5

Peanut 2.6 3.9 6.6 0 10.5 5.3

Page 6: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

My M&Ms data: marginal %s for color

(add down the columns)Brown Yellow Red Blue Orange Green Total

Plain 18.4 13.2 13.2 10.5 5.3 10.5

Peanut 2.6 3.9 6.6 0 10.5 5.3

Marg. for color

21.0 17.1 19.8 10.5 15.8 15.8 100

Page 7: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

My M&Ms data: marginal %s for flavor

(add across the rows)

Brown Yellow Red Blue Orange Green Marg. for flavor

Plain 18.4 13.2 13.2 10.5 5.3 10.5 71.1

Peanut 2.6 3.9 6.6 0 10.5 5.3 28.9

Total 100

Page 8: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

My M&Ms data: joint and marginal %s

Brown Yellow Red Blue Orange Green Marg. for flavor

Plain 18.4 13.2 13.2 10.5 5.3 10.5 71.1

Peanut 2.6 3.9 6.6 0 10.5 5.3 28.9

Marg. for color

21.0 17.1 19.8 10.5 15.8 15.8 100

Page 9: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Conditional distribution of flavor for color

• We know the color of our M&M already, but now how is flavor distributed for this color?

joint % of color and flavormarginal % of color

Page 10: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Conditional distribution example

• We know we have a red M&M, so what is the probability it is a plain M&M?

joint % of red and plain 13.266.7%

marginal % of red 19.8

Page 11: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Conditional distribution of color for flavor

• We know the flavor of our M&M already, but now how is color distributed for this color?

joint % of color and flavormarginal % of flavor

Page 12: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Conditional distribution example

• We know we have a peanut M&M, so what is the probability it is green?

joint % of peanut and green 5.318.3%

marginal % of peanut 28.9

Page 13: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Conditional distributions in general

Conditional distribution of X for Y (we know Y for sure already, but we want to know the probability or % of having X be true as well):

joint % of X and Ymarginal % of Y (what we know for sure)

Page 14: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Bar graphs for conditional distribution of color for both flavors

blue brown green orange red yellow

color for milk chocolate M&Ms

0

5

10

15

20

25

30

Per

cen

t

Cases weighted by percentages for plain M&Ms

Conditional distribution of color for Milk Chocolate M&Ms

brown green orange red yellow

color for peanut M&Ms

0

10

20

30

40

Per

cen

t

Cases weighted by percentages for peanut M&Ms

Conditional distribution of color for Peanut M&Ms

Page 15: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Chi-squared hypothesis test

H0: There is no association between color

distribution and flavor for M&Ms.

Ha: There is association between color

distribution and flavor for M&Ms.

Use an = 0.01 for this story.

Page 16: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Full-class M&Ms data in counts(large sample size necessary for test)

Brown Yellow Red Blue Orange Green

Plain 147 302 264 407 330 373

Peanut 69 110 70 162 148 123

Page 17: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Chi-squared test SPSS results

Chi-Square Tests

14.396a 5 .013

14.623 5 .012

2505

Pearson Chi-Square

Likelihood Ratio

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)

0 cells (.0%) have expected count less than 5. Theminimum expected count is 58.81.

a.

Page 18: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Chi-squared test conclusions

• Test statistic = 14.396 and P-value = 0.013

• Since P-value is > our of 0.01, we do not reject H0.

• We do not have enough evidence to say there is association between color distribution and flavor for M&Ms.

Page 19: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Skittles vs. M&Ms

• Now we will compare the proportion of yellow candies for Skittles and for M&Ms.

• The previous two-way table with plain and peanut M&Ms was of size 2 x 6.

• This table will be of size 2x2 because we only care about whether a candy is yellow or non-yellow.

Page 20: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Full-class M&Ms and Skittles data in counts

(large sample size necessary for test)

Yellow Non-Yellow

Total

Plain M&Ms

302 1521 1823

Skittles 361 1351 1712

Total 663 2872 3535

Page 21: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Chi-squared hypothesis test

H0: There is no association between color

distribution and flavor for these candies.

Ha: There is association between color

distribution and flavor for these candies.

Use an = 0.01 for this story.

Page 22: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Chi-squared test SPSS results

Chi-Square Tests

11.839b 1 .001

11.544 1 .001

11.840 1 .001

.001 .000

3535

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

Computed only for a 2x2 tablea.

0 cells (.0%) have expected count less than 5. The minimum expected count is 321.09.

b.

Page 23: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Chi-squared test conclusions

• Test statistic = 11.839 and P-value = 0.001

• Since P-value is < our of 0.01, we reject H0.

• We have evidence that there is association between color distribution and flavor for these candies.

Page 24: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Another way to do this test

Since this is a 2x2 table, and if we are only interested in a 2-sided () hypothesis test, we can use the 2-sample proportions test here.

Page 25: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

2-sample proportion test hypotheses

H0: pM&Ms = pSkittles

Ha: pM&Ms pSkittles

Page 26: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Defining the proportions

M&Ms

Skittles

# yellow M&Msp

total # M&Ms

# yellow Skittlesp

total # Skittles

Page 27: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Test statistic

&

&

ˆ ˆ

1 1ˆ ˆ(1 )

M Ms Skittles

M Ms Skittles

p pZ

p pn n

Page 28: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Results from the proportion test

• Sample proportions:

• Test statistic Z = -3.44

• P-value = 2(0.0003) = 0.0006

• Since P-value < our of 0.01, we reject H0.

& 0.166 and 0.211ˆ ˆM Ms Skittlesp p

Page 29: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

Conclusion to the proportion test

• We have evidence the proportion of yellow M&Ms is not the same as the proportion of yellow Skittles.

• In other words, the type of candy makes a difference to the color distribution.

Page 30: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

How do our results from the 2 tests compare?

• The X2 test statistic = 11.839, which is actually the (Z test statistic = -3.44)2.

• If you take into account the rounding, the P-values for both tests are 0.001.

• We rejected H0 in both tests.

Page 31: M&Ms Two-way Tables Ellen Gundlach STAT 301 Course Coordinator Purdue University.

When do you use which test?

• Chi-squared tests are best for:

two-sided hypothesis tests only

2x2 or bigger tables

• Proportion (Z) tests are best for:one- or two-sided hypothesis tests

only 2x2 tables