Top Banner
Chi-Square Test
15

Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

Dec 16, 2015

Download

Documents

Isabel Surgent
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

Chi-Square Test

Page 2: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

Most of the previous techniques presented so far have been for NUMERICAL data.

So, what do we do if the data is CATEGORICAL?

Ex: Information gathered on gender, political party, college major, etc.

Page 3: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

Categorical Variables

Based on observations

Univariate – single categorical variableExample: Sample 100 people & ask if they

agree or disagree with a question.

Bivariate – uses two categorical variablesExample: Sample 100 people & ask if they

are male/female and what political party they support.

Page 4: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

One-Way Frequency Table - Univariate

Democrat Democrat Democrat Independent

Republican Democrat Republican Independent

Republican Republican Republican Republican

  Democrat Republican Independent

Freq. 4 6 2

Data

Horizontal One-Way Table

  Freq.

Democrat 4

Republican 6

Independent 2

Vertical One-Way Table

Page 5: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

Goodness of Fit Test

Used to measure the extent to which the observed counts differ from the expected counts.

K = # categories of a categorical variable df = k – 1 Test Statistic:

2

2

2 Observed Expected

Expected

Page 6: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

How Does a Hypothesis Test for Chi-Square Work? The idea of the chi-square goodness-of-

fit test is this: we compare the observed counts from our sample with the counts that would be expected is the was true.

The more the observed counts differ from the expected counts, the more evidence we have AGAINST the null hypothesis.

Page 7: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

Assumptions

1. Observed Values are based on random

Samples

2. Sample size is large – each cell count is

at least 5. (All cells

Page 8: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

Hypotheses

Ho: State each proportion’s hypothesized value.

HA: At least 1 of the proportions differ from the hypothesized value.

Page 9: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

It uses the Chi-Square Chart

Positively Skewed Uses d.f. On calculator!

Page 10: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

Is there a preference in type of car?

  Freq. Expected

SUV 27 

Truck 25  

Sedan 29  

Sports 19  

P1=proportion who prefer a SUV

P2=proportion who prefer a truck

p3=proportion who prefer a sedan

P4=proportion who prefer a sports car

1 2 3 4:

: at least 1 prop. is differento

A

H p p p p

H

Assumptions: Random Samples & all cell counts are at least 5.

Use a Chi-Square goodness of fit Test

df = 3

24.2

25

2519

25

2529

25

2525

25

2527

)(

2

22222

22

PREDICTED

PREDICTEDOBSERVED

524.03,,24.22 cdfValP

Page 11: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

A researcher believes that the number of homicides crimes in CA by season is uniformly distributed. To test this claim, you randomly select 1200 homicides from a recent year and record the season when each

happened.

Season Freq

Spring 312

Summer 298

Fall 297

Winter 293

Page 12: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

Results from a previous survey asking people who go to movies at least once a month are shown in the table below. To determine whether this distribution is still the same, you randomly select 1000 people who go to movies at least once a month and record the age of each. Are the

distributions the same?

Age Survey Freq

2 - 17 26.70% 240

18 - 24 19.80% 214

25 - 39 19.70% 183

40 - 49 14% 156

50+ 19.80% 207

Page 13: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

What’s your favorite flavor of ice-cream?

Page 14: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

    Observed

A 40% 45

B 30% 52

C 20% 39

D 5% 8

F 5% 6

Page 15: Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information.

Homework

Worksheet