Top Banner
Sociology 601 Class12: October 8, 2009 1
18

Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

Sociology 601 Class12: October 8, 2009

1

Page 2: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

8.2 Chi-squared statistical significance test for contingency tables.

support tax reform? Yes No Totsupport Yes 150 100 250environment? No 200 50 250

Tot 350 150 500

• “Is the level of support for the environment independent of the level of support for tax reform?”– If so, these two measures may have some causal link

worth investigating.– Q: which causes which? 2

Page 3: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

2x2 table: a t-test for proportions

• With a 2x2 table, we can use a t-test for independent-sample proportions (review 7.2).

. prtesti 250 .6 250 .8

Two-sample test of proportion x: Number of obs = 250 y: Number of obs = 250------------------------------------------------------------------------------ Variable | Mean Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- x | .6 .0309839 .5392727 .6607273 y | .8 .0252982 .7504164 .8495836-------------+---------------------------------------------------------------- diff | -.2 .04 -.2783986 -.1216014 | under Ho: .0409878 -4.88 0.000------------------------------------------------------------------------------ diff = prop(x) - prop(y) z = -4.8795 Ho: diff = 0

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(Z < z) = 0.0000 Pr(|Z| < |z|) = 0.0000 Pr(Z > z) = 1.0000 3

Page 4: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

Moving beyond 2x2 tables:

• Comparing conditional probabilities is fine when there are only two comparisons and two possible outcomes for each comparison.

• The Chi-Square (2) test is a new technique for making comparisons more flexible.

• 2 is like a null hypothesis that every cell should have the frequency you would expect if the variables were independently distributed.

• fe is the expected count for each cell.

– fe = product of row totals * column totals / table total (A&F 254)

– fe = total N * unconditional row probability * unconditional column probability

– fe = column N * unconditional row

• A test for the whole table will combine tests for fe for every cell.

4

Page 5: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

Calculating expected cell counts:

• The expected cell count is the count we would expect in a cell if – environmental support among tax reform advocates and

among tax reform opponents were identical, or if– environmental support among tax reform advocates were

the same as environmental support among the whole sample, or if

– tax reform support among environmentalists were the same as among non-enviornmentalists

• 50% of sample supports environmental spending, so– fe(1,1) = .5 * 350 = 175– fe(1,1) = 250 * 350 / 500 (A&F)– fe(1,1) = 500*(350/500)taxes *(250/500)environment = 175

• fe(1,2) = 75• fe(2,1) = 175• fe(2,2) = 75 5

Page 6: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

Testing independence of support for tax reform and environmental spending:

• New Approach: Chi Squared test for independence of attitudes toward taxes and the environment.

• Test statistic: – 2 = ((fo – fe)2 / fe )

– where fo is the observed count in each cell

– and where fe is the expected count for each cell, assuming that attitudes toward taxes will be the same for people who support environmental issues as for people who do not support environmental issues. 6

Page 7: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

Assumptions and hypothesis for a chi-squared test:

• Assumptions:– two categorical variables (for this course)– random sample or stratified random sample– fe 5 for all cells

• Hypothesis: Ho: the two variables are statistically independent.– this means that the distribution of each variable is

independent of the score of the other variable

7

Page 8: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

Using expected cell counts to calculate a Chi-squared test statistic

• The test statistic is analogous to a t-statistic…– but the form of the equation makes it difficult to see

that the X2 statistic is a difference between the observed and expected values, divided by an estimate of the typical variation we would expect from random sampling error.

• Test statistic: – 2 = ((fo – fe)2 / fe )= ((150 –175)2/175 + (100-75)2/75 + (200-175)2/175 + (50-75)2/75 ) = 3.5714 + 8.3333 + 3.5714 + 8.3333 = 23.81

8

Page 9: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

Degrees of freedom for a Chi-squared statistic:

• We now have a test statistic: 2 = 23.81 • How do we assign a p-value to this?• Step 1: calculate the degrees of freedom.

– Given the row and column marginal totals, how many cells need we fill in before we can do the rest automatically?

– Answer: 1 in this case, so df = 1.– General answer: df = (r-1)*(c-1), where r is the

number of rows and c is the number of columns.

9

Page 10: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

p-value for a Chi-squared statistic:

• Assign a p-value to the statistic: 2 = 23.81, df = 1

• Given the degrees of freedom, look up the p-value.– Go to Table C on page 670.– Go down to the row for df = 1– Move across X2 values to the largest tabled value that is

smaller than the measured X2

– Look up the corresponding p-value at the top of the column: p < .001

– The chi-squared test is always a 1-tailed test: we always use the right tail of the distribution.

10

Page 11: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

Do your own chi-squared test:

• You watch 50 beachcombers to see if they are wearing sandals and if they are wearing shorts

• . wearing shorts?• Yes No Tot• sandals? Yes 20 10 30• . No 10 10 20• . Tot 30 20 50

• Q: Does a beachcomber’s chance of wearing sandals depend on their chance of wearing shorts?

•11

Page 12: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

Chi-Squared Tests for tables larger than 2X2

• Here is a command to run a chi-squared test on the gender and partyid data from the 1996 GSS (cf. 8.1)

. tab partyid3 sex, chi2

| respondents sex partyid3 | male female | Total------------+----------------------+---------- Democrat | 350 627 | 977 Independent | 514 557 | 1,071 Republican | 400 407 | 807 ------------+----------------------+---------- Total | 1,264 1,591 | 2,855

Pearson chi2(2) = 43.4391 Pr = 0.00012

Page 13: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

• Add expected cell counts

. tab partyid3 sex, chi2 exp

+--------------------+| Key ||--------------------|| frequency || expected frequency |+--------------------+

| respondents sex partyid3 | male female | Total------------+----------------------+---------- Democrat | 350 627 | 977 | 432.5 544.5 | 977.0 ------------+----------------------+----------Independent | 514 557 | 1,071 | 474.2 596.8 | 1,071.0 ------------+----------------------+---------- Republican | 400 407 | 807 | 357.3 449.7 | 807.0 ------------+----------------------+---------- Total | 1,264 1,591 | 2,855 | 1,264.0 1,591.0 | 2,855.0

Pearson chi2(2) = 43.4391 Pr = 0.00013

Page 14: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

8.3 When not to do a chi-squared test

1.) Do not do a Chi-squared test when the expected value of a cell is less than 5.

The Problem: The total 2 is 6.28, so p<.05, but 4.5 of the total comes from one cell with fe = 2.

(It is okay to do a Chi-squared test if a cell has an expected value above 5 and an observed value below 5!)

age Party identification

Democrat Indep. Republican Total

<65 42 (40) 5 (8) 33 (32) 80

65 8 (10) 5 (2) 7 (8) 20

total 50 10 40 100

14

Page 15: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

A small sample alternative to a chi-squared test

When the sample size is too small for a chi-squared test, you may treat the contingency table as a small sample comparison of two population proportions.

This means you should do a Fisher’s exact test for population proportions.

A Fisher’s exact test will also work okay on large samples, but you sometimes will bog down the computer with lengthy computations. (This is especially likely to happen when the tables are 5X4 or larger). 15

Page 16: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

Fisher’s exact test in STATA

(not necessary in this case because of large N).. tab partyid3 sex, chi exact

Enumerating sample-space combinations:

stage 3: enumerations = 1

stage 2: enumerations = 158

stage 1: enumerations = 0

| respondents sex

partyid3 | male female | Total

------------+----------------------+----------

Democrat | 350 627 | 977

Independent | 514 557 | 1,071

Republican | 400 407 | 807

------------+----------------------+----------

Total | 1,264 1,591 | 2,855

Pearson chi2(2) = 43.4391 Pr = 0.000

Fisher's exact = 0.000 16

Page 17: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

When not do a chi-squared test (#2)

2.) Do not do a Chi-squared test for cell values that are not observed frequencies.

The Problem: If you use percentages, you misstate the sample size as 100.

sex Voted in last election?

Yes No Total

women 35% 15% 50%

men 20% 30% 50%

total 55% 45% 100%

17

Page 18: Sociology 601 Class12: October 8, 2009 The Chi-Squared Test (8.2) – expected frequencies – calculating Chi-square – finding p When (not) to use Chi-squared.

When not to do a chi-squared test (#3)

3.) Do not do a Chi-squared test to find a difference in population proportions for dependent samples.

The Problem: You want to know if the speech changed people’s opinions. A 2 test would tell you if opinions after the speech depend on opinions before the speech.

Before speech:

Number supporting death penalty:

After hearing speech:

Yes No Total

Yes 80 20 100

No 40 60 100

total 120 80 200

18