Top Banner
Lecture 12 - The χ 2 -test C2 Foundation Mathematics (Standard Track) Dr Linda Stringer Dr Simon Craik [email protected] [email protected] INTO City/UEA London
24

C2 st lecture 12 the chi squared-test handout

May 11, 2015

Download

Education

fatima d
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: C2 st lecture 12   the chi squared-test handout

Lecture 12 - The χ2-testC2 Foundation Mathematics (Standard Track)

Dr Linda Stringer Dr Simon [email protected] [email protected]

INTO City/UEA London

Page 2: C2 st lecture 12   the chi squared-test handout

Pearson’s χ2-test

I Pearson’s χ2-test is another kind of hypothesis test, we useit to investigate whether two variables are independent, orrelated to each other.

I For example to investigate whether people’s votingintention and social class are related.

I In a χ2-test the variables are categorical - for example thecategories of voting intention are Labour, Conservative andLiberal Democrat, and the categories of social class are A,B and C.

I To investigate whether men get paid more than women, thevariables are gender and salary. The categories of genderare male and female, and the categories of salary could below, medium and high.

Page 3: C2 st lecture 12   the chi squared-test handout

The structure of a χ2-test

The χ2-test has the same structure as a Z-test and a T-testI Hypotheses (state H0 and H1)I Critical value (look it up in a table)I Test statistic (this is the long part - there are 5 steps)I Decision (Reject/accept H0, with justification)I Conclusion (in words)

Page 4: C2 st lecture 12   the chi squared-test handout

Are students’ grades dependent on attendance?

Some students think that turning up to lectures and seminarsdoesn’t make a difference to what grade they get at the end ofthe module. We surveyed a sample of 100 students. Based onthe data below do you think that attendance and grades areindependent variables?

Often attends Sometimes Rarely attendsDistinction 36 7 4

Merit 12 10 7Pass 2 6 16

You see that about half of the students (47 out of 100) got aDistinction. However is this true for each individual category ofattendance?We will perform a χ2-test on this data later (see Example 1below)

Page 5: C2 st lecture 12   the chi squared-test handout

Hypotheses

I The null hypothesis (H0) is always that the variables areindependent.

I The alternative hypothesis (H1) is always that the variablesare related (dependent on each other).

I Write the hypotheses as sentences, for exampleH0: Voting intention and social class are independentH1; Voting intention depends on social class

Page 6: C2 st lecture 12   the chi squared-test handout

χ2-test statistic step 1: calculate the totals

I Data is presented in a table, called a contingency table.I The values in the contingency table are called the

observed values (O).I ( The contingency table is often referred to as the

’observed’ values table.)I We first calculate the row totals, the column totals and the

grand total.

Page 7: C2 st lecture 12   the chi squared-test handout

χ2-test statistic step 2: the expected table

I We then construct the expected table.I This is the table of expected values (E). For each cell,

expected value (E) =row total × column total

grand total

I NOTE: For a χ2-test to be viable the expected values mustall be greater than 5 in a 2 × 2 table and greater than 5 in80% of the cells in larger tables.

I Handy hint: The totals are the same for the expected tableas for the observed table.

Page 8: C2 st lecture 12   the chi squared-test handout

χ2-test statistic step 3: the residual table

I We then construct the residual table.I This is the table of residual values. For each cell,

residual value (R) = observed value(O)−expected value(E)

I Handy hint: The row and column totals in the residual tableare always 0.

Page 9: C2 st lecture 12   the chi squared-test handout

χ2-test statistic step 4: the χ2-table

I We then construct the χ2-table.I Construct the χ2 table. For each cell,

value =R2

E=

(O − E)2

E

I Handy hint: The values in the χ2 table are always positive.

Page 10: C2 st lecture 12   the chi squared-test handout

χ2-test statistic step 5: the test statistic

I The test statistic is the sum of all the values in the χ2 table

χ2-test statistic =∑ R2

E=

∑ (O − E)2

E

I Handy hint: The test statistic is always positive.

Page 11: C2 st lecture 12   the chi squared-test handout

The critical value

I We also need to find our critical value.I To do this we calculate the degree of freedom of our table.

This will be the degrese of freedom of the rows multipliedby the degrees of freedom of the columns.

I d .o.f . = (n − 1)× (m − 1), where n is the number of rowsand m is the number of columns.

I Get the critical value by reading off the degree of freedomand required significance level from the table. If the teststatistic is greater than the critical value we reject the nullhypothesis.

Page 12: C2 st lecture 12   the chi squared-test handout

The χ2-test critical value table

5% Significance 1% Significanced.o.f. Probability 0.05 Probability 0.01

1 3.84 6.632 5.99 9.213 7.81 11.344 9.49 13.285 11.07 15.096 12.59 16.817 14.07 18.488 15.51 20.099 16.92 21.67

10 18.31 23.2111 19.68 24.7212 21.03 26.22

Page 13: C2 st lecture 12   the chi squared-test handout

Example 1

I Some students think that turning up to lectures doesn’tmake a difference to what grade they get. We consider asample of 100 students, and test at 5% level ofsignificance.

Often attends Sometimes RarelyDistinction 36 7 4

Merit 12 10 7Pass 2 6 16

I First state the hypotheses.I H0: the students’ attendance and grades are independent.

H1: the grade depends on attendence.

Page 14: C2 st lecture 12   the chi squared-test handout

Example 1Step 1: Calculate the column totals, the row totals and thegrand total.

Often Sometimes Rarely Row totalDistinction 36 7 4 47

Merit 12 10 7 29Pass 2 6 16 24

Column total 50 23 27 100

Step 2: Calculate the expected table.

expected value (E) =row total × column total

grand total

23.5 10.81 12.6914.5 6.67 7.8312 5.52 6.48

Page 15: C2 st lecture 12   the chi squared-test handout

Example 1Step 3: Calculate the residual table

residual value (R) = observed value(O) − expected value(E)

12.5 -3.81 -8.69-2.5 3.33 -0.83-10 0.48 9.52

Step 4: Calculate the χ2-table. As we want a final value to 2 dpcalculate to 3 dp.

value =R2

E=

(O − E)2

E

6.649 1.343 5.9510.431 1.663 0.0888.333 0.042 13.986

Step 5: The test statistic is the sum of all the values in the table

χ2-test statistic =∑ R2

E=

∑ (O − E)2

E= 38.49 to 2 d.p.

Page 16: C2 st lecture 12   the chi squared-test handout

Example 1

I We now find our critical value.I The degree of freedom will be (3 − 1)× (3 − 1) = 4.I We consult our table and get a critical value of 9.49.I As our test statistic is greater than the critical value,

38.49>9.49, we decide to reject the null hypothesis.I We conclude that your grade depends on your attendance.

Page 17: C2 st lecture 12   the chi squared-test handout

Example 2

I In his sketch the Vitruvian man Leonardo da Vincidisplayed the “perfect" proportions of man.

I We aren’t so sure there is a correlation between height andnose size so we gather some data and test at a 1% level ofsignificance.

≤ 1.5 cm 1.5-2.5 cm ≥ 2.5 cm≤ 165cm 9 3 6

166-175 cm 15 18 18176-185 cm 12 21 24≥ 186 cm 9 6 9

I First we state our hypotheses.I H0: height and nose size are independent variables.

H1: height and nose size are dependent variables.

Page 18: C2 st lecture 12   the chi squared-test handout

Example 2

Calculate the column and row totals.

≤ 1.5 cm 1.5-2.5 cm ≥ 2.5 cm Row total≤ 165cm 9 3 6

166-175 cm 15 18 18176-185 cm 12 21 24≥ 186 cm 9 6 9

Column total 45 48 57 150

Calculate the expected table. Multiply the column sum and rowsum and divide by total.

5.4 5.76 6.8415.3 16.32 19.3817.1 18.24 21.667.2 7.68 9.12

Page 19: C2 st lecture 12   the chi squared-test handout

Example 2Calculate the residual table. Subtract the expected value fromthe original value.

3.6 -2.76 -0.84-0.3 1.68 -1.38-5.1 2.76 2.341.8 -1.68 -0.12

Calculate the χ2-table. Square the residual value and divide bythe expected. As we want a final value to 2 d.p., calculate to 3d.p.

2.4 1.323 0.1030.006 0.173 0.0981.521 0.418 0.2530.45 0.368 0.002

χ2-test statistic =∑ R2

E=

∑ (O − E)2

E= 7.11 to 2 d.p.

Page 20: C2 st lecture 12   the chi squared-test handout

Example 2

I We now find our critical value.I The degree of freedom will be (3 − 1)× (4 − 1) = 6.I We consult our table and get a critical value of 16.81.I Our test statistic is closer to 0 than our critical value,

7.11<16.81, so we decide to accept our null hypothesis.I We conclude that height and nose size are independent.

Page 21: C2 st lecture 12   the chi squared-test handout

Example 3

I A cynical Englishman says that it rains all year round inBritain. To see if this is the case we keep a log of theweather in each season. We test at a 1% level ofsignificance.

Overcast Sunny Rainy SnowySpring 5 53 28 4

Summer 31 37 22 0Autumn 32 25 30 3Winter 22 15 29 24

I First we state our hypotheses.I H0: the weather and season are independent.

H1: the weather depends on the season.

Page 22: C2 st lecture 12   the chi squared-test handout

Example 3

Calculate the column and row totals.

Overcast Sunny Rainy Snowy Row totalSpring 5 53 28 4 90

Summer 31 37 22 0 90Autumn 32 25 30 3 90Winter 22 15 29 24 90

Column total 90 130 109 31 360

Calculate the expected table. Multiply the column sum and rowsum and divide by total.

22.5 32.5 27.25 7.7522.5 32.5 27.25 7.7522.5 32.5 27.25 7.7522.5 32.5 27.25 7.75

Page 23: C2 st lecture 12   the chi squared-test handout

Example 3Calculate the residual table. Subtract the expected value fromthe original value.

-17.5 20.5 0.75 -3.758.5 4.5 -5.25 -7.759.5 -7.5 2.75 -4.75-0.5 -17.5 1.75 16.25

Calculate the χ2-table. Square the residual value and divide bythe expected. As we want a final value to 2 d.p., calculate to 3d.p.

13.611 12.931 0.021 1.8153.211 0.623 1.011 7.754.011 1.731 0.278 2.9110.011 9.423 0.112 34.073

The test statistic is the sum of all the values in the table 93.52to 2 d.p.

Page 24: C2 st lecture 12   the chi squared-test handout

Example 3

I We now find our critical value.I The degree of freedom will be (4 − 1)× (4 − 1) = 9.I We consult our table and get a critical value of 21.67.I Our test statistic larger than our critical value so we decide

to reject our null hypothesis.I We conclude that the weather depends on the season.