CHAPTER 11: CHI-SQUARE TESTS. 2 THE CHI-SQUARE DISTRIBUTION Definition The chi-square distribution has only one parameter called the degrees of freedom.

CHAPTER 11:

CHI-SQUARE TESTS

THE CHI-SQUARE DISTRIBUTION

Definition The chi-square distribution has only one

parameter called the degrees of freedom. The shape of a chi-squared distribution curve is skewed to the right for small df and becomes symmetric for large df. The entire chi-square distribution curve lies to the right of the vertical axis. The chi-square distribution assumes nonnegative values only, and these are denoted by the symbol χ2 (read as “chi-square”).

Figure 11.1 Three chi-square distribution curves.

Example 11-1

Find the value of χ² for 7 degrees of freedom and an area of .10 in the right tail of the chi-square distribution curve.

Table 11.1 χ2 for df = 7 and .10 Area in the

Right Tail

Area in the Right Tail Under the Chi-Square Distribution Curve

df .995 … .100 … .005

.010…

.989…

67.328

………………

2.7064.605

…12.017

…118.498

………………

7.87910.597

…20.278

…140.169

Required value of χ²

Figure 11.2

df = 7

12.017 χ² 0

Example 11-2

Find the value of χ² for 12 degrees of freedom and area of .05 in the left tail of the chi-square distribution curve.

Solution 11-2

Area in the right tail = 1 – Area in the left tail = 1 – .05 = .95

Table 11.2 χ2 for df = 12 and .95 Area in the Right Tail

Area in the Right Tail Under the Chi-Square Distribution Curve

df .995 … .950 … .005

.010…

3.074…

67.328

………………

.103…

5.226…

77.929

………………

7.87910.597

…28.300

…140.169

Required value of χ²

Figure 11.3

Shaded area = .95

df = 12

5.226 0

A GOODNESS-OF-FIT TEST

Definition An experiment with the following

characteristics is called a multinomial experiment.

Multinomial Experiment cont.

1. It consists of n identical trials (repetitions).

2. Each trial results in one of k possible outcomes (or categories), where k > 2.

3. The trials are independent.4. The probabilities of the various

outcomes remain constant for each trial.

A GOODNESS-OF-FIT TEST cont.

Definition The frequencies obtained from the

performance of an experiment are called the observed frequencies and are denoted by O. The expected frequencies, denoted by E, are the frequencies that we expect to obtain if the null hypothesis is true. The expected frequency for a category is obtained as

E = np Where n is the sample size and p is the

probability that an element belongs to that category if the null hypothesis is true.

A GOODNESS-OF-FIT TEST cont.

Degrees of Freedom for a Goodness-of-Fit Test

In a goodness-of-fit test, the degrees of freedom are

df = k – 1 where k denotes the number of

possible outcomes (or categories) for the experiment.

Test Statistic for a Goodness-of-Fit Test

The test statistic for a goodness-of-fit test is χ2 and its value is calculated as

where O = observed frequency for a category E = expected frequency for a category = np

Remember that a chi-square goodness-of-fit test is always right-tailed.

EO 22 )(

Example 11-3 A bank has an ATM installed inside the bank, and

it is available to its customers only from 7 AM to 6 PM Monday through Friday. The manager of the bank wanted to investigate if the percentage of transactions made on this ATM is the same for each of the five days (Monday through Friday) of the week. She randomly selected one week and counted the number of transactions made on this ATM on each of the five days during this week. The information she obtained is given in the following table, where the number of users represents the number of transactions on this ATM on these days. For convenience, we will refer to these transactions as “people” or “users.”

Example 11-3

At the 1% level of significance, can we reject the null hypothesis that the proportion of people who use this ATM each of the five days of the week is the same? Assume that this week is typical of all weeks in regard to the use of this ATM.

Day Monday

Tuesday Wednesday

Thursday

Friday

Number of users

253 197 204 179 267

Solution 11-3

H0 : p1 = p2 = p3 = p4 = p5 = .20 H1 : At least two of the five

proportions are not equal to .20

Solution 11.3

There are five categories Five days on which the ATM is used Multinomial experiment

We use the chi-square distribution to make this test.

Solution 11-3

Area in the right tail = α = .01 k = number of categories = 5 df = k – 1 = 5 – 1 = 4 The critical value of χ2 = 13.277

Figure 11.4

Reject H0 Do not reject H0

α = .01

13.277

Critical value of χ2

Table 11.3

Category (Day)

Observed Frequenc

Expected Frequency

E = np(O – E) (O – E)2

MondayTuesdayWednesdayThursdayFriday

253197204279267

1200(.20) = 240

13-43-363927

169184912961521729

.7047.7045.4006.3383.038

n = 1200 Sum = 23.184

EO 2)(

Solution 11-3

All the required calculations to find the value of the test statistic χ2 are shown in Table 11.3.

184.23)( 2

Solution 11.3

The value of the test statistic χ2 = 23.184 is larger than the critical value of χ2 = 13.277 It falls in the rejection region

Hence, we reject the null hypothesis

Example 11-4 In a National Public Transportation survey

conducted in 1995 on the modes of transportation used to commute to work, 79.6% of the respondents said that they drive alone, 11.1% car pool, 5.1% use public transit, and 4.2% depend on other modes of transportation (USA TODAY, April 14, 1999). Assume that these percentages hold true for the 1995 population of all commuting workers. Recently 1000 randomly selected workers were asked what mode of transportation they use to commute to work. The following table lists the results of this survey.

Example 11-4

Mode of transportation

Drive alone Carpool Public transit Other

Number of workers 812 102 57 29

Test at the 2.5% significance level whether the current pattern of use of transportation modes is different from that for 1995.

Solution 11-4

H0: The current percentage distribution of the use of transportation modes is the same as that for 1995.

H1: The current percentage distribution of the use of transportation modes is different from that for 1995.

Solution 11-4 There are four categories

Drive alone, carpool, public transit, and other

Multinomial experiment We use the chi-square distribution to

make the test.

Solution 11-4

Area in the right tail = α = .025 k = number of categories = 4 df = k – 1 = 4 – 1 = 3 The critical value of χ2 = 9.348

Figure 11.5

α = .025

9.348 χ2

Table 11.4

CHAPTER 11: CHI-SQUARE TESTS. 2 THE CHI-SQUARE DISTRIBUTION Definition The chi-square distribution has only one parameter called the degrees of freedom.

Documents

Chi-Square Tests and the F - Distribution Chapter 10.

Journal of Biometrics & Biostatistics · PDF filecollection....

Chi-Square and F Distribution

The Chi-Square Distribution - Arkansas State...

Chi-square = 2.85 Chi-square crit = 5.99

THE CHI-SQUARE STATISTIC AND THE CHI-SQUARE … ·...

10 CHI-SQUARE TESTS AND THE F-DISTRIBUTION

Standardizing the Empirical Distribution Function Yields...

Chi-Square Non-parametric test (distribution- free) Nominal....

CHAPTER Chi-Square Tests and the F-Distribution...528...

The Chi-Square and F Distributions - Statpower...

© 2010 Joerg Woerner Datamath Calculator Museum · Curve.....

Chapter 10 Chi-Square Tests and the F- Distribution 1.

Tutorial: Chi-Square Distribution

The Chi-Square Distribution. Preliminary Idea Sum of n...

Unit 6 Chi-Square Distribution SLM