Top Banner
Chi-Square Tests Chi-Square Tests C h a p t e r 15 15 Chi-Square Test for Independence Chi-Square Tests for Goodness-of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit Test Normal Chi-Square Goodness-of-Fit Test McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
52

Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

Jan 20, 2016

Download

Documents

Patricia Bruce
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

Chi-Square TestsChi-Square TestsC

hapter15151515Chi-Square Test for

IndependenceChi-Square Tests for Goodness-

of-FitUniform Goodness-of-Fit TestPoisson Goodness-of-Fit TestNormal Chi-Square Goodness-

of-Fit TestECDF Tests (Optional)

McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.

Page 2: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-2

Chi-Square Test for IndependenceChi-Square Test for Independence

• A A contingency tablecontingency table is a cross-tabulation of is a cross-tabulation of nn paired observations into categories.paired observations into categories.

• Each cell shows the count of observations that Each cell shows the count of observations that fall into the fall into the category category defined by its defined by its row (row (rr) and ) and column (column (cc))heading.heading.

Contingency TablesContingency Tables

A

B

Page 3: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-3

Contingency TablesContingency Tables

Chi-Square Test for IndependenceChi-Square Test for Independence

• For example:For example:

Table 15.1

Page 4: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-4

Chi-Square Test for IndependenceChi-Square Test for Independence

Chi-Square TestChi-Square Test• In a test of independence for an In a test of independence for an rr x x cc contingency contingency

table, the hypotheses aretable, the hypotheses areHH00: Variable : Variable AA is independent of variable is independent of variable BB

HH11: Variable : Variable AA is not independent of variable is not independent of variable BB

• Use the Use the chi-square test for independencechi-square test for independence to test to test these hypotheses.these hypotheses.

• This This non-parametric non-parametric test is based on test is based on frequenciesfrequencies..• The The nn data pairs are classified into data pairs are classified into cc columns columns

and and rr rows and then the rows and then the observed frequencyobserved frequency ffjkjk is is

compared with the compared with the expected frequencyexpected frequency eejkjk..

Page 5: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-5

Chi-Square Test for IndependenceChi-Square Test for Independence

• The critical value comes from the The critical value comes from the chi-square chi-square probability distributionprobability distribution with with degrees of degrees of freedom.freedom.

= degrees of freedom = (= degrees of freedom = (rr – 1)( – 1)(cc – 1) – 1)where where rr = number of rows in the table = number of rows in the table

cc = number of columns in the table = number of columns in the table

• Appendix E contains critical values for right-tail Appendix E contains critical values for right-tail areas of the chi-square distribution.areas of the chi-square distribution.

• The mean of a chi-square distribution is The mean of a chi-square distribution is with with variance 2variance 2..

Chi-Square DistributionChi-Square Distribution

Page 6: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-6

Chi-Square Test for IndependenceChi-Square Test for Independence

• Consider the shape of the chi-square distribution:Consider the shape of the chi-square distribution:

Chi-Square DistributionChi-Square Distribution

Figure 15.1

Page 7: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-7

Chi-Square Test for IndependenceChi-Square Test for Independence

• Assuming that Assuming that HH00 is true, the expected is true, the expected

frequency of row frequency of row jj and column and column kk is: is:

eejkjk = = RRjjCCkk//nn

where where RRjj = total for row = total for row jj ( (jj = 1, 2, …, = 1, 2, …, rr))

CCkk = total for column = total for column kk ( (kk = 1, 2, = 1, 2,

…, …, cc))nn = sample size = sample size

Expected FrequenciesExpected Frequencies

Page 8: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-8

Chi-Square Test for IndependenceChi-Square Test for Independence

• Step 1: State the HypothesesStep 1: State the Hypotheses

HH00: Variable : Variable AA is independent of variable is independent of variable B B

HH11: Variable : Variable AA is not independent of variable is not independent of variable BB

• Step 2: Specify the Decision RuleStep 2: Specify the Decision Rule

Calculate Calculate = ( = (rr – 1)( – 1)(cc – 1) – 1)

For a given For a given , look up the right-tail critical , look up the right-tail critical value (value (22

RR) from Appendix E or by using Excel.) from Appendix E or by using Excel.

Reject Reject HH00 if if 22RR > test statistic. > test statistic.

Steps in Testing the HypothesesSteps in Testing the Hypotheses

Page 9: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-9

Chi-Square Test for IndependenceChi-Square Test for Independence

• For example, for For example, for = 6 and = 6 and = .05, = .05, 22.05.05 = 12.59. = 12.59.

Steps in Testing the HypothesesSteps in Testing the Hypotheses

Figure 15.2

Page 10: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-10

Chi-Square Test for IndependenceChi-Square Test for Independence

• Here is the rejection region.Here is the rejection region.

Steps in Testing the HypothesesSteps in Testing the Hypotheses

Figure 15.3

Page 11: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-11

Chi-Square Test for IndependenceChi-Square Test for Independence

• Step 3: Calculate the Expected FrequenciesStep 3: Calculate the Expected Frequencieseejkjk = = RRjjCCkk//nn

• For example, For example,

Steps in Testing the HypothesesSteps in Testing the Hypotheses

Page 12: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-12

Chi-Square Test for IndependenceChi-Square Test for Independence

• Step 4: Calculate the Test StatisticStep 4: Calculate the Test StatisticThe chi-square test statistic isThe chi-square test statistic is

• Step 5: Make the DecisionStep 5: Make the DecisionReject Reject HH00 if if 22

RR > test statistic or if the > test statistic or if the pp-value -value << ..

Steps in Testing the HypothesesSteps in Testing the Hypotheses

calc

Page 13: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-13

Chi-Square Test for IndependenceChi-Square Test for Independence

• The chi-square test is unreliable if the The chi-square test is unreliable if the expectedexpected frequencies are too small.frequencies are too small.

• Rules of thumb:Rules of thumb:• Cochran’s RuleCochran’s Rule requires that requires that eejkjk > 5 for all cells. > 5 for all cells.• Up to 20% of the cells may have Up to 20% of the cells may have eejkjk < 5 < 5

Small Expected FrequenciesSmall Expected Frequencies

• Most agree that a chi-square test is infeasible Most agree that a chi-square test is infeasible if if eejkjk < 1 in any cell. < 1 in any cell.

• If this happens, try combining adjacent rows If this happens, try combining adjacent rows or columns to enlarge the expected or columns to enlarge the expected frequencies.frequencies.

Page 14: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-14

Chi-Square Test for IndependenceChi-Square Test for Independence

• Chi-square tests for independence can also be Chi-square tests for independence can also be used to analyze quantitative variables by used to analyze quantitative variables by coding them into categories.coding them into categories.

Cross-Tabulating Raw DataCross-Tabulating Raw Data

For example, the variables Infant Deaths per 1,000 and Doctors per 100,000 can each be coded into various categories:

Figure 15.6

Page 15: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-15

Chi-Square Test for Goodness-of-Chi-Square Test for Goodness-of-FitFit

Why Do a Chi-Square Test on Numerical Why Do a Chi-Square Test on Numerical

Data?Data?• The researcher may believe there’s a The researcher may believe there’s a

relationship between X and Y, but doesn’t relationship between X and Y, but doesn’t want to use regression.want to use regression.

• There are outliers or anomalies that prevent There are outliers or anomalies that prevent us from assuming that the data came from us from assuming that the data came from a normal population.a normal population.

• The researcher has numerical data for one The researcher has numerical data for one variable but not the other.variable but not the other.

Page 16: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-16

Chi-Square Test for IndependenceChi-Square Test for Independence

• More than two variables can be compared using More than two variables can be compared using contingency tables.contingency tables.

• However, it is difficult to visualize a higher order table.However, it is difficult to visualize a higher order table.• For example, you could visualize a For example, you could visualize a cubecube as a stack of as a stack of

tiled 2-way contingency tables.tiled 2-way contingency tables.• Major computer packages permit 3-way tables.Major computer packages permit 3-way tables.

3-Way Tables and Higher3-Way Tables and Higher

Page 17: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-17

Chi-Square Test for Goodness-of-Chi-Square Test for Goodness-of-FitFit

Purpose of the TestPurpose of the Test

• The The goodness-of-fitgoodness-of-fit ( (GOFGOF) test helps you ) test helps you decide whether your sample resembles a decide whether your sample resembles a particular kind of population.particular kind of population.

• The chi-square test will be used because The chi-square test will be used because it is versatile and easy to understand.it is versatile and easy to understand.

Page 18: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-18

Chi-Square Test for Goodness-of-Chi-Square Test for Goodness-of-FitFit

• A A multinomial distributionmultinomial distribution is defined by any is defined by any kk probabilities probabilities 11, , 22, …, , …, kk that sum to unity. that sum to unity.

• For example, consider the following “official” For example, consider the following “official” proportions of M&M colors.proportions of M&M colors.

Multinomial GOF TestMultinomial GOF Test

calc

Page 19: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-19

Chi-Square Test for Goodness-of-Chi-Square Test for Goodness-of-FitFit

• The hypotheses areThe hypotheses are

HH00: : 11 = .30, = .30, 22 = .20, = .20, 33 = .10, = .10, 44 = .10, = .10, 55 = .10, = .10, 66 = .20 = .20

HH11: At least one of the : At least one of the jj differs from the differs from the

hypothesized value hypothesized value

• No parameters are estimated (No parameters are estimated (mm = 0) and there = 0) and there are are cc = 6 classes, so the degrees of freedom are = 6 classes, so the degrees of freedom are

= = cc – – mm – 1 = 6 – 0 - 1 – 1 = 6 – 0 - 1

Multinomial GOF TestMultinomial GOF Test

Page 20: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-20

Chi-Square Test for Goodness-of-Chi-Square Test for Goodness-of-FitFit

Hypotheses for GOFHypotheses for GOF

• The hypotheses are:The hypotheses are:

HH00: The population follows a _____ distribution: The population follows a _____ distribution

HH11: The population does not follow a ______ : The population does not follow a ______

distribution distribution

• The blank may contain the name of any The blank may contain the name of any theoretical distribution (e.g., uniform, Poisson, theoretical distribution (e.g., uniform, Poisson, normal).normal).

Page 21: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-21

Chi-Square Test for Goodness-of-Chi-Square Test for Goodness-of-FitFit

• Assuming Assuming nn observations, the observations observations, the observations are grouped into are grouped into cc classes and then the classes and then the chi-chi-square test statisticsquare test statistic is found using: is found using:

Test Statistic and Degrees of Freedom for Test Statistic and Degrees of Freedom for GOFGOF

wherewhere ffjj = the observed frequency of = the observed frequency of

observations in class observations in class jj

eejj = the expected frequency in = the expected frequency in

class class jj if if HH00 were true were true

calc

Page 22: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-22

Chi-Square Test for Goodness-of-Chi-Square Test for Goodness-of-FitFit

• If the proposed distribution gives a good fit If the proposed distribution gives a good fit to the sample, the test statistic will be near to the sample, the test statistic will be near zero.zero.

• The test statistic follows the chi-square The test statistic follows the chi-square distribution with degrees of freedomdistribution with degrees of freedom

= = cc – – mm – 1 – 1

wherewhere cc is the no. of classes used in is the no. of classes used in the test the test mm is the no. of parameters is the no. of parameters estimatedestimated

Test Statistic and Degrees of Freedom for Test Statistic and Degrees of Freedom for GOFGOF

Page 23: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-23

Chi-Square Test for Goodness-of-Chi-Square Test for Goodness-of-FitFit

Test Statistic and Degrees of Freedom for Test Statistic and Degrees of Freedom for GOFGOF

110 ccmcv

211 ccmcv

312 ccmcv

Page 24: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-24

Chi-Square Test for Goodness-of-Chi-Square Test for Goodness-of-FitFit

• Instead of “fishing” for a good-fitting Instead of “fishing” for a good-fitting model, visualize model, visualize a prioria priori the characteristics the characteristics of the underlying of the underlying data-generating processdata-generating process..

Data-Generating SituationsData-Generating Situations

• MixturesMixtures occur when more than one data- occur when more than one data-generating process is superimposed on top generating process is superimposed on top of one another.of one another.

Mixtures: A ProblemMixtures: A Problem

Page 25: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-25

Chi-Square Test for Goodness-of-Chi-Square Test for Goodness-of-FitFit

• A simple “eyeball” inspection of the A simple “eyeball” inspection of the histogram or dot plot may suffice to rule histogram or dot plot may suffice to rule out a hypothesized population.out a hypothesized population.

Eyeball TestsEyeball Tests

• Goodness-of-fit tests may lack power in Goodness-of-fit tests may lack power in small samples. As a guideline, a chi-small samples. As a guideline, a chi-square goodness-of-fit test should be square goodness-of-fit test should be avoided if avoided if nn < 25. < 25.

Small Expected FrequenciesSmall Expected Frequencies

Page 26: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-26

Uniform Goodness-of-Fit TestUniform Goodness-of-Fit Test

• The The uniform goodness-of-fituniform goodness-of-fit test is a special test is a special case of the multinomial in which every value case of the multinomial in which every value has the same chance of occurrence.has the same chance of occurrence.

• The chi-square test for a uniform distribution The chi-square test for a uniform distribution compares all compares all cc groups simultaneously. groups simultaneously.

• The hypotheses are:The hypotheses are:

HH00: : 11 = = 22 = …, = …, cc = 1/ = 1/cc

HH11: Not all : Not all jj are equal are equal

Uniform DistributionUniform Distribution

Page 27: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-27

Uniform Goodness-of-Fit TestUniform Goodness-of-Fit Test

• The test can be performed on data that are The test can be performed on data that are already tabulated into groups.already tabulated into groups.

• Calculate the expected frequency Calculate the expected frequency eejj for each cell.for each cell.

• The degrees of freedom are The degrees of freedom are = c – 1 since there = c – 1 since there are no parameters for the uniform distribution.are no parameters for the uniform distribution.

• Obtain the critical value Obtain the critical value 22 from Appendix E for from Appendix E for

the desired level of significance the desired level of significance ..• The The pp-value can be obtained from Excel. -value can be obtained from Excel.

• Reject Reject HH00 if if pp-value -value << ..

Uniform GOF Test: Grouped DataUniform GOF Test: Grouped Data

Page 28: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-28

Uniform Goodness-of-Fit TestUniform Goodness-of-Fit Test

• First form First form cc bins of equal width and create a bins of equal width and create a frequency distribution.frequency distribution.

• Calculate the observed frequency Calculate the observed frequency ffjj for each bin. for each bin.

• Define Define eejj = = n/c.n/c.

• Perform the chi-square calculations.Perform the chi-square calculations.

• The degrees of freedom are The degrees of freedom are = c – 1 since there = c – 1 since there are no parameters for the uniform distribution.are no parameters for the uniform distribution.

• Obtain the critical value from Appendix E for a Obtain the critical value from Appendix E for a given significance level given significance level and make the decision. and make the decision.

Uniform GOF Test: Raw DataUniform GOF Test: Raw Data

Page 29: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-29

Uniform Goodness-of-Fit TestUniform Goodness-of-Fit Test

• Maximize the test’s power by defining bin Maximize the test’s power by defining bin width aswidth as

• As a result, the expected frequencies will As a result, the expected frequencies will be as large as possible.be as large as possible.

Uniform GOF Test: Raw DataUniform GOF Test: Raw Data

Page 30: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-30

Uniform Goodness-of-Fit TestUniform Goodness-of-Fit Test

• Calculate the mean and standard deviation of Calculate the mean and standard deviation of the uniform distribution as:the uniform distribution as:

= (a + b)/2= (a + b)/2

• If the data are not skewed and the sample size If the data are not skewed and the sample size is large (is large (nn > 30), then the mean is > 30), then the mean is approximately normally distributed. approximately normally distributed.

• So, test the hypothesized uniform mean using So, test the hypothesized uniform mean using

Uniform GOF Test: Raw DataUniform GOF Test: Raw Data

= [(b – a + 1)2 – 1)/12= [(b – a + 1)2 – 1)/12

Page 31: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-31

Poisson Goodness-of-Fit TestPoisson Goodness-of-Fit Test

• In a Poisson distribution model, In a Poisson distribution model, XX represents the number of events per unit of represents the number of events per unit of time or space.time or space.

• XX is a discrete nonnegative integer ( is a discrete nonnegative integer (XX = 0, 1, = 0, 1, 2, …) 2, …)

• Event arrivals must be independent of each Event arrivals must be independent of each other.other.

• Sometimes called a model of Sometimes called a model of rare eventsrare events because because XX typically has a small mean. typically has a small mean.

Poisson Data-Generating SituationsPoisson Data-Generating Situations

Page 32: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-32

Poisson Goodness-of-Fit TestPoisson Goodness-of-Fit Test

• The mean The mean is the only parameter. is the only parameter.• Assuming that Assuming that is unknown and must be is unknown and must be

estimated from the sample, the steps are:estimated from the sample, the steps are:

Step 1: Tally the observed frequency Step 1: Tally the observed frequency ffjj of of

each each XX-value.-value.

Step 2: Estimate the mean Step 2: Estimate the mean from the from the sample.sample.

Step 3: Use the estimated Step 3: Use the estimated to find the to find the Poisson probability Poisson probability PP((XX) for each value of ) for each value of XX..

Poisson Goodness-of-Fit TestPoisson Goodness-of-Fit Test

Page 33: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-33

Poisson Goodness-of-Fit TestPoisson Goodness-of-Fit Test

Step 4: Multiply Step 4: Multiply PP((XX) by the sample size ) by the sample size nn to get expected Poisson frequencies to get expected Poisson frequencies eejj..

Step 5: Perform the chi-square Step 5: Perform the chi-square calculations.calculations.

Step 6: Make the decision.Step 6: Make the decision.• You may need to combine classes until You may need to combine classes until

expected frequencies become large enough expected frequencies become large enough for the test (at least until for the test (at least until eejj >> 2). 2).

Poisson Goodness-of-Fit TestPoisson Goodness-of-Fit Test

Page 34: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-34

• Calculate the sample mean as:Calculate the sample mean as:

• Using this estimate mean, calculate the Using this estimate mean, calculate the Poisson probabilities either by using the Poisson probabilities either by using the Poisson formulaPoisson formula

PP((xx) = () = (xxee--)/)/xx! or Excel.! or Excel.

Poisson Goodness-of-Fit TestPoisson Goodness-of-Fit Test

Poisson GOF Test: Tabulated DataPoisson GOF Test: Tabulated Data

^̂ = = xxj j ffjj cc

j j =1=1

nn

Page 35: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-35

Poisson Goodness-of-Fit TestPoisson Goodness-of-Fit Test

• For For cc classes with classes with mm = 1 parameter = 1 parameter estimated, the degrees of freedom areestimated, the degrees of freedom are

= = cc – – mm – 1 – 1 • Obtain the critical value for a given Obtain the critical value for a given from from

Appendix E. Appendix E. • Make the decision.Make the decision.

Poisson GOF Test: Tabulated DataPoisson GOF Test: Tabulated Data

Page 36: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-36

Normal Chi-SquareNormal Chi-Square Goodness-of-Fit Test Goodness-of-Fit Test

• Two parameters, Two parameters, and and , fully describe the , fully describe the normal distribution.normal distribution.

• Unless Unless and and are know are know aa prioripriori, they must , they must be estimated from a sample by using be estimated from a sample by using xx and and ss..

• Using these statistics, the chi-square Using these statistics, the chi-square goodness-of-fit test can be used.goodness-of-fit test can be used.

Normal Data Generating SituationsNormal Data Generating Situations

Page 37: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-37

Normal Chi-SquareNormal Chi-Square Goodness-of-Fit Test Goodness-of-Fit Test

• Transform the sample observations Transform the sample observations xx11, , xx22, ,

…, …, xxnn into standardized values.into standardized values.

• Count the sample observations Count the sample observations ffjj within within

intervals of the form intervals of the form and compare them and compare them with the known frequencies with the known frequencies eejj based on the based on the

normal distribution.normal distribution.

Method 1: Standardizing the DataMethod 1: Standardizing the Data

xx ++ ksks

Page 38: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-38

Normal Chi-SquareNormal Chi-Square Goodness-of-Fit Test Goodness-of-Fit Test

Method 1: Standardizing the DataMethod 1: Standardizing the Data

Advantage is a Advantage is a standardized standardized scale.scale.

Disadvantage is Disadvantage is that data are no that data are no longer in the longer in the original units.original units.

Figure 15.14

Page 39: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-39

Normal Chi-SquareNormal Chi-Square Goodness-of-Fit Test Goodness-of-Fit Test

• To obtain equal-width bins, divide the To obtain equal-width bins, divide the exact exact data range data range into into cc groups of equal width. groups of equal width.

Step 1: Count the sample observations in Step 1: Count the sample observations in each bin to get observed frequencies each bin to get observed frequencies ffjj..

Step 2: Convert the bin limits into Step 2: Convert the bin limits into standardized z-values by using the formula.standardized z-values by using the formula.

Method 2: Equal Bin WidthsMethod 2: Equal Bin Widths

Page 40: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-40

Normal Chi-SquareNormal Chi-Square Goodness-of-Fit Test Goodness-of-Fit Test

Step 3: Find the normal area within each Step 3: Find the normal area within each bin assuming a normal distribution.bin assuming a normal distribution.

Step 4: Find expected frequencies Step 4: Find expected frequencies eejj by by

multiplying each normal area by the multiplying each normal area by the sample size sample size nn..

• Classes may need to be collapsed from the Classes may need to be collapsed from the ends inward to enlarge expected ends inward to enlarge expected frequencies.frequencies.

Method 2: Equal Bin WidthsMethod 2: Equal Bin Widths

Page 41: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-41

Normal Chi-SquareNormal Chi-Square Goodness-of-Fit Test Goodness-of-Fit Test

• Define histogram bins in such a way that an Define histogram bins in such a way that an equal number of observations would be equal number of observations would be expectedexpected within each bin under the null within each bin under the null hypothesis.hypothesis.

• Define bin limits so that Define bin limits so that eejj = = nn//cc

• A normal area of 1/A normal area of 1/cc in each of the in each of the cc bins is bins is desired.desired.

• The first and last classes must be open-ended The first and last classes must be open-ended for a normal distribution, so to define for a normal distribution, so to define cc bins, bins, we need we need cc – 1 cutpoints. – 1 cutpoints.

Method 3: Equal Expected FrequenciesMethod 3: Equal Expected Frequencies

Page 42: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-42

Normal Chi-SquareNormal Chi-Square Goodness-of-Fit Test Goodness-of-Fit Test

• The upper limit of bin The upper limit of bin jj can be found can be found directly by using Excel.directly by using Excel.

• Alternatively, find Alternatively, find zzjj for bin for bin jj using Excel using Excel

and then calculate the upper limit for bin and then calculate the upper limit for bin jj as as

• Once the bins are defined, count the Once the bins are defined, count the observations observations ffjj within each bin and within each bin and

compare them with the expected compare them with the expected frequencies frequencies eejj = = nn//cc..

Method 3: Equal Expected FrequenciesMethod 3: Equal Expected Frequencies

xx + + zzjjss

Page 43: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-43

Normal Chi-SquareNormal Chi-Square Goodness-of-Fit Test Goodness-of-Fit Test

Method 3: Equal Expected FrequenciesMethod 3: Equal Expected Frequencies• Standard normal cutpoints for equal area bins.Standard normal cutpoints for equal area bins.

Table 15.16

Page 44: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-44

Normal Chi-SquareNormal Chi-Square Goodness-of-Fit Test Goodness-of-Fit Test

HistogramsHistograms• The fitted normal histogram gives visual clues The fitted normal histogram gives visual clues

as to the likely outcome of the as to the likely outcome of the GOFGOF test. test.• Histograms reveal any outliers or other non-Histograms reveal any outliers or other non-

normality issues.normality issues.• Further tests are needed since histograms Further tests are needed since histograms

vary.vary.

Figure 15.15

Page 45: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-45

Normal Chi-SquareNormal Chi-Square Goodness-of-Fit Test Goodness-of-Fit Test

Critical Values for Normal GOF TestCritical Values for Normal GOF Test• Since two parameters, m and s, are Since two parameters, m and s, are

estimated from the sample, the degrees of estimated from the sample, the degrees of freedom are freedom are = = c c – – mm – 1 – 1

• At least 4 bins are needed to ensure 1 df.At least 4 bins are needed to ensure 1 df.

Table 15.19

Page 46: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-46

ECDF TestsECDF Tests

Kolmogorov-Smirnov and Lilliefors TestsKolmogorov-Smirnov and Lilliefors Tests• There are many alternatives to the chi-square There are many alternatives to the chi-square

test based on the test based on the Empirical Cumulative Empirical Cumulative Distribution Function Distribution Function ((ECDFECDF).).

• The The Kolmogorov-Smirnov Kolmogorov-Smirnov (K-S) test statistic (K-S) test statistic DD is the largest absolute difference between the is the largest absolute difference between the actual and expected cumulative relative actual and expected cumulative relative frequency of the frequency of the nn data values: data values:

DD = Max | = Max |FFaa – – FFee||• The K-S test is not recommended for grouped The K-S test is not recommended for grouped

data.data.

Page 47: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-47

ECDF TestsECDF Tests

Kolmogorov-Smirnov and Lilliefors TestsKolmogorov-Smirnov and Lilliefors Tests• FFaa is the actual cumulative frequency at is the actual cumulative frequency at

observation observation ii..• FFee is the expected cumulative frequency at is the expected cumulative frequency at

observation observation ii under the assumption that the under the assumption that the data came from the hypothesized distribution.data came from the hypothesized distribution.

• The K-S test assumes that no parameters are The K-S test assumes that no parameters are estimated.estimated.

• If parameters are estimated, use a If parameters are estimated, use a Lilliefors Lilliefors testtest..

• Both of these tests are done by computer.Both of these tests are done by computer.

Page 48: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-48

ECDF TestsECDF Tests

Kolmogorov-Smirnov and Lilliefors TestsKolmogorov-Smirnov and Lilliefors Tests

K-S test foruniformity.

Figure 15.20

Page 49: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-49

ECDF TestsECDF Tests

Kolmogorov-Smirnov and Lilliefors TestsKolmogorov-Smirnov and Lilliefors Tests

K-S test fornormality.

Figure 15.21

Page 50: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-50

ECDF TestsECDF Tests

Anderson-Darling TestsAnderson-Darling Tests• The The Anderson-Darling Anderson-Darling (A-D)(A-D) test test is widely used is widely used

for non-normality because of its power.for non-normality because of its power.• The A-D test is based on a The A-D test is based on a probability plotprobability plot..• When the data fit the hypothesized distribution When the data fit the hypothesized distribution

closely, the probability plot will be close to a closely, the probability plot will be close to a straight line.straight line.

• The A-D test statistic measures the overall The A-D test statistic measures the overall distance between the actual and the distance between the actual and the hypothesized distributions, using a weighted hypothesized distributions, using a weighted squared distance.squared distance.

Page 51: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

15-51

ECDF TestsECDF Tests

Anderson-Darling Tests with MINITABAnderson-Darling Tests with MINITAB

Figure 15.22

Page 52: Chi-Square Tests Chapter1515 Chi-Square Test for Independence Chi-Square Tests for Goodness- of-Fit Uniform Goodness-of-Fit Test Poisson Goodness-of-Fit.

Applied Statistics in Applied Statistics in Business & EconomicsBusiness & Economics

End of Chapter 15End of Chapter 15

15-52