Questions of goodness of fit

Post on 25-May-2015

112 Views

Category:

Education

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Decision-Based Learning - Questions of goodness of fit

Transcript

Questions of Goodness of Fit

Is the problem you are working on a question of Goodness of Fit?

Is the problem you are working on a question of Goodness of Fit?

Questions of Goodness of Fit have become increasingly important in modern statistics.

Goodness of fit is a method used to determine how close a hypothesized pattern fits an observed pattern.

Goodness of fit is a method used to determine how close a hypothesized pattern fits an observed pattern.

Hypothesized Pattern- the way you think

things are.

Goodness of fit is a method used to determine how close a hypothesized pattern fits an observed pattern.

fits

Hypothesized Pattern- the way you think

things are.

Goodness of fit is a method used to determine how close a hypothesized pattern fits an observed pattern.

Observed Pattern – the way things

actually are.

Hypothesized Pattern- the way you think

things are.

fits

For example, let’s say we hypothesize that there are an equal number of females as there are males in the town of Solvang, California.

So, in a sample of 200 Solvangans we would hypothesize that 100 would be female.

So, in a sample of 200 Solvangans we would hypothesize that 100 would be female.

The hypothesized number

of females in a sample of 200 is

100

So, in a sample of 200 Solvangans we would hypothesize that 100 would be female.

The hypothesized number

of females in a sample of 200 is

100

That is because we assume that an equal number will be males and an equal number

will be females

We then take a sample of 200 and find that there are actually 84.

Once again, our hypothesized number of females from a sample of 200 is 100.

Once again, our hypothesized number of females from a sample of 200 is 100.

The hypothesized number

of females in a sample of 200 is

100

But, our actual number of females from a sample of 100 is 84.

Is the difference between 100 and 84 statistically significant?

Is the difference between 100 and 84 statistically significant?

Note - Even though we are using the word difference here, in this

case we are referring to how well the data FITS the hypothesis.

Is the difference between 100 and 84 statistically significant?

The HYPOTHESIZED number

of females in a sample of 200 is

100

Is the difference between 100 and 84 statistically significant?

The ACTUAL number

of females in a sample of 200 is

84

The HYPOTHESIZED number

of females in a sample of 200 is

100

Is the difference between 100 and 84 statistically significant?

The ACTUAL number

of females in a sample of 200 is

84 16

The HYPOTHESIZED number

of females in a sample of 200 is

100

Is the difference between 100 and 84 statistically significant?

The ACTUAL number

of females in a sample of 200 is

84 16

The HYPOTHESIZED number

of females in a sample of 200 is

100

This value can

be tested

statistically to

determine its

Goodness of

Fit

If it is significantly different, then we may need to collect a new sample that is more representative of the hypothesized population.

If it is significantly different, then we may need to collect a new sample that is more representative of the hypothesized population.

Here is an equation that we will use as a guide to identify goodness of fit questions.

Here is an equation that we will use as a guide to identify goodness of fit questions.

Hypothesized Number

fit theActual

NumberDoes the ?

Examples of Goodness of Fit Tests

Example #1

Consider a standard package of milk chocolate M&Ms.

Consider a standard package of milk chocolate M&Ms. There are six different colors: red, orange, yellow, green, blue and brown.

Consider a standard package of milk chocolate M&Ms. There are six different colors: red, orange, yellow, green, blue and brown. Suppose that we are curious about the distribution of these colors and ask, do all six colors occur equally?

Consider a standard package of milk chocolate M&Ms. There are six different colors: red, orange, yellow, green, blue and brown. Suppose that we are curious about the distribution of these colors and ask, do all six colors occur equally? You collect 24 M&Ms with 4 reds, 4 oranges, 3 yellows, 5 greens, 2 blues, & 6 browns.

Consider a standard package of milk chocolate M&Ms. There are six different colors: red, orange, yellow, green, blue and brown. Suppose that we are curious about the distribution of these colors and ask, do all six colors occur equally? You collect 24 M&Ms with 4 reds, 4 oranges, 3 yellows, 5 greens, 2 blues, & 6 browns. Are these differences statistically significant?

Consider a standard package of milk chocolate M&Ms. There are six different colors: red, orange, yellow, green, blue and brown. Suppose that we are curious about the distribution of these colors and ask, do all six colors occur equally? You collect 24 M&Ms with 4 reds, 4 oranges, 3 yellows, 5 greens, 2 blues, & 6 browns. Are these differences statistically significant?

Hypothesized Number

fit theActual

NumberDoes the ?

Consider a standard package of milk chocolate M&Ms. There are six different colors: red, orange, yellow, green, blue and brown. Suppose that we are curious about the distribution of these colors and ask, do all six colors occur equally? You collect 24 M&Ms with 4 reds, 4 oranges, 3 yellows, 5 greens, 2 blues, & 6 browns. Are these differences statistically significant?

Hypothesized Number

fit theActual

NumberDoes the ?

Consider a standard package of milk chocolate M&Ms. There are six different colors: red, orange, yellow, green, blue and brown. Suppose that we are curious about the distribution of these colors and ask, do all six colors occur equally? You collect 24 M&Ms with 4 reds, 4 oranges, 3 yellows, 5 greens, 2 blues, & 6 browns. Are these differences statistically significant?

fit theActual

NumberDoes the

Hypothesized Number =

4 red, 4 orange4 yellow, 4 green4 blue, 4 brown

?

Consider a standard package of milk chocolate M&Ms. There are six different colors: red, orange, yellow, green, blue and brown. Suppose that we are curious about the distribution of these colors and ask, do all six colors occur equally? You collect 24 M&Ms with 4 reds, 4 oranges, 3 yellows, 5 greens, 2 blues, & 6 browns. Are these differences statistically significant?

fit theActual

NumberDoes the

Hypothesized Number =

4 red, 4 orange4 yellow, 4 green4 blue, 4 brown

?

Consider a standard package of milk chocolate M&Ms. There are six different colors: red, orange, yellow, green, blue and brown. Suppose that we are curious about the distribution of these colors and ask, do all six colors occur equally? You collect 24 M&Ms with 4 reds, 4 oranges, 3 yellows, 5 greens, 2 blues, & 6 browns. Are these differences statistically significant?

fit theDoes the

Hypothesized Number =

4 red, 4 orange4 yellow, 4 green4 blue, 4 brown

Actual Number =

4 red, 4 orange3 yellow, 5 green2 blue, 6 brown

?

Example #2

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate.

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism follows faculty perception.

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism follows faculty perception. The faculty expected that a group of 100 students would miss class according to the following chart.

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism follows faculty perception. The faculty expected that a group of 100 students would miss class according to the following chart.

# of absences EXPECTED # of Students

0-2 50

3-5 30

6-8 12

9-11 6

12+ 2

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism follows faculty perception. The faculty expected that a group of 100 students would miss class according to the following chart. Here were actual results of a random sample.

# of absences EXPECTED # of Students

0-2 50

3-5 30

6-8 12

9-11 6

12+ 2

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism follows faculty perception. The faculty expected that a group of 100 students would miss class according to the following chart. Here were actual results of a random sample.

# of absences EXPECTED # of Students

0-2 50

3-5 30

6-8 12

9-11 6

12+ 2

# of absences ACTUAL # of Students

0-2 35

3-5 40

6-8 20

9-11 1

12+ 4

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism follows faculty perception. The faculty expected that a group of 100 students would miss class according to the following chart. Here were actual results of a random sample. Did the faculty perception fit the reality?

Hypothesized Number

fit theActual

NumberDoes the ?

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism follows faculty perception. The faculty expected that a group of 100 students would miss class according to the following chart. Here were actual results of a random sample. Did the faculty perception fit the reality?

Hypothesized Number

fit theActual

NumberDoes the ?

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism follows faculty perception. The faculty expected that a group of 100 students would miss class according to the following chart. Here were actual results of a random sample. Did the faculty perception fit the reality?

Faculty Perceptions of Student

Absenteeism

fit theActual

NumberDoes the ?

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism follows faculty perception. The faculty expected that a group of 100 students would miss class according to the following chart. Here were actual results of a random sample. Did the faculty perception fit the reality?

Faculty Perceptions of Student

Absenteeism

fit theActual

NumberDoes the ?

Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism follows faculty perception. The faculty expected that a group of 100 students would miss class according to the following chart. Here were actual results of a random sample. Did the faculty perception fit the reality?

Actual Student

Absenteeism

Faculty Perceptions of Student

Absenteeism

fit theDoes the ?

An exception to the rule

An exception to the rule

As was just shown, if you are comparing an observed count with a hypothesized count, then you will use goodness of fit statistical methods.

An exception to the rule

As was just shown, if you are comparing an observed count with a hypothesized count, then you will use goodness of fit statistical methods.

Hypothesized Count = 100

Actual Count = 84

An exception to the rule

As was just shown, if you are comparing an observed count with a hypothesized count, then you will use goodness of fit statistical methods.

Hypothesized Count = 100

Actual Count = 84

An exception to the rule

However,

An exception to the rule

However, if you are comparing a hypothesized proportion (5 out of 10) or percentage (50%)

An exception to the rule

However, if you are comparing a hypothesized proportion (5 out of 10) or percentage (50%) with an actual proportion or percentage, then you will use a “Difference” method.

An exception to the rule

However, if you are comparing a hypothesized proportion (5 out of 10) or percentage (50%) with an actual proportion or percentage, then you will use a “Difference” method.

Hypothesized Percentage =

50%

Actual Percentage =

42%

An exception to the rule

However, if you are comparing a hypothesized proportion (5 out of 10) or percentage (50%) with an actual proportion or percentage, then you will use a “Difference” method.

Hypothesized Percentage =

50%

Actual Percentage =

42%

Here are the two classifications with their equations:

Question of Goodness of Fit:

Question of Goodness of Fit:

Hypothesized Number

Actual Number

fit theDoes the ?

Question of Goodness of Fit:

Question of Difference:

Hypothesized Number

Actual Number

fit theDoes the ?

Question of Goodness of Fit:

Question of Difference:

Hypothesized Number

Actual Number

Hypothesized Percentage or

Proportion

differActual

Percentage or Proportion

Does the

fit theDoes the ?

?

Let’s see an example:

You have been asked to determine if a sample is representative of the general population in terms of gender.

You have been asked to determine if a sample is representative of the general population in terms of gender. Since there are roughly equal numbers of men and women in the population, your sample of 500 should have 250 females.

You have been asked to determine if a sample is representative of the general population in terms of gender. Since there are roughly equal numbers of men and women in the population, your sample of 500 should have 250 females. However, in your sample there are 325 females.

You have been asked to determine if a sample is representative of the general population in terms of gender. Since there are roughly equal numbers of men and women in the population, your sample of 500 should have 250 females. However, in your sample there are 325 females. How well does your sample of 325 fit this hypothesized expectation statistically?

You have been asked to determine if a sample is representative of the general population in terms of gender. Since there are roughly equal numbers of men and women in the population, your sample of 500 should have 250 females. However, in your sample there are 325 females. How well does your sample of 325 fit this hypothesized expectation statistically? Since this question is dealing with number

counts, it will be classified as a Goodness of Fit Question

Now let’s see the same question but as a “difference” question.

You have been asked to determine if a sample is representative of the general population in terms of gender.

You have been asked to determine if a sample is representative of the general population in terms of gender. Since there are roughly equal numbers of men and women in the population, your sample should have 50% females.

You have been asked to determine if a sample is representative of the general population in terms of gender. Since there are roughly equal numbers of men and women in the population, your sample should have 50% females. However, in your sample there are 65% females.

You have been asked to determine if a sample is representative of the general population in terms of gender. Since there are roughly equal numbers of men and women in the population, your sample should have 50% females. However, in your sample there are 65% females. How much does your sample of 65% differ from the hypothesized expectation of 50% statistically?

You have been asked to determine if a sample is representative of the general population in terms of gender. Since there are roughly equal numbers of men and women in the population, your sample should have 50% females. However, in your sample there are 65% females. How much does your sample of 65% differ from the hypothesized expectation of 50% statistically?

Since this question is dealing with percentages or proportions, it will be

classified as a Difference Question

Examine the question or problem you are working on.

Is it a question of goodness of fit?

If so, select GOODNESS OF FIT

top related