Top Banner
Chapter 19 Analysis of Variance (ANOVA)
26

Chapter 19 Analysis of Variance (ANOVA)

Jan 03, 2016

Download

Documents

jerry-henry

Chapter 19 Analysis of Variance (ANOVA). ANOVA. How to test a null hypothesis that the means of more than two populations are equal. H 0 : m 1 = m 2 = m 3 H 1 : Not all three populations are equal Test hypothesis with ANOVA procedure (Analysis of variance) - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 19 Analysis of Variance (ANOVA)

Chapter 19

Analysis of Variance(ANOVA)

Page 2: Chapter 19 Analysis of Variance (ANOVA)

ANOVA

• How to test a null hypothesis that the means of more than two populations are equal.

H0: 1 = 2 = 3

H1: Not all three populations are equal

• Test hypothesis with ANOVA procedure (Analysis of variance)

• ANOVA tests use the F distribution

Page 3: Chapter 19 Analysis of Variance (ANOVA)

F Distribution

• F distribution has 2 numbers of degree of freedom (DF) -- numerator and denominator.– EXAMPLE: df = (8,14)

• Change in numerator df has a greater effect on the shape of the distribution.

• Properties:– Continuous and skewed to the right– Has 2 df numbers– Nonnegative unites.

Page 4: Chapter 19 Analysis of Variance (ANOVA)

Finding the F valueExample 19.1

SITUATION: Find the F value for 8 degrees of freedom for the numerator, 14 degrees of freedom for the denominator and .05 area in the right tail of the F curve.

• Consult Table V of Appendix A – corresponding to .05 area. – Locate the numerator on the top row, and the denominator along

the left. – Find where they intersect.

• This will give the critical value of F. • Excel: FDIST (x, df1, df2), FINV(prob., df1, df2)

Page 5: Chapter 19 Analysis of Variance (ANOVA)

Assumptions in ANOVA

To test H0: 1=2=3

H1: Not all three populations are equal

– The following must be true:• Population from which samples are drawn are normally

distributed• Population from which the samples are drawn have the

same variance (or standard deviation)• The samples are drawn from different populations that

are random and independent.

Page 6: Chapter 19 Analysis of Variance (ANOVA)

How does ANOVA work?

• The purpose of ANOVA is to test differences in means (for groups or variables) for statistical significance.

• By partitioning the total variance into the component that is due to true random error (i.e., within-group SSE) and the components that are due to differences between groups (SSG).

• SSG is then tested for statistical significance, and, if significant, the null hypothesis of no differences between means is rejected.

• Always right-tailed with the rejection region in the right tail

Page 7: Chapter 19 Analysis of Variance (ANOVA)

Types of ANOVA

• One-way ANOVA: Only one factor is considered• Two-way ANOVA

– Answer the question if the tow categorical variables act together to impact the averages for the various groups?

– If the two factors do not act together to impact the averages, does at least one of the factors have an impact on the averages for the various groups?

• N-way ANOVA – Looking for interaction of multiple factors. – Requires more data

– Always right-tailed with the rejection region in the right tail

Page 8: Chapter 19 Analysis of Variance (ANOVA)

ANOVA Notation and Formulas

• xi = sample mean for group (or treatment) i

• k = the number of groups (or treatments)

• ni = sample size of group i

• x = the average (the grand mean) of all of the observations in all groups

• n = sum of the k sample sizes = n1 + n2 + n3 …. + nk

• si2 = the sample variance for group (or treatment) i

Page 9: Chapter 19 Analysis of Variance (ANOVA)

MSG and MSE

• Sum of squares for groups (SSG)

• Mean squares for groups (MSG)

• Sum of squared error (SSE)

• Mean squared error (MSE)

2222

211 )(....)()( xxnxxnxxnSSG kk

2222

211 )1(....)1()1( kk snsnsnSSE

1

k

SSGMSG

kn

SSEMSE

Page 10: Chapter 19 Analysis of Variance (ANOVA)

SST and relationship among the SSs

• Total sum of squares (SST)– SST is the numerator when calculating sample variance– Does not include a group distinction– Dividing SST by its df sample variance

• Relationships

SSG + SSE = SST

2)( xxSST

Groups Error Total

df k-1 n-k n-1

Page 11: Chapter 19 Analysis of Variance (ANOVA)

ANOVA Tables

• It is common practice to report results using an ANOVA table:

SourceSum of

Squaresdf Mean Square F P

Groups SSG k-1 P-value

Error SSE n-k

TOTAL SST n-1

1

k

SSGMSG

kn

SSEMSE

MSE

MSGF 0

Page 12: Chapter 19 Analysis of Variance (ANOVA)

ANOVA process by handExample 19.2

SITUATION: Soap manufacturer wants to test 3 new machines that should fill a jug. They tested for 5 hours and recorded the number of jugs filled by each per hour:

– At the 10% significance level can we reject the null hypothesis that the mean number of jugs filled per hour by each machine is the same?

• k = 3

• n1= n2 =n3 = 5

continued….

Machine 1 Machine 2 Machine 3

54 53 49

49 56 53

52 57 47

55 51 50

48 59 54

Page 13: Chapter 19 Analysis of Variance (ANOVA)

ANOVA process by handExample 19.2 continued

• We now need to calculate the ANOVA table• For machine 1:

• Now do the same for machine 2 & 3

• Then for 1-3 combined

3.91

)(

6.515

4855524954

5

1

212

1

1

11

1

n

xxs

n

xx

n

3.8,6.50,5

2.10,2.55,52332

2222

sxn

sxn

7333.169,4667.52,3,15 2 sxkn

Page 14: Chapter 19 Analysis of Variance (ANOVA)

ANOVA process by handExample 19.2 continued

• Then we can calculate SSG/E/T:

SST =SSG + SSE = 58.5335 + 111.2 = 169.7335

• Degrees of freedom– Group df = k-1 = 3-1 = 2– Error df = n-k = 15-3 = 12– Total df = n-1 = 15-1 = 14

continued….

5335.58)4667.526.50(5)4667.522.55(5)4667.526.51(5

)()()(222

233

222

211

SSG

xxnxxnxxnSSG

2.1113.8)15(2.10)15(3.9)15(

)1()1()1( 233

222

211

SSE

snsnsnSSE

Page 15: Chapter 19 Analysis of Variance (ANOVA)

ANOVA process by handExample 19.2 continued

• Now calculate MSG,

MSE, and F

• Determine if the assumption that the three populations have the same population variance are valid. The assumption is reasonable if:

• Now, look in Table V of Appendix A . Use numerator df=2, denominator df=12 ….

continued….

1583.326667.9

26675.29

26667.912

2.111

26675.292

5335.58

1

0

MSE

MSGF

kn

SSEMSE

k

SSGMSG

2)min(

)max(

i

i

s

s

Page 16: Chapter 19 Analysis of Variance (ANOVA)

Example 19.2 ANOVA tables

• Replace the calculations results in the table below:

• Do we reject the null hypothesis?

H0: 1=2=3

H1: Not all three populations are equal

SourceSum of

Squaresdf Mean Square F P

Groups SSG k-1 P-value

Error SSE n-k

TOTAL SST n-1

1

k

SSGMSG

kn

SSEMSE

MSE

MSGF 0

Page 17: Chapter 19 Analysis of Variance (ANOVA)

Example 19.2 by Excel

Anova: Single Factor

SUMMARYGroups Count Sum Average Variance

M1 5 258 51.6 9.3M2 5 276 55.2 10.2M3 5 253 50.6 8.3

ANOVASource of Variation SS df MS F P-value F crit

Between Groups 58.53333 2 29.26667 3.158273 0.079073 2.806796Within Groups 111.2 12 9.266667

Total 169.7333 14

M1 M2 M354 53 4949 56 5352 57 4755 51 5048 59 54

Page 18: Chapter 19 Analysis of Variance (ANOVA)

Example 19.2 by Minitab

One-way ANOVA: P versus M

Source DF SS MS F PM 2 58.53 29.27 3.16 0.079Error 12 111.20 9.27Total 14 169.73

S = 3.044 R-Sq = 34.49% R-Sq(adj) = 23.57%

Individual 95% CIs For Mean Based on Pooled StDevLevel N Mean StDev -+---------+---------+---------+--------M1 5 51.600 3.050 (---------*---------)M2 5 55.200 3.194 (---------*---------)M3 5 50.600 2.881 (---------*---------) -+---------+---------+---------+-------- 48.0 51.0 54.0 57.0

Pooled StDev = 3.044

Page 19: Chapter 19 Analysis of Variance (ANOVA)

Example 19.3 by Excel

Anova: Single Factor

SUMMARYGroups Count Sum Average Variance

A 5 108 21.6 11.3B 6 87 14.5 7.5C 6 93 15.5 13.1D 5 110 22 8.5

ANOVASource of Variation SS df MS F P-value F critBetween Groups 255.6182 3 85.20606 8.417723 0.001043 2.416005Within Groups 182.2 18 10.12222

Total 437.8182 21

Page 20: Chapter 19 Analysis of Variance (ANOVA)

Example 19.3 by Minitab

One-way ANOVA: Cus. versus Teller

Source DF SS MS F PTeller 3 255.6 85.2 8.42 0.001Error 18 182.2 10.1Total 21 437.8

S = 3.182 R-Sq = 58.38% R-Sq(adj) = 51.45% Individual 95% CIs For Mean Based on Pooled StDev

Level N Mean StDev ------+---------+---------+---------+---A 5 21.600 3.362 (--------*-------)B 6 14.500 2.739 (------*-------)C 6 15.500 3.619 (-------*-------)D 5 22.000 2.915 (--------*-------) ------+---------+---------+---------+--- 14.0 17.5 21.0 24.5

Pooled StDev = 3.182

Page 21: Chapter 19 Analysis of Variance (ANOVA)

Pairwise Comparisons

• If the result of ANOVA is to reject the null hypothesis, it does not identify which group means are significantly different.

• Most software packages include this comparison.– Calculate a confidence interval for the differences of

each unique pair of means. – Check to see if ZERO falls in the interval, if not then

they are significantly different.

Page 22: Chapter 19 Analysis of Variance (ANOVA)

Example 19.3 by Minitab

Fisher 95% Individual Confidence IntervalsAll Pairwise ComparisonsSimultaneous confidence level = 80.96%

A subtracted from: Lower Center Upper ---------+---------+---------+---------+B -11.147 -7.100 -3.053 (------*------)C -10.147 -6.100 -2.053 (------*------)D -3.827 0.400 4.627 (------*------) ---------+---------+---------+---------+ -6.0 0.0 6.0 12.0B subtracted from: Lower Center Upper ---------+---------+---------+---------+C -2.859 1.000 4.859 (------*-----)D 3.453 7.500 11.547 (------*-----) ---------+---------+---------+---------+ -6.0 0.0 6.0 12.0C subtracted from: Lower Center Upper ---------+---------+---------+---------+D 2.453 6.500 10.547 (------*------) ---------+---------+---------+---------+ -6.0 0.0 6.0 12.0

Page 23: Chapter 19 Analysis of Variance (ANOVA)

Pairwise ComparisonsFisher’s Least Significant Difference (LSD) Method

• Null Hypothesis: H0: i = j

• Least Significant Difference (LSD) :

)()11

(2

,2/,2/ jiji

EaNE

aN nnifnn

MStorn

MStLSD

• The pair of means i and j is declared significantly different if

LSDXX ji

Page 24: Chapter 19 Analysis of Variance (ANOVA)

Example 19.3 with LSD

)()11

(2

,2/,2/ jiji

EaNE

aN nnifnn

MStorn

MStLSD

LSDXX ji

1.10EMS

Page 25: Chapter 19 Analysis of Variance (ANOVA)

Example 19.3 with LSD

)()11

(2

,2/,2/ jiji

EaNE

aN nnifnn

MStorn

MStLSD

Page 26: Chapter 19 Analysis of Variance (ANOVA)

Welch’s Approach to Heterogeneity of Variance

• If Max(sj2)/Min(sj

2)>2, the assumption of equal variance can not be used.

• Welch’s approach modifies the F-test with the following steps:– For each sample j, calculate wj

– Calculate the summation of w from k samples– Calculate the weighted avg. of sample means

– Calculate the test statistic F0 and df

2j

jj s

nw

k

j jw1

k

j j

k

j jj

w

XwX

1

1

k

j k

j j

j

j

k

j jj

w

w

nkk

k

XXw

F

1

2

1

2

1

2

0

11

11

)2(21

1

)(

k

j k

j j

j

j w

w

n

kdf

1

2

1

2

11

13

1