Top Banner
Analysis of Variance Jigyasu Gaur
29
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Anova

Analysis of Variance

Jigyasu Gaur

Page 2: Anova

General ANOVA Setting

Investigator controls one or more independent variables Called factors (or treatment variables) Each factor contains two or more levels (or groups or

categories/classifications) Observe effects on the dependent variable

Response to levels of independent variable Experimental design: the plan used to collect

the data

Page 3: Anova

One-Way Analysis of Variance

Evaluate the difference among the means of three or more groups

Example: Performance rates for 1st, 2nd, and 3rd shift of employees in a factory

Assumptions Populations are normally distributed : or CLT

applies Populations have equal variances Samples are randomly and independently

drawn

Page 4: Anova

Hypotheses of One-Way ANOVA

All population means are equal i.e., no treatment effect (no variation in means among

groups)

At least one population mean is different i.e., there is a treatment effect Does not mean that all population means are different

(some pairs may be the same)

c3210 μμμμ:H

same the are means population the of all Not:H1

Page 5: Anova

Why ANOVA?

We could compare the means, one by one using t-tests for difference of means.

Problem: each test contains type I error The total type I error is where k is the

number of comaprisons.

For example, if there are 5 means and you use =.05, you must make 10 two by two comparisons. Thus, the type I error is 1-(.95)10, which is .59. That is, 59% of the time you will reject the null hypothesis of equal means in favor of the alternative!

k11

Page 6: Anova

One-Way ANOVA

All Means are the same:The Null Hypothesis is True

(No Treatment Effect)

c3210 μμμμ:H

same the are μ all Not:H i1

321 μμμ

Page 7: Anova

One-Way ANOVA

At least one mean is different:The Null Hypothesis is NOT true

(Treatment Effect is present)

c3210 μμμμ:H

same the are μ all Not:H i1

321 μμμ 321 μμμ

or

(continued)

Page 8: Anova

Partitioning the Variation

Total variation can be split into two parts:

SST = Total Sum of Squares (Total variation)

SSA = Sum of Squares Among Groups (Between-group variation)

SSW = Sum of Squares Within Groups (Within-group variation)

SST = SSA + SSW

Page 9: Anova

Partitioning the Variation

Total Variation = the aggregate dispersion of the individual data values across the various factor levels (SST)

Within-Group Variation = dispersion that exists among the data values within a particular factor level (SSW)

Between-Group Variation = dispersion between the factor sample means (SSA)

SST = SSA + SSW

(continued)

Page 10: Anova

Partition of Total Variation

Variation Due to Factor (SSA)

Variation Due to Random Sampling (SSW)

Total Variation (SST)

Commonly referred to as: Sum of Squares Within Sum of Squares Error Sum of Squares Unexplained Within Groups Variation

Commonly referred to as: Sum of Squares Between Sum of Squares Among Sum of Squares Explained Among Groups Variation

= +

Page 11: Anova

Total Sum of Squares

c

1j

n

1i

2ij

j

)XX(SSTWhere:

SST = Total sum of squares

c = number of groups (levels or treatments)

nj = number of observations in group j

Xij = ith observation from group j

X = grand mean (mean of all data values)

SST = SSA + SSW

Page 12: Anova

Total Variation

G rou p 1 G rou p 2 G rou p 3

Resp on se , X

X

2cn

212

211 )XX(...)XX()XX(SST

c

(continued)

Page 13: Anova

Among-Group Variation

Where:

SSA = Sum of squares among groups

c = number of groups or populations

nj = sample size from group j

Xj = sample mean from group j

X = grand mean (mean of all data values)

2j

c

1jj )XX(nSSA

SST = SSA + SSW

Page 14: Anova

Among-Group Variation

Variation Due to Differences Among Groups

i j

2j

c

1jj )XX(nSSA

1c

SSAMSA

Mean Square Among =

SSA/degrees of freedom

(continued)

Page 15: Anova

Among-Group Variation

G rou p 1 G rou p 2 G rou p 3

Resp on se , X

X1X

2X3X

2222

211 )xx(n...)xx(n)xx(nSSA cc

(continued)

Page 16: Anova

Within-Group Variation

Where:

SSW = Sum of squares within groups

c = number of groups

nj = sample size from group j

Xj = sample mean from group j

Xij = ith observation in group j

2jij

n

1i

c

1j

)XX(SSWj

SST = SSA + SSW

Page 17: Anova

Within-Group Variation

Summing the variation within each group and then adding over all groups

i

cn

SSWMSW

Mean Square Within =

SSW/degrees of freedom

2jij

n

1i

c

1j

)XX(SSWj

(continued)

Page 18: Anova

Within-Group Variation

G rou p 1 G rou p 2 G rou p 3

Resp on se , X

1X2X

3X

2ccn

2212

2111 )XX(...)XX()Xx(SSW

c

(continued)

Page 19: Anova

Obtaining the Mean Squares

cn

SSWMSW

1c

SSAMSA

1n

SSTMST

Page 20: Anova

One-Way ANOVA Table

Source of Variation

dfSS MS(Variance)

Among Groups

SSA MSA =

Within Groups

n - cSSW MSW =

Total n - 1SST =SSA+SSW

c - 1 MSA

MSW

F ratio

c = number of groupsn = sum of the sample sizes from all groupsdf = degrees of freedom

SSA

c - 1

SSW

n - c

F =

Page 21: Anova

One-Factor ANOVAF Test Statistic

Test statistic

MSA is mean squares among variances

MSW is mean squares within variances

Degrees of freedom df1 = c – 1 (c = number of groups)

df2 = n – c (n = sum of sample sizes from all populations)

MSW

MSAF

H0: μ1= μ2 = … = μc

H1: At least two population means are different

Page 22: Anova

Interpreting One-Factor ANOVA F Statistic

The F statistic is the ratio of the among estimate of variance and the within estimate of variance The ratio must always be positive df1 = c -1 will typically be small df2 = n - c will typically be large

Decision Rule: Reject H0 if F > FU,

otherwise do not reject H0

0

= .05

Reject H0Do not reject H0

FU

Page 23: Anova

One-Factor ANOVA F Test Example

You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the .05 significance level, is there a difference in mean distance?

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204

Page 24: Anova

••••

One-Factor ANOVA Example: Scatter Diagram

270

260

250

240

230

220

210

200

190

••

•••

•••••

Distance

1X

2X

3X

X

227.0 x

205.8 x 226.0x 249.2x 321

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204

Club1 2 3

Page 25: Anova

One-Factor ANOVA Example Computations

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204

X1 = 249.2

X2 = 226.0

X3 = 205.8

X = 227.0

n1 = 5

n2 = 5

n3 = 5

n = 15

c = 3SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4

SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6

MSA = 4716.4 / (3-1) = 2358.2

MSW = 1119.6 / (15-3) = 93.325.275

93.3

2358.2F

Page 26: Anova

F = 25.275

One-Factor ANOVA Example Solution

H0: μ1 = μ2 = μ3

H1: μi not all equal

= .05

df1= 2 df2 = 12

Test Statistic:

Decision:

Conclusion:

Reject H0 at = 0.05

There is evidence that at least one μi differs from the rest

0

= .05

FU = 3.89Reject H0Do not

reject H0

25.27593.3

2358.2

MSW

MSAF

Critical Value:

FU = 3.89

Page 27: Anova

SUMMARY

Groups Count Sum Average Variance

Club 1 5 1246 249.2 108.2

Club 2 5 1130 226 77.5

Club 3 5 1029 205.8 94.2

ANOVA

Source of Variation

SS df MS F P-value F crit

Between Groups

4716.4 2 2358.2 25.275 4.99E-05 3.89

Within Groups

1119.6 12 93.3

Total 5836.0 14        

ANOVA -- Single Factor:Excel Output

EXCEL: tools | data analysis | ANOVA: single factor

Page 28: Anova

What happens if there is more than 1 explanation for changes in the dependent variable?

If 2 or more independent variables all have independent effects then you get a good result by doing separate 1-way ANOVA analyses.

This is likely only when the independent variables are not related to each other (not correlated) and when there is no interaction between them in influencing the dependent variable.

Page 29: Anova

Thank You