Transcript

1 1 Slide Slide

Analysis of Variance

Chapter 13BA 303

2 2 Slide Slide

An Introduction to Analysis of Variance

A factor is a variable that the experimenter has selected for investigation.

A treatment is a level of a factor.

Example: A company president is interested in the

number of hours worked by managers at three different plants: Gulfport, Long Beach, and Biloxi.

Factor: Plant Treatment: Gulfport, Long Beach, and

Biloxi

3 3 Slide Slide

Analysis of Variance: A Conceptual Overview

Analysis of Variance (ANOVA) can be used to test for the equality of three or more population means.

Data obtained from observational or experimental studies can be used for the analysis.

We want to use the sample results to test the following hypotheses:

H0: 1=2=3=. . . = k

Ha: Not all population means are equal

4 4 Slide Slide

H0: 1=2=3=. . . = k

Ha: Not all population means are equal

If H0 is rejected, we cannot conclude that all population means are different.

Rejecting H0 means that at least two population means have different values.

Analysis of Variance: A Conceptual Overview

5 5 Slide Slide

For each population, the response (dependent) variable is normally distributed.

The variance of the response variable, denoted 2, is the same for all of the populations.

The observations must be independent.

Assumptions for Analysis of Variance

Analysis of Variance: A Conceptual Overview

6 6 Slide Slide

Sampling Distribution of Given H0 is Truex

1x 3x2x

Sample means are close together because there is only one sampling distribution when H0 is true.

Analysis of Variance: A Conceptual Overview

7 7 Slide Slide

Sampling Distribution of Given H0 is Falsex

33 1x 2x3x 11 22

Sample means come from different sampling distributions and are not as close together when H0

is false.

Analysis of Variance: A Conceptual Overview

8 8 Slide Slide

Analysis of Variance

Between-Treatments Estimate of Population Variance Within-Treatments Estimate of Population Variance Comparing the Variance Estimates: The F Test ANOVA Table

9 9 Slide Slide

2

1

( )

MSTR1

k

j jj

n x x

k

Between-Treatments Estimateof Population Variance s 2

Denominator is thedegrees of freedom

associated with SSTR

Numerator is calledthe sum of squares

dueto treatments (SSTR)

The estimate of 2 based on the variation of the sample means is called the mean square due to treatments and is denoted by MSTR.

10 10 Slide Slide

The estimate of 2 based on the variation of the sample observations within each sample is called the mean square error and is denoted by MSE.

Within-Treatments Estimateof Population Variance s 2

Denominator is thedegrees of freedom

associated with SSE

Numerator is called

the sum of squares

due to error (SSE)

MSE

( )n s

n k

j jj

k

T

1 2

1MSE

( )n s

n k

j jj

k

T

1 2

1

11 11 Slide Slide

F Test Statistic

MSE

MSTRF

1 MSTR d.f k

knT MSE d.f

12 12 Slide Slide

Comparing the Variance Estimates: The F Test

If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of MSTR/MSE is an F distribution with MSTR d.f. equal to k - 1 and MSE d.f. equal to nT - k.

If the means of the k populations are not equal, the value of MSTR/MSE will be inflated because MSTR overestimates 2. Hence, we will reject H0 if the resulting value of MSTR/MSE appears to be too large to have been selected at random from the appropriate F distribution.

13 13 Slide Slide

F Distribution

14 14 Slide Slide

Sampling Distribution of MSTR/MSE

Do Not Reject H0Do Not Reject H0

Reject H0Reject H0

MSTR/MSEMSTR/MSE

Critical ValueCritical ValueFF

Sampling Distributionof MSTR/MSE

a

Comparing the Variance Estimates: The F Test

15 15 Slide Slide

MSTRSSTR

-

k 1MSTR

SSTR-

k 1

MSESSE

-

n kT

MSESSE

-

n kT

MSTRMSE

MSTRMSE

Source ofVariation

Sum ofSquares

Degrees ofFreedom

MeanSquare F

Treatments

Error

Total

k - 1

nT - 1

SSTR

SSE

SST

nT - k

SST is partitionedinto SSTR and

SSE.

SST’s degrees of freedom

(d.f.) are partitioned into

SSTR’s d.f. and SSE’s d.f.

ANOVA Tablefor a Completely Randomized Design

p-Value

16 16 Slide Slide

SST divided by its degrees of freedom nT – 1 is the overall sample variance that would be obtained if we treated the entire set of observations as one data set.

With the entire data set as one sample, the formula for computing the total sum of squares, SST, is:

2

1 1

SST ( ) SSTR SSEjnk

ijj i

x x

ANOVA Tablefor a Completely Randomized Design

17 17 Slide Slide

ANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedom into their corresponding sources: treatments and error.

Dividing the sum of squares by the appropriate degrees of freedom provides the variance estimates and the F value used to test the hypothesis of equal population means.

ANOVA Tablefor a Completely Randomized Design

18 18 Slide Slide

Test for the Equality of k Population Means

F = MSTR/MSE

H0: 1=2=3=. . . = k

Ha: Not all population means are equal

Hypotheses

Test Statistic

19 19 Slide Slide

Test for the Equality of k Population Means

Rejection Rule

where the value of F is based on anF distribution with k - 1 numerator d.f.and nT - k denominator d.f.

Reject H0 if p-value < ap-value Approach:

Critical Value Approach: Reject H0 if F > Fa

20 20 Slide Slide

Reed Manufacturing

Janet Reed would like to know if there is anysignificant difference in the mean number of hoursworked per week for the department managers at herthree manufacturing plants (in Buffalo, Pittsburgh,and Detroit). An F test will be conducted using a = .05.

Testing for the Equality of k Population Means

21 21 Slide Slide

Reed Manufacturing

A simple random sample of five managers fromeach of the three plants was taken and the number ofhours worked by each manager in the previous weekis shown on the next slide.

Testing for the Equality of k Population Means

Factor . . . Manufacturing plantTreatments . . . Buffalo, Pittsburgh, DetroitExperimental units . . . Managers

Response variable . . . Number of hours worked

22 22 Slide Slide

12345

4854575462

7363666474

5163615456

Plant 1Buffalo

Plant 2Pittsburgh

Plant 3DetroitObservation

Sample MeanSample Variance

55 68 5726.0 26.5 24.5

Testing for the Equality of k Population Means

23 23 Slide Slide

H0: 1= 2= 3

Ha: Not all the means are equalwhere: 1 = mean number of hours worked per

week by the managers at Plant 1 2 = mean number of hours worked per week by the managers at Plant 2 3 = mean number of hours worked per week by the managers at Plant 3

1. Develop the hypotheses.

p -Value and Critical Value Approaches

Testing for the Equality of k Population Means

24 24 Slide Slide

2. Specify the level of significance. a = .05

p -Value and Critical Value Approaches

3. Compute the value of the test statistic.

MSTR = 490/(3 - 1) = 245SSTR = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 - 60)2 = 490

(Sample sizes are all equal.)Mean Square Due to Treatments

Testing for the Equality of k Population Means

= (55 + 68 + 57)/3 = 60x

25 25 Slide Slide

3. Compute the value of the test statistic.

MSE = 308/(15 - 3) = 25.667

SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308Mean Square Due to Error

(con’t.)

F = MSTR/MSE = 245/25.667 = 9.55

p -Value and Critical Value Approaches

Testing for the Equality of k Population Means

26 26 Slide Slide

TreatmentError

Total

490308

798

212

14

24525.667

Source ofVariation

Sum ofSquares

Degrees ofFreedom

MeanSquare

9.55

F

ANOVA Table

Testing for the Equality of k Population Means

p-Value

.0033

27 27 Slide Slide

5. Determine whether to reject H0.

We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plant.

The p-value < .05, so we reject H0.

With 2 numerator d.f. and 12 denominator d.f.,the p-value is .01 for F = 6.93. Therefore, thep-value is less than .01 for F = 9.55.

p –Value Approach

4. Compute the p –value.

Testing for the Equality of k Population Means

28 28 Slide Slide

5. Determine whether to reject H0.

Because F = 9.55 > 3.89, we reject H0.

Critical Value Approach

4. Determine the critical value and rejection rule.

Reject H0 if F > 3.89

We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plant.

Based on an F distribution with 2 numeratord.f. and 12 denominator d.f., F.05 = 3.89.

Testing for the Equality of k Population Means

29 29 Slide Slide

ANOVA PRACTICE

30 30 Slide Slide

Perform an ANOVA

Using the data in the table on the next slide, compute the mean and standard deviation for the Direct treatment.

Compute the sum of the squares due to treatments (SSTR) and the mean square due to treatments (MSTR).

Compute the sum of the squares due to the error (SSE) and the mean square due to the error (MSE).

Compute the F test statistic.

Using a=0.05, what do you conclude about the null hypothesis that the means of the treatments are equal?

Determine the approximate p-value.

Present the results in an ANOVA table.

31 31 Slide Slide

Direct Indirect Combination

1 17.0 16.6 14.44 25.2 0.042 18.5 22.2 3.24 24.0 1.003 15.8 20.5 0.01 21.5 12.254 18.2 18.3 4.41 26.8 3.245 20.2 24.2 14.44 27.5 6.256 16.0 19.8 0.36 25.8 0.647 13.3 21.2 0.64 24.2 0.64

20.4 25.06.26 4.01

x

2)( xxi 2)( xxi ixixix

2s

2)( xxi

top related