1 Slide Analysis of Variance Chapter 13 BA 303
Dec 27, 2015
1 1 Slide Slide
Analysis of Variance
Chapter 13BA 303
2 2 Slide Slide
An Introduction to Analysis of Variance
A factor is a variable that the experimenter has selected for investigation.
A treatment is a level of a factor.
Example: A company president is interested in the
number of hours worked by managers at three different plants: Gulfport, Long Beach, and Biloxi.
Factor: Plant Treatment: Gulfport, Long Beach, and
Biloxi
3 3 Slide Slide
Analysis of Variance: A Conceptual Overview
Analysis of Variance (ANOVA) can be used to test for the equality of three or more population means.
Data obtained from observational or experimental studies can be used for the analysis.
We want to use the sample results to test the following hypotheses:
H0: 1=2=3=. . . = k
Ha: Not all population means are equal
4 4 Slide Slide
H0: 1=2=3=. . . = k
Ha: Not all population means are equal
If H0 is rejected, we cannot conclude that all population means are different.
Rejecting H0 means that at least two population means have different values.
Analysis of Variance: A Conceptual Overview
5 5 Slide Slide
For each population, the response (dependent) variable is normally distributed.
The variance of the response variable, denoted 2, is the same for all of the populations.
The observations must be independent.
Assumptions for Analysis of Variance
Analysis of Variance: A Conceptual Overview
6 6 Slide Slide
Sampling Distribution of Given H0 is Truex
1x 3x2x
Sample means are close together because there is only one sampling distribution when H0 is true.
Analysis of Variance: A Conceptual Overview
7 7 Slide Slide
Sampling Distribution of Given H0 is Falsex
33 1x 2x3x 11 22
Sample means come from different sampling distributions and are not as close together when H0
is false.
Analysis of Variance: A Conceptual Overview
8 8 Slide Slide
Analysis of Variance
Between-Treatments Estimate of Population Variance Within-Treatments Estimate of Population Variance Comparing the Variance Estimates: The F Test ANOVA Table
9 9 Slide Slide
2
1
( )
MSTR1
k
j jj
n x x
k
Between-Treatments Estimateof Population Variance s 2
Denominator is thedegrees of freedom
associated with SSTR
Numerator is calledthe sum of squares
dueto treatments (SSTR)
The estimate of 2 based on the variation of the sample means is called the mean square due to treatments and is denoted by MSTR.
10 10 Slide Slide
The estimate of 2 based on the variation of the sample observations within each sample is called the mean square error and is denoted by MSE.
Within-Treatments Estimateof Population Variance s 2
Denominator is thedegrees of freedom
associated with SSE
Numerator is called
the sum of squares
due to error (SSE)
MSE
( )n s
n k
j jj
k
T
1 2
1MSE
( )n s
n k
j jj
k
T
1 2
1
11 11 Slide Slide
F Test Statistic
MSE
MSTRF
1 MSTR d.f k
knT MSE d.f
12 12 Slide Slide
Comparing the Variance Estimates: The F Test
If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of MSTR/MSE is an F distribution with MSTR d.f. equal to k - 1 and MSE d.f. equal to nT - k.
If the means of the k populations are not equal, the value of MSTR/MSE will be inflated because MSTR overestimates 2. Hence, we will reject H0 if the resulting value of MSTR/MSE appears to be too large to have been selected at random from the appropriate F distribution.
13 13 Slide Slide
F Distribution
14 14 Slide Slide
Sampling Distribution of MSTR/MSE
Do Not Reject H0Do Not Reject H0
Reject H0Reject H0
MSTR/MSEMSTR/MSE
Critical ValueCritical ValueFF
Sampling Distributionof MSTR/MSE
a
Comparing the Variance Estimates: The F Test
15 15 Slide Slide
MSTRSSTR
-
k 1MSTR
SSTR-
k 1
MSESSE
-
n kT
MSESSE
-
n kT
MSTRMSE
MSTRMSE
Source ofVariation
Sum ofSquares
Degrees ofFreedom
MeanSquare F
Treatments
Error
Total
k - 1
nT - 1
SSTR
SSE
SST
nT - k
SST is partitionedinto SSTR and
SSE.
SST’s degrees of freedom
(d.f.) are partitioned into
SSTR’s d.f. and SSE’s d.f.
ANOVA Tablefor a Completely Randomized Design
p-Value
16 16 Slide Slide
SST divided by its degrees of freedom nT – 1 is the overall sample variance that would be obtained if we treated the entire set of observations as one data set.
With the entire data set as one sample, the formula for computing the total sum of squares, SST, is:
2
1 1
SST ( ) SSTR SSEjnk
ijj i
x x
ANOVA Tablefor a Completely Randomized Design
17 17 Slide Slide
ANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedom into their corresponding sources: treatments and error.
Dividing the sum of squares by the appropriate degrees of freedom provides the variance estimates and the F value used to test the hypothesis of equal population means.
ANOVA Tablefor a Completely Randomized Design
18 18 Slide Slide
Test for the Equality of k Population Means
F = MSTR/MSE
H0: 1=2=3=. . . = k
Ha: Not all population means are equal
Hypotheses
Test Statistic
19 19 Slide Slide
Test for the Equality of k Population Means
Rejection Rule
where the value of F is based on anF distribution with k - 1 numerator d.f.and nT - k denominator d.f.
Reject H0 if p-value < ap-value Approach:
Critical Value Approach: Reject H0 if F > Fa
20 20 Slide Slide
Reed Manufacturing
Janet Reed would like to know if there is anysignificant difference in the mean number of hoursworked per week for the department managers at herthree manufacturing plants (in Buffalo, Pittsburgh,and Detroit). An F test will be conducted using a = .05.
Testing for the Equality of k Population Means
21 21 Slide Slide
Reed Manufacturing
A simple random sample of five managers fromeach of the three plants was taken and the number ofhours worked by each manager in the previous weekis shown on the next slide.
Testing for the Equality of k Population Means
Factor . . . Manufacturing plantTreatments . . . Buffalo, Pittsburgh, DetroitExperimental units . . . Managers
Response variable . . . Number of hours worked
22 22 Slide Slide
12345
4854575462
7363666474
5163615456
Plant 1Buffalo
Plant 2Pittsburgh
Plant 3DetroitObservation
Sample MeanSample Variance
55 68 5726.0 26.5 24.5
Testing for the Equality of k Population Means
23 23 Slide Slide
H0: 1= 2= 3
Ha: Not all the means are equalwhere: 1 = mean number of hours worked per
week by the managers at Plant 1 2 = mean number of hours worked per week by the managers at Plant 2 3 = mean number of hours worked per week by the managers at Plant 3
1. Develop the hypotheses.
p -Value and Critical Value Approaches
Testing for the Equality of k Population Means
24 24 Slide Slide
2. Specify the level of significance. a = .05
p -Value and Critical Value Approaches
3. Compute the value of the test statistic.
MSTR = 490/(3 - 1) = 245SSTR = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 - 60)2 = 490
(Sample sizes are all equal.)Mean Square Due to Treatments
Testing for the Equality of k Population Means
= (55 + 68 + 57)/3 = 60x
25 25 Slide Slide
3. Compute the value of the test statistic.
MSE = 308/(15 - 3) = 25.667
SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308Mean Square Due to Error
(con’t.)
F = MSTR/MSE = 245/25.667 = 9.55
p -Value and Critical Value Approaches
Testing for the Equality of k Population Means
26 26 Slide Slide
TreatmentError
Total
490308
798
212
14
24525.667
Source ofVariation
Sum ofSquares
Degrees ofFreedom
MeanSquare
9.55
F
ANOVA Table
Testing for the Equality of k Population Means
p-Value
.0033
27 27 Slide Slide
5. Determine whether to reject H0.
We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plant.
The p-value < .05, so we reject H0.
With 2 numerator d.f. and 12 denominator d.f.,the p-value is .01 for F = 6.93. Therefore, thep-value is less than .01 for F = 9.55.
p –Value Approach
4. Compute the p –value.
Testing for the Equality of k Population Means
28 28 Slide Slide
5. Determine whether to reject H0.
Because F = 9.55 > 3.89, we reject H0.
Critical Value Approach
4. Determine the critical value and rejection rule.
Reject H0 if F > 3.89
We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plant.
Based on an F distribution with 2 numeratord.f. and 12 denominator d.f., F.05 = 3.89.
Testing for the Equality of k Population Means
29 29 Slide Slide
ANOVA PRACTICE
30 30 Slide Slide
Perform an ANOVA
Using the data in the table on the next slide, compute the mean and standard deviation for the Direct treatment.
Compute the sum of the squares due to treatments (SSTR) and the mean square due to treatments (MSTR).
Compute the sum of the squares due to the error (SSE) and the mean square due to the error (MSE).
Compute the F test statistic.
Using a=0.05, what do you conclude about the null hypothesis that the means of the treatments are equal?
Determine the approximate p-value.
Present the results in an ANOVA table.
31 31 Slide Slide
Direct Indirect Combination
1 17.0 16.6 14.44 25.2 0.042 18.5 22.2 3.24 24.0 1.003 15.8 20.5 0.01 21.5 12.254 18.2 18.3 4.41 26.8 3.245 20.2 24.2 14.44 27.5 6.256 16.0 19.8 0.36 25.8 0.647 13.3 21.2 0.64 24.2 0.64
20.4 25.06.26 4.01
x
2)( xxi 2)( xxi ixixix
2s
2)( xxi