Top Banner
Statistics and Data Analysis Part_Eight Analysis of Variance
46
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Anova

Statistics and Data Analysis

Part_EightAnalysis of Variance

Page 2: Anova

Analysis of Variance

• Analysis of variance helps compare two or more populations of quantitative data.

• Specifically, we are interested in the relationships among the population means (are they equal or not).

• The procedure works by analyzing the sample variance.

Page 3: Anova

• The analysis of variance is a procedure that tests to determine whether differences exits among two or more population means.

• To do this, the technique analyzes the variance of the data.

Single - Factor (One - Way) Analysis of Variance : Independent Samples

Page 4: Anova

• Example– An apple juice manufacturer is planning to develop

a new product -- a liquid concentrate.– The marketing manager has to decide how to

market the new product.– Three strategies are considered

• Emphasize convenience of using the product.

• Emphasize the quality of the product.

• Emphasize the product’s low price.

Page 5: Anova

• Example - continued

– An experiment was conducted as follows:

• In three cities an advertising campaign was launched.

• In each city only one of the three characteristics

(convenience, quality, and price) was emphasized.

• The weekly sales were recorded for twenty weeks

following the beginning of the campaigns.

• We assume the samples are independent of each other

CLUSTER SAMPLING

Page 6: Anova

Convnce Quality Price529 804 672658 630 531793 774 443514 717 596663 679 602719 604 502711 620 659606 697 689461 706 675529 615 512498 492 691663 719 733604 787 698495 699 776485 572 561557 523 572353 584 469557 634 581542 580 679614 624 532

Convnce Quality Price529 804 672658 630 531793 774 443514 717 596663 679 602719 604 502711 620 659606 697 689461 706 675529 615 512498 492 691663 719 733604 787 698495 699 776485 572 561557 523 572353 584 469557 634 581542 580 679614 624 532

• Example - continued– Data

Weekly sales

Weekly sales

Weekly sales

• Solution– The data are quantitative.

– Our problem objective is to compare sales in three cities.

– We hypothesize on the relationships among the three mean weekly sales:

Page 7: Anova

202020N =

ADTYPE

3.002.001.00

SA

LE

S

900

800

700

600

500

400

300

Descriptives

SALES

20 577.5500 103.8027 23.2110 528.9688 626.1312 353.00 793.00

20 653.0000 85.0771 19.0238 613.1827 692.8173 492.00 804.00

20 608.6500 93.1141 20.8210 565.0713 652.2287 443.00 776.00

60 613.0667 97.8147 12.6278 587.7984 638.3349 353.00 804.00

1.00

2.00

3.00

Total

N Mean Std. Deviation Std. Error Lower Bound Upper Bound

95% Confidence Interval forMean

Minimum Maximum

ExploratoryAnalysis

Page 8: Anova

• The test stems from the following rationale:– If the null hypothesis is true, we would expect

all the sample means be close to one another (and as a result to the overall mean).

– If the alternative hypothesis is true, at least 2 of the sample means would be different from one another.

H0: 1 = 2= 3

H1: At least two means differ

Page 9: Anova

Ho: 1 = 2= 3

H1: At least two means differ

Test statistic F= 3.23p-val = 0.047 < 0.05

There is sufficient evidence to reject Ho in favor of H1, and argue that at least one of the mean sales is different than the others.

202020N =

ADTYPE

3.002.001.00

SA

LES

900

800

700

600

500

400

300

Page 10: Anova

ANOVA

SALES

57512.233 2 28756.117 3.233 .047

506983.5 57 8894.447

564495.7 59

Between Groups

Within Groups

Total

Sum ofSquares df Mean Square F Sig.

Descriptives

SALES

20 577.5500 103.8027 23.2110 528.9688 626.1312 353.00 793.00

20 653.0000 85.0771 19.0238 613.1827 692.8173 492.00 804.00

20 608.6500 93.1141 20.8210 565.0713 652.2287 443.00 776.00

60 613.0667 97.8147 12.6278 587.7984 638.3349 353.00 804.00

1.00

2.00

3.00

Total

N Mean Std. Deviation Std. Error Lower Bound Upper Bound

95% Confidence Interval forMean

Minimum Maximum

SPSS runs the ANOVA test for us

p-value

Page 11: Anova

• The Test Statistic – where does it come from??

• The test stems from the following rationale:– If the null hypothesis is true, we would expect

all the sample means be close to one another (and as a result to the grand mean).

– If all the means are equal we would estimate the sample variance by

11

)(1 1

2

2

n

SST

n

xx

s

k

j

n

iij

i

This measures the Total variation in the data

Page 12: Anova

• The variability among the sample means is measured as the sum of squared distances between each group mean and the overall mean.

This sum is called the

Sum of Squares Between Groups

SSB

2

1

)( xxnSSBk

jjj

In our example groups arerepresented by the differentadvertising strategies.

If all the means are equal then this number will be smallIf differences exist among the means then this number will be large

202020N =

ADTYPE

3.002.001.00

SA

LE

S

900

800

700

600

500

400

300

Page 13: Anova

•The variation in the data, IF some of the group means are different is called the

Sum of Squares Within Groups

SSW 2

1 1

)(

k

j

n

ijij

j

xx

It is possible to show that:

SST = SSB + SSW

ANOVA compares SSB to SSW. If SSB is comparatively large, then there exist differences between the group means

Page 14: Anova

3

41 2

SSB is the sum of squares between the constant mean and each group meanSST is the total variation, between the individual points and the constant meanSSW is the variation when different means are allowedIf SSW is about the same as SST then the means are close to equal SSB is small

SST = SSB + SSW

The test statistic herecompares SSB and SSW

If SSB is comparativelylarge, then differences exist in the group means

Page 15: Anova

Actually SSB and SSW are NOT on the same scale,they are scaled first, then compared

1k

SSBMSB

kn

SSWMSW

The test statistic for ANOVA compares MSB to MSW using

MSW

MSBF

This statistic has an F distribution with (k-1) and (n-k) degrees of freedom

If MSB is large then the F statistic is large and differences exist in the group means

)()1()1( knkn

SSWSSBSST

Page 16: Anova

We can summarise this information in an ANOVA table

Source SS df MS F p-val

Between groups SSB k-1 MSB MSB/MSWWithin groups SSW n-k MSW

Total SST n-1

If the SSB is ‘large’ then the model with differing group meansis a significant improvement over the constant mean model as SSW must be ‘small’.

The p-value in this case is )Pr( ,1 FF knk

We can use SPSS to calculate this for us

Page 17: Anova

Assumptions for ANOVAThe statistic will have an F-distribution only if the data in each group are normally distributed AND the variation in eachgroup is roughly the same. When using this test, if the data are highly non-normal OR havevastly different variation in each group, then the test is not valid.

Usually it is enough if the histograms are roughly mound shaped

This means the histogram is roughly symmetric with highestdensity in the middle and lowest density in the tails

Page 18: Anova

And finally the hypothesis test:

H0: 1 = 2 = …=k

H1: At least two means differ

Test statistic:

Rejection region: F>Fk-1,n-k

MSEMST

F

Specifically in our advertisement problem

Page 19: Anova

Ho: 1 = 2= 3

H1: At least two means differ

Test statistic F= MST / MSE= 3.23

As p = 0.047 < 0.05 there is sufficient evidence to reject Ho in favor of H1, and argue that at least one of the mean sales is different than the others.

23.317.894,812.756,28

MSEMST

F

Page 20: Anova

Checking the required conditions

• The F test requires that the populations are normally distributed with equal variance.

• From SPSS we compare the sample variances: 10774, 7238, 8670. It seems the variances are roughly equal (we can test for this too if we like)

• To check the normality observe the histogram of each sample.

Page 21: Anova

Convenience

0

24

68

10

450 550 650 750 850 More

Quality

02468

10

450 550 650 750 850 More

Price

02468

10

450 550 650 750 850 More

All the distributions seem to be close to normal OR at least possibly normal.

Page 22: Anova

A test for equal variancesThe hypotheses to be tested here are

different is variancegroup 1least At :

...: 222

210

A

k

H

H

Levene came up with a test for this assuming normally distributed data in each group, SPSS does this test for us.

Test of Homogeneity of Variances

SALES

.344 2 57 .710

LeveneStatistic df1 df2 Sig.

The large p-value (0.71) means that the sample variances are sufficiently‘close’ to each other. We cannot conclude that they are different.

Page 23: Anova

So we have found differences in the groups, but where do they

lie?

202020N =

ADTYPE

3.002.001.00

SA

LES

900

800

700

600

500

400

300

Hard to see where differences lie here, But we want to know which is the ‘best’ or ‘worst’ advertising strategy

Solution: We can calculate all 3 95% CI’s for the difference in mean sales

BUT: We need to account for the number of CI’s we calculate. Why??

A 95% CI contains the true value only 95% of the time, it is wrong 5%of the time. So for every 20 intervals we calculate we would expect tomake the WRONG decision 1 time on average.

Page 24: Anova

Comparison of multiple means

Question : Before looking at the data, the manager decides to test whether emphasizing convenience is different to emphasizing quality in terms of sales.

The point estimate of the difference in sales is

12

12 xx As this is just a difference between two means we can put a 95% confidence interval around it as usual

21

2

12 2nn

sxx

Page 25: Anova

This confidence interval relies on two things1. The 2 groups of sales are independent of each other2. The comparison has been decided to be made before we look at the data. WHY??

Consider the case where we have 50 groups and wish to compare among the means. We might compare among 50 school classes in terms of HSC marks. We can do 3 things

1. A planned comparison2. An unplanned comparison3. Data snoop

Page 26: Anova

Planned comparisons

A planned comparison is when we wish to compare two meansbecause they answer a research question considered before the

data were inspected.

For example, in the 50 group study, Fred Smith may wish to compare class 36 (his class) to his friend Bob Jones class (class 48).

Here we do the usual confidence interval. For = 0.05 we have

)11

(23648

23648 nn

sYY

This interval has a 95% chance of containing the true mean difference

3648

Page 27: Anova

Unplanned comparisons (data mining)

The researcher for the 50 class study finds 95% confidence intervals for every possible difference between two means

(that’s 2450 intervals) and reports only the ones found to be significantly different to each other. The claim is made that significant differences have been found between the classes

and the intervals are used to illustrate where the differences are.

Any problem with this?

At the 5% level, we expect he will find 0.05*2450 = 122.5

intervals NOT containing 0, even if all the means really are equal!

We must take account of this in some way.

Page 28: Anova

Data Snooping

The researcher for the 50 class study picks the highest and lowest group means and finds a 95% confidence interval for the difference

between these two means and reports only this interval.

Any problem with this?

We must account for the fact that there are 2450 possible comparisons and he has picked the maximum and minimum group means. Note that again we would expect 5% (that’s 122.5) of 95% intervals not to contain 0, even if ALL the 50 class means were exactly equal to each other.

Again we must take account of his.

Page 29: Anova

Comparing multiple meansUnless we do a planned comparison, we must change the level of each confidence interval we calculate for differences between 2 means, to allow for data mining or data snooping.

We want to calculate multiple confidence intervals so that there is an overall 5% chance of finding a difference (when that difference is really 0) for each sample of DATA we analyse, NOT each CI we calculate.

There are many methods to do this. All involve widening confidence intervals in some way. The best and most accurate for pairwise comparisons were discovered by John Tukey.

Page 30: Anova

Tukey’s simultaneous mean confidence intervals – give an overall 95% chance of

containing all 3 true mean difference valuesMultiple Comparisons

Dependent Variable: SALES

Tukey HSD

-75.4500* 29.8236 .037 -147.2181 -3.6819

-31.1000 29.8236 .553 -102.8681 40.6681

75.4500* 29.8236 .037 3.6819 147.2181

44.3500 29.8236 .305 -27.4181 116.1181

31.1000 29.8236 .553 -40.6681 102.8681

-44.3500 29.8236 .305 -116.1181 27.4181

(J) ADTYPE2.00

3.00

1.00

3.00

1.00

2.00

(I) ADTYPE1.00

2.00

3.00

MeanDifference

(I-J) Std. Error Sig. Lower Bound Upper Bound

95% Confidence Interval

The mean difference is significant at the .05 level.*.

We are 95% confident that strategy 2 (quality) is better than strategy 1 (convenience) in terms of mean sales of apple juice.

Page 31: Anova

Factor ALevel 1Level2

Level 1

Factor B

Level 3

Two - way ANOVA

Level2

One - way ANOVA

Treatment 3

Treatment 2

Response

Response

Treatment 1

What if we have morethan one factor OR not independent groups?

Page 32: Anova

Treatment 4

Treatment 3

Treatment 2

Treatment 1

Block 1Block3 Block2

Block all the observations with some commonality across treatments

This is similar to matched pairs for 2 samples.

Page 33: Anova

• Example – A radio station manager wants to know if the

amount of time his listeners spent listening to the radio is about the same for every day of the week.

– 200 teenagers were asked to record how long they spend listening to a radio each day of the week.

• Solution– The problem objective is to compare seven

populations (1 for each week day).– The data are quantitative.

Page 34: Anova

– Each day of the week can be considered a group.

– Each 7 data points (per person) are related (called a block), because they belong to the same person.

– This procedure eliminates the variability in the “radio-times” among teenagers, and helps detect differences of the mean times teenagers listen to the radio among the days of the week only.

Ho: 1 = 2=…= 7

H1: At least two means differ

Page 35: Anova

ANOVASource of Variation SS df MS F P-value F critRows 209834.6 199 1054.445 2.627722 1.04E-23 1.187531Columns 28673.73 6 4778.955 11.90936 5.14E-13 2.106162Error 479125.1 1194 401.2773

Total 717633.5 1399

ANOVASource of Variation SS df MS F P-value F critRows 209834.6 199 1054.445 2.627722 1.04E-23 1.187531Columns 28673.73 6 4778.955 11.90936 5.14E-13 2.106162Error 479125.1 1194 401.2773

Total 717633.5 1399

b-1 k-1

Conclusion: At 5% significance level there is sufficient evidence to reject the null hypothesis, and infer that mean “radio time”is different in at least one of the week days.

BlocksGroups

MSE

MSGFGroups

Page 36: Anova

Two Factor Analysis of Variance

Suppose that two factors are to be examined:• The effects of the marketing approach on sales.

– Emphasis on convenience– Emphasis on quality– Emphasis on price

• The effects of the selected media on sales.– Advertise on TV– Advertise in newspapers

Page 37: Anova

• We can design the experiment as follows: City 1 City2 City3 City4 City5 City6Conven. & Quality& Price & Convenience& Quality & Price &

TV TV TV paper paper paper

• This is a one - way ANOVA experimental design.

The p-value =.045. We conclude that there is a strong evidence that differences exist in the mean weekly sales at the 5% level.

Ho: 1 = 2=…= 6

H1: At least two means differ

Page 38: Anova

• Are these differences caused by differences in the marketing approach?

• Are these differences caused by differences in the medium used for advertising?

• Are there combinations of these two factors that interact to affect the weekly sales?

• A new experimental design is needed to answer these questions.

Page 39: Anova

City 1sales

City3sales

City 5sales

City 2sales

City 4sales

City 6sales

TV

Newspapers

Convenience Quality Price

Factor A: Marketing strategy

Factor B: Advertising media

Are there differences in the mean sales caused by different marketing strategies?

Test whether mean sales of “Convenience”, “Quality”, and “Price” significantly differ from one another.

Factor A: Marketing strategy

Factor B: Advertising media

Factor A: Marketing strategy

Factor B: Advertising media

Factor A: Marketing strategy

Factor B: Advertising media

Page 40: Anova

City 1sales

City 3sales

City 5sales

City 2sales

City 4sales

City 6sales

TV

Newspapers

Convenience Quality Price

Factor A: Marketing strategy

Factor B: Advertising media

Are there differences in the mean sales caused by different advertising media?

Test whether mean sales of the “TV”, and “Newspapers” significantly differ from one another.

Page 41: Anova

City 1sales

City 5sales

City 2sales

City 4sales

City 6sales

TV

Newspapers

Convenience Quality Price

Factor A: Marketing strategy

Factor B: Advertising media

Are there differences in the mean sales caused by interaction between marketing strategy and advertising medium?

Test whether mean sales of certain cells are different than the level expected.

City 3sales

Page 42: Anova

101010 101010N =

ADTYPE

3.002.001.00

SA

LES

1

900

800

700

600

500

400

MEDIUM

Newspape

TV

Graphically, quality againseems the best strategy andnewspaper ads tend to getslightly more sales than TV ads.

Always plot the data first

Page 43: Anova

The interaction measureswhether the effect of each advertising strategyis the same for each advertising medium (papers and TV)

Estimated Marginal Means of SALES1

ADTYPE

3.002.001.00

Est

imat

ed M

argi

nal M

eans

700

680

660

640

620

600

580

560

540

MEDIUM

Newspape

TV

The (close to) parallel lines indicate that no interaction is occurring between medium and advertising strategy.

If the differences between thelines were changing or the lines crossed over, this wouldbe some evidence of interaction

Page 44: Anova

Convenience Quality Price

TV 491 677 575TV 712 627 614TV 558 590 706TV 447 632 484TV 479 683 478TV 624 760 650TV 546 690 583TV 444 548 536TV 582 579 579TV 672 644 795

Newspaper 464 689 803Newspaper 559 650 584Newspaper 759 704 525Newspaper 557 652 498Newspaper 528 576 812Newspaper 670 836 565Newspaper 534 628 708Newspaper 657 798 546Newspaper 557 497 616Newspaper 474 841 587

The two - way ANOVA in SPSS

Tests of Between-Subjects Effects

Dependent Variable: SALES1

113620.283a 5 22724.057 2.449 .045

22643098.0 1 22643098.02 2439.908 .000

13172.017 1 13172.017 1.419 .239

98838.633 2 49419.317 5.325 .008

1609.633 2 804.817 .087 .917

501136.700 54 9280.309

23257855.0 60

614756.983 59

SourceCorrected Model

Intercept

MEDIUM

ADTYPE

MEDIUM * ADTYPE

Error

Total

Corrected Total

Type III Sumof Squares df Mean Square F Sig.

R Squared = .185 (Adjusted R Squared = .109)a.

Clearly, at the 5% level, the AdvertisingStrategy is affecting mean sales (p=0.008), but Advert medium is NOT significantly affecting mean sales. Also there is no significant interaction effect on mean sales

Page 45: Anova

Where do the differences lie?Multiple Comparisons

Dependent Variable: SALES1

Tukey HSD

-99.3500* 30.4636 .005 -172.7671 -25.9329

-46.5000 30.4636 .287 -119.9171 26.9171

99.3500* 30.4636 .005 25.9329 172.7671

52.8500 30.4636 .202 -20.5671 126.2671

46.5000 30.4636 .287 -26.9171 119.9171

-52.8500 30.4636 .202 -126.2671 20.5671

(J) ADTYPE2.00

3.00

1.00

3.00

1.00

2.00

(I) ADTYPE1.00

2.00

3.00

MeanDifference

(I-J) Std. Error Sig. Lower Bound Upper Bound

95% Confidence Interval

Based on observed means.

The mean difference is significant at the .05 level.*.

Again, we are 95% confident that strategy 2 (quality) is better than strategy 1 (convenience) in terms of mean sales of apple juice.There appears to be no significant difference in mean sales between emphasizing quality or price, nor between price and convenience.

Page 46: Anova

Are the assumptions satisfied?Levene's Test of Equality of Error Variancesa

Dependent Variable: SALES1

.731 5 54 .603F df1 df2 Sig.

Tests the null hypothesis that the error variance of thedependent variable is equal across groups.

Design: Intercept+MEDIUM+ADTYPE+MEDIUM* ADTYPE

a.

The sample variances are NOT significantly different toeach other, as we have a high p-value (0.603 > 0.05) here

202020N =

ADTYPE

3.002.001.00

SA

LES

1

900

800

700

600

500

400

3640

3030N =

MEDIUM

TVNewspape

SA

LE

S1

900

800

700

600

500

400

The observations in each group appear to be fairly symmetric and close to normal