Top Banner
Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler
34

Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Jan 17, 2016

Download

Documents

Gabriel Burke
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Analysis and Interpretation:Analysis of Variance

(ANOVA)

Chris Fowler

Page 2: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Contents & OutcomesFour Basic Questions:

1. Why use ANOVA?– Multiple comparisons of means– More Complex Designs– Main effects and interactions

2. What is ANOVA?– Why analyse variance?

3. How do we interpret the results?– Summary and Mean Tables– Statistical v’s theoretical significance

4. When to use it?– Assumptions (Parameters)

Page 3: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Scope

• The presentation will not focus on: – statistical theory (beyond what is necessary)– computations and formulae (use a computer!)

• But it will focus on:– making sense of the results– helping you to choose the right design

• However in ANOVA the design, data collection, and analysis become inseparable

Page 4: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

1. Why use ANOVA?• Multiple (more than 2) simultaneous comparisons of means.• Comparison of 3 means using a T test would mean undertaking 3

analysisA vs BA vs CB vs C

• 4 comparisons = 6 tests; or N(N-1) 2

Where N=Number of Means being compared• ANOVA allows the simultaneous comparison of the means – only

one test• So what’s the problem?

– Type 1 errors– Loss of information (interactions)

Page 5: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Making a Type 1 error

• A significance level tells you the probability of rejecting the Null Hypothesis when it is in fact true.

• P<0.05 means that there is less than 5 out 100 chance of incorrectly rejecting the Null hypothesis. Or there is a 5% chance of making an error called a type 1 error.

• So your significance level states the probability of making a Type 1 error

• Every additional comparison you make increase the chances of a type 1 error (so if you do 100 comparisons – 5 are likely to be false – but which five?).

Page 6: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Type I and Type II Errors

Note that 1-beta equals the power of a test

Findings (H1)

Significant Non-Significant

Ho

True Reject incorrectly Type I Error (or Alpha)

Accept Correctly

False Reject Correctly (1 – beta)

Accept incorrectlyType II error (or beta)

Page 7: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

But…..

• A significant main effect means that overall there is a significant difference between means .

• But one mean may not be significantly different from one of the others.

• To make specific comparisons you can do a ‘Planned or Unplanned Comparison’.

• Equally you can test for linear or nonlinear trends (Trend tests).

• Both use weighted coefficients that must sum to zero and total number of comparison/trends cannot exceed the total number of DF (L-1) for the effect you are examining. (You are partitioning the variance).

Page 8: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Example Coefficients for Planned Comparisons

• Four Levels (L1,L2,L3 and L4) L1 L2 L3 L4

+3 vs -1 and -1 and -1+1 vs -1 0 0+1 and +1 vs -1 and -1

Remember they are planned – you were expecting to find a difference. There are unplanned comparisons for more explorative analysis but be aware of post hoc analysis.

Page 9: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Coefficients for Trend TestGroup Size Trend Coefficients

3 Linear -1 0 +1

Quadratic +1 -2 +1

4 Linear -3 -1 +1 +3

Quadratic +1 -1 -1 +1

Cubic -1 +3 -3 +1

5 Linear -2 -1 0 +1 +2

Quadratic -2 +1 +2 +1 -2

Cubic -1 +2 0 -2 +1

Quartic +1 -4 +6 -4 +1

Page 10: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Exercise

• Use the coefficients and draw the trends on a graph.

Page 11: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

More Complex Designs

• A Simple design (or one way ANOVA) only has a single independent variable (Factor) with three or more levels.

For example The effects of Noise on memory retention.

Three levels of Noise (High, Medium and Low) and each subjects’ score is the number of words remembered (out of 20)

This would be One way Between subject factorial design. The within S equivalent would have each subject undertaking all the Noise conditions.

Page 12: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Two Way ANOVA

• You have two independent variables (Factors). • For example as well as noise you have Task

Difficulty (Easy & Hard) as a variable. Easy Hard

H M L H M L X X X X X XX X X X X X

A within-S example as above butS1 X X X X X XS2 X X X X X X

Page 13: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Within AND Between Subject Designs (Mixed or Split plot)

Where one Factor is B-S and the other is W-S

Eg

H M L

Easy S1 X X X

S2 X X X

Hard S6 X X X

S7 X X X

Page 14: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Main Effects and Interactions

• More Complex designs (more than One way) allow you not only to explore the main effects of the individual variables but also the interaction between the variables.

• These can be two way (A x B), three way (AxBxC) , four ways and so on.

• A two way ANOVA (A,B) only has one interaction (AxB); a three way has three interactions (AxB; AxC and AxBxC) and so on.

Page 15: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Examples of Interactions

Page 16: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

2. What is ANOVA?

• How can analysing variance tells us about differences between means?

Page 17: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Analysing the Variance

Sample 1 Sample 2 Sample 3

6 7 1

8 9 3

10 11 5

12 13 7

14 15 9

Total 50 55 25

Mean 10 11 5

Variance (S2) 10

10 10

Sample 1 and 2 are very similar and combining them makes little difference to the overall mean (10.5) or Variance (9.17)

But Sample 3 has a much lower mean, and although it starts with same variance as the other two, if you combine it with sample 1 and 2 the variance will increase (15.95)

They all started with same variance so the increase in variance can only be attributed to difference between the means.

Page 18: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

But…..

• This only works if you assume homogeneity of variance.

• ANOVA is based on statistical theory relating to populations rather than samples, but under certain conditions we can assume that the sample is unbiased estimate of our population hence inferring from samples about populations

• The conditions are stated in the central limit theorem (mean, variance and shape).

Page 19: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

And….• Any treatment effect also contains sampling error so we need

to calculate the error separately. The greater the treatment effect the greater disparity between the two.

• If there is no treatment effect (all error) then dividing the treatment effect by error (a residual) will result in a ratio of 1 (the F ratio)*

• The greater the treatment effect the greater the value of F.• To decide whether the F-ratio is significant (ie you can reject

the null hypothesis) you need to look up in a table the probability of getter that particular F value for that particular F distribution.

• The particular distribution is determined by number of degrees of freedom associated with your treatment and error effects

* In a perfect world you would never get an F value less than one, but because we use estimates an F<1 can occur.

Page 20: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

3. Interpreting the results

1. Have the tables of means and ANOVA summary table at hand

2. Select and interpret those means for which you have predicted effects on the basis of your hypotheses.

3. Interpret any significant but unpredicted effects (with caution) but use a ‘two-tail’ test (halves the probability) or increase the significance level (P<0.01 rather than P<0.05)

Page 21: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

ANOVA Summary Table

Source SS DF MS F-R Probability

Main Effect A

Main Effect B

Interaction (AxB)

Error

Total

Simple Two-way ANOVA:

Page 22: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Simple Two Way Within Subject Design

Source SS DF MS F-R Probability

(Between – S Effect)

(Within –S effect)

Main Effect AError a

Main Effect BError b

Interaction (AxB)

Error c

Total

Page 23: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Split Plot design (one within & one between)

Source SS DF MS F-R Probability

(Between – S Effect)

Main Effect A

Error a

(Within –S effect)

Main Effect B

Error b

Interaction (AxB)

Error c

Total

Page 24: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

An Example

Hypothesis – Background noise has a masking effect that helps students concentrate better, particularly on difficult tasks

Independent Variables:• Three levels of background noise’ (65db, 75db

& 85dbs)• Two levels of task difficulty (easy and hard)Dependent Variable• Number of key points recalled from a piece of

text.

Page 25: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Raw DataA 2 x 3 Factorial BS Design

Easy (A1) Hard (A2)

65db 75db 85db 1 4 9 3 4 8 2 5 8 1 5 7 1 4 7

65db 75db 85db 1 2 5 0 4 4 1 3 4 2 4 5 2 3 5

Totals 8 22 39 6 16 23

Equal Cell sizes (n=5)

Page 26: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Table of Means

Easy Hard

65 db 1.6 1.2 1.4

75 db 4.4 3.2 3.8

85 db 7.8 4.6 6.2

4.6 3.0

Page 27: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Interaction

4

2

6

8

65 75 85

x

x

x

o

o

o

x x

o o

Easy

Hard

Page 28: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

ANOVA Summary Table

Source SS DF MS F-R Probability

Task Difficulty (A)

Noise Level (B)

Interaction (AxB)

Error

19.2

115.2

10.4

14.0

1

2

2

24

19.2

57.63

5.2

0.58

32.93

99.31

8.96

P<0.001

P<0.001

P<0.01

Total 158.8 29

Page 29: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Results

• That more items were recalled from the easy (4.6) compared to the hard task (3.6) (F=32.93, df 1, 24, P<0.001). This was expected.

• That as noise increases, recall improves (F=99.31, df 2, 24, P<0.001).

• That the effect of the noise diminishes as the tasks becomes harder (F=8.96, df 2, 24, P<0.05) or the more difficult the task the less background noise should be used.

Page 30: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

Theoretical vs Statistical Significance

• Be wary of:– Post hoc explanation (changing your hypothesis after

analysing your data)– Data Trawling (capturing as much data as you can

rather than the data you need)– Post mortem data analysis (keep on analysing in

unintended ways until you find something significant)– Data checking (only checking your results when you

have no significant findings)– Data exclusion (getting rid of those awkward scores!)

Page 31: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

And….

• Something that is statistically significant may have no or limited theoretical significance

• Equally something that is statistically non significant may have theoretical significance (pressure to publish only significant results).

Page 32: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

4. When to use ANOVA

• ANOVA is a very powerful test , but its use is based on certain assumptions:

1. The population distribution from which the sample was drawn from should be normally distributed.

2. The observations should be independent (usually assured through random sampling and assignment)

3. Measurements should be made on an interval or ratio scale (but ordinal data can be transformed into normal scores).

4. There should be homogeneity of variance (usually OK if equal sample sizes are used).

Page 33: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

But….

• ANOVA is a very robust test and can sustain breaches in its assumptions.

• However, if you think some of the assumptions are breached and a equivalent non-parametric test is available then you should use the non parametric version.

Page 34: Analysis and Interpretation: Analysis of Variance (ANOVA) Chris Fowler.

THANK YOU!