Top Banner
Statistics Used in One-way Analysis of Variance BUSI 6480 Lecture 2
29

Statistics Used in One-way Analysis of Variance

Dec 30, 2015

Download

Documents

BUSI 6480 Lecture 2. Statistics Used in One-way Analysis of Variance. Design of Experiments: A historical note. Two spoonfuls of vinegar three times a day (and 4 other treatments for scurvy) lost out to oranges and lemons in what Wikipedia credits as an "early" designed experiment. James Lind. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Statistics Used in One-way Analysis of Variance

Statistics Used in One-way Analysis of Variance

BUSI 6480Lecture 2

Page 2: Statistics Used in One-way Analysis of Variance

Design of Experiments: A historical noteTwo spoonfuls of vinegar three times a day (and 4

other treatments for scurvy) lost out to oranges and lemons in what Wikipedia credits as an "early" designed experiment.

James Lind

Page 3: Statistics Used in One-way Analysis of Variance

Design of Experiments: A historical noteIn 1747, while serving as surgeon on HM Bark Salisbury, James Lind, the ship's surgeon, carried out a controlled experiment to develop a cure for scurvy. Lind selected 12 men from the ship, all suffering from scurvy, and divided them into six pairs, giving each group different additions to their basic diet for a period of two weeks. The treatments were all remedies that had been proposed at one time or another. They were:A quart of cider every day Twenty five gutts of elixir vitriol three times a day upon an empty stomach, One half-pint of seawater every day A mixture of garlic, mustard, and horseradish in a lump the size of a nutmeg Two spoonfuls of vinegar three times a day Two oranges and one lemon every day.

Page 4: Statistics Used in One-way Analysis of Variance

Design of Experiments: A historical noteThe men who had been given citrus fruits recovered dramatically within a week. One of them returned to duty after 6 days and the other became nurse to the rest. The others experienced some improvement, but nothing was comparable to the citrus fruits, which were proved to be substantially superior to the other treatments. In this study his subjects' cases "were as similar as I could have them", that is he provided strict entry requirements to reduce extraneous variation. The men were paired, which provided replication. From a modern perspective, the main thing that is missing is randomized allocation of subjects to treatments.

Page 5: Statistics Used in One-way Analysis of Variance

ANOVA:Testing hypothesized values of 2

A standard assumption is that the data come from a normally distributed population. The groups are assumed to have equal population variances.

H0: 2 = 02 HA: 2 0

2

H0: 2 02 HA: 2 > 0

2

2 =(n-1) /2 where is the

sample variance and 2 is the

hypothesized variance.

2 2

Page 6: Statistics Used in One-way Analysis of Variance

F statistic for ANOVA Ratio of two chi-square statistics

F = (v12 /v1)/ (v2

2 / v2) where v1 and v2 are the numerator and denominator degrees of freedom.

E(F) = v2/(v2 – 2)

Thus, expected value of F is approximately one.

Page 7: Statistics Used in One-way Analysis of Variance

Model for One-way Anova

Y ..Y j. Y j.

Yij = + j + i(j) where j is a fixed effect,i = 1, . . . , n; j = 1, . . . , p

H0: 1 = 1 = . . . = p= 0

H1: j 0 for some j

Yij = + ( - ) + ( Yij - )Score Grand Treatment Error mean effect effect

Y ..

j 1

p

iji 1

n2Y Y

( ).. = n Y Y Y Yj

j 1

p2

j 1

p

iji 1

n2( ) ( ). .. ..

SST = SSB + SSW

Page 8: Statistics Used in One-way Analysis of Variance

Notation for Sums of Squares

( )i 1

n

j 1

p

Y ij

2

i 1

n

ij2

j 1

p

Y = [AS]

/np = [Y]

( )i 1

n

j 1

p

Y ij

2/n = [A]

The Letter A represents all observations belonging to a level of treatment A. The notation [A] means to square the sum of Y’s within each treatment level and divide by the number of observations within each level of treatment A.

The letters AS represent the treatment and subject within the treatment level. The notation [AS] means to square the sum of observations within each subject treatment level. Note that there is only one observation within each subject treatment level.

The letter Y represents all observations of the dependent variable. The notation [Y] means to square the sum of all response observations and divide by the number of responses.

Page 9: Statistics Used in One-way Analysis of Variance

Sums of Squares using Symbols SST = [AS] –[Y]

SSB = [A] – [Y]

SSW = [AS] – [A]

Three terms are used in computing the sums of squares for a one-way ANOVA

Page 10: Statistics Used in One-way Analysis of Variance

Expected Values of Error Terms E(i(j)) = = 0

E( i(j) ) = n= 0

E( i(j)2 ) = n

E( i(j))2 = n

i 1

n

i 1

n

i 1

n

E( i(j)2) = np

2i 1

n

i 1

p

i 1

p

i 1

n

E( i(j))2 = np

2

Page 11: Statistics Used in One-way Analysis of Variance

Expected Value of Mean Sum of Squares for the Fixed Effects CR-p Design

E(MSB) = 2 +

E(MSW) = 2

E(F) ≈ E(MSB)/E(MSW) = (2 + ) /

2

If H0 is true and all j = 0, then E(F) ≈ 1.

n

p 1

j2

j 1

p

What is the E(SSB)?

What is the E(SSW)?

n

p 1

j2

j 1

p

Page 12: Statistics Used in One-way Analysis of Variance

Expected Value of Mean Sum of Squares for the Random Effects CR-p Design

E(MSB) = 2 + n

2

E(MSW) = 2

E(F) ≈ E(MSB)/E(MSW) = (2 + n

2) / 2

If H0 is true and all 2 = 0, then E(F) ≈ 1.

Remember for a random effects model i is a random variable with mean 0 and variance

2.

What is the E(SSB)?

What is the E(SSW)?

Page 13: Statistics Used in One-way Analysis of Variance

How do you know what ratio of sums of squares to form for the F test?

By finding the expected Mean Sum of Squares, the F statistic can be correctly computed. This will become handy as the designs become more complicated.

The expected values of the mean squares for the fixed and random effects model lead to the same ratios of mean sums of squares for the CR-p design. This will not always be true for more complex designs.

Page 14: Statistics Used in One-way Analysis of Variance

Assumptions for CR-pF Assumptions

1. Data come from normally distributed populations.2. Observations within cells are random or at least

observations are randomly assigned to cells. (Cells are determined by treatment levels.)

3. Numerator and denominator of F statistic are independent.

4. Numerator and denominator are estimates of the same population variance,

2, when H0 is true.

Page 15: Statistics Used in One-way Analysis of Variance

Model Assumptions for CR-p The model Yij = + j + i(j) contains all the

sources of variation that affect Yij. The experiment contains all the treatment

levels of interest. The error effect, i(j) is (a) independent of other

error terms, (b) normally distributed within each treatment level, ( c) mean is equal to zero, and (d) variance is constant (

2) across treatment level.

Page 16: Statistics Used in One-way Analysis of Variance

Testing for Homogeneity of Variance H0: 1

2 = 22 = … = p

2

Page 17: Statistics Used in One-way Analysis of Variance

Levene Test / Brown-Forsythe Testfor testing for homogeneity of variance Levene's test is an alternative to the Bartlett test. It is

less sensitive than the Bartlett test to departures from normality. If there is strong evidence that the data do in fact come from a normal, or nearly normal distribution, then Bartlett's test has better performance.

Levene’s test: replace each observation by the absolute value of the deviation of the observation from the group mean and run a one-way ANOVA.

Brown-Forsythe test: modify Levene’s test by using the deviation of each observation from the group median instead of the group mean.

Page 18: Statistics Used in One-way Analysis of Variance

HOV: Homogeneity of Variance

If the null hypothesis is rejected use the Welch or Brown-Forsythe test ANOVA test on the means.

Assume 5% significance level as the default value.

Page 19: Statistics Used in One-way Analysis of Variance

Transformation of the Dependent Variable to Achieve HOV or Normality – more effective with unequal group sizes.

Page 20: Statistics Used in One-way Analysis of Variance

Rules of Thumb for Transformations of Y

Page 21: Statistics Used in One-way Analysis of Variance

Plotting Group Variances to Determine Transformation.

Plot the variance of the group (treatment level) by the group mean (x-axis).

Draw a straight line (least squares line) through the points and find the slope, .

Use p = 1- to determine the transformation of the form:y = xp

Round off p and use the closest transformation listed.

Page 22: Statistics Used in One-way Analysis of Variance

Another method for selecting the transformationUse the Smallest Range Criteria

Page 23: Statistics Used in One-way Analysis of Variance

Kruskal Wallis: Nonparametric Counterpart to the One-way ANOVA

Page 24: Statistics Used in One-way Analysis of Variance

Guidelines for Alternative Tests

Page 25: Statistics Used in One-way Analysis of Variance

Run the following Data using SAS and SPSS

(select HOV and Welch options) one way ANOVA /* SAS Commands ***/

DM "Log;Clear;OUT;Clear;" ;

Data mydata; Input Treat1 Treat2 Treat3; datalines; 17 18 15 13 12 16 18 26 19 10 18 17 11 9 18 16 30 17 19 12 19 ;

Page 26: Statistics Used in One-way Analysis of Variance

Create one column of responses and another column with the grouping variable

Data Treat1; set mydata; resp = Treat1; Data Treat2; set mydata; resp = Treat2; Data Treat3; set mydata; resp = Treat3;

Data myAnovaData; Set Treat1 Treat2 Treat3; If _N_ <= 21 then Level = 3; If _N_ <= 14 then Level = 2; If _N_ <= 7 then Level = 1; Keep resp Level;

proc print data = myAnovaData; proc export data=myAnovaData outfile='d:MyAnovaDatainColformat.dat' dbms=dlm

replace;

Page 27: Statistics Used in One-way Analysis of Variance

SAS proc to get Welch and Levine HOV test proc glm data = myAnovaData; class Level; model resp = Level; means Level / hovtest welch ; run; quit;

Page 28: Statistics Used in One-way Analysis of Variance

SPSS Analyze > Compare Means > One-Way ANOVA

Page 29: Statistics Used in One-way Analysis of Variance

SPSSOptions for the One-Way ANOVA Test for Equal Group VariancesTest for Means assuming unequal variances