Top Banner
Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last two paragraphs of section 1.3), 1.6 (skip matrix notation and constraints), 1.7 (Tukey method only) and 1.9 (ignore H matrix notation on page 35), 2.1, 2.2 We will not do 1.5 nor 1.8 Assignment 1:
22

Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Stat 470-4

• Today: Multiple comparisons, diagnostic checking, an example

• After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last two paragraphs of section 1.3), 1.6 (skip matrix notation and constraints), 1.7 (Tukey method only) and 1.9 (ignore H matrix notation on page 35), 2.1, 2.2

• We will not do 1.5 nor 1.8

• Assignment 1:

Page 2: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Multiple Comparisons

• In previous example, we saw that there was a significant treatment effect…so what?

• If an ANOVA is conducted and the analysis suggests that there is a significant treatment effect, then a reasonable question to ask is

Page 3: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Multiple Comparisons

• Would like to see if there is a difference between treatments i and j

• Can use two-sample t-test statistic to do this

• For testing reject if

• Perform many of these tests

jiAji HH : versus:0

Page 4: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Multiple Comparisons

• Perform many of these tests

• Error rate must be controlled

Page 5: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Tukey Method

• Tests:

• Confidence Interval:

Page 6: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Back to Example

Page 7: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Diagnostic Checking – Residual Analysis

• To support the assumptions on which the analysis is based, we need to check for – have all effects been

captured?

– unequal variances

– non-Normality

– sequence effects

• Should do this before hypothesis testing and multiple comparisons

T1

T2

T3

T4

3

4

5

6

7

8

Treatment

y

Dotplots of y by Treatmen(group means are indicated by lines)

The data plot (limited data) shows no strong evidence of non-Normality or unequal variances

Page 8: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Diagnostic Checking

• ANOVA model:

• Predicted response: , where–

• Residual:

• Estimates error

ijiijy

iiy ˆˆˆ

..ˆ y)(ˆ ... yyi

)ˆ( iijij yyr

Page 9: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Diagnostic Plots

• Errors are assumed to be normally distributed– Useful plot

• Errors assumed to be independent– Useful plot

• Equal variances in each group– Useful plot

Page 10: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Normality Check

• Dot plot or histogram of residuals

• Normal probability plot of residuals (via software or by hand - see class handout)

Normal Q-Q Plot of Residual for RESPONSE

Observed Value

.6.4.2-.0-.2-.4-.6

Exp

ect

ed

No

rma

l Va

lue

.6

.4

.2

-.0

-.2

-.4

-.6

Page 11: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Independence Check

• Plot residuals in the time sequence in which the data were collected

• X-axis denotes the sequence, Y-axis denotes the residual values

• Should observe

Page 12: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Independence Check

• Suppose the sequence of the observations (going across rows from top to bottom in the tabled data) is 1, 2, 11, 9, 5, 7, 6, 3, 4, 12, 10, 8

Time Plot of residuals

Sequence

14121086420

Re

sid

ua

l fo

r R

ES

PO

NS

E

.4

.2

-.0

-.2

-.4

-.6

Page 13: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Equal Variances

• A useful plot is:

• Should observe:

Page 14: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Equal Variances

Plot of Residual Versus Treatment

Packaging

5.04.03.02.01.00.0

Re

sid

ua

l fo

r R

ES

PO

NS

E

.4

.2

-.0

-.2

-.4

-.6

Page 15: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Comments

• The F-test is fairly robust – it is not very sensitive to departures from the assumption of Normal distributions.

• Often, simple transformations, such as the logarithm or square root, can make the Normal distribution assumption and the equal variance assumption more appropriate (Chapter 2)

Page 16: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Summary: Completely Randomized Design, One-Way ANOVA

• Method: Random assignment of treatments to experimental units

• ANOVA: Compare variation among treatments to variation within treatments to assess evidence of a difference among treatments

• Investigate and identify differences among Treatments, if any. Act on the findings

Page 17: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Comment: One-Way Model

• The one-way model,yij = + i + eij, eij ~NID(0, 2) can be and is applied to data obtained in ways other than a completely randomized design

• Example: starting salaries for MBAs at different companies. Company is not a treatment that is applied to experimental units

• Analyzing the data according to the above model can answer whether apparent differences between companies are real or could be just due to chance.

• The randomness involved comes from the randomness of the hiring and salary-determination processes, not the random assignment of treatments to experimental units

Page 18: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

General Linear Model

• ANOVA model can be viewed as a special case of the general linear model or regression model

• Suppose have response, y, which is thought to be related to p predictors (sometimes called explanatory variables or regressors)

• Predictors: x1, x2,…,xp

• Model:

Page 19: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Example: Rainfall (Exercise 2.16)

• In winter, a plastic rain gauge cannot be used to collect precipitation because it will freeze and crack. Instead, metal cans are used to collect snowfall and the snow is allowed to melt indoors. The water is then poured into a plastic rain gauge and a measurement recorded. An estimate of snowfall is obtained by multiplying this measurement by 0.44.

• One observer questions this and decides to collect data to test the validity of this approach

• For each rainfall in a summer, she measures: (i) rainfall using a plastic rain gauge, (ii) using a metal can

• What is the current model being used?

Page 20: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Example: Rainfall (Exercise 2.16)

Scatter Plot of Rainfall Data

Rain Collected in Metal Can (x)

76543210

Ra

in C

olle

cte

d in

Pla

stic

Ga

ug

e4.0

3.0

2.0

1.0

0.0

Page 21: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Example: Rainfall (Exercise 2.16)

• Seems to be a linear relationship

• Will use regression to establish linear relationship between x and y

• What should the slope be?

Page 22: Stat 470-4 Today: Multiple comparisons, diagnostic checking, an example After these notes, we will have looked at 1.1-1.3 (skip figures 1.2 and 1.3, last.

Example: Rainfall (Exercise 2.16)

Coefficientsa

3.579E-02 .012 2.931 .005

.444 .006 .995 76.264 .000

(Constant)

X

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Ya.

ANOVAb

25.860 1 25.860 5816.213 .000a

.245 55 .004

26.105 56

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), Xa.

Dependent Variable: Yb.

Model Summaryb

.995a .991 .990 .06668Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), Xa.

Dependent Variable: Yb.