Top Banner
Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data
24

Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Jan 14, 2016

Download

Documents

Martin Fowler
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Lack of Fit (LOF) Test

A formal F test for checking whether a specific type of regression function

adequately fits the data

Page 2: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

504030

200

150

100

Latitude

Mo

rtal

ityS = 19.1150 R-Sq = 68.0 % R-Sq(adj) = 67.3 %

Mortality = 389.189 - 5.97764 Latitude

Regression Plot

Example 1

Do the data suggest that a linear function is adequate in describing the relationship between skin cancer mortality and latitude?

Page 3: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Example 2

Do the data suggest that a linear function is adequate in describing the relationship between the length and weight of an alligator?

150140130120110100 90 80 70 60

700

600

500

400

300

200

100

0

Length

Wei

ght

S = 54.0115 R-Sq = 83.6 % R-Sq(adj) = 82.9 %

Weight = -393.264 + 5.90235 Length

Regression Plot

Page 4: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Example 3

Do the data suggest that a linear function is adequate in describing the relationship between iron content and weight loss due to corrosion?

210

130

120

110

100

90

80

iron

wgt

loss

S = 3.05778 R-Sq = 97.0 % R-Sq(adj) = 96.7 %

wgtloss = 129.787 - 24.0199 iron

Regression Plot

Page 5: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Lack of fit test for a linear function … the basic idea

• Use general linear test approach.• Full model is most general model with no

restrictions on the means μj at each Xj level.• Reduced model assumes that the μj are a linear

function of the Xj, i.e., μj = β0+ β1Xj.• Determine SSE(F), SSE(R), and F statistic.• If the P-value is small, reject the reduced model

(H0: No lack of fit (linear)) in favor of the full model (HA: Lack of fit (not linear)).

Page 6: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Assumptions and requirements

• The Y observations for a given X level are independent.

• The Y observations for a given X level are normally distributed.

• The distribution of Y for each level of X has the same variance.

• LOF test requires repeat observations, called replications (or replicates), for at least one of the X values.

Page 7: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Notationiron wgtloss0.01 127.60.01 130.10.01 128.00.48 124.00.48 122.00.71 110.80.71 113.10.95 103.91.19 101.51.44 92.31.44 91.41.96 83.71.96 86.2

• c different levels of X (c=7 with X1=0.01, X2=0.48, …, X7=1.96)

• nj = number of replicates for jth level of X (Xj) (n1=3, n2=2, …, n7=2) for a total of n = n1 + … + nc observations.

• Yij = observed value of the response variable for the ith replicate of Xj

(Y11=127.6, Y21=130.1, …, Y27=86.2)

Page 8: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

The Full ModelAssume nothing about (or “put no structure on”) the means of the responses, μj, at the jth level of X:

ijjijY Make usual assumptions about error terms (εij): normal, mean 0, constant variance σ2.

Least squares estimates of μj are sample means of responses at Xj level.

jj Y

“Pure error sum of squares”

SSPEYYFSSEc

j

n

ijij

j

2

1 1

)(

Page 9: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

The Reduced ModelAssume the means of the responses, μj, are linearly related to the jth level of X (same model as before, just modified subscripts):

ijjij XY 10

Make usual assumptions about error terms (εij): normal, mean 0, constant variance σ2.

Least squares estimates of μj are as usual. jij XbbY 10ˆ

“Error sum of squares” SSEYYRSSEc

j

n

iijij

j

2

1 1

ˆ)(

Page 10: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Error sum of squares decomposition

ijjjijijij YYYYYY ˆˆ error deviation pure error deviation lack of fit deviation

j i

ijjj i

jijj i

ijij YYYYYY222 ˆˆ

SSLFSSPESSE

Page 11: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

The F test

FFR df

FSSE

dfdf

FSSERSSEF

)()()(*

2ndfRcndfF

SSERSSE )(

SSPEFSSE )(

MSPE

MSLF

n

SSPE

c

SSLF

n

SSPE

cnn

SSPESSEF

2222*

Page 12: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

The Decision (Intuitively)

• If the largest portion of the error sum of squares is due to lack of fit, the F test should be large.

• A large F* statistic leads to a small P-value (determined by F(c-2, n-2) distribution).

• If P-value is small, reject null and conclude significant lack of (linear) fit.

Page 13: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

LOF Test summarized in an ANOVA Table

Page 14: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

LOF Test in Minitab

• Stat >> Regression >> Regression …

• Specify predictor and response.

• Under Options…, under Lack of Fit Tests, select box labeled “Pure error.”

• Select OK. Select OK. ANOVA table appears in session window.

Page 15: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

504030

200

150

100

Latitude

Mo

rtal

ityS = 19.1150 R-Sq = 68.0 % R-Sq(adj) = 67.3 %

Mortality = 389.189 - 5.97764 Latitude

Regression Plot

Example 1

Do the data suggest that a linear function is adequate in describing the relationship between skin cancer mortality and latitude?

Page 16: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Example 1: Mortality and Latitude

Analysis of Variance

Source DF SS MS F PRegression 1 36464 36464 99.80 0.000Residual Error 47 17173 365 Lack of Fit 30 12863 429 1.69 0.128 Pure Error 17 4310 254Total 48 53637

19 rows with no replicates

Page 17: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Example 2

Do the data suggest that a linear function is adequate in describing the relationship between the length and weight of an alligator?

150140130120110100 90 80 70 60

700

600

500

400

300

200

100

0

Length

Wei

ght

S = 54.0115 R-Sq = 83.6 % R-Sq(adj) = 82.9 %

Weight = -393.264 + 5.90235 Length

Regression Plot

Page 18: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Example 2: Alligator length and weight

Analysis of Variance

Source DF SS MS F PRegression 1 342350 342350 117.35 0.000Residual Error 23 67096 2917 Lack of Fit 17 66567 3916 44.36 0.000 Pure Error 6 530 88Total 24 409446

14 rows with no replicates

Page 19: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Example 3

Do the data suggest that a linear function is adequate in describing the relationship between iron content and weight loss due to corrosion?

210

130

120

110

100

90

80

iron

wgt

loss

S = 3.05778 R-Sq = 97.0 % R-Sq(adj) = 96.7 %

wgtloss = 129.787 - 24.0199 iron

Regression Plot

Page 20: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Example 3: Iron and corrosion

Analysis of Variance

Source DF SS MS F PRegression 1 3293.8 3293.8 352.27 0.000Residual Error 11 102.9 9.4 Lack of Fit 5 91.1 18.2 9.28 0.009 Pure Error 6 11.8 2.0Total 12 3396.6

2 rows with no replicates

Page 21: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Closing comment #1

• The t-test or F=MSR/MSE test only tests whether there is a linear relation between the predictor and response (β1≠0) or not (β1=0).

• Failing to reject the null does not imply that there is no relation between the predictor and response.

Page 22: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

50-5

40

30

20

10

0

X

Y*

Example: Closing comment #1

Page 23: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Example: Closing comment #1The regression equation isY* = 14.1 - 0.100 X

Predictor Coef SE Coef T PConstant 14.118 2.598 5.44 0.000X -0.0998 0.6942 -0.14 0.887

S = 13.25 R-Sq = 0.1% R-Sq(adj) = 0.0%

Analysis of VarianceSource DF SS MS F PRegression 1 3.6 3.6 0.02 0.887Residual Error 24 4210.4 175.4 Lack of Fit 11 4188.3 380.8 223.87 0.000 Pure Error 13 22.1 1.7Total 25 4214.0

Page 24: Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.

Closing comments #2, #3

• We used general linear test approach to test appropriateness of a linear function. It can just as easily be used to test for appropriateness of other functions (quadratic, cubic).

• The alternative HA: Lack of fit (not linear) includes all possible regression functions other than a linear one. Use residuals to help identify what type of function is appropriate.