Top Banner
1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsi ung
69

1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

Mar 28, 2015

Download

Documents

Roger Jan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

1

Chapter 7 Polynomial Regression Models

Ray-Bing Chen

Institute of Statistics

National University of Kaohsiung

Page 2: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

2

7.1 Introdution

• The linear regression model y = X + is a general model for fitting any relationship that is linear in the unknown parameter .

• Polynomial regression model:

Page 3: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

3

7.2 Polynomial Models in One Variable7.2.1 Basic Principles• A second-order model (quadratic model):

Page 4: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

4

Page 5: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

5

• Polynomial models are useful in situations where the analyst knows that curvilinear effects are present in the true response function.

• Polynomial models are also useful as approximating functions to unknown and possible very complex nonlinear relationship.

• Polynomial model is the Taylor series expansion of the unknown function.

Page 6: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

6

• Several important conditions:– Order of the model: The order (k) should be as

low as possible. The high-order polynomials (k > 2) should be avoided unless they can be justified for reasons outside the data. In an extreme case it is always possible to pass a polynomial of order n-1 through n point so that a polynomial of sufficiently high degree can always be found that provides a “good” fit to the data.

– Model Building Strategy: Various strategies for choosing the order of an approximating polynomial have been suggested. Two procedures: forward selection and backward elimination.

Page 7: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

7

• Extrapolation: Extrapolation with polynomial models can be extreme hazardous. (see Figure 7.2)

• Ill-Conditioning I: The X’X matrix becomes ill-conditioned as the order increases. It means that the matrix inversion calculations will be inaccurate, and considerable error may be introduced into the parameter estimates.

• Ill-Conditioning II: If the values of x are limited to a narrow range, there can be significant ill-conditioning or multicollinearity in the columns of X.

Page 8: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

8

Page 9: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

9

• Hierarchy: The regression model

is said to be hierarchical because it contains all terms of order three and lower. Only hierarchical models are invariant under linear transformation.

Example 7.1 The Hardwood Data:• The strength of kraft paper (y) v.s. the % of hardw

ood.• Data in Table 7.1• A scatter plot in Figure 7.3

Page 10: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

10

Page 11: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

11

Page 12: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

12

Page 13: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

13

Page 14: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

14

Page 15: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

15

Page 16: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

16

7.2.2 Piecewise Polynomial Fitting (Splines)• Sometimes a low-order polynomial provides a poo

r fit to the data. But increasing the order of the polynomial modestly does not substantially improve the situation.

• This problem may occur when the function behaves differently in different parts of the range of x.

• A usual approach is to divide the range of x into segments and fit an appropriate curve in each segment.

• Spline functions offer a useful way to perform this type of piecewise polynomial fitting.

Page 17: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

17

• Splines are piecewise polynomials of order k.• The joint points of the pieces are usually called kno

ts.• Generally the function values and the first k-1 deriv

atives agree at the knots. That is slpine is a continuous function with k-1 continues derivatives.

• Cubic Spline:

Page 18: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

18

• It is not simple to decide the number and position of the knots and the order of the polynomial in each segment.

• Wold (1974) suggests – there should be as few knots as possible, with at

least four or five data points per segment.– There should be no more than one extreme poin

t and one point of inflexion per segment.• The great flexibility of spline functions makes it v

ery easy to overfit the data.

Page 19: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

19

• Cubic slpine model with h knots and no continuous restriction:

• The fewer continuity restrictions required, the better if the fit.

• The more continuity restrictions required, the worse is the fit but smoother the final curve will be.

Page 20: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

20

Page 21: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

21

• X’X becomes ill-conditioned if there is a large number of knots.

• Use a different representation of the slpine: cubic B-spline.

Page 22: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

22

Example 7.2 Voltage Drop Data• The battery voltage drop in a guided missile moto

r observed over the time of missile flight is shown in Table 7.3.

• The Scatter-plot is in Figure 7.6• Model the data with a cubic slpine using two knots

at 6.5 and 13.

Page 23: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

23

Page 24: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

24

• The ANOVA

• A plot of the residual v.s. the fitted values and a normal probability plot of the residuals are in Figure 7.7 and Figure 7.8

Page 25: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

25

Page 26: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

26

Page 27: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

27

Page 28: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

28

Page 29: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

29

Example 7.3 Piecewise Linear Regression• An important special case of practical interest fitti

ng piecewise linear regression models.• This can be treated easily using linear splines.

Page 30: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

30

Page 31: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

31

Page 32: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

32

7.2.3 Polynomial and Trigonometric Terms• Sometimes consider the models as the combinatio

n of polynomial and trigonometric terms.• From the scatter-plot, there may be some periodici

ty or cyclic behavior in the data.• A model with fewer terms may result than if only

polynomial terms are employed.• The model

Page 33: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

33

• If the regressor x is equally spaced, then the pairs of terms sin(jx) and cos(jx) are orthogonal.

• Even without exactly equal spacing, the correlation between these terms will usually be quite small.

• In Example 7.2– Rescale the regressor x so that all of the observ

ations are in the interval (0, 2).– Fit the model with d = 2 and r = 1

– R2 = 0.9895 and MSRes = 0.0767

Page 34: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

34

7.3 Nonparamteric Regression

• Nonparameter regression is closed related to the piecewise polynomial regression.

• Develop a model free basis for predicting the response over the range of the data.

Page 35: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

35

7.3.1 Kernel Regression• The kernel smoother: use a weighted average of th

e data.

• where S=[wij] is the smoothing matrix.

• Typically, the weights are chosen such that wij 0

for all yi’s outside of s defined “neighborhood” of

the specific location of interest.

Page 36: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

36

• These kernel smoothers use a bandwidth, b, to define this neighborhood of interest.

• A large value for b results in more of the data being used to predict the response at the specific location.

• The resulting plot of predicted values becomes much smoother as b increases.

• As b decrease, less of the data are used to generate the prediction, and the resulting plot looks more wiggly or bumpy.

Page 37: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

37

• This approach is called a kernel smoother.• A kernel function:

• See Table 7.5

Page 38: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

38

7.3.2 Locally Weighted Regression (Loess)• Another nonparameteric method• Loess also uses the data from a neighborhood arou

nd the specific location. • The neighborhood is defined as the span, which is

the fraction of the total points used to form neighborhoods.

• A span 0.5 indicates that the closest half of the total data points is used as the neighborhood.

• Then loess procedure uses the points in the neighborhood to generate a weighted least-squares estimate of the specific response.

Page 39: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

39

• The weights are based on the distance of the points used in the estimation from the specific location of interest.

• Let x0 be the specific location of interest, and let

Δ(x0) be the distance the farthest point in the

neighborhood lies from the specific location of interest.

• The tri-cube weighted function is

Page 40: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

40

• The model •

• Since

Page 41: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

41

• A common estimate of variance is

• R2 = (SST – SSRes) / SST

Page 42: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

42

Example 7.4 Applying Loess Regression to the Windmill Data

Page 43: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

43

Page 44: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

44

Page 45: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

45

Page 46: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

46

Page 47: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

47

Page 48: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

48

7.3.3 Final Cautions• Parametric models are guided by appropriate subje

ct area theory.• Nonparametric models almost always reflect pure

empiricism.• One should always prefer a simple parametric mo

del when it provides a reasonable and satisfactory fit to the data.

• The model terms often have important interpretations.

• One should prefer the parametric model, especially when subject area theory supports the transformation used.

Page 49: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

49

• On the other hand, there are many situations where no simple parametric model yields an adequate or satisfactory fit to the data, where there is little or no subject area theory to guide the analyst, and where no simple transformation appears appropriate.

• In such cases, nonparametric regression makes a great deal of sense.

• One is willing to accept the relative complexity and the “black box” nature of the estimation in order to give an adequate fit to the data.

Page 50: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

50

7.4 Polynomial Models in Two or More Variables•

Page 51: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

51

Page 52: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

52

• Response surface methodology (RSM) is widely applied in industry for modeling the output response(s) of a process in terms of the important controllable variables and then finding the operating conditions that optimize the response.

• Illustrate fitting a second-order response surface in two variables. – y : the percent conversion of a chemical process– T : reaction temperature– C : reaction concentration

• Figure 7.14 shows a central composite design.

Page 53: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

53

• Second-order model:

• See p.246• The fitted model is

• The ANOVA table

2122

2121 75.713.588.822.483.975.79ˆ xxxxxxy

Page 54: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

54

Page 55: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

55

• R2 and adjusted R2 values for this model are satisfactory.

Page 56: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

56

Page 57: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

57

Page 58: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

58

Page 59: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

59

Page 60: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

60

Page 61: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

61

• From the response surface plots, the maximum percent conversion occurs at about 245°C and 20% concentration.

• The experimenter is interested in predicting the response y pr estimating the mean response at a particular point in the process variable space.

Page 62: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

62

Page 63: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

63

7.5 Orthogonal Polynomial

• In fitting polynomial model in one variable, even if nonessential ill-conditioning is removed by centering, we may still have high levels of multicollinearity.

Page 64: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

64

Page 65: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

65

• Suppose the model is,

• Then X’X is

• The estimators are

iikkiii xPxPxPy )()()( 1100

n

iik

n

ii

xP

xP

XX

1

2

1

20

)(00

0

0

00)(

'

n

iij

n

iiij

xP

yxP

1

2

1

)(

)(̂

Page 66: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

66

Page 67: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

67

Example 7.5 Orthogonal Polynomial• The effect of various reorder quantities on the

average annual cost of the inventory.

Page 68: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

68

Page 69: 1 Chapter 7 Polynomial Regression Models Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

69

• The fitted equation is

12

110

25

5.162

2

17955.2)

25

5.162)(2(7424.030.324ˆ

22xx

y