Top Banner
Topic 14: Inference in Multiple Regression
32

Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Dec 30, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Topic 14: Inference in Multiple Regression

Page 2: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Outline

• Review multiple linear regression• Inference of regression coefficients– Application to book example

• Inference of mean– Application to book example

• Inference of future observation• Diagnostics and remedies

Page 3: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Data for Multiple Regression

• Yi is the response variable

• Xi1, Xi2, … , Xi,p-1 are the p-1

explanatory variables

• Yi, Xi1, Xi2, … , Xi,p-1 are the data for

case i, where i = 1 to n

Page 4: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Multiple Regression Model

• Yi = β0 + β1Xi1 + β2Xi2 +…+ βp-1Xi,p-1 + ei

• Yi is the value of the response variable for the ith case

• β0 is the intercept

• β1, β2, … , βp-1 are the regression coefficients for the explanatory variables

• ei are independent Normally distributed random errors with mean 0 and variance σ2

Page 5: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Least Squares Solutions

YX)XX(b 1

s2 = MSE=

s = Root MSE

)/()( pn YHIY

Page 6: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

ANOVA F-test

• H0: β1 = β2 = … = βp-1 = 0

• Ha: βk ≠ 0, for at least one k=1,2,…,p-1

• Under H0, F ~ F(p-1,n-p)

• Reject H0 if F is large, using P-value we reject if the P-value ≤ 0.05

Page 7: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Inference for individual regression coefficients

• We can show b ~ N(β, σ2(X΄X)-1)

• Define

}b{}b{

)XX(}b{

,22

12

kkk

pp

ss

MSEs

Page 8: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Significance Test for βk

• H0: βk = 0

• Same test statistic t* = bk/s(bk)

• Still use dfE which now equals n-p

• P-value computed from t(n-p) dist

• This tests the significance of a variable given the other variables are already in the model (i.e., fitted last)

Page 9: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Confidence interval for βk

• CI: bk ± tcs(bk), where tc = t(.975, n-p)

• Same form as before but dfE now equals n-p

• This interval describes region of bk given the other variables are in the model

Page 10: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Example II(KNNL p 236)

• Dwaine Studios, Inc. operates portrait studios in 21 cities of medium size

• Yi is sales in city i

• X1 : population aged 16 and under

• X2 : per capita disposable income

i2i21i10i XXY

Page 11: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Read in the data

data a1; infile ‘../data/ch06fi05.txt'; input young income sales;

proc print data=a1; run;

Page 12: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Partial Proc Print Results

Obs young income sales

1 68.5 16.7 174.4 2 45.2 16.8 164.4 3 91.3 18.2 244.2 4 47.8 16.3 154.6 5 46.9 17.3 181.6

Page 13: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Proc Reg

proc reg data=a1; model sales=young income;run;

Page 14: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

OutputAnalysis of Variance

Source DFSum of

SquaresMean

Square F Value Pr > FModel 2 24015 12008 99.10 <.0001Error 18 2180.9274 121.1626    Corrected Total

20 26196      

Root MSE 11.00739 R-Square

0.917

At least one variable is helpful in predicting in sales

Page 15: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

OutputParameter Estimates

Variable DFParameter

EstimateStandard

Error t Value Pr > |t|Intercept 1 -68.85707 60.01695 -1.15 0.2663young 1 1.45456 0.21178 6.87 <.0001income 1 9.36550 4.06396 2.30 0.0333

Both variables are helpful in explaining sales after the other

is already in the model

Page 16: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

CLB option

• Used to get confidence intervals for each coefficient

proc reg data=a1; model sales=young income/clb;run;

Page 17: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Output

Parameter Estimates

Variable DFParameter

EstimateStandard

Error95% Confidence

LimitsIntercept 1 -68.85707 60.01695 -194.94801 57.23387

young 1 1.45456 0.21178 1.00962 1.89950

income 1 9.36550 4.06396 0.82744 17.90356

Page 18: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

What if just young fit?Parameter Estimates

Variable DFParameter

EstimateStandard

Error95% Confidence

LimitsIntercept 1 68.04536 9.46224 48.24066 87.85006

young 1 1.83588 0.14641 1.52943 2.14233

CIs for both the intercept and young change dramatically when just young as explanatory variable

Page 19: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Estimation of E(Yh)

• Xh is now a vector that looks like

(1, Xh1, Xh2, … , Xh,p-1)΄

• We want a point estimate and a confidence interval for the subpopulation mean corresponding to the set of explanatory variables Xh

Page 20: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Theory for E(Yh)

)-nt(0.975,)ˆ(ˆ:CI

XX)X(X{b}XsX)ˆ(

bXˆ

X)E(Y

1-222

h

ps

ss

hh

hhhhh

h

hhh

Page 21: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Using CLM option

proc reg data=a1; model sales=young income/clm; id young income;run;

Adds them to output table

Page 22: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

CLM Output

Output Statistics

Obs young incomeDependent

VariablePredicted

ValueStd Error

Mean Predict 95% CL Mean1 68.5 16.7 174.4000 187.1841 3.8409 179.1146 195.2536

2 45.2 16.8 164.4000 154.2294 3.5558 146.7591 161.6998

3 91.3 18.2 244.2000 234.3963 4.5882 224.7569 244.0358

4 47.8 16.3 154.6000 153.3285 3.2331 146.5361 160.1210

5 46.9 17.3 181.6000 161.3849 4.4300 152.0778 170.6921

21 52.3 16.0 166.5000 157.0644 4.0792 148.4944 165.6344

Page 23: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Prediction of Yh

• Xh is still a vector of form

(1, Xh1, Xh2, … , Xh,p-1)΄

• We want a prediction of Yh based on a set of predictor values with an interval that expresses the uncertainty in our prediction

Page 24: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Theory for Yh

)-nt(.975,)pred(ˆ:CI

)XX)X(X(1

)(Var)Y(Var

)Y(Var)pred(

bXˆYXY

1-2

2

ps

s

s

h

hh

h

h

hhhhh

Page 25: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Using the CLI option

proc reg data=a1; model sales=young income/cli; id young income;run;

Adds them to output table

Page 26: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

CLI Output

Output Statistics

Obs young incomeDependent

VariablePredicted

ValueStd Error

Mean Predict 95% CL Predict1 68.5 16.7 174.4000 187.1841 3.8409 162.6910 211.6772

2 45.2 16.8 164.4000 154.2294 3.5558 129.9271 178.5317

3 91.3 18.2 244.2000 234.3963 4.5882 209.3421 259.4506

21 52.3 16.0 166.5000 157.0644 4.0792 132.4018 181.7270

Page 27: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Diagnostics

• Look at the distribution of each variable

• Look at the relationship between pairs of variables

• Plot the residuals versus– the predicted/fitted values–each explanatory variable– time (if available)

Page 28: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Diagnostics

• Are the residuals approximately Normal

–Look at a histogram

–Normal quantile plot

• Is the variance constant

–Plot the residuals vs anything that might be related to the variance (e.g. residuals vs predicted values & residuals versus each X)

Page 29: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.
Page 30: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.
Page 31: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Remedies

• Similar remedies as simple regression

• Transformations such as Box-Cox

• Analyze with/without outliers

• More detail in KNNL Ch 9 and 10

Page 32: Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Background Reading

• We finished Chapter 6.

• Program used to generate output for confidence intervals for means and prediction intervals is topic14.sas