Top Banner
Multiple Regression
28

Multiple Regression

Dec 31, 2015

Download

Documents

dean-hood

Multiple Regression. What Techniques Can Tell Us. Chi Square- Do groups differ (nominal data)? T Test Do Groups/Variables differ? Gamma/Lambda/Kendall’s Tau etc Are variables related to each other? (nominal data) Correlation Are variables related to each other? (ratio/interval data). - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multiple Regression

Multiple Regression

Page 2: Multiple Regression

What Techniques Can Tell Us

• Chi Square- • Do groups differ (nominal data)?• T Test• Do Groups/Variables differ?• Gamma/Lambda/Kendall’s Tau etc• Are variables related to each other? (nominal

data)• Correlation• Are variables related to each other?

(ratio/interval data)

Page 3: Multiple Regression

Interpreting Correlations

• 3 questions we can answer

1. Is there a relationship between 2 variables?

2. What is the direction of the relationship?

3. What is the Strength of a relationship

Correlations

1 .506**

. .000

1623 1608

.506** 1

.000 .

1608 1776

Pearson Correlation

Sig. (2-tailed)

N

Pearson Correlation

Sig. (2-tailed)

N

IDEO

PID

IDEO PID

Correlation is significant at the 0.01 level(2-tailed).

**.

Page 4: Multiple Regression

Interpreting Correlations

• Are there limitations here? And if so, what?

• Don’t know amount of effect of one variable on other

• Don’t know impact of other variables

Correlations

1 .506**

. .000

1623 1608

.506** 1

.000 .

1608 1776

Pearson Correlation

Sig. (2-tailed)

N

Pearson Correlation

Sig. (2-tailed)

N

IDEO

PID

IDEO PID

Correlation is significant at the 0.01 level(2-tailed).

**.

Page 5: Multiple Regression

VAR00002

3020100

VA

R8

80

60

40

20

0

-20

-40

-60

RND2

403020100-10-20-30-40

RN

D1

40

30

20

10

0

-10

-20

-30

-40

Strength

Page 6: Multiple Regression

VAR00002

3020100

VA

R4

30

20

10

0

-10

VAR00002

3020100

VA

R6

30

20

10

0

-10

Strong Relationships

Page 7: Multiple Regression

Perfect Relationship

VAR00002

3020100

VA

R0

00

01

30

20

10

0

Page 8: Multiple Regression

Basic Equations

• Let your DV (Y)= total cost of bananas• Suppose you buy X lbs of bananas at $.49 a lb• How would you express this as an equation to

figure out how much your bananas are worth?• Y=.49 X• Can use for prediction• 10lbs=$4.90• 2lbs=$.98

Page 9: Multiple Regression

Multivariate Equations

• Suppose you have a phone plan that charges – $5.95 a month– $.10 a minute instate long distance– $.08 a minute interstate long distance– $.01 a minute Local Calls

• How would you represent?

• Total=.1x1+.08x2+.01x3+5.95

Page 10: Multiple Regression

Regression Analysis

• Lets you work the problem Backwards

• How much do different IVs contribute to a DV

• How do different IVs relate to DV

• Lets you build a model of more complicated relationships

• In addition to existence, direction, strength, gives you the amount of change

Page 11: Multiple Regression

Expressing A regression equation

• Y=b1x1+b2x2+…..bixi+constant+error

• Error is part of probabilistic nature of social science

• Constant- what Y would equal if all Xs=0

• Estimation process- fit a line to data that minimizes the distance to all observed data points

Page 12: Multiple Regression

Scatter Plots and Regression Lines

• PID and Ideology • Correlation here is .37, not bad, but you can see,

there are deviations in some cases

Linear Regression

2.00 4.00 6.00

ideo

0.00

2.00

4.00

6.00

pid

pid = -1.05 + 0.81 * ideoR-Square = 0.37

Page 13: Multiple Regression

Fitting the Regression Line

• Goal: Minimize the squared distances (error) between predicted values of Y and observed values.

• Goal, explain the variance in Y in terms of X

• Error in prediction is unexplained variance

Page 14: Multiple Regression

Party and Ideology

• Set up PID as DV, Ideology as IV, run analysis• Can also do Ideology as DV

Coefficientsa

-8.34E-03 .127 -.066 .948

.645 .027 .506 23.511 .000

(Constant)

IDEO

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: PIDa.

Coefficientsa

3.236 .059 54.924 .000

.397 .017 .506 23.511 .000

(Constant)

PID

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: IDEOa.

Page 15: Multiple Regression

Goodness of Fit

• Measure of how much variance is explained by model you build

• R2= correlation coefficient squared • R2= proportion of variance explained• R2 is symetrical• In previous example R2 = .256• R2 ranges from 0-1• Adjusted R2 takes into account the degrees of

freedom, more appropriate measure

Page 16: Multiple Regression

Run for the Border Using Multiple Regression

• Suppose that you and some friends ate at Taco bell every week for a year.

• For each meal, you know the total amount spent, and the number of each item, but not what each item cost.

• You could use multiple regression to get parameter estimates of the true values.

• Data set was constructed by choosing a random number (Between 0 and 4) of Bean Burritos, Tacos, Chalupas, Chicken Tacos, Beef Burritos, 7 Layer Burritos, and Soft drinks

• Data matrix includes a variable for number of each

Page 17: Multiple Regression

Border Model 1

• We’ll look at impact of bean burritos on total

Model Summaryb

.039a .002 -.018 3.74743Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), BEANBURa.

Dependent Variable: TOTAL2b.

Coefficientsa

21.561 1.165 18.507 .000

-.131 .476 -.039 -.276 .784 1.000 1.000

(Constant)

BEANBUR

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Tolerance VIF

Collinearity Statistics

Dependent Variable: TOTAL2a.

Page 18: Multiple Regression

Border Model 2

• Bean Burritos and Tacos

Model Summaryb

.257a .066 .028 3.66072Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), TACO, BEANBURa.

Dependent Variable: TOTAL2b.

Coefficientsa

19.655 1.538 12.781 .000

-.185 .466 -.055 -.397 .693 .996 1.004

.842 .457 .255 1.843 .071 .996 1.004

(Constant)

BEANBUR

TACO

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Tolerance VIF

Collinearity Statistics

Dependent Variable: TOTAL2a.

Page 19: Multiple Regression

Border Model 3Model Summaryb

.298a .089 .032 3.65375Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), CHICKTAC, BEANBUR, TACOa.

Dependent Variable: TOTAL2b.

Coefficientsa

18.032 2.139 8.432 .000

-.160 .465 -.047 -.343 .733 .994 1.006

.891 .458 .270 1.945 .058 .986 1.014

.554 .508 .151 1.090 .281 .987 1.013

(Constant)

BEANBUR

TACO

CHICKTAC

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Tolerance VIF

Collinearity Statistics

Dependent Variable: TOTAL2a.

Page 20: Multiple Regression

Model 4Model Summaryb

.744a .553 .505 2.61316Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), CHALUPA, CHICKTAC,BEANBUR, TACO, BEEFBUR

a.

Dependent Variable: TOTAL2b.

Coefficientsa

9.080 2.027 4.479 .000

5.312E-02 .334 .016 .159 .874 .984 1.016

.739 .332 .224 2.224 .031 .959 1.043

.955 .374 .260 2.550 .014 .931 1.074

1.617 .322 .514 5.029 .000 .929 1.076

1.707 .331 .516 5.153 .000 .967 1.034

(Constant)

BEANBUR

TACO

CHICKTAC

BEEFBUR

CHALUPA

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Tolerance VIF

Collinearity Statistics

Dependent Variable: TOTAL2a.

Page 21: Multiple Regression

Linear Regression

16.00 20.00 24.00 28.00

total2

16.00000

20.00000

24.00000

28.00000U

nst

and

ard

ized

Pre

dic

ted

Val

ue

Unstandardized Predicted Value = 9.50 + 0.55 * total2R-Square = 0.55

Page 22: Multiple Regression

Model 5Model Summaryb

.923a .852 .832 1.52228Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), SEVLAYR, BEEFBUR, TACO,CHALUPA, BEANBUR, CHICKTAC

a.

Dependent Variable: TOTAL2b.

Coefficientsa

3.426 1.322 2.592 .013

.568 .202 .169 2.810 .007 .914 1.095

.610 .194 .185 3.140 .003 .954 1.048

1.285 .221 .350 5.816 .000 .908 1.101

1.634 .187 .519 8.720 .000 .929 1.076

1.546 .194 .468 7.982 .000 .960 1.042

1.797 .189 .577 9.516 .000 .896 1.116

(Constant)

BEANBUR

TACO

CHICKTAC

BEEFBUR

CHALUPA

SEVLAYR

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Tolerance VIF

Collinearity Statistics

Dependent Variable: TOTAL2a.

Page 23: Multiple Regression

Linear Regression

16.00 20.00 24.00 28.00

total2

16.00000

20.00000

24.00000

28.00000U

nst

and

ard

ized

Pre

dic

ted

Val

ue

Unstandardized Predicted Value = 3.15 + 0.85 * total2R-Square = 0.85

Page 24: Multiple Regression

Full ModelModel Summaryb

1.000a 1.000 1.000 .00000Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), DRINK, SEVLAYR, BEEFBUR,TACO, BEANBUR, CHICKTAC, CHALUPA

a.

Dependent Variable: TOTAL2b.

Coefficientsa

2.269E-15 .000 . .

.690 .000 .205 . . .906 1.104

.790 .000 .239 . . .936 1.069

1.390 .000 .379 . . .904 1.107

1.590 .000 .505 . . .928 1.078

1.190 .000 .360 . . .893 1.120

1.890 .000 .607 . . .891 1.122

1.290 .000 .404 . . .909 1.100

(Constant)

BEANBUR

TACO

CHICKTAC

BEEFBUR

CHALUPA

SEVLAYR

DRINK

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Tolerance VIF

Collinearity Statistics

Dependent Variable: TOTAL2a.

Page 25: Multiple Regression

Linear Regression

16.00 20.00 24.00 28.00

total2

16.00000

20.00000

24.00000

28.00000

Un

stan

dar

diz

ed P

red

icte

d V

alu

e

Unstandardized Predicted Value = 0.00 + 1.00 * total2R-Square = 1.00

Page 26: Multiple Regression

Model 4 Revisited

• Bean Burrito- .69,Taco .79, Chalupa 1.19, Chicken taco 1.39, Beef Burrito 1.59,7 layer 1.89, Drink 1.29

Coefficientsa

9.080 2.027 4.479 .000

5.312E-02 .334 .016 .159 .874 .984 1.016

.739 .332 .224 2.224 .031 .959 1.043

.955 .374 .260 2.550 .014 .931 1.074

1.617 .322 .514 5.029 .000 .929 1.076

1.707 .331 .516 5.153 .000 .967 1.034

(Constant)

BEANBUR

TACO

CHICKTAC

BEEFBUR

CHALUPA

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Tolerance VIF

Collinearity Statistics

Dependent Variable: TOTAL2a.

Page 27: Multiple Regression

Some Data Requirements for Regression

• DV must be interval or ratio, and continuous

• IVs should not be correlated with each other

• Error should be constant at high and low predicted value (homoschedasticity)

• Relationship must be linear• Errors of subsequent observations should

not be correlated (no serial correlation)

Page 28: Multiple Regression

For Next time

• Multicolinearity

• Heteroskedasticity

• Interaction terms

• Pass out Stat Assignment II