Slide 2 Items to consider - 3 Multicollinearity The relationship
between IVswhen IVs are highly correlated with one another What to
do: Examine the correlation matrix of all IVs & DV to detect
any multicollinearity Look for rs between IVs in excess of 0.70 If
detected, it is generally best (or at least most simple) to re-run
MLR and eliminate one of the offending IVs from the model (see
model reduction, later) 2 3 1 Slide 3 Multicollinearity what is it?
Its to do with unique and shared variance of the IVs with the
predictor & themselves Must establish what unique variance on
each predictor (IV) is related to variance on criterion (DV)
Example 1 (graphical): y freshman college GPA predictor 1 high
school GPA predictor 2 SAT total score predictor 3 attitude toward
education 1 Slide 4 Multicollinearity what is it? x1x1 x2x2 y
Common variance in y that both predictors 1 and 2 account for
variance in y accounted for by predictor 2 after the effect of
predictor 1 has been partialled out Circle = variance for a
variable; overlap = shared variance (only 2 predictors shown here)
1 2 3 4 5 Slide 5 Multicollinearity what is it? x1x1 x2x2 y Circle
= variance for a variable; overlap = shared variance (only 2
predictors shown here) 1 2 3 Total R 2 =.66 or 66% Slide 6
Multicollinearity what is it? x1x1 x2x2 y Circle = variance for a
variable; overlap = shared variance (only 2 predictors shown here)
1 2 3 Total R 2 =.33 or 33% 4 Slide 7 Multicollinearity what is it?
Example 2 (words): y freshman college GPA predictor 1 high school
GPA predictor 2 SAT total score predictor 3 attitude toward
education 1 2 3 4 5 Slide 8 Multicollinearity what is it? =
variance in college GPA predictable from variance in high school
GPA = residual variance in SAT related to variance in college GPA =
residual variance in attitude related to variance in college GPA 1
Slide 9 Multicollinearity what is it? Consider these: X1X2X3
Y.2.1.3 X1.5.4 X2.6 X1X2X3 Y.6.5.7 X1.2.3 X2.2 X1X2X3 Y.6.7 X1.7.6
X2.8 ACB Which would we expect to have the largest overall R 2, and
which would we expect to have the smallest? 1 Slide 10
Multicollinearity what is it? R 2 will be at least.7 for B & C,
but only at least.3 for A No chance of R 2 for A getting much
larger, because intercorrelations of Xs are as large for A as for B
& C X1X2X3 Y.2.1.3 X1.5.4 X2.6 X1X2X3 Y.6.5.7 X1.2.3 X2.2
X1X2X3 Y.6.7 X1.7.6 X2.8 ACB 1 2 Slide 11 Multicollinearity what is
it? R will probably be largest for B Predictors are correlated with
Y Not much redundancy among predictors R probably greater in B than
C, as C has considerable redundancy in predictors X1X2X3 Y.2.1.3
X1.5.4 X2.6 X1X2X3 Y.6.5.7 X1.2.3 X2.2 X1X2X3 Y.6.7 X1.7.6 X2.8 ACB
1 2 Slide 12 What effect does the big M have? Can increase SE E of
regression coefficients (those with the multicollinearity) This can
lead to insignificant findings for those coefficients So predictors
that may be significant when used in isolation may not be
significant when used together Can also lead to imprecision among
regression coefficients (mistakes in estimating the change in Y for
a unit change in the IV) So a model with multicollinearity is
misleading, & can have redundancy among the predictors 1 2 3 4
Slide 13 What do we do about the big M? Many opinions E.g. OBrien
(2007) A Caution Regarding Rules of Thumb for Variance Inflation
Factors. Quality & Quantity, 41, 5, 673-690 Can use VIF
(variance inflation factor) and tolerance values in SPSS (problem
variables are those with VIF < 4) Can painstakingly examine all
possible versions of the model (putting each predictor in 1 st )
Well just signal multicollinearity with a r >.70, and enforce
removal of at least one of the variables, and signal possible
multicollinearity with a r of between.5 and.7, and suggest
examination of the model with and without one of the variables. 1 2
Slide 14 The Goal of MLR The big picture What were trying to do is
create a model predicting a DV that explains as much of the
variance in that DV as possible, while at the same time: Meet the
assumptions of MLR Best manage the other issues sample size, n of
predictors, outliers, multicollinearity, r with dependent variable,
significance in model Be parsimonious (can be very important) 1
2