Top Banner
Interpreting Multiple Regression: A Short Overview Abdel-Salam G. Abdel-Salam Laboratory for Interdisciplinary Statistical Analysis (LISA) Department of Statistics Virginia Polytechnic Institute and State University http://www.stat.vt.edu/consult/ Short Course November the 12 th , 2008 Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12 th , 2008 1 / 42
77
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multiple Regression Analysis

Interpreting Multiple Regression: A Short Overview

Abdel-Salam G. Abdel-Salam

Laboratory for Interdisciplinary Statistical Analysis (LISA)

Department of Statistics

Virginia Polytechnic Institute and State University

http://www.stat.vt.edu/consult/

Short CourseNovember the 12th, 2008

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 1 / 42

Page 2: Multiple Regression Analysis

Learning Objectives

Today , we will cover how to do Linear Regression Analysis (LRA) in SPSS andSAS

We will learn concepts and vocabularies in regression analysis such as:1 How to use the F-test to determine if your predictor variables have a statistically

significant relationship with your outcome/response variable?2 What are the assumptions for LRA and what you should do to meet these

assumptions?3 Why adjusted R2 is smaller than R2 and what these numbers mean when

comparing between several models?4 What is the difference between regression and ANOVA and when are they

equivalent?5 How can you select the best model?6 Other LR problems (Multicollinearity and Outliers observations), what you

should do??7 General Strategy for doing LRA?

PLEASE think about your research problem???

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 2 / 42

Page 3: Multiple Regression Analysis

Learning Objectives

Today , we will cover how to do Linear Regression Analysis (LRA) in SPSS andSAS

We will learn concepts and vocabularies in regression analysis such as:1 How to use the F-test to determine if your predictor variables have a statistically

significant relationship with your outcome/response variable?2 What are the assumptions for LRA and what you should do to meet these

assumptions?3 Why adjusted R2 is smaller than R2 and what these numbers mean when

comparing between several models?4 What is the difference between regression and ANOVA and when are they

equivalent?5 How can you select the best model?6 Other LR problems (Multicollinearity and Outliers observations), what you

should do??7 General Strategy for doing LRA?

PLEASE think about your research problem???

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 2 / 42

Page 4: Multiple Regression Analysis

Learning Objectives

Today , we will cover how to do Linear Regression Analysis (LRA) in SPSS andSAS

We will learn concepts and vocabularies in regression analysis such as:1 How to use the F-test to determine if your predictor variables have a statistically

significant relationship with your outcome/response variable?2 What are the assumptions for LRA and what you should do to meet these

assumptions?3 Why adjusted R2 is smaller than R2 and what these numbers mean when

comparing between several models?4 What is the difference between regression and ANOVA and when are they

equivalent?5 How can you select the best model?6 Other LR problems (Multicollinearity and Outliers observations), what you

should do??7 General Strategy for doing LRA?

PLEASE think about your research problem???

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 2 / 42

Page 5: Multiple Regression Analysis

Learning Objectives

Today , we will cover how to do Linear Regression Analysis (LRA) in SPSS andSAS

We will learn concepts and vocabularies in regression analysis such as:1 How to use the F-test to determine if your predictor variables have a statistically

significant relationship with your outcome/response variable?2 What are the assumptions for LRA and what you should do to meet these

assumptions?3 Why adjusted R2 is smaller than R2 and what these numbers mean when

comparing between several models?4 What is the difference between regression and ANOVA and when are they

equivalent?5 How can you select the best model?6 Other LR problems (Multicollinearity and Outliers observations), what you

should do??7 General Strategy for doing LRA?

PLEASE think about your research problem???

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 2 / 42

Page 6: Multiple Regression Analysis

Outline

1 IntroductionDefinitionsExample

2 Regression AssumptionsExample Problem1Satisfying AssumptionsExample Problem (2)

3 Linear Regression vs. ANOVA

4 Model SelectionGeneral Example

5 Strategy for solving problems

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 3 / 42

Page 7: Multiple Regression Analysis

Introduction

When should we use Regression??

Data can be continuous or discrete

Important Information☺

Response Variable 

(Y)

Discretee.g. Gender

Continuouse.g. Time

PredictorVariables 

(X)

DiscreteContingency Table Analysis

(Using  Chi‐square Test)ANOVA

Continuous Logistic Regression(binary or Multinomial)

Regression

(Our focus)

In the regression , we need our response to be continuous and at least one

predictor to be continuous.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 4 / 42

Page 8: Multiple Regression Analysis

Introduction Definitions

Linear Regression Analysis

Linear regression is a general method for estimating/describing association

between a continuous outcome variable (dependent) and one or multiplepredictors in one equation.

Simple linear regression model

yi =β0 +β1xi +εi i = 1,2, . . . ,n

where yi represents the ith response value, β0 is the intercept (the mean valueof y at x = 0), β1 (Slope, Regression coefficient) tells us that, on average, as xincreases by 1 so y increases by β1, and εi is the error term.

The estimated model is

yi = β0 + β1xi i = 1,2, . . . ,n

where εi = yi − yi, represents the residuals for the ith observation.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 5 / 42

Page 9: Multiple Regression Analysis

Introduction Definitions

Linear Regression Analysis

Linear regression is a general method for estimating/describing association

between a continuous outcome variable (dependent) and one or multiplepredictors in one equation.

Simple linear regression model

yi =β0 +β1xi +εi i = 1,2, . . . ,n

where yi represents the ith response value, β0 is the intercept (the mean valueof y at x = 0), β1 (Slope, Regression coefficient) tells us that, on average, as xincreases by 1 so y increases by β1, and εi is the error term.

The estimated model is

yi = β0 + β1xi i = 1,2, . . . ,n

where εi = yi − yi, represents the residuals for the ith observation.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 5 / 42

Page 10: Multiple Regression Analysis

Introduction Definitions

Linear Regression Analysis

Linear regression is a general method for estimating/describing association

between a continuous outcome variable (dependent) and one or multiplepredictors in one equation.

Simple linear regression model

yi =β0 +β1xi +εi i = 1,2, . . . ,n

where yi represents the ith response value, β0 is the intercept (the mean valueof y at x = 0), β1 (Slope, Regression coefficient) tells us that, on average, as xincreases by 1 so y increases by β1, and εi is the error term.

The estimated model is

yi = β0 + β1xi i = 1,2, . . . ,n

where εi = yi − yi, represents the residuals for the ith observation.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 5 / 42

Page 11: Multiple Regression Analysis

Introduction Definitions

A regression Line

A regression Line

SlopeIntercept

Estimated Line

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 6 / 42

Page 12: Multiple Regression Analysis

Introduction Definitions

Multiple Linear Regression

What are the reasons for using a multiple regression?

Two reasons for using multiple regression:

1 To be able to make stronger causal inferences from observed associationsbetween two or more variables.

2 To predict a dependent variable based on values of a number of other

independent variables.

Example : There might be many factors associated with crime such asPoverty, Urbanisation, Low social cohesion and informal social control, andEducation

Therefore , we want to be able to understand the unique contribution of eachvariable to variation in crime levels.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 7 / 42

Page 13: Multiple Regression Analysis

Introduction Definitions

Multiple Linear Regression

What are the reasons for using a multiple regression?

Two reasons for using multiple regression:

1 To be able to make stronger causal inferences from observed associationsbetween two or more variables.

2 To predict a dependent variable based on values of a number of other

independent variables.

Example : There might be many factors associated with crime such asPoverty, Urbanisation, Low social cohesion and informal social control, andEducation

Therefore , we want to be able to understand the unique contribution of eachvariable to variation in crime levels.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 7 / 42

Page 14: Multiple Regression Analysis

Introduction Definitions

Multiple Linear Regression

What are the reasons for using a multiple regression?

Two reasons for using multiple regression:

1 To be able to make stronger causal inferences from observed associationsbetween two or more variables.

2 To predict a dependent variable based on values of a number of other

independent variables.

Example : There might be many factors associated with crime such asPoverty, Urbanisation, Low social cohesion and informal social control, andEducation

Therefore , we want to be able to understand the unique contribution of eachvariable to variation in crime levels.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 7 / 42

Page 15: Multiple Regression Analysis

Introduction Definitions

Multiple Linear Regression

The multiple regression model

yi =β0 +β1x1i +β2x2i + . . .++βK xKi +εi i = 1,2, . . . ,n

What does it mean?1 Interpretation of β0 is the expected value of Y when X1,X2, . . . ,XK are all equal

zero.2 Interpretation of partial regression coefficient β1 is for every unit

(increase/decrease) in the value of X1 we predict a β1 change in Y ,controlling for the effect of the others.

3 What do we mean by that?!

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 8 / 42

Page 16: Multiple Regression Analysis

Introduction Definitions

Multiple Linear Regression

The multiple regression model

yi =β0 +β1x1i +β2x2i + . . .++βK xKi +εi i = 1,2, . . . ,n

What does it mean?1 Interpretation of β0 is the expected value of Y when X1,X2, . . . ,XK are all equal

zero.2 Interpretation of partial regression coefficient β1 is for every unit

(increase/decrease) in the value of X1 we predict a β1 change in Y ,controlling for the effect of the others.

3 What do we mean by that?!

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 8 / 42

Page 17: Multiple Regression Analysis

Introduction Definitions

Multiple Linear Regression

The multiple regression model

yi =β0 +β1x1i +β2x2i + . . .++βK xKi +εi i = 1,2, . . . ,n

What does it mean?1 Interpretation of β0 is the expected value of Y when X1,X2, . . . ,XK are all equal

zero.2 Interpretation of partial regression coefficient β1 is for every unit

(increase/decrease) in the value of X1 we predict a β1 change in Y ,controlling for the effect of the others.

3 What do we mean by that?!

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 8 / 42

Page 18: Multiple Regression Analysis

Introduction Example

Example

We are interested in the effect of education and occupational status ongeneral happiness. All measured on scales from 1 to 10.

Prediction model is

yi = β0 + β1x1i + β2x2i i = 1,2, . . . ,n

You estimate model and get these results:

Prediction model is

predicted HAPPINESS = 3+1×EDUC + .5×STATUS

β0 = predicted HAPPINESS when EDUC and STATUS are both zero = 3

β1 = for each unit increase in EDUC we predict a 1 unit rise in happiness,controlling for status.

β2 = for each unit increase in status, we predict a .5 unit rise in happiness,controlling for education.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 9 / 42

Page 19: Multiple Regression Analysis

Introduction Example

Example

We are interested in the effect of education and occupational status ongeneral happiness. All measured on scales from 1 to 10.

Prediction model is

yi = β0 + β1x1i + β2x2i i = 1,2, . . . ,n

You estimate model and get these results:

Prediction model is

predicted HAPPINESS = 3+1×EDUC + .5×STATUS

β0 = predicted HAPPINESS when EDUC and STATUS are both zero = 3

β1 = for each unit increase in EDUC we predict a 1 unit rise in happiness,controlling for status.

β2 = for each unit increase in status, we predict a .5 unit rise in happiness,controlling for education.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 9 / 42

Page 20: Multiple Regression Analysis

Introduction Example

Example

Imagine two people Gordon and Tony both are having the same level of

education but Gordon has status score 5, Tony has 6

Prediction model is

predicted HAPPINESS = 3+1×EDUC + .5×STATUS

Who has more happiness??

Model predicts Tony’s happiness score to be 0.5 greater than Gordon’s

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 10 / 42

Page 21: Multiple Regression Analysis

Introduction Example

Example

Imagine two people Gordon and Tony both are having the same level of

education but Gordon has status score 5, Tony has 6

Prediction model is

predicted HAPPINESS = 3+1×EDUC + .5×STATUS

Who has more happiness??

Model predicts Tony’s happiness score to be 0.5 greater than Gordon’s

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 10 / 42

Page 22: Multiple Regression Analysis

Introduction Example

Example

Imagine two people Gordon and Tony both are having the same level of

education but Gordon has status score 5, Tony has 6

Prediction model is

predicted HAPPINESS = 3+1×EDUC + .5×STATUS

Who has more happiness??

Model predicts Tony’s happiness score to be 0.5 greater than Gordon’s

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 10 / 42

Page 23: Multiple Regression Analysis

Regression Assumptions

Regression Assumptions

1 Independence means that, the Y values are statistically independent of oneanother.

2 Linearity means that, the mean value Y is proportional to the independent

variable (X), ⇒ a straight line function.

3 Normally Distributed means that, for a fixed value of X , Y has a normal

distribution.

Positive vs. Negative Skewness

Normal Distribution

4 Homoscedasticity or Homogeneity means that, the variance of Y is the same

for any X (constant variance).

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 11 / 42

Page 24: Multiple Regression Analysis

Regression Assumptions

Regression Assumptions

1 Independence means that, the Y values are statistically independent of oneanother.

2 Linearity means that, the mean value Y is proportional to the independent

variable (X), ⇒ a straight line function.

3 Normally Distributed means that, for a fixed value of X , Y has a normal

distribution.

Positive vs. Negative Skewness

Normal Distribution

4 Homoscedasticity or Homogeneity means that, the variance of Y is the same

for any X (constant variance).

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 11 / 42

Page 25: Multiple Regression Analysis

Regression Assumptions

Checking Assumptions

Can be done graphically , most popular, and can be done by some statisticaltests.

Ï Independence

Ï Linearity

Ï Normality

Ï Homogeneity

Ï Response (Y ) or Residual (ε) vs. Time

Ï Response (Y ) vs. Predictor (X)

Ï Probability Plot (PP plot or QQ plot)

Ï Residual (ε) vs. Predicted (X)

Statistical tests for these assumptions will not cover here.

Examples ⇒⇒⇒⇒⇒⇒←-

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 12 / 42

Page 26: Multiple Regression Analysis

Regression Assumptions

Checking Assumptions

Can be done graphically , most popular, and can be done by some statisticaltests.

Ï Independence

Ï Linearity

Ï Normality

Ï Homogeneity

Ï Response (Y ) or Residual (ε) vs. Time

Ï Response (Y ) vs. Predictor (X)

Ï Probability Plot (PP plot or QQ plot)

Ï Residual (ε) vs. Predicted (X)

Statistical tests for these assumptions will not cover here.

Examples ⇒⇒⇒⇒⇒⇒←-

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 12 / 42

Page 27: Multiple Regression Analysis

Regression Assumptions

Checking Assumptions

Can be done graphically , most popular, and can be done by some statisticaltests.

Ï Independence

Ï Linearity

Ï Normality

Ï Homogeneity

Ï Response (Y ) or Residual (ε) vs. Time

Ï Response (Y ) vs. Predictor (X)

Ï Probability Plot (PP plot or QQ plot)

Ï Residual (ε) vs. Predicted (X)

Statistical tests for these assumptions will not cover here.

Examples ⇒⇒⇒⇒⇒⇒←-

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 12 / 42

Page 28: Multiple Regression Analysis

Regression Assumptions

Checking Assumptions

Can be done graphically , most popular, and can be done by some statisticaltests.

Ï Independence

Ï Linearity

Ï Normality

Ï Homogeneity

Ï Response (Y ) or Residual (ε) vs. Time

Ï Response (Y ) vs. Predictor (X)

Ï Probability Plot (PP plot or QQ plot)

Ï Residual (ε) vs. Predicted (X)

Statistical tests for these assumptions will not cover here.

Examples ⇒⇒⇒⇒⇒⇒←-

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 12 / 42

Page 29: Multiple Regression Analysis

Regression Assumptions

Checking Assumptions

Can be done graphically , most popular, and can be done by some statisticaltests.

Ï Independence

Ï Linearity

Ï Normality

Ï Homogeneity

Ï Response (Y ) or Residual (ε) vs. Time

Ï Response (Y ) vs. Predictor (X)

Ï Probability Plot (PP plot or QQ plot)

Ï Residual (ε) vs. Predicted (X)

Statistical tests for these assumptions will not cover here.

Examples ⇒⇒⇒⇒⇒⇒←-

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 12 / 42

Page 30: Multiple Regression Analysis

Regression Assumptions

Checking Assumptions

Can be done graphically , most popular, and can be done by some statisticaltests.

Ï Independence

Ï Linearity

Ï Normality

Ï Homogeneity

Ï Response (Y ) or Residual (ε) vs. Time

Ï Response (Y ) vs. Predictor (X)

Ï Probability Plot (PP plot or QQ plot)

Ï Residual (ε) vs. Predicted (X)

Statistical tests for these assumptions will not cover here.

Examples ⇒⇒⇒⇒⇒⇒←-

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 12 / 42

Page 31: Multiple Regression Analysis

Regression Assumptions

Independence

Independent of Time

Dependent

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 13 / 42

Page 32: Multiple Regression Analysis

Regression Assumptions

Linearity

Linear

Temperature

Res

pons

e

1009080706050

100

90

80

70

60

50

Scatterplot of Response vs Temperature

Non-Linear

Temperature

Res

pons

e

1009080706050

100000000

80000000

60000000

40000000

20000000

0

Scatterplot of Response vs Temperature

Question??? In MLR plot the response against each predictor

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 14 / 42

Page 33: Multiple Regression Analysis

Regression Assumptions

Linearity

Linear

Temperature

Res

pons

e

1009080706050

100

90

80

70

60

50

Scatterplot of Response vs Temperature

Non-Linear

Temperature

Res

pons

e

1009080706050

100000000

80000000

60000000

40000000

20000000

0

Scatterplot of Response vs Temperature

Question??? In MLR plot the response against each predictor

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 14 / 42

Page 34: Multiple Regression Analysis

Regression Assumptions

Linearity

Linear

Temperature

Res

pons

e

1009080706050

100

90

80

70

60

50

Scatterplot of Response vs Temperature

Non-Linear

Temperature

Res

pons

e

1009080706050

100000000

80000000

60000000

40000000

20000000

0

Scatterplot of Response vs Temperature

Question??? In MLR plot the response against each predictor

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 14 / 42

Page 35: Multiple Regression Analysis

Regression Assumptions

Normality

Data is Normal . The data follows a straight line

Data is Non-Normal . The data is slightly curved

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 15 / 42

Page 36: Multiple Regression Analysis

Regression Assumptions

Homogeneity

Constant Variance . Randomly Distributed or No Pattern

Non-constant variance (Heterogeneity or Hetroscedasticity). Curved Shapeor Megaphone Shape (Monotone Spread)

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 16 / 42

Page 37: Multiple Regression Analysis

Regression Assumptions

Now...

Any Questions???

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 17 / 42

Page 38: Multiple Regression Analysis

Regression Assumptions Example Problem1

Example Problem (1)

A group of 13 children participated in a psychological study to analyze therelationship between age and average total sleep time (ATST). The results aredisplayed below. Determine the SLR model for the data?

AGE(X)(Years) 4.4 14 10.1 6.7 1.5 9.6 12.4 8.9 11.1 7.75 5.5 8.6 7.2ATST(Y )(Minutes) 586 462 491 565 462 532 478 515 493 528 576 533 531

Using Proc Glm or Proc Reg .

proc glm data=sleep;model atst=age/solution clparm;run;

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 18 / 42

Page 39: Multiple Regression Analysis

Regression Assumptions Example Problem1

SPSS Analysis

Before running analysis check that data is listed as scale in the variable viewscreen

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 19 / 42

Page 40: Multiple Regression Analysis

Regression Assumptions Example Problem1

SPSS Analysis

Analyze ⇒ Regression ⇒ Linear⇒

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 20 / 42

Page 41: Multiple Regression Analysis

Regression Assumptions Example Problem1

SPSS Analysis

Enter ATST in Dependent Box & Enter AGE in Independents & Add confidenceintervals: Statistics ⇒ Check Confidence Interval Box

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 21 / 42

Page 42: Multiple Regression Analysis

Regression Assumptions Example Problem1

SPSS Analysis - 2nd Method

Analyze ⇒ General Linear Model ⇒ Univariate

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 22 / 42

Page 43: Multiple Regression Analysis

Regression Assumptions Example Problem1

SPSS Analysis - 2nd Method

ATST in Dependent Variable & Age in Covariate(s)

Note that, SPSS & JMP are similar for regression analysis.

Data and SAS code is posted for future analysis.http://filebox.vt.edu/users/abdo/statwww/Example%201.pdf

Outputhttp://filebox.vt.edu/users/abdo/statwww/SAS%20Output1.mht

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 23 / 42

Page 44: Multiple Regression Analysis

Regression Assumptions Example Problem1

SPSS Analysis - 2nd Method

ATST in Dependent Variable & Age in Covariate(s)

Note that, SPSS & JMP are similar for regression analysis.

Data and SAS code is posted for future analysis.http://filebox.vt.edu/users/abdo/statwww/Example%201.pdf

Outputhttp://filebox.vt.edu/users/abdo/statwww/SAS%20Output1.mht

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 23 / 42

Page 45: Multiple Regression Analysis

Regression Assumptions Example Problem1

Now...

Any Questions???

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 24 / 42

Page 46: Multiple Regression Analysis

Regression Assumptions Satisfying Assumptions

What if the assumptions are not met?

IndependenceÏ If due to time, add time to model.Ï Other methods for dealing with dependence :

1 Repeated Measures2 Paired Data

LinearityÏ Transformations

F Try log, square root, square, and inverse transformation

Ï Add VariablesÏ Add InteractionsÏ Add Higher Powers of Variables

Normality & HomoscedasticityÏ Transformations:

F Try log, square root, and inverse transformation. Use first transformed variable thatsatisfies normality criteria.

F If no transformation satisfies normality criteria, then Robust Regression wherenormality is not required.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 25 / 42

Page 47: Multiple Regression Analysis

Regression Assumptions Satisfying Assumptions

What if the assumptions are not met?

IndependenceÏ If due to time, add time to model.Ï Other methods for dealing with dependence :

1 Repeated Measures2 Paired Data

LinearityÏ Transformations

F Try log, square root, square, and inverse transformation

Ï Add VariablesÏ Add InteractionsÏ Add Higher Powers of Variables

Normality & HomoscedasticityÏ Transformations:

F Try log, square root, and inverse transformation. Use first transformed variable thatsatisfies normality criteria.

F If no transformation satisfies normality criteria, then Robust Regression wherenormality is not required.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 25 / 42

Page 48: Multiple Regression Analysis

Regression Assumptions Example Problem (2)

SAS Analysis (2)

Example Problem 2 In order to study the growth rate of a particular type ofbacteria biologists were interested in the relationship between time and theproportion of total area taken up by a colony of bacteria. The biologists placedsamples in four Petri dishes and observed the percentage of total area takenup by the bacteria colony after fixed time intervals

Data and SAS code is posted for future analysis.http://filebox.vt.edu/users/abdo/statwww/Example2.pdf

Outputhttp://filebox.vt.edu/users/abdo/statwww/SAS%20Output2.mht

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 26 / 42

Page 49: Multiple Regression Analysis

Regression Assumptions Example Problem (2)

Now...

Any Questions???

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 27 / 42

Page 50: Multiple Regression Analysis

Linear Regression vs. ANOVA

Linear Regression vs. ANOVA

Linear regression

Dependent : Continuous

Independent : Continuous orCategorical

ANOVA

Dependent : Continuous

Independent :Categorical

Both are Linear models

ANOVA is a special case from the regression analysis!!!

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 28 / 42

Page 51: Multiple Regression Analysis

Linear Regression vs. ANOVA

Linear Regression vs. ANOVA

Linear regression

Dependent : Continuous

Independent : Continuous orCategorical

ANOVA

Dependent : Continuous

Independent :Categorical

Both are Linear models

ANOVA is a special case from the regression analysis!!!

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 28 / 42

Page 52: Multiple Regression Analysis

Linear Regression vs. ANOVA

Linear Regression vs. ANOVA

Linear regression

Dependent : Continuous

Independent : Continuous orCategorical

ANOVA

Dependent : Continuous

Independent :Categorical

Both are Linear models

ANOVA is a special case from the regression analysis!!!

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 28 / 42

Page 53: Multiple Regression Analysis

Linear Regression vs. ANOVA

Linear Regression vs. ANOVA

Example :

Scientific Question : Is there any difference in the loneliness between female

and male?

H0 : Female = Male VS H1 : Female 6= Male

Student T-test or ANOVA ?? OR even Regression Analysis?

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 29 / 42

Page 54: Multiple Regression Analysis

Linear Regression vs. ANOVA

Linear Regression vs. ANOVA

ANOVA WITH SOLUTIONPROC GLM DATA = MYLIB.LONELINESS;CLASS GENDER;MODEL LONELINESS=GENDER/SOLUTION;

RUN;

LINEAR REGRESSION USING GLMPROC GLM DATA = MYLIB.LONELINESS;MODEL LONELINESS = GENDER;

RUN;

Dependent Variable: loneliness

Sum ofSource                      DF         Squares     Mean Square    F Value    Pr > F

Model                        1        4.902852        4.902852       2.24    0.1347

Error                      498     1087.709443        2.184156                     

Corrected Total            499     1092.612294                                     

R‐Square     Coeff Var Root MSE    lonely3 Mean

0.004487      29.18049      1.477889        5.064648

Source                      DF       Type I SS     Mean Square    F Value    Pr > F

gender                       1      4.90285179      4.90285179       2.24    0.1347

Source                      DF     Type III SS     Mean Square    F Value    Pr > F

gender                       1      4.90285179      4.90285179       2.24    0.1347

StandardParameter           Estimate             Error    t Value    Pr > |t|

Intercept        4.931462122 B      0.11077245      44.52      <.0001gender    1      0.206809959 B      0.13803488       1.50      0.1347gender    2      0.000000000 B       .                .         .    

Sum ofSource                      DF         Squares     Mean Square    F Value    Pr > F

Model                        1        4.902852        4.902852       2.24    0.1347

Error                      498     1087.709443        2.184156                     

Corrected Total            499     1092.612294                                     

R‐Square     Coeff Var Root MSE    lonely3 Mean

0.004487      29.18049      1.477889        5.064648

Source                      DF       Type I SS     Mean Square    F Value    Pr > F

gender                       1      4.90285179      4.90285179       2.24    0.1347

Source                      DF     Type III SS     Mean Square    F Value    Pr > F

gender                       1      4.90285179      4.90285179       2.24    0.1347

StandardParameter         Estimate           Error    t Value    Pr > |t|

Intercept      5.345082039      0.19850165      26.93      <.0001gender        ‐0.206809959      0.13803488      ‐1.50      0.1347

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 30 / 42

Page 55: Multiple Regression Analysis

Linear Regression vs. ANOVA

Linear Regression vs. ANOVA

In the example , we show that ANOVA is a special case of linear regression.

What if there are more than 2 groups in the ANOVA?

Dummy variable for categorical data :

1 Outcome : Y (continuous)

2 Predictor : X (categorical) where X=(Group1, Group2, Group3, Group4)

ANOVA : Y ∼ X Regression : Y ∼ Z1,Z2,Z3

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 31 / 42

Page 56: Multiple Regression Analysis

Linear Regression vs. ANOVA

Linear Regression vs. ANOVA

In the example , we show that ANOVA is a special case of linear regression.

What if there are more than 2 groups in the ANOVA?

Dummy variable for categorical data :

1 Outcome : Y (continuous)

2 Predictor : X (categorical) where X=(Group1, Group2, Group3, Group4)

ANOVA : Y ∼ X Regression : Y ∼ Z1,Z2,Z3

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 31 / 42

Page 57: Multiple Regression Analysis

Linear Regression vs. ANOVA

Linear Regression vs. ANOVA

In the example , we show that ANOVA is a special case of linear regression.

What if there are more than 2 groups in the ANOVA?

Dummy variable for categorical data :

1 Outcome : Y (continuous)

2 Predictor : X (categorical) where X=(Group1, Group2, Group3, Group4)

ANOVA : Y ∼ X Regression : Y ∼ Z1,Z2,Z3

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 31 / 42

Page 58: Multiple Regression Analysis

Linear Regression vs. ANOVA

Now...

Any Questions???

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 32 / 42

Page 59: Multiple Regression Analysis

Model Selection

Goodness of Fit

R2 : The larger, the better.Ï Measures the proportion of variability in the response explained by the model,

0 ≤ R2 ≤ 1.Ï Compare models, and Evaluates How well the model explains your data.

Adjusted R2 : The larger, the better (Take degrees of freedom into

consideration).

Root MSE : The smaller, the better.

Most frequently used statistic (probably) is CP , with P = 1+ number ofindependent variables (k)

It is always happened that the model with all variable has CP = P exactly.

For other models, a good fit is indicated by CP ≈ P , with CP < P even better.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 33 / 42

Page 60: Multiple Regression Analysis

Model Selection

Goodness of Fit

R2 : The larger, the better.Ï Measures the proportion of variability in the response explained by the model,

0 ≤ R2 ≤ 1.Ï Compare models, and Evaluates How well the model explains your data.

Adjusted R2 : The larger, the better (Take degrees of freedom into

consideration).

Root MSE : The smaller, the better.

Most frequently used statistic (probably) is CP , with P = 1+ number ofindependent variables (k)

It is always happened that the model with all variable has CP = P exactly.

For other models, a good fit is indicated by CP ≈ P , with CP < P even better.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 33 / 42

Page 61: Multiple Regression Analysis

Model Selection

Goodness of Fit

R2 : The larger, the better.Ï Measures the proportion of variability in the response explained by the model,

0 ≤ R2 ≤ 1.Ï Compare models, and Evaluates How well the model explains your data.

Adjusted R2 : The larger, the better (Take degrees of freedom into

consideration).

Root MSE : The smaller, the better.

Most frequently used statistic (probably) is CP , with P = 1+ number ofindependent variables (k)

It is always happened that the model with all variable has CP = P exactly.

For other models, a good fit is indicated by CP ≈ P , with CP < P even better.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 33 / 42

Page 62: Multiple Regression Analysis

Model Selection

Model Selection

Forward Selection Method ; Starting with the null model, add variablessequentially.

Ï Drawback : Variables Added to the model cannot be taken out.

Backward Elimination Method ; Starting with the full model, delete variableswith large P-values sequentially.

Ï Drawback : Variables taken out of model cannot be added back in.

Stepwise Method ; Combination of Backward/Forward methods.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 34 / 42

Page 63: Multiple Regression Analysis

Model Selection

Model Selection

Forward Selection Method ; Starting with the null model, add variablessequentially.

Ï Drawback : Variables Added to the model cannot be taken out.

Backward Elimination Method ; Starting with the full model, delete variableswith large P-values sequentially.

Ï Drawback : Variables taken out of model cannot be added back in.

Stepwise Method ; Combination of Backward/Forward methods.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 34 / 42

Page 64: Multiple Regression Analysis

Model Selection

Model Selection

Forward Selection Method ; Starting with the null model, add variablessequentially.

Ï Drawback : Variables Added to the model cannot be taken out.

Backward Elimination Method ; Starting with the full model, delete variableswith large P-values sequentially.

Ï Drawback : Variables taken out of model cannot be added back in.

Stepwise Method ; Combination of Backward/Forward methods.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 34 / 42

Page 65: Multiple Regression Analysis

Model Selection

Model Comparison

Full Model 1 : Compute regression sum of squares (RSS1) & degrees offreedom (df1)

Reduced Model 2 : Compute regression sum of squares (RSS2) & df2

F-Test

Ftest = (RSS2 −RSS1)/(df2 −df1)

RSS1/df1∼ F(df2−df1,df1)

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 35 / 42

Page 66: Multiple Regression Analysis

Model Selection General Example

General Example

General Example From Motor Trend magazine data were obtained for n=32cars on the following variables:

Data and SAS code is posted for future analysis.

Output http://filebox.vt.edu/users/abdo/statwww/General%20Example.pdf

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 36 / 42

Page 67: Multiple Regression Analysis

Model Selection General Example

Now...

Any Questions???

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 37 / 42

Page 68: Multiple Regression Analysis

Model Selection General Example

Linear Regression and Outliers

Outliers can distort the regression results. When an outlier is included in theanalysis, it pulls the regression line towards itself. This can result in a solutionthat is more accurate for the outlier, but less accurate for all of the other casesin the data set.

The problems of satisfying assumptions and detecting outliers areintertwined. For example, if a case has a value on the dependent variable thatis an outlier, it will affect the skew, and hence, the normality of thedistribution.

Removing an outlier may improve the distribution of a variable.

Transforming a variable may reduce the likelihood that the value for a case

will be characterized as an outlier.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 38 / 42

Page 69: Multiple Regression Analysis

Model Selection General Example

Linear Regression and Outliers

Outliers can distort the regression results. When an outlier is included in theanalysis, it pulls the regression line towards itself. This can result in a solutionthat is more accurate for the outlier, but less accurate for all of the other casesin the data set.

The problems of satisfying assumptions and detecting outliers areintertwined. For example, if a case has a value on the dependent variable thatis an outlier, it will affect the skew, and hence, the normality of thedistribution.

Removing an outlier may improve the distribution of a variable.

Transforming a variable may reduce the likelihood that the value for a case

will be characterized as an outlier.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 38 / 42

Page 70: Multiple Regression Analysis

Model Selection General Example

Linear Regression and Outliers

Outliers can distort the regression results. When an outlier is included in theanalysis, it pulls the regression line towards itself. This can result in a solutionthat is more accurate for the outlier, but less accurate for all of the other casesin the data set.

The problems of satisfying assumptions and detecting outliers areintertwined. For example, if a case has a value on the dependent variable thatis an outlier, it will affect the skew, and hence, the normality of thedistribution.

Removing an outlier may improve the distribution of a variable.

Transforming a variable may reduce the likelihood that the value for a case

will be characterized as an outlier.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 38 / 42

Page 71: Multiple Regression Analysis

Model Selection General Example

Linear Regression and Outliers

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 39 / 42

Page 72: Multiple Regression Analysis

Strategy for solving problems

Strategy for solving problems

Our strategy for solving problems about violations of assumptions and outliers willinclude the following steps:

1 Run type of regression specified in problem statement on variables using fulldata set.

2 Test the dependent variable for normality. If it does not satisfy the criteria fornormality unless transformed, substitute the transformed variable in theremaining tests that call for the use of the dependent variable.

3 Test for normality, linearity, homoscedasticity using scripts. Decide which

transformations should be used.

4 Substitute transformations and run regression entering all independentvariables, saving studentized residuals.

5 Remove the outliers (studentized residual greater(Smaller) than 3 (-3) , andrun regression with the method and variables specified in the problem.

6 Compare R2 for analysis using transformed variables and omitting outliersstep 5 to R2 obtained for model using all data and original variables step 1 .

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 40 / 42

Page 73: Multiple Regression Analysis

Strategy for solving problems

Strategy for solving problems

Our strategy for solving problems about violations of assumptions and outliers willinclude the following steps:

1 Run type of regression specified in problem statement on variables using fulldata set.

2 Test the dependent variable for normality. If it does not satisfy the criteria fornormality unless transformed, substitute the transformed variable in theremaining tests that call for the use of the dependent variable.

3 Test for normality, linearity, homoscedasticity using scripts. Decide which

transformations should be used.

4 Substitute transformations and run regression entering all independentvariables, saving studentized residuals.

5 Remove the outliers (studentized residual greater(Smaller) than 3 (-3) , andrun regression with the method and variables specified in the problem.

6 Compare R2 for analysis using transformed variables and omitting outliersstep 5 to R2 obtained for model using all data and original variables step 1 .

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 40 / 42

Page 74: Multiple Regression Analysis

Strategy for solving problems

Strategy for solving problems

Our strategy for solving problems about violations of assumptions and outliers willinclude the following steps:

1 Run type of regression specified in problem statement on variables using fulldata set.

2 Test the dependent variable for normality. If it does not satisfy the criteria fornormality unless transformed, substitute the transformed variable in theremaining tests that call for the use of the dependent variable.

3 Test for normality, linearity, homoscedasticity using scripts. Decide which

transformations should be used.

4 Substitute transformations and run regression entering all independentvariables, saving studentized residuals.

5 Remove the outliers (studentized residual greater(Smaller) than 3 (-3) , andrun regression with the method and variables specified in the problem.

6 Compare R2 for analysis using transformed variables and omitting outliersstep 5 to R2 obtained for model using all data and original variables step 1 .

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 40 / 42

Page 75: Multiple Regression Analysis

Strategy for solving problems

Strategy for solving problems

Our strategy for solving problems about violations of assumptions and outliers willinclude the following steps:

1 Run type of regression specified in problem statement on variables using fulldata set.

2 Test the dependent variable for normality. If it does not satisfy the criteria fornormality unless transformed, substitute the transformed variable in theremaining tests that call for the use of the dependent variable.

3 Test for normality, linearity, homoscedasticity using scripts. Decide which

transformations should be used.

4 Substitute transformations and run regression entering all independentvariables, saving studentized residuals.

5 Remove the outliers (studentized residual greater(Smaller) than 3 (-3) , andrun regression with the method and variables specified in the problem.

6 Compare R2 for analysis using transformed variables and omitting outliersstep 5 to R2 obtained for model using all data and original variables step 1 .

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 40 / 42

Page 76: Multiple Regression Analysis

Strategy for solving problems

The End...

Thank You For Your Attention!Acknowledgment for Dr. Schabanberger,Dr. J.P. Morgan, Jonathan Duggins,

Dingcai Cao, and all our Consultants.

Selected References : [for Interdisciplinary Statistical Analysis (LISA), ,Schabenberger and Morgen, , Montgomery et al., 2006, Smaxone et al., 2005,Chatterjee and Hadi, 2006, Kutner et al., 2005, Kleinbaum et al., 2007,Myers, 1990, Zar, 1999, Grafarend, 2006, Hastie and Tibshirani., 1990,Rencher, 2000, Vonesh and Chinchilli, 1997, Lee et al., 2006]

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 41 / 42

Page 77: Multiple Regression Analysis

ReferencesChatterjee, S. and Hadi, A. S. (2006).Regression Analysis by Example. 4th edition.ISBN: 978-0-471-74696-6.

for Interdisciplinary Statistical Analysis (LISA), L.http://www.stat.vt.edu/consult/index.html.

Grafarend, E. W. (2006).Linear and Nonlinear Models: Fixed Effects, Random Effects, and Mixed Models.Walter de Gruyter.

Hastie, T. J. and Tibshirani., R. J. (1990).Generalized additive models.New York : Chapman and Hall.

Kleinbaum, D., Kupper, L., Nizam, A., and Muller, K. (2007).Applied Regression Analysis and Multivariate Methods. 4th edition.ISBN - 13: 978-0-495-38496-0.

Kutner, M., Nachtsheim, C., Neter, J., and Li, W. (2005).Applied Linear Statistical Models. 5th edition.ISBN - 13: 978-0-073-10874-2.

Lee, Y., Nelder, J. A., and Pawitan, Y. (2006).Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood.Chapman & Hall/CRC.

Montgomery, D. C., Peck, E. A., and Vining, G. G. (2006).Introduction to Linear Regression Analysis, 4th Edition.John Wiley & Sones, New Jersey.

Myers, R. H. (1990).Classical and Modern Regression with Applications.Second edition, Boston, MA : PWS-KENT.

Rencher, A. C. (2000).Linear Models in Statistics.John Wiley and Sons, New York, NY.

Schabenberger, O. and Morgen, J. P.Regression and anova course pack.STAT 5044.

Smaxone, BÃÂÿgballe, M., Rasmussen, B., and Skafte, C. (2005).Regression.BL Music Scarlet, S. Donato Mil.se.Indspillet I Danmark 2004-2005 Tekster pÃÂe omslag Indhold: 5 I see you, part 1 Regression Freedom 2003 Smiling Waiting Bad sensation Dead but alive Afterlife If you could I see you, part 2.

Vonesh, E. F. and Chinchilli, V. M. (1997).Linear and Nonlinear Models for the Analysis of Repeated Measurements.Marcel Dekker, Inc.,New York.

Zar, J. (1999).Biostatistical Analysis. 4th edition.ISBN - 13: 978-0-130-81542-2.

Abdel-Salam G. Abdel-Salam © (Virginia Tech) Short Course 2008, LISA, Department of Statistics November the 12th , 2008 42 / 42