Top Banner
Ordinal Multinomial Logistic Regression Thom M. Suhy Southern Methodist University May14th, 2013
28

Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

May 12, 2019

Download

Documents

vodien
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Ordinal Multinomial Logistic Regression

Thom M. Suhy Southern Methodist University May14th, 2013

Page 2: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

GLM � Generalized Linear Model (GLM) –

“Framework for statistical analysis” (Gelman and Hill, 2007, p. 135)

�  Linear Regression – Continuous data

�  Logistic Regression – Binary data � Ordered Multinomial Logistic Regression � Unordered Multinomial Logistic Regression

Ordered Multinomial Logistic Regression

Page 3: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Logistic Regression �  Dependent variable is dichotomous

�  Yes or No �  Apply or Not Apply �  Pass or Fail �  Heisman or no Heisman

�  Probability of trait (yes, apply, pass, Heisman) based on independent variables

�  Independent variable does not need to be dichotomous �  Categorical �  Integral �  Dichotomous �  Nominal �  Ordinal

Ordered Multinomial Logistic Regression

Page 4: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Logistic Regression – Refresher Call: glm(formula = comply ~ physrec, family = binomial(link = "logit")) Deviance Residuals: Min 1Q Median 3Q Max -1.3735 -1.3735 -0.5434 0.9933 1.9929 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.8383 0.4069 -4.518 6.26e-06 *** physrec 2.2882 0.4503 5.081 3.75e-07 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 226.47 on 163 degrees of freedom Residual deviance: 191.87 on 162 degrees of freedom AIC: 195.87 Number of Fisher Scoring iterations: 4

Ordered Multinomial Logistic Regression

Page 5: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Logistic Regression – Refresher Formula for logit �  ## logit=-1.8383+(2.2882*physrec) ## �  ## logit = -1.8383 for no physrec ## �  ## logit = .4499 for yes physrec ## Probability to comply �  exp(-1.8383)/(1+(exp(-1.8383))) �  Probability of comply with no physrec = .137 or 13.7%

�  exp(.4499)/(1+(exp(.4499))) �  Probability of comply with physrec = .6106 or 61%

Ordered Multinomial Logistic Regression

Page 6: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Logistic Regression – Refresher

physrec

factor(comply)

0.2 0.4 0.6 0.8

01

0.0

0.2

0.4

0.6

0.8

1.0

Ordered Multinomial Logistic Regression

Page 7: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Logistic vs. Ordered Multinomial How are they different?

�  An extension of logistic regression to multiple categories (Gelman & Hill, 2007)

�  Not binary, categorical (but ordered) � Decision (Yes, Maybe, No) � Order of Finish (1st, 2nd, 3rd) � Likert Scale (Strongly Disagree – Strongly

Agree) �  Income ranges (0 – 25K, 25K-50K, 50K+) � Degree (None, Bachelors, Masters, PhD)

�  There is unordered multinomial logistic regression, but that is not for today!

Ordered Multinomial Logistic Regression

Page 8: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

A Little More Information Ordinal multinomial logistic regression is an extension of logistic regression using multiple categories that have a logical order. (Gelman & Hill, 2007) “Ordinal data are the most frequently encountered type of data in the social sciences” (Johnson & Albert, 1999, p. 126).

Ordered Multinomial Logistic Regression

Page 9: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Running a Model in R 1. We are going to use a file from UCLA, but first load

your libraries :

> library(psych) > library(arm) 2. Now we will read in our data: > suhy<- read.dta(url("http://www.ats.ucla. edu/stat/r/dae/ologit.dta"))

Ordered Multinomial Logistic Regression

Page 10: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Running a Model in R 3. Let’s examine our data:

> head(suhy)

apply pared public gpa

1 very likely 0 0 3.26 2 somewhat likely 1 0 3.21 3 unlikely 1 1 3.94 4 somewhat likely 0 0 2.81 5 somewhat likely 0 0 2.53 6 unlikely 0 1 2.59

Ordered Multinomial Logistic Regression

Page 11: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Running a Model in R Defining our variables:

apply = Likelihood of college juniors applying to grad

school. (Self-reported)(very likely, somewhat likely, unlikely)

pared = Does at least one parent have a graduate

degree? (no=0, yes=1) public = Undergrad was a private or public institution.

(private = 0, public = 1) gpa = Undergrad grade point average

Ordered Multinomial Logistic Regression

Page 12: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Running a Model in R What are our assumptions?

Data are case specific – iv has a single value for each case No perfect predictors – no single predicator variable, iv can determine the outcome of the dv No zero or very small quantities in a crosstab cell Sample size – larger than normal OLS regression

Ordered Multinomial Logistic Regression

Page 13: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Running a Model in R 4. Let us check our assumptions:

>xtabs(~suhy$pared+suhy$apply) suhy$pared unlikely somewhat likely very likely 0 200 110 27 1 20 30 13 > xtabs(~suhy$public+suhy$apply) suhy$public unlikely somewhat likely very likely 0 189 124 30 1 31 16 10

Ordered Multinomial Logistic Regression

Page 14: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Running a Model in R 5. We are good to go, let’s run the model:

> summary(m1<-bayespolr(as.ordered(suhy$apply)~suhy$gpa)) Call: bayespolr(formula = as.ordered(suhy$apply) ~ suhy$gpa) Coefficients: Value Std. Error t value suhy$gpa 0.7109 0.2471 2.877 Intercepts: Value Std. Error t value unlikely|somewhat likely 2.3306 0.7502 3.1065 somewhat likely|very likely 4.3505 0.7744 5.6179 Residual Deviance: 737.6921 AIC: 743.6921

Ordered Multinomial Logistic Regression

Page 15: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Running a Model in R A visual:

Ordered Multinomial Logistic Regression

Thank you Pooja Shivraj (2012)

Page 16: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Running a Model in R 6. Lets calculate the probabilities for the average gpa

> x<-mean(suhy$gpa) x = 2.998925 > coef<-m1$coef > coef suhy$gpa 0.710892 > intercept<-m1$zeta > intercept unlikely|somewhat likely somewhat likely|very likely 2.330599 4.350527

Ordered Multinomial Logistic Regression

Page 17: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Running a Model in R 6. Let’s calculate the probabilities for the average gpa (cont.)

Remember: > prob<-function(input){exp(input)/(1+exp(input))} > (p0<-prob(intercept[1]-coef*x))

unlikely|somewhat likely 0.549509 OR 55%

> (p1<-prob(intercept[2]-coef*x)-p0)

somewhat likely|very likely 0.3523997 OR 35%

> (p2<-1-(p0+p1))

very likely 0.09809127 OR 9% p0+p1+p2 always equal 1 when using 3 categories

Ordered Multinomial Logistic Regression

Page 18: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Running a Model in R 2.5 GPA

> (p0<-prob(intercept[1]-coef*2.5)) unlikely|somewhat likely 0.6349169 > (p1<-prob(intercept[2]-coef*2.5)-p0) somewhat likely|very likely 0.2942062 > (p2<-1-(p0+p1))

very likely 0.07087689

3.7 GPA > (p0<-prob(intercept[1]-coef*3.7)) unlikely|somewhat likely 0.4256305 > (p1<-prob(intercept[2]-coef*3.7)-p0) somewhat likely|very likely 0.4225275 > (p2<-1-(p0+p1))

very likely 0.151842

Ordered Multinomial Logistic Regression

Page 19: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Running a Model in R Now you tell me the probability for each category if you had a 4.0 GPA. > (p0<-prob(intercept[1]-coef*4.0)) unlikely|somewhat likely = 37% > (p1<-prob(intercept[2]-coef*4.0)-p0) somewhat likely|very likely = 44% > (p2<-1-(p0+p1))

very likely = 18%

Ordered Multinomial Logistic Regression

Page 20: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Multiple Predictors 1.  Let’s look at a model with multiple predictors:

> summary(m2<-bayespolr(as.ordered(suhy$apply)~suhy$gpa+suhy$pared+suhy$public)) Call: bayespolr(formula = as.ordered(suhy$apply) ~ suhy$gpa + suhy$pared + suhy$public) Coefficients: Value Std. Error t value suhy$gpa 0.60441 0.2577 2.3453 suhy$pared 1.02746 0.2636 3.8973 suhy$public -0.05297 0.2932 -0.1807 Intercepts: Value Std. Error t value unlikely|somewhat likely 2.1646 0.7710 2.8074 somewhat likely|very likely 4.2526 0.7955 5.3458 Residual Deviance: 727.0019 AIC: 737.0019

Ordered Multinomial Logistic Regression

Page 21: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Multiple Predictors 2.  Let’s calculate the probabilities:

>(coef<- m2$coef) suhy$gpa suhy$pared suhy$public 0.60440882 1.02746355 -0.05297486 > (intercept<-m2$zeta) unlikely|somewhat likely somewhat likely|very likely 2.164642 4.252572 > mean(suhy$public) [1] 0.1425

Ordered Multinomial Logistic Regression

Page 22: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Multiple Predictors 2.  Let’s calculate the probabilities: (cont.)

>(x1<-cbind(0:4, 0 , .1425)) [,1] [,2] [,3] [1,] 0 0 0.1425 [2,] 1 0 0.1425 [3,] 2 0 0.1425 [4,] 3 0 0.1425 [5,] 4 0 0.1425 > (x2<-cbind(0:4, 1 , .1425)) [,1] [,2] [,3] [1,] 0 1 0.1425 [2,] 1 1 0.1425 [3,] 2 1 0.1425 [4,] 3 1 0.1425 [5,] 4 1 0.1425

Ordered Multinomial Logistic Regression

Page 23: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Multiple Predictors For pared = no (x1) > prob<-function(VAR){exp(VAR)/(1+exp(VAR))} > (p1<-prob(intercept[1]-x1 %*% coef)) [,1]

[1,] 0.8977243 [2,] 0.8274671 [3,] 0.7237966

[4,] 0.5887896 [5,] 0.4389450 > (p2<-prob(intercept[2]-x1 %*% coef)-p1) [,1]

[1,] 0.08835176 [2,] 0.14734081 [3,] 0.23104216

[4,] 0.33154442 [5,] 0.42429742 >p3<-1-(p1+p2) >p3 [,1] [1,] 0.01392398 [2,] 0.02519204 [3,] 0.04516123

[4,] 0.07966593 [5,] 0.13675756

For pared = yes (x2) > prob<-function(VAR){exp(VAR)/(1+exp(VAR))} > (p4<-prob(intercept[1]-x2 %*% coef)) [,1]

[1,] 0.7585465 [2,] 0.6318864 [3,] 0.4839828

[4,] 0.3388329 [5,] 0.2187598 > (p5<-prob(intercept[2]-x2 %*% coef)-p4) [,1]

[1,] 0.2034985 [2,] 0.3007712 [3,] 0.3992947

[4,] 0.4664163 [5,] 0.4744189 > p6<-1-(p4+p5) > p6 [,1] [1,] 0.03795509 [2,] 0.06734236 [3,] 0.11672252

[4,] 0.19475078 [5,] 0.30682131

Ordered Multinomial Logistic Regression

Page 24: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Graphing the Results >library(lattice)

>Undergrad.GPA <-0:4 >plot(Undergrad.GPA, p1, >type="l", col=1, ylim=c(0,1)) >lines(0:4, p2, col=2) >lines(0:4, p3, col=3) >lines(0:4, p4, col=1, lty = 2) >lines(0:4, p5, col=2, lty = 2) >lines(0:4, p6, col=3, lty = 2) >legend(1.5, 1, >legend=c("P(unlikely)", >"P(somewhat likely)", >"P(very likely)", "Line Type >when Pared = 0", >"Line Type when Pared = 1"), >col=c(1:3,1,1), >lty=c(1,1,1,1,2))

Ordered Multinomial Logistic Regression

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

Undergrad.GPA

p1

P(unlikely)

P(somewhat likely)

P(very likely)Line Typewhen Pared = 0Line Type when Pared = 1

Page 25: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Why Not Linear Regression

Ordered Multinomial Logistic Regression

� The decision is not always black and white

� Large categories that are equally spaced could call for a simple linear model

� However, you must ALWAYS check your assumptions

(Gelman &Hill, 2007)

Page 26: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Why Not Linear Regression Here is why… If we run our model using a simple linear model: >apply2<-as.numeric(suhy$apply) >m3<-lm(apply2~gpa, suhy) >summary(m3)

Call: lm(formula = apply2 ~ gpa, data = suhy) Residuals: Min 1Q Median 3Q Max -0.7917 -0.5554 -0.3962 0.4786 1.6012 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.77984 0.25224 3.092 0.00213 ** gpa 0.25681 0.08338 3.080 0.00221 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.6628 on 398 degrees of freedom Multiple R-squared: 0.02328, Adjusted R-squared: 0.02083 F-statistic: 9.486 on 1 and 398 DF, p-value: 0.002214

Ordered Multinomial Logistic Regression

This is what we see when we check our assumptions…

Page 27: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

Why Not Linear Regression

Ordered Multinomial Logistic Regression

1.3 1.4 1.5 1.6 1.7 1.8

-1.0

0.0

0.5

1.0

1.5

Fitted values

Residuals

Residuals vs Fitted

1859486

-3 -2 -1 0 1 2 3

-10

12

Theoretical Quantiles

Sta

ndar

dize

d re

sidu

als

Normal Q-Q

1859486

1.3 1.4 1.5 1.6 1.7 1.8

0.0

0.5

1.0

1.5

Fitted values

Standardized residuals

Scale-Location1859486

0.000 0.005 0.010 0.015 0.020

-10

12

Leverage

Sta

ndar

dize

d re

sidu

als

Cook's distance

Residuals vs Leverage

13

185

78

You Tell me…..

Page 28: Ordinal Multinomial Logistic Regressionfaculty.smu.edu/kyler/courses/7312/presentations/... · Ordinal multinomial logistic regression is an extension of logistic regression using

References Gelman, A. & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. NewYork: Cambridge University Press. Hoelze, B. (2009). Regression analysis with the ordinal multinomial logistic model [PowerPoint slides]. Retrieved from http://faculty.smu.edu/kyler/courses/7314//student/ordered_multinomial.pptx Johnson, V. E. & Albert, J. H. (1999). Statistics for the social sciences and public policy: Ordinal data modeling. New York: Springer. Shivraj, P. (2011). Ordered multinomial logistic regression analysis [PowerPoint slides]. Retrieved from http://faculty.smu.edu/kyler/courses/7312/presentations/shivraj/Ordered_ML_Shivraj.pdf. UCLA: Academic Technology Services. (n.d.). Retrieved from http://www.ats.ucla.edu/stat/r/ologit.dta