Top Banner
Logistic Regression Saed Sayad 1 www.ismartsoft.com
18

Logistic Regression

Feb 25, 2016

Download

Documents

Duane

Logistic Regression. Saed Sayad. Definition. Logistic Regression is a type of regression model where the dependent variable (target) has just two values, such as: 0, 1 Y, N F, T. Sample Dataset. Linear Regression ( Continuous Dependent Variable ). Balance. Months in Business. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Logistic Regression

Logistic Regression

Saed Sayad

1www.ismartsoft.com

Page 2: Logistic Regression

Definition

Logistic Regression is a type of regression model where the dependent variable (target) has just two values, such as:

0, 1Y, NF, T

2www.ismartsoft.com

Page 3: Logistic Regression

Sample Dataset

www.ismartsoft.com 3

Months n Business Balance Default189 $429,916 0170 $240,319 1166 $231,327 0423 $196,105 0145 $193,907 160 $190,944 097 $184,333 0

354 $152,126 099 $151,061 180 $135,885 025 $119,751 1

118 $116,578 174 $123,864 0... ... ...

Page 4: Logistic Regression

Linear Regression (Continuous Dependent Variable)

www.ismartsoft.com 4

Months in Business

Balance

Page 5: Logistic Regression

Linear Regression (Binary Dependent Variable)

www.ismartsoft.com 5

Default

Months in Business

Page 6: Logistic Regression

Linear Regression Model – Binary Target

• If the actual Y is a binary variable then the predicted Y can be less than zero or greater than 1

• If the actual Y is a binary variable then error is not normally distributed.

1i o i iY X

6www.ismartsoft.com

Page 7: Logistic Regression

Linear Regression Model

0

1Y

X

7www.ismartsoft.com

Page 8: Logistic Regression

Frequency Table

www.ismartsoft.com 8

Months in Business Count Default Count

Default Frequency

<50 4 0 050-100 12 1 0.083

100-150 4 1 0.25150-200 4 2 0.5200-250 4 3 0.75250-300 1 1 1

>300 4 4 1

Page 9: Logistic Regression

Frequency Plot

www.ismartsoft.com9

Months in Business - Bins

Default Probability

Page 10: Logistic Regression

Logistic Function

www.ismartsoft.com 10

zezf

11)(

Page 11: Logistic Regression

Logistic Regression

The logistic distribution constrains the estimated probabilities to lie between 0 and 1.

Maximum Likelihood Estimation is a statistical method for estimating the coefficients of a model.

11www.ismartsoft.com

)( 1011

Xep

Page 12: Logistic Regression

Logistic Regression Model

0

1

Linear Model

Logistic Model

Y

X

12www.ismartsoft.com

Page 13: Logistic Regression

Maximum Likelihood Estimation (MLE)• MLE maximizes the log likelihood (LL) which reflects

how likely it is that the dependent variable will be predicted from the independent variables.

• MLE is an iterative algorithm which starts with initial arbitrary numbers of what the coefficients should be.

• After this initial function is estimated, the process is repeated until LL does not change significantly.

13www.ismartsoft.comCopyright iSmartsoft Inc. 2008

Page 14: Logistic Regression

Log Likelihood (LL)

www.ismartsoft.com 14

• Likelihood is the probability that the dependent variable may be predicted from the independent variables.

• LL is calculated through iteration, using maximum likelihood estimation (MLE).

• Log likelihood is the basis for tests of a logistic model.

Page 15: Logistic Regression

Log Likelihood Test (-2LL)

• The log likelihood test is a test of the significance of the difference between the likelihood ratio for the baseline model minus the likelihood ratio for a reduced model.

• This difference is called "model chi-square“.• Also called Likelihood Ratio test.

www.ismartsoft.com 15

Page 16: Logistic Regression

Wald Test• A Wald test is used to test the statistical significance

of each coefficient () in the model. • A Wald test calculates a Z statistic, which is:

• This Z value is then squared, yielding a Wald statistic with a chi-square distribution.

www.ismartsoft.com 16

SEZ ̂

Page 17: Logistic Regression

Summary• Logistic Regression is a classification method.• It returns the probability that the binary dependent variable

may be predicted from the independent variables.• Maximum Likelihood Estimation is a statistical method for

estimating the coefficients of the model.• The Likelihood Ratio test is used to test the statistical

significance between the full model and the simpler model.• The Wald test is used to test the statistical significance of

each coefficient in the model.

www.ismartsoft.com 17

Page 18: Logistic Regression

18www.ismartsoft.com

Questions?