Top Banner
The Probit Model Alexander Spermann University of Freiburg University of Freiburg SoSe 2009
38

The Probit Model

Feb 02, 2017

Download

Documents

LêAnh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Probit Model

The Probit Model

Alexander SpermannUniversity of FreiburgUniversity of Freiburg

SoSe 2009

Page 2: The Probit Model

1. Notation and statistical foundations

2. Introduction to the Probit model

3. Application

4. Coefficients and marginal effects

Course outline

2

4. Coefficients and marginal effects

5. Goodness-of-fit

6. Hypothesis tests

Page 3: The Probit Model

1 2 2

t 0 1 1

'

^ ^

1. y Gujarati

y W ooldridge

2. M atrix

i i k ik i

t k tk t

x x

x x u

Y X

Y x

Y X u

β β β εβ β β

β εβ ε

β

= + + + += + + + +

= += +

= +

K

K

Notation and statistical foundations

3

( ) ( )

'

' '

0 1

1 2 1 2 3 2

2 3

1 1

i i i

i i

Y X u

y x

x x

x x x x

ββ ε

β ββ ββ ββ β

= += +

Page 4: The Probit Model

Notation and statistical foundations – Vectors

� Column vector:

� Transposed (row vector): [ ]

1

2

1

'

xn

n

a

aa

a

a a a a

=

=

M

K

4

� Transposed (row vector):

� Inner product:

[ ]

[ ]

'1 2

1

1

2'1 2

xn

n

n i i

n

a a a a

b

ba b a a a a b

b

=

= =

K

KM

Page 5: The Probit Model

� PDF: probability density function f(x)

� Example: Normal distribution:

( )( )2

2

121

x

x e

µσφ

− − =

Notation and statistical foundations – density function

5

� Example: Standard normal distribution: N(0,1), µ = 0, σ = 1

( )2

x eφσ π

=

( )2

21

2

x

x eφπ

−=

0=µ

Page 6: The Probit Model

� Standard logistic distribution:

Exponential distribution:

( ) 3,0,

1)(

22

2

πσµ ==+

=x

x

e

exf

Notation and statistical foundations – distibutions

6

Exponential distribution:

� Poisson distribution:

220,

1

0,0 ,,0,)( θσθµθθ

θ ==>

=≥

−xe

x

x

xf

θσθµθθ

===−

2,,!

)(x

exf

x

Page 7: The Probit Model

� CDF: cumulative distribution function F(x)

� Example: Standard normal distribution:

( )2

21

2

z x

z e dxπ

−∞

Φ = ∫

Notation and statistical foundations – CDF

7

� The cdf is the integral of the pdf.

Page 8: The Probit Model

� Rule I:

� Rule II:

log log log

n

y x z

y x z

y x

== +

=

Notation and statistical foundations – logarithms

8

� Rule II:

� Rule III:

log log

log log log

b

y x

y n x

y a x

y a b x

==

== +

Page 9: The Probit Model

� Why not use OLS instead?

Introduction to the Probit model – binary variables

=0

1y

OLS

9

� Nonlinear estimation, for example by maximum likelihood.

1

0

OLS(linear)

x x x x x x x

x x x x

Page 10: The Probit Model

� Latent variable: Unobservable variable y* which can take all values in (-∞, +∞).

� Example: y* = Utility(Labour income) - Utility(Non labour income)

Underlying latent model:

Introduction to the Probit model – latent variables

10

� Underlying latent model:

iii

i

ii

xy

y

yy

εβ +=

≤>

=

'*

0*,0

0*,1

Page 11: The Probit Model

� Probit is based on a latent model:

Introduction to the Probit model – latent variables

( )εφ

)|(

)|0(

)|0()|1(

'

'

*

ββε

εβ

ii

ii

ii

xxP

xxP

xyPxyP

−−=

−>=

>+=

>==

11

Assumption: Error terms are independent and normally distributed:

'ix β−

)(1 'βixF −−=

ββ ''ii xx−

)(

1),(1)|1(

'

'

β

σσβ

i

ii

x

xxyP

Φ=

≡−Φ−==

because of symmetry

Page 12: The Probit Model

� Example:

Introduction to the Probit model – CDF

( )zΦ=CDF ( )zΦ=CDF

0,8

1

12

1-0,2=0,81-0,2=0,8z− z0

0,2

0,5

0,8

β'ixz =

Page 13: The Probit Model

� F(z) lies between zero and one

� CDF of Probit: CDF of Logit:

Introduction to the Probit model – CDF Probit vs. Logit

13

β'ixz = β'ixz =

Page 14: The Probit Model

� PDF of Probit: PDF of Logit:

Introduction to the Probit model – PDF Probit vs. Logit

14

Page 15: The Probit Model

� Joint density:

Introduction to the Probit model – The ML principle

[ ]ii

ii

yi

yi

y

iy

ii

FF

xFxFxyf

−=

−=

∏∏

1

)1(''

)1(

)(1)(),|( βββ

15

� Log likelihood function:

ii

i∏

)1ln()1(lnln iii

ii FyFyL −−+=∑

Page 16: The Probit Model

� The principle of ML: Which value of β maximizes the probability of observing the given sample?

Introduction to the Probit model – The ML principle

1

))(1(ln

−−−+=

∂∂

∑ ii i

ii

i

ii xF

fy

F

fyL

β

16

0

)1(

1

=

−−=

−∂

∑ ii

iii

ii

i ii

xfFF

Fy

FFβ

Page 17: The Probit Model

� Example taken from Greene, Econometric Analysis, 5. ed. 2003, ch. 17.3.

� 10 observations of a discrete distribution

� Random sample: 5, 0, 1, 1, 0, 3, 2, 3, 4, 1

Introduction to the Probit model – Example

17

� PDF:

� Joint density :

� Which value of θ makes occurance of the observed sample most probable?

( )!

,i

x

i x

exf

iθθθ−

=

( ) ( )36,207

!,|,,,

2010

10

1

1010

11021

θθθθθθ ⋅=

∑⋅==−

=

= ∏∏ e

x

exfxxxf

ii

x

ii

i i

K

Page 18: The Probit Model

( )( )

ln 10 20ln 12,242

ln 2010 0

L

d L

d

θ θ θθ

θ θ

= − + −

= − + =

( )xL |θ

Introduction to the Probit model – Example

18

( )xL |θ

θ

( )xL |θ

( )xL |ln θ

2Maximumd

Ld

−=22

2 20)(ln

θθθ

Page 19: The Probit Model

Application

� Analysis of the effect of a new teaching method in economic sciences

� Data: Beobachtung GPA TUCE PSI Grade Beobachtung GPA TUCE PSI Grade

1 2,66 20 0 0 17 2,75 25 0 02 2,89 22 0 0 18 2,83 19 0 0

19

Source: Spector, L. and M. Mazzeo, Probit Analysis and Economic Education. In: Journal of Economic Education, 11, 1980, pp.37-44

3 3,28 24 0 0 19 3,12 23 1 04 2,92 12 0 0 20 3,16 25 1 15 4 21 0 1 21 2,06 22 1 06 2,86 17 0 0 22 3,62 28 1 17 2,76 17 0 0 23 2,89 14 1 08 2,87 21 0 0 24 3,51 26 1 09 3,03 25 0 0 25 3,54 24 1 110 3,92 29 0 1 26 2,83 27 1 111 2,63 20 0 0 27 3,39 17 1 112 3,32 23 0 0 28 2,67 24 1 013 3,57 23 0 0 29 3,65 21 1 114 3,26 25 0 1 30 4 23 1 115 3,53 26 0 0 31 3,1 21 1 016 2,74 19 0 0 32 2,39 19 1 1

Page 20: The Probit Model

Application – Variables

� GradeDependent variable. Indicates whether a student improved his grades after the new teaching method PSI had been introduced (0 = no, 1 = yes).

� PSI

20

� PSIIndicates if a student attended courses that used the new method (0 = no, 1 = yes).

� GPAAverage grade of the student

� TUCEScore of an intermediate test which shows previous knowledge of a topic.

Page 21: The Probit Model

Application – Estimation

� Estimation results of the model (output from Stata):

21

Page 22: The Probit Model

Application – Discussion

� ML estimator: Parameters were obtained by maximization of the log likelihood function.Here: 5 iterations were necessary to find the maximum of the log likelihood function (-12.818803)

� Interpretation of the estimated coefficients:

22

� Interpretation of the estimated coefficients:

� Estimated coefficients do not quantify the influence of the rhs variables on the probability that the lhs variable takes on the value one.

� Estimated coefficients are parameters of the latent model.

Page 23: The Probit Model

Coefficients and marginal effects

� The marginal effect of a rhs variable is the effect of an unit change of this variable on the probability P(Y = 1|X = x), given that all other rhs variables are constant:

ββϕ )()|()|1( '

iiiii x

x

xyE

x

xyP =∂

∂=∂=∂

23

� Recap: The slope parameter of the linear regression model measures directly the marginal effect of the rhs variable on the lhs variable.

iii xx ∂∂

Page 24: The Probit Model

Coefficients and marginal effects

� The marginal effect depends on the value of the rhs variable.

� Therefore, there exists an individual marginal effect for each person of the sample:

24

Page 25: The Probit Model

Coefficients and marginal effects – Computation

� Two different types of marginal effects can be calculated:

� Average marginal effect Stata command: margin

25

� Marginal effect at the mean: Stata command: mfx compute

Page 26: The Probit Model

Coefficients and marginal effects – Computation

� Principle of the computation of the average marginal effects:

26

� Average of individual marginal effects

Page 27: The Probit Model

Coefficients and marginal effects – Computation

� Computation of average marginal effects depends on type of rhs variable:

� Continuous variables like TUCE and GPA:

ββϕ∑=

=n

iix

nAME

1

' )(1

27

� Dummy variable like PSI:

=in 1

[ ]∑=

=Φ−=Φ=n

i

kii

kii xxxx

nAME

1

'' )0()1(1 ββ

Page 28: The Probit Model

Coefficients and marginal effects – Interpretation

� Interpretation of average marginal effects:

� Continuous variables like TUCE and GPA:An infinitesimal change of TUCE or GPA changes the probability that the lhs variable takes the value one by X%.

� Dummy variable like PSI:

28

� Dummy variable like PSI:A change of PSI from zero to one changes the probability that the lhs variable takes the value one by X percentage points.

Page 29: The Probit Model

Coefficients and marginal effects – Interpretation

Variable Estimated marginal effect Interpretation

GPA 0.364 If the average grade of a student goes up by an infinitesimal amount, the probability for the variable grade taking the value one rises by

29

the value one rises by 36.4 %.

TUCE 0.011 Analog to GPA,with an increase of 1.1%.

PSI 0.374 If the dummy variable changes from zero to one, the probability for the variable grade taking the value one rises by 37.4 ppts.

Page 30: The Probit Model

Coefficients and marginal effects – Significance

� Significance of a coefficient: test of the hypothesis whether a parameter is significantly different from zero.

� The decision problem is similar to the t-test, wheras the probit test statistic follows a standard normal distribution. The z-value is equal to the estimated parameter divided by

30

The z-value is equal to the estimated parameter divided by its standard error.

� Stata computes a p-value which shows directly the significance of a parameter:

z-value p-value Interpretation

GPA : 3.22 0.001 significant

TUCE: 0,62 0,533 insignificant

PSI: 2,67 0,008 significant

Page 31: The Probit Model

Coefficients and marginal effects

� Only the average of the marginal effects is displayed.

� The individual marginal effects show large variation:

31

Stata command: margin, table

Page 32: The Probit Model

Coefficients and marginal effects

� Variation of marginal effects may be quantified by the confidence intervals of the marginal effects.

� In which range one can expect a coefficient of the population?

In our example:

32

� In our example:

Estimated coefficient Confidence interval (95%)

GPA: 0,364 - 0,055 - 0,782

TUCE: 0,011 - 0,002 - 0,025

PSI: 0,374 0,121 - 0,626

Page 33: The Probit Model

Coefficients and marginal effects

� What is calculated by mfx?

� Estimation of the marginal effect at the sample mean.

33

Sample mean

Page 34: The Probit Model

Goodness of fit

� Goodness of fit may be judged by McFaddens Pseudo R².

� Measure for proximity of the model to the observed data.

� Comparison of the estimated model with a model which only contains a constant as rhs variable.

34

� : Likelihood of model of interest.

� : Likelihood with all coefficients except that of the intercept restricted to zero.

� It always holds that

)(ˆln FullML

)(ˆln InterceptML

)(ˆln FullML ≥ )(ˆln InterceptML

Page 35: The Probit Model

Goodness of fit

� The Pseudo R² is defined as:

� Similar to the R² of the linear regression model, it holds

)(ˆln

)(ˆln122

Intercept

FullMcF

ML

MLRPseudoR −==

35

� Similar to the R² of the linear regression model, it holds that

� An increasing Pseudo R² may indicate a better fit of the model, whereas no simple interpretation like for the R² of the linear regression model is possible.

10 2 ≤≤ McFR

Page 36: The Probit Model

Goodness of fit

� A high value of R²McF does not necessarily indicate a good fit, however, as R²McF = 1 if = 0.

� R²McF increases with additional rhs variables. Therefore, an adjusted measure may be appropriate:

)(ˆln FullML

)(ˆln KML −

36

� Further goodness of fit measures: R² of McKelvey and Zavoinas, Akaike Information Criterion (AIC), etc. See also the Stata command fitstat.

)(ˆln

)(ˆln122

Intercept

FullMcFadjusted

ML

KMLRPseudoR

−−==

Page 37: The Probit Model

Hypothesis tests

� Likelihood ratio test: possibility for hypothesis testing, for example for variable relevance.

� Basic principle: Comparison of the log likelihood functions of the unrestricted model (ln LU) and that of the restricted model (ln LR)

37

model (ln LR)

� Test statistic:

� The test statistic follows a χ² distribution with degrees of freedom equal to the number of restrictions.

( )2R ULR 2ln 2(lnL lnL ) χ Kλ= − = − −

R

U

L0 1

Lλ λ= ≤ ≤

��

Page 38: The Probit Model

Hypothesis tests

� Null hypothesis: All coefficients except that of the intercept are equal to zero.

� In the example:

� Prob > chi2 = 0.0014

2LR (3) 15,55χ =

38

� Interpretation: The hypothesis that all coefficients are equal to zero can be rejected at the 1 percent significance level.