Top Banner
1 Research Method Research Method Lecture 11-2 Lecture 11-2 (Ch15) (Ch15) Instrumental Instrumental Variables Variables Estimation and Two Estimation and Two Stage Least Square Stage Least Square ©
47

1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Dec 17, 2015

Download

Documents

Alexia Matthews
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

1

Research MethodResearch Method

Lecture 11-2 Lecture 11-2 (Ch15)(Ch15)

Instrumental Instrumental Variables Variables

Estimation and Two Estimation and Two Stage Least SquareStage Least Square

©

Page 2: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

What would happen when you use IV What would happen when you use IV method when the suspected method when the suspected

endogenous variable is in fact endogenous variable is in fact exogenous?exogenous?

Consider the following model Y=β0+β1x+u

If x is exogenous, you do not need IV method. OLS estimators are consistent.

Suppose that you have an instrument for x, called z, which satisfies the instrument conditions (instrument exogeneity and instrument relevance described in handout 11-1). Then, IV estimators are also consistent.

Then, which one is better, OLS or IV?

2

Page 3: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Answer is, OLS. If x is exogenous, IV estimators have larger variances, so IV estimators are imprecise (you tend to get smaller t-stat in absolute value.)

To see this, notice the following.

3

2,

2

,1 )ˆ(zxx

IV RSSTVar

xOLS SST

Var2

,1 )ˆ(

Since R2x,z is always between 0 and 1

(except the case x=z, where it is 1), the variance of IV estimator is always bigger asymptotically).

Page 4: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Thus, controlling for endogeneity(i.e., using IV method) when it is actually exogenous is costly in terms of precision.

4

Page 5: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Poor instruments: What would happen Poor instruments: What would happen if the instrumental variable does not if the instrumental variable does not

satisfy the instrument conditions.satisfy the instrument conditions.

Consider the following model Y=β0+β1x+u

This time, suppose that x is endogenous. But further suppose that your instrumental variable z does not satisfy the instrument conditions (i.e., you have a poor instrument).

Then what would happen?

5

Page 6: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Answer to this question is the following

1.IV estimators are inconsistent. 2.The directions of the biases in IV

estimators and OLS estimators can be the opposite.

3.The bias in IV can be worse than OLS.

6

Page 7: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

To understand 1, notice that

(Proof: See the front board)So, both IV and OLS are inconsistent.

7

x

uIV xzCorr

uzCorrp

),(

),()ˆlim( 1,1

x

uOLS uxcorrp

),()ˆlim( 1,1

If instrument exogeneity is not satisfied, this term is not zero, so inconsistent.

If x is endogenous, this term is not zero, so inconsistent.

Page 8: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

To understand 2, first consider that Corr(x,u) is a positive. Then OLS has positive bias.

But it can happen that Corr(z,u)/Corr(z,x) is negative. In such a case, the IV estimator have a negative bias.

This means that, when you have an invalid instrument, you may get very unexpected results.

8

Page 9: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

To understand 3, consider the following scenario.

(i) the instrument exogeneity is almost satisfied but not perfectly statisfied, that is; corr(z,u) is close to 0 but not exactly 0.

(ii) The instrument is not very relevant; i.e., corr(z, x) is very close to 0.

Then, even if instrument exogeneity is almost satisfied, the bias will be magnified by the small corr(z,x).

9

x

uIV xzCorr

uzCorrp

),(

),()ˆlim( 1,1

If this is small, bias will be magnified.

Page 10: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

It is possible that the bias is so magnified that the extent of bias in IV estimator is worse than OLS.

10

Page 11: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

IV estimation of the IV estimation of the multiple regression modelmultiple regression model I will extend the discussion to the multiple

regression model. I will explain the following 3 cases, step by

step.

Case 1: One endogenous variable, one instrument.

Case 2: One endogenous variable, more than one instruments. (Two stage least squares)

Case 3: More than one endogenous variables, more than one instruments. (Two stage least squares)

11

Page 12: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Case 1: One endogenous Case 1: One endogenous variable, one instrument.variable, one instrument.

Consider the following regression.

Suppose that educ is endogenous but exp is exogenous.

12

ueducwage exp)log( 210

Page 13: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

To explain IV regression for multiple regression, it is often useful to use different notations for endogenous end exogenous variable.

Let us use y for endogenous variable (i.e., correlated with u) and z for exogenous variables (i.e., uncorreated with u).

Then, we can write the model as:

y1=β0+β1y2+β2z1+u …………………(1)

y1 is log(wage), y2 is educ, and z1 is exp.

13

Page 14: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

This model is called the structural equation to emphasize that this equation shows the causal relationship. Off course, OLS cannot be used to consistently estimate the parameters since y2 is endogenous.

If you have an instrument for y2, you can consistently estimate the model. Let us call this instrument, z2.

14

Page 15: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

As before, z2 should satisfy (i) instrument exogeneity, and (ii) instrument relevance.

For a multiple regression model, these conditions are written as:

1. The instrument exogeneity Cov(z2, u)=0 …………………….(2)

2. The instrument relevance y2=π0+π1z1+π2z2+error …………….(3)

and π2≠0

In addition, z2 should not be a part of the structural equation (1). This is called the exclusion restriction.

15

All the exogenous variables included. This equation is often called the reduced form equation.

Page 16: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Now, we have the following three conditions that can be used to obtain the IV estimators.

E(u)=0 Cov(z1,u)=0

Cov(z2,u)=0 (this is from the instrument

exogeneity)

The sample counterparts of these conditions are given in the next slide.

16

Page 17: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

This is a set of three equations with three unknowns: The solutions to these equations are the IV estimators. There is a simple matrix expression for IV estimators. However, we will not cover this during the class.

17

n

iiii zyy

1122101 0)ˆˆˆ(

n

iiiii zyyz

11221011 0)ˆˆˆ(

n

iiiii zyyz

11221012 0)ˆˆˆ(

If you divide it by n, this is the sample average of .

If you divide it by n-1, this is the sample covariance between z1 and .

If you divide it by n-1, this is the sample covariance between z2 and .

u

u

u

210

Page 18: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Above method can be easily extended to the case where there are more explanatory variables (but only one endogenous variable).

Consider the following model. y1=β0+β1y2+β2z1+β3z2+β4z3+..+ βkzk-1+ u

Suppose that zk is the instrument for y2. Then the IV estimators are the solution to the following equations.

18

Page 19: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

19

n

iikkiii zzyy

11122101 0)ˆ...ˆˆˆ(

n

iikkiiii zzyyz

111221011 0)ˆ...ˆˆˆ(

n

iikkiiiik zzyyz

11122101 0)ˆ...ˆˆˆ(

Solution to the above equations are the IV estimators when there are many explanatory variables, but only one endogenous variable and one instrument.

Page 20: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

ExampleExample

Consider the following model.

Log(wage)=β0+β1(educ)+β2Exper+β3Exper2

+β3(SMSA)+ β3(South)+u

Using the college proximity (nearc4) as an IV for education, estimate the model. Use CARD.dta. (nearc4) is a dummy variable for someone who grew up near a four-year college.

20

Page 21: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

21

_cons 4.611015 .067895 67.91 0.000 4.477889 4.74414 south -.1751761 .0146486 -11.96 0.000 -.2038985 -.1464537 smsa .1508006 .015836 9.52 0.000 .1197501 .1818511 expersq -.0022021 .0003238 -6.80 0.000 -.0028371 -.0015672 exper .0838357 .0067735 12.38 0.000 .0705545 .0971169 educ .0815797 .003499 23.31 0.000 .0747189 .0884405 lwage Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 592.641645 3009 .196956346 Root MSE = .38127 Adj R-squared = 0.2619 Residual 436.681848 3004 .145366794 R-squared = 0.2632 Model 155.959797 5 31.1919593 Prob > F = 0.0000 F( 5, 3004) = 214.57 Source SS df MS Number of obs = 3010

. reg lwage educ exper expersq smsa south

Instruments: exper expersq smsa south nearc4Instrumented: educ _cons 3.703427 .8201379 4.52 0.000 2.095986 5.310867 south -.1409356 .0343705 -4.10 0.000 -.2083005 -.0735707 smsa .1249987 .0284538 4.39 0.000 .0692302 .1807671 expersq -.0022553 .0003394 -6.64 0.000 -.0029205 -.00159 exper .1067727 .0218136 4.89 0.000 .0640188 .1495266 educ .13542 .0486085 2.79 0.005 .0401491 .230691 lwage Coef. Std. Err. z P>|z| [95% Conf. Interval]

Root MSE = .39562 R-squared = 0.2051 Prob > chi2 = 0.0000 Wald chi2(5) = 499.36Instrumental variables (2SLS) regression Number of obs = 3010

. ivregress 2sls lwage exper expersq smsa south (educ=nearc4)

OLS

IV

Page 22: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

22

_cons 16.68131 .1489113 112.02 0.000 16.38933 16.97329 nearc4 .3456458 .0824092 4.19 0.000 .1840616 .50723 south -.582683 .0743531 -7.84 0.000 -.7284712 -.4368948 smsa .3639914 .0863314 4.22 0.000 .1947167 .5332661 expersq .0009774 .0017044 0.57 0.566 -.0023646 .0043194 exper -.4258437 .0320651 -13.28 0.000 -.4887155 -.362972 educ Coef. Std. Err. t P>|t| [95% Conf. Interval] Robust

Root MSE = 1.9825 R-squared = 0.4524 Prob > F = 0.0000 F( 5, 3004) = 675.83Linear regression Number of obs = 3010

. reg educ exper expersq smsa south nearc4, robust

Check if nearc4 satisfies instrument relevance. Using t-test, we can reject the null hypothesis that nearc4 is not correlated with educ after controlling for all other exogenous variables.

Page 23: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Case 2: One endogenous Case 2: One endogenous variable, more than one variable, more than one

instruments.instruments.Two stage least squaresTwo stage least squares

Consider the following model with one endogenous variable.

y1=β0+β1y2+β2z1+u

Now, suppose that you have two instruments for y2 that satisfy the instrument conditions. Call them z2 and z3.

23

Page 24: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

You can apply IV method using either z2 or z3. But this produces two different estimators. Moreover, they are not efficient.

Now, I will show you a more efficient estimator.

First, it is important to lay out the instrument conditions.

24

Page 25: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

For z2 and z3 to be valid instruments, they have to satisfy the following two conditions.

1.Instrument exogeneity Cov(z2, u)=0 and Cov(z3, u)=0

2.Instrument relevance y2=π0+π1z1+ π2z2+ π3z3+error

and π2≠0 or π3≠0

In addition, z2 and z3 should not be a part of the structural equation. These are called the exclusion restrictions.

25

Include all the exogenous variables

Page 26: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Now, I will explain the estimation method.

Instead of using only one instrument, we use a linear combination of z2 and z3 as the instrument.

Since a linear combination of z2 and z3 also satisfies the instrument conditions, this is a valid method.

The question is how to find the best linear combination of z2 and z3.

26

Page 27: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

It turns out that OLS regression of the following model provides the best linear combination.

y2=π0+π1z1+ π2z2+ π3z3+error

After you estimate this model, you get the predicted value of y2.

Since is a combination of variables which are not correlated with u, is not correlated with u as well. At the same time, is correlate with y2. Thus this is a valid instrument.

27

33221102 ˆˆˆˆˆ zzzy

2y

2y

2y

Page 28: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Thus, we have the following three conditions that can be used to derive an IV estimator.

E(u)=0 Cov(z1,u)=0

Cov( ,u)=0

The sample counter part of the above equations are given by:

28

2y

Page 29: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

This is a set of three equations with three unknowns .

Solution to these equations are special type of IV estimators called the two stage least square estimators.

29

n

iiii zyy

1122101 0)ˆˆˆ(

n

iiiii zyyz

11221011 0)ˆˆˆ(

n

iiiii zyyy

11221012 0)ˆˆˆ(ˆ

0 1 2

Page 30: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

You can estimate these parameters by following the above procedure.

There is an alternative and equivalent procedure to estimate these parameters. This procedure will give you an idea why it is called the two stage least squares.

30

Page 31: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

The estimation procedures of The estimation procedures of the two stage least square the two stage least square

(2SLS).(2SLS).

Stage 1. Estimate the following model using OLS and get the predicted value for y2: .

Stage 2. replace y2 with , then estimate the following model using OLS.

OLS estimators of the coefficients are the two stage least square estimators (2SLS).

31

y2=π0+π1z1+ π2z2+ π3z3+error

2y

2y

errorzyy 122101 ˆ

Make sure to put all the exogenous variables

Page 32: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Estimating the standard Estimating the standard errors for two stage least errors for two stage least

square.square. When you exactly follow the two stage

procedures explained in the previous slide, you get correct 2SLS coefficients. But you don’t get correct standard errors.

So, after applying the procedure, you have to do some extra work to estimate the standard errors.

Under the homoskedasticity assumption, the valid standard errors are computed as follows/

32

Page 33: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

1. Estimate the 2SLS coefficients, then estimate the variance of u as

where

2. Then the variance for βj is given by

where is the total variation of .

is the R-squared from regressing on all other exogenous variables appearing in the structural equation.

33

n

iiukn 1

22 ˆ1

1

222101ˆˆˆˆ iiii zyyu

Note you use y2, not . Coefficients are 2SLS estimates.

2y

)ˆ1(ˆˆ

)ˆ(ˆ222

2

RTSSrVa j

2y2ˆTSS

22R 2y

Page 34: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

The square root of the variance in the previous slide is the standard error for βj.

34

Page 35: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

NoteNote STATA automatically estimate 2SLS

model, as well as calculating the correct standard errors.

Most of the cases, you should avoid estimating 2SLS “manually” (although it is a good exercise), since this does not provide you with the correct standard errors.

35

Page 36: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

ExerciseExercise

Consider the following model.

Log(wage)=β0+β1(educ)+β2Exper+β3Exper2+u

1.Suppose educ is endogenous but exper and its square are exogenous. Using mother and father’s education as instruments, estimate the 2SLS model. Use Mroz.dta.

2.Manually estimate the model to check if you get the same coefficients. (Note that you will not get the correct standard errors.)

36

Page 37: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

37

_cons -.5220406 .2016505 -2.59 0.010 -.9183996 -.1256815 expersq -.0008112 .0004201 -1.93 0.054 -.0016369 .0000145 exper .0415665 .015273 2.72 0.007 .0115462 .0715868 educ .1074896 .013219 8.13 0.000 .0815068 .1334725 lwage Coef. Std. Err. t P>|t| [95% Conf. Interval] Robust

Root MSE = .66642 R-squared = 0.1568 Prob > F = 0.0000 F( 3, 424) = 27.30Linear regression Number of obs = 428

. reg lwage educ exper expersq, robust

OLS

Page 38: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

38Instruments: exper expersq motheduc fatheducInstrumented: educ _cons .0481003 .398453 0.12 0.904 -.7328532 .8290538 expersq -.000899 .0003998 -2.25 0.025 -.0016826 -.0001154 exper .0441704 .0133696 3.30 0.001 .0179665 .0703742 educ .0613966 .0312895 1.96 0.050 .0000704 .1227228 lwage Coef. Std. Err. z P>|z| [95% Conf. Interval]

Root MSE = .67155 R-squared = 0.1357 Prob > chi2 = 0.0000 Wald chi2(3) = 24.65Instrumental variables (2SLS) regression Number of obs = 428

_cons 9.10264 .4265614 21.34 0.000 8.264196 9.941084 fatheduc .1895484 .0337565 5.62 0.000 .1231971 .2558997 motheduc .157597 .0358941 4.39 0.000 .087044 .2281501 expersq -.0010091 .0012033 -0.84 0.402 -.0033744 .0013562 exper .0452254 .0402507 1.12 0.262 -.0338909 .1243417 educ Coef. Std. Err. t P>|t| [95% Conf. Interval]

Root MSE = 2.0390 Adj R-squared = 0.2040 R-squared = 0.2115 Prob > F = 0.0000 F( 4, 423) = 28.36 Number of obs = 428

First-stage regressions

. ivregress 2sls lwage exper expersq (educ = motheduc fatheduc), first

First stage regression

2SLS results

“first” option show s first stage and second stage

Page 39: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Estimating 2SLS manually: When you regress the first stage manually on this data, more observations are used than the above 2SLS. To use exactly the same observations, first run the 2SLS and find the observations used in the regression.

39

. gen fullsample= e(sample)

Instruments: exper expersq motheduc fatheducInstrumented: educ _cons .0481003 .398453 0.12 0.904 -.7328532 .8290538 expersq -.000899 .0003998 -2.25 0.025 -.0016826 -.0001154 exper .0441704 .0133696 3.30 0.001 .0179665 .0703742 educ .0613966 .0312895 1.96 0.050 .0000704 .1227228 lwage Coef. Std. Err. z P>|z| [95% Conf. Interval]

Root MSE = .67155 R-squared = 0.1357 Prob > chi2 = 0.0000 Wald chi2(3) = 24.65Instrumental variables (2SLS) regression Number of obs = 428

. ivregress 2sls lwage exper expersq (educ = motheduc fatheduc)

e(sample) enable you to create dummy if the observation is used

Page 40: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

40

_cons 9.10264 .4265614 21.34 0.000 8.264196 9.941084 fatheduc .1895484 .0337565 5.62 0.000 .1231971 .2558997 motheduc .157597 .0358941 4.39 0.000 .087044 .2281501 expersq -.0010091 .0012033 -0.84 0.402 -.0033744 .0013562 exper .0452254 .0402507 1.12 0.262 -.0338909 .1243417 educ Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 2230.19626 427 5.22294206 Root MSE = 2.039 Adj R-squared = 0.2040 Residual 1758.57526 423 4.15738833 R-squared = 0.2115 Model 471.620998 4 117.90525 Prob > F = 0.0000 F( 4, 423) = 28.36 Source SS df MS Number of obs = 428

. reg educ exper expersq motheduc fatheduc if fullsample==1

Then, estimate the first stage regression. Note “if fullsample==1” tells STATA to use observations only if fullsample is 1.

. predict educ_hat, xb

After estimation, type this command. This will automatically create the predicted value of educ.

Page 41: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

41

_cons .0481003 .4197565 0.11 0.909 -.7769624 .873163 expersq -.000899 .0004212 -2.13 0.033 -.0017268 -.0000711 exper .0441704 .0140844 3.14 0.002 .0164865 .0718543 educ_hat .0613966 .0329624 1.86 0.063 -.0033933 .1261866 lwage Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 223.327441 427 .523015084 Root MSE = .70746 Adj R-squared = 0.0431 Residual 212.209613 424 .50049437 R-squared = 0.0498 Model 11.117828 3 3.70594266 Prob > F = 0.0001 F( 3, 424) = 7.40 Source SS df MS Number of obs = 428

. reg lwage educ_hat exper expersq if fullsample==1

Finally estimate the second stage regression. You can see that the coefficient s are the same as before, but Std error and t-stats are different.

Page 42: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Case 3: More than one endogenous Case 3: More than one endogenous variables, more than one instrumentsvariables, more than one instruments

Consider the following structural equation.

y1=β0+β1y2+β2y3+β3z1+β4z2+β5z3+u1

There are two endogenous variables, y2 and y3. Thus, OLS will be biased. In order to estimate this model with IV method, you need at least 2 instruments.

When you have multiple endogenous variables, you need at least the same number of instruments as the endogenous variables.

42

Page 43: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Suppose you have 3 instruments: z4 z5 z6. As usual, these instruments should satisfy 2 conditions. The first is that they should not be correlated with u1 (Instrument exogeneity). The second is that they should be correlated with endogenous variable (instrument relevance). When you have multiple endogenous variables, the second condition has a more complex expression, and it is called the rank condition.

43

Page 44: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

The estimation procedureThe estimation procedure The 2SLS procedure when there are

more than one endogenous variables is shown here.

y1=β0+β1y2+β2y3+β3z1+β4z2+β5z3+u1

Suppose you have three Instruments : z4 z5 z6.

44

Page 45: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

First stage: Estimate the following two reduced from regressions

y2=п10+п11z1+п12z2+п13z3+п14z4+п15z5+п16z6+error y3=п20+п21z1+п22z2+п23z3+п24z4+п25z5+п26z6+error

Then obtain and .

The second stage: Estimate the following ‘second stage regression’.

y1=β0+β1 +β2 +β3z1+β4z2+β5z3+u1

The estimated coefficients are the 2SLS coefficients.

45

3y2y

2y 3y

Page 46: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Note that second stage regression does not produce correct standard errors. The derivation of the exact formula for the standard errors is not the focus of this course. Stata ivregress command automatically computes the correct standard errors.

46

Page 47: 1 Research Method Lecture 11-2 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Testing multiple Testing multiple hypotheseshypotheses

In the 2SLS method, the F statistic formula we used for OLS is no longer valid. STATA automatically computes a valid F-type statistic for 2SLS.

47