Top Banner

of 16

03.Econ427_2016_SW6

Jul 07, 2018

Download

Documents

Edith Kua
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/19/2019 03.Econ427_2016_SW6

    1/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    UBC Okanagan

    Dr. Mohsen JavdaniWinter 1, 2015/2016

    Eon!2"#

    Eono$etris

    %&ro$ 'tok and Watson(s Introductionto Econometrics)

    Cha*ter 6

    6+1

  • 8/19/2019 03.Econ427_2016_SW6

    2/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    Ot-ine

    1. Omitted variable bias

    2. Causality and regression analysis

    3. ultiple regression and O!"

    #. easures o$ $it

    %. "ampling distribution o$ the O!" estimator

    6+2

  • 8/19/2019 03.Econ427_2016_SW6

    3/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    O$itted aria-e Bias%'W 'etion 6.1)

    &he error u arises be'ause o$ $a'tors( orvariables( that in$luen'e Y  but are notin'luded in the regression $un'tion. &here

    are al)ays omitted variables. 

    "ometimes( the omission o$ those variables'an lead to bias in the O!" estimator.

    6+

  • 8/19/2019 03.Econ427_2016_SW6

    4/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    Omitted variable bias, ctd. 

    &he bias in the O!" estimator that o''urs as a resulto$ an omitted $a'tor( or variable( is 'alled omittedvariable bias. *or omitted variable bias to o''ur( theomitted variable +Z , must satis$y t)o 'onditions

    &he t)o 'onditions $or omitted variable bias

    1. Z  is a determinant o$ Y  i.e. Z  is part o$ u/ and

    2. Z  is 'orrelated )ith the regressor X  i.e. 'orrZ ( X / 0/

     

    Both conditions must hold for the omission of Z toresult in omitted variable bias.

    6+!

  • 8/19/2019 03.Econ427_2016_SW6

    5/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    Omitted variable bias, ctd. 

    n the test s'ore eample

    1. 4nglish language ability )hether the student has4nglish as a se'ond language/ plausibly a$$e'tsstandardi5ed test s'ores Z  is a determinant o$Y .

    2. mmigrant 'ommunities tend to be less a$$luentand thus have smaller s'hool budgets and higherSTR Z  is 'orrelated )ith X .

    A''ordingly( is biased. What is the dire'tion o$this bias6

     – What does common sense suggest? 

     – $ 'ommon sense $ails you( there is a $ormula7

    6+5

  • 8/19/2019 03.Econ427_2016_SW6

    6/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    Omitted variable bias, ctd. 

    A $ormula $or omitted variable bias re'all thee8uation(

      9  β1 : :

    )here v i  :  X i  9 /ui  ;  X i  9  μ X /ui .

  • 8/19/2019 03.Econ427_2016_SW6

    7/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    Omitted variable bias, ctd. 

  • 8/19/2019 03.Econ427_2016_SW6

    8/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    he o$itted varia-e ias or$-a#

    • $ an omitted variable Z  is both

    1. a determinant o$ Y  that is( it is 'ontained in u/ and 

    2. 'orrelated )ith X (then ρ Xu  0 and the O!" estimator is biased and is not

    'onsistent.

    .

    . *or eample( distri'ts )ith $e) 4"! students 1/ do better on

    standardi5ed tests and 2/ have smaller 'lasses biggerbudgets/( so ignoring the e$$e't o$ having many 4"!students $a'tor )ould result in overstating the 'lass si5ee$$e't. s this is actuall! going on in the "# data6

    6+3

    β 1 +→ p

     

    σ u

    σ  X 

     

      

       ρ  Xu

  • 8/19/2019 03.Econ427_2016_SW6

    9/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    • Bistri'ts )ith $e)er 4nglish !earners have higher test s'ores

    • Bistri'ts )ith lo)er per'ent E$ %ctE$/ have smaller 'lasses

    • Among distri'ts )ith 'omparable %ctE$( the e$$e't o$ 'lasssi5e is small re'all overall +test s'ore gap, : .#/

    6+4

  • 8/19/2019 03.Econ427_2016_SW6

    10/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    Casa-it and regression ana-sis

    • &he test s'oreDSTR D$ra'tion 4nglish !earnerseample sho)s that( i$ an omitted variablesatis$ies the t)o 'onditions $or omitted variablebias( then the O!" estimator in the regression

    omitting that variable is biased and in'onsistent."o( even i$ n is large( )ill not be 'lose to &1.

    • &his raises a deeper 8uestion ho) do )e de$ine

     &16 &hat is( )hat pre'isely do )e )ant to estimate

    )hen )e run a regression6

    6+10

    1β̂ 

  • 8/19/2019 03.Econ427_2016_SW6

    11/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    What *reise- do e ant to esti$atehen e rn a regression7

    &here are at least/ three possible ans)ers to this8uestion

    1. We )ant to estimate the slope o$ a line through a

    s'atterplot as a simple summary o$ the data to)hi'h )e atta'h no substantive meaning.

    This can be useful at times' but isn(t ver! interestingintellectuall! and isn(t )hat this course is about.

    6+11

  • 8/19/2019 03.Econ427_2016_SW6

    12/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    2. We )ant to maEe $ore'asts( or predi'tions( o$ thevalue o$ Y  $or an entity not in the data set( $or)hi'h )e Eno) the value o$ X .

    *orecasting is an im+ortant ,ob for economists'and e-cellent forecasts are +ossible usingregression methods )ithout needing to no)causal effects. We )ill return to forecasting laterin the course.

    6+12

  • 8/19/2019 03.Econ427_2016_SW6

    13/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    3. We )ant to estimate the 'ausal e$$e't on Y  o$ a'hange in X .

    This is )h! )e are interested in the class si/eeffect. Su++ose the school board decided to cutclass si/e b! 2 students +er class. What )ould bethe effect on test scores? This is a causal 0uestion)hat is the causal effect on test scores of STR?so )e need to estimate this causal effect. E-ce+t)hen )e discuss forecasting' the aim of thiscourse is the estimation of causal effects using

    regression methods.

    6+1

  • 8/19/2019 03.Econ427_2016_SW6

    14/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    What, *reise-, is a asa- eet7

    •  +Causality, is a 'omple 'on'eptF

    • n this 'ourse( )e taEe a pra'ti'al approa'h

    to de$ining 'ausality

    8 asa- eet is deined to e the

    eet $easred in an idea-rando$i9ed ontro--ed e:*eri$ent.

    5+1!

  • 8/19/2019 03.Econ427_2016_SW6

    15/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    ;dea-

  • 8/19/2019 03.Econ427_2016_SW6

    16/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    Bak to -ass si9e#

    magine an ideal randomi5ed 'ontrolled eperiment $ormeasuring the e$$e't on Test Score o$ redu'ing STR7

    • n that eperiment( students )ould be randomly assigned to'lasses( )hi'h )ould have di$$erent si5es.

    • @e'ause they are randomly assigned( all student'hara'teristi's and thus ui / )ould be distributed

    independently o$ STRi .

    • &hus( E ui HSTRi / : 0 9 that is( !"A =1 holds in a randomi5ed

    'ontrolled eperiment.

    6+16

  • 8/19/2019 03.Econ427_2016_SW6

    17/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    =o does or oservationa- data dierro$ this idea-7

    • &he treatment is not randomly assigned

    • Consider %ctE$ 9 per'ent 4nglish learners 9 in thedistri't. t plausibly satis$ies the t)o 'riteria $oromitted variable bias Z  : %ctE$ is1. a determinant o$ Y  and 

    2. 'orrelated )ith the regressor X .

    • &hus( the +'ontrol, and +treatment, groups di$$erin a systemati' )ay( so 'orrSTR(%ctE$/ 0

    6+1"

  • 8/19/2019 03.Econ427_2016_SW6

    18/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    • Iandomi5ation J 'ontrol group means that anydi$$eren'es bet)een the treatment and 'ontrolgroups are random 9 not systemati'ally related to

    the treatment

    • We 'an eliminate the di$$eren'e in %ctE$ bet)eenthe large 'ontrol/ and small treatment/ groupsby eamining the e$$e't o$ 'lass si5e amongdistri'ts )ith the same %ctE$.

     – $ the only systemati' di$$eren'e bet)een the large andsmall 'lass si5e groups is in %ctE$( then )e are ba'E tothe randomi5ed 'ontrolled eperiment 9 )ithin ea'h %ctE$ 

    group.

     – &his is one )ay to +'ontrol, $or the e$$e't o$ %ctE$ )henestimating the e$$e't o$ STR.

    6+13

  • 8/19/2019 03.Econ427_2016_SW6

    19/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    Return to omitted variable bias 

    hree as to overo$e o$itted varia-e ias

    1. Iun a randomi5ed 'ontrolled eperiment in )hi'h treatmentSTR/ is randomly assigned then %ctE$ is still adeterminant o$ TestScore( but %ctE$ is un'orrelated )ithSTR. This solution to 34 bias is rarel! feasible./

    2. Adopt the +'ross tabulation, approa'h( )ith $iner gradationso$ STR and %ctE$ 9 )ithin ea'h group( all 'lasses have thesame %ctE$( so )e 'ontrol $or %ctE$ 5ut soon !ou )ill runout of data' and )hat about other determinants lie famil!

    income and +arental education6/3.

  • 8/19/2019 03.Econ427_2016_SW6

    20/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    he >o*-ation M-ti*-e

  • 8/19/2019 03.Econ427_2016_SW6

    21/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    ;nter*retation o oeiients in $-ti*-eregression

    Y i  :  β0 J  β1 X 1i  J  β2 X 2i  J ui ( i  : 1(7(n

    Consider 'hanging X 1 by Δ X 1 )hile holding X 2 

    'onstant

    Population regression line before the 'hange 

    Y  :  β0 J  β1 X 1 J  β2 X 2

     

    Population regression line( after the 'hange 

    Y  J ΔY  :  β0 J  β1 X 1 J Δ X 1/ J  β2 X 2 

    6+21

  • 8/19/2019 03.Econ427_2016_SW6

    22/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    Before Y  :  β0 J  β1 X 1 J Δ X 1/ J β 2 X 2 

     After  Y  J ΔY  :  β0 J  β1 X 1 J Δ X 1/ J  β2 X 2

     

    Difference ΔY  :  β1Δ X 1

    So:

       β1 : ( ho-ding X 

    2 onstant

     β2 : ( ho-ding X 1 onstant

     

     β0 : predi'ted value o$ Y  )hen X 1 : X 2 : 0.

    6+22

     

    ∆Y 

    ∆ X 1

     

    ∆Y ∆ X 

    2

  • 8/19/2019 03.Econ427_2016_SW6

    23/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    he O?' Esti$ator in M-ti*-e

  • 8/19/2019 03.Econ427_2016_SW6

    24/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    E:a$*-e# the Ca-iornia test sore data

    Iegression o$ TestScore against STR

     

    : KLM.L 9 2.2MNSTR

     o) in'lude per'ent 4nglish !earners in the distri't

    %ctE$/

      : KMK.0 9 1.10NSTR 9 0.K%%ctE$

    •  What happens to the 'oe$$i'ient on STR6• What STR( %ctE$/ : 0.1L/

    6+2!

  • 8/19/2019 03.Econ427_2016_SW6

    25/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    M-ti*-e regression in '88

    reg testscr str pctel, robust;

     

    Regression with robust standard errors Number of obs = 420

      F( 2, 417) = 223.82

      Prob > F = 0.0000

      R-squared = 0.4264

      Root MSE = 14.464

     ------------------------------------------------------------------------------

      | Robust

      testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]

    -------------+----------------------------------------------------------------

      str | -1.101296 .4328472 -2.54 0.011 -1.95213 -.2504616

      pctel | -.6497768  .0310318 -20.94 0.000 -.710775 -.5887786

      _cons | 686.0322  8.728224 78.60 0.000 668.8754 703.189

    ------------------------------------------------------------------------------

     

    : KMK.0 9 1.10NSTR 9 0.K%%ctE$

    6ore on this +rintout later7

    6+25

  • 8/19/2019 03.Econ427_2016_SW6

    26/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    Measres o &it or M-ti*-e

  • 8/19/2019 03.Econ427_2016_SW6

    27/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    SER and RMSE  

    As in regression )ith a single regressor( theSER and the R6SE  are measures o$ thespread o$ the Y s around the regression line

    SER :

    R6SE  :

    6+2"

    2

    1

    1

    n

    i

    i

    un k    =− −

     ∑

    2

    1

    1 ˆn

    i

    i

    un   =∑

  • 8/19/2019 03.Econ427_2016_SW6

    28/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    R2 and %ad@sted R2)

    &he R2 is the $ra'tion o$ the varian'e eplained 9same de$inition as in regression )ith a singleregressor

    R2 : : (

    )here ESS : ( SSR : ( TSS : .

    &he R2 al)ays in'reases )hen you add another regressor)h!? 9 a bit o$ a problem $or a measure o$ +$it, 

    6+23

      R2

      ESS TSS   

    1− SSRTSS 2

    1

    ˆ ˆ( )n

    i

    i

    Y Y =

    −∑ 21

    ˆn

    i

    i

    u=∑

     (Y 

    i − Y )2

    i=1

    n

  • 8/19/2019 03.Econ427_2016_SW6

    29/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    R2 and ctd. 

    &he the +adGusted R2,/ 'orre'ts this problem by +penali5ing, you $or in'luding another regressor 9the does not ne'essarily in'rease )hen you addanother regressor.

     

     #d,usted R2 A

    ote that R2( ho)ever i$ n is large the t)o )illbe very 'lose.

    6+24

      R2

     

    1−n−1

    n − k  −1  

       SSR

    TSS  R2

      R2

      R2

      R2

  • 8/19/2019 03.Econ427_2016_SW6

    30/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    Measures of fit, ctd. 

    &est s'ore eample

     

    1/ : KLM.L 9 2.2MNSTR(

      R2 : .0%( SER : 1M.K

     2/ : KMK.0 9 1.10NSTR 9 0.K%%ctE$(

      R2 : .#2K( A .#2#, SER : 1#.%

     

    • What 8 +recisel! 8 does this tell !ou about the fit of regression

    2 com+ared )ith regression 1? 

    • Wh! are the R2 and the so close in 2? 

    6+0

      R2

      R2

  • 8/19/2019 03.Econ427_2016_SW6

    31/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    he ?east 'ares 8ss$*tions orM-ti*-e

  • 8/19/2019 03.Econ427_2016_SW6

    32/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    8ss$*tion 1# the onditiona- $ean o u given the in-ded X s is 9ero.

    E uH X 1 : - 1(7( X   : -  / : 0

     

    • &his has the same interpretation as in regression )ith asingle regressor.

    *ailure o$ this 'ondition leads to omitted variable bias(spe'i$i'ally( i$ an omitted variable

    1. belongs in the e8uation so is in u/ and  

    2. is 'orrelated )ith an in'luded X 

    . then this 'ondition $ails and there is OS bias.

    . &he best solution( i$ possible( is to in'lude the omittedvariable in the regression.

    . A se'ond( related solution is to in'lude a variable that'ontrols $or the omitted variable dis'ussed in Ch. /

    6+2

  • 8/19/2019 03.Econ427_2016_SW6

    33/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    8ss$*tion 2# % X 1i ,, X i  ,! i ), i  A1,,n, are i.i.d.

    &his is satis$ied automati'ally i$ the data are 'olle'ted

    by simple random sampling.

    8ss$*tion # -arge ot-iers are rare %initeorth $o$ents)

    &his is the same assumption as )e had be$ore $or asingle regressor. As in the 'ase o$ a single regressor(O!" 'an be sensitive to large outliers( so you need to'he'E your data s'atterplotsF/ to maEe sure there areno 'ra5y values typos or 'oding errors/.

    6+

  • 8/19/2019 03.Econ427_2016_SW6

    34/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    8ss$*tion !# here is no *eret $-tio--inearit"erfect multicollinearit#  is )hen one o$ the regressors is anea't linear $un'tion o$ the other regressors.

    E$am%le# "uppose you a''identally in'lude STR t)i'eregress testscr str str, robust

    Regression with robust standard errors Number of obs = 420

      F( 1, 418) = 192!

      "rob # F = 00000

      R$s%uared = 00&12

      Root ' = 18&81

    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

      * Robust

    testscr * +oef td rr t "#*t* 9&- +onf .nter/a

    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

      str * $2239808 &194892 $49 0000 $0094& $12&8!31  str * (dro55ed)

      6cons * !989 10!4! !344 0000 !38&!02 3190&3

    $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

    6+!

  • 8/19/2019 03.Econ427_2016_SW6

    35/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    "erfect multicollinearit#  is )hen one o$ theregressors is an ea't linear $un'tion o$ theother regressors.

    • n the previous regression(  β1 is the e$$e't on

    TestScore o$ a unit 'hange in STR( holding STR 'onstant 666/

    • We )ill return to per$e't and imper$e't/multi'ollinearity shortly( )ith more eamples7

    •  

    • With these least s0uares assum+tions in hand'

    )e no) can derive the sam+ling distribution of   ( (7( .

    6+5

    1β̂  ˆ

    k β 2β̂ 

  • 8/19/2019 03.Econ427_2016_SW6

    36/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    he 'a$*-ing Distrition o the O?'Esti$ator %'W 'etion 6.6)

  • 8/19/2019 03.Econ427_2016_SW6

    37/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    M-tio--inearit, >eret and ;$*eret%'W 'etion 6.")

    "erfect multicollinearit#  is )hen one o$ the regressors is anea't linear $un'tion o$ the other regressors.

     

    "ome more eamples o$ per$e't multi'ollinearity

    1. &he eample $rom be$ore you in'lude STR t)i'e(2. Iegress TestScore on a 'onstant( ;( and 5( )here ;i  : 1

    i$ STR T 20( : 0 other)ise 5i  : 1 i$ STR U20( : 0

    other)ise( so 5i  : 1 9 ;i  and there is per$e't

    multi'ollinearity.

    3. Would there be per$e't multi'ollinearity i$ the inter'ept'onstant/ )ere e'luded $rom this regression6 &hiseample is a spe'ial 'ase o$7

    6+"

  • 8/19/2019 03.Econ427_2016_SW6

    38/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    he d$$ varia-e tra*

    "uppose you have a set o$ multiple binary dummy/ variables( )hi'hare mutually e'lusive and ehaustive 9 that is( there are multiple'ategories and every observation $alls in one and only one 'ategory*reshmen( "ophomores( Vuniors( "eniors( Other/. $ you in'lude allthese dummy variables and  a 'onstant( you )ill have per$e't

    multi'ollinearity 9 this is sometimes 'alled the dumm# variabletra%.

    • Wh! is there +erfect multicollinearit! here6

    • Solutions to the dumm! variable tra+

    1. Omit one o$ the groups e.g. "enior/( or

    2. Omit the inter'ept. What are the im+lications of 1 or 2 for the inter+retation of the

    coefficients? 

    6+3

  • 8/19/2019 03.Econ427_2016_SW6

    39/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

    "erfect multicollinearit#, ctd. 

    • Per$e't multi'ollinearity usually re$le'ts a mistaEein the de$initions o$ the regressors( or an oddity inthe data

    $ you have per$e't multi'ollinearity( yourstatisti'al so$t)are )ill let you Eno) 9 either by'rashing or giving an error message or by +dropping, one o$ the variables arbitrarily

    •&he solution to per$e't multi'ollinearity is tomodi$y your list o$ regressors so that you nolonger have per$e't multi'ollinearity.

    6+4

  • 8/19/2019 03.Econ427_2016_SW6

    40/41

    Copyright © 2011 Pearson Addison-Wesley. All rights reserved.

     Im%erfect multicollinearit#  

    mper$e't and per$e't multi'ollinearity are 8uite di$$erent despite thesimilarity o$ the names.

     

     Im%erfect multicollinearit# o''urs )hen t)o or more regressors arevery highly 'orrelated.

    • Why the term +multi'ollinearity,6 $ t)o regressors are very highly'orrelated( then their s'atterplot )ill pretty mu'h looE liEe astraight line 9 they are +'o-linear, 9 but unless the 'orrelation isea'tly 1( that 'ollinearity is imper$e't.

    6+!0

  • 8/19/2019 03.Econ427_2016_SW6

    41/41

     Im%erfect multicollinearit#, ctd.

    mper$e't multi'ollinearity implies that one or more o$ theregression 'oe$$i'ients )ill be impre'isely estimated.

    • &he idea the 'oe$$i'ient on X 1 is the e$$e't o$ X 1 holding X 2 

    'onstant but i$ X 1 and X 2 are highly 'orrelated( there is very

    little variation in X 1 on'e X 2 is held 'onstant 9 so the datadonQt 'ontain mu'h in$ormation about )hat happens )hen X 1 

    'hanges but X 2 doesnQt. $ so( the varian'e o$ the O!"

    estimator o$ the 'oe$$i'ient on X 1 )ill be large.

    mper$e't multi'ollinearity 'orre'tly/ results in largestandard errors $or one or more o$ the O!" 'oe$$i'ients.

    • &he math6 "ee "W( App. K.2

     &e$t to%ic: h#%othesis tests and confidence intervals'

    6