Top Banner
Few notes on panel data (materials by Alan Manning) Development Workshop
26

Few notes on panel data (materials by Alan Manning)

Jan 13, 2016

Download

Documents

fraley

Few notes on panel data (materials by Alan Manning). Development Workshop. A Brief Introduction to Panel Data. Panel Data has both time-series and cross-section dimension – N individuals over T periods - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Few notes on panel data (materials by Alan Manning)

Few notes on panel data (materials by Alan Manning)

Development

Workshop

Page 2: Few notes on panel data (materials by Alan Manning)

A Brief Introduction to Panel Data

Panel Data has both time-series and cross-section dimension – N individuals over T periods

Will restrict attention to balanced panels – same number of observations on each individuals

Whole books written about but basics can be understood very simply and not very different from what we have seen before

Asymptotics typically done on large N, small T Use yit to denote variable for individual i at time t

Page 3: Few notes on panel data (materials by Alan Manning)

The Pooled Model

Can simply ignore panel nature of data and estimate:

yit=β’xit+εit

This will be consistent if E(εit|xit)=0 or plim(X’ ε/N)=0 But computed standard errors will only be consistent if

errors uncorrelated across observations This is unlikely:

– Correlation between residuals of same individual in different time periods

– Correlation between residuals of different individuals in same time period (aggregate shocks)

Page 4: Few notes on panel data (materials by Alan Manning)

A More Plausible Model

Should recognise this as model with ‘group-level’ dummies or residuals

Here, individual is a ‘group’

' 'it it i ity x D 'it it i ity x

Page 5: Few notes on panel data (materials by Alan Manning)

Three Models

Fixed Effects Model– Treats θi as parameter to be estimated (like β)

– Consistency does not require anything about correlation with xit

Random Effects Model– Treats θi as part of residual (like θ)

– Consistency does require no correlation between θi and xit

Between-Groups Model– Runs regression on averages for each individual

Page 6: Few notes on panel data (materials by Alan Manning)

The fixed effect estimator of β will be consistent if:

a. E(εit|xit)=0

b. Rank(X,D)=N+K

Proof: Simple application of what you should know about linear regression model

Page 7: Few notes on panel data (materials by Alan Manning)

Intuition

First condition should be obvious – regressors uncorrelated with residuals

Second condition requires regressors to be of full rank

Main way in which this is likely to fail in fixed effects model is if some regressors vary only across individuals and not over time

Such a variable perfectly multicollinear with individual fixed effect

Page 8: Few notes on panel data (materials by Alan Manning)

Estimating the Fixed Effects Model

Can estimate by ‘brute force’ - include separate dummy variable for every individual – but may be a lot of them

Can also estimate in mean-deviation form:

1

1 Ti t ity yT

it it iy y y

Page 9: Few notes on panel data (materials by Alan Manning)

How does de-meaning work?

'it it i ity x

Can do simple OLS on de-meaned variables STATA command is like: xtreg y x, fe i(id)

'i i i iy x

'it it ity x

Page 10: Few notes on panel data (materials by Alan Manning)

Problems with fixed effect estimator

Only uses variation within individuals – sometimes called ‘within-group’ estimator

This variation may be small part of total (so low precision) and more prone to measurement error (so more attenuation bias)

Cannot use it to estimate effect of regressor that is constant for an individual

Page 11: Few notes on panel data (materials by Alan Manning)

Random Effects Estimator

• Treats θi as part of residual (like θ)• Consistency does require no correlation between θi

and xit

• Should recognise as like model with clustered standard errors

• But random effects estimator is feasible GLS estimator

11 1ˆ ˆ ˆ' 'RE X X X y

Page 12: Few notes on panel data (materials by Alan Manning)

More on RE Estimator

Will not describe how we compute Ω-hat – see Wooldridge

STATA command: xtreg y x, re i(id)

Page 13: Few notes on panel data (materials by Alan Manning)

The random effects estimator of β will be consistent if:

a. E(εit|xi1,..xit,.. xiT)=0

b. E(θi|xi1,..xit,.. xiT)=0c. Rank(X’Ω-1X)=k

Proof: RE estimator a special case of the feasible GLS estimator so conditions for consistency are the same.

Error has two components so need a. and b.

Page 14: Few notes on panel data (materials by Alan Manning)

Comments

Assumption about exogeneity of errors is stronger than for FE model – need to assume εit uncorrelated with whole history of x – this is called strong exogeneity

Assumption about rank condition weaker than for FE model e.g. can estimate effect variables that are constant for a given individual

Page 15: Few notes on panel data (materials by Alan Manning)

Another reason why may prefer RE to FE model

If exogeneity assumptions are satisfied RE estimate will be more efficient than FE estimator

Application of general principle that imposing true restriction on data leads to efficiency gain.

Page 16: Few notes on panel data (materials by Alan Manning)

Another Useful Result

Can show that RE estimator can be thought of as an OLS regression of:

On:

Where:

This is sometimes called quasi-time demeaning See Wooldridge (ch10, pp286-7) if want to

know more

it it iy y y it it ix x x

2

2 21

T

Page 17: Few notes on panel data (materials by Alan Manning)

Between-Groups Estimator

This takes individual means and estimates the regression by OLS:

Stata command is xtreg y x, be i(id) Condition for consistency the same as for RE estimator But BE estimator less efficient as does not exploit variation in

regressors for a given individual And cannot estimate variables like time trends whose average

values do not vary across individuals So why would anyone ever use it – lets think about measurement

error

'i i i iy x

Page 18: Few notes on panel data (materials by Alan Manning)

Measurement Error in Panel Data Models

Assume true model is:

Where x is one-dimensional Assume E(εit|xi1,..xit,.. xiT)=0 and E(θi|xi1,..xit,..

xiT)=0 so that RE and BE estimators are consistent

*0 1it it i ity x

Page 19: Few notes on panel data (materials by Alan Manning)

Measurement Error Model

Assume:

where uit is classical measurement error, x*i is average value of x* for individual i and ηit is variation around the true value which is assumed to be uncorrelated with and uit and iid.

We know this measurement error is likely to cause attenuation bias but this will vary between FE, RE and BE estimators.

* *it it it i it itx x u x u

Page 20: Few notes on panel data (materials by Alan Manning)

Proposition 5.4

For FE model we have:

For BE model we have:

For RE model we have:

Where:

1 1 1

ˆlim FE Var up

Var Var u

1 1 1

ˆlim*

BE Var up

TVar x Var Var u

1 1 1

ˆlim*

RE Var up

Var x Var Var u

2 2

2 2

11

21

T

Page 21: Few notes on panel data (materials by Alan Manning)

What should we learn from this?

All rather complicated – don’t worry too much about details

But intuition is simple Attenuation bias largest for FE estimator –

Var(x*) does not appear in denominator – FE estimator does not use this variation in data

Page 22: Few notes on panel data (materials by Alan Manning)

Conclusions

Attenuation bias larger for RE than BE estimator as T>1>κ The averaging in the BE estimator reduces the importance

of measurement error. Important to note that these results are dependent on the

particular assumption about the measurement error process and the nature of the variation in xit – things would be very different if measurement error for a given individual did not vary over time

But general point is the measurement error considerations could affect choice of model to estimate with panel data

Page 23: Few notes on panel data (materials by Alan Manning)

Estimating Fixed Effects Model in Differences

1 1 1'it it i ity x

Can also get rid of fixed effect by differencing:

'it it i ity x

'it it ity x

Page 24: Few notes on panel data (materials by Alan Manning)

Comparison of two methods

Estimate parameters by OLS on differenced data

If only 2 observations then get same estimates as ‘de-meaning’ method

But standard errors different Why?: assumption about autocorrelation in

residuals

Page 25: Few notes on panel data (materials by Alan Manning)

What are these assumptions?

For de-meaned model:

, 0,it isCov t s

• For differenced model:

, 0,it isCov t s

• These are not consistent:

1 1 2 1 2 1, , , ,it it it it it it it it itCov Cov Cov Cov Var

Page 26: Few notes on panel data (materials by Alan Manning)

This leads to time series…

Which is ‘better’ depends on which assumption is right – how can we decide this?

Much of this you have covered in Macroeconometrics course…