Lecture 31 Summary of previous lecture PANEL DATA
Dec 24, 2015
Lecture 31
Summary of previous lecture
PANEL DATA
Topics for today
PANEL DATA
FEM & REM
FEM
3- Slope coefficients constant but the intercept varies over individuals as well as time.
To consider this possibility consider the following model.
Suppose when we run this regression, we find the company dummies as
well as the coefficients of the X are individually statistically significant.
And interestingly none of the time dummies is significant.
The overall conclusion that emerges is that perhaps there is pronounced
individual company effect but no time effect.
In other words, the investment functions for the four companies are the
same except for their intercepts.
FEM…
4- All Coefficients vary across Individuals
Here we assume that the intercepts and the slope coefficients are different for all individual, or
cross-section, units.
This is to say that the investment functions of GE, GM, US, and WEST are all different.
We can easily extend our LSDV model to take care of this situation.
In the model we have introduce the interactive dummies. The γ’s are the differential slope coefficients.
If one or more of the γ’s coefficients statistically significant, it will tell us that one or more slope coefficients
are different from the base group.
If all the differential intercept and all the differential slope coefficients are statistically significant then
The conclusion is that the investment functions of General Motors, United States Steel, and Westinghouse
are different from that of General Electric.
A Caution on the Use of the Fixed Effects, or LSDV, Model.
Although easy to use, the LSDV model has some problems.
1- If we introduce too many dummy variables, we will run up against the degrees of freedom problem.
2- With so many variables in the model, there is always the possibility of Multicollinearity which might
make precise estimation of one or more parameters difficult.
3- We can also include variables such as sex, color, and ethnicity, which are time invariant too
because an individual’s sex color, or ethnicity does not change over time.
Hence, the LSDV approach may not be able to identify the impact of such time-invariant variables.
4- We have to think carefully about the error term u as our discussion so far are based on the
assumption that the error term follows the classical assumptions, namely, uit N(0, σ2).∼
Since the i index refers to cross-sectional observations and t to time series observations, the
classical assumption for uit may have to be modified.
There are several possibilities
1. We can assume that the error variance is the same for all cross section
units or we can assume that the error variance is Heteroscedastic.
2. For each individual we can assume that there is no autocorrelation over
time. Thus, for example, we can assume that the error term of the
investment function for General Motors is non-autocorrelated.
3- For a given time, it is possible that the error term for General Motors
is correlated with the error term for, say, U.S. Steel or both U.S. Steel and
Westinghouse. Or, we could assume that there is no such correlation.
4. We can think of other permutations and combinations of the error term.
Allowing for one or more of these possibilities will make the analysis that
much more complicated. The problems may be alleviated- REM
Estimation of panel data regression modelsThe Random Effects Model (REM)
Although straightforward to apply, fixed effects, or LSDV,
modeling can be expensive in terms of degrees of freedom if
we have several cross-sectional units.
If the dummy variables do in fact represent a lack of knowledge
about the (true) model, why not express this ignorance through
the disturbance term uit
This is precisely the approach suggested by the proponents of
the so called error components model (ECM) or Random
effects model (REM).
REM Instead of treating β1i as fixed, we assume that it is a random
variable with a mean value of β.
And the intercept value for an individual company can be expressed
as
Where εi is a random error term with a mean value of zero and
variance of σ2ε.
This mean that the four firms included in our sample are a drawing
from a much larger universe of such companies and that they have a
common mean value for the intercept ( = β)
The individual differences in the intercept values of each company
are reflected in the error term εt
REM
The composite error term wit, consists of two components, εi which is the
cross-section, or individual-specific, error component, and uit, which is the
combined time series and cross-section error component.
Notice carefully the difference between FEM and ECM. In FEM each
cross-sectional unit has its own (fixed) intercept value, in all N such values
for N cross-sectional units.
In REM, on the other hand, the intercept β1 represents the mean value of
all the (cross-sectional) intercepts and the error component ε represents
the (random) deviation of individual intercept from this mean value.
However, keep in mind that εi is not directly observable; it is what is known
as an unobservable, or latent, variable.
FIXED EFFECTS (LSDV) VERSUS RANDOM EFFECTS MODEL
Which model is better, FEM or ECM?
The answer to this question hinges around the assumption one makes about
the likely correlation between the individual, or cross-section specific, error
component εi and the X regressors.
If it is assumed that εi and the X’s are uncorrelated, ECM may be appropriate,
whereas if εi and the X’s are correlated, FEM may be appropriate.
Keeping this fundamental difference in the two approaches in mind, what more
can we say about the choice between FEM and ECM?
FEM Vs. REM
If T (the number of time series data) is large and N (the number cross-
sectional units) is small, there is likely to be little difference in the values the
parameters estimated by FEM and ECM.
Hence the choice here is based computational convenience. On this score,
FEM may be preferable.
When N is large and T is small, the estimates obtained by the two methods
can differ significantly.
1. If the individual error component correlated, then the ECM estimators are
biased, whereas those obtained from FEM are unbiased.
2. If N is large and T is small, and if the assumptions underlying ECM hold,
ECM estimators are more efficient than FEM estimators.
FEM Vs. REM
Is there a formal test that will help us to choose between FEM and
ECM?
Yes, Hausman test can serve this job.
The null hypothesis underlying the Hausman test is that the FEM and
ECM estimators do not differ substantially.
The test statistic developed by Hausman has an asymptotic χ2
distribution.
If the null hypothesis is rejected, the conclusion is that ECM is not
appropriate and that we may be better off using FEM, in which case
statistical inferences will be conditional on the ε in the sample.
Although an improvement over cross-section data, panel data do not
provide a cure-all for all of an estimation problems.
Simultaneous Regression Models By now we were concerned exclusively with single equation models.
The cause-and-effect relationship, if any, in such models therefore ran from
the X’s to the Y.
But in many situations, such a one-way or unidirectional cause-and-effect
relationship is not meaningful.
This occurs if Y is determined by the X’s, and some of the X’s are, in turn,
determined by Y.
In short, there is a two way, or simultaneous, relationship between Y and
(some of) the X’s, which makes the distinction between dependent and
explanatory variables of dubious value.
In simultaneous models there is more than one equation—one for each of
the mutually, or jointly, dependent or endogenous variables.
Simultaneous Regression Models What happens if the parameters of each equation are estimated
by applying, say, the method of OLS, disregarding other
equations in the system?
One of the crucial assumptions of the method of OLS is that the
Explanatory X variables are nonstochastic.
If this conditions is not met, then, the least-squares estimators
are not only biased but also inconsistent
That is, as the sample size increases indefinitely, the estimators
do not converge to their true value.
Simultaneous regression model
Now it is not too difficult to see that P and Q are jointly
dependent variables.
A shift in the demand curve changes both P and Q.
Therefore, a regression of Q on P would violate an important
assumption of the classical linear regression model, namely,
the assumption of no correlation between the explanatory
variable(s) and the disturbance term.
Jarque Berra (JB)test- A test of normality
The JB test of normality is an asymptotic, or large-sample, test.
It is based on the OLS residuals.
This test first computes the Skewness and kurtosis measures of the OLS residuals and uses
the following test statistic.
Where n = sample size, S = skewness coefficient, and K = kurtosis coefficient.
For a normally distributed variable, S= 0 and K = 3.
Therefore, the JB test of normality is a test of the joint hypothesis that S and K are 0 and
3,respectively.
Under the null hypothesis that the residuals are normally distributed, Jarque and Bera
showed that asymptotically the JB statistic follows the chi-square distribution with 2 df.
If the computed p value of the JB statistic is sufficiently low, we reject the hypothesis that
the residuals are normally distributed and vice versa.
Example JB- test
Suppose the value of the JB is 0.7769.
The p value of obtaining such a value from the chi-square
distribution with 2 df is about 0.68, which is quite high.
In other words, we may not reject the normality assumption.
Of course, bear in mind the warning about the sample size.
F-test-Testing the Overall Significance of a Multiple Regression
For certain reasons we cannot use the usual t test to test the
joint hypothesis.
However, this joint hypothesis can be tested by the F-test
which can be demonstrated as follows.
Partial Correlation
We knowthe coefficient of correlation as a measure of the degree of linear
association between two variables.
For the three-variable regression model we can compute three correlation
coefficient (correlation between Y and X2), r13 (correlation coefficient
between Y and X3), and (correlation coefficient between X2 and X3.
The subscript 1 represent Y for notational convenience. These correlation
coefficients are called gross or simple correlation coefficients, or
correlation coefficients of zero order.
In general, r12 is not likely to reflect the true degree of association
between Y and X in the presence of X.
Therefore we use the partial correlation coefficients.
Partial coefficients…
A journey through the course
Quantitative Technique:SamplingPopulation SamplingProbabilityNon probabilityDerivatives
Regression:AssumptionsEstimators BLUECoefficient of determination Problems
A journey through the course …
Dummy Variables:ANOVA ModelsANCOVA Models Dummy Variable Trap
Qualitative Response Models:LPMLOGITPROBITOrdinal Logit and ProbitMultinomial Logit and Probit
Panel data Simultaneous equation models