Dr. Etazaz Econometrics Notes.pdf

ECONOMETRICS LECTURES OF DR. EATZAZ AHMEDECONOMETRICS LECTURES OF DR. EATZAZ AHMEDECONOMETRICS LECTURES OF DR. EATZAZ AHMEDECONOMETRICS LECTURES OF DR. EATZAZ AHMED

Quaid-e-Azam University Islamabad Page - 1 -



Econometrics It is a subject in which we formulate mathematical relationship among economic variables on the basis of knowledge of economic theory and there estimate, numerical values of the parameters in this relationship using the actual data. Classical Linear Regression Model: Suppose we want to analyze a variable Y using the data; Yi (i = 1, 2, 3 ………n). In the most simple analysis we’ll like to represent the whole data Y1, Y2, Y3

.…..Yn, by a single number. We can formulate a model for this purpose which looks like this way; Y = α + Y – α Or Y = α + U [U = Y – α] Thus Y is set equal to a constant (α) plus the discrepancy (difference) between Y and the presumed constant (α). The equation is what is the most appropriate interpretation of α? If we set the average of errors E (U) = 0 Then we’ll have => E (Y – α) = 0 => E (Y) – α = 0 => E (Y) = α. [Pop Mean of Y] With this interpretation, we can write the model as; Y = E (Y) + U U = Y – E (Y). Example: Pop Mean = 20 years Person age = 3 years Y = E (Y) + U Y = 20 + 3 Y = 23 years. And if Person age = -1 year Y = 20 + (-1) Y = 19 years.

In the statistics, we learnt how to estimate population mean using a random sample. In this course, we will repeat the some exercises using a different approach.

We start with the model.

Y = α + U Suppose this model is imposed on data;

Y1, Y2, Y3 .…..Yn this amounts to;



Y1 = α + U1 Y2 = α + U2 Y3 = α + U3

: è Yi = α + Ui : : Yn = α + Un The estimation of α depends on assumptions of the model. The Classical assumptions are as follows;

1). Ui is a random variable for each i.

This means Ui is a random variable for that U1, U2, U3………… Un are all random variables. Random Variable: random variable is that which can take at least two values with non zero probability].

Ui is one out of infinite values, each have infinite values.

Time is fixed variable. Age is not random variable. Weight is random variable.

2). E (Ui) = 0 for each i.

On average errors are equal to zero. Since Ui = Y – α => Yi – E (α) = E (Ui) This assumption holds by construction.

3). Var (Ui) = σ² for all i. All errors terms have the same variance, this assumption is known as Homoscedasticity assumption and if assumption violated we’ll have Hetroscedasticity.

4). Co-Var (Ui, Uj) = 0 of all i ≠ j.

Time series data they are correlated but not in cross section data. If co-Var (Ui, Uj) ≠ 0 for some i ≠ j then we say that Ui is Auto correlated with Uj different time at one variable (for example food expenditure).

5). Ui is distributed normally.

Some times we also make the assumption that;

Ui ~ N [Ui is distributed normally]

It is also challengeable assumption.



Y – E (Y) [Mean income] 0 – 800 Or 1000000 – 800

Estimation of α

Let is an estimator of α.

= + ℮ [Where ℮ = Y - ]

à ℮ is regression error or estimation error or residual.

One way to approach estimation is to focus on ℮ and choose such an estimator which minimizes the error. Suppose we attempt to minimize ∑ ℮i

à Choosing (1) it is preferred wrong criteria. Suppose we minimize∑ |℮i| (ignoring signs) à Choosing (2) is wrong criteria.

Examples 1). ℮i 2). ℮i

10 0 -10 0 20 1 -20 0

∑ ℮i = 0 ∑ ℮i = 1

Examples 1). ℮i 2). ℮i

5 0 -5 0 5 19 -5 0

∑ |℮i |= 20 ∑ |℮i |= 19



We should minimize weighted some of errors such that larger error are assigned greater weights. Suppose we set weights proportional to absolute size of error, so set; ωi = Ө |℮i | Now minimize ∑ ωi |℮i | Min ∑ Ө |℮i | |℮i | Min ∑ Ө |℮i |² Min ∑ ℮i² The estimator , which min ∑ ℮i² is known as Ordinary Least Square (OLS) method Y= α + U [Basic equation] Estimation:

= where [E (U) =0] Regression error or residual

e = Y - e = Y - Min ∑ ei² è [Ordinary Least Square Estimator] è OLS estimator of α: Min ∑ ei² = ∑ (Y - ) ² First-Order condition à ∂_ [(Y1 - ) ² + (Y2 - ) ² + (Y3 - ) ² + ……..+ (Yn - ) ²] =0 ∂

[2(Y1 - ) (-1) + 2(Y2 - ) (-1) + 2(Y3 - ) (-1) + …. + 2 (Yn - ) (-1)] =0



-2[∑Y – n ] =0 divide both sides by -2 and n

à OLS estimator of α is mean of Y. Some properties of : 1). has min sum square of errors ∑ ei². ∑ ei² = ∑ (Yi - ) ²

= ∑ (Yi - ) ² = ∑ y² 2). is a random variable of Ui. = 1_ [Y1 + Y2 + Y3 + ……… + Yn] n = 1_ [(α + U1) + (α + U2) + (α + U3) +……………. + (α + Un)] n = 1_ [n α + (U1 + U2 + U3 +……………. + Un)] n = α + (1_ U1 + 1_ U2 + 1_ U3 +……………. + 1_Un)] à equation (i) n n n n = a0 + a1 U1 + a2 U2 + a3 U3 +……………. + an Un)] à equation (ii) Where a0= α, ai= 1_ and so on for all as’. n Equation (ii) show that is a linear function of random variable of U1, U2, U3, ….. , Un. .: must be random. 3). is Unbiased. Proof: E ( ) = E (α + ( 1_U1 + 1_U2 + 1_U3 +……………. + 1_Un)] n n n n = α + (1_ E (U1) + 1_ E (U2) + 1_ E (U3) +……………. + 1_ E (Un)] n n n n = α + (1_ (0) + 1_ (0) + 1_ (0) +……………. + 1_ (0)] n n n n E ( ) = α. [As we know that E (Ui) =0]



4). has minimum variance in the class of linear unbiased estimator. Proof: (a). Var ( ) = E [ - E ( )] ² = E [α + 1_ U1 + 1_ U2 + 1_ U3 +……………. + 1_ Un - α] ² n n n n = E [1_ (U1 + U2 + U3 +……………. + Un)] ² n = E [1_ (U1 + U2 + U3 +……………. + Un) ²] n² = 1_ E [U1² + U2² + U3² +……………. + Un² + ∑i≠j ∑ (Ui, Uj)] n² = 1_ [E (U1²) + E (U2²) + E (U3²) +…………. + E (Un²) + ∑i≠j ∑ E (Ui, Uj)] n² = 1_ [σ² + σ² + σ² +…………. + σ² + ∑i≠j ∑ (o)] .: {co-var (Ui, Uj) =0} n² = 1_ nσ² n² = σ² à equation (a) n (b). Now consider any linear unbiased estimator. (i) α* = b1Y1 + b2Y2+ b3Y3+ b4Y4+ …………………. + bnYn Where b1, b2, b3, b4… bn are constants obviously α* is linear, to make α* unbiased, we set E (α*) this implies the following; E (α*) = α E (b1Y1 + b2Y2+ b3Y3+ b4Y4+ …………………. + bnYn) = α E [b1 (α+U1) + b2 (α+U2) + b3 (α+U3) + …………………. + bn (α+Un)] = α E [α (b1 + b2 + b3 + b4 + …… + bn) + b1U1 + b2U2 + b3U3 + ……. + bnUn] = α α (b1 + b2 + b3 + .. + bn) + b1 E (U1) + b2 E (U2) + b3 E (U3) + …. +bn E (Un)] = α α ∑bi + b1 (0) + b2 (0) + b3 (0) + …. +bn (0)] = α

α ∑bi = α ∑bi = 1. (ii). Now compute variance α*. Var (α*) = Var (b1Y1 + b2Y2+ b3Y3+ b4Y4+ …………………. + bnYn) = Var (b1Y1) + Var (b2Y2) + Var (b3Y3) + ………………. + Var (bnYn) + ∑ i≠j ∑ Co-var (biYi, bjYj) = b1²Var (Y1) + b2² Var (Y2) + b3² Var (Y3) + ……………. + bn² Var (Yn) + ∑ i≠j ∑ bi bj Co-var (YiYj) = b1²Var (U1) + b2² Var (U2) + b3² Var (U3) + ……………. + bn² Var (Un) + ∑ i≠j ∑ bi bj Co-var (UiUj)



= b1²σ² + b2²σ² + b3²σ² + ……………. + bn²σ² + ∑ i≠j ∑ bi bj (0) = ∑bi²σ² (iii). Comparison between Var ( ) < Var (α*) unless b1= 1 for all i. n Consider Var (α*) and minimize it by choosing bi. Min (b1, …… bn) Var (α*) = ∑bi²σ² Subject to ∑bi = 1. Make Lagrangian L = σ² (b1² + b2² + b3² + ……………. + bn²) +λ [1-(b1 + b2 + …. …. + bn)] First-order conditions; ∂L => L bi => 2σ² bi – λ =0 (i= 1,2,3………n) ---------------à (A) ∂bi

∂L => L λ => 1-(b1 + b2 + …. …. + bn) =0 ---------------à (B) ∂ λ From equation (A) bi = λ_ substitute in equation (B) 2σ² => 1-( λ_ + λ_ + …. …. + λ_) =0 2σ² 2σ² 2σ² => λn_ =1 => λ = 2σ² --------------à (C) 2σ² n Substitute equation (C) into (A). => 2σ² bi = λ 2σ² bi = 2σ² n bi = 1_ n => α*= b1Y1 + b2Y2 + b3Y3 + …………..+ bnYn = 1_Y1 + 1_Y2 + 1_Y3 + …………..+ 1_Yn n n n n = 1_ [Y1 + Y2 + Y3 + …………..+ Yn] n = 1_ ∑Yi n α*= Ỹ Recap: the OLS estimator is linear, unbiased and has minimum variance in the class of linear unbiased estimators, that is is best linear unbiased estimator (minimum variance). i.e.:- is BLUE



è Comments on BLUE property: 1). linear: is a linear function of Y.

= 1 Y1 + 1 Y2 + 1 Y3 + ………..+ 1 Yn12

n n n n Theorem: If X1 ~ N, X2 ~ N, X3 ~ N………….. Xn ~ N then any linear combination of X1, X2, X3………… Xn. Z = a1X1 + a2X2 + a3X3 + ……………….. + anXn ~N By this theorem since Ui ~N and Yi = α + Ui is linear function of Ui, we infer that Yi ~N Further = b1Y1 + b2Y2 + b3Y3 + …………..+ bnYn being a linear function of Y1 toYn is also distributed normally; Ui ~N _Lf__ à Yi ~N _ Lf__ à ~N [Lf = linear function] This is normally distributed ~N.

mportance of BLUE property:

Linearity is important to apply tools of statistical inferences because of the following argument; Ui ~N Yi ~N = ~N.

is a linear function of Y. We can use standard tools of statistical inferences; we can say big things with limited source of data. There is counter argument that the above chain of reasoning is too long and unnecessary. We can just assume that ~N. Linearity is not very important (indispensable). We have more options of estimations. Unbiased ness this means E ( ) = α. .: if we draw all possible random samples of Y and estimate from each sample one by one, then mean value of will be equal to α. Where

This property is desirable because we don’t want to have any systematic error in estimation but it is not indispensable. Consider the following example; Prob (α –E < < α +E) = 0.6 E ( ) = α unbiased Prob (α –E < α* < α +E) = 0.9. E ( ) ≠ α α* estimator is biased We can see in figure that estimator biased could be better than unbiased.

I



Best/Minimum Variance this means that Var ( ) < Var (α*), where is OLS estimator, α* is any other linear biased estimator. If we compute with nonlinear or biased estimator, the property doesn’t help. BLUE property is desirable, unbiased limit our choices and linearity also limit. The above model determines E (Y) as a constant. Y= α +U α = E (Y). Now suppose we want to determine E (Y) given some information set (I). This information is usually in form of data on variables called explanatory variables, e.g. Gender, Height etc. Suppose such variables are X1, X2, X3…………….. Xm, if the set of information is complete then we can write; Y= f (X1, X2, X3…………….. Xm)

à Complete information means:

Ø List of all variables X1, X2, X3…………….. Xm is complete. Ø All data are measured accurately. Ø The functional form [f (.)] is exactly known.

à The three sources of error:

Ø Incomplete list of X variables. Ø Measurement error in data. Ø Misspecification of the functional form.

This will produce the following type of equation



Y= α2 X2 + α3 X3 + α4 X4 + ………………+ α k Xm +Z. [k < m]

Z= error committed due to above three reasons. à Econometrics is all about extracting information from the composition of error term Z

and using it beneficially, extracting information intelligently with cost and beneficially. We can extract information immediately. Z= E (Z) + Z – E (Z) Fluctuation in error around its mean value, denote it by (U) A parameter that can be estimated (α)

Now we can write the model as;

Y= α2 X2 + α3 X3 + α4 X4 + ………………+ α k Xm +Z.

Or

Y= α2 X2 + α3 X3 + α4 X4 + ………………+ α k Xm + E (Z) + Z – E (Z) Or Y= α1 + α2 X2 + α3 X3 + ………………+ α k Xk + U. [X1 =1]

à This model is known as General Linear Regression Model (focus on parameters).

Assumptions:

1). Ui is random variable for each i. 1b). Ui is normally distributed (Ui ~ N) for each i. 2). E (Ui) =0 for each i.

As this assumption holds by construction; Ui = Zi – E (Zi) E (Ui) = E (Zi) – E [E (Zi)] = E (Zi) – E (Zi) = 0 3). Var (Ui) = σ² for all i. 4). Co-Var (Ui Uj) = 0 for all i≠j. 5). X variables are fixed or exogenous or non-random.

Example:

(i). Height depends on age. Age is exogenous variable.

(ii). Marks depends on hours of study (reading). Hours of study are exogenous variable. (iii). C = α + βY + U. Y = C + I + G +NX (X –M).



wo Variable Regression Model

Y = α + βX + U

Suppose we have data through a random sample size n, and then we can write;

Yi = α + βXi + Ui (i= 1, 2, 3………..n)

Example: Height or Age is exogenous variables.

Yi = α + βXi + Ui

Estimation:

Suppose and are estimators of α and β respectively, therefore we have the estimated values of Y given as;

Ŷi = + Xi + Ui The regression residual; ei = Yi - Ŷi

= Yi - ( + Xi)

∑ ei² = ∑ [Yi - - Xi] ²

For OLS we minimize ∑ei² with respect to and the first-order conditions are;

= ∑ 2[Yi- - Xi] (-1) =0

= ∑ 2[Yi- - Xi] (-Xi) =0 Divide both sides by -2 and rearrange

∑ Yi- n - ∑ Xi =0 ----------- à(i)

∑ Xi Yi- Xi - Xi² =0 --------- à(ii)

From equation (i)

∑ Yi + ∑ Xi = n ÷ both sides by n

----------- à (iii) Substitute (iii) into (ii)

T



Consider

Like wise, we can show that;

Now substitute (v) and (vi) into (iv)

Thus we have

à These are OLS estimators of α and β.

è Properties of : -

1). is a linear function of Y. Proof:



= a1 Y1 + a2 Y2 + a3 Y3 + ………………… an Yn.

= ∑ ai Yi ----------------------- à (vii)

à By assumption x values are fixed therefore a1 = xi / ∑x² is fixed value and can be treated as a constant, the same is true for a2, a3 …………………………… an.

This means that = a1 Y1 + a2 Y2 + a3 Y3 + …………. an Yn is a linear function of Y1, ………. Yn.

1b). is a linear function of U. Proof:

= ∑ ai Yi. = ∑ ai (α + β Xi + Ui)

= ∑ ai α + β ∑ ai Xi + ∑ ai Ui ---------------- à (viii) Now consider è ∑ ai = ∑ (xi / ∑ xi²) = 1 . ∑ xi. ∑ xi²

= 1 . (0) => ∑ ai = 0. ------------------ à (ix) ∑ xi²

è ∑ ai Xi = ∑ (xi / ∑ xi²) Xi.



= 1 .∑ xi² ∑ xi² ∑ ai Xi = 1.

Substitute (ix) and (x) into (viii)

= ∑ ai α + β ∑ ai Xi + ∑ ai Ui = (0) α + β (1) + ∑ ai Ui

= β + ∑ ai Ui ------------------ à (xi) = β + a1 U1 + a2 U2 + …………………. + an Un is a linear function of U.

à It follows that is random variable, it also follows that Ui ~ N for each i then ~ N.

2). is unbiased:

Proof:

= β + ∑ ai Ui

E ( ) = E [β + ∑ (ai Ui)] = β + ∑ ai E (Ui) since ai is fixed = β + ∑ ai (0) where E (Ui) = 0

E ( ) = β --------------------- à (xii)

3). has minimum variance in the class of linear unbiased estimators:

Var ( ) = E [ - E ( )] ² = E [β + ∑ ai Ui - β] ² = E [∑ai Ui] ² = E [a1² U1 ² + a2² U2² + a3² U3² + …………... + an² Un² + ∑ i≠j ∑ aij Ui Uj]

= a1² E (U1²) + a2² E (U2²) + …………... + an² E (Un²) + ∑ i≠j ∑ aij E (Ui Uj) à (A) Since x values are fixed. Consider E (Ui) = E [Ui – E (Ui)] ² [where E (Ui) = 0] = E (Ui) ²

Var (Ui) = σ² ---------------------- à (xiii) Now consider E (Ui Uj) = E [Ui –E (Ui)] [Uj –E (Uj)] ² [where E (Ui) = 0]

= co-Var (Ui Uj) = 0 ----------------------- à (xiv)



Substitute (xiii) and (xiv) into (A)

Var ( ) = a1² σ² + a2² σ² + …………... + an² σ² + ∑ i≠j ∑ aij (0) = ∑ ai σ² = σ² ∑ (xi / ∑ xi²) ² = σ² [∑ xi²/(∑ xi²)²]

Var ( ) = σ² 1 . ---------------------- à (xv) ∑ xi² 4). Consider another linear estimator:

Let β* be another linear estimator of β. Proof: β* = ∑ bi Yi. [Where bi is fixed] = b1 Y1 + b2 Y2 + b3 Y3 + ……………… + bn Yn. à If β* is to be unbiased, we will need E (β*) = β. That is E (β*) = β => E (∑bi Yi) = β. => E [∑bi (α + β Xi + Ui)] = β. => E [∑bi α + β ∑ bi Xi + ∑ bi Ui] = β. => ∑ bi α + β ∑ bi X i + ∑ bi E (Ui) = β. [Where E (Ui) = 0] => ∑ bi α + β ∑ bi Xi = β. ---------------------- à (xvi) à This require∑ bi = 0, ∑ bi Xi = 1.

Now consider variance of β*.

Var (β*) = E [β* - E (β*)] ² ----------------------- à (xvii)

Consider β* = ∑ bi Yi. = ∑ bi (α + β Xi + Ui) = ∑ bi α + β ∑ bi Xi + ∑ bi Ui. = α (0) + β (1) + ∑ bi Ui. [Using (xvi) equation] β* = β + ∑ bi Ui. Substitute in (xvii) Var (β*) = E [β +∑ bi Ui – β] ² = E [∑ bi Ui] ² = E [∑bi² Ui² + ∑ i≠j ∑ bi Ui bj Uj]. = ∑ bi² E (Ui ²) + ∑ i≠j ∑ bi bj E (Ui Uj)]. [E (Ui Uj) = 0] Var (β*) = ∑ bi²σ². -------------------------- à (xviii)

à We need to prove that Var (β*) > Var ( ).

Var (β*) = ∑ bi²σ². = σ² ∑ (bi²- ai + ai) ². = σ² ∑ [(bi- ai) ² + (ai) ² + 2 (bi- ai) (ai)]. = σ² ∑ (bi- ai) ² + ∑ ai² + 2 ∑ (bi - ai) (ai) -2 ∑ ai²].



[∑ ai = 0, ∑ ai Xi = 1, ∑ bi = 0, ∑ bi Xi = 1]. = σ² [∑ (bi- ai) ² - ∑ ai² + 2 ∑ bi ai] -------------- à (xix) à Consider ∑ bi ai = ∑ bi xi . ∑ xi²

∑ bi ai = ∑ ai² Substitute in (xix) equation Var (β*) = σ² [∑ (bi- ai) ² - ∑ ai² + 2 ∑ ai²] = σ² [∑ (bi- ai) ² + ∑ ai²] = σ² ∑ ai² + σ² ∑ (bi- ai) ²

Var (β*) = Var ( ) + σ² ∑ (bi- ai) ²

à Var (β*) > Var ( ) unless bi = ai

Var (β*) = σ² ∑ ai² + σ² ∑ (bi- ai) ² = σ² 1 + σ² ∑ (bi- ai) ² ∑ xi²

Var (β*) = Var ( ) + σ² ∑ (bi- ai) ²

à If β* is different from then we must have bi ≠ ai for at least some i, this will yield;

Var (β*) = Var ( )

ractice Equation:

Suppose we want to estimate the equation,

Example: Per minute income hypothesis.

Qi = βAi + Ui

Production = Area Or

Ti = βM+ Ui

Yi = βXi+ Ui (error) and

Derive OLS estimator of ( 1 ) Now convert the equation as, Yi = β+ Єi Xi Again derive OLS estimator of β.

P



Compare between and , think over it.

Hetroscedisticity [Yi = β + Єi] Xi Properties of OLS residuals:

1). OLS residuals are or thogonal to Regressors.

Like [a1 a2] a1.b1 + a2.b2 =0 Regressors in equation are 1 and Xi in the below equation.

Yi = α + β Xi + Ui a). [ 1,1, ……………..1] e1 + e2 + e3 + ,……….. en => ∑ei =0 b). [X1 + X2 + ..… Xn] X1 e1 + X2 e2 + X3 e3 + ,……….. Xn e2n => ∑ Xi ei =0 Proof:- OLS estimators are derived from the equation

=> ∑ (Yi- - Xi) = 0

=> ∑ (Yi- ) = 0 => ∑ ℮i = 0

= 0 => ∑ (Yi- - Xi) Xi = 0

=> ∑ (Yi- ) Xi = 0 => ∑ Xi ℮i = 0

ERFORMANCE OF AN ESTIMATED EQUATION:

Consider the following relation,

Yi = + ℮i

=0

P



Multiply both sides∑

à (i) Actual variation = Expected variation.

Now consider

Where =0

= 0. Using this result we can write equation (i) as

à The performance of the estimated model can be measured by R², which is Square of

multiple correlation co-efficient between one variable (V) on the one hand and a set of variables (V) on the other hand.

R² = minimum at 0. R² = maximum at 1.

R² is also called the co-efficient of determination, it is given by;

Also note that,

Extreme cases, so

0 ≤ R² ≥ 1 à There is no bench mark and in what context R² is taken.

Example: Age of Ali = α + β (age of Ali’s dad) + U



R² = 1 Weight of Ali = α + β (weight of Ali’s dad) + U R² = 1 Pakistan, consumption R² = 0.95

à Spurious cause between Y and G

à In cross section data R² = 0.4 is good, but in time series data R² = 0.9 is not a remarkable because of bound ness, we think this is best measure. Problem with R² is that R² increases if we add more variables in the regression.

Example: Dependent variable is consumption of household. Data: Cross section. C = α + β Y +U R² = 0.25 C = α + β Y + γ N +U R² = 0.46 C = α + β Y + γ Nc + δ Nm + π Nf + λ R +U R² = 0.56 C = is linear function of Y, Nc, Nm, Nf, Residence, education female, education male,

wealth ……………………….etc. R² = 0.9899

à If we make R² as the criteria to choose the number of variables in the equation, we’ll end up with as many variables as the number of sample points with R² = 1.

à Also note that as the sample size decreases R² will in general increases.



à It do not put limitations to include variables in the model, model has to be small cater.

èèèè Adjusted R²: Consider the formula for R²

Now

à Where K= number of parameters in the equation [α β γ δ λ π etc].

à We can write adjusted R² as;



à If the net effect is positive then, on net basis will increases and we’ll include variable under consideration.

Degree of Freedom (n-k), we need at least two variables.

tatistical Inference in Econometrics:

Theorems:

1). If X is distributed normally with mean (µ) and standard deviation (σ).

X ~ N (µ, σ) [shape and all estimates exactly] Then Z = x-µ ~ N (1, 0) σ Standardize normal distribution.

2). If X1 ~ N (µ 1, σ 1) X2 ~ N (µ 2, σ 2) X3 ~ N (µ 3, σ 3) : : : Xm ~ N (µ m, σ m)

And X1, X2, X3 ……….. Xm are mutually independent then,

X1 + X2 + X3 +……. + Xm ~ N (µ 1+ µ 2…+ µ m, σ 1+ σ 2 + σ 3…+ σ m)

If X1, X2, X3 ……….. Xm are not mutually independent then,

X1 + X2 + X3 +……. + Xm ~ N (µ 1 + µ 2…+ µ m, σ 1+ σ 2+ σ 3…+ σ m+∑ I j∑ σi j) 3). Suppose Xi ~ N (µ i, σ i) [i= 1, 2, 3 ………………]

And Xi is mutually independent

Number of observer of minimum required.

Continuous variable (Height) à There will be no point probability, only interval probability.

4). Suppose V1 ~ χ²m1

S



V2 ~ χ²m2 V1 and V2 are independent, F = V1/ m1 ~ F m1, m2 [Fisher distribution with numerator degree of freedom to m 1 and denominator V2/ m2 degree of freedom to m 2]

5). Suppose X ~ N (µ, σ) V ~ χ²m

X and V are mutually independent then

t = X-µ/ σ = > standardize normal (V/m) ½ (V/m) ½

t = X-µ/ σ ~ t m (V/m) ½ Variances increases tails increases Back to Econometrics: Consider Y = α + β X + U Suppose we want to test the null hypothesis; H0: β = β0 [where β0 is given value and alternative] H1: β ≠ β0 As we know that

~ N (β, ) Therefore



à In testing null (H0) hypothesis, we will use estimated value of β as , the hypothetical value β0 in place of β and ∑x² will be computed from the actual data.

That is from data β = β0 from H0 ∑x² from data

Note that σ² remains unknown; one option is to replace σ² by its unbiased estimator, [Proof is in the book]

Then is no longer normally distributed if the sample size is large then is Approximately normal

Another option is to convert Z into t distribution the steps are as follows;

1). ~ N (β, ) 2). It can be shown that

Or

3). It can also be shown and V are mutually independent. 4). It follows from 1, 2, and 3 that

Now consider



Recall

Example 1: Y = α + β X + U Y=weight X =Age (X–mean X) =x (Y-mean Y) =y X² y² x y

4 1 -2 -5 4 25 10 8 2 -1 -1 1 1 1

10 3 0 1 0 1 0 12 4 1 3 1 9 3 11 5 2 2 4 4 4

∑Y=45 ∑X=15 ∑x=0 ∑y=0 ∑x²=10 ∑y²=40 ∑x y=18

= 9 – 1.8 x 3 = 9 - 5.4 = 3.6 The estimated equation is;

3.6 = weight at time of birth. 1.8 = rate of increase in weight per year increase in age. Suppose we want to test; H0: β = 0 H1: β ≠ 0



5.4 When x = 1 7.2 x = 2 9 x = 3

10.8 x = 4 12.6 x = 5 We can compute

èèèè Deviation is standard error. Now

à Set level of significance (or probability of type 1-error, equal to one minus level of

confidence) = 0.05

à The calculated t-value falls in rejection (area) range so we reject H0. This means the effect of age on weight is significantly different from zero.



Testing of Hypothesis: Suppose we have estimated the following equation;

C =10.0 + 0.8Y , R² = 0.97, n = 25 (2.5) (0.1) [The values in brackets are standard errors]

à Interpret these results as an economist.

Before we interpret, we test a few hypotheses.

1). H0: α = 0 H1: α ≠ 0

Degree of freedom = 25 – 2 = 23 Level of significance = 0.05 Critical t-values = + 2.069

à We reject H0.

2). H0: β = 0 H1: β ≠ 0

à We reject H0.

3). H0: β = 1 H1: β < 1



à We reject H0.

Interpretation: - The result show that 97% variation in consumption expenditure is explain by our model, which indicates that the over all performance of the equation is satisfactory, the intercept is positive and significantly different from zero, its magnitude shows that the subsistence or autonomous consumption expenditure is 10 thousand rupees per capita per year, further note that marginal propensity to consume (MPC) is significantly different from zero and less than one, the estimated value of the MPC shows that the marginal consumption rate is 0.8 or 80% of each incremental rupee of income is consumed, while the remaining 20% is saved. è Testing a linear restriction on two or more parameters: Y= α + β X + U H0: α + β = 1 H1: α + β ≠ 1

.



Actual application:

Variances and co-variances are obtained from coefficient variance- co-variance matrix.

Suppose = 1.7, = - 0.2 H0: α + β = 0 H1: α + β ≠ 0

H0: α - β = 0 H1: α - β ≠ 0



hree Variables Model:

Yi = β1 + β2X2i + β3X3i + Ui

Assumptions:

1). Ui is random variable for each i. 1b). Ui is normally distributed (Ui ~ N) for each i. 2). E (Ui) =0 for each i.

As this assumption holds by construction; Ui = Zi – E (Zi) E (Ui) = E (Zi) – E [E (Zi)] = E (Zi) – E (Zi) = 0 3). Var (Ui) = σ² for all i. 4). Co-Var (Ui Uj) = 0 for all i≠j. 5). X variables are fixed or exogenous or non-random.

6). Correlation between X2, X3 is not equal to 1.

[This is X2, and X3 is not perfectly correlated].

E (Yi) = β1 + β2X2i + β3X3i

Consider β2 = ƏE (Y) ƏX2 β3 = ƏE (Y) ƏX3

à Since all variation in X2, and X3 are common, it is impossible to separate the effects of

X2, and X3 on Y [conceptually wrong, where variables are perfectly correlated, model construction is wrong].

à We can further show that all possible methods of estimation will fail; we can also

interpret this situation by argument that information content in data is zero. X2, X3 X2, X3 X2, X3

Information, Information Content is zero. Content is full. Information is rich.

T



Estimation:

Yi = β1 + β2X2 + β3X3 + U à Replace unknown parameters by their estimators and set U = 0.

In OLS we solve the following problem

First-Order-Condition:

The end result is as follows;

Also note;

à It can not be shown that following properties hold:



3).Var ( 1), Var ( 2), Var ( 3) and Z have minimum variance in the class of linear unbiased estimators, it can be shown that an unbiased estimator of the variance of U, σ² is;

ultiple Regression Model:

à Testing of more than one restriction jointly:

Suppose we have to estimate or test.

H0: β2 = 1, β3 = 0 H1: β2 ≠ 1 and/ or β3 ≠ 0,

And one model is

Y = β1 + β2X2 + β3X3 + β4X4 + U

à [We have to see the number of restrictions and not to focus on how many restrictions on parameters].

Where

The Steps are as follows:

1). Estimate the unrestricted equation to compute 1, 2, 3, 4. . Then compute

Then ℮i = Y i – .

M



Finally compute ∑℮U²

Suppose ∑℮U² = 50.

2). Impose the restriction given into this will yield, according to our example;

Y = β1 + (1) X2 + (0) X3 + β4X4 + U

Y – X2 = β1 + β4X4 + U

Estimate β1 and β4; and compute;

1 = ………………………..

4 = ……………………….. Now complete

℮i = Y i – . Finally compute ∑℮R²

Suppose we have ∑℮R² = 60.

3).Compute the F-statistics.

Note these values; ∑℮U² = 50, ∑℮R² = 60, R = 2, n = 34 and k = 4.

Now plug in values;

F = (60 – 50) / 2 = 10 / 2 = 5*30 = 3. 50 / (34 – 4) 50 / 30 50 4). Conclusion. We conclude by comparing calculated F-value with the critical F-value. In our case the critical F-value at R = 2 and n-k = 30 and df = 2.87 is supposed.

à In our example the calculated F-value > critical F-value, so we reject H0.



è How to check F- value in the Table:

è Consider now General model:

Y = β1 + β2X2 + β3X3 + ………………………………… βK XK + U Special Case 1: Just one restriction.

Examples: H0: β1 = 0. H0: β1 = 1. H0: β2 + β3 = -1. H0: β2 + β3 + β4 + β5 = 0. H0: β2 + β3 + β4 + ………………… + βK = 0. à In this case we can show that; F = t² Critical F = (critical t) ² | t | > | t critical | ó | F | > | F critical |

| t | = | t critical | ó | F | = | F critical |

| t | < | t critical | ó | F | < | F critical |

à So “t” and “F” test will lead to identical conclusions.

Special Case 2: Each parameter except intercept is set equal to zero.

H0: β2 = 0, β3 = 0, β4 = 0 ………………..… βK = 0. H1: At least one βj ≠ 0, [j = 2, 3, 4 ……………k] à In this case restricted model becomes; Y = β1 + U Or Y = b1 + V

Now



à Where all values come from unrestricted model, so we can ignore the subscript ǖ. We can write model as;

à It is test of over all performance of the model. Note 1: In General Case.

Or

Divide all terms by ∑ y².

à F-statistics indicates that increase in the restriction decrease R² by more error (In against or alternative). à F-statistics indicates the increase in the R² due to removal of restrictions.

Note 2: In Special Case 2.

Divide all terms by ∑ y²



When F = 0 => R² = 0 test just like. à F-statistics indicates the significance difference of R² from zero.

ULTICOLLINEARITY:

It is an econometric problem. à There are four ways to solve the problem:

ðððð What is problem ðððð What are the consequences of problem ðððð How to test the problem ðððð What is solution

à Recall the three variables regression model problem.

Yi = β1 + β2X2 + β3X3 + U Also recall;

Also note;

à In each equation denominator is same.

Ώ = ∑x2² ∑x3² - (∑x2 x3) ²

M



à Now suppose we have X2 and X3 are perfectly correlated;

γ²23 = + 1. In this case

2 = ---------------, 3 = ---------------. O. O à So we can estimate β2 and β3 by OLS. It can be shown that β2 and β3 can not be estimated at all. In fact the true values of β2 and β3 are not even perfectly defined. β2 = ∂ E (Y) and β3 = ∂ E (Y) do not exist ∂ X2 ∂ X3 à On the other hand if X2 and X3 are not at all correlated then,

γ23 = + 0. In this case

à These are OLS estimators when we regress.

Y = α + β2 X2 + U --------------- à One model

Y = α + β3 X3 + U --------------- à Other model Also note that in this case.

β2 = ∂ E (Y) = d E (Y) ∂ X2 d X2

β3 = ∂ E (Y) = d E (Y) ∂ X2 d X3

à Finally note that in this case multiple regression equation and partial (simple) regression equation produce identical results.

Recap: In case γ23 = + 1, multiple regression equation fails theoretically and application wise also. In other extreme case γ23 = 0, multiple regression equation is not needed, so the only practical use of multiple regression equation is when

γ23 = ≠ 0, γ23 ≠ + 1.



è Now multicollinearity can be defined as “Strong but imperfect correlation among X variables”. A correlation problem in which independent (X) variables is highly correlated, and then estimator’s quality is poor. à Another way of understanding the problem is to look at the information content in data. X2, X3. X2, X3. γ23 = + 1. Information. γ23 = + 1. Content is zero. Information Content is full.

γ23 = ≠ 0, γ23 ≠ + 1. | γ23 | is low. | γ23 | is High. Information content is rich. Information content is poor.

à Multicollinearity is a problem of poor information content in data. Variation or estimation is possible but in poor performance.

Concept of Multicollinearity: Type of situation when multicollinearity can arise.

Multicollinearity is most common in time series data because in such data variables are highly correlated due to common trend.

Example: Relationship between age of my father and my age it is one by one relationship which is spurious cause. Multicollinearity can also arise if the X variables are poorly specified. Examples:

ðððð Exchange Rate. ER = α + β CAD + γ Export + θ Import + U.

In this equation we have repeated same variables on the right hand side as exports and imports are included in Current Account Deficit (CAD) therefore writing only CAD or exports and imports are good.

ðððð Consumer Price Index.

CPI = α + β CC + γ DD + θ TD + λ OD + U



In this equation of CPI, DD plus TD are coming from commercial banks and CC is major part of money, we can write this equation as to be good.

CPI = α + β [CC + DD + TD + OD] + U Or CPI = α + β M2 + U [M2 = CC + DD + TD + OD] Or CPI = α + β (CC + DD) + γ (TD + OD) + U

à There could be model specification problem. Here is matter of judgment not a matter of science. è Consequences of Multicollinearity:

Note that OLS estimators remain BLUE.

ðððð (1) The variances of OLS estimators become large. Recall formula for variances.

à If | γ23 | is high (1- γ²23) will be low, therefore Var ( 2) and Var ( 3) will be large.

It follows that,



à Standard error will also be large. Thus the t-value for H0: j = 0.

t = j - 0. à Will become small

SE ( 2) Therefore we may accept H0, while will should not have to accept. In other words we may wrongly conclude that X variables do not affect Y. Example: CPIt = α + β M2 + γ ERt + θ Yt + λ CPI t-1 + Ut

If data is too large than more multicollinearity, we may accept H0: j = 0. Standard error will be greater and cause misleading t-value. Another implication of this consequence is

that j varies quite a bit from sample to sample. In particular even small changes in the

sample may produce large changes in j.

Take above example of CPI equation, we may have = 2.1 than we use sample of

annual data from 1970 to 2002. = -3.7 when the data are from 1970 to 2005, so conclusion

is that ’s also erratically changes with small changes in the data, specification etc, t-value

(decreases) is small then ’s will volatile the model their will be no robusness (stability) in the model, their will be no trust over model.

ðððð (2) Recall the formula for Co-Var ( 2, 3).

If γ2 3 is high (due to multicollinearity), this will make Co-Var ( 2, 3) large. Further note that



If X2 and X3 are positively and highly correlated with each other, then (negative and very large correlation) under estimation of β2 will accompany over estimation of β3 and vice versa.

Example: β2 = 100 and β3 = 250 β3 = 300 and β2 = 100 Like wise if X2 and β3 are negatively and highly correlated then over (under) estimation of β2 will accompany under (over) estimation of β3.

Consequences of (1) and (2) imply that the estimated parameters ( ’s) become volatile (unreliable, unstable) and too sensitive, their magnitudes are quite likely to be unrealistic in terms of sign and size (some significant parameters’ sign will not good).

Example: CPI, Y, ER, IR MPC = 1.3. MPC = -1.3 Own price elasticity is negative it comes positive; it means estimation is not realistic and reliable.

è Testing and Diagnostic of Multicollinearity:

Formal test of multicollinearity are too complex but not much fruit full. In practice we may rely on certain clues and symptoms (indicators).

ðððð (1) Multicollinearity is likely to be present if data are time series data at high frequency (for example annual rather than monthly data). Unless data are de-trended (Remove common trend, low interval and low frequency).

ðððð (2) A very popular symptom of multicollinearity is that over all performance of the estimated equation is good in terms of high value of R², but t-statistics for individual

regression coefficients for H0: j = 0 is mostly insignificant. Example1: Log CPIt = 1.2 + 0.3 log M2 + 0.7 log Yt – 0.25 log ERt + 0.97 log CPIt-1.

T-values: (0.85) (1.37) (-0.09) (44.73)

R² = 0.9938

Log CPIt = 1.2 + 0.3 log M2 + (-) 0.7 log Yt – (+) 0.25 log ERt + 0.97 log CPIt-1. (Insignificant) (Insignificant) (Insignificant) (Highly significant)

R² = 0.9938 is very impressive.

à In this case over all performance is good but individual effects are misleading due to sign and high (low) standard error and small (low) t-values and also insignificant t-values. It is not result that before CPI is present that’s why it is arisen. Example 2: Log (Qw) = 1.2 + 0.3 log Y – 0.1 log (Pw) + 0.3 log (Pr) + 1.1 log (Npop) T-values: (1.57) (-0.73) (1.21) (17.43) (Insignificant) (Insignificant) (Insignificant) (Highly significant) R² = 0.99.



Where

Log Y = Income elasticity. Log Pw = Price elasticity of wheat. Log Pr = Cross price elasticity of rice.

Like wise signs are good results is fine because all factors are good.

ðððð (3) Parameter estimates are too sensitive to changes in sample, definition of variables and specification of the model.

If we change sample little bit to add new data and regress and it give new situation

which drastically changes results (we are avoiding Var ( ) is too high not good). Defining variables in more than one ways like GNP as GDP or GNI and saving data which definition we are going to use or to take, which changes drastically. Through which way we are specifying the model,

C = α + β1 Y + γ R + ……………………….. -------- à Linear function.

Log C = α + β2 log Y + γ log R + ……...……….. -------- à Logarithm function.

β1 = ∂C. ∂Y

β2 = ∂ Log C = ∂C ۪ Y. ∂ Log Y ∂Y C Therefore β2 = β1 ۪ Y. C

Drastically changes means too sensitive.

ðððð (4) Detection through Correlation Matrix.

Let take an example:

Log P = α + β log M + γ log GDP + θ log ER + U. (CPI) (Ms) (Real GDP) (Real ER)



ðððð Construct Correlation Matrix for all variables.

Rule of Thumb: If correlation among X variables is stronger than correlation between X and Y then multicollinearity is present.

Example: X and Y are strongly correlated. AóBóC óóóó In this case the correlation among X variables can undermine the relationship.

Relationship between X and Y variables Among X variables Case 1: γ P, M = 0.95 γ M, ER = 0.80 (Not ok) γ P, ER = 0.70. (Multicollinearity exists). . --.

Case 2: γ P, M = 0.95 γ M, Y = 0.60 (Ok) γ P, Y = 0.65. (No Multicollinearity exists). . --.

Case 3: γ P, GDP = 0.95 γ GDP, ER = 0.55(Ok) γ P, ER = 0.70. (No Multicollinearity exists). . --.

è Solutions of Multicollinearity:

óóóó (1) Exclude variable (s) causing multicollinearity.

This solution makes sense only when the variable being dropped is not important in the over all frame-work of our analysis.

Example: Pt = α + β Mt + γ GDPt + θ ERt + λ Pt-1 + π Pt-2 + U

If Pt-2 is causing multicollinearity we should exclude this variable which is not very important while Mt is causing multicollinearity we should not exclude this variable because with out Mt (money supply) we can not measure the inflation.

Note: Unfortunately important variable causes multicollinearity.



óóóó (2) Increase the sample size. This solution in general is not much appealing because along sample is desirable in any case (Why we wait for multicollinearity to arise?). In most cases the sample size is small in first place just because large sample is not available.

Example: Poverty, income reduces poverty, why poverty is exists?

However a meaning full interpretation of this solution can be as follows.

óóóó (a) Split the data into quarterly data or monthly.

=> 30 x 1 = 30 yearly data. => 30 x 4 = 120 quarterly data. => 30 x 12 = 360 monthly data. Two issues: Can we split data on the basis of month and quarter. Monthly and quarterly data is not valid due to common trend. In some cases like as;

ðððð ER weakly, monthly daily data. ðððð Real activity can not split quarterly and monthly like as; ðððð GDP, saving is not perfectly in quarterly data but quarterly approximation. ðððð Pt: Mt, GDPt, ERt, Pt-1, Pt-2. ðððð GDPt is not accurately splits, if advantage is more in splitting data then split the data.

óóóó (b) Split the data on special basis. For example data on Pakistan can be split into data on Punjab, Sindh, Blochistan and NWFP provinces wise, area (space) wise, time wise etc.

óóóó (C) Merge two different (not identical) but similar data. For example we can merge 30 observations of Pakistan with 30 observations each of India, Sri Lanka and Bangladesh. It is work able solution.

óóóó (3) Filter the data.

(1) In time series we can apply first differencing.

Yt = α + β Xt + γ Zt + Ut. Yt-1 = α + β Xt-1 + γ Zt-1 + Ut-1 . Yt –Yt-1 = β( Xt – Xt-1) + γ(Zt – Zt-1) + Ut – Ut-1 Or ∆Yt = β ∆Xt + γ∆ Zt +∆ Ut.



It reduces the changes drastically of multicollinearity but it filter out all valuable variables also. It is not a good solution, intercept is also gone. Suppose multicollinearity is caused by common trend.

Why not control trend? To control for the trend, we include time variable in the model/ equation. t = 0, 1, 2, 3 ………………………………………………..

Yt = α + θt + β Xt + γ Zt + Ut. Where => θt – θ(t-1) Yt-1 = α + θ(t-1) + β Xt-1 + γ Zt-1 + Ut-1 . => θ [t – (t –1)] ∆Yt = θ + β ∆Xt + γ∆ Zt +∆ Ut. => θ.

Relationship between independent and dependent will become week.

(2) Both in cross-section and time series data we can take ratios.

C = α + β Y + γ W + U. [Y increases ó W increases]

In time series = Pop income In cross-section = household pop income (per capita).

We are redefining; not dividing by N trend will become week.

Suppose we have Cob-Douglas Production function.

K = Capital L = Labor.



M = Material. E =Energy. Log Q = log A + α log K + β log L + γ log E + θ log M + U. Or Log Q = a0 + α log K + β log L + γ log E + θ log M + U. Firm have higher capital stock, and then there will be more employment arises. Test: H0: α + β + γ + θ = 1 H1: α + β + γ + θ ≠ 1 Suppose H0 is accepted then we can write;

óóóó β = 1 – α – γ – θ.

Now production function becomes,

Or

Y per worker = f (K per worker, E per worker, M per worker). Note: A problem does not have to be shall only tackle, leave it alone.

RISCH-WAUGH THEOREM: Suppose we have;

Y = β1 + β2 X + β3 Z + U Then β2 = ƏE (Y) ƏX The effect of Z can also be eliminated as follows; Regress Y on Z. Y = a0 + a1 Z + V (by OLS) And obtain

Regress X on Z. X = b0 + b1 Z + W (by OLS) And obtain

F



Now regress by OLS below equation

Y* = C0 + C1X* + Є It can be proved.

à If we make Z as constant or we eliminate it.

∆ Log Xt = log Xt – log Xt-1

= log Xt – log Xt-1 t – (t – 1) = ∆ Log Xt ∆t ≈ ∂ log Xt = ∂ log Xt . ∂Xt ∂t ∂Xt dt = 1 ∆X X = growth rate of X. Note: - We have to week the co linearity to decrease multicollinearity.

UTOCORRELATION: Definition:

Correlation between Xi and Yj, i≠j and X and Y may be the same or different variables are called Serial correlation.

Example: correlation between Mt-1, Pt Yt, Rt-2 If X and Y are the same then serial correlation becomes Autocorrelation. Correlation between Xi and Xj, it arises in time series data and in cross section as well.

Example: - Correlation between Ct, Ct-1 [Consumption] Correlation between Qt, Qt-4 [Output] Correlation between Tt, Tt-12 [Temperature] Special case of serial correlation is correlation between Xi, Yi [contemporiuos]. Autocorrelation Problem:

This problem means presence of autocorrelation in error term [Ui] in the regression equation. Yi = α + β Xi + Ui Autocorrelation means, Cov (Ui, Uj) ≠ 0 for at least some i≠j. Recall the assumption. Cov (Ui, Uj) = 0 for all i≠j. Autocorrelation is violation of this assumption and mainly a problem of time series data, it is usually present because the dependent variable has inertia or sluggish ness or stickiness and this inertia is not captured by any variable on the right hand side.

A



Ct = α + β Yt + Ut [4 years monthly data] We have exclude variables which capture the inertia, error become auto correlated when error term captures inertia.

Consequences of Autocorrelation:

Note: - OLS estimators remain linear and unbiased. 1). OLS estimators no more have minimum variance in the class of linear unbiased

estimators.

Not remains best.

2). Ordinary formula for calculating variances is no more valid.

Var ( ) ≠ σ²_ ∑x² ∑ i≠j ∑ Cov (Ui Uj) ≠ 0 OLS estimators are not sufficient, they are larger in variances. à This is not a big problem we can make correlation if we apply BLUE but not best by using correct formula, result will come in too high variances then standard errors also become high and we will accept t-value which miss lead the parameters. [Testing and Solution of Autocorrelation is post pond till such time we understand the various forms of Autocorrelation]. Form of Autocorrelation: Consider the model Yt = β0 + β1 Xt + Ut 1). Auto Regressive system [AR (p) model]

Ut = α0 + α1 Ut-1 + ……………………………… + α p Ut-p + Є t

Auto correlated portion Non-auto portion Innovation News, Shock White noise error



2). Moving Average system [MA (q) model] Ut = γ 0 + γ 1 Є t-1 + ……………………………… + γ q Є t-q + Є t

(Regressed over errors, chicken egg problem) = γ + γ 1 Є t-0 + ……………………………… + γ q Є t-q [γ 0 =1] Why we call it moving average, we can write equation as

Ut = γ + Ф [Moving average of Є t] 3). ARMA (p, q) model: Ut = α0 + α1 Ut-1 + ……… + α p Ut-p + γ 0 + γ 1 Є t-1 + …………… + γ q Є t-q + Є t AR (p) model MA (q) model [Some called it ARIMA model] AR (1) Model: Ut = α0 + α1 Ut-1 + Є t This is the most popular and a simple way to model Autocorrelation. Assumptions:

1). Є t is a random variable for all t. 2). E (Є t) = 0. 3). Var (Є t) = σ² for all t. 4). Cov (Є t, Є t’) = 0 for all t = t’. [at two different points they are not correlated]. 5). | α1 | < 1. Properties of Ut:

Solve Ut as follows

Ut = α0 + α1 Ut-1 + Є t = α0 + α1 [Ut = α0 + α1 Ut-2 + Є t-1] + Є t = α0 + α0 α1 + α1² Ut-2 + α1 Є t-1 + Є t = α0 + α0 α1 + α1² [α0 + α1 Ut-3 + Є t-2] + α1 Є t-1 + Є t = α0 + α0 α1 + α0 α1² + α1³ Ut-3 + α1² Є t-2 + α1 Є t-1 + Є t

[We will end up with following equation] Ut = α0 + α0 α1 + α0 α1² + α0 α1³ + ………………………………………… + Є t + α1 Є t-1 + α1² Є t-2 + α1³ Є t-3 + ……………………………….. + α1˚˚ Ut-∞ [α1˚˚ = 0] Or Ut = α0 [1 + α1 + α1² + …….……..] + Є t + α1 Є t-1 + α1² Є t-2 + ……….….



= α0 1- α1˚˚ + Є t + α1 Є t-1 + α1² Є t-2 + ……………………………….. 1- α1 Ut = α0 + Є t + α1 Є t-1 + α1² Є t-2 + …………………………… à MA(∞)

1-α1 AR (1) model = MA (∞) model Ut = γ + γ 0 Є t-0 + γ1 Є t-1 + γ2 Є t-2 + ………………………………………

à Weighted average of past innovations, shocks telling us use full information. Parametric Properties of Ut: 1). E (Ut) = α0 + E (Є t) + α1 E (Єt-1) + α1² E (Є t-2) + …………….………… 1-α1 = α0 + (0) + α1 (0) + α1² (0 ) + ………………..………….………… 1-α1 = α0 + 0 [1+ α1 + α1² + ………………..………….…….…………] 1-α1 = α0 + 0 [ 1 ] 1-α1, 1-α1 = α0 . 1-α1 [0 + 0 + 0 + 0 + 0 + 0 + 0 + …………………….+ 0 ≠ 0] ∞ Time zeros 10 = ∞ => 10 = 0 ∞. 0 1 = ∞ => 1 = 0 ∞. 0 à In AR process for Ut, if E (Є t) = 0, we should set α 0 = 0, so we’ll have E (Ut) =0 Ut = α0 + α1 Ut-1 + ……………………………… + Є t 2). Var (Ut) = Var (Є t + α1 Є t-1 + α1² Є t-2 + ……………………………………………) = Var (Є t) + Var (α1 Є t-1) + Var (α1² Є t-2) + ……………+ (covariance) = σ² + σ² α1² + σ² α1 + ………………………………………..(0). = σ² [1 + α1² + α1 + ……………………………………………]. = σ² [ 1 – (α1²)˚˚] [ = (0.8)˚˚ = 0] 1-α1² Var (Ut) = σ² . --------------------- à (1a) 1-α1² This is constant variance for all t, there is no hetroscedisticity in Ut.



3). Cov (Ut, Ut-1) = E [Ut – E (Ut)][Ut-1 – E (Ut-1)] = E [Ut – Ut-1] = E [Є t + α1 Є t-1 + α1² Є t-2 + ……………….…………] x [ Є t-1 + α1 Є t-2 + α1² Є t-3 + ……………..…] = σ² α1 + σ² α1³ + σ² α1 + ……………………………… = σ² α1 [1 + α1² + α1 + …………………………………] = σ² α1 [ 1 ]

1-α1² Cov (Ut, Ut-2) = E [Ut – E (Ut)] [Ut-2 – E (Ut-2)] = E [Ut – Ut-2] = E [Є t + α1 Є t-1 + α1² Є t-2 + ……………………................……] x [ Є t-2 + α1 Є t-3 + α1² Є t-2 + ……………..…] = σ² α1² + σ² α1 + σ² α1 + …………………...…………………… = σ² α1 [1 + α1² + α1 + ……………………………………………] = σ² α1² [ 1 ] 1-α1² Cov (Ut, Ut-3) = E [Ut – E (Ut)] [Ut-3 – E (Ut-3)] = E [Ut – Ut-3] = E [Є t + α1 Є t-1 + α1² Є t-2 + ……………………................……] x [ Є t-3 + α1 Є t- 4 + α1² Є t-5 + ……………..…] = σ² α1³ + σ² α1 + σ² α1 + …………………...…………………… = σ² α1³ [1 + α1² + α1 + ……………………………………………] = σ² α1³ [ 1 ] 1-α1² In General, we obtain;

Note Var (Ut, U0) = σ² α1˚ [ 1 ] 1-α1² = Var (Ut) = σ² . -------------- à as given in equation (1a). 1-α1² 4). Corr (Ut, Ut-i) = Cov (Ut, Ut-i) . SD (Ut), SD (Ut-i) = Cov (Ut, Ut-i) . SD (Ut), SD (Ut) = Cov (Ut, Ut-i) . Var (Ut) = Cov (Ut, Ut-i) . Var (Ut-i)



Autocorrelation Coefficient at lag length i which is function of i α1 > 0 α1 = 0.8 Auto function is geometrically declining and approaching towards zero as the lag length increases. α1 < 0 α1 = - 0.5 Auto function oscillatory, starting with a negative value at lag length one and approaching towards zero. Price level = α + β (money supply) + Ut Another case of AR (P) Suppose we have quarterly data, to estimate the equation. Yt = α + β Xt + Ut We expect that Ut = α0+ α1 Ut-1 + α2 Ut-2 + α3 Ut-3 + α4 Ut- 4 + Є t To simply that matter we assume α0 = α1 = α2 = α3 = 0.



We are left with Ut = α4 Ut- 4 + Є t -------------- à (1)

Assumptions about Є t:

1). Є t is a random variable for all t. 2). E (Є t) = 0. 3). Var (Є t) = σ² for all t. 4). Cov (Є t, Є t’) = 0 for all t = t’.

Properties of Ut:

1). Equation (1) can be expressed as a MA (∞) process; Є t + α4 Є t-4 + α8² Є t-8 + …………………….................…… 2). E (Ut) = 0. 3). Var (Ut) = σ² . 1- α4²

5). Cov (Ut, Ut-i) = i = 0 => σ² . 1- α4² i = 1 => 0 i = 2 => 0 i = 3 => 0

i = 5 => 0 i = 6 => 0. It follows that

= 0 other wise. Autocorrelation function is; α4 > 0 α4 = 0.5



α4 < 0 α4 = -0.5 à In computer software the autocorrelation function is shown as a part of “Correlogram” (One part of autocorrelation, other are partial autocorrelation etc). AR (1), α1<0 AR (1), α1>0 AR (4), α1<0 AR (4), α1>0

AR (1) is just a symptom, it is kind of art not a perfect science and very use full idea, correlogram is reason. MA (1) Model: Ut = Є t + β Є t-1 à Є t satisfies all standard properties, we can show that; 1). E (Ut) = 0. 2). Var (Ut) = (1+ β1²) σ² 3). Cov (Ut, Ut-i) = β1 σ² for i = 1 Cov (Ut, Ut-i) = 0 for i ≥ 2 Ut = Є t + β Є t-1 Ut Ut-1 = Є t-1 + β Є t-2 Ut-1 No correlation Ut-2 = Є t-2 + β Є t-3 Ut-2



4). Corr (Ut, Ut-i) = β1 . For i = 1 1+ β1² = 0 for i ≥ 2 è Correlogram: MA (1) β1 < 0 MA (1) β1 > 0

MA (4); Ut = Є t-2 + β Є t-4

MA (1) β1 > 0 MA (1) β1 < 0

Testing of Autocorrelation: 1). Durbin Watson test:

DW test is based on DW statistics.

à This formula can be further expressed as follows;



H0: ρ = 0 => ď = 2 H1: ρ ≠ 0 => ď ≠ 2 Now unfortunately the distribution of ď is not unique; it depends on actual data, for exact distribution we don’t have time and energy to make calculations of observation. Durbin Watson has provided the two extreme distributions as shown in the following graph



. Table for critical ď value provides dl and du for various values of;

n (Number of observations) k`(Number of parameters minus one) Example: CPI = α + β M2 + γ GDP + θ ER +U

Sample 1970-71 to 2004-05

n = 35 k`= 3

From the table we have

dl = 1.42 du = 1.71

Suppose the calculated ď = 2.74, we can determine the right tail critical value.

4-dl = 4- 1.42 = 2.58 4-du = 4- 1.71 = 2.29 Since calculated ď < 4-dl and 4-du, we reject H0 and conclude that autocorrelation is present, this test has some problem. Notes on the test:

1). the test statistics has inconclusive range, so it may not produce a concrete conclusion. 2). the test is especially designed for AR (1) process, but not for higher order auto processes

or MA process or others. 3). Despite the above two limitations the test is power full to detect autocorrelation,

especially it is most common form AR (1) process.



H0 is : Test True False

Decision Reject Type I

error Power

Accept confidence Type II error

4). DW test is the most popular test. 5). DW gives biased results when lagged dependent variable appears on the right hand side.

Example: Pt = α + β M6 + γ Yt + θ Et + λ Pt-1 + Ut è Note (1) is reality not a weak ness (2) and (5) are serious problems. 2). Durbin-h Test:

Durbin-h is use full to test autocorrelation of first order when lagged dependent variable is on the right hand side.

h ~ N (0, 1)

Critical values are + 1.96 for 5% level of significance. + 1.645 for 10% level of significance. + 1.345 for 1% level of significance. If h turn to be an imaginary number,

Then we use another method. 3). Durbin’s Alternative Test:

Estimate the regression equation.

Yt = α + β X t + Ut

By OLS and compute regression residual ℮t, then estimate the following regression equation. ℮t = θ0 + θ1 ℮ t-1 + θ2 ℮ t-2 + θ3 ℮ t-3 + ………. θp ℮ t- p + γ Yt-1 + λ Xt + error.



Now test the null hypothesis,

H0: θ0 = 0, θ1 = 0, θ2 = 0, θ3 = 0 ……………….. θp = 0. H1: At least one θj ≠ 0 for j = 1, 2, 3 ……………..……p.

Apply F test [simple form p = 1] ℮t = θ0 + θ1 ℮ t-1 + γ Yt-1 + λ Xt + error.

H0: θ1 = 0. H1: θ1 ≠ 0. 4). Q-Test: Q-test is use full to test cumulative autocorrelation up to any order p permissible with the data.

H0: ρ0 = 0, ρ1 = 0, ρ2 = 0, ρ3 = 0 ……………….. ρp = 0. H1: At least one ρj ≠ 0 for j =1,2, 3 ………p.

a). P =1. First order autocorrelation. b). P =2. First and second order autocorrelation c). P =3. First, second and third order autocorrelation and so on. For Q-test formula is;

Improved form of formula;

Solutions for Autocorrelation:

Consider the following model

Yt = α + β Xt + Ut. -------------------------- à (1) Ut = ρ Ut-1 + Є t. -------------------------- à (2)

Є t. is White noise. [Random variables with zero mean and constant variance and zero autocorrelation] The estimation procedure attempt to replace the auto correlated variable Ut by non auto correlated variable Є t. Consider Yt = α + β Xt + Ut. (1)

[Take first difference of equation (1) and multiply with ρ to all terms minus new equation from (1)]



ρ Yt-1 = ρ α + ρ (β Xt-1) + ρUt-1. (1`) Yt = α + β Xt + Ut.

ρ Yt-1 = ρ α + ρ (β Xt-1) + ρUt-1. Subtract: Yt – ρ Yt-1 = α (1- ρ) + β (Xt – ρ Xt-1) + Ut- ρ Ut-1. Or Yt – ρ Yt-1 = α (1- ρ) + β (Xt – ρ Xt-1) + Є t. ----------------- à (A)

Now using equation (2) we can write Ut = Yt – α – β Xt ρ Ut-1 = ρ Yt-1 – ρ α – ρ β Xt-1 .

Subtract: Ut – ρ Ut-1 = (Yt – α – β Xt) – ρ (Yt-1 – α – β Xt-1) = Є t. Or (Yt – α – β Xt) = ρ (Yt-1 – α – β Xt-1) + Є t. ----------------- à (B) à Equation (A) or (B) has the error term Є t which satisfies all the classical assumptions. The two unknown values of coefficients multiply each other then it becomes non-linear equation.

However, the trouble is that both these equations are non-linear in parameters; we can not drive the formula for the OLS estimators of α, β, ρ.

Since we can not use any unique formula to compute OLS estimators of α, β and ρ, we’ll have to apply some numerical algorithm.

We’ll consider two methods (1) Cochrane-Orcatt two step iterative method. (2) A Version of Direct Search.

Cochrane-Orcatt two step iterative method:

Step 1a: Start with some initial value of ρ, suppose we set ρ◦ = 0. Then equation (A) becomes;

Yt = α + β Xt + Є t. -------------------------------- à (A`)

Now apply OLS to compute and these estimators are poor because they do not treat autocorrelation (ignoring). Step 1b:

Substitute and in equation (B)

Or ℮ t = ρ℮ t-1 + Є t. -------------------------- à (B`)

Apply OLS to compute , now use in equation (A);



Yt* = α X1 + β Xt* + Є t.

Apply OLS to yield and . à These are two step estimators of α and β (either not good due to wrongly estimated

and ).

Since ρ◦ = 0 is not true, and are poor

à is poor

à and are poor

However, we expect that is more likely to be closer to the true value of ρ than ρ◦ = 0

it follows, therefore that and are preferable to and , but there is possibility of improvement

Step 2a:

Use and in equation (B) to compute .

Or ℮ t = ρ℮ t-1 + Є t. ------------------- à (B``) Step 2b:

Use in equation (A) to compute and .

This process continuous till convergence achieved.

Then the estimator of α and β becomes stable. A Version of Direct Search:

Consider equation (A) which can be written as;

Yt = α (1- ρ) + β (Xt – ρ Xt-1) + ρ Yt-1 +Є t.

Above equation of Yt regressed on Xt, Xt-1 and Yt-1.

Yt = θ0 + θ1 Xt – θ2 Xt-1) + θ3 Yt-1 + Є t.

Such that θ1, θ3 = θ2

We start with initial values of all parameters of α˚, β˚ and ρ˚.

For example we can set α˚ = 2, β˚ = 0.5 and ρ˚ = 0.7. This will yield Yt = α˚ (1- ρ˚) + β˚ (Xt – ρ˚ Xt-1) + ρ˚ Yt-1 + ℮ t. = 2 (1-0.7) + 0.5 Xt – 0.35 Xt-1 + 0.7 Yt-1 + ℮ t.



= 0.6 + 0.5 Xt – 0.35 Xt-1 + 0.7 Yt-1 + ℮ t.

Now compute ℮ t= (Y- ) ℮ t = Yt - α˚ (1- ρ˚) - β˚ (Xt – ρ˚ Xt-1) - ρ˚ Yt-1. = Yt - 0.6 - 0.5 Xt + 0.35 Xt-1

Finally compute ∑℮²

Let ∑℮² = 52. Now change one of the three parameters at a time and recomputed ∑℮².

For example we change β˚ from 0.5 to 0.6 keep and α˚ = 2, ρ˚ = 0.7.

Now compute the following expression

à This is so called numerical derivative, if the expression in (a) is positive, it means that

at β˚ = 0.5, errors are increasing in β, so we should set β˚ less than 0.5. à Repeat the same procedure for α, β and ρ.

Once we know the directions in which α, β and ρ should be searched, we can change the initial values and repeat the entire process.

Example: α˚ = 2, β˚ = 0.5, ρ˚ = 0.7. Derivative with respect to α > 0 Derivative with respect to β < 0 Derivative with respect to ρ < 0

Now we can set α˚ = 0.2 β˚ = 0.8 ρ˚ = 0.9. For example now the signs of derivatives are Positive for α Positive for β Negative for ρ Now set α˚ = 0.-3. β˚ = 0.7. ρ˚ = 0.95.

ETROSCEDISTICITY:

Introduction:

If the assumption that Var (Ui) =σ² for all i is violated, we’ll have Var (Ui) =σi², which can vary from observation to observation, this situation is referred to as Hetroscedisticity.

Examples:

(i) Qi = α + β Ki + γ Li + θ Ai +Ui

H



Q = wheat output K= capital L = labor A= acreage. In our sample we have all size of farms; Var (Ui) measures the size of variation in output due to random factor. We expect that Var (Ui) to increase with the size of farm. (ii) Yi = α + β Xi + Ui

Y= expenditure on snacks X= income There is random fluctuation, low variance, and low income in mostly in cross section data.

Yi = α + β Xi + Ui

Var (Ui) = σi² which varies across observation points, one reason can be that when the value of Xi is larger, there are more chances of larger unexpected variations in Yi, that is

Var (Ui) = σi² => f (Xi) Example1): Yi = α + β Xi + Ui

Yi is food consumption, Xi is income, and data is at household level. Now the household with higher income level are expected to experience larger fluctuations in food consumption.

Example2): Yi = α + β Xi + Ui

Yi is wheat output; Xi is area under wheat crop, and Ui is random fluctuation in wheat output. Larger the farms are expected to experience larger fluctuations in output. There can be favorable and unfavorable effects of weather conditions on wheat output.

Obviously hetroscedisticity problem is more likely to arise where larger variations in Xi. This is more likely to happen in cross section data rather than in time series data. Hetroscedasticity mainly a problem of cross section data, it may arise in time series data if the data is observed at low frequency level like daily or weakly.

Consequences of hetroscedisticity:

à OLS estimators are remains linear and unbiased.

1). OLS estimators no more have minimum variance in the class of linear unbiased estimators.

à Not remains best.

2). Ordinary formula for calculating variances is no more valid.

Var ( ) ≠ σ²_ ∑x²



OLS estimators are not sufficient, they are larger in variances.

Testing of Hetroscedisticity:

1). Gold-field Quandt test. 2). Glejser test. 3). Rank correlation test. 4). White’s General test.

These all are well known tests, Glejser is weaker test and White’s General test is modified form of that, Gold-field Quandt test is very power full test.

Gold-field Quandt:

This test is power full test but it can not help in detecting the form of hetroscedisticity and not giving direction for its solutions.

Consider the regression equation;

Yi = α + β Xi + Ui

Steps:

1) Arrange the data in order of Xi (ascending order). 2) Omit central 20% observations (to get some whole number), this will yield two sub-

samples 40% observations with small Xi and 40%observations with large Xi. 3) Estimate a regression equation for each sub-sample and compute ∑℮1², ∑℮2² and

hence; Ô1²= ∑℮1², Ô2²= ∑℮2² n1 = n2 = 0.4n n1 – k n2 – k

4) Compute F-statistics.

F = Ô1² if Ô1²> Ô2² Ô2² F = Ô2² if Ô2²> Ô1² Ô1² à The F test is applied at 5% level of significance and degrees of freedom (df) equal

to n1-k and n2-k.our null and alternative hypothesis are as given below;

H0: σ1²= σ2² [no hetroscedisticity] H1: σ1²≠ σ2² [Hetroscedisticity]

Notes: 1). Test is very power full. 2). If there are more than one X variables than the test become quite complicated.

Foodi = α + β Incomei + γ Familyi +Ui Or Yi = α + β Xi + γ Zi + Ui

3). the test does not indicate the form of hetroscedisticity (due to linear, quadratic or simultaneous).



White’s General Test:

White’s General test on the other hand is an instrument in detecting the form of hetroscedasticity and there by directing towards the possible solutions.

Consider the regression equation

Yi = α + β Xi + γ Zi + Ui

Steps:

1). Estimate the equation by OLS and compute ei = Yi – Ŷi 2). Estimate by OLS the following equation

℮i²= a0 +a1Xi + a2Zi + b1Xi²+ b2Zi²+ cXiZi +Vi --------- à (2) 3). Compute F-statistics or χ ²

F = Explained variation/ (n-1) __ Unexplained variation/ (n-m)

χ ²= n R² R² is obtained from equation (2) it is not negligible, it is significant.

F is ~ F (n-1), (n-m)

χ ² is ~ χ m² m = 1+ (k-1) + (k-1) + (k-1) (k-2) ↓ ↓ __↓__ 2 . Intercept, linear, square ↓ . Simultaneous m = 1+ (k-1) + (k-1) + (k-1) (k-2) 2 = 1+ k - 1 + k -1 + k² - 3k + 2 2 = 2k – 1 + k² -1.5k + 1 2 = k² + 1k

2 2 m = k (k +1) 2 Hypothesis:

H0: ai =0, bi =0, ci =0 for all I except intercept. H1: At least one parameter in H0 is ≠ 0.

Rejection of H0 indicates presence of Hetroscedisticity.

Notes:

1) If k is large then m will also be large and it will reduce the power of test.

Let suppose k=6 => 6*7 = 21 2



If n= 50 then n-m= is too small, it is very poor.

à Partial solution for this problem is to omit the simultaneous term, in this case we will have;

m = 1 + (k-1) + (k-1) + 0 = 1 + k -1 + k- 1 m = 2k-1. If k = 6 then m =11

2) The test also help in determining the form of Hetroscedisticity, which can be guessed by looking at estimate of equation (2)’s t-statistics.

℮i²= a0 +a1 Xi + a2 Zi + b1 Xi²+ b2 Zi²+ c Xi Zi + Vi (1.1) (0.95) (1.32) (1.15) (4.5) (0.99)

3) The test is very general in application, it give more than one form of Hetroscedisticity.

Solutions of Hetroscedisticity:

Informal Solution:

In some contexts, we can re specify our model to reduce the chances of Hetroscedisticity. Example1:

Suppose we suspect Hetroscedisticity relates to K (capital), we also expect that α + β + γ ≈ 1 than we can write,

It is more stable variable model as compare to previous; this equation is less likely to have Hetroscedisticity.

Example2:

Consider a quadratic expenditure system QES: Yi = α + β Xi + γ Xi²+Ui

Yi = food, Xi = income. Suppose we expect Var (Ui) = θXi² SD (Ui) = √θ Xi Now ÷ the equation by Xi

Yi =1 α + β + γXi² +Ui Xi, Xi, Xi, Xi Si = β + γ Xi + α 1 + Vi Xi Var (Vi) = Var (1 Ui) Xi



= 1 Var (Ui) Xi² = 1 θXi² Xi² = θ -------- à no Hetroscedisticity. [Micro economic theories are based or depend on survey data.]

Formal Solution:

In the formal solution, we use the basic principles. Suppose Yi = α + β Xi + γ Zi + Ui ---------- à (i) Var (Ui) = σi² ----------------- à (ii) Transform equation (i) in the light of equation (ii), so that the transformed error term is homoscedastic. Thus ÷ equation (i) by σi

Yi = α 1 + β Xi + γ Zi +Ui σi, σi, σi, σi, σi Or Yi* = α Xi + β Xi*+ γ Zi * + Ui* -------- à (iii) Now Var (Ui*) = Var (Ui) σi = 1 Var (Ui) σi² = 1 σi² σi² = 1 ----------- à No Hetroscedisticity. Equation (iii) can be estimated only when σi is known, thus we have to apply a two step procedure.

Step1: Apply OLS to equation (i) and compute the series of ℮i; then if we follow White’s test we’ll estimate the equation. ℮i²= a0 +a1Xi + a2Zi + b1Xi²+ b2Zi²+ c Xi Zi +error ----------- à (iv) Apply White’s test, If the H0 is accepted then hetroscedisticity is not present and step one complete the estimation, if H0 is rejected then hetroscedisticity is present and we move to step two.

Step2: From the estimated equation (iv) and compute the estimated value of the dependent variable ℮i².

Set



Hence Now replace σi by in equation (iii) and apply OLS.

EGRESSION ANALYSIS WITH QUANTITATIVE DATA:

We confine to the cases where in the equation quantitative variables appears on the right hand side only, e.g. gender, ethnicity, residential etc.

Suppose we consider the effect of gender on income.

Y = α + β D + U ------------- à equation (i)

à Where D is “dummy” or binary variable in equation (i) indicating gender

D =0 for male D =1 for female (0 and 1 are more convenient values)

From equation (i) if we assume as usual that E (U) =0 and D is fixed, we can infer the following,

E (Y) = α + β D

[Mean income depends on income of male and female]

E (YM) = α [if D =0, as male] E (YF) = α + β [if D =1, as female] E (YF) - E (YM) = (α + β) – α => β [difference between male and female income]

è Let us redo by defining two dummies:

D1 =0 for male D1 =1 for female

D2 =0 for female D2 =1 for male We can write the model in three different forms (ways).

Y = α0 + α1D1 + U --------------------- à (i) Y = β0 + β1D2 + V --------------------- à (ii)

Y = γ1D1 + γ2D2 + W --------------------- à (iii)

If we include all dummies for all categories of a quantitative variable and also include intercept it will create “dummy variable trap”, this will create perfect co-linearity and estimation will break down.

E (Y) = α0 + α1D1 = β0 + β1D2 = γ1D1 + γ2D2 E (YM) = α0 = β0 + β1 = + γ2 [if D1 =0, D2 =1, as male] E (YF) = α0 + α1 = β0 = γ1 [if D1 =1, D2 =0, as female] E (YF) - E (YF) = α1 = - β1 = γ1 - γ2 [difference]

In equation (iii) model specification is not very good, essentially there is no difference in results (base category is male) e.g. education and literacy relationship with income.

R



è Dummies for more than two categories:

Categories of education qualification (banking side worker)

Illiterate 0-4 years of education Primary 5-9 years of education Secondary 10-11 years of education Senior secondary 12-13 years of education Higher or above 14or above years of education

Define the following dummy variables; [Base category is Illiterate]

D2 =1 if primary, =0 otherwise. D3 =1 if secondary, =0 otherwise. D4 =1 if senior secondary, =0 otherwise. D5 =1 if higher, =0 otherwise.

The regression model is; Y = β1 + β2D2 + β3D3 + β4D4 + β5D5 + U We can set that E (Y) = β1 + β2D2 + β3D3 + β4D4 + β5D5

E (YI) = β1 [if D1 =1, as Illiterate] E (YP) = β1+ β2 [if D2 =1, as Primary] E (YS) = β1+ β3 [if D3 =1, as Secondary] E (YSS) = β1+ β4 [if D4 =1, as Senior Secondary] E (YH) = β1+ β5 [if D5 =1, as Higher]

[Theoretically we expect] β5> β5> β4> β3> β2>0 [General expectation] β1>0

Example: Now consider the effects of gender and education on income.

Gender: G =1 if female, =0 otherwise

Education: E2=1 if secondary, =0 otherwise E3=1 if higher, =0 otherwise The model can be constructed as follows; Y = α + β G + U -------------------- à (1) Now we propose (α) is expected income of male and it depends upon level of education,

α = α 1 +α 2E2 + α 3E3 ---------------------- à (2a)

β = β1 +β2E2 + β 3E3 ---------------------- à (2b)

Now substitute (2a) and (2b) into (1) then,

Y = α 1 +α 2E2 + α 3E3 + [β1 +β 2E2 + β3E3] G + U Or Y = α 1 +α 2E2 + α 3E3 + β1G + β2 (E2G) + β3 (E3G) + U



Categories Expected income E (Y) _____

Male, primary E (YP) = α 1 G =0 Male, secondary E (YS) = α 1+α 2 Male, higher E (YH) = α 1+α 3 Female, primary E (YP) = α 1 + β1 G =1 Female, secondary E (YS) = α 1+α 2 + β1 + β2 Female, higher E (YH) = α 1+α 2 + β1 + β3 ----------------------------------------------------------------------------------------------.

è Combining Qualitative and Quantitative Variables:

Suppose income depends upon experiences and education, experience is measured as a quantitative variable (the years of experiences), education has three categories;

1). M Sc. or Equivalent 2). M Phil or Equivalent 3). PhD or Equivalent

Defining dummies;

D2 = 1 if M.Phil, = 0 otherwise D3 = 1 if PhD, = 0 otherwise

The model can be constructed as follows,

Y = α + β E + U -------------------- à (1) α = α 1 +α 2D2 + α 3D3 ---------------------- à (2a)

β = β1 +β2D2 + β 3D3 ---------------------- à (2b)

Substitute (2a) and (2b) into (1) then,

Y = α 1 +α 2D2 + α 3D3+ [β1 +β2D2 + β 3D3] E + U

[Mean Income]

E (Y) = α 1 +α 2D2 + α 3D3+ [β1 +β2D2 + β 3D3] E

E (Y M Sc) = α 1 + β1E [Mean income at M Sc. level]

E (Y M.Phil) = (α 1+α 2) + (β1 + β2) E [Mean income at M Phil level]

E (Y PhD) = (α 1+α 3) + (β1 + β3) E [Mean income at PhD level] We expect that α 3>α 2>0, α 1>0 β3> β2>0, β1>0



TOCHASTIC/ RANDOM REGRESSORS:

Suppose the assumption that X variables are exogenous is not true. This situation is called as the case of stochastic/ random repressors.

In a typical equation we have,

Y = α + β X + U

X is given, U is random and model is complete.

à If X is not given then the model,

Y = α + β X + U

U is random, model is not complete (information is not complete).

Example1: Macro level consumption function.

Ct = α + β Yt + Ut

Yt = Ct + Zt [Y it self depends upon C, it is case of simultaneous equation, as we can see from below graph]

Example2: [Nt = A Ut] Population is growing exponentially.

Log Nt = α + β Log Nt-1 + Ut

S



Log Nt-1 is not exactly given, it follows the path,

Log Nt-1 = α + β Log Nt-2 + Ut-1 [Nt-1 is also evolving from previous population Nt-2 and so on.]

Example3: Market Demand function

Q = α + β P + U

P is not given, Infect both P and Q is determined by the intersection of supply and demand.

Example4: We have the following relationship for a sample of children aged 0 to18.

Weight = α + β food + U

Cross section data of 100 on average, more you eat more will be the weight and food is not independent, it also depends upon weight also.

Example5: Investment and Saving Model (IS-Model).

IS: Y = α + β R + γ G + U

R = Interest rate, G = government expenditures

Where G is given and R and Y are not given, government can change G according to their needs.

Example6: Weight = α + β Age + U

Age is given and information is complete, there are more factors like age are given in the practice. èConsequences of stochastic/ random Regressors problem:

Consider

Y = α + β X + U

U satisfies all standard assumptions. X is not fixed, it is random



1). OLS estimator of β is :

= ∑xy ∑x² = ∑xY ∑x² = x1 Y1 + x2 Y2 + x3 Y3 + ……………….. + xn Yn ∑x² ∑x² ∑x² ∑x² = a1 Y1 + a 2 Y2 + a 3Y3 + ……………….. + a n Yn

Now X variables are not fixed, therefore we can not treat a1, a 2, a 3 ……a n as constants, in this case a1, a 2, a 3 ……a n are them salves random variables, so is not a linear function of Y. so property of linearity of OLS estimator is violated.

2). Consider E ( ):

= β + ∑xU ∑x² = β + x1 U1 + x2 U2 + x3 U3 + ……………….. + xn Un ∑x² ∑x² ∑x² ∑x² Apply expectation:

E ( ) = β +E ( x1 U1) +E ( x2 U2) +E ( x3 U3) + ……………….. +E ( xn Un) ∑x² ∑x² ∑x² ∑x²

E ( ) ≠ β + x1 E (U1) + x2 E (U2) + x3 E (U3) + ……………….. + xn E (Un) ∑x² ∑x² ∑x² ∑x² Because x can not be factored out from expectation, (X and U are correlated with each other)

so is biased.

3). It can be shown that if x and u are independent, but x is random then the OLS estimator

is biased but with increase in sample size the biasness approach towards zero. Examples:

1). Weight = α + β F + U (F and U are correlated) U à W à F

2). Q = α + β P + U (P and U are correlated) U à Q à P 3). Y = α + β R + γ G + U (R and U are correlated) U à Y à R A good example is;

Yt = α + β Yt-1 + U t

Since Yt-1 = α + β Yt-2 + Ut-1 depends on random variable U t-1 soYt-1 is random, Ut is uncorrelated with Yt-1 (It means today’s event does not depends upon yesterday’s action) today’s shock does not change yesterday’s event. However, Ut and Yt-1 are independent as sample size tends to infinity or reasonable large than bias will be negligible.



4). It can be shown that if x is random and x is not independent of U then OLS estimator is biased and the amount of bias does not diminish with increase in sample size.

5). Consider special case of example 4.

Yt = α + β Yt-1 + U t

[Current CPI depends upon previous CPI].

Ut = ρUt-1 + Єt Now Ut-1 Ut

ð Ut and Yt-1 are correlated Yt-1

Now OLS estimator becomes biased. [Recall that D W statistics also becomes biased, now we know the reason]. Auto correlated and leg dependent variables both create more serious problem. Solution / Estimation Procedure:

Consider the model

Y = α + β X + U Cov(X, U) ≠0 [X is not given, not random and correlated with U, example 4]

Now we define instrumental variable, say Z, as a variable that satisfies two conditions;

1). Z and X are closely correlated with in the given sample. 2). Cov (Z, U) =0 in the population. [This seems impossible in sense]

X U But Z is not correlated with U. Z Food = α + β Weight + U

X = weight Z = age

Age does not depends upon U

Weight U (sickness) Age Example:

C = α + β Y + U Y = C + Z Z = exogenous [U à C à Y]

Y U Z, Yt-1 Time series data and Yt-1 also good instrumental factor to use in this example



èèèè One important of estimation is Two Stages Least Squares (2SLS) Method.

Model: Y = α + β X + U ------------ à (i) Cov (X, U) ≠0 Z is valid instrument.

Stage 1: Regress X on Z

X = a + b Z + V

Apply OLS and obtain estimated and and hence

is such that it contains only those variations in X which are determined by Z (basically we are filtering out the problem).

In other words we have

Not explained by Z, “Trouble some”. Explained by Z, “Trouble free roughly”.

We can say that X is endogenous and therefore trouble some

De-endogenizes X

Stage 2: Rewrite the main equation (i) as follows;

Y = α + β ( ) + U

Error in X variable Or

Where Now apply OLS,

It can be shown that estimator, so obtained are;

1). Not linear 2). Biased



3). asymptotically unbiased. [As the sample size increases biasness diminishes towards zero].

These estimators are called 2SLS estimators as well as Instrumental Variables Least Squares [IVLS] estimators.

imultaneous Equations estimations:

Consider any two equations Let Q = α + βP + γY + U [Demand] Q = a + bP + cW + dR+ V [Supply]

P and Q both are endogenous variables, we are going to calculate the endogenous variables at the same time and find first variable and put the value of that to calculate the second one at the same time is referred to simultaneous equation case.

! ~~~~~~~~~~~~~~~~~~~~~~~~~ !

S





Econometrics Practice Questions Sir Eatzaz Ahmed Q.1: Are the following statements true, false or uncertain? Explain your answer.

a) Instrumental variables are used when data on some variables are not available. Answer: False statement, because the instrumental variables are used as proxy variable on the behalf

of endogenous variables when there is endogenity problem we define instrumental variable. b) OLS estimators for the parameters of simultaneous equation are inconsistent when the

equations are under-identified. However, the estimators become consistent if the equations are identified.

Answer: We have not learned that simultaneous equation topic. c) Cochrane-Orcutt iterative procedure is a test for autocorrelation.

Answer: False Statement, Cochrane-Orcutt iterative is not a test for autocorrelation because it is solution for autocorrelation.

d) Multicollinearity problem arises only when there are many equations in the model. Answer: False Statement, because Multicollinearity problem arises only when there are many

explanatory variables in the model. e) A major limitation of DW test is that it is a powerful test.

Answer: False Statement, because it is not a limitation of DW test; it is specialty that it is powerful test.

f) Following is a set of simultaneous equations Z = α + βY +U Y = C + I + G + X – M

Answer: False Statement, because Z and Y are not simultaneously determined where value of Y is given in the second equation if both are not given same time then it will be the set of simultaneous equation.

g) The inconclusive range in the DW test is the result of type-2 error. Answer: False Statement because type-2 error is acceptance of H0 when it is false and inconclusive

range means we unable to give concrete result. h) Goldfield-Quandt test is a powerful method of estimating an equation in the presence of

hetroscedisticity. Answer: False Statement, because Goldfield-Quandt is a test not a estimating method or

solution in the presence of hetroscedasticity. i) In the regression equation Yt = α + βX t + δ2 ε t-2 + δ1ε t-1 + ε t , where ε t is a random error

term, multicollinearity can occur if there is strong correlation between ε t-1 and ε t-2 . Answer: True Statement, because the regression is run on the random error terms and

multicollinearity is present. Q.2: Critically evaluate the following statements. Give details to justify your answer.

a) Hybrid equations are used in order to remove both autocorrelation and multicollinearity from an equation.

Answer: We have not learned that hybrid equation topic. b) In the presence of multicollinearity, the OLS estimator is linear and unbiased and its variance

is smaller than the variance of any other linear and unbiased estimator. Answer: It is true that In the presence of multicollinearity, the OLS estimator is linear and

unbiased and its variance is smaller than the variance of any other linear and unbiased estimator.



c) In the presence of autocorrelation OLS estimators of regression parameters are likely to have large sampling error and, therefore, they are unbiased.

Answer: It is true that In the presence of autocorrelation OLS estimators of regression parameters are likely to have large sampling error and, therefore, they are unbiased.

d) The estimators based on Cochrane-Orcutt iterative method are linear and unbiased with minimum variance.

Q.3: Using a cross section data of 500 household you are to study the effects of income, rural-urban

residence and education level of the household on household savings. The information on education is classified as no education, school level education and higher education. Formulate an appropriate regression equation.

Answer: Saving = f [Income (Y), Residence (R), Education (E)] S = α + β Y + U. -------------------- à (1) Education dummies: E2 = 1 if school level, = 0 other wise. E3 = 1 if higher level, = 0 other wise.

Residential dummy: R = 1 if rural, = 0 other wise.

α = α0+ α1 R + α2 E2 + α3 E3 ------------------- à (2a) β = β0 + β1 R + β2 E2 + β3 E3 ------------------- à (2b)

Substitute (2a) and (2b) in (1)

S = α0+ α1 R + α2 E2 + α3 E3 + (β0 + β1 R + β2 E2 + β3 E3) Y + U. Or S = α0+ α1 R + α2 E2 + α3 E3 + β0 Y + β1 (RY) + β2 (E2 Y) + β3 (E3 Y) + U.

Q.4: Carefully explain steps to apply White’s hetroscedisticity test for the Q.3 equation. Answer:

S = α0+ α1 R + α2 E2 + α3 E3 + β0 Y + β1 (RY) + β2 (E2 Y) + β3 (E3 Y) + U.

Apply White’s General Test. Steps: ó (1) Estimate the equation by OLS and compute.

℮i = S – Ŝ ó (2) Estimate by OLS the following equation.

S = a0+ a1 R + a2 E2 + a3 E3 + a4 Y + b1 R² + b2 E2² + b3 E3² + b4 Y² + c1 RY + c2 E2Y + c3 E3Y + V.

This given ℮i² equation all the forms of hetroscedasticity. ó (3) Compute F-statistics or χ ²

F = Explained variation/ (n-1) __ Unexplained variation/ (n-m)

χ ²= n R² [R² is obtained from given equation, it is significant.] Where m = 2k-1 with out cross product terms



= 2*9 – 1. [k = 9] = 18 – 1. m = 17 n = 500 n – m = 500 – 17 = 483.

m = k (k +1) 2 Test the Hypothesis:

H0: ai =0, bi =0, ci =0 for all i except intercept. H1: At least one parameter in H0 is ≠ 0.

Rejection of H0 indicates presence of hetroscedisticity and Acceptance of H0 indicates no hetroscedisticity.

Q.5: Consider the following estimated regression equation based on a sample of 26 firms in a manufacturing industry of Pakistan, where MPL and L denote the marginal product of labor and the number of labor units respectively. The values in parenthesis are computed the t-values.

MPL = 100 +.012 (1/L), R² = 0.4 (40.0) (0.03)

a) Explain and interpret all the results. b) Test the prepositions that marginal product of labor is diminishing. c) Suppose your sample include 16 firms in the privates sector and 10 in public sector. How

would you modify the regression equation in order to allow for the possibility that the marginal product of labor diminishes faster in private sector than in the public sector? How would you carry out the test?

Q.6: Consider the following demand equation, where M is the quantity of Money, Y is real out put, P

is general price level, W is financial wealth and t denotes the time period.

Log (Mt) = α + β log (Yt) + δ log (Pt) + Ф Rt + λ log (Wt) + Ө log (Mt-1) + Ut

a) Explain the meaning of each parameter. Answer: Meaning of parameters α = Subsistence or Autonomous elasticity of money demand β = ∂ Log (Mt) = % change in elasticity of money demand . ∂ Log (Yt) % change in output elasticity of money demand δ = ∂ Log (Mt) = % change in elasticity of money demand . ∂ Log (Pt) % change in price elasticity of money demand Ф = ∂ Log (Mt) = % change in elasticity of money demand . ∂ Rt % change in World interest rate λ = ∂ Log (Mt) = % change in elasticity of money demand . ∂ Log (Wt) % change in financial wealth elasticity of money demand Ө = ∂ Log (Mt) = % change in elasticity of money demand . ∂ Log (Mt-1) % change in previous elasticity of money demand



b) How would you test the following propositions, one at time? Answer: Testing of propositions

i. Money demand does not depend on financial wealth Answer: H0: λ = 0 H1: λ ≠ 0

ii. The output elasticity of money demand is greater than the price elasticity Answer: H0: β – δ > 0 H1: β – δ = 0

iii. Money demand depends on nominal income PY only Answer: H0: β = δ, λ = 0, Ө = 0, Ф = 0 H1: β ≠ δ, λ ≠ 0, Ө ≠ 0, Ф ≠ 0

Log (Mt) = α + β log (Yt) + δ log (Pt)

Log (Mt) = α + β [log (Yt) + log (Pt)]

Log (Mt) = α + β log (Yt Pt)

Q.7: In the regression equation Yi = a + bXi + c Xi-1 + Ui, Durbin-Watson test is not appropriate to detect first order autocorrelation. Do you agree? If yes, which test is suitable in this case?

Answer: Not agree, because there is no lagged dependent variable; there is lagged independent variable on the right hand side when lagged dependent appears on the right hand side then DW is not appropriate to detect first order autocorrelation and Durbin h-test is suitable in this case.

Q.8: Can White’s general test detect all type of autocorrelation in a random variable? Answer: This statement is wrong because White’s General test is for hetroscedasticity

problem not for detection of autocorrelation problem.

Q.9: In the presence of hetroscedasticity a regression equation can be estimated by Goldfield-Quandt test. Do you agree?

Answer: We can not estimate the regression equation in the presence of hetroscedasticity because Goldfield-Quandt is test not an estimation technique; estimation techniques are OLS and many others.

Q.10: While estimation the regression equation Yi = a + bXi + Ui, multicollinearity problem is more likely to occur in cross section data than in time series data. Do you agree?

Answer: No, because multicollinearity problem of correlation among explanatory (X) variables and is more likely to occur in time series data and there is only one explanatory variable in the given regression equation.

Q.11: Interpret the following regression equation as an economist. C, Y and W are per capita consumption, income and wealth respectively, all in thousand rupees. Numbers in parentheses are the t-values.

Ct = 1.17 + 0.45Yt + 0.55Wt, R² = 0.9643, DW = 0.09. (2.13) (6.17) (1.43) Answer: Interpretation: - The result show that 96.43% variation in consumption expenditure is

explain by our model, which indicates that the over all performance of the equation is satisfactory, the intercept is positive and significantly different from zero, its magnitude shows that the subsistence or autonomous consumption expenditure is Rs.1.17 thousand



rupees per capita per year, further note that marginal propensity to consume (MPC) is significantly different from zero and less than one, the estimated value of the MPC shows that the marginal consumption rate is 0.45 or 45% of each incremental rupee of income is consumed, while the remaining 55% is the marginal consumption rate of each incremental rupee of wealth is consumed.

Q.12: Suppose in the equation Yt = a + bXt + Ut, the stochastic variables Xt and Ut are correlated with each other.

a) Does this imply that we have problems of autocorrelation and/or multicollinearity and/or hetroscedasticity?

Answer: It is not a problem of autocorrelation or multicollinearity or hetroscedasticity then it is endogeniety problem b) Can in this case the equation be estimated by White’s general test or Durbin-Watson test or

Durbin’s h-test? Answer: We can not because White’s general test or Durbin-Watson test or Durbin’s h-test are

tests not solutions for the given equation. Q.13: Suppose you have estimated two alternative cost functions for wheat using data on 500 farms.

The cost (C) is measured in thousands of rupees while output (Q) is measured in tons. The regression results are given below. The vales in parentheses are standard errors.

Log (C) = - 10.48 +1.12 log (Q) C/Q = 4568 + 0.2284 Q + 4.84 Q-¹ (2.56) (0.16) (1141) (0.5521) (4.40)

Can you test the null hypothesis that the marginal cost is an increasing function of output for each equation? If yes, apply the test and draw your conclusion. If not, explain why the test cannot be applied and what additional information, if any, is required to perform the test.

Answer: Log (C) = - 10.48 + 1.12 log (Q) (2.56) (0.16) Null Hypothesis H0: β = 0 Degree of freedom = (n-k) H1: β ≠ 0 = (500 – 2) => 498. Critical value = 1.96 Level of significance = 5% Two tail test Apply test

t è We reject H0. H0: β = 1 Degree of freedom = (n-k) H1: β < 1 = (500 – 2) => 498.



Critical value = 1.96 Level of significance = 5% Right tail test Apply test

è We accept H0. C/Q = 4568 + 0.2284 Q + 4.84 Q-¹ (1141) (0.5521) (4.40)

Null Hypothesis H0: α = 0 Degree of freedom = (n-k) H1: α ≠ 0 = (500 – 2) => 498. Critical value = 1.96 Level of significance = 5% Two tail test Apply test

è We reject H0. H0: β = 0 Degree of freedom = (n-k) H1: β ≠ 0 = (500 – 2) => 498. Critical value = 1.96 Level of significance = 5% Two tail test Apply test

è We reject H0.



H0: β = 1 Degree of freedom = (n-k) H1: β < 1 = (500 – 2) => 498. Critical value = 1.96 Level of significance = 5% Right tail test Apply test

è We accept H0. H0: γ = 1 Degree of freedom = (n-k) H1: γ > 1 = (500 – 2) => 498. Critical value = 1.96 Level of significance = 5% Left tail test Apply test

è We accept H0.

Conclusion: The over all test’s on each equation is given but we are able to make the decision that over all equation is satisfactory and there is also need to check the performance of the equation given by R² which is not given in each of the equation. It is shown from the results that marginal cost is not increasing function of the output for each equation.

Q.14: Suppose you want to study the propositions:

i. Loan recovery rate varies considerably across private commercial banks, public owned commercial banks and development finance institutions,

ii. The loan recovery rate declines with the size of loan. Formulate an appropriate econometric equation, giving special attention to construction of variables and the type of data to be used for estimation.

Answer: Loan Recovery = f [size of loan (Z), nature of bank) LR = α + β Z + U. ---------------- à (1)



Dummies of banks: D2 = 1 if Public own commercial banks, = 0 other wise. D3 = 1 if Development finance banks, = 0 other wise.

α = α1 + α2 D2 + α3 D3 ------------------- à (2a) β = β1 + β2 D2 + β3 D3 ------------------- à (2b)


LR = α1 + α2 D2 + α3 D3 + (β1 + β2 D2 + β3 D3) Z + U.

E (LR) = α1 + α2 D2 + α3 D3 + β1Z+ β2 (D2Z) + β3 (D3Z)

E (LR, commercial banks) = α1 + β1Z.

E (LR, Public own commercial banks) = (α1 + α2) + (β1 + β2) Z.

E (LR, Development finance banks) = (α1 + α3) + (β1+ β3) Z. Q.15: Consider the following the regression equation estimated by OLS using time series data for 24

years, where E is official exchange rate (rupees per US dollar), P is domestic price level (CPI), П is world price level, and T is trade deficit as a percentage of GDP. The numbers in parentheses are the computed t-values.

Log (Et) = 0.12 + 0.58 Log (Pt) – 0.23 Log (П t) + 0.0021T t + 0.92 Log (E t-1) (2.23) (3.21) (-1.85) (2.43) (23.00)

R² = 0.9958 DW = 1.82

a) Interpret all the results other than R² and DW statistic. b) What could be the reason for including lagged exchange rate in the equation? c) Are there are serious econometric problems apparent from the results? d) How would you re-estimate the equation in the light of these problems? If there are two or

more problems, consider each one at a time. Q.16: Consider the following model of IS-LM equilibrium:

IS: Y = α +β R +δ Z +U LM: M = Ф +λ R + Ө Y + π W +V

Where Y is aggregate expenditure, R is interest rate; Z is exogenous component of aggregate expenditure, M is the quantity of money and W is financial wealth. Suppose the State Bank of Pakistan (SBP) pegs money supply at predetermined levels (that is the quantity of money is exogenous) and lets the interest rate be determined in the market (the interest rate is endogenous).

Q.17: a) Why do we include lagged variables in a regression equation? Illustrate using an example

from economics. Answer: We use the lagged dependent variable to capture the inertia (sluggish ness) of the equation,

for example current consumption depends up on previous consumption. b) Explain the use of Instrumental Variable Least Squares method using your example.

Answer: Model: Y = α + β X + U ------------ à (i) Cov (X, U) ≠0 Z is valid instrument.



Stage 1: Regress X on Z

X = a + b Z + V

Apply OLS and obtain estimated and and hence

is such that it contains only those variations in X which are determined by Z (basically we are filtering out the problem).

In other words we have

Not explained by Z, “Trouble some”. Explained by Z, “Trouble free roughly”.

We can say that X is endogenous and therefore trouble some

De-endogenizes X

Stage 2: Rewrite the main equation (i) as follows;

Y = α + β ( ) + U

Error in X variable Or

Where Now apply OLS

It can be shown that estimator, so obtained are;

1). Not linear 2). Biased 3). asymptotically unbiased. [As the sample size increases biasness diminishes towards zero].

These estimators are called 2SLS estimators as well as Instrumental Variables Least Squares [IVLS] estimators.

c) Provide interpretation for each parameter in the light of economic model you have chosen. Answer:



Q.18: a) Explain the use of dummy variables in determining the effects of gender (male or female) and

education (matriculation, intermediate, bachelor or higher) on wage rates among clerical personnel.

Answer: Wage = f [Gender (G), Education (E)] Gender dummy: G =1 if female, = 0 other wise. Education dummies: E2 = 1 if intermediate, = 0 other wise. E3 = 1 if higher, = 0 other wise. We can construct model in this way;

W = α + β G + U. ------------------- à (1)

α = α1 + α2 E2 + α3 E3 ------------------- à (2a) β = β1 + β2 E2 + β3 E3 ------------------- à (2b)


W = α1 + α2 E2 + α3 E3 + (β1 + β2 E2 + β3 E3) G + U. Or W = α1 + α2 E2 + α3 E3 + β1G + β2 (E2G) + β3 (E3G) + U. Categories Expected Wage “E (W)”

Male, Matriculation = α1. G = 0. Male, Intermediate = α1 + α2 Male, Higher = α1 + α3.

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.

Female, Matriculation = α1 + β1 G = 1. Female, Intermediate = (α1 + α2) + (β1 + β2) Female, Higher = (α1 + α3) + (β1 + β3). ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.

b) Provide interpretation for each parameter.

Answer: Interpretation of parameters. α1 = Mean wage of male at matriculation level education. α2 = Mean wage of male at intermediate level education. α3 = Mean wage of male at higher level education. β1 = Differential effect of being a female at matriculation level education. β2 = Differential effect of being a female at intermediate level education. β3 = Differential effect of being a female at higher level education.



Q.19: Using the regression equation Yi = β + Ui provide a precise answer to the following question

with or without mathematical proofs.

a) Under what assumption is the OLS estimator β of linear?

Answer: should be linear function Y. b) Under what assumption is the OLS estimator β of unbiased?

Answer: when E ( ) = β then OLS estimator β will be unbiased.

Q.20: Consider the following demand function for rice where Q is per capita consumption of rice in kilograms, P is price of rice per kilogram and M is per capita income in rupees. The regression equation has been estimated on the basis of time series data for 9 years. The values in parentheses are standard errors.

ln Q = 2.46 – 0.45 ln P + 0.65 ln M R² = 0.90, F = 12.00 (0.82) (0.20) (0.50)

a) Explain the meanings of estimated regression coefficients. b) Test the null hypothesis of the following one by one and interpret the results of your tests.

i. Income elasticity of rice is greater than one. ii. Income elasticity of rice is negative.

Q.21: In the regression equation Yi = β Xi +Ui the parameter β can be estimated using one of the

following methods.

a) Are the two estimators linear; prove? b) Are the two estimators unbiased; prove?

c) Which of the two estimators do you prefer over the other? Justify your choice.

Q.22: What are the limitations of DW test for autocorrelation? Answer:

1. The test statistics has inconclusive range, so it may not produce a concrete conclusion.

2. The test is especially designed for AR (1) process, but not for higher order auto processes or MA process or others.

3. DW gives biased results when lagged dependent variable appears on the right hand side.

Q.23: Explain rank correlation test for hetroscedasticity. Answer: We have not learned that test.

Q.24: Derive autocorrelation function for the following autoregressive processes. a) Ut = ρ 3 Ut-3 + ε t b) Ut = ε t + δ 1 ε t -1 + δ 3 ε t -3

Q.25: Derive OLS estimators for the parameters of the following equations, where NX, X, and P are net export (exports minus imports), export and consumer price index respectively.

a) NX = Xt - βYt +Ut



Answer: Apply OLS and estimate the NXt to compute regression residual ℮i².

First order condition.

b) Log (Pt) = α + log (Pt-1) + Ut

Answer: Apply OLS and estimate the equation to compute regression residual ℮i².

First order condition:

Q.26: Interpret multicollinearity problem as poor information content in data. Consider any

estimation strategy and explain how it can improve the information content.



Q.27: What econometric problems arise in the estimation of an equation with lagged dependent variable on the right hand side? Suggest solution(s) to these problems.

Q.28: Specify an econometric equation to determine monthly earning in a cross section of 300 economists in Pakistan. Define all the variables in your model and explain how they can be measured in practice.

Q.29: Determine identification of the following two equations by hybrid equations method and explain the steps for estimation of each equation by 2SLS method. Consider the following set of equations.

a) Y = α 1 + α 2 R + α 3 Z + U Answer: Here Cov (R, U) ≠ 0. Z is the valid instrumental variable.

Stage 1: Regress R on Z R = a + b Z + V Apply OLS and obtain , and

Where is such that it contains only that variation in R which is explained or determined by Z.? In other words we have R = + R – . We can say that Z is endogenous there is some trouble. De-endogenizes R Stage 2: Rewrite the main equation as follows. Y = α1 + α2 ( + R – ) + α3 Z + U Y = α1 + α2 + α3 Z + U + α2 (R – ). Error in R variable Or Y = α1 + α2 + α3 Z + V Where V = α2 (R – ). Now apply OLS

Where

It can be shown that estimators, so obtained are (1) Not linear. (2) Biased. (3) Asymptotically unbiased [As the sample size increases biasness will diminishes towards zero].

b) M = β1 + β2 Y + β3 R + V

Answer: Here Cov (Y, V) ≠ 0. R is the valid instrumental variable.



Stage 1: Regress Y on R Y = α1 + α2 R + U Apply OLS and obtain , and

Where is such that, it contains only that variation which explained or determined by R. In other words we have Y = + Y - We can say that Y is endogenous, therefore there is some trouble De-endogenizes Y Stage 2: Rewrite the main equation as follows

M = β1 + β2 ( + Y - ) + β3 R + V M = β1 + β2 + + β3 R + V + β2 (Y - ) Error in Y variable Or M = β1 + β2 + + β3 R + W [W = β2 (Y - )] Now apply OLS

Where

It can be shown that estimators, so obtained are (1) Not linear. (2) Biased. (3) Asymptotically unbiased

[As the sample size increases biasness will diminishes towards zero].

Q.30: Consider an econometric equation involving four or more variables. Suppose you have access to only annual data for 25 years for Pakistan and no other data are available in or outside Pakistan. Further suppose that there is severe multicollinearity in data that can not be eliminated by dropping any variable from the equation. How would you handle this situation? Provide an elaborate answer.

Q.31: The daily demand for strawberries in Islamabad depends on price of strawberries only. On each

day a fixed quantity of strawberries (that can change from day to day) is brought to the market and the price is determined at a level that clears the market. If it were known that the elasticity of demand is constant. Would you be able to obtain unbiased estimator of the elasticity?

Answer: Qd = α + β P + U. [Demand function] Qs = Q is fixed. [Supply function]



Quantity supply is fixed, exogenous variable and Price is endogenous.

If elasticity of demand is (fixed) constant than the demand function becomes; Log Qd = α + β log P + U. [Demand function] Log Qs = Q is fixed. [Supply function]

Here we can not take out the P from expectation because P is not “fixed” variable, so it becomes biased because P and U are correlated with each other. Q.32: You want to estimate Cobb-Douglas production function for manufacturing sector of

Pakistan with capital, labor and energy as the factor inputs, with only 16 time series observations available. Multicollinearity problem is likely to arise. In order to tackle this problem one can use 16 observations on the private sector and other 16 on public sector to make a pooled sample of 32 observations. What complications are likely to arise due to pooling and how would you respond to these complications?

!~~~~~~~~~~~~~~~~~~~~~~~~~!



E523 Econometrics Spring Semester 2007 Sir Eatzaz Ahmed Total marks: 75 Terminal Paper Time: 3 hour NOTE: Attempt any three questions. Each question is worth 25 marks

1. Explain and differentiate between: a) Error term of a regression equation and regression residual b) Random and fixed variables c) R square and Adjusted R-square d) Goldfield-Quandt Test and White’s general test e) Dummy, proxy and instrumental variables

2. Are the following statements true, false or uncertain? Explain your answer.

a) The sample mean of the random error term, U = 1 ∑ Ui is equal to zero. n

b) In the regression equation Y/ X = b + U the OLS estimator of b is equal to Y / X. c) If the variable Y is regressed on X and log (X), it may create multicollinearity due

to strong linear relationship between the variables X and log (X). d) In the equation X t = α + β Y t + δ Y t-1 + U t a major limitation of DW test is that it

produces biased results due to presence of Y t-1 on right hand side of the equation.

e) Since a dummy variable can take only values, it must be fixed (exogenous). f) Instrumental variables are used to test the presence of endogenous variables in the

equation. 3. Consider the regression model:

Y t = α + β Y t + U t U t = U t-2 + ε t.

ε t is white noise

a) Derive autocorrelation coefficients for the lag lengths 0, 1, 2, 3, 4. b) Explain the Two-Step Iterative method of estimation.

4. Consider the regression model: Y t = α + β X t + U t X t = a + b Y t + V t. U and V satisfy all standard assumptions.

a) Show that the OLS estimator of β is biased. b) Does bias decrease with increase in sample size? c) Consider any context in economics in which the above model can be applied.

Mention what are the X and Y variables in the context that you have chosen. d) In the context you have chosen, what instrumental variable can be used for

estimating α and β.

! ~~~~~~~~~~~~~~~~~~~~~~~~ !



E523 Econometrics Spring Semester 2007 Sir Eatzaz Ahmed Total marks: 35 1st Mid Term Time: 1 hour 1. Suppose you have estimated two alternative cost functions for wheat using data on 500 farms. The

cost (C) is measured in thousands of rupees while output (Q) is measured in tons. The regression results are given below. The vales in parentheses are standard errors.

Log (C) = - 10.48 +1.12 log (Q) C/Q = 4568 + 0.2284 Q + 4.84 Q-¹ (2.56) (0.16) (1141) (0.5521) (4.40)

Can you test the null hypothesis that the marginal cost is an increasing function of output for each equation? If yes, apply the test and draw your conclusion. If not, explain why the test cannot be applied and what additional information, if any, is required to perform the test.

Answer: Log (C) = - 10.48 + 1.12 log (Q) (2.56) (0.16) Null Hypothesis H0: β = 0 Degree of freedom = (n-k) H1: β ≠ 0 = (500 – 2) => 498. Critical value = 1.96 Level of significance = 5% Two tail test Apply test

t è We reject H0. H0: β = 1 Degree of freedom = (n-k) H1: β < 1 = (500 – 2) => 498. Critical value = 1.96 Level of significance = 5% Right tail test Apply test

è We accept H0.



C/Q = 4568 + 0.2284 Q + 4.84 Q-¹ (1141) (0.5521) (4.40)

Null Hypothesis H0: α = 0 Degree of freedom = (n-k) H1: α ≠ 0 = (500 – 2) => 498. Critical value = 1.96 Level of significance = 5% Two tail test Apply test

è We reject H0. H0: β = 0 Degree of freedom = (n-k) H1: β ≠ 0 = (500 – 2) => 498. Critical value = 1.96 Level of significance = 5% Two tail test Apply test

è We reject H0. H0: β = 1 Degree of freedom = (n-k) H1: β < 1 = (500 – 2) => 498. Critical value = 1.96 Level of significance = 5% Right tail test Apply test

è We accept H0.



H0: γ = 1 Degree of freedom = (n-k) H1: γ > 1 = (500 – 2) => 498. Critical value = 1.96 Level of significance = 5% Left tail test Apply test

è We accept H0.

Conclusion: The over all test’s on each equation is given but we are able to make the

decision that over all equation is satisfactory and there is also need to check the performance of the equation given by R² which is not given in each of the equation. It is shown from the results that marginal cost is not increasing function of the output for each equation.

2. Using the regression equation Yi = β + Xt + Ui provide a precise answer to the following question with or without mathematical proofs.

a) OLS estimator of β is .

Answer: OLS estimator of β is .

Yi = β + Xt + Ui Estimation:

As we know that ℮ = Yi - As we know ∑℮² = (Yi - ) ² ∑℮² = (Yi - -Xt) ²

First order condition

=> -2 ∑ (Yi - -Xt) = 0.

=> ∑ Yi – n - ∑ Xt) = 0.

=> ∑ Yi – ∑ Xt = n



Both sides divide by n

b) Under what assumption is the OLS estimator β of linear? Answer: OLS estimator β of linear

= 1 (∑ Yi – ∑ Xt) n = 1 ∑ (Yi – Xt) n = 1 ∑ (β + Xt + Ui – Xt) n = 1 ∑ (β + Ui) n = ∑ (1 β +1 Ui) n n = n 1 β +1 ∑ Ui) n n = β +1 ∑ Ui

n = β + ai ∑ Ui where ai = 1. n

c) Under what assumption is the OLS estimator β of unbiased?

Answer: OLS estimator is unbiased

E ( ) = β. Proof:

E ( ) = E [β + ∑ (ai Ui)] = β + ∑ ai E (Ui) since ai is fixed = β + ∑ ai (0) where E (Ui) = 0

E ( ) = β

! ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !



E523 Econometrics Spring Semester 2007 Sir Eatzaz Ahmed Total marks: 35 2nd Mid Term Time: 1 hour

1. a) How would you simply define Multicollinearity? b) What type of procedure do you suggest to Diagnose or test Multicollinearity? c) Multicollinearity?

2. d) Drive Autocorrelation coefficient function at lag length 0, 1, 2, 3, 4. e) Consider the following model & solve through Iterative Two-Step procedure.

Yt = α + β Xt + Ut. -------------------------- à (1) Ut = ρ Ut-1 + Є t. -------------------------- à (2)

Є t. is White noise.

! ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !

Dr. Etazaz Econometrics Notes.pdf

Documents

e y u y

variable y

years y

e y u u

year y

pop mean of y

eatzaz ahmed quaide

average of errors e