7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
1/44
Review of OLS & Introduction to PooledCross Sections
EMET 8002
Lecture 1
July 23, 2008
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
2/44
Outline
Review the key assumptions of OLS regression
Chapter 3 in the text
Motivation for course topics
Introduce regression analysis with time series data
Chapter 10 in the text
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
3/44
OLS assumptions
MLR.1: The model is linear in parameters
MLR.2: Random sampling
MLR.3: No perfect multicollinearity
None of the independent variables are constant and
there are no exact linear relationships among theindependent variables
MLR.4: Zero conditional mean
0 1 1 ... k ky x x u = + + + +
( )1| ,..., 0kE u x x =
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
4/44
Unbiasedness of OLS
Theorem 3.1: Under assumptions MLR.1 to MLR.4
For any values of the population parameter j
In other words, the OLS estimators are unbiasedestimators of the population parameters
This is a property of the estimator, not any specific
estimate i.e., we have no reason to believe that our estimate is either too big or too small
( ) , 0,1,...,j jE j k = =
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
5/44
Review: The Variance of OLS
MLR.5: Homoskedasticity
Theorem: Given MLR.1 through MLR.5, we can show that:
Where SSTj is the total sample variation ofxj:
And is the R-squared from regressing xj on all the otherregressors in the model.
var u x1,x
2,...,x
k( ) = 2
varj( ) =
2
SSTj 1 Rj2( )
SSTj = xij xj( )2
i=1
n
R
j
2
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
6/44
Review: Standard Errors for OLS
Unless we know , we need an unbiased estimator in orderestimate standard errors.
We obtain:
Thus, the formula for the standard error is given by:
2
2=
ui
2
i=1
n
n k1
=SSR
n k1, E
2X =
2
sej( ) =
SSTj 1 Rj2( )
12
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
7/44
Review: Gauss-Markov Theorem
Theorem: Under assumptions MLR.1 through MLR.5, the OLSestimators
are the best linear unbiased estimators (BLUE) of
i.e., For any linear estimator,
OLS has a lower variance, i.e.,
0,
1,...,
k
0,
1,...,
k
j = wijyi
i=1
n
var j( ) var j( )
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
8/44
Review: Variance-Covariance Matrix
var ( ) =
var 0( ) cov 0 , 1( ) cov 0 , k( )
cov 1, 0( ) var 1( ) cov
i,
j( ) var i( ) var
k( )
For testing hypotheses involving more than one coefficient, we may need toknow covariances between coefficient estimators:
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
9/44
What is the sampling distribution?
Knowing the mean and variance of the OLS estimator isinsufficient information to permit hypothesis testing.
Under what conditions will the OLS estimator be normallydistributed?
Exact, small (finite) sample assumptions Large sample (asymptotic) assumptions
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
10/44
Asymptotic Normality
Theorem: Under the Gauss-Markov Assumption (MLR.1through MLR.5, also permitting MLR.4):
Where
And
Which allows us:
n j j( )~
a
N 0,2
aj
2
aj2= plim n
1rij2
i=1
n
> 0
plim
2=
2
j j
se j( )~a
N 0,1( )
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
11/44
Motivation for course
During the lecture component of the course we aregoing to study circumstances in which some of theOLS assumptions fail
Caution: a little econometric technique in the wronghands can be a dangerous thing
See critique by Angus Deaton (2009) Instruments ofDevelopment
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
12/44
The nature of time series data
Two key differences between time series and cross-sectional data:
Temporal ordering
Interpretation of randomness
When we collect a time series data set, we obtain onepossible outcome, or realization, but if certain conditionsin the past had been different, we generally obtain a
different realization for the stochastic process The set of all possible realizations of a time series
process plays the role of the population in a cross-sectional analysis
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
13/44
Simple examples: A finite distributed
lag model
Afinite distributed lag (FDL) model:
This is model is an FDL of order 2
How do we interpret the coefficients in the context ofa temporary, one-period increase in the value of z?
0 shows the immediate impact of a one-unit increase
in z at time t impact propensity or impactmultiplier
We can graph j as a function of j to obtain the lagdistribution
0 0 1 1 2 2t t t t t y z z z u = + + + +
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
14/44
Finite distributed lag model
We are also interested in the change in y due to apermanent one-unit increase in z
At the time of the increase y increases by 0 After one period, y has increased by 0+ 1 After two periods, y has increased by 0+1+2 There are no further changes in y after two periods
This shows that the long-run change in y given a
permanent increase in z is the sum of the coefficientson the current and lagged values of z
This is known as the long-run propensity or long-run multiplier
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
15/44
Unbiasedness of the OLS estimator
As in cross-section data, we require a set ofassumptions in order for the OLS estimator to beunbiased for time series models
Assumption TS.1: The stochastic process {(xt1, xt2,, xtk, yt): t=1,2,,n} follows the linear model
where {ut: t=1,2,,n} is the sequence of errors ordisturbances and n is the number of observations(time periods)
0 1 1 ...t t k kt t y x x u = + + + +
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
16/44
Unbiasedness of the OLS estimator
Assumption TS.1 is essentially the same asassumption MLR.1 for cross-sectional models
Assumption TS.2: No perfect collinearity noindependent variable is constant nor a perfect linearcombination of the others
Assumption TS.2 is the same as the correspondingassumption for cross-sectional data
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
17/44
Unbiasedness of the OLS estimator
A little further notation:
Letxt=(xt1, xt2, , xtk) denote the set of alindependent variables in the equation at time t
LetXdenote the collection of all independent variablesfor all time periods where the rows correspond theobservations for each time period
Assumption TS.3: (Zero conditional mean) Foreach t, the expected value of the error ut, given theexplanatory variables for all time periods, is 0:
( )| 0, 1, 2,...,tE u t n= =X
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
18/44
Unbiasedness of the OLS estimator
Assumption TS.3 implies that the error term, ut
, isuncorrelated with each explanatory variable in everytime period
If ut is independent ofXand E(ut)=0 then TS.3 isautomatically satisfied
Given our assumption for cross-sectional analysis(MLR.4) it is not surprising that we require ut to beuncorrelated with the explanatory variables in periodt. Stated in terms of conditional expectations:E(ut|xt)=0
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
19/44
Unbiasedness of the OLS estimator
When E(ut|x
t)=0 holds, we say that the x
tjare
contemporaneously exogenous
However, TS.3 is stronger than simply requiringcontemporaneous exogeneity. The error term ut mustbe uncorrelated with xsj even when st.
When TS.3 holds, we say that the explanatoryvariables are strictly exogenous
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
20/44
Unbiasedness of the OLS estimator
For cross-sectional data we did not need to specifyhow the error term for, say, person i, ui, is related tothe explanatory variables for other persons in thedataset. Due to random sampling, ui is automatically
uncorrelated with the explanatory variables for otherpersons
Importantly, TS.3 does not put any restrictions onthe correlation between explanatory variables or onthe correlation of ut across time
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
21/44
Unbiasedness of the OLS estimator
Assumption TS.3 can fail for many reasons, such as
measurement error or omitted variable bias, but it can also faildue to less obvious reasons
For example, consider the simple static model
Assumption TS.3 requires that ut is uncorrelated with thepresent value as well as past and future values of zt. This hastwo implications:
z can have no lagged effect on y. If z does have a laggedeffect on y, then we should estimate a distributed laggedmodel.
Strict exogeneity excludes the possibility that changes in theerror term today can cause future changes in z. Thiseffectively rules out feedback from y to future values of z
0 1t t ty z u = + +
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
22/44
Unbiasedness of the OLS estimator
As a concrete example, consider
Lets assume, for arguments sake, that ut is
uncorrelated with polpct and with past values ofpolpc
Suppose, though, that the city adjusts the size of itspolice force based on past values of the murder rate(i.e., a high murder rate in period t leads to anincrease in the size of the police force in period t+1).Then, ut is correlated with polpct+1, violating TS.3
0 1t t tmrdrte polpc u = + +
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
23/44
Unbiasedness of the OLS estimator
Explanatory variables that are strictly exogenous cannot react to
what has happened to y in the past
For example, the amount of rainfall in agricultural productionis not influenced by production in the previous year
However, something like labour input for agriculturalproduction may not be strictly exogenous since the farmermay adjust the amount based on last years yields
There are lots of reasons to believe that TS.3 will be violated inmany social science contexts as current policies, such asinterest rates, are influenced by previous realizations. However,it is the simplest way to demonstrate unbiasedness of the OLS
estimator.
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
24/44
Unbiasedness of the OLS estimator
Theorem 10.1: (Unbiasedness of OLS) Underassumptions TS.1, TS.2 and TS.3, the OLS estimatorsare unbiased conditional onX, and thereforeunconditionally as well:
( ) , 0,1,...,j jE j k = =
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
25/44
Variance of the OLS estimator
We have seen the conditions necessary for anunbiased estimator. The next step is to learn how wecan say anything about how precise the estimator is.
Assumption TS.4: (homeskedasticity) ConditionalonX, the variance of ut is the same for all t:
( ) ( ) 2var | var , 1, 2,...,t t
u u t n= = =X
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
26/44
Variance of the OLS estimator
TS.4 requires that the variance cannot depend onX.It is sufficient that:
ut andXare independent, and
var(ut
) must be constant over time
When TS.4 does not hold, we say that the errors areheteroskedastic, just as in the cross-section case
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
27/44
Variance of the OLS estimator
Example of heteroskedasticity:
Consider the following regression of the 3-month T-billrate (i3t) on the inflation rate (inft) and the federaldeficit as a percentage of GDP (deft):
Since policy regime changes are known to affect thevariability of interest rates, it is unlikely that the error
terms are homoskedastic Furthermore, the variability in interest rates may
depend on the level of inflation or the deficit, alsoviolating the homoskedasticity assumption
0 1 23t t t t i inf def u = + + +
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
28/44
Variance of the OLS estimator
Assumption TS.5: (NEW!! No serial correlation)Conditional onX, the errors in two different timeperiods are uncorrelated:
To interpret this condition, it is easier to ignore theconditioning onX, in which case the condition is:
( ), | 0,t scorr u u t s= X
( ), 0,t scorr u u t s=
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
29/44
Variance of the OLS estimator
If this condition does not hold, then we say theerrors suffer from serial correlation orautocorrelation
Example of serial correlation:
If ut-1>0 then ut is on average above 0, then thecorrelation is also positive. This turns out to be a
reasonable characterization of error terms in manytimes series applications, which we will deal with later
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
30/44
Variance of the OLS estimator
However, TS.5 says nothing about the temporalcorrelation of the explanatory variables
For example, inft is almost certainly positivelycorrelated over time, but this does not violate TS.5
We did not need a similar assumption to TS.5 forcross-sectional analysis because under random
sampling the error terms are automaticallyuncorrelated for two different observations
However, serial correlation will come up in the context
of panel data
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
31/44
Variance of the OLS estimator
Theorem 10.2 (OLS sampling variances) Under thetime series assumptions TS.1 through TS.5, thevariance of the OLS estimator conditional onXis:
where SSTj is the total sum of squares of xtj and Rj2
is the R-squared from the regression of xj on theother independent variables
( )( )2 2var | 1 , 1,...,j j jSST R j k = =
X
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
32/44
Variance of the OLS estimator
This is the same variance we derived for cross-sectional analysis
Theorem 10.3 (unbiased estimation of
2
) Underassumptions TS.1 through TS.5, the estimator
is an unbiased estimator of2
where df=n-k-1
2 /SSR df =
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
33/44
Variance of the OLS estimator
Theorem 10.4: (Gauss-Markov Theorem) Underassumptions TS.1 through TS.5, the OLS estimatorsare the best linear unbiased estimators conditional on
X.
Bottom line: OLS has the same desirable finitesample properties under TS.1 through TS.5 that is
has under MLR.1 through MLR.5
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
34/44
Inference
In order to use the OLS standard errors, t statistics,and F statistics, we need to add one more additionalassumption
Assumption TS.6: (normality) The errors ut areindependent ofXand are independently andidentically distributed as N(0, 2)
TS.6 implies assumptions TS.3 through TS.5 but it isstronger since it assumes independence andnormality
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
35/44
Inference
Theorem 10.5: (normal sampling distributions)under assumptions TS.1 through TS.6 the OLSestimators are normally distributed, conditional onX.Further, under the null hypothesis, each t statistic
has a t distribution and each F statistic has an Fdistribution. Confidence intervals are constructed inthe usual way.
Bottom line: Under these assumptions, we canproceed as usual.
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
36/44
Functional form, dummy variables
and index numbers
Not surprisingly, we can use logarithmic functionalforms and dummy variables just as before (seeSection 10.4 in Wooldridge)
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
37/44
Trends
Many economic time series have a common tendencyof growing over time. If we ignore this underlyingtrend we may improperly attribute the correlationbetween the two series
One popular way to capture trending behaviour is alinear time trend:
where et is i.i.d. with E(et)=0 and var(et)=2
0 1 , 1,2,...t ty t e t = + + =
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
38/44
Trends
A second popular method for capturing trends inusing an exponential trend which holds when aseries has the same average growth rate from periodto period:
Other forms of trends are used in empirical analysisbut linear and exponential trends are the mostcommon
( ) 0 1log , 1, 2,...t ty t e t = + + =
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
39/44
Trends
Nothing about trending variables immediatelycontradicts our assumptions, TS.1 through TS.6.However, it may be that unobserved trending factorsthat affect yt may also affect some of the explanatory
variables. If we ignore this, we have run a spuriousregression!
Consider the model:
0 1 1 2 2 3t t t t y x x t u = + + + +
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
40/44
Trends
If the above model satisfies assumptions TS.1through TS.3 (those required for the OLS estimatorto be unbiased), then omitting t from the regressionwill generally lead to biased estimators of1 and 2
Example 10.7 provides a simple example of theimpacts of ignoring trends
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
41/44
Trends
R2 are often very high in times series regressions.Does this mean they are more informative thancross-sectional regressions?
Not necessarily
Time series data is often aggregate data such asaverage wage levels, meaning individual heterogeneityhas been averaged over
Moreover, the R2 can be artificially high when thedependent variable is trending
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
42/44
Trends
Recall the formula for the adjusted R2:
where
However, when E(yt) follows, say, a linear trend,
then this is no longer an unbiased or consistentestimator of var(yt). The simplest way around this isto detrend the variables first and calculate the R2
from a regression using the detrended variables
( )2 2 2 1 u yR =
( )22
1
n
y t
t
y y=
=
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
43/44
Seasonality
If a time series is observed at monthly or quarterlyintervals (or even weekly or daily) it may exhibitseasonality
e.g., housing starts can be strongly influenced by
weather. If weather patterns are generally worse in,say, January than June, then housing starts willgenerally be lower in January
e.g., retail sales are generally higher in December thanin other months because of the Christmas holiday
7/30/2019 01-Basic Regression Analysis With Time Series Data part 1
44/44
Seasonality
One way to address this is to allow the expectedvalue to vary by month:
Many series that display seasonal patterns are often
seasonally adjusted before being reported forpublic use. Essentially, they have had the seasonalcomponent removed.
0 1 1
1 2 11
...
...
t t k kt
t t t t
y x x
feb mar dec u
= + + + +
+ + + +