Top Banner

of 44

01-Basic Regression Analysis With Time Series Data part 1

Apr 14, 2018

Download

Documents

cprogboy
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    1/44

    Review of OLS & Introduction to PooledCross Sections

    EMET 8002

    Lecture 1

    July 23, 2008

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    2/44

    Outline

    Review the key assumptions of OLS regression

    Chapter 3 in the text

    Motivation for course topics

    Introduce regression analysis with time series data

    Chapter 10 in the text

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    3/44

    OLS assumptions

    MLR.1: The model is linear in parameters

    MLR.2: Random sampling

    MLR.3: No perfect multicollinearity

    None of the independent variables are constant and

    there are no exact linear relationships among theindependent variables

    MLR.4: Zero conditional mean

    0 1 1 ... k ky x x u = + + + +

    ( )1| ,..., 0kE u x x =

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    4/44

    Unbiasedness of OLS

    Theorem 3.1: Under assumptions MLR.1 to MLR.4

    For any values of the population parameter j

    In other words, the OLS estimators are unbiasedestimators of the population parameters

    This is a property of the estimator, not any specific

    estimate i.e., we have no reason to believe that our estimate is either too big or too small

    ( ) , 0,1,...,j jE j k = =

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    5/44

    Review: The Variance of OLS

    MLR.5: Homoskedasticity

    Theorem: Given MLR.1 through MLR.5, we can show that:

    Where SSTj is the total sample variation ofxj:

    And is the R-squared from regressing xj on all the otherregressors in the model.

    var u x1,x

    2,...,x

    k( ) = 2

    varj( ) =

    2

    SSTj 1 Rj2( )

    SSTj = xij xj( )2

    i=1

    n

    R

    j

    2

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    6/44

    Review: Standard Errors for OLS

    Unless we know , we need an unbiased estimator in orderestimate standard errors.

    We obtain:

    Thus, the formula for the standard error is given by:

    2

    2=

    ui

    2

    i=1

    n

    n k1

    =SSR

    n k1, E

    2X =

    2

    sej( ) =

    SSTj 1 Rj2( )

    12

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    7/44

    Review: Gauss-Markov Theorem

    Theorem: Under assumptions MLR.1 through MLR.5, the OLSestimators

    are the best linear unbiased estimators (BLUE) of

    i.e., For any linear estimator,

    OLS has a lower variance, i.e.,

    0,

    1,...,

    k

    0,

    1,...,

    k

    j = wijyi

    i=1

    n

    var j( ) var j( )

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    8/44

    Review: Variance-Covariance Matrix

    var ( ) =

    var 0( ) cov 0 , 1( ) cov 0 , k( )

    cov 1, 0( ) var 1( ) cov

    i,

    j( ) var i( ) var

    k( )

    For testing hypotheses involving more than one coefficient, we may need toknow covariances between coefficient estimators:

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    9/44

    What is the sampling distribution?

    Knowing the mean and variance of the OLS estimator isinsufficient information to permit hypothesis testing.

    Under what conditions will the OLS estimator be normallydistributed?

    Exact, small (finite) sample assumptions Large sample (asymptotic) assumptions

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    10/44

    Asymptotic Normality

    Theorem: Under the Gauss-Markov Assumption (MLR.1through MLR.5, also permitting MLR.4):

    Where

    And

    Which allows us:

    n j j( )~

    a

    N 0,2

    aj

    2

    aj2= plim n

    1rij2

    i=1

    n

    > 0

    plim

    2=

    2

    j j

    se j( )~a

    N 0,1( )

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    11/44

    Motivation for course

    During the lecture component of the course we aregoing to study circumstances in which some of theOLS assumptions fail

    Caution: a little econometric technique in the wronghands can be a dangerous thing

    See critique by Angus Deaton (2009) Instruments ofDevelopment

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    12/44

    The nature of time series data

    Two key differences between time series and cross-sectional data:

    Temporal ordering

    Interpretation of randomness

    When we collect a time series data set, we obtain onepossible outcome, or realization, but if certain conditionsin the past had been different, we generally obtain a

    different realization for the stochastic process The set of all possible realizations of a time series

    process plays the role of the population in a cross-sectional analysis

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    13/44

    Simple examples: A finite distributed

    lag model

    Afinite distributed lag (FDL) model:

    This is model is an FDL of order 2

    How do we interpret the coefficients in the context ofa temporary, one-period increase in the value of z?

    0 shows the immediate impact of a one-unit increase

    in z at time t impact propensity or impactmultiplier

    We can graph j as a function of j to obtain the lagdistribution

    0 0 1 1 2 2t t t t t y z z z u = + + + +

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    14/44

    Finite distributed lag model

    We are also interested in the change in y due to apermanent one-unit increase in z

    At the time of the increase y increases by 0 After one period, y has increased by 0+ 1 After two periods, y has increased by 0+1+2 There are no further changes in y after two periods

    This shows that the long-run change in y given a

    permanent increase in z is the sum of the coefficientson the current and lagged values of z

    This is known as the long-run propensity or long-run multiplier

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    15/44

    Unbiasedness of the OLS estimator

    As in cross-section data, we require a set ofassumptions in order for the OLS estimator to beunbiased for time series models

    Assumption TS.1: The stochastic process {(xt1, xt2,, xtk, yt): t=1,2,,n} follows the linear model

    where {ut: t=1,2,,n} is the sequence of errors ordisturbances and n is the number of observations(time periods)

    0 1 1 ...t t k kt t y x x u = + + + +

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    16/44

    Unbiasedness of the OLS estimator

    Assumption TS.1 is essentially the same asassumption MLR.1 for cross-sectional models

    Assumption TS.2: No perfect collinearity noindependent variable is constant nor a perfect linearcombination of the others

    Assumption TS.2 is the same as the correspondingassumption for cross-sectional data

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    17/44

    Unbiasedness of the OLS estimator

    A little further notation:

    Letxt=(xt1, xt2, , xtk) denote the set of alindependent variables in the equation at time t

    LetXdenote the collection of all independent variablesfor all time periods where the rows correspond theobservations for each time period

    Assumption TS.3: (Zero conditional mean) Foreach t, the expected value of the error ut, given theexplanatory variables for all time periods, is 0:

    ( )| 0, 1, 2,...,tE u t n= =X

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    18/44

    Unbiasedness of the OLS estimator

    Assumption TS.3 implies that the error term, ut

    , isuncorrelated with each explanatory variable in everytime period

    If ut is independent ofXand E(ut)=0 then TS.3 isautomatically satisfied

    Given our assumption for cross-sectional analysis(MLR.4) it is not surprising that we require ut to beuncorrelated with the explanatory variables in periodt. Stated in terms of conditional expectations:E(ut|xt)=0

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    19/44

    Unbiasedness of the OLS estimator

    When E(ut|x

    t)=0 holds, we say that the x

    tjare

    contemporaneously exogenous

    However, TS.3 is stronger than simply requiringcontemporaneous exogeneity. The error term ut mustbe uncorrelated with xsj even when st.

    When TS.3 holds, we say that the explanatoryvariables are strictly exogenous

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    20/44

    Unbiasedness of the OLS estimator

    For cross-sectional data we did not need to specifyhow the error term for, say, person i, ui, is related tothe explanatory variables for other persons in thedataset. Due to random sampling, ui is automatically

    uncorrelated with the explanatory variables for otherpersons

    Importantly, TS.3 does not put any restrictions onthe correlation between explanatory variables or onthe correlation of ut across time

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    21/44

    Unbiasedness of the OLS estimator

    Assumption TS.3 can fail for many reasons, such as

    measurement error or omitted variable bias, but it can also faildue to less obvious reasons

    For example, consider the simple static model

    Assumption TS.3 requires that ut is uncorrelated with thepresent value as well as past and future values of zt. This hastwo implications:

    z can have no lagged effect on y. If z does have a laggedeffect on y, then we should estimate a distributed laggedmodel.

    Strict exogeneity excludes the possibility that changes in theerror term today can cause future changes in z. Thiseffectively rules out feedback from y to future values of z

    0 1t t ty z u = + +

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    22/44

    Unbiasedness of the OLS estimator

    As a concrete example, consider

    Lets assume, for arguments sake, that ut is

    uncorrelated with polpct and with past values ofpolpc

    Suppose, though, that the city adjusts the size of itspolice force based on past values of the murder rate(i.e., a high murder rate in period t leads to anincrease in the size of the police force in period t+1).Then, ut is correlated with polpct+1, violating TS.3

    0 1t t tmrdrte polpc u = + +

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    23/44

    Unbiasedness of the OLS estimator

    Explanatory variables that are strictly exogenous cannot react to

    what has happened to y in the past

    For example, the amount of rainfall in agricultural productionis not influenced by production in the previous year

    However, something like labour input for agriculturalproduction may not be strictly exogenous since the farmermay adjust the amount based on last years yields

    There are lots of reasons to believe that TS.3 will be violated inmany social science contexts as current policies, such asinterest rates, are influenced by previous realizations. However,it is the simplest way to demonstrate unbiasedness of the OLS

    estimator.

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    24/44

    Unbiasedness of the OLS estimator

    Theorem 10.1: (Unbiasedness of OLS) Underassumptions TS.1, TS.2 and TS.3, the OLS estimatorsare unbiased conditional onX, and thereforeunconditionally as well:

    ( ) , 0,1,...,j jE j k = =

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    25/44

    Variance of the OLS estimator

    We have seen the conditions necessary for anunbiased estimator. The next step is to learn how wecan say anything about how precise the estimator is.

    Assumption TS.4: (homeskedasticity) ConditionalonX, the variance of ut is the same for all t:

    ( ) ( ) 2var | var , 1, 2,...,t t

    u u t n= = =X

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    26/44

    Variance of the OLS estimator

    TS.4 requires that the variance cannot depend onX.It is sufficient that:

    ut andXare independent, and

    var(ut

    ) must be constant over time

    When TS.4 does not hold, we say that the errors areheteroskedastic, just as in the cross-section case

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    27/44

    Variance of the OLS estimator

    Example of heteroskedasticity:

    Consider the following regression of the 3-month T-billrate (i3t) on the inflation rate (inft) and the federaldeficit as a percentage of GDP (deft):

    Since policy regime changes are known to affect thevariability of interest rates, it is unlikely that the error

    terms are homoskedastic Furthermore, the variability in interest rates may

    depend on the level of inflation or the deficit, alsoviolating the homoskedasticity assumption

    0 1 23t t t t i inf def u = + + +

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    28/44

    Variance of the OLS estimator

    Assumption TS.5: (NEW!! No serial correlation)Conditional onX, the errors in two different timeperiods are uncorrelated:

    To interpret this condition, it is easier to ignore theconditioning onX, in which case the condition is:

    ( ), | 0,t scorr u u t s= X

    ( ), 0,t scorr u u t s=

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    29/44

    Variance of the OLS estimator

    If this condition does not hold, then we say theerrors suffer from serial correlation orautocorrelation

    Example of serial correlation:

    If ut-1>0 then ut is on average above 0, then thecorrelation is also positive. This turns out to be a

    reasonable characterization of error terms in manytimes series applications, which we will deal with later

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    30/44

    Variance of the OLS estimator

    However, TS.5 says nothing about the temporalcorrelation of the explanatory variables

    For example, inft is almost certainly positivelycorrelated over time, but this does not violate TS.5

    We did not need a similar assumption to TS.5 forcross-sectional analysis because under random

    sampling the error terms are automaticallyuncorrelated for two different observations

    However, serial correlation will come up in the context

    of panel data

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    31/44

    Variance of the OLS estimator

    Theorem 10.2 (OLS sampling variances) Under thetime series assumptions TS.1 through TS.5, thevariance of the OLS estimator conditional onXis:

    where SSTj is the total sum of squares of xtj and Rj2

    is the R-squared from the regression of xj on theother independent variables

    ( )( )2 2var | 1 , 1,...,j j jSST R j k = =

    X

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    32/44

    Variance of the OLS estimator

    This is the same variance we derived for cross-sectional analysis

    Theorem 10.3 (unbiased estimation of

    2

    ) Underassumptions TS.1 through TS.5, the estimator

    is an unbiased estimator of2

    where df=n-k-1

    2 /SSR df =

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    33/44

    Variance of the OLS estimator

    Theorem 10.4: (Gauss-Markov Theorem) Underassumptions TS.1 through TS.5, the OLS estimatorsare the best linear unbiased estimators conditional on

    X.

    Bottom line: OLS has the same desirable finitesample properties under TS.1 through TS.5 that is

    has under MLR.1 through MLR.5

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    34/44

    Inference

    In order to use the OLS standard errors, t statistics,and F statistics, we need to add one more additionalassumption

    Assumption TS.6: (normality) The errors ut areindependent ofXand are independently andidentically distributed as N(0, 2)

    TS.6 implies assumptions TS.3 through TS.5 but it isstronger since it assumes independence andnormality

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    35/44

    Inference

    Theorem 10.5: (normal sampling distributions)under assumptions TS.1 through TS.6 the OLSestimators are normally distributed, conditional onX.Further, under the null hypothesis, each t statistic

    has a t distribution and each F statistic has an Fdistribution. Confidence intervals are constructed inthe usual way.

    Bottom line: Under these assumptions, we canproceed as usual.

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    36/44

    Functional form, dummy variables

    and index numbers

    Not surprisingly, we can use logarithmic functionalforms and dummy variables just as before (seeSection 10.4 in Wooldridge)

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    37/44

    Trends

    Many economic time series have a common tendencyof growing over time. If we ignore this underlyingtrend we may improperly attribute the correlationbetween the two series

    One popular way to capture trending behaviour is alinear time trend:

    where et is i.i.d. with E(et)=0 and var(et)=2

    0 1 , 1,2,...t ty t e t = + + =

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    38/44

    Trends

    A second popular method for capturing trends inusing an exponential trend which holds when aseries has the same average growth rate from periodto period:

    Other forms of trends are used in empirical analysisbut linear and exponential trends are the mostcommon

    ( ) 0 1log , 1, 2,...t ty t e t = + + =

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    39/44

    Trends

    Nothing about trending variables immediatelycontradicts our assumptions, TS.1 through TS.6.However, it may be that unobserved trending factorsthat affect yt may also affect some of the explanatory

    variables. If we ignore this, we have run a spuriousregression!

    Consider the model:

    0 1 1 2 2 3t t t t y x x t u = + + + +

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    40/44

    Trends

    If the above model satisfies assumptions TS.1through TS.3 (those required for the OLS estimatorto be unbiased), then omitting t from the regressionwill generally lead to biased estimators of1 and 2

    Example 10.7 provides a simple example of theimpacts of ignoring trends

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    41/44

    Trends

    R2 are often very high in times series regressions.Does this mean they are more informative thancross-sectional regressions?

    Not necessarily

    Time series data is often aggregate data such asaverage wage levels, meaning individual heterogeneityhas been averaged over

    Moreover, the R2 can be artificially high when thedependent variable is trending

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    42/44

    Trends

    Recall the formula for the adjusted R2:

    where

    However, when E(yt) follows, say, a linear trend,

    then this is no longer an unbiased or consistentestimator of var(yt). The simplest way around this isto detrend the variables first and calculate the R2

    from a regression using the detrended variables

    ( )2 2 2 1 u yR =

    ( )22

    1

    n

    y t

    t

    y y=

    =

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    43/44

    Seasonality

    If a time series is observed at monthly or quarterlyintervals (or even weekly or daily) it may exhibitseasonality

    e.g., housing starts can be strongly influenced by

    weather. If weather patterns are generally worse in,say, January than June, then housing starts willgenerally be lower in January

    e.g., retail sales are generally higher in December thanin other months because of the Christmas holiday

  • 7/30/2019 01-Basic Regression Analysis With Time Series Data part 1

    44/44

    Seasonality

    One way to address this is to allow the expectedvalue to vary by month:

    Many series that display seasonal patterns are often

    seasonally adjusted before being reported forpublic use. Essentially, they have had the seasonalcomponent removed.

    0 1 1

    1 2 11

    ...

    ...

    t t k kt

    t t t t

    y x x

    feb mar dec u

    = + + + +

    + + + +