Multiple Regression Analysis - Inference Võ Đøc Hoàng Vũ School of Econmics University of Economics HCMC 5/1/2007 Võ Đøc Hoàng Vũ Multiple Regression Analysis - Inference
Multiple Regression Analysis - Inference
Võ Đức Hoàng Vũ
School of EconmicsUniversity of Economics HCMC
5/1/2007
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
Assumption of the Classical Linear Model (CLM)
So far, we know that given the Gauss-Markov
assumptions,OLS is BLUE.
In order to do classical hypothesis testing, we need to add
another assumption (beyond the Gauss-Markov
assumptions)
Assume that u is independent of x1, . . . , xk and u is
normally distributed with zero mean and variance
σ2 : u ∼ Normal(0, σ2)
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
CLM Assumptions (cont)
Under CLM, OLS is not only BLUE, but is the minimum
variance unbiased estimator.
We can summarize the population assumptions of CLM
as follows
y |x ∼ Normal(β0 + β1x1 + . . .+ βkxk , σ2)
While for now we just assume normality, clear that
sometimes not the case
Large samples will let us drop normality
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
MLR Assumptions
Assumption MLR.1 Linear in Parameter
The Model in the population can be written as
y = β0 + β1x1 + β2x2 + . . .+ βkxk + u, (1)
where β0, β1, β2, . . . , βk are unknown parameters (con-stants) of interest and u is unobserved random error ordisturbance terms.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
MLR Assumptions
Assumption MLR.2 Random Sampling
We have a random sample of n observations,{(xi1, xi2, . . . , xik , yi) : i = 1, 2, . . . , n }, following thepopulation model in Assumption MLR. 1
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
MLR Assumptions
Assumption MLR.3 No Perfect Collinearity
In the sample (and therefore in the population), none ofindependent variables is constant, and there are no exactlinear relationships among independent variables.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
MLR Assumptions
Assumption MLR.4 Zero Conditional Mean
The error u has an expected value of zero given anyvalues of the independent variables. In other words,
E (u|x1, x2, . . . , xk) = 0 (2)
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
MLR Assumption
Assumption MLR.5 Homoskedasticity
The error u has the same variance given any values ofthe explanatory variables. In other words,
Var(u|x1, . . . , xk) = σ2.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
MLR Assumption
Assumption MLR.6 Normality
The population error u is independent of the explanatoryvariables x1, x2, . . . , xk and is normally distributed withzero mean and variance σ2 : u ∼ Normal(o, σ2).
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
The homoskedastic normal distribution with a single
explanatory variable
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
Normal Sampling Distributions
Under the CLM assumptions, conditional on the sample
values of the independent variables
β̂j ∼ N(βj ,Var(β̂j)), so that
β̂j − βjsd(β̂j)
∼ N(0, 1)
β̂j is distributed normally because it is a linear
combination of the errors
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
The t-test
Under the CLM assumptions
β̂j − βjse(β̂j)
∼ tα/2n−k−1
Note this is a t distribution (vs normal) because we have
to estimate σ2 by σ̂2
Note the degress of freedom: n − k − 1
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
The t-test (cont)
Knowing the sampling distribution for standardized
estimator allow us to carry out hypothesis test
Start with a null hypothesis
For example, H0 : βj = 0
If accept null, then accept that xj has no effect on y ,
controlling for other x’s
To perform our test we first need to form "the" t-statistic
for β̂j : tβ̂j ≡β̂j
se(β̂j)
We will then use our t statistic along with a rejection rule
to determine whether to accept the null hypothesis, H0.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
t-test: one-side Alternatives
Besides our null, H0, we need an alternative
hypothesis,H1, and a significance level.
H1 may be one-sided, or two-sided
H1 : βj > 0 and H1 : βj < 0 are one-sided
H1 : βj 6= 0 is a two-sided alternative.
If we want to have only a 5% probability of rejecting H0 if
it is really true, then we say our significance level is 5%.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
t-test: one-side Alternatives (cont)
Having picked a significance level, α, we look up the
(1− α)th percentile in a t distribution with n − k − 1 df
and call this c , the critical value
We can reject the null hypothesis if the t statistics is
greater than the critical value
If the t statistics is less than the critical value then we fail
to reject the null.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
One-Sided Alternatives (cont)
yi = β0 + β1xi1 + . . .+ βkxik + ui
H0 : βj = 0 vs H1 : βj > 0
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
One-sided vs Two-sided
Because the t distribution is symmetric, testing
H1 : βj < 0 is straightforward. The critical value is just
the negative of before.
We can reject the null if the t statistic < −c , and if the t
statistic > than −c then we fail to reject the null.
For a two-sided test, we set the critical value based on
α/2 and reject H1 : βj 6= 0 if the absolute value of the
t-statistics > c
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
One-Sided Alternatives (cont)
yi = β0 + β1xi1 + . . .+ βkxik + ui
H0 : βj = 0 vs H1 : βj 6= 0
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
Summary for H0 : βj = 0
Unless otherwise stated, the alternative is assumed to be
two-sided
If we reject the null, we typically say "xj is statistically
significant at the α% level"
If we fail to reject the null, we typically say "xj is
statistically insignificant at the α% level.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference