Top Banner
Simple Linear Regression Estimation and Properties
47

Simple Linear Regression Estimation and Properties

Feb 24, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Simple Linear Regression Estimation and Properties

Simple Linear RegressionEstimation and Properties

Page 2: Simple Linear Regression Estimation and Properties

Outline• Review of the Reading• Estimate parameters using OLS• Other features of OLS– Numerical Properties of OLS– Assumptions of OLS– Goodness of Fit

Page 3: Simple Linear Regression Estimation and Properties

Checking Understanding• What is the best estimate of E(Y)?• How would we find E(Y|Xi)?

• Y = B1 + B2X + u– What is B1?– What is B2?– What is u?

Page 4: Simple Linear Regression Estimation and Properties

Checking Understanding• What is a z-score?

• What is the mean of z(x)?• What is the standard deviation of z(x)?

z(x) =x� x

�x

Page 5: Simple Linear Regression Estimation and Properties

Checking Understanding• What is a z-score?

• Correlation:

r =

Pzxzy

n� 1

z(x) =x� x

�x

Page 6: Simple Linear Regression Estimation and Properties

Checking Understanding• Correlation:

• The regression line in z-scores:

r =

Pzxzy

n� 1

zy = mzx

Page 7: Simple Linear Regression Estimation and Properties

Checking Understanding• Correlation:

• The regression line in z-scores:• Can also be written as:

• Can also be written as:

r =

Pzxzy

n� 1

zy = mzxzy = mzx

zy = rzx

Page 8: Simple Linear Regression Estimation and Properties

Checking Understanding• Correlation:

• The regression line in z-scores:• Can also be written as: • Can also be written as:

• Remember:

r =

Pzxzy

n� 1

zy = mzxzy = mzxzy = rzx

m =cov(X,Y )

var(x)

Page 9: Simple Linear Regression Estimation and Properties

And What is Covariance?

• Cov(X,Y) = E[(X-E[X])(Y-E[Y])]• Cov(X,Y) = E[XY]-E[X]E[Y]• Covariance is positive if x and y are both

below their mean or both above their mean. It is negative if x is above its mean while y is below its mean or vice versa.

�xy = cov(X,Y ) = E[(X � µx)(Y � µy)]

�xy = cov(X,Y ) = E[(X � x)(Y � y)]

Page 10: Simple Linear Regression Estimation and Properties

And What is Covariance?

• Cov(X,Y) = E[ ( X - E[X] ) ( Y - E[Y] ) ]• Cov(X,Y) = E[XY] - E[X] E[Y]• Covariance is positive if x and y are both

below their mean or both above their mean. It is negative if x is above its mean while y is below its mean or vice versa.

• But it has units. It is easy to interpret the sign, but hard to interpret the number

�xy = cov(X,Y ) = E[(X � µx)(Y � µy)]�xy = cov(X,Y ) = E[(X � x)(Y � y)]

Page 11: Simple Linear Regression Estimation and Properties

Total Population of Money Spent and the Number of Votes

Effect of Money on Votes

Num

ber o

f Vot

es

0

12500

25000

37500

50000

Amount Spent- in millions0 3 5 8 10

Page 12: Simple Linear Regression Estimation and Properties

What we can see from the graph

• We can see the average value of Y for each value of X– These are the conditional expected values E(Y|X)

• If we join the conditional values of Y given each value of X we get the – Population Regression Line

Page 13: Simple Linear Regression Estimation and Properties

Population Regression Function and the Linear Model

• E(Y|Xi)=f(Xi)– The expected value of the distribution of Y,

given Xi is functionally related to Xi

• E(Y|Xi)=B1+B2Xi

Page 14: Simple Linear Regression Estimation and Properties

Two interpretations of linearity• Linear in Variables

– Which of the following is linear in variables and why?:• E(Y|Xi)=B1+B2Xi

2

• E(Y|Xi)=B1+B2Xi

• Linear in Parameters– Which of the following is linear in parameters and why?

• E(Y|Xi)=B1+B2Xi2

• E(Y|Xi)=B1+B22Xi

• Why Should We Care?– Linear Regression Requires linearity in parameters only

Page 15: Simple Linear Regression Estimation and Properties

Straight Line

Y=B1+B2Xi

Page 16: Simple Linear Regression Estimation and Properties

Quadratic

Y=B1+B2X+B3X2

Page 17: Simple Linear Regression Estimation and Properties

Adding in the Stochastic Term

• Yi=E(Y|Xi) + ui

• Systematic Component: E(Y|Xi)• Stochastic Disturbance: U

Page 18: Simple Linear Regression Estimation and Properties

The Sample Regression Function (SRF)

• Because of sampling fluctuation, any sample will only approximate our true Population Regression Function

• Stochastic form of the SRF:

Page 19: Simple Linear Regression Estimation and Properties

Primary Goal in Regression Analysis

• We want to estimate the PRF– Yi=B1+B2Xi+ui

• On the basis of the SRF

Page 20: Simple Linear Regression Estimation and Properties

One method• Choose the Sample Regression Function

such that the sum of the residuals is as small as possible

Page 21: Simple Linear Regression Estimation and Properties

Illustration and Problem

X

Y

u1=10

u2=-2

u3=2

u4=-10

Page 22: Simple Linear Regression Estimation and Properties

Alternative Method• Ordinary Least Squares (OLS) is a method of

finding the linear model which minimizes the sum of the squared errors.

– Example: (10)2 + (-2)2 + (2)2 + (-10)2 = 208

• This method is the best, linear unbiased estimator

Page 23: Simple Linear Regression Estimation and Properties

Good Spot for a break

Page 24: Simple Linear Regression Estimation and Properties

Minimizing the Sum of Squares• Our goal is to minimize the sum of the

squared errors.

• Since we have two unknowns, B1 and B2, we need to take the partial derivatives for the following equation:

Page 25: Simple Linear Regression Estimation and Properties

Partial Derivatives for B’s

• We start with our original equation:

• Now we take the partial derivatives– First equation is the partial derivative with respect to

B1,

– Second equation is with respect to B2

Page 26: Simple Linear Regression Estimation and Properties

Set Equal to Zero• Last set of equations:

• Next:

Page 27: Simple Linear Regression Estimation and Properties

The Normal Equations• Last:

• Divide both equations by –2• Multiply through• Separate summation terms and rearrange:

Page 28: Simple Linear Regression Estimation and Properties

Rewriting the Equation• Last Equation:

• We can rewrite

Page 29: Simple Linear Regression Estimation and Properties

Solving Equation• We have two equations with two unknowns, for which we

can use algebra

• Multiply first equation by sum of Xi and second by n• End up with…

Page 30: Simple Linear Regression Estimation and Properties

Subtract first equation from second and rearranging

Page 31: Simple Linear Regression Estimation and Properties

Last step• Last equation

• Multiply numerator and denominator by 1/n…recall that

• End up with

Page 32: Simple Linear Regression Estimation and Properties

We can now solve for B1

• If we go back to the first normal equation:

Page 33: Simple Linear Regression Estimation and Properties

What Does B2 Mean?

• Equation for B2 may not seem to make intuitive sense at first

• But if we break it down into pieces we can begin to see the logic

Page 34: Simple Linear Regression Estimation and Properties

In sum…

• If the changes in X are EQUAL to the changes in y, then B2 = 1

• If the changes in Y are LARGER than the changes in X, then B2 > 1

• If the changes in Y are SMALLER than the changes in X, then B2 < 1

Page 35: Simple Linear Regression Estimation and Properties

Let’s Do An Example!

Page 36: Simple Linear Regression Estimation and Properties

Calculating a and b• Mean of X is 4• Mean of Y is 12.71429

Page 37: Simple Linear Regression Estimation and Properties

Calculating B1 and B2

Page 38: Simple Linear Regression Estimation and Properties

Which Looks Like…This!Regression of Y on X

0

8

15

23

30

0 2 4 6 8

Page 39: Simple Linear Regression Estimation and Properties

Practice Problem• We have a sample of the amount of

money a each candidate spent in a state (in millions) and the percentage of the vote they received.

• Calculate the regression line and interpret.

Page 40: Simple Linear Regression Estimation and Properties

Data

State % vote Money spentCA 40 10FL 35 12GA 15 4MO 20 6OH 40 11VT 25 8

Page 41: Simple Linear Regression Estimation and Properties

Numerical Properties of OLS• Those properties that result from the method of

OLS– Expressed from observable quantities of X and Y– Point Estimator for B’s– Sample regression line passes through sample

means of Y and X– Sum of residuals is zero– Residuals are uncorrelated with the predicted Yi

– Residuals uncorrelated with Xi

Page 42: Simple Linear Regression Estimation and Properties

Assumptions of Classical Linear Regression

• A1: Linear Regression Model-Linear in parameters

• A2: X values are fixed in repeated sampling.

• A3: Zero mean value of the disturbance term ui

• A4: Homoskedasticity or Equal Variance of ui.

Page 43: Simple Linear Regression Estimation and Properties

More Assumptions• A5: No autocorrelation between disturbances

• A6: Zero covariance between ui and Xi

• A7: Number of observations n is greater than the number of parameters to be estimated

• A8: Variability in X values

Page 44: Simple Linear Regression Estimation and Properties

More Assumptions• A9: Regression model is correctly

specified.– The correct variables are included– We have the correct functional form– Correct assumptions about the probability

distributions of Yi, Xi and ui.• A10: With multiple regression, we add the

assumption of no perfect multicollinearity

Page 45: Simple Linear Regression Estimation and Properties

How “good” does it fit?

• To measure “reduction in errors” we need a benchmark for comparison.

• The mean of the dependent variable is a relevant and tractable benchmark for comparing predictions.

• The mean of Y represents our “best guess” at the value of Yi absent other information.

Page 46: Simple Linear Regression Estimation and Properties

Sums of Squares

• This gives us the following 'sum-of-squares' measures:

• Total Variation = Explained Variation + Unexplained Variation

Page 47: Simple Linear Regression Estimation and Properties

How well does our model perform?

• R squared statistic– = TSS-USS/TSS– =ESS/TSS• Bounded between 0 and 1• Higher values indicate a better fit• Lower values more unexplained than explained

variance