1 Chapter 12 Simple Linear Regression Simple Linear Regression Model Least Squares Method Coefficient of Determination Model Assumptions Testing for Significance Using the Estimated Regression Equation for Estimation and Prediction Computer Solution Residual Analysis: Validating Model Assumptions
47
Embed
1 Chapter 12 Simple Linear Regression Simple Linear Regression Model Least Squares Method Coefficient of Determination Model Assumptions Testing for Significance.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Chapter 12 Simple Linear Regression
Simple Linear Regression ModelLeast Squares Method Coefficient of DeterminationModel AssumptionsTesting for SignificanceUsing the Estimated Regression Equation for Estimation and PredictionComputer SolutionResidual Analysis: Validating Model Assumptions
2
The equation that describes how y is related to x and an error term is called the regression model.The simple linear regression model is:
y = 0 + 1x +
– 0 and 1 are called parameters of the model.
– is a random variable called the error term.
Simple Linear Regression Model
3
n The simple linear regression equation is:
EE((yy) = ) = 00 + + 11xx
• Graph of the regression equation is a straight line.
• 0 is the y intercept of the regression line.
• 1 is the slope of the regression line.
• E(y) is the expected value of y for a given x value.
Simple Linear Regression EquationSimple Linear Regression Equation
variable for the ith observationyi = estimated value of the dependent
variable for the ith observation
Least Squares Method
min (y yi i )2min (y yi i )2
^
10
Slope for the Estimated Regression Equation
bx y x y n
x x ni i i i
i i1 2 2
( )/
( ) /b
x y x y n
x x ni i i i
i i1 2 2
( )/
( ) /
The Least Squares Method
11
n y-Intercept for the Estimated Regression Equation
where:xi = value of independent variable for ith observationyi = value of dependent variable for ith observation
x = mean value for independent variable y = mean value for dependent variable n = total number of observations
____
The Least Squares MethodThe Least Squares Method
0 1b y b x 0 1b y b x
12
Example: Reed Auto Sales
Simple Linear Regression Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown on the next slide.
13
Example: Reed Auto Sales
n Simple Linear Regression
Number of TV AdsNumber of TV Ads Number of Cars SoldNumber of Cars Sold11 141433 242422 181811 171733 2727
14
Slope for the Estimated Regression Equation
b1 = 220 - (10)(100)/5 = _____
24 - (10)2/5
y-Intercept for the Estimated Regression Equation
b0 = 20 - 5(2) = _____
Estimated Regression Equation
y = 10 + 5x
^
Example: Reed Auto Sales
15
Example: Reed Auto Sales
Scatter Diagram
y = 10 + 5x
0
5
10
15
20
25
30
0 1 2 3 4TV Ads
Ca
rs S
old ^
16
Relationship Among SST, SSR, SSE
SST = SSR + SSE
where: SST = total sum of squares SSR = sum of squares due to
regression SSE = sum of squares due to error
The Coefficient of Determination
( ) ( ) ( )y y y y y yi i i i 2 2 2( ) ( ) ( )y y y y y yi i i i 2 2 2^^
17
n The coefficient of determination is:
r2 = SSR/SST
where: SST = total sum of squares SSR = sum of squares due to
regression
The Coefficient of DeterminationThe Coefficient of Determination
18
Coefficient of Determination
r2 = SSR/SST = 100/114 =The regression relationship is very strong
because 88% of the variation in number of cars sold can be explained by the linear relationship between the number of TV ads and the number of cars sold.
Example: Reed Auto Sales
19
The Correlation Coefficient
Sample Correlation Coefficient
where: b1 = the slope of the estimated
regressionequation
21 ) of(sign rbrxy 21 ) of(sign rbrxy
ionDeterminat oft Coefficien ) of(sign 1brxy ionDeterminat oft Coefficien ) of(sign 1brxy
xbby 10ˆ xbby 10ˆ
20
Sample Correlation Coefficient
The sign of b1 in the equation is “+”.
rxy = +.9366
Example: Reed Auto Sales
21 ) of(sign rbrxy 21 ) of(sign rbrxy
ˆ 10 5y x ˆ 10 5y x
=+ .8772xyr =+ .8772xyr
21
Model Assumptions
Assumptions About the Error Term 1. The error is a random variable with mean
of zero.2. The variance of , denoted by 2, is the
same for all values of the independent variable.
3. The values of are independent.4. The error is a normally distributed
random variable.
22
Testing for Significance
To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of 1 is zero.
Two tests are commonly used– t Test– F Test
Both tests require an estimate of 2, the variance of in the regression model.
23
An Estimate of 2
The mean square error (MSE) provides the estimate
of 2, and the notation s2 is also used.
s2 = MSE = SSE/(n-2)
where:
Testing for Significance
210
2 )()ˆ(SSE iiii xbbyyy 210
2 )()ˆ(SSE iiii xbbyyy
24
Testing for Significance
An Estimate of – To estimate we take the square root of 2.– The resulting s is called the standard error of
the estimate.
2
SSEMSE
n
s2
SSEMSE
n
s
25
Hypotheses
H0: 1 = 0
Ha: 1 = 0
Test Statistic
where
Testing for Significance: t Test
tbsb
1
1
tbsb
1
1
2)(1
xx
ss
i
b
2)(1
xx
ss
i
b
26
n Rejection Rule
Reject H0 if t < -tor t > t
where: t is based on a t distribution
with n - 2 degrees of freedom
Testing for Significance: Testing for Significance: tt Test Test
27
t Test – Hypotheses
H0: 1 = 0
Ha: 1 = 0
– Rejection Rule For = .05 and d.f. = 3, t.025 =
_____ Reject H0 if t > t.025 = _____
Example: Reed Auto Sales
28
n t Test
• Test Statistics
t t = _____/_____ = 4.63= _____/_____ = 4.63
• Conclusions
t t = 4.63 > 3.182, so reject = 4.63 > 3.182, so reject HH00
Example: Reed Auto SalesExample: Reed Auto Sales
29
Confidence Interval for 1
We can use a 95% confidence interval for 1 to test the hypotheses just used in the t test.H0 is rejected if the hypothesized value of 1 is not included in the confidence interval for 1.
30
The form of a confidence interval for 1 is:
where b1 is the point estimate
is the margin of erroris the t value providing an
areaof /2 in the upper tail of a
t distribution with n - 2 degrees
of freedom
Confidence Interval for 1
12/1 bstb 12/1 bstb
12/ bst 12/ bst2/t 2/t
31
Rejection RuleReject H0 if 0 is not included in
the confidence interval for 1.
95% Confidence Interval for 1
= 5 +/- 3.182(1.08) = 5 +/- 3.44
or ____ to ____
Conclusion0 is not included in the confidence interval.
Reject H0
Example: Reed Auto Sales
12/1 bstb 12/1 bstb
32
n HypothesesHypotheses
HH00: : 11 = 0 = 0
HHaa: : 11 = 0 = 0
n Test StatisticTest Statistic
FF = MSR/MSE = MSR/MSE
Testing for Significance: F Test
33
n Rejection Rule
Reject Reject HH00 if if FF > > FF
where:where: FF is based on an is based on an FF distribution distribution
with 1 d.f. in the numerator andwith 1 d.f. in the numerator and
nn - 2 d.f. in the denominator - 2 d.f. in the denominator
Testing for Significance: Testing for Significance: FF Test Test
34
n F F Test Test
• HypothesesHypotheses
HH00: : 11 = 0 = 0
HHaa: : 11 = 0 = 0
• Rejection RuleRejection Rule
For For = .05 and d.f. = 1, 3: = .05 and d.f. = 1, 3: FF.05.05 = = ____________
Reject Reject HH00 if F > if F > FF.05.05 = ______. = ______.
FF = 21.43 > 10.13, so we reject = 21.43 > 10.13, so we reject HH00..
Example: Reed Auto SalesExample: Reed Auto Sales
36
Some Cautions about theInterpretation of Significance Tests
Rejecting H0: 1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y.Just because we are able to reject H0: 1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x and y.
37
n Confidence Interval Estimate of E(yp)
n Prediction Interval Estimate of yp
yypp ++ tt/2 /2 ssindind
where:where: confidence coefficient is 1 - confidence coefficient is 1 - andand
tt/2 /2 is based on ais based on a t t distributiondistribution
with with nn - 2 degrees of freedom - 2 degrees of freedom
Using the Estimated Regression Equationfor Estimation and Prediction
/ y t sp yp 2 / y t sp yp 2
38
Point EstimationIf 3 TV ads are run prior to a sale, we expect the mean number of cars sold to be:
y = 10 + 5(3) = ______ cars^
Example: Reed Auto Sales
39
n Confidence Interval for E(yp)
95% confidence interval estimate of the 95% confidence interval estimate of the meanmean number of cars sold when 3 TV ads are run is:number of cars sold when 3 TV ads are run is:
25 25 ++ 4.61 = ______ to _______ cars 4.61 = ______ to _______ cars
Example: Reed Auto Sales
40
n Prediction Interval for yp
95% prediction interval estimate of the 95% prediction interval estimate of the number of cars sold in number of cars sold in one particular weekone particular week when 3 TV ads are run is:when 3 TV ads are run is:
25 25 ++ 8.28 = _____ to ______ cars 8.28 = _____ to ______ cars