5-1 bivar . Unit 5 Correlation and Regression: Examining and Modeling Relationships Between Variables Chapters 8 - 12 Outline: Two variables Scatter Diagrams to display bivariate data Correlation Concept, Interpretation, Computation, Cautions Regression Model: Using a LINE to describe the relation between two variables & for prediction •Finding "the" line •Interpreting its coefficients Residuals, Prediction Errors Extensions of Simple Linear Regression A.05
70
Embed
5-1 bivar. Unit 5 Correlation and Regression: Examining and Modeling Relationships Between Variables Chapters 8 - 12 Outline: Two variables Scatter Diagrams.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
5-1bivar.
Unit 5Correlation and Regression:
Examining and Modeling Relationships Between Variables
r = .883 (all data)r = .984 (without flower bonds)
(Siegel)
5-20bivar.
Interpretation of Empirical Association
1. DescriptiveExample: Height versus Weight
2. CausalExample: Total Cost vs. Volume of Production
3. NonsenseExample: Polio Incidence vs. Soft Drink Sales
5-21bivar.
Prediction Using Correlation
1. What is the best prediction of the dependent variable?What if the value of the independent variable is available?
2. What is the likely size of the prediction error?
Fundamental Principle of Prediction
1. Use the mean of the relevant group.
2. SD of the group gives the "likely size of error."
5-22bivar.
5-23bivar.
Diamond State Telephone Company
Demand for LINES versus Proposed MONTHLY charge per line ($)
10 15 20 25 30 35100
150
200
250
MONTHLY
LIN
ES
5-24bivar.
Look At The Vertical StripCorresponding to the Given X
Value
Y
X
5-25bivar.
10 15 20 25 30 35100
150
200
250
MONTHLY
LIN
ES
x
x
x
Graph of Averages
x
x
estimated LINES = 237.495 - 3.867 MONTHLY
5-26bivar.
5-27bivar.
Linearly Related Variables
The REGRESSION LINE is to a scatter diagram as the AVERAGE is to a list of numbers.
The regression line estimates the average values for the dependent variable, Y, corresponding to each value, x, of the independent variable.
5-28bivar.
Linearly Related Variables
If we have 2 variables, linearly related to one another, then knowing the value of one variable (for a particular individual) can help to estimate / predict the value of the other variable.
• If we know nothing re. the value of the independent variable (X), then we estimate the value of the dependent variable to be the OVERALL AVERAGE of the dependent variable (Y).
• If we know that the independent variable (X) has a particular value for a given individual, then we can take a "more educated guess" at the value of the dependent variable (Y).
5-29bivar.
Regression and SD Lines
The REGRESSION LINE for modeling the relation between X (independent variable) and Y (dependent variable) passes through the POINT OF AVERAGES and has slope
That is, associated with each increase of one SD in X, there is an increase of r SD’s in Y, on the average.
The SD LINE for modeling the relation between X (independent variable) and Y (dependent variable) passes through the POINT OF AVERAGES and has slope
r SDY
SDX
SDY
SDX
5-30bivar.
Estimating the Intercept andSlope of the Regression Line
The REGRESSION LINE for modeling the relation between X (independent variable) and Y (dependent variable) is also known as
The REGRESSION LINE for predicting Y from X, and has the form
Y = a + b x
= intercept + slope x.
Here,b = slope
= r SD(Y) / SD(X)
a = intercept
= avg(Y) - b avg(X)
= avg(Y) - r [SD(Y) / SD(X)] avg(X)
5-31bivar.
Prediction from aRegression Model
Predicted value of Y corresponding to a given value of X is
Y = a + b X
= ( Y - r SDY
SDX X ) + ( r SDY
SDX ) X
= Y - ( X - X ) ( r SDY
SDX )
= Y - ( X - X ) ( slope )
5-32bivar.
5-33bivar.
TOTAL OBSERVATIONS: 21
LINES MONTHLY
N OF CASES 21 21MINIMUM 105.000 10.320MAXIMUM 201.000 34.000MEAN 154.048 21.581VARIANCE 1122.648 69.623STANDARD DEV 33.506 8.344
PEARSON CORRELATION MATRIX
LINES MONTHLYLINES 1.000MONTHLY -0.963 1.000
NUMBER OF OBSERVATIONS: 21
5-34bivar.
Diamond State Questions
In the Diamond State Telephone Company example, avg (LINES) = 154.048 SD (LINES) = 33.506 avg (MONTHLY) = 21.581 SD (MONTHLY) = 8.344
r = -0.963
What are the coordinates for the point of averages?
What is the slope of the regression line?
Suppose the MONTHLY charge was set at $25.00.What would you estimate to be the demand for # LINES from the 62 new businesses?
Suppose the MONTHLY charge was set at $15.00.What would you estimate to be the demand for # LINES from the 62 new businesses?
5-35bivar.
Another Diamond State Question
Suppose the MONTHLY charge was set at $50.00.What would you estimate to be the demand for # LINES from the 62 new businesses?
For cases with income less than or equal to $15,000,avg (Voluntary) = 6.376 SD (Voluntary) = 3.959avg (Income) = $10,332.756 SD (Income) = $2,109.819 r = 0.896
Derive the equation for the regression line.
According to this linear model, what is the estimated value for "Voluntary" in a ZIP code area with Income $12,000?... with Income $9,500?
5-48bivar.
blank
5-49bivar.
Regression Effect
In virtually all test-retest situations, the bottom group on the first test will, on average, show some improvement on the 2nd test, and the top group will, on average, fall back.
This is called the REGRESSION EFFECT.
The REGRESSION FALLACY is thinking that the regression effect must be due to something important, not just the spread of points around the line.
5-50bivar.
blank
5-51bivar.
Residuals
Regression methods allow us to estimate the average value of the dependent variable for each value of the independent variable.
Individuals will differ somewhat from the regression estimates.
How much?
5-52bivar.
blank
Country Economic Birth RateAlgeria 2 48
Argentina 19 21Denmark 34 14Germany 40 11
Guatemala 8 41India 12 37
Ireland 20 22Jamaica 20 31Japan 37 19
Philippines 19 42United States 30 15
Russia 46 18
Algeria
5-53bivar.
Residuals
Prediction error = actual - predicted
= vertical distance from the point to the regression
line
5-54bivar.
Residuals for Economically Active Women and Crude Birth Rates
Country Economic Birth Rate Regr.Estim. ResidualAlgeria 2 48 44.1 3.9