Dec 03, 2014
IBS Statistics Year 1
Dr. Ning DING [email protected]
What we are going to learn?
• Review
• Chapter 12: Simple Regression and Correlation– dependent / independent variables– scatter diagrams– regression analysis– Least-squares estimating equation– the coefficient of determination– the coefficient of correlation
• Review
• Chapter 12: Simple Regression and Correlation
• Exercises
Review
Find the interquartile range: 146014711637172117581787194020382047205420972205228723112406
Interquartile Range=Q3-Q1
=2205-1721=484
• Review
• Chapter 12: Simple Regression and Correlation
• Exercises
Review EXCEL Lesson
L=(8+1)*25%=2.25
Q1=133.5
L=(8+1)*75%=6.75
Q3=274.5
Interquartile Range=274.5-133.5=141
Review
Boxplot
12245789
12
Median1224
789
12
Quartile
Q1=2
Q3=8.5
5Interquartile
Range
Decile
1st D
9th D
Percentile
http://cnx.org/content/m11192/latest/
How to interpret?
The distribution is skewed to __________ because the mean is __________the median.
the right larger than
http://cnx.org/content/m11192/latest/
€ 20 € 2000Q1= € 250 Q3= € 850Median= € 350
Mean= € 450a b
• Review
• Chapter 12: Simple Regression and Correlation
• Exercises
Review
0.81.01.01.21.21.31.51.72.02.02.12.24.0
2.03.23.63.74.04.24.24.54.54.64.85.05.0
Mean > Median
Mean < Median
Positively skewed
Negatively skewedhttp://qudata.com/online/statcalc/
Review
This means that the data is symmetrically distributed.
Zero skewness
mode=median=mean
Zero skewness
mode=median=mean
Review
– scatter diagrams– dependent / independent variables– regression analysis– Least-squares estimating equation– the coefficient of determination– the coefficient of correlation
Chapter 12• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Regression and Correlation Analyses
– How to determine both the nature and the strength of a relationship between variables.
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Regression and Correlation Analyses
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Scatter Diagram:
28
Describing Relationship between Two Variables – Scatter Diagram Examples
Positive correlationPositive correlation
28
Describing Relationship between Two Variables – Scatter Diagram Examples
Regression and Correlation Analyses
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Scatter Diagram:
Negative correlationNegative correlation
28
Describing Relationship between Two Variables – Scatter Diagram Examples
Regression and Correlation Analyses
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Scatter Diagram:
No correlationNo correlation
Regression and Correlation Analyses
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Scatter Diagrams:• Patterns indicating that the variables are related• If related, we can describe the relationship
Strong & Positivecorrelation
Strong & Negativecorrelation
Weak & Positivecorrelation
Weak & Negativecorrelation
Nocorrelation
28
Describing Relationship between Two Variables – Scatter Diagram Examples
28
Describing Relationship between Two Variables – Scatter Diagram ExamplesVariables: – Independent variables: known
– Dependent variables: to predict
Independent Variable
Dependent Variable
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Regression and Correlation Analyses
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Regression and Correlation Analyses
Correlation & Cause Effect?
• The relationships found by regression to be relationships of association
• Not necessarilly of cause and effect.
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Least-squares estimating equation:• The dependent variable Y is determined by the independent
variable X
Ŷ = a + bX
Y
X
Independent Variable
Dependent Variable• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
88 ?I
Ŷ = a + bX
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Least-squares estimating equation:
xn-x
y xn-xy=b 22
Y = a + bX a = Y - bX
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Least-squares estimating equation:
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
75.09*444
6*3*478
-
-=b
the relationship between the age of a truck and the annual repair expense?
X=3 Y=6
xn-x
y xn-xy=b
22
a = 6 - 0.75*3 = 3.75
Ŷ = 3.75 + 0.75 X
If the city has a truck that is 4 years old,
the director could use the equation to predict $675 annually in repairs.
6.75 = 3.75 + 0.75 * 4
Least-squares estimating equation:
Y = a + bX a = Y - bX
Step 1:
Step 2:
Step 4:
Step 5:
Step 6:
Step 7:
Step 8:
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Example:• To find the simple/linear regression of Personal Income (X) and
Auto Sales (Y)
Count the number of values.
Step 1:
Find XY, X2 See the below tableStep 2:
N = 5N = 5
If X=64, what about Y?
Least-squares estimating equation:
Step 3:
Step 4:
Find ΣX, ΣY, ΣXY, ΣX2. ΣX = 311 Mean = 62.2 ΣY = 18.6 Mean = 3.72 ΣXY = 1159.7 ΣX2 = 19359
xn-x
y xn-xy=b
22
Substitute in the above slope formula given. Slope(b) = = 0.19 1159.7-5*62.2*3.72
19359-5*62.2*62.2
Least-squares estimating equation:
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Step 5:
Then substitute these values in regression equation formula Regression Equation(Ŷ) = a + bX
Ŷ = -8.098 + 0.19X
Slope(b) = 0.19
Now, again substitute in the above intercept formula given.
Intercept(a) = Y - bX = 3.72- 0.19 * 62.2= -8.098
Suppose if we want to know the approximate y value for the variable X = 64. Then we can substitute the value in the above equation.
Regression Equation:Ŷ = a + bX = -8.098 + 0.19(64) = -8.098 + 12.16
= 4.06
Regression Equation:Ŷ = a + bX = -8.098 + 0.19(64) = -8.098 + 12.16
= 4.06
Least-squares estimating equation:
Step 6:
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
to minimize the sum of the squares of the errors to measure the goodness of fit of a line
ei = residuali
Least-squares estimating equation:• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Strongcorrelation
Weakcorrelation
SESE
to minimize the sum of the squares of the errors to measure the goodness of fit of a line
ei = residuali
Least-squares estimating equation:
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Correlation Analysis:
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
describe the degree to which one variable is linearly related to another.
Coefficient of Determination:Measure the extent, or strength, of the association that existsbetween two variables.
Coefficient of Correlation:Square root of coefficient of determination
r 2r 2
rr
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Coefficient of Determination:Measure the extent, or strength, of the association that
exists between two variables.
r 2r 2
• 0 ≤ r2 ≤ 1.• The larger r2 , the stronger the linear relationship.• The closer r2 is to 1, the more confident we are in our prediction.
Yn-YYn-XYb+Ya
=r 22
22
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
Coefficient of Determination: r 2r 2Yn-Y
Yn-XYb+Ya=r 22
22
Coefficient of Correlation:Square root of coefficient of determination
rr• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
ReviewWhich value of r indicates a stronger correlation than 0.40? A. -0.30B. -0.50C. +0.38D. 0
If all the plots on a scatter diagram lie on a straight line, what is the standard error of estimate? A. -1B. +1C. 0D. Infinity
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
In the least squares equation, Ŷ = 10 + 20X the value of 20 indicates A. the Y intercept.B. for each unit increase in X, Y increases by 20.C. for each unit increase in Y, X increases by 20.D. none of these.
Review
A sales manager for an advertising agency believes there is a relationship between the number of contacts and the amount of the sales. To verify this belief, the following data was collected: What is the Y-intercept of the linear equation? A. -12.201B. 2.1946C. -2.1946D. 12.201
Review• Review
• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation
– scatter diagrams– dependent / independent variables– regression analysis– Least-squares estimating equation– the coefficient of determination– the coefficient of correlation
What we have learnt?