Top Banner
IBS Statistics Year 1 Dr. Ning DING [email protected] I.007
35

Lesson04

Dec 03, 2014

Download

Technology

Ning Ding

Statistics for International Business School, Hanze University of Applied Science, Groningen, The Netherlands
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lesson04

IBS Statistics Year 1

Dr. Ning DING [email protected]

Page 2: Lesson04

What we are going to learn?

• Review

• Chapter 12: Simple Regression and Correlation– dependent / independent variables– scatter diagrams– regression analysis– Least-squares estimating equation– the coefficient of determination– the coefficient of correlation

Page 3: Lesson04

• Review

• Chapter 12: Simple Regression and Correlation

• Exercises

Review

Find the interquartile range: 146014711637172117581787194020382047205420972205228723112406

Interquartile Range=Q3-Q1

=2205-1721=484

Page 4: Lesson04

• Review

• Chapter 12: Simple Regression and Correlation

• Exercises

Review EXCEL Lesson

L=(8+1)*25%=2.25

Q1=133.5

L=(8+1)*75%=6.75

Q3=274.5

Interquartile Range=274.5-133.5=141

Page 5: Lesson04

Review

Boxplot

12245789

12

Median1224

789

12

Quartile

Q1=2

Q3=8.5

5Interquartile

Range

Decile

1st D

9th D

Percentile

http://cnx.org/content/m11192/latest/

How to interpret?

Page 6: Lesson04

The distribution is skewed to __________ because the mean is __________the median.

the right larger than

http://cnx.org/content/m11192/latest/

€ 20 € 2000Q1= € 250 Q3= € 850Median= € 350

Mean= € 450a b

• Review

• Chapter 12: Simple Regression and Correlation

• Exercises

Review

Page 7: Lesson04

0.81.01.01.21.21.31.51.72.02.02.12.24.0

2.03.23.63.74.04.24.24.54.54.64.85.05.0

Mean > Median

Mean < Median

Positively skewed

Negatively skewedhttp://qudata.com/online/statcalc/

Review

Page 8: Lesson04

This means that the data is symmetrically distributed.

Zero skewness

mode=median=mean

Zero skewness

mode=median=mean

Review

Page 9: Lesson04

– scatter diagrams– dependent / independent variables– regression analysis– Least-squares estimating equation– the coefficient of determination– the coefficient of correlation

Chapter 12• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 10: Lesson04

Regression and Correlation Analyses

– How to determine both the nature and the strength of a relationship between variables.

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 11: Lesson04

Regression and Correlation Analyses

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Scatter Diagram:

28

Describing Relationship between Two Variables – Scatter Diagram Examples

Positive correlationPositive correlation

Page 12: Lesson04

28

Describing Relationship between Two Variables – Scatter Diagram Examples

Regression and Correlation Analyses

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Scatter Diagram:

Negative correlationNegative correlation

Page 13: Lesson04

28

Describing Relationship between Two Variables – Scatter Diagram Examples

Regression and Correlation Analyses

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Scatter Diagram:

No correlationNo correlation

Page 14: Lesson04

Regression and Correlation Analyses

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Scatter Diagrams:• Patterns indicating that the variables are related• If related, we can describe the relationship

Strong & Positivecorrelation

Strong & Negativecorrelation

Weak & Positivecorrelation

Weak & Negativecorrelation

Nocorrelation

Page 15: Lesson04

28

Describing Relationship between Two Variables – Scatter Diagram Examples

28

Describing Relationship between Two Variables – Scatter Diagram ExamplesVariables: – Independent variables: known

– Dependent variables: to predict

Independent Variable

Dependent Variable

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Regression and Correlation Analyses

Page 16: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Regression and Correlation Analyses

Correlation & Cause Effect?

• The relationships found by regression to be relationships of association

• Not necessarilly of cause and effect.

Page 17: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 18: Lesson04

Least-squares estimating equation:• The dependent variable Y is determined by the independent

variable X

Ŷ = a + bX

Y

X

Independent Variable

Dependent Variable• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

88 ?I

Page 19: Lesson04

Ŷ = a + bX

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Least-squares estimating equation:

Page 20: Lesson04

xn-x

y xn-xy=b 22

Y = a + bX a = Y - bX

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Least-squares estimating equation:

Page 21: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

75.09*444

6*3*478

-

-=b

the relationship between the age of a truck and the annual repair expense?

X=3 Y=6

xn-x

y xn-xy=b

22

a = 6 - 0.75*3 = 3.75

Ŷ = 3.75 + 0.75 X

If the city has a truck that is 4 years old,

the director could use the equation to predict $675 annually in repairs.

6.75 = 3.75 + 0.75 * 4

Least-squares estimating equation:

Y = a + bX a = Y - bX

Step 1:

Step 2:

Step 4:

Step 5:

Step 6:

Step 7:

Step 8:

Page 22: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Example:• To find the simple/linear regression of Personal Income (X) and

Auto Sales (Y)

Count the number of values.      

Step 1:

Find XY, X2   See the below tableStep 2:

N = 5N = 5

If X=64, what about Y?

Least-squares estimating equation:

Page 23: Lesson04

Step 3:

Step 4:

Find ΣX, ΣY, ΣXY, ΣX2.            ΣX = 311 Mean = 62.2             ΣY = 18.6 Mean = 3.72            ΣXY = 1159.7             ΣX2 = 19359

xn-x

y xn-xy=b

22

Substitute in the above slope formula given.            Slope(b) = = 0.19 1159.7-5*62.2*3.72

19359-5*62.2*62.2

Least-squares estimating equation:

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 24: Lesson04

Step 5:

Then substitute these values in regression equation formula            Regression Equation(Ŷ) = a + bX

         Ŷ  = -8.098 + 0.19X

            Slope(b) = 0.19

Now, again substitute in the above intercept formula given.           

Intercept(a) = Y - bX  = 3.72- 0.19 * 62.2= -8.098

Suppose if we want to know the approximate y value for the variable X = 64. Then we can substitute the value in the above equation.

Regression Equation:Ŷ = a + bX             = -8.098 + 0.19(64)            = -8.098 + 12.16

            = 4.06

Regression Equation:Ŷ = a + bX             = -8.098 + 0.19(64)            = -8.098 + 12.16

            = 4.06

Least-squares estimating equation:

Step 6:

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 25: Lesson04

to minimize the sum of the squares of the errors to measure the goodness of fit of a line

ei = residuali

Least-squares estimating equation:• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Strongcorrelation

Weakcorrelation

SESE

Page 26: Lesson04

to minimize the sum of the squares of the errors to measure the goodness of fit of a line

ei = residuali

Least-squares estimating equation:

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 27: Lesson04

Correlation Analysis:

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

describe the degree to which one variable is linearly related to another.

Coefficient of Determination:Measure the extent, or strength, of the association that existsbetween two variables.

Coefficient of Correlation:Square root of coefficient of determination

r 2r 2

rr

Page 28: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Coefficient of Determination:Measure the extent, or strength, of the association that

exists between two variables.

r 2r 2

• 0 ≤ r2 ≤ 1.• The larger r2 , the stronger the linear relationship.• The closer r2 is to 1, the more confident we are in our prediction.

Yn-YYn-XYb+Ya

=r 22

22

Page 29: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Coefficient of Determination: r 2r 2Yn-Y

Yn-XYb+Ya=r 22

22

Page 30: Lesson04

Coefficient of Correlation:Square root of coefficient of determination

rr• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 31: Lesson04

ReviewWhich value of r indicates a stronger correlation than 0.40? A. -0.30B. -0.50C. +0.38D. 0

If all the plots on a scatter diagram lie on a straight line, what is the standard error of estimate? A. -1B. +1C. 0D. Infinity

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 32: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

In the least squares equation,  Ŷ = 10 + 20X the value of 20 indicates A. the Y intercept.B. for each unit increase in X, Y increases by 20.C. for each unit increase in Y, X increases by 20.D. none of these. 

Review

Page 33: Lesson04

A sales manager for an advertising agency believes there is a relationship between the number of contacts and the amount of the sales. To verify this belief, the following data was collected: What is the Y-intercept of the linear equation? A. -12.201B. 2.1946C. -2.1946D. 12.201

Review• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 34: Lesson04

– scatter diagrams– dependent / independent variables– regression analysis– Least-squares estimating equation– the coefficient of determination– the coefficient of correlation

What we have learnt?

Page 35: Lesson04