Simple Linear Regression and Correlation by Asst. Prof. Dr. Min Aung.

Simple Linear Regression andCorrelation

by

Asst. Prof. Dr. Min Aung

When SLR?

• Study a relationship between two variables

• Paired-Samples or matched data

• Interval or ratio level measurement

Independent and dependent variables

• You want to guess or estimate or compute the

values of the dependent variable.

• In estimating, you will use the values of the

independent variable.

Predictor and Predicted variables

• Predictor = independent variable.

• Predicted variable = dependent variable.

Scatter Diagram

• X-axis = independent variable.

• Y-axis = dependent variable.

• Each pair of data A point (x, y)

X

Y

2

3 (2, 3)

X

Purpose of Drawing Scatter Diagram

• Is there a linear relationship between the two variables X and Y?

• Linear relationship = Scatter points (roughly at least) form the shape of a straight line.

Y

X

Y

Linear relationship No linear relationship

Measuring Strength of Linear Relationship

• Pearson’s coefficient of correlation r

• Formula (2) (Not used in exam. Just for knowledge)

• Calculator Work For Casio 350MS

Switch the calculator on.

1. Set calculator in LR (Linear Regression) mode:

Press Mode.

Press 3 for Reg (Regression).

Press 1 for Linear.

• Check n. (Checking whether there are old data):

Press Shift 1, next 3, and then =.

Calculator Work for r

3. Enter Data in Pairs:

x-value , y-value M+



4. Check n again: see

step 2 above.

5. Press shift 2, then move by arrow to the right, press

3 for r, and then press =.

Now you see the value of r.

Interpretation of r (Direct linear relationship)

1. If r is 1 or – 1, then all scatter points are on a straight line.

2. If r is 1, all points are on a straight line with a positive slope.

3. If r is -1, all points are on a straight line with a negative slope.

4. If a straight line has a positive slope, it rises up to the right.

5. If a straight line has a

positive slope, if x

increases, then y increases

for the points (x, y) on it.(small x, small y)

(large x, large y)

6. In this situation, we say that the two variables X and Y are

directly or positively correlated.

Interpretation of r (Inverse linear relationship)

1. If r is -1, all points are on a straight line with a negative slope.

2. If a straight line has a

negative slope, if x

increases, then y decreases

for the points (x, y) on it.

(small x, large y)

(large x, small y)

6. In this situation, we say that the two variables X and Y are

inversely or negatively correlated.

Interpretation of r (strength)

1. If r is not exactly 1 or – 1, but it is .9 or - .9, then the points

are around a straight line. They are close to a straight-line

shape.

2. If r is .8 or - .8, then the points are close to a straight-line

shape, but not so well as in case of .9 or -.9.

3. Thus, the closer r is to 1 or – 1, the closer are the points to a

straight-line shape.

4. Thus, the closer r is to 0, the farther are the points from a

straight-line shape.

5. In r-values, 0.9 are stronger than 0.8, and 0.8 are

weaker than 0.9.

Interpretation of r (strength)

Values of r

0

No linear relationship

0.5

Weak linear relationship

- 0.5

Weak linear relationship

1

Strong

Perfect

-1

Strong

Perfect

Testing Linear Relationship

1. Pearson invented a formula to measure the strength and

direction of a linear relationship between two variables.

2. The number given by his formula is called correlation

coefficient. We call it Pearson’s coefficient of

correlation.

3. We write r for this value in a sample, and we write for

this value in a population.

4. Testing whether the correlation is significant is scientific

guessing whether there should be a correlation, in the

population, between the two variables under

consideration.

Null and Alternate Hypothesis

1. Test correlation: H0: = 0 and Ha: 0

2. Test direct correlation: H0: 0 and Ha: > 0

3. Test inverse correlation: H0: 0 and Ha: < 0

4. Test positive correlation: H0: 0 and Ha: > 0

5. Test inverse correlation: H0: 0 and Ha: < 0

Three types of test

1. H0: = 0 and Ha: 0 Two-tailed test

2. H0: 0 and Ha: < 0 Left-tailed test

3. H0: 0 and Ha: > 0 Right-tailed test

Critical value

1. Read t table.

2. Degrees of freedom (Df) = n - 2

3. n = number of pairs of data

4. Right-tailed test Positive sign

5. Left-tailed test Negative sign

6. Two-tailed test Both positive and negative sign

Test Statistic

1. Test statistic = Strength of evidence supporting alternate hypothesis Ha

2. Original test statistic to test is r.

3. Convert r to t by Formula (10).

4. Learn to compute t by your calculator correctly.

Rejection region 1

• For a two tailed-test, the rejection region is on the right of

positive critical value and on the left of negative critical value.

Real number line for t values

0 Positive Critical ValueNegative Critical Value

Total area = Level of significance = Probability = α

Rejection regionRejection region

T curve

Rejection region 2

• For a left-tailed test, the rejection region is on the left of

(negative) critical value.


0(Negative) Critical Value

α = Area = Level of significance = Probability

Rejection region

t curve

Rejection region 3

• For a right-tailed test, the rejection region is on the right of the

(positive) critical value.


0 (Positive) Critical Value

Area = Level of significance = Probability = α

Rejection region

t curve

Decision Rule

• If the test statistic (TS) is in the rejection region, then reject H0.

• Reject H0 = “H0 is false, and hence Ha is true.”

• Fail to reject H0 = “H0 is true, and hence Ha is false.”

Conclusion

• Conclusion = Decision

• Decision is the last step of statistical procedure.

• Conclusion is the report to the one who asked the original question.

Simple Linear Regression and Correlation by Asst. Prof. Dr. Min Aung.

Documents

tailed test slide

test correlation

points x

test positive correlation

value of

variables x

negative sign slide

aung slide