Top Banner
1. Scatterplot 2. Simple (Linear) Regression 3. Simple Non-Linear Regression Data: Country database website information on national characteristics of 160 countries measures of quality of life of the population (e.g. life expectancy and infant mortality) measures of wealth (e.g. GNP) Simple Regression/Correlation
27
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Correlation & Regression_

1. Scatterplot

2. Simple (Linear) Regression

3. Simple Non-Linear RegressionData: Country database website

– information on national characteristics of 160 countries

– measures of quality of life of the population (e.g. life expectancy and infant mortality)

– measures of wealth (e.g. GNP)

Simple Regression/Correlation

Page 2: Correlation & Regression_

Research Question:

Is there a relationship between population growth rate and level of urbanization (measured as the percent of population living in urban areas)?

A question of

“whether or not there is,”

as well as

“how much.”

Scatterplot & Linear Simple Regression

T-tests and chi-square tests

Correlation and regression analysis

Page 3: Correlation & Regression_

1. Scatterplot: Graphs/Scatter

Simple

Page 4: Correlation & Regression_

popgrow

1. Scatterplot: Graphs/Scatter

urb

Page 5: Correlation & Regression_

1. Scatterplot: Graphs/Scatter

URB

120100806040200

PO

PG

RO

W8

6

4

2

0

-2

Each dotrepresentsa case/country

Page 6: Correlation & Regression_

Chart/Options

To make changes to chart, double click chart in output window. Chart Editor will appear.

Page 7: Correlation & Regression_

Chart/Options: the least square line

Page 8: Correlation & Regression_

Chart/Axis

Page 9: Correlation & Regression_

1. Scatterplot: Graphs/Scatter

URB

100806040200

GR

OW

TH

6

4

2

0

-2

-4

File/Print,

Export Chart to export as a Graphics Figure (e.g. .jpg) to Word,

Copy/Paste as picture to Word

The leastsquare line

Page 10: Correlation & Regression_

1. Scatterplot: Graphs/Scatter

Identify cases using the Point ID tool in the Chart Editor window. Select the button in the menu, and use the pointer to select the point in the upper right corner with high urbanization and high growth rate. (Case number 46: United Arab Emirates.)

Page 11: Correlation & Regression_

Format/Color or Format/Marker

While in the Chart Editor, click on a feature (e.g. the least square line or the dots), you can change the color (on the button menu above) or right click and select “properties window” to change the line or marker/dot types.

Page 12: Correlation & Regression_

2. Simple Linear RegressionAnalyze/Regression/Linear...

Page 13: Correlation & Regression_

2. Simple Linear Regression

Dependent: popgrow

Independent(s): urb

Page 14: Correlation & Regression_

Model Summary

.249a .062 .057 1.1537Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), URBa.

Regression Results

r2 = 0.06

Variables Entered/Removedb

URBa . EnterModel1

VariablesEntered

VariablesRemoved Method

All requested variables entered.a.

Dependent Variable: POPGROWb.

About 6% of the variation in popgrow can be explained by variation in urb.

Page 15: Correlation & Regression_

Coefficientsa

2.511 .220 11.389 .000

-1.25E-02 .004 -.249 -3.357 .001

(Constant)

URB

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: POPGROWa.

ANOVAb

15.002 1 15.002 11.271 .001a

226.272 170 1.331

241.274 171

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), URBa.

Dependent Variable: POPGROWb.

Regression Results

< 0.05

Reject H0: No linear relationship between growth and urbanization

Confidence interval on the slope for URB, b:[-0.0125 + 1.96 * 0.004] = [-.01984 ≤ b ≤ -.00416 ] --> does not contain 0There is a significant linear relationship. (reject H0.)

> 1.96

Page 16: Correlation & Regression_

There is a weak negative relationship between POPGROW and URB.

Only 6% of the variation in growth rate is explained by variation in the level of urbanization (r2=0.062)

The Linear Model (refer to B in coefficients):

POPGROW = 2.511 - 0.0125 * URB

The predicted growth rate for the US (URB=77.2%) is 1.58% (2.511- 0.0125*77.2), in comparison to the actual rate of 1%.

Analysis of Regression Results

Page 17: Correlation & Regression_

Research Question:

Perhaps the average number of live births per female (FERTIL) will also explain variation in growth rate?

1. Scatterplot & 2. Linear Simple Regression

Page 18: Correlation & Regression_

popgrow

1. Scatterplot: Graphs/Scatter

fertil

Page 19: Correlation & Regression_

1. Scatterplot: Graphs/Scatter

FERTIL

987654321

PO

PG

RO

W7

6

5

4

3

2

1

0

-1

Page 20: Correlation & Regression_

2. Simple Linear Regression

Dependent: popgrow

Independent(s): fertil

Page 21: Correlation & Regression_

Model Summary

.740a .547 .544 .7892Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), FERTILa.

Regression Results

r2 = 0.547

Variables Entered/Removedb

FERTILa . EnterModel1

VariablesEntered

VariablesRemoved Method

All requested variables entered.a.

Dependent Variable: POPGROWb.

About 55% of the variation in popgrow can be explained by variation in fertil.

Page 22: Correlation & Regression_

Coefficientsa

.245 .132 1.851 .066

.469 .033 .740 14.027 .000

(Constant)

FERTIL

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: POPGROWa.

ANOVAb

122.537 1 122.537 196.759 .000a

101.513 163 .623

224.051 164

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), FERTILa.

Dependent Variable: POPGROWb.

Regression Results

< 0.05

Reject H0: No linear relationship between growth and fertility

Confidence interval on the slope for FERTIL, b:[0.469 + 1.96 * 0.033] = [0.404 < b < 0.533] --> not containing 0There is a significant linear relationship. (reject H0.)

t > 1.96

Page 23: Correlation & Regression_

There is a moderate positive relationship between POPGROW and FERTIL.

About 55% of the variation in growth rate is explained by variation in fertility rate (r2=0.547)

The Linear Model (refer to B in coefficients):POPGROW = 0.245 + 0.469 * FERTIL

T-scores and the significance levels indicates the constant and coefficient ON FERTIL are significantly different from 0.

However, the relationship does not look linear on the scatterplot.

Analysis of Regression Results

Page 24: Correlation & Regression_

Test alternative regression models for the relationship between popgrow and fertil using the Analyze/Regression/Curve Estimation package.

3. Simple Non-Linear Regression

Page 25: Correlation & Regression_

3. Simple Non-linear Regression

Dependent: popgrow

Independent(s): fertil

Linear, Quadratic, Logarithmic

Page 26: Correlation & Regression_

Analysis Results

LINEAR: popgrow = b0 + b1 * fertilpopgrow = .245 + .469 * fertil r2 = .547

 LOGARITHMIC: popgrow = b0 + b1 * log(fertil)

popgrow = .045 + 1.67 * log(fertil) r2 = .635 QUADRATIC: popgrow = b0 + b1 * fertil + b2 * fertil2

popgrow = -1.337 + 1.508 * fertil - .132 * fertil2 r2 = .655

About 65% of the variation in growth rates can be explained by variation in fertility using logarithmic or quadratic as compared to only 55% using the linear model.

Independent: FERTIL  Dependent Mth Rsq d.f. F Sigf b0 b1 b2  POPGROW LIN .547 163 196.76 .000 .2447 .4686 POPGROW LOG .635 163 284.08 .000 .0445 1.6661 POPGROW QUA .655 162 154.03 .000 -1.3371 1.5078 -.1315

Page 27: Correlation & Regression_

POPGROW

FERTIL

987654321

7

6

5

4

3

2

1

0

-1

Observed

Linear

Logarithmic

Quadratic

Analysis Results

Both the logarithmic and quadratic curves seem to better represent the relationship between growth rate and fertility: growth rates increase at a decreasing rate with fertility.