Top Banner
PSY 1950 Correlation November 5, 2008
22

PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Jan 15, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

PSY 1950Correlation

November 5, 2008

Page 2: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Definition• Correlation quantifies the strength and direction of a linear relationship between two variables

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 3: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

History

Page 4: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

The First Scatterplot (Galton, 1885)

Page 5: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Importance• Prior to correlation, “there was no way to discuss -- let alone measure -- the association between variables that lacked a cause-effect relationship”

• Correlation underlies many advanced statistical techniques– Factor analysis– Structural equation modeling

• Correlation informs– Prediction of a unkown variable– Validity of a measure– Reliability of a measure– Validity of a theory

Page 6: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Covariance• Covariance measures how much two variables change together– The more they change together, the higher the covariance

– Variance is a special case of covariance

Page 7: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

3

4

5

6

7

8

9

0 1 2 3 4 5 6

3

4

5

6

7

8

9

0 2 4 6

X Y X Y Product1 4 -2 -2 42 5 -1 -1 13 6 0 0 04 7 1 1 15 8 2 2 4

Score DeviationX Y X Y Product1 8 -2 2 -42 5 -1 -1 13 4 0 -2 04 5 1 -1 -15 8 2 2 4

DeviationScore

Page 8: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

The Problem with Covariation

• It reflects not only the degree of a bivariate relationship, but also the variation of each variables

• In other words, its units depends on the variables

3

5

7

9

11

13

15

17

19

21

0 1 2 3 4 5 6

3

4

5

6

7

8

9

0 2 4 6

Page 9: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Pearson Product-Moment Correlation (r)

• Special case of covariance– Standardized covariance– Covariance of standardized variables

Page 10: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Example

Page 11: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Interpreting r• Things to consider carefully

– Correlation versus causation– Restricted Range– Group sampling– Outliers– Linearity– Size– Homoscedasticity– Significance

Page 12: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Correlation versus Causation

Page 13: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Correlation versus Causation

Page 14: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Restriction of Range• When the bivariate range is artificially limited– In the case of linear relationship, the correlation is almost spuriously attenuated

– In the case of curvilinear relationship, can result in a spuriously large correlation

• Possibly a grouping/selection effect– The correlation between height and basketball ability among NBA players

• http://www.ruf.rice.edu/~lane/stat_sim/restricted_range/index.html

Page 15: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Grouping• Grouping of heterogeneous groups (either a priori via sampling or a posteriori via data segregation) can inflate correlation– e.g., the correlation between height and basketball ability among small people and tall people

– e.g., the correlation between height and weight in men and women•For men, r = .60, for women r = .49•Together, r = .78

• http://www.ruf.rice.edu/~lane/stat_sim/restricted_range/index.html

Page 16: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Outliers• Correlation is very sensitive to outliers– For all three plots, r, means, and SD are equal

Page 17: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Linearity

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 18: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Size• The magnitude of r

• The magnitude of r2

– The coefficient of determination– The proportion of variability in one variable accounted for by variability in the other variable

Page 19: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Homoscedasticity• Same as homogeneity of variance assumption

• Variance for Y does not depend on value of Y and vice-versa

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 20: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Significance• To test the null hypothesis that the population correlation, (“rho”) = 0, use:

Page 21: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

QuickTime™ and a decompressor

are needed to see this picture.

Page 22: PSY 1950 Correlation November 5, 2008. Definition Correlation quantifies the strength and direction of a linear relationship between two variables.

Other measures of correlation

• Computationally identical to r– Point-biserial

•One dichotomous variable

– Phi•Two dichotomous variables

– Spearman•Both variables on ordinal scale•Tests monotonicity of relationship•As X increases, so does Y•No accurate significance test

• Computationally novel techniques– e.g., Kendll’s Tau