Pearson Product Moment Correlation Welcome to the Pearson Product Moment Correlation Learning Module
Jun 20, 2015
Pearson Product Moment Correlation
Welcome to the Pearson Product Moment Correlation Learning
Module
• The Pearson Product Moment Correlation is the most widely used statistic when determining the relationship between two variables that are continuous.
• The Pearson Product Moment Correlation is the most widely used statistic when determining the relationship between two variables that are continuous.
Variable A Variable B
• By continuous we mean a variable that can take any valuable between two points.
• By continuous we mean a variable that can take any valuable between two points.
• Here is an example:
• By continuous we mean a variable that can take any valuable between two points.
• Here is an example:
Suppose the fire department mandates that all fire fighters must weigh between 150 and 250 pounds. The weight of a fire fighter would be an example of a continuous variable; since a fire fighter's weight could take on any value between 150 and 250 pounds.
• By continuous we mean a variable that can take any valuable between two points.
• Here is an example:
Suppose the fire department mandates that all fire fighters must weigh between 150 and 250 pounds. The weight of a fire fighter would be an example of a continuous variable; since a fire fighter's weight could take on any value between 150 and 250 pounds.
• The Pearson Product Moment Correlation will either indicate a strong relationship
• The Pearson Product Moment Correlation will either indicate a strong relationship
Variable A Variable B
• Or a weak even nonexistent relationship
• Or a weak even nonexistent relationship
Variable A Variable B
• Strong relationships can either be positive
• Strong relationships can either be positive
Variable A Variable B
• Or negative
• Or negative
Variable A Variable B
• The Pearson Product Moment Correlation or simply Pearson Correlation values range from -1.0 to +1.0
• The Pearson Product Moment Correlation or simply Pearson Correlation values range from -1.0 to +1.0
-1 +10
• A Pearson Correlation of 1.0 has a perfect positive relationship. Note two qualities here:
• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction
• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
• A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.
• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
• A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.
• A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.
• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
• A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.
• A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.
• A 0.0 Pearson Correlation has no direction and has no strength.
• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
• A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.
• A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.
• A 0.0 Pearson Correlation has no direction and has no strength.
• A +0.3 Pearson Correlation’s direction is positive and it’s strength is moderately weak.
• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
• A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.
• A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.
• A 0.0 Pearson Correlation has no direction and has no strength.
• A +0.3 Pearson Correlation’s direction is positive and it’s strength is moderately weak.
• A -0.1 Pearson Correlation’s direction is negative and it’s strength is very weak.
• There is another quality as well. With a Pearson correlation you are considering the relationship between only two variables.
• There is another quality as well. With a Pearson correlation you are considering the relationship between only two variables.
• There is another quality as well. With a Pearson correlation you are considering the relationship between only two variables.
• Three’s a crowd:
• There is another quality as well. With a Pearson correlation you are considering the relationship between only two variables.
• Three’s a crowd:
• There is another quality as well. With a Pearson correlation you are considering the relationship between only two variables.
• Three’s a crowd:
• Bottom line: The Pearson Correlation is used only when exploring the relationship between two variables.
• Let’s look at a fictitious problem to illustrate how the Pearson Correlation is calculated.
• Imagine you are conducting a study to determine the relationship between the average daily temperature and the average daily ice cream sales in a particular city.
• Imagine you are conducting a study to determine the relationship between the average daily temperature and the average daily ice cream sales in a particular city.
• Imagine the data set looks like this:
• Imagine the data set looks like this:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
• Notice how as one variable goes up (temperature) the other variable increases (ice cream sales)
• Notice how as one variable goes up (temperature) the other variable increases (ice cream sales)
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
• Notice how as one variable goes up (temperature) the other variable increases (ice cream sales)
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
• One way to look at this relationship is to rank order both variable values like so:
• One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
• One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
• One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
• One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
• One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd 2nd
• One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
3rd 3rd
2nd
• One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
3rd 3rd
2nd
4th 4th
• One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
• Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
• Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Meaning that higher values for one variable are associated with higher
values for another variable
• Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Meaning that higher values for one variable are associated with higher
values for another variable
• Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Meaning that higher values for one variable are associated with higher
values for another variable
• Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Or
• Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Meaning that lower values for one variable are associated with lower
values for another variable
• Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Meaning that lower values for one variable are associated with lower
values for another variable
• What would a perfectly negative correlation (-1.0) look like?
• What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
• What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
• What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
• What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
• What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
• What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
Meaning that higher values for one variable are associated with lower
values for another variable
• What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
Meaning that higher values for one variable are associated with lower
values for another variable
• What would a zero correlation (0.0) look like?
• What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
• What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
• What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
• What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
• What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
• What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
• What would a zero correlation (0.0) look like?
• Note – Pearson Correlation is not just a comparison of rank ordered data (that is what a Phi coefficient does) but the rank order is one factor that is considered with a Pearson Correlation. Another factor is the degree to which the standard deviations are similar.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
• The Pearson Product Moment Correlation (PPMC) is calculated as the average cross product of the z-scores of two variables for a single group of people. Here is the equation for the PPMC
• The Pearson Product Moment Correlation (PPMC) is calculated as the average cross product of the z-scores of two variables for a single group of people. Here is the equation for the PPMC
𝑟=∑(𝑍 𝑋 ∙𝑍𝑌 )𝑛
• Let’s calculate the Pearson Correlation, for the following data set:
• Let’s calculate the Pearson Correlation, for the following data set:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
• Let’s calculate the Pearson Correlation, for the following data set:
• It is very important to note that the Pearson Correlation can be computed in a matter of seconds using statistical software. The next set of slides is designed to help you see what is happening conceptually as well as computationally with the Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
• When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:
• When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
• When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
The metric here is degrees
• When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
The metric here is number of ice
cream sales
The metric here is degrees
• So we have to get these two variables on the same metric. This is done by calculating the z scores or standardized scores for the values from each variable.
• So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
• So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
• So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
• So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
• So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
Different Metric (raw scores)
• So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
Same Metric (z or standard
scores)
• Note – this is done by subtracting each value from it’s mean (e.g., 900 minus 700 = 200) and dividing it by it’s standard deviation (e.g., 200 / 14.1 = 1.4)
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
• Once the values are standardized we multiply them
• Once the values are standardized we multiply them
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Once the values are standardized we multiply them
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Once the values are standardized we multiply them
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Once the values are standardized we multiply them
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
XXXXX
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Once the values are standardized we multiply them
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
XXXXX
Cross Products
1.9
0.4
0.0
0.6
2.1
=====
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Once the values are standardized we multiply them
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
XXXXX
Cross Products
1.9
0.4
0.0
0.6
2.1
=====
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
These are called cross products because we are multiplying
across two values
• Once the values are standardized we multiply them
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
XXXXX
Cross Products
1.9
0.4
0.0
0.6
2.1
=====
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
1.9 + 0.4 + 0.0 + 0.6 + 2.1 = 5.0Then we sum the cross products
• Finally, divide that number (5.0) by the number of observations
• Finally, divide that number (5.0) by the number of observations
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Finally, divide that number (5.0) by the number of observations
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
The number of observations (in this case 5)
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
12345
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝟓
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝟓
The number of observations (in this case 5)
𝑟=𝟓𝟓
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝟓
The number of observations (in this case 5)
𝑟=𝟓𝟓
Sum of the cross products1.9 + 0.4 + 0.0 + 0.6 + 2.1 =
5.0
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝟓
The number of observations (in this case 5)
𝑟=𝟓𝟓
Sum of the cross products1.9 + 0.4 + 0.0 + 0.6 + 2.1 =
5.0
𝑟=+𝟏 .𝟎
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝟓
The number of observations (in this case 5)
𝑟=𝟓𝟓
Sum of the cross products1.9 + 0.4 + 0.0 + 0.6 + 2.1 =
5.0
𝑟=+𝟏 .𝟎This is the Pearson Correlation which in this case is a perfect
positive relationship
• In summary:
• In summary:• The Pearson Product Moment Correlation can range
from -1 to 0 to +1.
• In summary:• The Pearson Product Moment Correlation can range
from -1 to 0 to +1.
-1 +10
• A correlation of 0 indicates no association between the variables of interest.
• A correlation of 0 indicates no association between the variables of interest.
• The direction (positive or negative) simply indicates a positive or negative (inverse) relationship between the variables.
• If POSITIVE, when values increase on one variable, they tend to increase on another variable.
• If POSITIVE, when values increase on one variable, they tend to increase on another variable.
Variable 1
10
9
8
7
Variable 2
5
4
3
2
-1 +10
• If POSITIVE, when values increase on one variable, they tend to increase on another variable.
Variable 1
10
9
8
7
Variable 2
5
4
3
2
-1 +10
• If POSITIVE, when values increase on one variable, they tend to increase on another variable.
Variable 1
10
9
8
7
Variable 2
5
4
3
2
PearsonCorrelation = +1.0
-1 +10
• If NEGATIVE, when values increase on one variable, they tend to decrease on another variable.
• If NEGATIVE, when values increase on one variable, they tend to decrease on another variable.
Variable 1
10
9
8
7
Variable 2
2
3
4
5
-1 +10
• If NEGATIVE, when values increase on one variable, they tend to decrease on another variable.
Variable 1
10
9
8
7
Variable 2
2
3
4
5
PearsonCorrelation = -1.0
-1 +10
• The strength of the relationship depends on the decimal value.
• The strength of the relationship depends on the decimal value.
-1 +10
• The strength of the relationship depends on the decimal value.
-1 +10
• The strength of the relationship depends on the decimal value.
-1 +10 0.2weak
• The strength of the relationship depends on the decimal value.
-1 +10
• The strength of the relationship depends on the decimal value.
-1 +10 0.8strong
• The strength of the relationship depends on the decimal value.
-1 +10
• The strength of the relationship depends on the decimal value.
-1 +100.2
weak
• The strength of the relationship depends on the decimal value.
-1 +10
• The strength of the relationship depends on the decimal value.
-1 +100.8
strong
• The strength of the relationship depends on the decimal value.
-1 +10
• There is a tendency to interpret the Pearson Product Moment Correlation with causal language as though changes in one variable causes changes in the other.
• There is a tendency to interpret the Pearson Product Moment Correlation with causal language as though changes in one variable causes changes in the other.
• Whether to interpret the Pearson Product Moment Correlation as prediction or causation depends on the nature of the research design rather than the nature of the statistic.
• There is a tendency to interpret the Pearson Product Moment Correlation with causal language as though changes in one variable causes changes in the other.
• Whether to interpret the Pearson Product Moment Correlation as prediction or causation depends on the nature of the research design rather than the nature of the statistic.
• First, analyze the nature of the research design before interpreting the Pearson Product Moment Correlation with causal or prediction language.
• There is a tendency to interpret the Pearson Product Moment Correlation with causal language as though changes in one variable causes changes in the other.
• Whether to interpret the Pearson Product Moment Correlation as prediction or causation depends on the nature of the research design rather than the nature of the statistic.
• First, analyze the nature of the research design before interpreting the Pearson Product Moment Correlation with causal or prediction language.
• So, if your research question is focused on the relationship between two continuous variables the Pearson Product Moment Correlation would be the appropriate statistical method to use.