Top Banner
1 10. Causality and Correlation ECON 251 Research Methods
22
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 10. Causality and Correlation ECON 251 Research Methods.

1

10. Causality and Correlation

ECON 251

Research Methods

Page 2: 1 10. Causality and Correlation ECON 251 Research Methods.

2

Example 1

A strong correlation has been found in a certain city in the northeastern United States between weekly sales of hot chocolate and weekly sales of facial tissues.

Would you interpret that to mean that hot chocolate causes people to need facial tissues? Explain.

Page 3: 1 10. Causality and Correlation ECON 251 Research Methods.

3

Example 2

Researchers found a correlation of 0.86 between the number of churchgoers and the number of burglaries committed in different towns.

Explanation?• More churchgoers means more empty houses• Attending church makes people want to rob

Common Third Cause:•

Page 4: 1 10. Causality and Correlation ECON 251 Research Methods.

4

Example 3

Researchers have shown that there is a positive correlation between the average fat intake and the breast cancer rate across countries. In other words, countries with higher fat intake tend to have higher breast cancer rates.

Does this correlation prove that dietary fat is a contributing cause of breast cancer? Explain.

Page 5: 1 10. Causality and Correlation ECON 251 Research Methods.

5

Example 4

If you were to draw a scatterplot of number of women in the work force versus number of Christmas trees sold in the United States for each year between 1930 and the present, you would find a very strong correlation.

Why do you think this would be true? Does one cause the other?

Page 6: 1 10. Causality and Correlation ECON 251 Research Methods.

6

Example 5

Explain this cartoon in terms of correlation and causation

Page 7: 1 10. Causality and Correlation ECON 251 Research Methods.

7

Causation vs. Association

Some studies want to find the existence of causation. Example of causation:

• Increased drinking of alcohol causes a decrease in coordination.

• Smoking and Lung Cancer. Example of association:

• High SAT scores are associated with a high Freshman year GPA.

• Smoking and Lung Cancer.

Page 8: 1 10. Causality and Correlation ECON 251 Research Methods.

8

Explaining AssociationsSome possible explanations for an observed association. The dashed lines show an association. The solid arrows show a cause-and-effect link. x is explanatory, y is response, and z is a lurking variable.

Page 9: 1 10. Causality and Correlation ECON 251 Research Methods.

9

Reasons Two Variables Could Be Related:

1. Explanatory variable is the direct cause of the response variable.• Example: Amount of food consumed in past hour and level

of hunger.

2. Response variable is causing a change in the explanatory variable.• Example: In a study in Resource Manual, it was noted that

divorced men were twice as likely to abuse alcohol as married men. The authors concluded that getting divorced caused alcohol abuse. But, it is just as reasonable to assume that alcohol abuse causes divorce.

Page 10: 1 10. Causality and Correlation ECON 251 Research Methods.

10

Reasons Two Variables Could Be Related:

3. Explanatory variable is a contributing but not sole cause of the response variable.

• Example: Carcinogen in diet is not sole cause of cancer, but rather a necessary contributor to it.

4. Confounding variables may exist.• A confounding variable is related to the explanatory variable

and affects the response variable. So can’t determine how much change is due to the explanatory and how much is due to the confounding variable(s).

• Example: Consider the relationship between hours studied per day and grade point average. Studying increases grade point average, but it is also reasonable that a desire to do well in school means that a person studies more and that their grade point average is high.

Page 11: 1 10. Causality and Correlation ECON 251 Research Methods.

11

Confounding

Two variables are confounded when their effects on a response variable cannot be distinguished from each other. The confounded variables may be either explanatory variables or lurking variables.• Example: Studies have found that religious people live longer

than nonreligious people.Religious people also take better care of themselves and are less likely to smoke or be overweight.

Page 12: 1 10. Causality and Correlation ECON 251 Research Methods.

12

Lurking Variables

Lurking variables can create nonsense correlations. For the world’s nations, let x be the number of TVs/person

and y be the average life expectancy;• A high positive correlation• Nations with more TV sets have higher life expectancies.• Could we lengthen the lives of people in Rwanda by shipping

them more TVs? Lurking variable: wealth of the nation

• Rich nations: more TV sets.• Rich nations: longer life expectancies because of better

nutrition, clean water, and better health care.

Page 13: 1 10. Causality and Correlation ECON 251 Research Methods.

13

Lurking Variables

Examples: Students who use tutors have lower test scores than

students who don’t.• Lurking variable:

Negative association between moderate amounts of wine drinking and death rates from heart disease in developed nations.• Lurking variable:

Number of churches and number of bars• Lurking variable:

Lurking variables can create nonsense (false) correlations!

Page 14: 1 10. Causality and Correlation ECON 251 Research Methods.

14

Lurking Variables

How to spot the presence of lurking variables?• In general difficult.• Many lurking variables change systematically over time.

Plot both the response variable and the residuals against the time order of the observations whenever possible.

Page 15: 1 10. Causality and Correlation ECON 251 Research Methods.

15

Reasons Two Variables Could Be Related:

5. Both variables may result from a common cause.• Example: Students who have high SAT scores in high school

have high GPAs in their first year of college.• This positive correlation can be explained as a common

response to students’ ability and knowledge. The observed association between two variables x and y

could be explained by a third lurking variable z. Both x and y change in response to changes in z. This

creates an association even though there is no direct causal link.

Page 16: 1 10. Causality and Correlation ECON 251 Research Methods.

16

Common Response

“There is a strong positive correlation between the number of firefighters at a fire and the amount of damage the fire does. So sending lots of firefighters just causes more damage.”

What is the lurking variable?a) Number of firefightersb) Amount of damagec) How large the fire is.d) If the fire is close to the fire station.

Page 17: 1 10. Causality and Correlation ECON 251 Research Methods.

17

Reasons Two Variables Could Be Related:

6. Both variables are changing over time.• Nonsensical associations result from correlating two variables

that have both changed over time.• Example: The number of divorces and the number of suicides

have both increased dramatically since 1900. This does not mean that divorces are causing suicides. All such statistics increase as the population increases.

7. Association may be nothing more than coincidence.• Association is a coincidence, even though odds of it

happening appear to be very small.

Page 18: 1 10. Causality and Correlation ECON 251 Research Methods.

18

Simpson’s Paradox Simpson’s paradox is a severe form of confounding in which

there is a reversal in the direction of an association caused by a lurking variable. Overall direction of

association: _________ But when we color different

habitats in different colors, the data is separated by a lurking variable (different habitats) into a series of ______ linear associations.

Page 19: 1 10. Causality and Correlation ECON 251 Research Methods.

19

Simpson’s Paradox

Is acceptance into a college (response variable) predicted by gender (explanatory variable)?

Consider these data:

Proportions accepted by gender:• Male success rate = 198 / 360 = 0.55• Female success rate = 88 / 200 = 0.44

Conclude: males were accepted at a _______ rate than females.

Page 20: 1 10. Causality and Correlation ECON 251 Research Methods.

20

Simpson’s Paradox Broken down according to the lurking variable "major…"

Male proportion = 18 / 120 = 0.15

Female proportion = 24 / 120 = 0.20

Therefore: males were accepted at a _____ rate than females.

Male proportion = 180 / 240 = 0.75

Female proportion = 64 / 80 = 0.80

Therefore: males were accepted at a _______ rate than females.

Page 21: 1 10. Causality and Correlation ECON 251 Research Methods.

21

Evidence for Causation

Evidence of a possible causal connection• The association is strong (high r value)• The association is consistent (the association can be found in

several studies of different subjects)• Higher doses are associated with stronger responses • The alleged cause precedes the effect in time • The alleged cause is plausible (storks do not bring babies)

Other things to keep in mind: Data from an observational study in the absence of any other evidence cannot be used to establish causation.

Page 22: 1 10. Causality and Correlation ECON 251 Research Methods.

22

Summary

Association does not imply causation! Correlation and regression can be misleading if you ignore

important lurking variables. A correlation based on averages is usually higher than if we

had data for individuals (Simpson’s paradox). Do not use a regression on inappropriate data.

• Pattern in the residuals• Presence of large outliers• Clumped data falsely appearing linear

A relationship, however strong, does not itself imply causation.

Use residual plots for help.