1 Quantitative Methods II Dummy Variables & Interaction Effects Edmund Malesky, Ph.D., UCSD
1
Quantitative Methods II
Dummy Variables & Interaction Effects
Edmund Malesky, Ph.D., UCSD
2
The Homogeneity Assumption
OLS assumes all cases in your data are comparable
x’s are a sample drawn from a single population
But we may analyze distinct groups of cases together in one analysis
Mean value of y may differ by group
3
Qualitative Variables
These group effects remain as part of the error term
If groups differ in their distribution of x’s, then we get a correlation between the X variables and the error term
Violates assumption: cov(Xi, ui)=E(u)=0 Omitted Variable Bias!
4
Testing for Differences Across Groups (p. 249-252) The Chow Test i.e. Testing for difference between males and females on
academic performance.
SSR1=Males only; SSR2=Females only SSRur=SSR1+SSR2 SSRP=SSRr=Pooling across both groups
1 2
1 2
[ ( )] [ 2( 1)]*
1pSSR SSR SSR n k
FSSR SSR k
The Chow Test:1. Is only valid under homoskedasticity (the
error variance for the two groups must be equal).
2. The null hypothesis is that there is no difference at all; either in the intercept or
the slope between the two groups.
3. This may be two restrictive in these cases, we should allow dummy variables
and dummy interactions to allow us to predict different slopes and intercepts for
the two groups.
5
Example: Democracy & Tariffs
Here we see that democracies have lower tariffs
Here we see that states in Regional Trading Arrangements (RTA’s) have lower tariffs
0
5
10
15
20
25
30
35
40
Dictator Oligarch Anocracy DemocracyP
erce
nt
Tar
iffs
Pooled Data
05
10152025
3035404550
Dictator Oligarch Anocracy Democracy
Per
cen
t T
ari
ffs
RTANo RTAPooled Data
But if Democracies are more likely to be in RTA’s, then pooling RTA and non-RTA
states biases the coefficient
6
Solution: The Qualitative Variable
Measure this group difference (RTA vs. Non-RTA) and specify it as an x
This eliminates bias But we have no numerical scale to
measure RTA’s Create a categorical variable that captures
this group difference
7
The Qualitative “Dummy”
Create a variable that equals 1 when a case is part of a group, 0 otherwise
This variable creates a new intercept for the cases in the group marked by the dummy
Specifically, how would we interpret:
0 1 2TARIFF DEM RTA u
8
Democracy and Tariff Barriers
05
10152025
3035404550
Dictator Oligarch Anocracy Democracy
Perc
ent T
ariff
s
RTANo RTA
0 1 2
0 1 2
ˆ ˆ ˆ ˆ
ˆ ˆ ˆ50 5 10
TARIFF DEM RTA u
and and
9x1 (could be continuous, categorical, or dichotomous)
y
Graphical Depiction of a Dummy
10
0 2ˆ ˆ
1
1
0
0 1 1 2 2 2ˆ ˆ ˆˆ if 1y x x x
0 1 1ˆ ˆy x
0 1 1 2 2 2ˆ ˆ ˆˆ if 0y x x x
10
Multiple Category Dummies
Dummy variables are a very flexible way to assess categorical differences in the mean of y
We can use dummies even for concepts with multiple categories
Imagine we want to capture the impact of global region on tariffsRegions: Americas, Europe, Asia, Africa
11
Warning!Warning!
Do not fall into the dummy variable trap!
When you have entered both values of a dummy variable in the same regression. These two variables are linearly dependent. One will drop out.
12
Multiple Category Dummies
Create 4 separate dummy variables - 1 for each region
Include all except one of these dummies in the equation
If you include all 4 dummies you get perfect collinearity with the constant. The fourth dummy will drop out.
Americas+Europe+Asia+Africa=1
13
Interpreting Multi-Category Dummies
Each coefficient compares the mean for that group to the mean in the excluded category
Thus if: βhat
2-βhat4 compare the mean tariff in each region to the
mean in the Americas
Mean in Americas is βhat0
An alternative strategy is to drop the constant and run all dummies, as discussed last week.
0 1 2 3 4ˆ ˆ ˆ ˆ ˆ ˆTARIFF DEM EUR ASIA AFR u
14
Dumb Dummies
Dummy variables are easy, flexible ways to measure categorical concepts
They CAN be just labels for ignorance Try to use dummies to capture theoretical
constructs not empirical observations If possible, measure the theoretical
construct more directly
15
Interaction Effects
Dummy variables specify new intercepts Other slope coefficients in the equation do
not change OLS assumes that the slopes of
continuous variables are constant across all cases
What if slopes are different for different groups in our sample?
16
Interaction Effects: An Example
What if the effect of democracy on tariffs depends on whether the state is in an RTA?
0 1 2ˆ ˆ ˆ ˆTARIFF DEM RTA u
1 0 1ˆ ˆ ˆ RTA
17
Interaction Effects: An Illustration (Notice that democracy has been converted to a dummy as
well for illustration purposes)
0
5
10
15
20
25
30
35
Non-Dem Democracy
Per
cen
t T
ari
ffs
RTANo RTA
0 1 2
1
1
ˆ ˆ ˆ
ˆ 5 0
ˆ 6 1
TARIFF DEM RTA u
if RTA
if RTA
18
How Do We Estimate This Set of Relationships?
We begin with:
Substituting for Βhat1,
we get:
0 1 2ˆ ˆ ˆ ˆTARIFF DEM RTA u
1 0 1ˆ ˆ ˆ RTA
0 0 1 2
0 0 1 2
ˆ ˆˆ ˆ ˆ( )
ˆ ˆˆ ˆ ˆ*
TARIFF RTA DEM RTA u
TARIFF DEM RTA DEM RTA u
Βhat1 Βhat
2 Βhat3
In STATA, they will appear as regular
coefficients
19
What Do These Coefficients Mean?
0 2ˆ ˆ is the new intercept for DEM when RTA=1 0
ˆ is the intercept for DEM when RTA=0
0 is the slope of DEM when RTA=0
1 is the impact of RTA on the coefficient for DEM
0 1So if RTA=1, the slope of DEM is +
0 0 1 2ˆ ˆˆ ˆ ˆ*TARIFF DEM RTA DEM RTA u
20
Interpreting the Interaction
Recall that:
RTA is a dummy variable taking on the values 0 or 1
0 0 1 2
0 0 1 2
ˆ ˆˆ ˆ ˆ( )
ˆ ˆˆ ˆ ˆ*
TARIFF RTA DEM RTA u
TARIFF DEM RTA DEM RTA u
0 1 2ˆ ˆ ˆ ˆTARIFF DEM RTA u
1 0 1ˆ ˆ ˆ RTA
1 0ˆ ˆThus if RTA=0, then =
1 0 1ˆ ˆ ˆBut if RTA=1, then = +
21
An Illustration of the Coefficients
Imagine we estimate:
0
5
10
15
20
25
30
35
Non-Dem Democracy
Per
cen
t T
ari
ffs
RTANo RTA
30 5( ) 1( * ) 10( )TARIFF DEM RTA DEM RTA
22
Substantive Effects of Dummy Interactions
No RTA RTA
Non-
Democracy
Βhat0 =
3030
Βhat0 + Βhat
3 =
2020
Democracy Βhat0 + Βhat
1 =
2525
Βhat0 + Βhat
1 +
Βhat2 + Βhat
3 = 1414
23
Interactions with Continuous Variables
The exact same logic about interactions applies if Βhat
1 depends on a continuous variable
0 1 2ˆ is the impact of x when x =0
0 1 1 2 2
1 0 1 2
ˆ ˆ ˆ ˆx x
ˆ ˆ ˆ x
y u
1 1 2ˆˆ is the change in for each one unit increase in x
2 2 1ˆ is the impact of x when x =0
24
Example: Democracy, Tariffs & Unemployment
28 2( ) 1( * ) 5( )TARIFF DEM DEM UNEMP UNEMP
10
20
30
40
50
Tari
ff R
ate
Dictator Oligarch Anocrat DemoDemocracy 1-4
yhat_, Unemployment == 0 yhat_, Unemployment == 2yhat_, Unemployment == 4 yhat_, Unemployment == 6
25x1 (could be continuous, categorical, or dichotomous)
y
Graphical Depiction of a Dummy/Continuous Interaction
1
0
0 1 1 2 2 2ˆ ˆ ˆˆ if 1y x x x
1 0 1
0 3ˆ ˆ
1 0
0 0 1 1 1 2 3 2 2ˆ ˆˆ ( * ) if 1y x x x x x
0 0 1 1 1 2 3 2 2ˆ ˆˆ ( * ) if 0y x x x x x
0
26
What if a Variable Interacts with Itself?
What if Βhat1 depends on the value of x1?
Then we substitute in as before:
Curvilinear (Quadratic) effect is a type of interaction
0 1 1 2 2
1 0 1 1
ˆ ˆ ˆ ˆx x
ˆ ˆ ˆ x
y u
0 0 1 1 1 2 2
20 0 1 1 1 2 2
ˆ ˆˆ ˆ ˆ( x )x x
ˆ ˆˆ ˆ ˆx x x
y u
y u
27
More Complex Interactions
We can use this method to specify the functional form of βhat
1 in any way we choose
Simply substitute the function in for βhat1 ,
multiply out the terms and estimate Only limitations are theories of interaction
and levels of collinearity
28
Examples of interaction effects
from my own research
29
Figure 4: PCI Performance and Economic Welfare
05
10
15
20
04
GD
P p
er
ca
pit
a (
in M
illio
ns o
f C
on
sta
nt
19
94
VN
D)
0 20 40 60 80 100Structural Endowments (Infrastructure, Human Capital, Proximity to Markets)
Low PCI High PCI
“The Governance Premium” Better governed (high PCI)
provinces are able to generate higher living
standards from the same level of development
Governance and Economic Welfare
30
Predicted Number of Loans by Legal Status among Vietnamese Private Firms
Land Use Rights Certificate
Registered at DPI None Partial Full
No 0.83 0.99 1.2
Yes 2.73 3.27 3.98
31
Predicted Probability of Provincial Division in Vietnam
(By State Sector Output with Number of Cabinet Officials).4
.5.6
.7.8
Pre
dic
ted P
rob
ablity
of P
rovin
cia
l D
ivis
ion
0 .2 .4 .6 .8 1State Contribution to Provincial Output
No Cabinet Members 1 Cabinet Member
2+ Cabinet Members
Contribution of covariates at 75th percentile
32
33