Top Banner
Regression Discontinuity Designs in Economics Dr. Kamiljon T. Akramov IFPRI, Washington, DC, USA Training Course on Applied Econometric Analysis September 13-23, 2016, WIUT, Tashkent, Uzbekistan
39

Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

May 22, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Regression Discontinuity Designs in EconomicsDr. Kamiljon T. AkramovIFPRI, Washington, DC, USA

Training Course on Applied Econometric Analysis

September 13-23, 2016, WIUT, Tashkent, Uzbekistan

Page 2: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Outline

• Overview of RDD

• Meaning and validity of RDD

• Several examples from the literature

• Estimation (where most decisions are made)

• Discussion of a paper• Stata code and data will be provided

• Conclusions

Page 3: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

But when you start exercising those rules, all sorts of processes start to happen and you start to find out all sorts of stuff about people…. It’s just a way of thinking about a problem, which let’s the shape of the problem begin to emerge. The more rules, the more arbitrary they are, the better.

Douglas Adams, Mostly Harmless

(Cited in Angrist and Pischke 2009)

Page 4: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Introduction

• RDD is developed to estimate causal treatment effects in non-experimental settings

• It exploits precise knowledge of the rules determining treatment

• Identification is based on the idea that some rules are arbitrary and provide good quasi experiments

• Treatment effects are local (LATE)

• RD research designs provide very good internal validity• Most assumptions can be empirically verified

• External validity is limited

• Like RCT relatively easy to estimate

Page 5: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Sharp and Fuzzy Discontinuity

• Sharp discontinuity• The discontinuity precisely determines treatment

• Equivalent to random assignment in a neighborhood

• E.g. Social security payment depends directly and immediately on a person’s age

• Fuzzy discontinuity• Discontinuity is highly correlated with treatment

• E.g. Rules determine eligibility but there is a margin of administrative error

• Use the assignment as an IV for program participation

Page 6: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Sharp RD

• Sharp RD is used when treatment status is deterministic and discontinuous function of covariate, 𝑥𝑖

• Suppose

𝐷𝑖 = ቊ1 𝑖𝑓 𝑥𝑖 ≥ 𝑥00 𝑖𝑓 𝑥𝑖 < 𝑥0

where 𝑥0 is known threshold or cutoff and the assignment mechanism is deterministic function of 𝑥𝑖 because once we know 𝑥𝑖 we know 𝐷𝑖. Treatment is discontinuous function of 𝑥𝑖 and no matter how close 𝑥𝑖 gets to 𝑥0, treatment is unchanged until 𝑥𝑖=𝑥0.

Page 7: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Discontinuity example

• National Merit Scholarship awards in USA• National Merit Scholarship Corporation (NMSC) uses PSAT/NMSQT

scores as the initial screen of over 1.5 million program entrants • NMSC determines a national Selection Index qualifying score

(critical reading + math + writing skills scores) for "Commended" recognition

• Qualifying score is calculated each year to yield students at about the 96th percentile (top 50,000 highest scorers

• Basically the top test-takers get a scholarship• A small difference in test score means a discontinuous jump in

scholarship amount

Page 8: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Identification for sharp discontinuity

yi = β0 + β1 Di + β2 xi + εi

Di =1 If 𝑥𝑖 ≥ 𝑥0

0 If 𝑥𝑖 < 𝑥0

𝑥𝑖 is continuous around the cut-off point and it is called a forcing or running variable

Assignment rule under sharp discontinuity:

Di = 1

Di = 0

𝑥𝑖 ≥ 50

𝑥𝑖 < 50

Page 9: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Identification for sharp discontinuity (cont.)

• Treatment effect is given by β1 (causal effect of interest)

E[Y/D = 1, X = 𝑥0] = β0 + β1 and E[Y/D = 0, X = 𝑥0] = β0

E[Y/D = 1, X = 𝑥0]- E[Y/D = 0, X = 𝑥0]= β1

• Note that the estimation of treatment effect in RDD depends on extrapolation

• To the left of cutoff point only non-treated observations

• To the right of cutoff point only treated observations

Page 10: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Extrapolation (dashed lines)

Page 11: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Counterfactuals

• The extrapolation is a counterfactual or potential outcome

• Each household has two potential outcomes

• 𝑌𝑖(1) denotes the outcome of household i if in the treated group

• 𝑌𝑖(0) denotes the outcome of household i if in the non-treated group

• Causal effect of treatment for household i is 𝑌𝑖(1) - 𝑌𝑖(0)

• Average treatment effect is E[𝑌𝑖(1) - 𝑌𝑖(0)]

• Only one potential outcome is observed. In randomized experiments, one group provides the counterfactual for the other because they are comparable (exchangeable)

Page 12: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Counterfactuals (cont.)

• In RDD the counterfactuals are conditional on 𝑥𝑖 as in RCT

• We are interested in the treatment effect at 𝑥𝑖 = 𝑥0E[𝑌𝑖(1) - 𝑌𝑖(0)|𝑥𝑖 = 𝑥0]

• Treatment effect is

𝑙𝑖𝑚𝑥→𝑥0𝐸 𝑌𝑖 𝑥𝑖 = 𝑥 − 𝑙𝑖𝑚𝑥←𝑥0𝐸 𝑌𝑖 𝑥𝑖 = 𝑥

• Estimation is possible because of the continuity of

E[𝑌𝑖(1)|𝑥𝑖] and E[ 𝑌𝑖(0)|𝑥𝑖]

• The estimation of the treatment effect is based on extrapolation because of lack of overlap

• Therefore, the functional relationship between Y and x must be correctly specified

Page 13: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

When to use RD design

• The beneficiaries/non-beneficiaries can be ordered along a quantifiable dimension

• This dimension can be used to compute a well-defined index or parameter

• The index/parameter has a cut-off point for eligibility

• The index value is what drives the assignment of a potential beneficiary to the treatment or to non-treatment groups

Page 14: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Indexes are common in targeting of welfare programs

Anti-poverty programs

Pension programs

Scholarships

CDD programs

targeted to households below a given poverty index

targeted to population above a certain age

targeted to students with high scores on standardized test

awarded to NGOs that achieve highest scores

Page 15: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Example: Effect of cash transfers on consumption• Objective: Target transfers to poorest households

• Method• Construct poverty index from 1 to 100 with pre-intervention characteristics

• Households with a score ≤ 50 are poor

• Households with a score >50 are non-poor

• Evaluation

• Measure outcomes (i.e., consumption, school attendance rates, nutrition outcomes) before and after transfer, comparing households just above and below the cut-off point

Page 16: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Regression Discontinuity Design-Baseline

Not Poor

Poor

Page 17: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Regression Discontinuity Design-Post Intervention

Treatment Effect

Page 18: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Identification for fuzzy discontinuity

yi = β0 + β1 Di + δ(scorei) + εi

Di =1 If household receives transfer

0 If household does not receive transfer

ButTreatment depends on whether scorei > or< 50

AndEndogenous factors

Page 19: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Identification for fuzzy discontinuity (cont.)

yi = β0 + β1 Di + f(scorei) + εi

First stage: Di = γ0 + γ1 I(scorei > 50) + ηi

yi = β0 + β1 Di + f(scorei) + εiSecond stage:

IV estimation

Dummy variable

Continuous function

Page 20: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

RDD examples from literature: Thistlethwaiteand Campbell (1960)

• This was a first application of RD design • They studied the impact of merit awards on future academic outcomes• Awards are allocated based on test scores• If a person had a score greater than c, the cutoff point, then she or he

received the award• Simple approach to the analysis: compare those who received the award to

those who didn't• Why is this the wrong approach?

• Factors that influence the test score are also related to future academic outcomes (income, parents' education, motivation, etc.)

• Thistlethwaite and Campbell realized they could compare individuals just above and below the cutoff point

Page 21: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Thistlethwaite and Campbell (1960): Validity

• Simple idea: assignment mechanism is known

• We know that the probability of treatment jumps to 1 if test score > c

• Assumption is that individuals cannot manipulate with precision their assignment variable (think about standardized tests: SAT, GRE, GMAT)

• Key word: precision

• Consequence: comparable individuals near cutoff point

• If treated and untreated individuals are similar near the cutoff point then data can be analyzed as if it were a (conditionally) randomized experiment

Page 22: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Thistlethwaite and Campbell (1960): Validity (cont.)

• If this is true, then background characteristics should be similar near c, the cutoff point (can be checked empirically)

• The estimated treatment effect applies to those near the cutoff point, which limits the external validity

• Validity hinges on assignment mechanism being known and free of manipulation with precision or cutoff point in some way related to outcome of interest

• Manipulation and validity• Some manipulation is fine (you can always study harder, for example)

• Precision and lack of relation of the cutoff point to outcome is the key to identify causal effects

Page 23: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

RDD examples from literature (Almond et al. QJE, 2010)

• Policy question: whether the benefits of additional medical expenditures exceed their costs

• RDD allows to compare health outcomes and medical treatment provision for newborns on either side of the very low birth weight threshold at 1,500 grams

• Study finds that newborns with birth weights just below 1,500 grams have lower one-year mortality rates than do newborns with birth weights just above this cutoff, even though mortality risk tends to decrease with birth weight

• One-year mortality falls by approximately one percentage point as birth weight crosses 1,500 grams from above

• Infants with birth weight < 1,500 grams receive more medical treatment and their hospital costs higher by $4,000 relative to mean hospital costs of $40,000 for infants with birth weight just above 1,500 grams

• Assuming observed medical spending fully captures the impact of the “very low birth weight” designation on mortality, the study estimates suggest that the cost of saving a statistical life of a newborn with birth weight near 1,500 grams is on the order of $550,000 in 2006 dollars

Page 24: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

RD examples from literature (DiNardo & Lee, QJE 2004)• Economic impacts of unionization on employers are difficult to estimate because

of selection bias

• Unions could organize at highly profitable enterprises that are more likely to grow and pay higher wages

• Union elections• If employers want to unionize, board holds election• 50% means the employer doesn’t have to recognize the union, and • 50% + 1 means the employer is required to “bargain in good faith” with the

union

• Multiple establishment-level datasets that represent establishments that faced organizing drives in the United States during 1984-1999

Page 25: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

DiNardo & Lee, QJE 2004 (cont.)

• The paper applies RD design to estimate the impact of unionization on business survival, employment, output, productivity, and wages

• Paper essentially compares outcomes for employers where unions barely won the election with those where the unions barely lost

• The analysis finds small impacts on all outcomes

• The results suggest that-at least in the study period-the legal mandate that requires the employer to bargain with a certified union has had little economic impact on employers

Page 26: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

RDD examples from literature (Angrist & Levy, QJE 1999)• Fuzzy RD design to estimate the effects of class size on children’s test

scores

• School class size- Maimonides’ rule• No more than 40 kids in a class in Israel• 40 kids in school means 40 kids per class• 41 kids means two classes with 20 and 21 kids

• Multiple discontinuities: causal variable of interest, class size, takes on many values

• First stage exploits jumps in average class size

• Finding: smaller class size increases test scores

Page 27: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

RD examples from literature (cont.)

• Anderson and Magruder (2012) and Lucas (2012)• Yelp.com ratings have an underlying continuous score

• Distribution determines cutoff points for 1 to 5 stars

• Effect of an extra star on future reservations and revenue

• Anderson et al. (2012)• Young adults lose their health insurance as they age (older than 18 and in

college but different after ACA)

• Age changes the probability of having health insurance (fuzzy design)

Page 28: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Paper by Raffaello Bronzini and Eleonora Iachini (AEJ: Economic Policy, 2014)• The paper uses sharp RDD to evaluate a unique R&D subsidy program

implemented in northern Italy

• Firms were invited to submit proposals for new projects and only those which scored above a certain threshold received the subsidy.

• It compares the investment spending of subsidized firms with that of unsubsidized firms

Page 29: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Main questions in empirical research

• What is the policy question?

• What is the causal relationship of interest?

• What is the dependent variable and how is it measured?

• What is (are) the key independent variable(s)?

• What is the data source?

• What is the identification strategy?

• What is the mode of statistical inference?

• What are the main findings?

Page 30: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Policy question

• Governments spend substantial financial resources to support private R&D activities• Direct government funding of private R&D in OCED countries amounts about 0.1% of

GDP, excluding tax incentives

• Economic rationale• Market failure• Liquidity constraints

• Inframarginal versus marginal projects • Do R&D investment subsidies actually work, i.e., increase private firms’

R&D activity (expenditures)?• Do benefits of additional government expenditures on investment

subsidies exceed their costs?

Page 31: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Program and causal relationship of interest

• Relationship between government R&D subsidies and private R&D activity (expenditures)

• “Regional Program for Industrial Research, Innovation and Technological Transfer” implemented in Emilia-Romagna (Italy)

• The regional government subsidizes the R&D expenditure of eligible firms through grants, the grant may cover up to• 50% of the costs of industrial research projects• 25% for precompetitive development projects; the 25% limit is extended by

an additional 10% if applicants are SMEs

• The maximum grant per project is €250,000

• Duration of the investment is from 12 to 24 months

Page 32: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Dependent and key independent variables

• Dependent variable • Natural candidate would be R&D investment, but not available• Net investment calculated from the balance-sheet data as annual differences in

tangible or intangible assets net of amortization

• Independent variable• Binary treatment variable for an R&D subsidy• Score

• technological and scientific (max. 45 points)• financial and economic (max. 20 points)• managerial (max. 20 points); • regional impact (max. 15 points) • Only projects deemed sufficient in each category and which obtain a total score of

at least 75 points receive the grants

Page 33: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Identification strategy

• Goal is to evaluate whether subsidized firms would not have made the same amount of R&D outlays without the grants

• Subsidized and nonsubsidized firms can differ in terms of unobserved characteristics correlated with the outcome

• Therefore, the variable identifying recipient firms in the econometric models can be endogenous

• To deal with the endogeneity issue, paper exploit the funds’ assignment mechanism

• Only those receiving a score equal to or above a given threshold (75 out of 100) were awarded grants

Page 34: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Identification strategy (cont.)

• The paper applies a sharp RDD comparing the performance of subsidized and nonsubsidized firms with scores close to the threshold

• By letting the outcome variable be a function of the score, the average treatment effect of the program is assessed through the estimated value of the discontinuity at the threshold

Page 35: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Empirical specification

𝑌𝑖 = 𝛼 + 𝛽𝑇𝑖 + 1 − 𝑇𝑖 σ𝑝=13 𝛾𝑝 𝑆𝑖

𝑝 + 𝑇𝑖 σ𝑝=13 𝛾𝑝

′(𝑆𝑖)𝑝 + 𝜀𝑖

where 𝑌𝑖 is the outcome variable; 𝑇𝑖 = 1 if firm i is subsidized (all firms with score ≥ 75) and 𝑇𝑖 = 0 otherwise; 𝑆𝑖 = 𝑆𝑐𝑜𝑟𝑒𝑖 − 75; 𝛾𝑝 and 𝛾𝑝

′ are the parameters of the score function and allowed to be different on the opposite side of the cutoff to allow for heterogeneity of the function across the threshold; 𝜀𝑖 is the random error.

Page 36: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Estimation

• First, a third order polynomial model was estimated on the full sample

• Second, equation was estimated through local regressions around the cutoff point using two different sample windows• Firms with scores between 52 and 80 (50% of the baseline sample)

• Firms with scores between 66 and 78 (35% of the baseline sample)

• Third, paper estimated the discontinuity using other nonparametric techniques, namely the kernel regressions using two bandwidths, 30 and 15 points of the score

Page 37: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Estimation (cont.)

• The OLS estimates of the parameter β measures the value of the discontinuity of function Y(𝑆𝑖) at the cutoff point, corresponding to the unbiased estimate of the causal effect of the program

• A coefficient β equal to zero would signal complete crowding-out of private investment by public grants• This would mean that firms reduced private expenditure by the amount of

the subsidies received and the investment turned out to be unaffected by the program

• A positive coefficient would show that overall treated firms invested more than untreated firms, plausibly thanks to the program, and that total crowding-out did not occur

Page 38: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Main findings

• Overall, no significant increase in investment

• Substantial heterogeneity in the program’s impact

• Small enterprises increased their investments—by approximately the amount of the subsidy they received—whereas larger firms did not

Page 39: Regression Discontinuity Designs in Economics · Identification for sharp discontinuity ... •Method • Construct poverty index from 1 to 100 with pre-intervention characteristics

Data and Stata codes

• Data and Stata codes are in the folder