”
Aid optimists
“I have identified the specific investments that are needed [to end poverty]; found ways to[to end poverty]; found ways to plan and implement them; [and] shown that they can be affordable ”affordable.
Jeffrey Sachs End of Poverty
2
Image by Angela Radulescu on Flickr.
“Aft $2 3 t illi 5
p
Aid pessimists
decades, why are the desperate needs of the world's poor still so tragically unmet?
Isn't it finally time for an end to the impunity of foreign aid?”
Bill Easterly The White Man’s Burden
© Unknown. All rights reserved. This content is excluded from our Creative Commons license. For more information, see http://ocw.mit.edu/fairuse.
"After $2.3 trillion over 5
33
Books
Poor Economics: A Radical Rethinking of the Way to Fight Global Povertyby Abhijit V. Banerjee and E
sther Duflo
Publication date: April 2011
Website: http://www.pooreconomics.com/
Le Développment Humain (Lutter contre la pauvrete, volume 1)by Esther Duflo
2010, Paris: Le seuil
2011, Italian translation: Feltrinelli
La polique de l'autonomie (Lutter contre la pauvrete, volume 2)by Esther Duflo
2010, Paris: Le seuil
2011, Italian translation: Feltrinelli
Expérience, science et lutter contre la pauvretéby Esther Duflo
2009, Paris: Fayard
© 2011 MIT. All rights reserved.
MIT Department of Economics : Esther Duflo : Books http://econ-www.mit.edu/faculty/eduflo/publications
2 of 2 07/09/2011 3:17 PM
11-07-15 2:02 PMMore Than 1 Billion People Are Hungry in the World - By Abhijit Banerjee and Esther Duflo | Foreign Policy
Page 3 of 15http://www.foreignpolicy.com/articles/2011/04/25/more_than_1_billion_people_are_hungry_in_the_world?print=yes&hidecomments=yes&page=full
books, The Elusive Quest for Growth and The White Man's Burden.
Dambisa Moyo, an economist who worked at Goldman Sachs and the World
Bank, has joined her voice to Easterly's with her recent book, Dead Aid. Both
argue that aid does more bad than good. It prevents people from searching for
their own solutions, while corrupting and undermining local institutions and
creating a self-perpetuating lobby of aid agencies. The best bet for poor
countries, they argue, is to rely on one simple idea: When markets are free and
the incentives are right, people can find ways to solve their problems. They do
not need handouts from foreigners or their own governments. In this sense, the
aid pessimists are actually quite optimistic about the way the world works.
According to Easterly, there is no such thing as a poverty trap.
This debate cannot be solved in the abstract. To find out whether there are in
fact poverty traps, and, if so, where they are and how to help the poor get out of
them, we need to better understand the concrete problems they face. Some aid
programs help more than others, but which ones? Finding out required us to
step out of the office and look more carefully at the world. In 2003, we founded
what became the Abdul Latif Jameel Poverty Action Lab, or J-PAL. A key part of
our mission is to research by using randomized control trials -- similar to
experiments used in medicine to test the effectiveness of a drug -- to understand
what works and what doesn't in the real-world fight against poverty. In practical
terms, that meant we'd have to start understanding how the poor really live
their lives.
Take, for example, Pak Solhin, who lives in a small village in West Java,
Indonesia. He once explained to us exactly how a poverty trap worked. His
parents used to have a bit of land, but they also had 13 children and had to build
so many houses for each of them and their families that there was no land left
for cultivation. Pak Solhin had been working as a casual agricultural worker,
which paid up to 10,000 rupiah per day (about $2) for work in the fields. A
recent hike in fertilizer and fuel prices, however, had forced farmers to
economize. The local farmers decided not to cut wages, Pak Solhin told us, but to
11-07-20 1:00 AMSearch | The Abdul Latif Jameel Poverty Action Lab
Page 1 of 1http://www.povertyactionlab.org/search/apachesolr_search?view=map&filters=type:evaluation
The Abdul Latif Jameel Poverty Action Lab Contact J-PAL | Subscribe
TRANSLATING RESEARCH INTO ACTION
Legend Education Finance & MicrofinanceEnvironment & Energy HealthPolitical Economy &
GovernanceLabor MarketsAgricultureStandard Search Region-Theme Grid
Search
To refine displayed results, select one ormore of the categories below:
Keyword:
Themes all...
Policy Goals all...
Region all...
Country all...
Researchers all...
Status all...
Data all...
J PAL
Evaluation: What Why WhenEvaluation: What, Why, When
povertyactionlab.org 1
f l k h
H l h t k ?
Why focus on impact evaluation?
• Surprisingly little hard evidence on what works
• Can do more with given budget with better evidence
• If people knew money was going to programs thatworked, could help increase pot for anti‐poverty programsprograms
• Instead of asking “do aid/development programswork?” should be asking:– Which work best, why and when?
– How can we scale up what works?
5
Impact: What is it?om
e Intervention
ry O
utco Impact
Prim
ar
Time 21
CounterfactualThe counterfactual represents the state of the world that program participants would have experienced in the absence of the program
Problem: Counterfactual cannot be observed
Solution: We need to “mimic” or construct the counterfactual
J-PAL | WHY RANDOMIZE 19
Constructing the counterfactual
• Usually done by selecting a group of individuals that did not participate in the program
• This group is usually referred to as the control group or comparison group
• How this group is selected is a key decision in the design of any impact evaluation
J-PAL | WHY RANDOMIZE 20
Selecting the comparison group
• Idea: Comparability
• Goal: Attribution
J-PAL | WHY RANDOMIZE 21
II – WHAT IS A RANDOMIZED EXPERIMENT?
The basics
Start with simple case:• Take a sample of program applicants• Randomly assign them to either: Treatment Group – is offered treatment Control Group – not allowed to receive treatment (during
the evaluation period)
J-PAL | WHY RANDOMIZE 26
Key advantage of experiments
Because members of the groups (treatment and control) do not differ systematically at the outset of the experiment,
any difference that subsequently arises between them can be attributed to the program rather than to other factors.
27J-PAL | WHY RANDOMIZE 27
Evaluation of “Women as Policymakers”: Treatment vs. Control villages at baseline
Variables Treatment Group
Control Group Difference
Female Literacy Rate 0.35 0.34 0.01(0.01)
Number of Public Health Facilities 0.06 0.08 -0.02(0.02)
Tap Water 0.05 0.03 0.02(0.02)
Number of Primary Schools 0.95 0.91 0.04(0.08)
Number of High Schools 0.09 0.10 -0.01(0.02)
Standard Errors in parentheses. Statistics displayed for West Bengal*/*/***: Statistically significant at the 10% / 5% / 1% levelSource: Chattopadhyay and Duflo (2004)
J-PAL | WHY RANDOMIZE 28
Some variations on the basics
• Assigning to multiple treatment groups
• Assigning of units other than individuals or households
Health Centers Schools Local Governments Villages
J-PAL | WHY RANDOMIZE 29
Key Steps in conducting an experiment
1. Design the study carefully
2. Randomly assign people to treatment or control
3. Collect baseline data
4. Verify that assignment looks random
5. Monitor process so that integrity of experiments is not
compromised
J-PAL | WHY RANDOMIZE 30
Key Steps in conducting an experiment (contd.)
6. Collect follow-up data for both the treatment and
control groups
7. Estimate program impacts by comparing mean
outcomes of treatment group vs mean outcomes of the
control group
8. Assess whether program impacts are statistically
significant and practically significant
J-PAL | WHY RANDOMIZE 31
III – WHY RANDOMIZE?
If properly designed and conducted, randomized experiments provide the most credible method to estimate the impact of a program
Why Randomize?- Conceptual Argument
J-PAL | WHY RANDOMIZE 41
Why “most credible”?
Because members of the groups (treatment and control) do not differ systematically at the outset of the experiment,
any difference that subsequently arises between them can be attributed to the program rather than to other factors.
J-PAL | WHY RANDOMIZE 42
t t
counter actua
Constructing the counterfactual
• Counterfactual is often constructed by selecting aff d b thgroup not affected by the program
• Randomized:– Use random assignment of the program to create acontrol group which mimics the counterfactual.
• Non‐randomized:– Argue that a certain excluded group mimics the
f lcounterfactual.
22
Example #3 Balsakhi Program
J-PAL | WHY RANDOMIZE 58
Balsakhi Program: Background
• Implemented by Pratham, an NGO from India• Program provided tutors ( Balsakhi) to help at-risk
children with school work• In Vadodara, the balsakhi program was run in
government primary schools in 2002-2003• Teachers decided which children would get the balsakhi
J-PAL | WHY RANDOMIZE 59
Balsakhi: Outcomes
• Children were tested at the beginning of the school year (Pretest) and at the end of the year (Post-test)
• QUESTION: How can we estimate the impact of the balsakhi program on test scores?
J-PAL | WHY RANDOMIZE 60
Methods to estimate impacts
• Let’s look at different ways of estimating the impacts using the data from the schools that got a balsakhi
1. Pre – Post (Before vs. After)2. Simple difference3. Difference-in-difference4. Other non-experimental methods5. Randomized Experiment
J-PAL | WHY RANDOMIZE 61
• Look at average change in test scores over the school year for the balsakhi children
1 - Pre-post (Before vs. After)
J-PAL | WHY RANDOMIZE 62
QUESTION: Under what conditions can this difference (26.42) be interpreted as the impact of the balsakhi program?
Average post-test score for children with a balsakhi
51.22
Average pretest score for children with a balsakhi
24.80
Difference 26.42
1 - Pre-post (Before vs. After)
J-PAL | WHY RANDOMIZE 63
2 - Simple difference
Children who got balsakhi
Compare test scores of…
Children who did not get balsakhi
With test scores of…
J-PAL | WHY RANDOMIZE65
2 - Simple difference
QUESTION: Under what conditions can this difference (-5.05) be interpreted as the impact of the balsakhi program?
Average score for children with a balsakhi
51.22
Average score for children without a balsakhi
56.27
Difference -5.05
J-PAL | WHY RANDOMIZE 66
3 – Difference-in-Differences
Children who got balsakhi
Compare gains in test scores of…
Children who did not get balsakhi
With gains in test scores of…
J-PAL | WHY RANDOMIZE 68
3 – Difference-in- difference
• QUESTION: Under what conditions can this difference (-5.05) be interpreted as the impact of the balsakhi program?
Pretest Post-test Difference
Average score for children with a balsakhi
24.80 51.22 26.42
J-PAL | WHY RANDOMIZE 69
3 – Difference-in-difference
Pretest Post-test Difference
Average score for children with a balsakhi
24.80 51.22 26.42
Average score for children without a balsakhi
36.67 56.27 19.60
J-PAL | WHY RANDOMIZE 70
3 – Difference-in-difference
Pretest Post-test Difference
Average score for children with a balsakhi
24.80 51.22 26.42
Average score for children without a balsakhi
36.67 56.27 19.60
Difference 6.82
J-PAL | WHY RANDOMIZE 71
• Suppose we evaluated the balsakhi program using a randomized experiment
• QUESTION #1: What would this entail? How would we do it?
• QUESTION #2: What would be the advantage of using this method to evaluate the impact of the balsakhi program?
5 – Randomized Experiment
J-PAL | WHY RANDOMIZE 73
How to Randomize
Random Selection
Random Selection
7J-PAL | WHAT IS EVALUATION
Random Selection
8J-PAL | WHAT IS EVALUATION
Monthly income, per capita
1000
500
0Population
1250
Random Selection
Randomly samplefrom area of interest
Random Selection
Monthly income, per capita
1000
500
0Population Sample
12521250
Random Assignment
Randomly assignto treatment
Random Assignment
Monthly income, per capita
1000
500
0Population Treatment
12571250
Random Assignment
Randomly assignto treatmentand control
Random Assignment
Monthly income, per capita
1000
500
0Population Treatment Control
1257 12441250
Alternate methods of Randomization?
15J-PAL | WHAT IS EVALUATION
NOT Random Assignment
17J-PAL | WHAT IS EVALUATION
NOT Random Assignment
Monthly income, per capita
1000
500
0Population Treatment Control
1453
1250
942
Simple randomization: Fixed probability • For each member, set
probability (e.g. 50%).– Spot randomization
– Point-of-servicerandomization
• May end up with slightlymore in one group andfewer in the other
J-PAL | HOW TO RANDOMIZE 20
ID Coin Treatment/Control
1 Heads T
2 Heads T
3 Tails C
4 Heads T
5 Tails C
6 Heads T
7 Tails C
8 Tails C
9 Heads T
10 Heads T
Count: T: 6C: 4
Complete randomization: Fixed proportion• Need sample frame• Determine number in
treatment (and in control)
• Pull out of a hat/bucket-or-
• Use random numbergenerator to orderobservations randomly
Source: Chris Blattman
J-PAL | HOW TO RANDOMIZE 21
Unit of Randomization: Individual?
J-PAL | HOW TO RANDOMIZE 23
Unit of Randomization: Individual?
J-PAL | HOW TO RANDOMIZE 24
Unit of Randomization: Clusters?
J-PAL | HOW TO RANDOMIZE 25
Unit of Randomization: Class?
J-PAL | HOW TO RANDOMIZE 26
Unit of Randomization: Class?
J-PAL | HOW TO RANDOMIZE 27
Unit of Randomization: School?
J-PAL | HOW TO RANDOMIZE 28
Unit of Randomization: School?
J-PAL | HOW TO RANDOMIZE 29
An education department wants to see if increasing the duration of recess can help reduce rates of obesity. What is the appropriate unit of randomization?
A. Child level
B. Household level
C. Classroom level
D. School level
E. Village level
F. Don’t know
A. B. C. D. E. F.
22%
0% 0%0%
56%
22%
J-PAL | HOW TO RANDOMIZE 30
The department of agriculture believes that if farmers used more fertilizer yields would improve. One advisor believes organic fertilizer will be more effective; a second believes inorganic fertilizer is better; a third believes neither will be effective. Can we test all three beliefs within one single experiment?
A. Yes, and we should
B. No, they can only be answered with twoseparate experiments
C. No they can only be answered with three separate experiments
D. Yes, but best practice is to run separate experiments
E. Don’t knowA. B. C. D. E.
71%
0%
14%
0%
14%
J-PAL | HOW TO RANDOMIZE 34
Treatment 1Treatment 2Control
Multiple treatments
J-PAL | HOW TO RANDOMIZE 35
Cross-cutting treatments:Factorial Design
J-PAL | HOW TO RANDOMIZE 38
Performance-based pay
Y N
YGroup 1
+ CashPerformance
Group 2Cash
NGroup 3
PerformanceGroup 4Control
CashGrants
Cross-cutting treatments:Factorial Design
J-PAL | HOW TO RANDOMIZE 40
Cross-cutting treatments:Factorial Design
J-PAL | HOW TO RANDOMIZE 42
Cross-cutting treatments:Factorial Design
J-PAL | HOW TO RANDOMIZE 43
Varying intensity of treatment
• To Measure:– Dosage
– Sensitivity
– Elasticity
– Spillovers
J-PAL | HOW TO RANDOMIZE 44
Varying intensity of treatment (individual)
• Dosage
• Sensitivity
• Elasticity
J-PAL | HOW TO RANDOMIZE 45
Challenge 1: Difficult (logistically or politically) for Service Providers• Service providers have trouble distinguishing between
treatment and comparison (or customizing service)
J-PAL | HOW TO RANDOMIZE 53
treatment
comparison
Crossovers: Control receives intervention (No longer represents pure counterfactual)
Services provided to both
Solution 1a: Assign to Different Service Providers• Service providers have trouble distinguishing between
treatment and comparison (or customizing service)
• Have different teams provide the different treatments• Randomly assign to those teams
J-PAL | HOW TO RANDOMIZE 54
treatment
comparison
Solution 1b: Randomize at a different unit• Service providers have trouble distinguishing between
treatment and comparison (or customizing service)
• Change the unit of random assignment• Have providers treat entire clusters the same
J-PAL | HOW TO RANDOMIZE 55
treatment
comparison
Challenge 2a: Control group finds out about treatment• If treatment and control individuals know each other, the
control may get upset.
• Service providers may lose support of community• Attrition: Control withdraws participation from research
J-PAL | HOW TO RANDOMIZE 57
treatment
comparison
Friends in control group get upset with researchers or service providers
Talks with friends (treatment and control)
Challenge 2b: Control group benefits from treatment• If treatment and control individuals know each other, the
treatment may share benefits with control.
J-PAL | HOW TO RANDOMIZE 58
Challenge 2e: Control group harmed by treatment• If treatment and control individuals compete with each
other, the control may be harmed.
J-PAL | HOW TO RANDOMIZE 61
Without experiment
With experimentTreatment group Control group
Solution 2a: Varying the unit to contain spillovers
J-PAL | HOW TO RANDOMIZE 62
treatment
comparison
friends
Solution 2b: Creating a Buffer
J-PAL | HOW TO RANDOMIZE 63
Not sampled
But perhaps not all at once
Challenge 3: Have resources to treat everyone. (Where’s the control group?)
J-PAL | HOW TO RANDOMIZE 67
Solution 3: Phase In
J-PAL | HOW TO RANDOMIZE 68
Phase 0: No one treated yetAll control
J-PAL | HOW TO RANDOMIZE 69
Phase 1: 1/4th treated 3/4ths control
J-PAL | HOW TO RANDOMIZE 71
Phase 2: 2/4ths treated 2/4ths control
J-PAL | HOW TO RANDOMIZE 72
Phase 3: 3/4ths treated 1/4th control
J-PAL | HOW TO RANDOMIZE 73
Phase 4: All treated No control (experiment over)
J-PAL | HOW TO RANDOMIZE 74
Challenge 4: There’s an eligibility criteria
J-PAL | HOW TO RANDOMIZE 78
Peo
ple
Income
Challenge 4: There’s an eligibility criteria
J-PAL | HOW TO RANDOMIZE 79
Peo
ple
Income
Cut-offEligible Ineligible
Solution 4: Relax the eligibility criteria
J-PAL | HOW TO RANDOMIZE 80
Peo
ple
Income
Cut-offEligible IneligibleNew Cut-off
Solution 4: Randomize “on the bubble”
J-PAL | HOW TO RANDOMIZE 81
Peo
ple
Income
Cut-offRemain Eligible
RemainIneligibleNew Cut-off
Not in Study
Not in Study
Study Sample
Challenge 5: Program is an entitlementCannot force nor deny intervention
Challenge 5: Program is an entitlement
Treatment Group Control Group
Solution 5: Encouragement
Treatment Group Control Group
J-PAL | HOW TO RANDOMIZE 86
Solution 5: Encouragement
Treatment Group Control Group
3/4ths take-up 1/4th take-up
J-PAL | HOW TO RANDOMIZE 87
To evaluate the effect of this program, you would first:A. Compare those who
enrolled to those who didn’t
B. Drop those who didn’t enroll from the treatment group
C. Drop those who did enroll from the control group
D. Both B&CE. Compare treatment
group to entire control group
J-PAL | HOW TO RANDOMIZE 88A. B. C. D. E.
0% 0%
67%
33%
0%
Solution 5: Encouragement
Treatment Group Control Group
3/4ths take-up 1/4th take-up
Entire Treatment Group Entire Control GroupCompare
toJ-PAL | HOW TO RANDOMIZE 89
Problem 6: Sample size is small
J-PAL | HOW TO RANDOMIZE 90
Solution 6a: Change the unit of randomization
J-PAL | HOW TO RANDOMIZE 91
How do we increase school participation (enrollment and attendance)?
A government wants to improve school attendance at primary schools, what interventions would you recommend?
J-PAL | WHAT IS EVALUATION 14
What is the most effective intervention to increase school participation (enrollment and attendance)?A. Text Books
B. Lunch for free
C. Free school uniforms
D. Treat intestinal worms
E. Merit scholarships
F. Improve curriculum & teaching
G. Provide better materials
H. Increase awareness of returns to education
J-PAL | WHAT IS EVALUATION 15A. B. C. D. E. F. G. H.
0%
100%
0% 0%0%0%0%0%
Impact evaluations can help answer these questions
J-PAL | WHAT IS EVALUATION 16
Which one of these would make a good question for an impact evaluation?A. What share of kids in
Tanzania drop out of school before completing primary?
B. Will providing kids with deworming pills or school uniforms do a better job of keeping kids in school?
C. What role does ethnicity play in student results?
J-PAL | WHAT IS EVALUATION 18A. B. C.
0%6%
94%
Which one of these would make a good question for an impact evaluation?A. Are agricultural
extension agents giving farmers the same information they were trained on?
B. What share of farmers in Kenya currently live on less than $2 a day?
C. Which kind of fertilizer works best for a plot of maize?
J-PAL | WHAT IS EVALUATION 19A. B. C.
0% 0%0%
Which one of these would make a good question for an impact evaluation?A. Does a sexual education
program or free school uniforms have a bigger effect on teenage pregnancy rates?
B. Do teenage girls have a right to have full information regarding sexual education?
C. Are teachers spreading misinformation when delivering sexual education?
J-PAL | WHAT IS EVALUATION 20A. B. C.
0% 0%0%
5 components of program evaluation
Impact Evaluation
Cost-Effectiveness Analysis
Needs Assessment
Theory of Change
Process Evaluation
Impact Evaluation
Cost Effectiveness Analysis J-PAL | WHAT IS EVALUATION 22
WATER, SANITATION & HEALTH
An Example
What do you think is the most cost-effective way to reduce diarrhea?A. Develop piped water
infrastructureB. Improve existing water
sources C. Increase supply of and
demand for chlorineD. Education on sanitation
and health E. Improved cooking stoves
for boiling waterF. Improve sanitation
infrastructure
J-PAL | WHAT IS EVALUATION 24A. B. C. D. E. F.
0%
6% 6%6%
35%
47%
NEEDS ASSESSMENT
Identifying the problem
Needs AssessmentQuestions answered by a needs assessment
• Does the problem we proposing to solve actually exist? – What is the likely source of the problem?– Of the solutions proposed and tried, why are they failing?– Who is in most need?
J-PAL | WHAT IS EVALUATION 26
Needs Assessment
• Does the problem exist?– Diarrheal disease killed approximately 2.6 million people a
year between 1990 and 2000 .– 20% all child deaths (under 5 years old) are from diarrhea
…..what is the likely source?
J-PAL | WHAT IS EVALUATION 27
The source of the problem?
J-PAL | WHAT IS EVALUATION 28
Theory of Change
Blueprint for Change
Theory of Change Questions answered by a theory of change
• How will the program address the needs put forth in your needs assessment?– What are the prerequisites to meet the needs?– How and why are those requirements currently lacking or
failing?– How does the program intend to target or circumvent
shortcomings? – What services will be offered?
J-PAL | WHAT IS EVALUATION 30
What is a potential solution to this problem?
31J-PAL | WHAT IS EVALUATION
Alternative Solution(s)?
32J-PAL | WHAT IS EVALUATION
Log FrameObjectives Hierarchy
Indicators Sources of Verification
Assumptions / Threats
Impact(Goal/ Overall
objective)
Lower rates of diarrhea
Rates of diarrhea
Household survey
Waterbornedisease is primarycause of diarrhea
Outcome(Project
Objective)
Households drink cleaner water
(Δ in) drinking water source;E. coli CFU/100ml
Household survey, water quality test at home storage
Shift away from dirty sources. No recontamination
Outputs Source water is cleaner; Families collect cleaner water
E. coli CFU/100ml;
Water qualitytest at source
continued maintenance, knowledge of maintenance practices
Inputs(Activities)
Source protection is built
Protection is present, functional
Source visits/ surveys
Sufficient materials, funding, manpower
Source: Roduner, Schlappi (2008) Logical Framework Approach and Outcome Mapping, A construct ive Attempt of Synthesis
Needs assessment
Process evaluation
Impactevaluation
35J-PAL | WHAT IS EVALUATION
PROCESS EVALUATION
Making the program work
Process Evaluation Questions answered by a process evaluation
• Was the program carried out as planned?– Are basic tasks being completed?– Is the intervention reaching the target population?– Is the intervention being completed well or efficiently and
to the beneficiaries’ satisfaction?
J-PAL | WHAT IS EVALUATION 37
IMPACT EVALUATION
Measuring how well it worked
Impact Evaluation Questions answered by impact evaluations
• Process evaluations determine if a program is running in the way it is supposed to run
• Impact evaluations determines if a program creates a change in an outcome(s)– Did concrete encased springs decrease diarrhea rates?
40J-PAL | WHAT IS EVALUATION
What was the impact?
• 66% reduction in source water e coli concentration• 24% reduction in household E coli concentration• 25% reduction in incidence of diarrhea
41J-PAL | WHAT IS EVALUATION
Making Policy from Evidence
Intervention Impact on DiarrheaSpring protection (Kenya) 25% reduction in diarrhea
incidence for ages 0-3
J-PAL | WHAT IS EVALUATION 42
Making Policy from Evidence
Intervention Impact on DiarrheaSpring protection (Kenya) 25% reduction in diarrhea
incidence for ages 0-3Source chlorine dispensers(Kenya)
20-40% reduction in diarrhea
Home chlorine distribution (Kenya)
20-40% reduction in diarrhea
Hand-washing (Pakistan) 53% drop in diarrhea incidence for children under15 years old
Piped water in (Urban Morocco)
0.27 fewer days of diarrhea per child per week
J-PAL | WHAT IS EVALUATION 43
COST-EFFECTIVENESS ANALYSIS
Evidence-Based Policymaking
Cost-Effectiveness Diagram
45J-PAL | WHAT IS EVALUATION
EvaluationDesign
Evaluation Implementation
Randomized Evaluation Process
J-PAL | WHAT IS EVALUATION 46
RandomAssignment
Survey DesignSampleSelection
Data Collection
Data Analysis
Results
Theoryof Change
Intervention OutcomesTarget Group
Monitoring
Why Randomize
How to Randomize
MeasurementPower & Sample Size
Post-Design Challenges
Why EvaluateEvaluation Question(Causal Hypothesis)
How to MeasureSources of Measurement
First-order questions in measurement
• What data do you collect?• Where do you get it?• When do you get it?
J-PAL | M EASUREMENT & I NDICATORS 14
Where can we get data?
• Obtained from other sources– Publically available
– Administrative data
– Other secondary data
• Collected by researchers– Primary data
J-PAL | M EASUREMENT & I NDICATORS 15ht t ps://commons.wikimedia.org/w iki/File:Cuyahoga_County_US_Census_Form-Herbert _Birch_Kingston_1920. jpght t ps ://commons.wikimedia.org/w iki/File:US_Navy_090123-N-9760Z-004_Hospit al_Corpsman_2nd_Class_Jennifer_Ross_files_medical_records_aboard_t he_aircraft _carrier_U SS_Nimitz_(CVN_68). jpg
Types and Sources of DataInformation about a person/ household / possessions
NOT about a person/ household / possessions
Information provided by a person
Automaticallygenerated
J-PAL | M EASUREMENT & I NDICATORS 16
Data collection on people
• Surveys• Exams, tests, etc.• Games• Vignettes• Direct Observation• Diaries/Logs• Focus groups• Interviews
J-PAL | M EASUREMENT & I NDICATORS 18
Survey: Modes of Data Collection
• Interviewer administered– Paper-based– Computer-assisted/ Digital– Telephone-based
• Self-administered– Paper– Computer/Digital
J-PAL | M EASUREMENT & I NDICATORS 19
When to collect data
• Baseline• During the intervention
– Process, Monitoring of intervention
• Endline• Follow-up• Scale-up• Intervention: M&E
J-PAL | M EASUREMENT & I NDICATORS 20
Ethics
• “Experimenting on people”• Belmont Principles
– Respect for persons– Beneficence– Justice
• Institutional Review Boards (IRBs)
J-PAL | M EASUREMENT & I NDICATORS 21
How to MeasureConcept
Concept of measurement
J-PAL | M EASUREMENT & I NDICATORS 23
Construct(Intelligence)
Indicator(IQ Test)
5
Data(Test Result)
https://commons.wikimedia.org/wiki/File:Red_Silhouette_-_Brain.svg
Concept of measurement
J-PAL | M EASUREMENT & I NDICATORS 24
Construct(Stress)
Indicator(Cortisol level)
12
Data(Test Result)
https://pixabay.com/en/despair-stress-alone-being-alone-862349/
The goals of measurement
J-PAL | M EASUREMENT & I NDICATORS 30
Accuracy
Unbiasedness
Validity
• Precision
• Reliability
Validity
• In theory: – How well does the indicator map to the outcome?
(e.g. IQ tests intelligence)
J-PAL | M EASUREMENT & I NDICATORS 31
Construct
Indicators
Validity
Reliability
• In theory:– The measure is consistent and precise vs. “noisy”
J-PAL | M EASUREMENT & I NDICATORS 32
Construct
Indicators
Data Collection(“Response”)
Reliability
The Response Process
J-PAL | M EASUREMENT & I NDICATORS 37
Indicators
Data Collection(“Response”)
Measurement Error
Data
4-step Response Process1.
Comprehension of the question
2. Retrieval ofInformation
3. Judgementand Estimation
4. Reporting anAnswer
J-PAL | M EASUREMENT & I NDICATORS 38
Measurement Error: Vagueness
Vague concepts where respondents may interpret the question in different ways.
Example:
Q. Do you live with a teenager?• Yes• No
Between what age ranges is a teenager?
Make sure to define vague conceptsJ-PAL | M EASUREMENT & I NDICATORS 51
Measurement Error: Completeness
The response categories do not include all categories that can be expected as a response
Example:
Q. What is the highest level of education completed?• Basic Education (1-5th)• Middle School (6th-8th)• High School (9th-12th)• College Degree• Post Graduate • Other Professional Degree (e.g. Medical, Law, Teacher)
“No education” or “vocational degree” is not a response
Pilot question to make sure that categories are exhaustiveJ-PAL | M EASUREMENT & I NDICATORS 53
Measurement Error: Negatives
Questions that include negatives can be confusing to the respondent and lead to misinterpretations.
Example:
Q. Do you think that you should not let your children play contact sports?• Yes• No
Having a negative might throw some people off
Avoid unnecessary negativesJ-PAL | M EASUREMENT & I NDICATORS 55
Measurement Error: Overlapping CategoriesThe categories overlap each other.
Example:
Q. How many hours a day do you work?• Less than an hour• Between one and four hours• Between three and eight hours• Between eight and ten hours• More than ten hours
What would a person who works eight hours a day reply?
Make sure that all categories are mutually exclusiveJ-PAL | M EASUREMENT & I NDICATORS 57
Measurement Error: Presumptions
The question assumes certain things about the respondent
Example:
Q. How would you rate the quality of coffee this morning?• Very good• Somewhat good• Not good
We are assuming that the respondent drank the coffee
Use filters and skip patternsJ-PAL | M EASUREMENT & I NDICATORS 59
Measurement Error: Framing effect
People react to a particular choice in different ways depending on how it is presented i.e. prefer gains over losses
Example:
Q. Two new treatments have been developed to treat 600 terminally ill patients. Treatment A will save 200 people, while Treatment B will allow 400 people to die. Which treatment would you prefer? • Treatment A• Treatment B
Treatment A is preferable because it has been framed as a gain
Try to be neutral when framing questionsJ-PAL | M EASUREMENT & I NDICATORS 61
Measurement Error: Recall Bias
People may retrieve recollections regarding events or experiences differently
Example:
Q. How long did you have to wait last time you voted?• No time (there was no line, or I voted by mail)• Less than 10 minutes• Between 10 minutes and 30• More than 30 minutes but less than an hour• An hour or more
This experience may be more vivid for some respondents than others.
You can ask respondents to keep a diary or save their receiptsJ-PAL | M EASUREMENT & I NDICATORS 63
Measurement Error: Anchoring Bias
People tend to rely too heavily on the first piece of information seen
Example:
Q. In Arizona, some voters reported having to wait more than 5 hours to vote. How long did you have to wait last time you voted?• No time (there was no line, or I voted by mail)• Less than 10 minutes• Between 10 minutes and 30• More than 30 minutes but less than an hour• An hour or moreRespondents will be more likely to give a number on the higher end of the spectrum
Avoid adding anchors to your questionsJ-PAL | M EASUREMENT & I NDICATORS 65
Measurement Error: Telescoping Bias
People perceive recent events as being more remote than they are (backward telescoping) and distant events as being more recent than they are (forward telescoping)
Example:
Q. Did you purchase a TV or other electronic (worth over $500) in the past 12 months?____________ emails
This will lead to over reporting due to forward telescoping of events that happened before 12 months ago
Visit once at the beginning of the reference period. Then ask, “since the last time I v isited you, have you…?”
J-PAL | M EASUREMENT & I NDICATORS 67
Measurement Error: Social Desirability BiasTendency of respondents to answer questions in a manner that is favorable to others i.e. emphasize strengths, hide flaws, or avoid stigma
Example:
Q. Do you beat your wife?• Yes• No
Respondents would be shy to admit to such behavior
Ask indirectly, ensure privacyJ-PAL | M EASUREMENT & I NDICATORS 69
Key Steps in conducting an experiment
1. Design the study carefully
2. Randomly assign people to treatment or control
3. Collect baseline data
4. Verify that assignment looks random
5. Monitor process so that integrity of experiments is not
compromised
J-PAL | WHY RANDOMIZE 30
Key Steps in conducting an experiment (contd.)
6. Collect follow-up data for both the treatment and
control groups
7. Estimate program impacts by comparing mean
outcomes of treatment group vs mean outcomes of the
control group
8. Assess whether program impacts are statistically
significant and practically significant
J-PAL | WHY RANDOMIZE 31
J-PAL | THREATS AND ANALYSIS
• Random assignment of subjects to treatments– receiving treatment statistically independent of subjects’
potential outcomes
• Non-interference: subject’s potential outcomes reflect only whether they receive the treatment themselves
– Subject’s potential outcomes unaffected by how treatments happened to be allocated
• Excludability: subject’s potential outcomes respond only to defined treatment, not other extraneous factors that may be correlated with treatment
– Importance of defining treatment precisely and maintaining symmetry between treatment and control groups (e.g., through blinding)
Core assumptions
11
J-PAL | THREATS AND ANALYSIS
Noncompliance
• Sometimes there is a disjunction between the treatment that is assigned and the treatment that is received– Miscommunication and administrative mishaps– Subjects may be unreachable– Encouragements sometimes don’t work
• Addressing noncompliance requires careful attention to “excludability” assumptions – Are outcomes affected only by the treatment? Or by
both the assignment and the treatment?
16
Treatment groupParticipants
No-Shows
Control groupNon-
Participants
Crossovers
Random Assignment
Bad idea: biased
What can you do?Can you switch them?
J-PAL | THREATS AND ANALYSIS
Handling noncompliance
17
Treatment groupParticipants
No-Shows
Control groupNon-
Participants
Crossovers
Random Assignment
J-PAL | THREATS AND ANALYSIS
Handling noncompliance
Bad idea: biased
What can you do?Can you drop them?
18
Treatment groupParticipants
No-Shows
Control groupNon-
Participants
Crossovers
Random Assignment
Inferences should be based solely on comparisons of randomly assigned groups
J-PAL | THREATS AND ANALYSIS
Handling noncompliance
19
J-PAL | THREATS AND ANALYSIS
Noncompliance: avoiding common errors
• Subjects you fail to treat are NOT part of the control group!
• Do not throw out subjects who fail to comply with their assigned treatment
• Base your estimation strategy on the ORIGINAL treatment and control groups, which were randomly assigned and therefore have comparable potential outcomes
20
Promise of experiments:
Surprisingly positive results
o (Miguel/Kremer 2004) showed that deworming treatment (costs 49 cents/child per year) can reduce abesenteeims from by school by one-quarter
o In terms of increasing attendance – deworming is 20 times as effective as hiring an extra teacher, even though both work in the sense of generating statistically significant improvements
o Economic intuition would not have helped us come to this conclusion
o NGOs were equally uniformed about this comparison
sk children around the world why they are not in school and you will get many answers: cost, distance, lack of facilities. Very few of
them will mention worms—soil-transmitted hel-minths (STHs) and schistosomes. Until recently few experts would have mentioned worms as a key barrier to schooling either.
Four hundred million children of school-age are chronically infected with intestinal worms. In-fected children suffer listlessness, diarrhea, ab-dominal pain and anemia. These parasites are so widespread that some societies do not recognize infection as a medical problem. Symptoms of worms, such as blood in the stool, are considered a natural part of growing up. So even though safe, cheap, and effective oral medication that can kill 99 percent of worms in the body is available and the World Health Organization (WHO) recom-mends mass deworming of school-aged children, only 10 percent of at-risk children get treated.
OCTOBER 2007
Policy Briefcase No. 4
Abdul Latif Jameel Poverty Action LabMIT Department of Economics
E60-275
30 Memorial Drive
Cambridge, MA 02142
Voice: 614 324 3852
Email: [email protected]
www.povertyactionlab.org
Mass Deworming: A Best-Buy for Education and Health
For more details on this study
see Miguel and Kremer (2004)
and Kremer and Miguel (2007)
available at
www.povertyactionlab.org
A This Briefcase (based on Miguel and Kremer, 2004; and Kremer and Miguel, 2007) reports the results of a randomized impact evaluation of a de-worming program in western Kenya. The results show that school-based mass deworming—where every child in a school is treated—is the most cost-effective way to increase school participation (of all the alternatives that have been rigorously evaluated). It is also one of the most cost-effective ways to improve health that we know of.
Similar educational benefits were found when intestinal worms were eradicated from the southern states of the U.S. in 1915 (Bleakley, 2007). Follow-up work found that attempts to make the program self-sustaining—through health education and user fees—led to its col-lapse. Only long-term funding of a school-based program sustained the benefits.
Summary
What was done About 30,000 children in 75 primary schools in rural Kenya were treated en masse in schools with drugs for hookworm, whipworm, roundworm, and schistosomiasis (bilharzia).
Key Impacts Reduced the incidence of moderate-to-heavy infections by 25 percentage points.
Reduced school absenteeism by 25 percent, with the largest gains among the youngest pupils.
School participation in the area increased by at least 0.14 years of schooling per treated child.
There was no evidence that deworming increased test scores.
Cost Effectiveness
Cost: 50 cents per child per year
Health: US$5 for every Disability Adjusted Life Year (DALY) saved
Education: US$3.50 for each additional year of school participation
Take Action Nowwww.dewormtheworld.org
11-07-20 1:08 AMOur Story | dewormtheworld
Page 1 of 1http://www.dewormtheworld.org/?q=node/68
Search
Home » About Us » Our Story
Our Story
Over 400 million children are infected with parasitic worms. Although the harm they cause to children’s health and educationhas been recognized since the 1980s, deworming was not widespread due to more urgent health sector priorities. However,over two decades later, new groundbreaking research changed how the education sector viewed school-based deworming.
There were three key findings. First, researchers showed that the health impacts of deworming were significantly greater thanpreviously estimated, due to the spillover effects of treatment. Second, they illustrated that mass deworming drastically improvedschool participation. In fact, it is one of the best returns on investment of any intervention evaluated to increase school attendance.Finally, they conclusively demonstrated that deworming through schools is an efficient and effective way to treat large numbers ofchildren.
Investigators have also since followed up on this research to show the long run impacts of deworming, which result in increasedearnings and workforce participation of adults who received two to three additional years of treatment during school.
This evidence was a breakthrough. School-based deworming was globally recognized as a ‘best buy’ for development, and thebenefits and cost-effectiveness of school-based deworming were now clear to both the health and education sectors. However,additional barriers remained, and millions of children continued to go without treatment. Some countries needed access to drugs, whileothers needed technical assistance and capacity building. In addition, policies needed to be developed or strengthened in order tosupport school-based deworming programs.
Recognizing the huge opportunity to impact the lives of millions of children, economists Michael Kremer and Esther Duflo shared theevidence with fellow members of the Young Global Leaders Education Task Force, who promptly launched the Deworm the WorldInitiative in January 2007 at the World Economic Forum Annual Meeting in Davos, Switzerland.
The Deworm the World Initiative is operated as a partnership between Innovations for Poverty Action and Partnership for ChildDevelopment. Working together, the Initiative has reached 20 million children in 27 countries by supporting the launch of newcountry programs and enabling the continued activity of existing ones.
www.dewormtheworld.org Disclaimer Sitemap Designed By SunGard Copyright @ 2011
SUPPORT US
HOME ABOUT US WHY DEWORM OUR WORK FOR IMPLEMENTERS GET INVOLVED NEWS RESOURCES
JOIN US
GOEmail:
Print This Page
Our Story
Board Of Directors
Staff
Partner Organizations
Contact Us
See the evidence for school-based deworming
Multiple treatment experiments can be informative
o Duflo, Kremer, Robinson (2010) reflects an iterative process
o succession of experiments on fertilizer use were run over a period of several years
o each set of results prompting the need to try out a series of new variation in order to better understand results of previous one
Theoretical Motivation
o Experiments designed to assess whether there is a demand for commitment products (Ashraf, Karlan, and Yin 2006) – came from theoretical motivation
o Karlan and others – experiments emerging as powerful too for testing theories
Biggest Advantage:
Experiments may be that they take us into terrain where observational approaches are not available
Objections raised by critics best viewed as warnings against over-interpreting experimental results
Also concerns about what experiments are doing to development economics as a field
Generalizability
Environmental Dependence - Core element of generalizability – would the same result occur in a different setting?
Effect is not constant across individuals – likely vary systematically with covariates?
Concern of implementer effects and compliance – smaller organization (NGO) – estimated treatment effect reflects unique characteristics of implementer
e.g. some NGOs refuse to randomize
Randomization Issues
Fact that there is an experiment going on might generate selection effects that would not arise in non-experimental setting (being part of an experiment and being monitored influences participants)
Villagers not used to private organization going around offering them things
Necessary that individuals are not aware that they are excluded from program (difficult when randomization is at individual level, easier if randomization is at village level)
Equilibrium Effects
Program effects from small study may not generalize when program is scaled up
e.g. :
Vouchers to go to private school
Students end up with better education and higher incomes
Scale up program to national level
Crowding in private schools (collapse of public schools)
Returns to education fall because of increased supply
Experimental evidence overstates returns to vouchers program
Notes from: “Instruments, Randomization, and Learning about Development” (Deaton 2010)
Effectiveness of development assistance is topic of great public interest
Much public debate among non-economists takes it for granted that, if the funds were made available, poverty would be eliminated -- Amongst economists, it is mixed.
Macro perspective: can foreign assistance raise growth and eliminate poverty?
Micro perspective: what sorts of projects are likely to be effective? Should aid focus on roads, electricity, schools, health clinics?
Answer – we don’t know – how should we go about finding out?
Frustration with Aid organizations
Particularly the World Bank
Allegedly failing to learn from its projects and to build up a systematic catalogue of what works and what does not
Movement toward randomized controlled experiments:
Esther Duflo:
“ randomized trials can revolutionize social policy during 21st century just as they revolutionized medicine during the 20th”
---- Lancet editorial headed “ The World Bank is finally embracing science”
Deaton argues:
under ideal circumstances randomized evaluations of projects are useful for obtaining convincing estimates of the average treatment effect of a program or project
This focus is too narrow and too local to tell us “what works” in development and to design policy or to advance scientific knowledge about development processes
Argues that work needs to be refocused – not answer which projects work but why
Bigger question:
RCTs allow investigator to induce variation that might not arise nonexperimentally – but are these the relevant ones?
RCTs of “what works”
even when done without error of contamination
unlikely to be helpful for policy or move beyond the local
unless they tell us something about why
RCTs are not targeted or suited to these questions
Actual policy will always be different than experiments:
General equilibrium effects that operate on large scale
Outcomes are different when everyone is covered by treatment rather than a few
Experimental subjects are not representative of population
Small development projects at village level do not attract attention of corrupt politicians
Scientists or experimentalists more careful than government implementers
Transporting successful experiments?
Mexico’s PROGRESA program
Conditional cash transfer program paid to parents if children attend schools and clinics
Now in 30 other countries
Is this a good thing?
Cannot simply be exported if countries have
Pre-existing anti-poverty programs with conditional transfers
No capacity to meet increased demands of education and health care
No political support
Combination of mechanism and context that makes for scientific progress
Much interest in RCTs, and instrumental variables, and other econometric techniques that mimic random allocation
comes from skepticism of economic theory
impatience with its ability to deliver structures that seem helpful in interpreting reality
Internal versus external validity:
Contrast between the rigor applied to establish internal validity and the looser analysis to render it policy relevant
To do this typically use some theory or some other information from observables – both go against simplicity of RCTs
Applied and theoretical economists have never been so far apart
Failure to reintegrate is not an option
Otherwise no chance of long term scientific progress extending from the RCTs.
RCTs that are not theoretically guided are unlikely to have more than local validity
14-10-15 1:09 PMPre-analysis plans at Berkeley's BITSS conference — Running Randomized Evaluations: A Practical Guide
Page 1 of 5http://runningres.com/blog/2013/12/16/pre-analysis-plans-at-berkeleys-bitss-conference
RUN N IN G RAN DOMIZED EVALUAT ION S: A PRACTICAL GUIDE
BLOGRUNNING RANDOMIZED EVALUATIONS CHAPTERS RESOURCES
BUY THE BOOK
Pre-analysis plans at
Berkeley's BITSS
conferenceDecember 16, 2013
On December 12th I attended the annualmeeting of the Berkeley Initiative forTransparency in the Social Sciences (BITSS).BITSS brings together economists, politicalscientists, biostatisticians, and psychologists tothink through how to improve the norms andincentives to promote transparency in thesocial sciences. I was on a panel talking about
14-10-15 1:09 PMPre-analysis plans at Berkeley's BITSS conference — Running Randomized Evaluations: A Practical Guide
Page 2 of 5http://runningres.com/blog/2013/12/16/pre-analysis-plans-at-berkeleys-bitss-conference
preanalysis plans in which researchers specifyin advance how they will analyze their data.
I have now been involved in writing four ofthese plans and my thinking about them hasevolved, as has the sophistication of the plans.Kate Casey, Ted Miguel and I first wrote one ofthese plans for our evaluation of a CommunityDriven Development program in Sierra Leone(see the previous blog ). It was exactly the typeof evaluation where pre-analysis plans aremost useful. We had a large number ofoutcome variables with no obvious hierarchyof which ones were most important so wespecified how all the outcomes would begrouped into families and tested as a group.While the outcomes were complex therandomization design was simple (onetreatment, one comparison group).
The next case also included multidimentionaloutcomes: empowerment of adolescent girls inBangladesh. However, now we had fivetreatments and a comparison group withdifferent treatments targeted at different ages.The task of prespecifying was overwhelmingand we made mistakes. It was extremelydifficult to think through in advance whatsubsequent analysis would make sense forevery combination of results we might getfrom the different arms. We also failed to takeinto account that some of our outcomes in agiven group were clearly more important thanothers: we ended up with strong effects onyears of schooling and math and literacyscores but the overall “education” effect wasweakened by no or negative effects onindicators like how often a girl read amagazine. We hope, when we write the paperpeople will agree it makes sense to deviate
14-10-15 1:09 PMPre-analysis plans at Berkeley's BITSS conference — Running Randomized Evaluations: A Practical Guide
Page 2 of 5http://runningres.com/blog/2013/12/16/pre-analysis-plans-at-berkeleys-bitss-conference
preanalysis plans in which researchers specifyin advance how they will analyze their data.
I have now been involved in writing four ofthese plans and my thinking about them hasevolved, as has the sophistication of the plans.Kate Casey, Ted Miguel and I first wrote one ofthese plans for our evaluation of a CommunityDriven Development program in Sierra Leone(see the previous blog ). It was exactly the typeof evaluation where pre-analysis plans aremost useful. We had a large number ofoutcome variables with no obvious hierarchyof which ones were most important so wespecified how all the outcomes would begrouped into families and tested as a group.While the outcomes were complex therandomization design was simple (onetreatment, one comparison group).
The next case also included multidimentionaloutcomes: empowerment of adolescent girls inBangladesh. However, now we had fivetreatments and a comparison group withdifferent treatments targeted at different ages.The task of prespecifying was overwhelmingand we made mistakes. It was extremelydifficult to think through in advance whatsubsequent analysis would make sense forevery combination of results we might getfrom the different arms. We also failed to takeinto account that some of our outcomes in agiven group were clearly more important thanothers: we ended up with strong effects onyears of schooling and math and literacyscores but the overall “education” effect wasweakened by no or negative effects onindicators like how often a girl read amagazine. We hope, when we write the paperpeople will agree it makes sense to deviate
11
The millennium development goal calls for a universal primary education by 2015 little consensus on how to achieve this goal or how much it
would cost
12
One view attracting additional children to school will be difficult since
most children not in school in developing countries are earning income their families need
Another view potential contribution of children of primary school age to family
income is very small hence modest incentives could significantly increase enrollment
13
Reducing the Cost of Education Some argue school fees prevent many students from attending school cite dramatic estimates from sub-Saharan Africa
free schooling introduced -- primary school enrollment
reportedly doubled Often data used for these estimates are unclear: free schooling is sometimes announced simultaneously with
other policy initiatives often accompanied by programs that replace school fees with per
pupil grants from the central government which create incentives for schools to over-report enrollment
14
Randomized experiments can isolate the impact of reducing costs on the quantity of schooling Several programs have gone beyond simply reducing school fees by actually paying students to attend school in the form of either cash grants or school meals School health programs can also increase quantity of schooling but this raises the question of how best to implement such programs One view is that the reliance on external financing of medicine is not sustainable and instead advocates health education, water and sanitation improvements and so forth
15
Quality of Education Notes from “Teacher Absence in India” (Kremer et. al.) Study entails a nationally representative survey on 3700 schools in India Three unannounced visits were made to each school
16
Absence data comes from direct physical verification of teacher’s presence not relying on logbooks, interviews, etc.
Teacher is recorded as absent if investigator could not find the teacher in the school during regular working hours
Journal of the European Economic Association (Resubmitted version, 11/27/04)
4
which absence calculations based on a similar methodology are available
(Table 1).3 Only 45 percent of teachers were actively engaged in teaching at
the time of the visit.4
Within India, the absence rate ranged from 15 percent in Maharashtra to 42
percent in Jharkand (Table 2).5 Absence rates are generally higher in low-
income states: doubling per capita income is associated with a 4.7 percentage
3 Most of these estimates come from other countries covered by the same research project on
provider absence in education and health, carried out by the authors of this study and using
standardized methodology (Chaudhury and others 2004).
4 Even with a generous allowance for the possibility that enumerators’ visits diverted some
teachers from teaching, it is unlikely that more than half of the teachers would have been teaching
at the time of the visit. See Kremer and others (2004).
5 Table 2 includes 19 of the 20 states surveyed. Fieldwork in the twentieth state, Delhi, was
delayed for bureaucratic reasons, and the data were received too late to be analyzed here.
Teacher absence (%)
Peru 11Ecuador 14Papua New Guinea 15Bangladesh 16Zambia 17Indonesia 19India 25Uganda 27
TABLE 1: Teacher absence rates by country
Source: Chaudhury, Hammer, Kremer, Muralidharan, and Rogers (2004) for most countries; Habyarimana and others (2004) for Zambia; World Bank (2004) for Papua New Guinea.
Journal of the European Economic Association (Resubmitted version, 11/27/04)
5
point lower predicted absence. The rates of teaching activity among the
teachers who are present are lower in higher-absence states and schools. In
some states, only 20 to 25 percent of teachers were engaged in teaching at the
time of the visit.
Absence rates are considerably higher than could be accounted for by
official non-teaching duties, such as staffing polling stations during elections or
conducting immunization campaigns, which are sometimes cited as important
causes of absence. Based on the responses of each school’s head teacher or
primary respondent, official non-teaching duties account for only about 4
percent of total absences. In other words, on any given day, only about 1
percent of primary teachers are absent because they are carrying out official
non-teaching-related duties.6 Preliminary calculations by the authors suggest
6 While stated reasons for absence should be taken with a grain of salt, there does not appear to
be any reason for head teachers to understate this cause of absence.
State Absence (%) State Absence (%)
Maharashtra 14.6 West Bengal 24.7Gujarat 17.0 Andhra Pradesh 25.3Madhya Pradesh 17.6 Uttar Pradesh 26.3Kerala 21.2 Chhatisgarh 30.6Himachal Pradesh 21.2 Uttaranchal 32.8Tamil Nadu 21.3 Assam 33.8Haryana 21.7 Punjab 34.4Karnataka 21.7 Bihar 37.8Orissa 23.4 Jharkhand 41.9Rajasthan 23.7 Weighted Average 24.8
TABLE 2: Teacher absence in public schools by state
19
One in four teachers are absent in a typical primary school in India Absence rates are generally higher in low-income states Higher teachers’ salaries do not seem to be associated with lower teacher absence Since nominal teachers’ salaries are very similar across states relative teachers’ salaries are higher in poorer states
yet poorer states have higher absence rates
24
Notes from “Addressing Absence” (Banerjee and Duflo) Obvious method to fight teacher absence is to monitor more intensively External control need not always be about monetary incentives Most common type control: someone in the institutional hierarchy (headmaster of a school) is
giventask of keeping an eye on teacher and penalizing absences Alternative method use some impersonal method, such as a camera, for recording absence An NGO in rural India experimented with a camera
25
In this area absence rate was 44% Most schools are one-teacher schools: when the teacher is absent children just go back home and lose entire day of schooling
120 schools were selected to participate in this study 60 randomly selected schools (treatment schools) NGO gave the teacher a camera with instructions to take a picture of himself /herself every day at opening time and at closing time
Figure 1
Figure 1
Figure 2: Impact of the CamerasNumber of Schools Found Open Times in Treatment and
Comparison schools(out of 13 visits)
0
2
4
6
8
10
12
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Attendance Frequency (x)
Num
ber o
f Tea
cher
s pr
esen
t exa
ctly
x ti
mes
Treatment Control
27
Experimental Design
Teachers received a bonus as a function of the number of days they actually attended Teachers received a salary of 1,000 Rs. monthly if they were present at least 21 days in a month Each additional day carried a bonus of 50 Rs. up to a maximum of 1,300 per month. Each day missed carried a penalty of 50 Rs. Therefore the way the bonus was set up the average teacher’s salary remained 1,000 Rs. per month which was what teachers were paid in the remaining 60 schools (the comparison schools).
28
The program resulted in an immediate improvement in teacher attendance The absence rate of teachers was cut by one half Given the structure of the payment, the average salary in the treatment schools ended up matching almost exactly the average salary in the comparison schools The incentives were therefore effective without an increase in teachers’ net pay
Treatment Control Difference(1) (2) (3)
School Open 0.66 0.64 0.02(0.11)
41 39 80
Number of Students Present 17.71 15.92 1.78(2.31)
27 25 52
Teacher Test Scores 34.99 33.62 1.37(2.01)
53 56 109
Teacher Highest Grade Completed 10.21 9.80 0.41(0.46)
57 54 111
0.83 0.84 0.00(0.09)
27 25 52
0.78 0.72 0.06(0.12)
27 25 52
Blackboards Utilized 0.85 0.89 -0.04(0.11)
20 19 39
Infrastructure Index 3.39 3.20 0.19(0.30)
57 55 112
Fstat(1,110) 1.21p-value (0.27)
Table 1: Is School Quality Similar in Treatment and Control Groups Prior to Program?
E. School Infrastructure
Percent of Teachers Interacting with Students
Percentage of Children Sitting Within Classroom
Notes: (1) Teacher Performance Measures from Random Checks only includes schools that were open during the random check. (2) Infrastructure Index: 1-5 points, with one point given if the following school attribute is sufficient: Space for Children to Play, Physical Space for Children in Room, Lighting, Library, Floor Mats
A. Teacher Attendance
B. Student Participation (Random Check)
C. Teacher Qualifications
D. Teacher Performance Measures (Random Check)
Treatment Control Difference Treatment Control Difference(1) (2) (3) (4) (5) (6)
Took Written Exam 0.17 0.19 -0.02(0.04)
1136 1094 2230
Math Score on Oral Exam 7.82 8.12 -0.30 -0.10 0.00 -0.10(0.27) (0.09)
940 888 1828 940 888 1828
Language Score on Oral Exam 3.63 3.74 -0.10 -0.03 0.00 -0.03(0.30) (0.08)
940 888 1828 940 888 1828
Total Score on Oral Exam 11.44 11.95 -0.51 -0.08 0.00 -0.08(0.48) (0.07)
940 888 1828 940 888 1828
Math Score on Written Exam 8.62 7.98 0.64 0.23 0.00 0.23(0.51) (0.18)
196 206 402 196 206 402
Language Score on Written Exam 3.62 3.44 0.18 0.08 0.00 0.08(0.46) (0.20)
196 206 402 196 206 402
Total Score on Written Exam 12.17 11.41 0.76 0.16 0.00 0.16(0.90) (0.19)
196 206 402 196 206 402
Levels Normalized by ControlTable 2: Are Students Similar Prior To Program?
Notes: (1) Children who could write were given a written exam. Children who could not write were given an oral exam. (2) Standard errors are clustered by school.
A. Can the Child Write?
B. Oral Exam
C. Written Exam
Treatment Control Diff Until Mid-Test Mid to Post Test After Post Test(1) (2) (3) (4) (5) (6)
0.79 0.58 0.21 0.20 0.20 0.23(0.03) (0.04) (0.04) (0.04)
1575 1496 3071 882 660 1529
0.78 0.63 0.15 0.15 0.15 0.14(0.04) (0.05) (0.05) (0.06)
843 702 1545 423 327 795
0.78 0.53 0.24 0.21 0.14 0.32(0.04) (0.05) (0.06) (0.06)
625 757 1382 412 300 670
Figure 3: Impact of the Cameras(out of at least 25 visits)
Notes: (1) Child learning levels were assessed in a mid-test (April 2004) and a post-test (November 2004). After the post-test, the "official" evaluation period was ended. Random checks continued in both the treatment and control schools. (2) Standard errors are clustered by school. (3) Panels B and C only include the 109 schools where teacher tests were available.
Table 3: Teacher AttendanceSept 2003-Feb 2006 Difference Between Treatment and Control Schools
A. All Teachers
B. Teachers with Above Median Test Scores
C. Teachers with Below Median Test Scores
0
2
4
6
8
1 4 7 10 13 16 19 22 25Atte ndance Fre que ncy
Num
ber
of T
each
ers p
rese
nt e
xact
ly x
tim
es
Treatment
Control
30
In another experiment: in treatment schools, if the headmasters marked the preschool
teachers present a sufficient number of times for the teacher to receive a prize (a bicycle).
This experiment had no effect Absence rates were not reduced This outcome suggests that when human judgment is involved in a system where rules are often bent incentives may easily be perverted
How to stop Malaria?
881,000 die each year
91% in Africa
85% under 5
881,000 die each year
91% in Africa
85% under 5
The Case for Bednets
� Malaria is transmitted by mosquitoes, mainly at dusk.
� Long Lasting Insecticide Treated Bednets prevent mosquitoes to bite
Heated policy debate
� Jeff Sachs, WHO: Give bed nets for free. � We know the science, no need to do
experiment
� Easterly, Dambisa Moyo, Population Service International: don’t give them for free.� We know the economics, no need to do
experiment!
� The true question of course is the extent to which they should be subsidized…
What we need to know
� We need to know:
� The price elasticity of the demand for bednets: if people are willing to purchase a price at the full cost, then subsidies are
not needed—if they are not willing to purchase one at ANY price, then price subsidies may be needed
� The immediate effect on use: are people who pay for bednetmore likely to use one. How much do they need to pay?
� The longer term effects—Will it wreck markets?
� On people who get it for free: will they buy nets in the future?
� On their friends and neighbors? Will they hold out for a free bednet?
How can we find out?
� Anecdotes…
Photo: Minakawa et al. 2008, “Unforeseen misuses of bed nets in
fishing villages along Lake Victoria,”
Malaria Journal
How can we find out?
� Anecdotes…
� There are certainly plenty. But usually they cut both ways.
� Compare purchase/use at various prices
� Some clinic may give them for free, other villages may not have that system, so any bednets are more likely to be obtained in the market
� Do we see fewer in those villages?
� Do we see that the few we see are used differently?
But the problem is…
� What is the right counterfactual: what would have
happen in the other situation?
� For example
� Bednets may be distributed for free in area where malaria is a
huge problem.
� So even if people had to pay for them, they would have been
more likely to get them
Purchase when bednets are expensive
High
malaria
Low
malaria
Pu
rch
ase
s
Purchase when bednets are free
High
malaria
Low
malaria
Pu
rch
ase
s
True effect of price on purchase
High
malaria
Low
malaria
Pu
rch
ase
s
Expensive
FreeExpensive
Free
Our estimate of effect if we compare low and high malaria regions
High
malaria
Low
malaria
Pu
rch
ase
s
Estimate
d effect
The bias
High
malaria
Low
malaria
Pu
rch
ase
s
Bias
EffectBias
Effect
Observed demand at various prices
0 10 20 30
Pu
rch
ase
Demand we would observe in region with free bed net, if bednets were not free
Pu
rch
ase
0 10 20 30
Bias in elasticityP
urc
hase
Problem and solution
� Problem:� What we observe in the world reflect:
� Selection bias: behavior of people would be different in different places, EVEN IF THE PRICES WERE THE SAME
� The actual treatment effect.
� And we don’t know how to separate those two effects: we do not observe how people would have behaved with a low price in the high price region (and vice-versa)
� Solution: � Randomly assign different prices in the same region
� Now, there is no systematic difference between people who face a high price and people who face a low price.
� Of course there is still the usual random noise: the sample must be large enough, and there will be some uncertainty around our estimates of the mean effects.
Dupas’ experiments
� First experiment (with Jessica Cohen)
� Randomly chose clinics, and offer bednets at different prices.
� Track purchase, and usage, in those clinic
� Findings: Compare purchase and usage at each price
Policy Implications
� What is the best price at which to charge for
bednets?
� One possible way to ask the question: price that will
minimize the cost per malaria death averted
� Trade off:
� Free bednets: more coverage
� But it cost you money…
� It turns out that in this case, the CHEAPEST way to avert malaria from the policy perspective is free
bednet. Why?
The controversy
� When Dani Rodrik posted these findings on his
website some people objected. Their main objections were:
� Pregnant women: all of them really need the bednets
� Product was well known in Kenya
� Long term effect may differ from short term effect
� This questions are all about external validity: Is the experiment valid outside of a specific context
Next step
� What is the next step needed to check these objections: � A different country: Uganda,
Madagascar
� Kenya, but not pregnant women
� A new kind of bednet
� An experiment for the long term effects:
� Entitlement effect
� Social effects
A New Experiment
� New experimental design by Pascaline Dupas to try
to address most of these questions
� Randomization done in the general population (men
and women)
� Phase 1: Different discount vouchers are randomly distributed to individuals, for buying a new kind of bednets available in shops, at various price-
� Check purchase, use, and purchase by neighbors
� Phase 2: After a few months, the new bednet is available for the same price for every one
Full price
Partial subsidy
Full subsidy
Google Earth
If people must pay for bednets, will they purchase them?
100%
80%
60%
40%
20%
0Free $0.65 $1 $1.60 $2 $3
Cost
Rate
Purchase
When people get bednetsfor free, will they use it?
100%
80%
60%
40%
20%
0Free
Cost
Purchase
Use
Rate
$0.65 $1 $1.60 $2 $3
Do free nets discourage future purchases?
30%
20%
10%
0Free
Prior cost
Future purchase of net at $2
$0.65 $1 $1.60 $2 $3
Do neighbors buy nets if other got it for free?purchase of net
$0.65 $1 $1.60 $2 $3
66%
50%
Averag e (33% receive
free)
If All receive free
Conclusion
� When we have a policy question, e.g. “what is the optimal price to charge for a bednet”, we need to start by unpacking the question: � What do we need to know to answer the question properly? Let’s not assume any answer, or
replace real answers by anecdotes, or observations that may be very misleading
� We can then design an experiment that will get us the answer to these questions.
� This is what J-PAL (poverty action lab) does…
� Examine critically whether this first experiment is enough: perhaps we need more data to conclude…
� Other than the answer to the policy question, what are the lessons from the experiments: in particular, what is the key puzzle here that we will need to answer in our section on health?