Counterfactual Counterfactual impact evaluation: impact evaluation: what it can (and cannot) what it can (and cannot) do for cohesion policy do for cohesion policy Alberto Martini Progetto Valutazione Torino, Italy [email protected]
Mar 27, 2015
Counterfactual Counterfactual impact evaluation: impact evaluation:
what it can (and cannot) what it can (and cannot) do for cohesion policy do for cohesion policy
Alberto MartiniProgetto Valutazione
Torino, Italy
ALL I REALLY NEED TO KNOW I ALL I REALLY NEED TO KNOW I LEARNED IN KINDERGARTENLEARNED IN KINDERGARTEN
by Robert Fulghumby Robert FulghumShare. Play fair. Don't hit people. Clean up your own mess. Wash your hands before you eat. Flush.
ALL IT REALLY MATTERS IN ALL IT REALLY MATTERS IN IMPACT EVALUATION COMES IMPACT EVALUATION COMES FROM COMMON SENSEFROM COMMON SENSE
It’s nice to have an impact. Not all we obtain is due to our actions.Some things happen without our help.To improve things we must understand’emWe must separate what we caused from
what would happen anyway.Flush.
Do we need counterfactuals?Do we need counterfactuals?The answer is simple: it depends on what
we need (can, want) to know and for which purpose
I’ll follow the COSCE approach (Common Sensical Counterfactual Evaluation)
[COSCE = Conference On Security and Cooperation in Europe]
What would have happened anyway What would have happened anyway = counterfactual= counterfactual
COSCE rule n.1COSCE rule n.1If your purpose is to be accountable, don’t
worry too much about counterfactuals◦Your main worry is to show that the money was
spent ◦Maybe you want to show how well it was spent◦Maybe you want to show for whom it was spent◦You might go further by showing your
contribution to objectives; e.g. to the Lisbon strategy
◦To impress DG-Regio, use a macro-model
COSCE rule n.2COSCE rule n.2If your purpose is to improve policy,
macro models will not doIf your purpose is to improve policy,
probably indicators will not doIf your purpose is to improve policy, you
need to learn:
◦What works and, if it does, why it works
◦What does not work and, if it doesn’t, why it doesn’t work
COSCE rule n.3COSCE rule n.3Learning “what works” precedes logically
learning “why it works”Otherwise we do not know what to explain
Learning why it works (or doesn’t) is:◦More important◦More interesting◦More difficult
than learning what worksThis is why it should be done later
COSCE rule n.4COSCE rule n.4Counterfactul Impact Evaluation tries to learn
something about “what works” on average (not very interesting) and for whom it works (data permitting)
It produces numbersIt requires good data and large samplesIt imposes non-testable assumptionsIts results are NOT the truth, are NOT universal
laws, are NOT scientificIt is (should be) a fallible, improvable,
intellectually honest human enterprise
COSCE rule n. 5COSCE rule n. 5Theory-based Impact Evaluation tries to learn
something about “why it works” indentifying the mechanisms that make a policy produce its effects (or fail to do so)
It produces narratives and insightsIt collects its data through qualitative methods
and doesn’t need large samplesIt develops a theory of change and then
observes policies as they are implemented, to learn which elements of the theory are verified
COSCE rule n.6COSCE rule n.6To learn something about “what works” one
needs to clarify Effects (impacts) on what?
Which outcomes Y Effects (impacts) of what?
Which treatment T
COSCE curse n. 1COSCE curse n. 1Effects and impacts are the same thing, the best example of distinction without a difference
COSCE rule n. 7COSCE rule n. 7The heart of CIE is to answer the question:what is the direction, size and significance of
the effect of treatment T of outcome Y?
AN EXAMPLE
A program providing subsidies to increase R&D expenditures among small and
medium enterprises
“subsidizing SME to do more R&D”
A MULTIPLE CHOICE TESTA MULTIPLE CHOICE TEST
What is the effect of the subsidies?What is the effect of the subsidies? the number of R&D projects funded and completed the take-up rate of the subsidy among eligible SME the increase in R&D expenditures
among subsidized SME the difference in R&D expenditure among subsidized and non subsidized SME none of the above
the number of R&D projects funded and completed the take-up rate of the subsidy among eligible SME
The number can be very high, the take-up rate can be 100 %, the effect
can be zero
COSCE curse n. 2COSCE curse n. 2the number of R&D projects is not a “gross impact”. It is not an impact. It’s a measure of activity. There is no such thing as a gross impact
the increase in R&D expenditures among subsidized SME
COSCE curse n. 3COSCE curse n. 3The deadweight (DW) is nothing else than the counterfactual. The only special thing about it is that is used when money is clearly wasted. Demonstrable Waste (DW) is a better name for it
The increase in not an effect, the subsidies might have gone to firms
with growing R&D expenditures
the difference in R&D expenditure among subsidized and non subsidized SME
The post-treatment difference in outcomes does not identify any effect, the difference
might be all due to initial differences (selection bias)
COSCE curse n. 3COSCE curse n. 3The Commission is stuck on the decomposition
“gross impact=net effect + deadweight”
The world literature focuses on the decomposition “observed difference=effect + selection bias”
What does COSCE have to say about What does COSCE have to say about the limitations of counterfactual the limitations of counterfactual
impact evaluation?impact evaluation?
In some quarters, CIE is seen as a universal approach, able to solve all the inferential problems through use of
ever more sophisticated methods.
COSCE disagree and views the CIE as an important contribution, with important limitations in their
applicability to Structural Funds, both in terms or relevance and compatibility.
Support for R&D projects
Transport infra-
structure
Human capital
investment
Urban renewal
Renewable energy
Investment support
Behavioral (vs. redistributive) motive
Replicable nature (vs. idiosyncratic)
Homogenous treatment (vs. composite)
Large numbers of eligible units
++ ++ ++ + + +
++ ++ + - - ++
++ + + - - - +
+ ++ + - - - ++
Different types of cohesion policies
Relevance and com
patibility
What timing for counterfactual What timing for counterfactual impact evaluation?impact evaluation?
When it is prospective, i.e. it is designed together with the intervention, impact evaluation can have a strong disciplinary effect.
First, it can help focus the attention of both policy-makers and beneficiaries on objectives.
Secondly, it creates an incentive to assemble the information necessary to assess results.
Thirdly, it brings to light the criteria by which beneficiaries are selected
BARCA DIXITBARCA DIXIT
Above timing, Above timing, above relevance, above relevance,
above compatibility, above compatibility,
the most important determinant of the diffusion of the most important determinant of the diffusion of counterfactual impact evaluation is counterfactual impact evaluation is
the interest and willingness, on the part of some the interest and willingness, on the part of some influential stakeholder, of truly influential stakeholder, of truly
learning about what workslearning about what works
and whyand why..