Top Banner
Development Impact Evaluation Field Coordinator Training Washington, DC April 22-25, 2013 Randomization “how to” in Stata (plus other random stuff) Aidan Coville
24

SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Feb 11, 2018

Download

Documents

phungkhanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Development Impact Evaluation

Field Coordinator Training Washington, DC

April 22-25, 2013

Randomization “how to” in

Stata (plus other random stuff)

Aidan Coville

Page 2: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Overview

Putting sample size in perspective (what ICC and MDE really imply for sample size)

How to randomize in Stata (SRS, Multiple treatment arms, Stratification, Clustering)

Practical notes for the field

Page 3: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

SAMPLE SIZE IN PERSPECTIVE

Page 4: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Sample size (implications of MDE and ICC)

Great to get a precise estimate, but this comes at a cost

Need to reflect on the full set of options to make a reasonable decision

Page 5: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

MDE (no clustering)

0

1000

020

000

3000

040

000

n1

0 5 10 15MDE

Page 6: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

MDE (no clustering)

0

500

1000

1500

2000

2500

n1

4 6 8 10 12 14 16MDE

Page 7: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

ICC sample implications

0

1000

2000

3000

4000

4 6 8 10 12 14 16MDE

n1 icc_005

icc_01 icc_015

Page 8: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Improving accuracy (within the realms of reality)

Multiple rounds of data collection on key variables (McKenzie, 2012)

Stratify ex ante on variables most associated with outcomes (Bruhn & McKenzie, 2008)

At the extreme this includes pairwise matching

Focus on more homogenous groups…? (McKenzie, 2009)

Page 9: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

RANDOMIZATION (ACTUALLY DOING IT)

Page 10: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Randomization in Stata

Choose a “seed” to start from – eg. from your token ID (this ensures replicability)

set seed 838 448

Generate random numbers for each obs

gen rand_num = uniform()

Rank numbers from smallest to largest egen ordering = rank(rand_num)

Page 11: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Unique ID Outcome of interest

rand_num ordering

13 0 0.0034 1

5 1 0.0053 2

2 1 0.0091 3

17 0 0.0132 4

9 0 0.0182 5

4 0 0.0199 6

56 1 0.0213 7

sort ordering

Randomization in Stata

Page 12: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

For one treatment, 1 control, assign treatment status by giving 1st half to control and second half to treatment

gen group = ""

replace group = "T" if ordering <= _N/2 replace group = "C" if ordering > _N/2

Randomization in Stata

Page 13: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Unique ID Outcome of

interest rand_num ordering Group

13 0 0.0034 1 T

5 1 0.0053 2 T

2 1 0.0091 3 T

17 0 0.0132 4 C

9 0 0.0182 5 C

4 0 0.0199 6 C

56 1 0.0213 7 C

Randomization in Stata

Page 14: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Taking it further

What about:

multiple treatment arms

Stratification

Clustering

Page 15: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Multiple treatment arms

Basically, just create random numbers and instead of splitting into 2, split into 6 gen group2 = "" replace group2 = "C" if ordering <= _N/6 forvalues i = 1/5 { replace group2 = "T`i'" if ordering <= (`i'+1)*_N/6 & ordering > `i'*_N/6 }

Page 16: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Unique ID Outcome of

interest rand_num ordering Group

13 0 0.0034 1 C

5 1 0.0053 2 T1

2 1 0.0091 3 T2

17 0 0.0132 4 T3

9 0 0.0182 5 T4

4 0 0.0199 6 T5

Multiple treatment arms

Page 17: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Stratification

Define all strata, eg if stratifying by gender and urban/rural you have 4 groups

Randomize within each group

Urban Rural

Male 20 60

Female 40 30

Page 18: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

gen group3 = "" egen strata=group(gender urban) *creates 4 groups bysort strata: egen ordering2=rank(random_num) bysort strata: replace group3 = "T" if ordering2 <= _N/2 bysort strata: replace group3 = "C" if ordering2 > _N/2

Stratification

Page 19: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Clustering

See accompanying handout. Randomize at the cluster level then apply all (or a sample of) observations to treatment or control.

What if observations don’t fit neatly into strata? What if we have multiple strata and multiple treatment arms?

Review Miriam Bruhn and David McKenzie’s blog post on this:

http://blogs.worldbank.org/impactevaluations/tools-of-the-trade-doing-stratified-randomization-with-uneven-numbers-in-some-strata

Page 20: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

IN THE FIELD

Page 21: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Confirming treatment assignment

For valid RCT, we need to have compliance, or at least a good understanding of the level of compliance in the field (how much deviation was there from the treatment assignment.

Assess availability of project monitoring data

Independent audit

Self-reported exposure in follow up questionnaire (include falsification questions to assess accuracy)

Page 22: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Dealing with survey companies

How do we deal with replacement? How do we keep track of non-response?

Missing at random vs. systematic non-response How do we make sure that survey company

sticks to sampling frame? All causal inference relies on the assumption that

the sample method was as stated in the methodology

Use ToR templates to make sure these issues are unambiguous

Page 23: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

Important for the field

1. Always keep record of do file for replication purposes

2. Never provide implementers with control list unless you have to

3. Best not to provide survey company with treatment status if possible (blind study)

Page 24: SAMPLE SIZE IN PERSPECTIVE - World Banksiteresources.worldbank.org/INTDEVIMPEVAINI/.../1_Stata...Sampling.pdf · Overview Putting sample size in perspective (what ICC and MDE really

References

Bruhn, M., & McKenzie, D. (2008). In pursuit of balance: Randomization in practice in development field experiments. World Bank Policy Research Working Paper Series, Vol. McKenzie, D. (2010). Impact Assessments in Finance and Private Sector Development: What have we learned and what should we learn?. The World Bank Research Observer, 25(2), 209-233. McKenzie, D. (2012). Beyond baseline and follow-up: The case for more T in experiments. Journal of Development Economics, 99(2), 210-221.