Top Banner
IES Workshop on IES Workshop on Evaluating State and Evaluating State and District Level District Level Interventions Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University David Holdzkom Assistant Superintendent for Evaluation and Research Wake County Public School System, North Carolina April 24, 2008 Washington, DC
63

IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

IES Workshop on Evaluating IES Workshop on Evaluating State and District Level State and District Level

InterventionsInterventions

Mark W. LipseyDirector, Center for Evaluation Research

and MethodologyVanderbilt University

David HoldzkomAssistant Superintendent for Evaluation and ResearchWake County Public School System, North Carolina

April 24, 2008 Washington, DC

Page 2: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

PurposeTo help schools, districts, and states design and implement rigorous evaluations of the effects of promising practices, programs, and policies on educational outcomes.

Page 3: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Why encourage locally initiated impact evaluation?

• Many interventions are not effective; users and interested others need to know.

• The interventions most relevant to improving outcomes are those that schools and districts believe are promising and feasible.

• IES has funding to support research initiated by schools, districts, and states.

Page 4: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

What kinds of interventions might be evaluated?

• Practices, e.g., one-on-one tutoring, educational software, acceleration of high ability students, cooperative learning.

• Programs, e.g., Reading Recovery, Ladders to Literacy, Cognitive Tutor algebra, Saxon Math, Caring School Community (character education).

• Policies, e.g., reduced class size, pre-K, alternative high schools, all year calendar.

Page 5: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Key Issues in Designing Impact Evaluations for

Education Interventions

Page 6: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Logic Models, Variables, and Evaluation Questions

Page 7: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Logic model: 1. Specifying the problem the intervention addressesNature of the need:

• What and for whom (e.g., kindergarten students who aren’t ready for school).

• Why (e.g., poor pre-literacy skills, inappropriate school behavior).

• Rationale/evidence supporting the intervention target (e.g., at entry K students need to be ready to learn or they will begin to fall behind; research shows school readiness can be enhanced for at-risk 4 year olds).

Page 8: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Logic model: 2. Specifying the planned intervention

What the intervention does that addresses the need:

• Content: What the students should know or be able to do; why this meets the need.

• Pedagogy: Instructional techniques and methods to be used; why appropriate.

• Delivery system: How the intervention will arrange to deliver the instruction.

• The key factors or core ingredients most essential and distinctive to the intervention.

Page 9: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Logic model: 3. Specifying the theory of change

4 year old at-risk

children

Pre-K with literacy

curriculum

Positive attitudes to

school

Improved pre-literacy

skills

Learn appropriate

school behavior

Increased school

readiness

Greater learning

gains in K

TargetPopulation Intervention Proximal Outcomes Distal Outcomes

Page 10: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Mapping variables onto the intervention theory: Sample characteristics

4 year old at-risk

children

Positive attitudes to

school

Improved pre-literacy

skills

Learn appropriate

school behavior

Increased school

readiness

Greater learning

gains in K

Sample descriptors:* Basic demographics * Diagnostic, need/eligibility identification* Baseline performance Potential moderators:

* Setting, context* Personal and family characteristics* Prior experience

Pre-K with literacy

curriculum

Page 11: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Mapping variables onto the intervention theory: Intervention characteristics

4 year old at-risk

children

Positive attitudes to

school

Improved pre-literacy

skills

Learn appropriate

school behavior

Increased school

readiness

Greater learning

gains in K

Independent variable:* T vs. C comparison conditions

Generic fidelity:* T and C exposure to the generic aspects of the intervention (type, amount, quality)

Specific fidelity:* T and C (?) exposure to distinctive aspects of the intervention (type, amount, quality)

Potential moderators:* Characteristics of personnel* Intervention setting, context e.g., class size

Pre-K with literacy

curriculum

Page 12: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Mapping variables onto the intervention theory: Intervention outcomes

4 year old at-risk

children

Exposed to intervention

Positive attitudes to

school

Improved pre-literacy

skills

Learn appropriate

school behavior

Increased school

readiness

Greater learning

gains in K

Focal dependent variables:* Pretests (pre-intervention).* Posttests (at end of intervention)* Follow-ups (lagged after end of intervention).

Other dependent variables:* Side effects– possible unplanned positive or negative outcomes.* Mediators– DVs on causal pathways from intervention to other DVs.

Pre-K with literacy

curriculum

Page 13: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Research questions: Relationships of (possible) interest

• Intervention effects: Causal relationship between intervention and outcomes.

• Duration of effects post-intervention.

• Moderator relationships: Differential intervention effects for different subgroups.

• Mediator relationships: Stepwise causal relationship with effects on a proximal outcome causing effects on a distal outcome.

Page 14: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Research Designs for Estimating

Intervention Effects

Page 15: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

What is an intervention effect and why is it so difficult to determine?

Page 16: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Research designs to discuss

• Two strong ones1. Randomized experiment

2. Regression-discontinuity

• Two weak ones3. Nonrandomized comparison groups

with statistical controls

4. Comparative interrupted time series

Page 17: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

1. Randomized experiment

Receive experimental interventionResearch

sample of students, teachers,

classrooms, schools, etc. Do not

receive experimenta

l intervention

Outcome

Outcome

Highpre

Medhighpre

Lowpre

Interventioneffect

Sample Pretestblocking

Random assignmentto conditions Posttest

Medlowpre

Randomlyassigned

Page 18: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Circumstances conducive to randomized experiments

• More demand than supply for program– allocate scarce resource by lottery.

• New program that can be phased in– wait list control had delayed start.

• Pull-out or add-on program for selected students– randomly select from among those eligible.

• Volunteers willing to opt in for a chance to receive the program.

Page 19: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Example: Junior high algebra curriculum

• In 2000-01 the Moore Oklahoma Independent School District conducted a study of the effectiveness of the Cognitive Tutor Algebra I program on students in their junior high school system.

• Students in 5 junior high schools were randomly assigned to either the Cognitive Tutor Algebra I course or the ‘practical as usual’ algebra courses. Cognitive Tutor teachers received the curriculum materials and 4 days of training.

• Outcome measures included the ETS Algebra I end-of-course exam, course grades, and a survey of student attitudes towards mathematics.

Page 20: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Example: Alternative high school for students at risk of dropping out

• Horizon High School in Las Vegas identified 9th and 10th grade students behind grade level and at risk of dropping out.

• A random sample of these students was assigned to attend an alternative high school that featured a focus on cooperative learning, small group instruction, and support services.

• Outcomes were compared for the alternative and regular high schools on dropout rates, self-esteem, drug use, and arrest rates.

Page 21: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Example: Remedial reading programs for elementary students

• The Allegheny Intermediate Unit (AIU), which serves 42 suburban school districts in Allegheny County, Pennsylvania, randomly assigned 50 schools to one of four commercially available remedial reading interventions.

• Within each school struggling readers in grades 3 and 5 were identified and randomly assigned to instruction as usual or the remedial reading program designated for that school.

• In each program, 3 students met with a trained teacher one hour/day for 20 weeks.

• Measures of reading skill were administered at the beginning and end of the school year for program and control students.

Page 22: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

2. Regression-discontinuity (aka the cutting-point design)

• When well-executed, its ability to provide an unbiased estimate of the intervention effect is strong– comparable to a randomized experiment.

• It is adaptable to many circumstances where it may be difficult to apply a randomized design.

Page 23: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Consider first a posttest on pretest regression for a randomized experiment with no effect

Pretest (S)Mean S

T

C

iiTiSi eTBSBBY 0

Corresponding regression equation (T: 1=treatment, 0=control)

Mean Y

Posttest (Y)

Page 24: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Pretest-posttest randomized experiment, now with an intervention effect

Pretest (S)

Posttest (Y)

T & C Mean S

T

C

iiTiSi eTBSBBY 0

C Mean Y

T Mean Y

Δ

Page 25: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Consider now the same regression with no effect but with a cutting point applied

Selection Variable (S)

Posttest (Y)

Cutting Point

T

C

iiTiSi eTBSBBY 0

C Mean Y

T Mean Y

Page 26: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Regression discontinuity scatterplot (null case)

Selection Variable (S)

Posttest (Y)

Cutting Point

TC

iiTiSi eTBSBBY 0

Page 27: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Now add an intervention effect

Selection Variable (S)

Posttest (Y)

Cutting Point

T

C

iiTiSi eTBSBBY 0

Δ

Page 28: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Regression discontinuity scatterplot with effect

Selection Variable (S)

Posttest (Y)

Cutting Point

T

C

iiTiSi eTBSBBY 0

Page 29: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

The effect estimated by R-D is the same as that from the randomized experiment

Selection Variable (S)

Posttest (Y)

T

C

iiTiSi eTBSBBY 0

Δ

Cutting Point

Page 30: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

The selection variable for R-D

• A continuous quantitative variable measured on every candidate for assignment to T or C who will participate in the study.

• Assignment to T or C strictly on the basis of the score obtained and the predetermined cutting point.

• Does not have to correlate highly with the outcome variable.

• Can be tailored to represent an appropriate basis for the assignment decision in the setting.

Page 31: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Special issues with the R-D design

• Correctly fitting the functional form– possibility that it is not linear– curvilinear functions– interaction with the cutting point.

• Statistical power– requires about 3 times the sample size of a

comparable randomized experiment– covariates correlated with the outcome but not

the selection variable are helpful.

Page 32: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Circumstances conducive to the regression-discontinuity design

• The situation involves a selection from some larger group of who will, or should, receive the intervention and who will not.

• The basis for selection is or can be made explicit and systematic enough to be captured in a quantitative rating or ranking.

• The allocation of the intervention can be made strictly on the basis of the selection score and cutting point in a high proportion of cases. Exceptions can be identified in advance and exempted from the scheme.

Page 33: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Example: Effects of universal pre-k in Tulsa, Oklahoma

• Eligibility for pre-k determined strictly on the basis of age– cutoff by birthday.

• Overall sample of 1,567 children just beginning pre-k plus 1,461 children just beginning kindergarten who had been in pre-k the previous year.

• WJ Letter-Word, Spelling, and Applied Problems as outcome variables.

Page 34: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Entry into pre-k selected by birthday

WJ testscore

Born before September 1

T Completed

pre-K; tested at beginning

of K

Born after September 1

Age

CNo Pre-K

yet; tested at beginning of pre-K year

?

Page 35: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Samples and testing

pre-k kindergarten

pre-k

First cohort

Second cohort

Year 1 Year 2

AdministerWJ tests

Page 36: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Excerpts from Regression AnalysisLetter-Word Spelling Applied Probs

Variable B coeff B coeff B coeff

Treatment (T) 3.00* 1.86* 1.94*

Age: Days ± from Sept 1 .01 .01* .02*

Days2 .00 .00 .00

Days x T .00 -.01 -.01

Days2 x T .00 .00 .00

Free lunch -1.28* -.89* -1.38*

Black .04 -.44* -2.34*

Hispanic -1.70* -.48* -3.66*

Female .92* 1.05* .76*

Mother’s educ: HS .59* .57* 1.25* * p<.05

Page 37: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

3. Nonrandomized comparison groups with statistical controls

• Statistical controls: Analysis of covariance and multiple regression

• Matching on the control variables

• Propensity scores derived from the control variables.

Page 38: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Nonequivalent comparison analog to the randomized experiment

Population of students, teachers,

classrooms, schools, etc.

Do not receive

experimental

intervention

Outcome

Outcome

Interventioneffect (??)

Selected through some nonrandom more-or-less natural process

Receive experimental intervention

Page 39: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Issues for obtaining good intervention effect estimates from nonrandomized

comparison groups• The fundamental problem: selection bias

• Knowing/measuring the variables necessary and sufficient to statistically control for the selection bias– characteristics related to the outcome on

which the groups differ

• Using an analysis model that properly adjusts for the selection bias, given appropriate control variables

Page 40: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Nonequivalent comparison groups: Pretest/covariate and posttest means

Pretest/Covariate(s) (X)

Posttest (Y)

T

C

iiTiXi eTBXBBY 0

Diff inpost

means

Diff inpretest/cov

means

Page 41: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Nonequivalent comparison groups: Covariate-adjusted treatment effect estimate

Pretest/Covariate(s) (X)

Posttest (Y)

T

C

iiTiXi eTBXBBY 0

Δ

Page 42: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Covariate-adjusted treatment effect estimate with a relevant covariate left out

Pretest/Covariate(s) (X)

Posttest (Y)

T

C

iiTiXiXi eTBXBXBBY 22110

Δ

Page 43: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Using control variables via matching

• Groupwise matching: select control comparison to be groupwise similar to intervention group, e.g., schools with similar demographics, geography, etc. Generally a good idea.

• Individual matching: select individuals from the potential control pool that match intervention individuals on one or more observed characteristics.May not be a good idea.

Page 44: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Potential problems with individual level matching

• Basic problem with nonequivalent designs– need to match on all relevant variables to obtain a good estimate of the intervention effect.

• If match on too few variables, may omit some that are important to control.

• If try to match on too many variables, the sample will be restricted to the cases that can be matched; may be overly narrow.

• If must select disproportionately from one tail of the treatment distribution and the other tail of the control distribution, may have regression to the mean artifact.

Page 45: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Regression to the mean: Matching on the pretest

T C

Area where matches can be found

Page 46: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Propensity scores as control variables

• The propensity score is the probability of being in the intervention group instead of the comparison group.

• It is estimated (“predicted”) from data on the characteristics of the individuals already in each group, typically using logistic regression.

• It thus combines all the control variables into a single variable optimized to differentiate the intervention sample from the control sample.

Page 47: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

One option: Use the propensity score to create matched groups

Propensity Score Quintiles

TreatmentGroup

ControlGroup

Matches

Page 48: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Another option: Use the propensity scoreas a covariate in ANCOVA or MR

Propensity score (P)

Posttest (Y) T

C

iiTiPi eTBPBBY 0

Δ

Page 49: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Circumstances appropriate for the nonequivalent comparison design

• A stronger design is truly not feasible.

• A sample of relatively comparable units not receiving the intervention is available.

• A full account can be given of the differences between the groups potentially related to the outcomes of interest.

• Data on those differences can be obtained and used for statistical control.

Page 50: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Example: Effects of a professional development program for teachers

• In the Montgomery County Public Schools, MD, some 3d grade teachers had received the Studying Skillful Teaching training, some had not.

• The reading and math achievement test scores for students of teachers with and without training were compared.

• Analysis of covariance was used to test for differences in student outcomes with a propensity score control variable and covariates representing teacher credentials, student pretest, reduced/free lunch status, ethnicity, and special ed or ELL service.

Page 51: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

4. Comparative interrupted time series

School Year

Mea

n A

chie

vem

ent

Program Onset

9th gradeprogramschools

9th gradeotherschools

Page 52: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Requirements for a good intervention effect estimate from comparative

interrupted time series• The fundamental problem: changes stemming from

other sources.

• Sufficient pre-intervention time series data showing relative stability.

• No other potentially influential event coincides with the program onset or staggered onsets if available.

• Comparison time series for very similar units in same environment but without the program.

• An analysis model that properly estimates changes and differences with autocorrelated data.

Page 53: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Circumstances appropriate for comparative interrupted time series

• A stronger design is truly not feasible.

• Time series data on a relevant outcome for those exposed to the program are available for periods before and after the onset of the program.

• Sufficient data points are available, with no change in the nature of the measure, to establish stable statistical trends.

• Data on the same measure over the same time period are available for comparable cases without the program.

Page 54: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Example: The ninth grade Success Academy in Philadelphia

• The Success Academy grouped 9th graders together in small learning communities with a specialized curriculum and a small group of dedicated teachers.

• Implemented by 7 of the 22 nonselective high schools in 2003-04.

• The outcomes were attendance, academic credit earned, promotion to 10th grade, achievement test scores, and graduation rates.

• Outcomes are compared for 9th graders during the 3 years prior and 5 years after program onset and for the program schools vs. a matched group of schools without the program.

Page 55: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Other Important Aspects of the Research Plan

Page 56: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Statistical power

• Statistical power = probability of statistical significance when there is an effect.

• Power is mainly a function of:– alpha level for significance testing– minimum effect size to detect in standard

deviation units– the sample size: number of students,

classrooms, schools, etc.– the covariates included in the analysis– the research design and corresponding

analysis model.

Page 57: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Power: Critical considerations

• A realistic identification of the minimal effect size with practical significance that the research should be powered to detect.

• The unit that is assigned to conditions (students, classrooms, schools, etc.).

• The intracluster correlations (ICC) expected for student outcomes when students are nested within the units assigned.

• The expected correlations with outcomes of any covariates measured on the units assigned to conditions.

• The number of schools, classrooms, students, etc. available for the study.

• Specific design issues such as the need for 3-4 times as many units for regression-discontinuity as for a comparable randomized experiment.

Page 58: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Computer program for power estimation in multilevel designs

• Raudenbush, S. W., Spybrook, J., Liu, X., & Congdon, R. (2006). Optimal design for longitudinal and multilevel research: Documentation for the “Optimal Design” software. Optimal Design Version 1.76

http://sitemaker.umich.edu/group-based/optimal_design_software

Page 59: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Multilevel Data Analysis

• Applicable when sampling and assignment to conditions occurs with one unit (e.g., classrooms, schools) and outcomes are measured on units nested within (e.g., students).

• Requires specialized computer programs, e.g., HLM, MLWin, SAS Proc Mixed, SPSS Mixed Models.

Page 60: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

References and readings

Experimental and quasi-experimental designShadish, W. R., Cook, T. D., & Campbell, D. T. (2001). Experimental and

quasi-experimental designs for generalized causal inference. Houghton Mifflin.

Bickman, L., & Rog, D. J. (eds)(2008). The SAGE Handbook of Applied Social Research Methods (Second Edition). Sage Publications.

Regression-discontinuityHahn, J., Todd, P. and Van der Klaauw, W. (2002). Identification and

estimation of treatment effects with a regression-discontinuity design. Econometrica, 69(1), 201-209.

Cappelleri J.C. and Trochim W. (2000). Cutoff designs. In Chow, Shein-Chung (Ed.) Encyclopedia of Biopharmaceutical Statistics, 149-156. NY: Marcel Dekker.

Cappelleri, J., Darlington, R.B. and Trochim, W. (1994). Power analysis of cutoff-based randomized clinical trials. Evaluation Review, 18, 141-152.

Page 61: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Nonequivalent comparison designsRosenbaum, P.R., & Rubin, D.B. (1983). The central role of the

propensity score in observational studies for causal effects. Biometrika 70(1): 41-55.

Luellen, J. K., Shadish, W.R., & Clark, M.H. (2005). Propensity scores: An introduction and experimental test. Evaluation Review 29(6): 530-558.

Schochet, P.Z., & Burghardt, J. (2007). Using propensity scoring to estimate program-related subgroup impacts in experimental program evaluations. Evaluation Review 31(2): 95-120.

Time seriesBloom, H. S. (2003). Using “short” interrupted time-series analysis to

measure the impact of whole school reforms. Evaluation Review, 27(1), 3-49.

Chatfield, C. (2003). The analysis of time series: An introduction (Sixth Ed.) Chapman & Hall/CRC.

Page 62: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

Multilevel analysisHox, J. (2002). Multilevel analysis: Techniques and applications.

Lawrence Erlbaum.

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (Second ed.). Sage publications.

Examples used in this presentationMorgan, P., & Ritter, S. (2002). An experimental study of the effects of

Cognitive Tutor® Alegbra I on student knowledge and attitude. Carnegie Learning, Inc.

Dynarski, M., Gleason, P., Rangarajan, A., & Wood, R. (1998). Impacts of Dropout Prevention Programs. Final Report. Mathematica Policy Research, Princeton, NJ.

Torgesen, J., Myers, D., Schirm, A., et al. (2006). National Assessment of Title I Interim Report to Congress: Volume II: Closing the Reading Gap, First Year Findings from a Randomized Trial of Four Reading Interventions for Striving Readers. Washington, DC: U.S. Department of Education, Institute of Education Sciences.

Page 63: IES Workshop on Evaluating State and District Level Interventions Mark W. Lipsey Director, Center for Evaluation Research and Methodology Vanderbilt University.

W. T. Gormley, T. Gayer, D. Phillips, & B. Dawson (2005). The effects of universal pre-k on cognitive development. Developmental Psychology, 41(6), 872-884.

Modarresi, S., & Wolanin, N. (2007). The effects of Studying Skillful Teaching training program on students’ reading and mathematics achievement. Evaluation Brief, February. Montgomery County Public Schools, MD.

Kemple, J. J., Herlihy, C. M., & Smith, T. J. (2005). Making progress toward graduation: Evidence from the Talent Development High School model. New York: MDRC. [Includes 9th grade Success Academy].

Raudenbush, S. W., Spybrook, J., Liu, X., & Congdon, R. (2006). Optimal design for longitudinal and multilevel research: Documentation for the “Optimal Design” software. Optimal Design Version 1.76