Using Student Learning Growth as a Measure of Effectiveness Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive.

Using Student Learning Growth as a Measure of Effectiveness

Laura Goe, Ph.D.Research Scientist, ETS, and Principal Investigator for the

National Comprehensive Center for Teacher Quality

Oregon Principals Conference

Monday, October 24, 2011 Bend, Oregon

2

The goal of teacher evaluation

The ultimate goal of all teacher evaluation should be…

TO IMPROVE TEACHING AND

LEARNING

3

Measures: The right choice depends on what you want to measure

4

Validity is a process

• Starts with defining the criteria and standards you want to measure

• Requires judgment about whether the instruments and processes are giving accurate, helpful information about performance

• Verify validity by Comparing results on multiple measures Multiple time points, multiple raters

5

Teaching standards

• A set of practices teachers should aspire to• A teaching tool in teacher preparation programs• A guiding document with which to align:

Measurement tools and processes for teacher evaluation, such as classroom observations, surveys, portfolios/evidence binders, student outcomes, etc.

Teacher professional growth opportunities, based on evaluation of performance on standards

• A tool for coaching and mentoring teachers: Teachers analyze and reflect on their strengths and

challenges and discuss with consulting teachers

6

Questions to ask about student learning growth aspects of teacher evaluation models

1. Inclusive (all teachers, subjects, grades). Do evaluation models allow teachers from all subjects and grades (not just 4-8 math & reading) to be evaluated with evidence of student learning growth according to standards for that subject/grade?

2. Professional growth. Can results from the measures be aligned with professional growth opportunities?

7

Measures that help teachers grow

• Measures that motivate teachers to examine their own practice against specific standards

• Measures that allow teachers to participate in or co-construct the evaluation (such as “evidence binders”)

• Measures that give teachers opportunities to discuss the results with evaluators, administrators, colleagues, teacher learning communities, mentors, coaches, etc.

• Measures that are directly and explicitly aligned with teaching standards

• Measures that are aligned with professional development offerings

• Measures which include protocols and processes that teachers can examine and comprehend

8

Comparability of measures

• It is not appropriate to use the same measure for every grade and subject A measure that may be valid for one

subject/grade may not be valid for another

• Measures should be chosen because they are appropriate for a specific subject and grade, not because they fit a certain format A paper-and-pencil test may be appropriate for

some subjects, while performance tests to measure applied knowledge and skills may be appropriate for others

9

Teacher observations: strengths and weaknesses

• Strengths Great for teacher formative evaluation (if observation is

followed by opportunity to discuss) Helps evaluator (principals or others) understand

teachers’ needs across school or across district

• Weaknesses Only as good as the instruments and the observers Considered “less objective” Expensive to conduct (personnel time, training,

calibrating) Validity of observation results may vary with who is

doing them, depending on how well trained and calibrated they are

10

Observations: Valued by teachers (usually)

• Teachers generally report that they value observations, but when they don’t, it’s because…

• They do not receive feedback at all• The feedback they receive is not specific and

actionable• The observer suggests actions but is unable to offer

the means and resources to carry out those actions Mentors/coaches, other support personnel Time for individual growth planning/activities Protected time for collaboration with others

11

Most popular growth models: Value-added and Colorado Growth Model

• EVAAS uses prior test scores to predict the next score for a student• Teachers’ value-added is the difference between

actual and predicted scores for a set of students• http://www.sas.com/govedu/edu/k12/evaas/index.ht

ml

• Colorado Growth model Betebenner 2008: Focus on “growth to proficiency” Measures students against “academic peers” www.nciea.org

http://www.sas.com/govedu/edu/k12/evaas/index.html

http://www.sas.com/govedu/edu/k12/evaas/index.html

http://www.nciea.org/

12

Slide courtesy of Damian Betebenner at www.nciea.org

Linking student learning results to professional growth opportunities

http://www.nciea.org/

13

Value-Added: Error rates and stability

• “Type I and II error rates for comparing a teacher’s performance to the average are likely to be about 25 percent with three years of data and 35 percent with one year of data.”

• “Any practical application of value-added measures should make use of confidence intervals in order to avoid false precision, and should include multiple years of value-added data in combination with other sources of information to increase reliability and validity.”

(Schochet & Chiang, 2010, abstract)

14

Value-Added: Responses to technical challenges

• Use multiple years of data to mitigate sorting bias and gain stability in estimates (Koedel & Betts, 2009; McCaffrey et al., 2009; Glazerman et al., 2010 )

• Use confidence intervals and other sources of information to improve reliability and validity of teacher effectiveness ratings (Glazerman et al., 2010)

• Have teachers and administrators verify rosters to ensure scores are calculated with students the teachers actually taught

• Consider the importance of subscores in teacher rankings

15

Teacher evaluation models

16

Measuring teachers’ contributions to student learning growth: A summary of current models

Model Description

Student learning objectives

Teachers assess students at beginning of year and set objectives then assesses again at end of year; principal or designee works with teacher, determines success

Subject & grade alike team models (“Ask a Teacher”)

Teachers meet in grade-specific and/or subject-specific teams to consider and agree on appropriate measures that they will all use to determine their individual contributions to student learning growth

Pre-and post-tests model

Identify or create pre- and post-tests for every grade and subject

School-wide value-added

Teachers in tested subjects & grades receive their own value-added score; all other teachers get the school-wide average

17

Recommendation from NBPTS Task Force on teacher evaluation

“Recommendation 2: Employ measures of student learning explicitly aligned with the elements of curriculum for which the teachers are responsible. This recommendation emphasizes the importance of ensuring that teachers are evaluated for what they are teaching.”

(Linn et al., 2011)

18

Ask A Teacher Model: Using the measures in comparable ways

• Even if all teachers are using the same measures in a grade/subject, they may be using them in different ways

Giving the assessment at different times of the year Allowing students more time to complete the assessment Engaging in test prep or coaching students in completing

assessments

• To ensure that differences in student scores are based on teacher performance, not on how/when the assessment was given, “standardize” assessment processes as much as possible

19

Washington DC IMPACT:Instructions for teachers in non-tested

subjects/grades

“In the fall, you will meet with your administrator to decide which assessment(s) you will use to evaluate your students’ achievement. If you are using multiple assessments, you will decide how to weight them. Finally, you will also decide on your specific student learning targets for the year. Please note that your administrator must approve your choice of assessments, the weights you assign to them, and your achievement targets. Please also note that your administrator may choose to meet with groups of teachers from similar content areas rather than with each teacher individually.”

19

20

Assessments for student learning growth

21

Measuring teachers’ contributions to student learning growth (classroom)

22

What assessments are teachers and schools going to use?

• Existing measures Curriculum-based assessments (come with packaged curriculum) Classroom-based individual testing (DRA, DIBELS) Formative assessments such as NWEA Progress monitoring tools (for Response to Intervention) National tests, certifications tests

• Rigorous new measures (may be teacher created)• The 4 Ps: Portfolios/products/performance/projects• School-wide or team-based growth• Pro-rated scores in co-teaching situations• Student learning objectives• Any measure that demonstrates students’ growth towards

proficiency in appropriate standards

23

New Haven assessment examples

• Examples of Assessments/Measures Basic literacy assessments, DRA District benchmark assessments District Connecticut Mastery Test LAS Links (English language proficiency for ELLs) Unit tests from NHPS approved textbooks Off-the-shelf standardized assessments (aligned to

standards) Teacher-created assessments (aligned to standards) Portfolios of student work (aligned to standards) AP and International Baccalaureate exams

4 types of musical behaviors:

Types of assessment

1.Responding

2.Creating

3.Performing

4.Listening

1. Rubrics

2. Playing tests

3. Written tests

4. Practice sheets

5. Teacher Observation

6. Portfolios

7. Peer and Self-Assessment

Assessing Musical Behaviors: The type of assessment must match the knowledge or skill

Slide used with permission of authors Carla Maltas, Ph.D. and Steve Williams, M.Ed. See reference list for details.

25

How to use evidence of student learning growth

• Teacher preparation for measuring student learning growth is limited or non-existent

• Most principals, support providers, instructional managers, and coaches are poorly prepared to make judgments about teachers’ contribution to student learning growth

• They need to know how to Evaluate the appropriateness of various measures of

student learning for use in teacher evaluation- Work closely with teachers to select appropriate student

growth measures and ensure that they are using them correctly and consistently

26

Collect evidence in a standardized way (to the extent possible)

• Evidence of student learning growth Locate or develop rubrics with explicit

instructions and clear indicators of proficiency for each level of the rubric

Establish time for teachers to collectively examine student work and come to a consensus on performance at each level- Identify “anchor” papers or examples

Provide training for teachers to determine how and when assessments should be given, and how to record results in specific formats

27

New Haven “matrix”

Asterisks indicate a mismatch—teacher is very high on one area (practice or growth) and very low on the other area.

28

Results inform professional growth opportunities

• Are evaluation results discussed with individual teachers?

• Do teachers collaborate with instructional managers to develop a plan for improvement and/or professional growth? All teachers (even high-scoring ones) have areas

where they can grow and learn

• Are effective teachers provided with opportunities to develop their leadership potential?

• Are struggling teachers provided with coaches and given opportunities to observe/be observed?

29

High-quality professional growth opportunities

• The ultimate goal of teacher evaluation should be to improve teaching & learning Individual coaching/feedback on instruction- Trained coaches, not just “good teachers”

Observing “master teachers”- Provide opportunities to discuss specific practices- May be especially helpful at beginning of year when

master teachers are creating a “learning environment”

Group PD and learning communities- Opportunity to grow together as a cohort

30

Moving forward

• Create (or revisit) timeline for final decisions and implementation

What do you still need to know to make appropriate recommendations?

• Who needs to be involved in decision-making? Department of Ed? Districts? Teachers? Union? State responsibilities vs. district responsibilities

• How/when will decisions be communicated to stakeholders?

• What resources will be required for training and implementation and where will they come from?

31

References

Herman, J. L., Heritage, M., & Goldschmidt, P. (2011). Developing and selecting measures of student growth for use in teacher evaluation. Los Angeles, CA: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).

http://www.aacompcenter.org/cs/aacc/view/rs/26719 Linn, R., Bond, L., Darling-Hammond, L., Harris, D., Hess, F., & Shulman, L. (2011). Student learning,

student achievement: How do teachers measure up? Arlington, VA: National Board for Professional Teaching Standards.

http://www.nbpts.org/index.cfm?t=downloader.cfm&id=1305

Malta, C., and Williams, S. (January 27, 2010). Meaningful assessment in the music classroom. Presented at Missouri Music Educators Association Conference, Jefferson City, MO.

http://dese.mo.gov/divimprove/curriculum/fa/AssessmentintheMusicClassroom.pptx

Race to the Top Application

http://www2.ed.gov/programs/racetothetop/resources.html

Rivkin, S. G., Hanushek, E. A., & Kain, J. F. (2005). Teachers, schools, and academic achievement. Econometrica, 73(2), 417 - 458.

http://www.econ.ucsb.edu/~jon/Econ230C/HanushekRivkin.pdf

Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. Brooklyn, NY: The New Teacher Project.

http://widgeteffect.org/downloads/TheWidgetEffect.pdf

http://www.aacompcenter.org/cs/aacc/view/rs/26719

http://www.nbpts.org/index.cfm?t=downloader.cfm&id=1305

http://dese.mo.gov/divimprove/curriculum/fa/AssessmentintheMusicClassroom.pptx

http://www2.ed.gov/programs/racetothetop/resources.html

http://www.econ.ucsb.edu/~jon/Econ230C/HanushekRivkin.pdf

http://widgeteffect.org/downloads/TheWidgetEffect.pdf

32

Questions?

33

Laura Goe, Ph.D.P: 609-734-1076E-Mail: [email protected]

Website: www.tqsource.org

Using Student Learning Growth as a Measure of Effectiveness Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive.

Documents

teachers opportunities

learning slide

goal of teacher evaluation

mentoring teachers

consulting teachers

comparability of measures

teacher formative evaluation

teacher observations