Top Banner
The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards Tests for Higher Standards
24

The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Mar 26, 2015

Download

Documents

Aiden Byrd
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

The Many Challenges of Using Test Scores in Evaluation

David E. W. Mott

Tests for Higher StandardsTests for Higher Standards

Page 2: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Abstract — The Many Challenges of Using Test Scores in Evaluation

Teacher and school evaluations using test scores is upon us. Probably none of us asked for it, and few of us want it, but it is here! It will not go away, at least not soon. What can division staffs do to encourage reasonable outcomes? The first thing is to start planning. Plan in conjunction with the affected groups. The stakeholders that must be involved are teachers, school staff, and central staff. It is only by planning your course of action with these primary players that viable approaches will emerge. At the same time, it would be valuable to call in some outside experts. This presentation suggests that one party that should be involved is a local assessment provider. Whatever else is part of your plan, your division will probably need their services. This presentation will briefly describe what Tests for Higher Standards is doing in this area, as a prelude to a multiple-sided conversation as to the current status of work and possible courses of action.

Page 3: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Ways to Evaluate Test ScoresFor Individuals For Groups (averaged)

Pass/Fail # Passing% Passing

Scores # Correct % CorrectScaled Score Percentile (%ile) (Mean, Median, etc.)

Score Change Amount of Change% ChangeScale Score ChangePercentile Change

Possible Change Amount of Possible Change% of Possible Change

Any of the above can be corrected for any number of combinations of input variables.

Page 4: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Corrections

This is the typical form:

Value ± Correction = Adjusted Value.

The Correction is an amount determined by a characteristic or set of characteristics that pertain to the student population. Typically such things as gender/ethnic mix, poverty indices, mix of non-English speakers, school charactercs, etc.

Page 5: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Adjusted Value Meaning

Adjusted Values are used to permit fair comparisons:

Student to Student Class to Class Teacher to Teacher School to School District to District

Page 6: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Regression Equations

The adjustments are commonly derived and applied through complex regression equations. They are usually quite specific to the populations for which they were originally derived: Virginia cannot use Tennessee’s adjustments. The equations also tend to be unstable over time.

Page 7: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Two types of Value-added Models

So this is one type of Value-added Model. This could be called:

Controlling for Named Input Characteristics.

Virginia uses a different approach Which the VDOE calls:

Student Growth Percentiles (SGP)

Page 8: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Student Growth Percentiles (SGPs)

The VDOEs Student Growth Percentile scheme can be seen as a score adjustment, using one or more previous test scores as the adjusting variable.

The presumption is that the student’s previous score summarizes all the other input variable adjustments.

Page 9: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Ways to Evaluate Test ScoresFor Individuals For Groups (averaged)

Pass/Fail # Passing% Passing

Scores # Correct % CorrectScaled Score Percentile (%ile) (Mean, Median, etc.)

Score Change Amount of Change% ChangeScale Score ChangePercentile Change

Possible Change Amount of Possible Change% of Possible Change

Any of the above can be corrected for any number of combinations of input variables.

Here we are

and here

Page 10: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Why Use SGPs? They are conceptially simpler than

regression-based growth models. SGPs require no vertical scaling

and no test equating. (%iles are scaleless.)

They are relatively simple to compute.

SGP methods are not sensitive to the distributions of scores.

Scores from year-to-year are likely to be quite stable.

Page 11: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Problems with the VDOE’s SGPs for School Divisions

Because the SOLs are adminis-tered in certain grades and subjects only, SGPs will not be available for all teachers.

The State’s method of estimating SGP is complex and can lead to scores for some students not being used. (These are students who had very high SOL Scaled Scores.)

SOLs are given only once / year.

Page 12: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

A Problem (?) Associated with

A L L Quantile Methods For every School, Teacher, Class, or

Student, one half will always be below average.

This is the very essence of the technique. There is no Lake Wobegone* effect.

For this reason we need to look at the test score in conjunction with the SGP or any other quantile method.

* Lake Wobegon, where all the women are strong, all the men are good looking, and all the children are above average.

Page 13: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

One More Observation What should we do with students near

the top of the distribution of scores? They can’t be expected to grow very

much; they have no place to go. They can only decline and some of them will.

We can always simply state that students who start at the top will not be averaged.

Or we can look at one other possibility the Amount or % of Possible Change.

Page 14: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

The Amount or Percentage of Possible Change What to do with students near the top

of the distribution of scores? They can’t be expected to grow very

much; they have no place to go. They can only decline and some of them will.

We can always simply state that students who start at the top will not be averaged.

Or we can look at one other possibility . . .

Page 15: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Ways to Evaluate Test ScoresFor Individuals For Groups (averaged)

Pass/Fail # Passing% Passing

Scores # Correct % CorrectScaled Score Percentile (%ile) (Mean, Median, etc.)

Score Change Amount of Change% ChangeScale Score ChangePercentile Change

Possible Change Amount of Possible Change% of Possible Change

Any of the above can be corrected for any number of combinations of input variables.

Now here

Page 16: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Back out of the clouds . . .

Page 17: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

What Can a School Division Do ?TfHS / ROSworks suggests the following:

• Use the state’s method, but simplify it.

• We have proposed something we call Student Growth Deciles (SGD).

• Instead of percentile groups we use ten decile groups.

• We only track two previous score years.

• We use actual groups rather than statistical estimations of those groups.

Page 18: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Some Details about our SGD Method

First, administer a pre-test. (We call it the Base Test.)

Score the pretest and break up the students into ten score groups.

Then, collapse the two top and the two bottom groups together, as test scores are least reliable at the two extremes.

Instruction occurs here. Next, administer a post-test. (The Growth Test)

Within each pretest group, sort all the students into ten (or 8) new SGD groups on the basis of their post-test scores. These are their SGD scores. They represent their growth.

continue . . .

Page 19: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Continued — Some Details about our

SGD Method For each analysis unit (School, Teacher, or

Class), compute the average of the SGD scores of the students assigned.

This number (rounded perhaps to one decimal) is the score for the school, teacher, or class.

For demographic analysis units: gender, ethnic, ELL groups, AMO’s, etc., the averages can be computed in the same way.

All of the proceeding steps should be calculated separately for each grade and subject area.

Page 20: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

TfHS / ROSworks SGD Coding Scheme

Page 21: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

What Might a Report Look Like?

For each analysis unit (School, Teacher, or Class), compute an average of the SGD scores of the students assigned.

This number (rounded perhaps to one decimal) is the score for the school, teacher, or class.

For demographic analysis units: gender, ethnic, ELL groups, AMO’s, etc., the averages can be computed in the same way.

All of the proceeding steps should be calculated separately for each grade and subject area.

Page 22: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

TfHS / ROSworks SGD Example Report

Page 23: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

One Last Vital Part !Involve the entire affected staff in your evaluation project. If you do, they will probably support it, work within it, and make it continually better. If those affected are not involved, a good outcome is in jeoprody.

When TfHS / ROSworks proposed the process reported here as a part of our company’s response to the VDOE’s recent RFP, we indicated that we wished to be included in both the planning and implementation processes with any contracted School Division — to be a part of the solution and not a part of the problem.

Page 24: The Many Challenges of Using Test Scores in Evaluation David E. W. Mott Tests for Higher Standards.

Contact Information

David E. W. Mott, PhD ROSworks LLC -- Reports Online System (ROS)

Tests for Higher Standards (TfHS) 5310 Markel Road, Suite 104Richmond, VA 23230-3030 USA 804.282.3111866.724.9722 toll free 804.282.4126 fax www.tfhs.net www.rosworks.com www.ThoughtsOnAssessment.com -- Discussion Forum & Blog