MAKING SENSE OF THE NEW ACCOUNTABILITY INDEX AND STUDENT GROWTH PERCENTILES Dr. Pete Bylsma Director, Assessment/Student Information Services Renton School District (Past President, Washington Educational Research Association - WERA) Dr. Glenn Malone Executive Director of Assessment, Accountability & Student Success Puyallup School District (WERA President-Elect) NCCE Conference March 12, 2014
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
MAKING SENSE OF THE NEW ACCOUNTABILITY INDEX AND STUDENT GROWTH PERCENTILES
Dr. Pete BylsmaDirector, Assessment/Student Information ServicesRenton School District(Past President, Washington Educational Research Association - WERA)
Dr. Glenn MaloneExecutive Director of Assessment, Accountability & Student SuccessPuyallup School District(WERA President-Elect)
NCCE ConferenceMarch 12, 2014
Describe changes in federal accountability that prompted changes in old Index and required student growth measures
Describe old and new Achievement Index that rates schools (assigns labels, identifies high and low performers, basis for State Board of Education/OSPI recognition)
Describe & critique the new student growth percentile measure (SGP) used in the new index (and potentially used in staff evaluations)
SESSION OBJECTIVES
AYP under NCLB started in 2002, state discarded its existing accountability system
• AYP used 9 student groups, reading/math proficiency and participation, graduation rate
• 37 “cells” possible for schools, 111 for district
• Gradually increasing goal, all groups must meet standard by 2014
• “Conjunctive” model – not making it in one area means not making AYP
• Escalating negative sanctions when not making AYP, but only for Title I schools
Why Change Accountability System?
3
• System is too complicated, invalid, and unrealistic– Different “rules” than those used by state
• Larger minimum N, margin of error, excludes some students
– Negative label applied when missing one goal,ELLs must take test despite not knowing English
– Conjunctive model all will eventually “fail”
• Resulted in unintended side effects– Focus on “bubble kids,” narrowing curriculum, some
states lowered standards so all can pass by 2014
Problems with AYP System
4
AYP waiver approved in 2012, some rules no longer apply• Do not need to have all students meet standard by 2014• Do not need to set aside Title I funds• School choice or supplemental services not required• Still looks at reading & math percent meeting standard,
95% participation rate, graduation rates
Annual Measurable Objectives (AMO) is new measure• Each subgroup in each school has its own annual targets• Targets use a 2011 baseline, must cut in half the
“proficiency gap” (difference between baseline and 100% meeting standard) by 2017
Instead of “not making AYP,” lowest performing schools are now identified for more support3 types of “Persistently Low Achieving” schools• Priority: Bottom 5% in “all students” category• Focus: Bottom 10% of all subgroups (Asian, black,
Hispanic, white, low income, ELL, special education)• Emerging: Schools close to becoming Priority or Focus
(next lowest 5%/10%)
No grade-band distinctions (elementary, middle, high, comprehensive, alternative are all in the same rankings)
7
Revised Federal Accountability “Sanctions”
System to identify low performing schools is badly flawed• Applies only to Title I schools, must have N > 30 for
three years• To identify Focus and Emerging schools, all subgroups
are combined and ranked together• In 2012, every Focus and Emerging school (186 total)
was identified based on ELL or SpEd subgroups (or both)*
If a school has a large ELL and/or SpEd population and is Title I, the odds of identification is very high
*A few alternative schools were also identified for low graduation rates8
Revised Federal Accountability “Sanctions”
Educational accountability systems require:(1) measures of effectiveness(2) goals to guide improvement efforts(3) reports that provide useful information to
policymakers, educators, and parents (4) a set of consequences that recognize exemplary
performance and support those needing more help
In response to flawed AYP system, the State Board of Education created an Accountability Index in 2009 to provide a better measure of school effectiveness
Accountability Systems
9
Original Accountability Index*Five Outcomes
Results from 4 assessments (reading, writing, math, science) aggregated together from all grades and all students, extended graduation rate for all students, minimum N = 10
Four Indicators1. Achievement by non-low income students
(% meeting standard/ext. grad rate)
2. Achievement by low income students (eligible for FRL)
3. Achievement vs. Peers (make “apples to apples” comparisons by controlling for percent ELL, low-income, special ed, gifted, mobility)
4. Improvement (change in Learning Index from previous year)
Creates a 5x4 matrix with 20 outcomes, each rated on a scale of 1-710* Required by Legislature in 2009 (ESHB 2261)
Original Accountability Index Matrix(multiple measures using available state data)
Outcomes
Indicator Reading Writing Math Science Ext. G.R. Avg.
Non-low inc. achievement
Low inc. ach.
Ach. vs. peers
Improvement
Average Index *
* Simple average of all rated cells (compensatory model)
11
Index Benchmarks and Ratings
Indicator Reading Writing Math Science Ext. grad rate
Achievement of- Non-low inc.- Low income (% met standard)
> .20 7.151 to .20 6.051 to .15 5-.05 to .05 4-.051 to -.15 3-.151 to -.20 2 < -.20 1
DIFFERENCEIN RATE RATING> 12 76.1 to 12 63.1 to 6 5-3 to 3 4-3.1 to -6 3-6.1 to -12 2< -12 1
12
Indicator Reading Writing Math Science Ext. grad rate
- Improvement(Learning Index)
CHANGE IN LEARNING INDEX RATING
> .15 7.101 to .15 6.051 to .10 5-.05 to .05 4-.051 to -.10 3-.101 to -.15 2< -.15 1
CHANGEIN RATERATING> 6 74.1 to 6 62.1 to 4 5-2 to 2 4-2.1 to -4 3-4.1 to -6 2< -6 1
Index Benchmarks and Ratings
13
• No Improvement rating given when performing at a very high level (sensitive to “ceiling” effect)
• Index excluded ELL results in the first 3 years of enrollment (ELLs must still take tests, most exit in 3 years)
Achievement vs. Peers
•Recognizes context affects outcomes
•Makes “apples to apples” comparisons (“statistical neighbors”) to control for 5 student variables(percent ELL, low-income, special education, mobile, gifted)
•Separate analysis for each type of school (e.g., elementary, middle, high, multiple grades)
•Non-regular schools do not receive a “peer” rating
14
Illustration of Achievement vs. Peers (1 of 5 variables)
* Percent meeting standard for content areas, extended graduation rate** All students, content areas measured using the Learning Index
17
2012 Index Results
18
41 3875
119
212
268
400377
320
162
5118
0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
1 .0 0 -1 .4 9
1 .5 0 -1 .9 9
2 .0 0 -2 .4 9
2 .5 0 -2 .9 9
3 .0 0 -3 .4 9
3 .5 0 -3 .9 9
4 .0 0 -4 .4 9
4 .5 0 -4 .9 9
5 .0 0 -5 .4 9
5 .5 0 -5 .9 9
6 .0 0 -6 .4 9
6 .5 0 -7 .0 0
Struggling 7.4%
Fair 28.8%
G ood 37.3%
V e ry G ood 15.4%
Exe m plary 11.1%
N=2,081
Washington Achievement Awards
OSPI/SBE used 2-year averages from Accountability Index• Overall Excellence Award uses the Index score (top 5% by grade band)• Special Recognition given “on the edges” when 2-year average is > 6.00
Language arts, math, science, graduation rate, Improvement
19
Outcomes
Indicator Reading Writing Math Science G.R. Average
Non-low inc. achievement
Compare1
Low inc. ach.
Ach. vs. peers
Improvement 6.00
Average 6.00 6.00 6.00 6.00 Top 5%1
1 Overall Excellence is granted only if the average difference in the income gap and the race/ethnicity gap (using a separate matrix) is < 2.5
• Federal NCLB waiver required a change to the current Index – it must include subgroups and a growth measure
• Merges two different accountability systems (state and federal) into one system
• Has no relationship with AMOs!• New index is much more complicated, has
different rules compared to previous index
20
New Accountability Index
• Included in waiver proposal to U.S. Dept. of Education (waiver still not approved)
• Includes all subgroups (race/ethnicity, programs)• N > 20 across grade band (not grade)
• New rating scales (1-10) and more “labels”• No Peer rating• Growth based on SGPs, not grade band improvement in Levels• Includes all ELL results (including results of students who exited program)
• Basis for identifying low-performing schools (federal acct.)• Sanctions also apply to non-Title I schools• Preliminary analyses show high correlation with school % FRL
-.53 (elementary) -.45 (middle) -.60 (high)
21
New Accountability Index
6 Labels, Norm-referenced
• Exemplary: Top 5% of schools using overall index, must have 60% students proficient in all tested subjects (given recognition)
• Very Good: Next 15% of schools
• Good: Next 30% of schools
• Fair: Next 30% of schools
• Underperforming: Next 5% of schools + 10% with large achievement gaps
• Priority: Lowest 5% of index
Proposed Priority, Focus, Emerging• Includes all schools, not just Title I• Uses Index to identify schools rather than stacked
rankingsPriority system uses the overall index value– Bottom 5% are Priority (“Struggling”)– Next 5% from the bottom are Emerging PriorityFocus system uses index value for each subgroup in each school– Bottom 10% are Focus– Next 10% from the bottom are Emerging Focus
Getting Off the Priority / Focus List*
• For 3 consecutive years in Math and Reading:– Meet or exceed AMOs for all subgroups
– Have at least 95% participation for all subgroups
– Not be in the bottom 5% (or 10% for Focus)
– Decrease % of students in all groups scoring Level 1 or 2 in reading and math. Improvement % must be comparable to top 30% of Title 1 schools
• OSPI determines sufficient progress has been made* Unclear how Emerging schools get off list
New Emphasis on Student Growth
• Federal waiver submitted in 2011 requires a student growth measure for the Index and for teacher and principal evaluations
• Index has growth measure but “weak legislation” regarding use of state test results in growth measure puts waiver in jeopardy
• OSPI amended waiver in July 2013 and requires student growth to be a “substantial factor” in 3 of 8 teacher and principal criteria – brinksmanship occurring right now
• Many ways to measure growth, State Board only considered Student Growth Percentile (SGP)
Achievement vs Growth What’s the Difference?
Achievement
Growth
Measuring Student Growth
• Growth, in its simplest form, is a comparison of the assessment results of a student or group of students between two points in time where a positive difference would imply growth.
Student Growth Percentiles• Problem: Current state assessment system was
not designed to measure student growth– Only selected grades and subjects are tested
– Difficulty varies in passing the test from one year to the next (high school reading and writing HSPE is easy to pass (bar was lowered due to graduation requirement)
• State’s Solution: Use a norm-referenced system that ranks the rate of student growth
Student Growth Percentiles• SGPs compare the growth rates of students who
were at the same scale score level the previous year (their “academic peers”)Example: A student earning an SGP of 80 performed as well or better than 80 percent of the students who scored the same score the previous year
• Does not compare the growth rate of all students to each other or compare the achievement to all students (the usual way to give percentiles)
Student Growth Percentiles• SGP trajectory predicts where students will perform in
the future, based on their previous growth rate and students who were at the same scale score level the previous year
• OSPI groups students into three categoriesHigh Growth Top 1/3 67th to 99th percentileTypical Growth Middle 1/3 34th to 66th percentileLow Growth Bottom 1/3 1st to 33rd percentile
• The median SGPs for a class, grade, school or district is the “score” (school median SGP is used in the new Index)
SGP Student Data
Student Growth Percentile (SGP) results are available to the public on the OSPI State Longitudinal Data System (SLDS) website 1 • From OSPI homepage, select “K-12 Data & Reports”
on right side• Select “Static Data Files”• Select “Assessment” menu item, scroll down to find
the SGP files and reports1 http://data.k12.wa.us/PublicDWP/Web/WashingtonWeb/Home.aspx
Three types of SGP files available to public• Bubble chart with all schools, with district’s
schools identified (hover over bubble for results)
• Individual school results by subgroup (compared to district and state for three years)
• Excel file with all results for all schools and district (Renton’s file has > 5000 rows and 20 columns)
Problems with SGP1. Results can be misleading
Percentile rank is not based on all students, so the 50th percentile is not the middle of the entire distribution, just those who had the same scale score the previous year
2. SGPs do not provide a measure of adequate (enough) growth or a year’s worth of growthA student can be at the 50th percentile and not make a year’s worth of growth or enough growth to meet expectations upon graduation; another student can be at the 50th percentile and make more than a year’s worth of growth
Student Report: No growth is “typical”
Student Report: Decline is “high growth”
Problems with SGP
3. Results may not reflect an accurate measure of student growth or educator effectiveness• SGPs are “highly unstable” and “problematic” for students
with very high and low scores because there are relatively few students with those scores to obtain stable rankings1
• No standard errors reported• Does not control for differences in the student population
4. Results are not available in a timely manner
5. SGPs are new and harder to understand than current metrics
1 Castellano, K. and Ho, A. (2013). A Practitioner’s Guide to Growth Models. Washington, DC: Council of Chief State School Officers
Alternative Measure of Student Growth• Criterion-referenced approach
• Students are compared to their own growth, not the growth rate of others
• Encourages cooperation because score doesn’t depend on how other students perform
• Can be computed quickly and easily – doesn’t require a minimum number of students and doesn’t depend on how other students perform
• Uses familiar data and concepts, makes it easy to understand
Measuring Achievement and Growth
LeadingSlipping
GainingLagging
Above 439 Level 4(Exceeds standard)
400-439 Level 3(Meets standard)
375-399 Level 2 (Below standard)
Below 375 Level 1 (Far below standard)
Change in Scale Score from Grade 4 (2012)
2013
Gra
de 5
Mat
h Sc
ale
Scor
e
-100 -75 -50 -25 0 25 50 75 100
-1 0 0 -5 0 0 5 0 1 0 0
2013 Achievement and Growth from 2012(Math, Grade 4 and Change from Grade 3)
LeadingSlipping
GainingLagging
Average change in scale score: +6.5 (413.1 to 419.6) N = 913 R2=.5856.3% of the students made at least one year gain (change in scale score > 0)Each dot represents a student who was enrolled in the district in both 2012 and 2013 (scores below 300 were marked as 300, scores above 500 were marked as 500)
15.6% (N=142) 50.4%
(N=460)
5.9% (N=54)
28.1% (N=257)
Change in Scale Score from Grade 3 (2012)
2013
Gra
de 4
Mat
h Sc
ale
Scor
e
500
440
400
375
300
Above 439 Level 4(Exceeds standard)
Below 375 Level 1 (Far below standard)
375-399 Level 2 (Below standard)
400-439 Level 3(Meets standard)
Change in Math Scale Scores, 2011 to 2012Non-Low Income Low Income (FRL)
43% made 1+ years gain60% made 1+ years gain
Limitations to Alternative Measure• Proficiency cut scores vary slightly from grade to
gradeIt’s harder to meet standard in some grades compared to others (like having an easy teacher one year and a hard teacher the next)
• No “vertical scale” to measure absolute growthSmarter Balanced assessments will have a vertical scale and cut scores that align with college/career readiness
For more details, see WERA Educational Journal, Winter 2014 article, “Using SGPs to Measure Student Growth: Context, Characteristics, and Cautions” www.wera-web.org