Jamal Abedi CRESST/University of California,Los Angeles Paper presented at 34 th Annual Conference on Large-Scale Assessment Boston, June 20-23, 2004 Reporting AYP Measures for Accommodated ELL Students
Jan 19, 2016
Jamal Abedi
CRESST/University of California,Los Angeles
Paper presented at 34th Annual Conference on Large-Scale Assessment
Boston, June 20-23, 2004
Reporting AYP Measures for Accommodated ELL Students
NCLB Accountability System
States should implement a set of high-quality, yearly student academic assessments in reading/language arts, mathematics, and later in science:
(i) applies the same high standards of academic achievement to all public schools
(ii) is statistically valid and reliable
(iii) results in substantial academic improvement for all students
What is AYP
Percentage of students scoring at the “proficient” level or higher on Annual Measureable Objectives (AMOs) referred to as Adequate Yearly Progress
Each state establishes a timeline for all students to reach “proficient” or higher, no more than 12 years after the start date of the 2001-2002
The first increase occurs within the first 2 years
How Should AYP be Reported?
The AYP is reported for schools, school districts, and the state for all students as well as for the following subgroups of students:
(1) Economically disadvantaged students;
(2) students from major racial and ethnic groups;
(3) students with disabilities;
(4) students with limited English proficiency.
Technical Issues in AYP Reporting
Defining academic achievement
Alignment of content standards with test items
Setting achievement levels
Modified Angoff Procedure, is a test-centered approach, does not fit well with open-ended questions
Item-mapping approach within IRT, works better with open-ended questions, but still subjective (other approaches such as contrasting group, bookmarking, etc.)
Problems with setting the baseline or starting value
NRT versus CRT, most states use NRT tests, baseline is set in most states by different NRT tests and AYP is reported later based on CRT tests
Problems in AYP Reporting for LEP Students: A Case Example
1. Problems in classification/reclassification of LEP students (LEP subgroup is a moving target)
2. Measurement quality for LEP
3. Low baseline
4. Instability of LEP subgroup
5. Sparse LEP population
6. LEP Cutoff points (Conjunctive versus Compensatory model)
Corrective Action
Year 1
Revise school plan
Use 10% funds for staff development
Provide school choice with paid transportation
District provides technical assistance (TA)
(District J Special Administrators Academy Session, August 13, 2003)
Year 2
Continue
• Staff development
• Choice
• District TA
Add
Supplemental services/tutoring
Year 3
Continue
• District TA
• Choice
• Supplemental Services
Add
• District corrective action
Year 4
Continue
District TA
Choice
Supplemental Services
Add
Development of plan for alternative governance
Year 5
Implement alternative governance plan
Reopen as charter
Replace staff
Contact with external entity
Takeover by state
Measurement Quality for LEP
Validity Issues
Language is a source of construct irrelevant variance
Reliability issues
Language complexity of test items is a source of measurement error
Site 2 Stanford 9 Sub-scale Reliabilities (1998) Grade 9 Alphas
Sub-scale(Items) Non-LEP Students Hi SES Low SES
English Only
FEP
RFEP
LEP
Reading N=205,092 N=35,855 N=181,202 N=37,876 N=21,869 N=52,720
-Vocabulary (30) .828 .781 .835 .814 .759 .666
-Reading Comp. (54)
.912 .892 .916 .903 .877 .833
Average reliability .870 .837 .876 .859 .818 .750
Math N=207,155 N=36,588 N=183,262 N=38,329 N=22,152 N=54,815
-Total (48) .899 .853 .898 .898 .876 .802
Language N=204,571 N=35,886 N=180,743 N=37,862 N=21,852 N=52,863
-Mechanics (24) .801 .759 .803 .802 .755 .686
-Expression (24) .818 .779 .823 .804 .757 .680
Average reliability .810 .769 .813 .803 .756 .683
Science N=163,960 N=28,377 N=144,821 N=29,946 N=17,570 N=40,255
-Total (40) .800 .723 .805 .778 .716 .597
Social Science N=204,965 N=36,132 N=181,078 N=38,052 N=21,967 N=53,925
-Total (40) .803 .702 .805 .784 .722 .530
23
To have a more valid AYP assessment For ELLs, some forms of accommodations
must be provided
NCLB requires that LEP students should receive assessment accommodations when necessary
Accommodation Issues for ELls
• Among the most important issues in the inclusion and
assessment of English Language Learners (ELLs) are issues
concerning accommodations for these students.
• By definition, accommodations are used for ELL students to
reduce the performance gap between ELL and non-ELL
students without jeopardizing the validity of assessments.
• There are many forms of accommodations used for both ELLs
and students with disabilities (SDs) by different states
(Abedi, Kim-Boscardin, and Lardon, 2000; Rivera, Stansfield,
Scialdone, & Sharkey, 2000; Thurlow & Bolt, 2001).
Rivera (2003) presents a list of 73 commonly used accommodations for ELLs, some examples include:
• Test administered in ESL/Bilingual classroom.
• Directions explained/clarified in native language.
• Key words and phrases in text highlighted.
We categorized these accommodations based on the level of appropriateness for ELL students
Of the 73 accommodations listed:
47 or 64% are not related
7 or 10% are remotely related
8 or 11% are moderately related
11 or 15% are highly related
The most important issue is the concern about the validity of accommodation strategies:
• Research findings suggest that providing accommodations may increase ELL students’ performance, while also benefiting non-ELL students.
• There is not enough research support for many of the accommodations currently being used in national and state assessments.
• The only way to make judgments about the efficiency and validity of these accommodations is to use them in experimentally controlled situations with both ELL and non-ELL students.
• The dictionary as a form of accommodation suffers from another major limitation, the feasibility issue. It is very difficult logistically to provide this form of accommodation to students.
Some forms of accommodation strategies, such as the use of a glossary with extra time, raised the performance of both ELL and non-ELL students (Abedi, Hofstetter, Lord, and Baker, 1998, 2000)
•English and bilingual dictionaries recipients may be advantaged over those without access to dictionaries. This may jeopardize the validity of assessment.
• Linguistically modification of test items is among these accommodations. (Abedi, Lord, and Hofstetter, 1998; Abedi, Hofstetter, Lord, and Baker, 1998, 2000).
There are, however, some accommodations that help ELL students with their English language limitations without compromising the validity of assessment.
Strategies that are expensive, impractical, or logistically complicated are unlikely to be
widely accepted.
Validity: The goal of accommodations is to level the playing field for ELL students, not to alter the construct under measurement.
Consequently, if an accommodation affects the performance of non-ELL students, the validity of the accommodation could be questioned.
Feasibility: For an accommodation strategy to be useful, its implementation must be possible in large-scale assessments.
Conclusion There is not enough research support for many
of the accommodations that are currently used in national and state assessments.
The only way to make judgments about the efficiency and validity of these accommodations is to use them in experimentally controlled situations with both ELL and non-ELL students and examine their validity and effectiveness under solid experimental design.
The results of CRESST studies along with other studies nationwide have provided support for some of the accommodations used for ELL students.
Conclusion cont.
Providing a customized dictionary is a viable alternative to providing traditional dictionaries.
The linguistic modification of test items that reduce unnecessary linguistic burdens on students is among the accommodations that help ELL students without affecting the validity of assessments.
Computer testing with added extra time and glossary was shown to be a very effective, yet valid accommodation (Abedi, Courtney, Leon and Goldberg, 2003)
Examples of research-supported accommodations:
Conclusion cont.
Without information on important aspects of accommodations such as validity, it would be extremely difficult to make an informed decision on what accommodation to use and how to report accommodated and non-accommodated results.
It is thus imperative to examine different forms of accommodations before using them in state and/or national assessments.
Now for a visual art representation of invalid
accommodations…
Claudia Davis, Shiv Desai, Zitlali Morales, Nina Neulight
Operational Definition of NCLB
Questions for NCLB legislators
When they say 2014 do they mean?
By the year 2014 all students should reach 100% proficiency
OR
After 2014 years from now everyone should reach 100% proficiency.