Jamal Abedi CRESST/University of California,Los Angeles Paper presented at 34 th Annual Conference on Large-Scale Assessment Boston, June 20-23, 2004.

Jamal Abedi

CRESST/University of California,Los Angeles

Paper presented at 34th Annual Conference on Large-Scale Assessment

Boston, June 20-23, 2004

Reporting AYP Measures for Accommodated ELL Students

NCLB Accountability System

States should implement a set of high-quality, yearly student academic assessments in reading/language arts, mathematics, and later in science:

(i) applies the same high standards of academic achievement to all public schools

(ii) is statistically valid and reliable

(iii) results in substantial academic improvement for all students

What is AYP

Percentage of students scoring at the “proficient” level or higher on Annual Measureable Objectives (AMOs) referred to as Adequate Yearly Progress

Each state establishes a timeline for all students to reach “proficient” or higher, no more than 12 years after the start date of the 2001-2002

The first increase occurs within the first 2 years

How Should AYP be Reported?

The AYP is reported for schools, school districts, and the state for all students as well as for the following subgroups of students:

(1) Economically disadvantaged students;

(2) students from major racial and ethnic groups;

(3) students with disabilities;

(4) students with limited English proficiency.

Technical Issues in AYP Reporting

Defining academic achievement

Alignment of content standards with test items

Setting achievement levels

Modified Angoff Procedure, is a test-centered approach, does not fit well with open-ended questions

Item-mapping approach within IRT, works better with open-ended questions, but still subjective (other approaches such as contrasting group, bookmarking, etc.)

Problems with setting the baseline or starting value

NRT versus CRT, most states use NRT tests, baseline is set in most states by different NRT tests and AYP is reported later based on CRT tests

Problems in AYP Reporting for LEP Students: A Case Example

1. Problems in classification/reclassification of LEP students (LEP subgroup is a moving target)

2. Measurement quality for LEP

3. Low baseline

4. Instability of LEP subgroup

5. Sparse LEP population

6. LEP Cutoff points (Conjunctive versus Compensatory model)

Corrective Action

Year 1

Revise school plan

Use 10% funds for staff development

Provide school choice with paid transportation

District provides technical assistance (TA)

(District J Special Administrators Academy Session, August 13, 2003)

Year 2

Continue

• Staff development

• Choice

• District TA

Add

Supplemental services/tutoring

Year 3

Continue

• District TA

• Choice

• Supplemental Services

Add

• District corrective action

Year 4

Continue

District TA

Choice

Supplemental Services

Add

Development of plan for alternative governance

Year 5

Implement alternative governance plan

Reopen as charter

Replace staff

Contact with external entity

Takeover by state

Measurement Quality for LEP

Validity Issues

Language is a source of construct irrelevant variance

Reliability issues

Language complexity of test items is a source of measurement error

Site 2 Stanford 9 Sub-scale Reliabilities (1998) Grade 9 Alphas

Sub-scale(Items) Non-LEP Students Hi SES Low SES

English Only

FEP

RFEP

LEP

Reading N=205,092 N=35,855 N=181,202 N=37,876 N=21,869 N=52,720

-Vocabulary (30) .828 .781 .835 .814 .759 .666

-Reading Comp. (54)

.912 .892 .916 .903 .877 .833

Average reliability .870 .837 .876 .859 .818 .750

Math N=207,155 N=36,588 N=183,262 N=38,329 N=22,152 N=54,815

-Total (48) .899 .853 .898 .898 .876 .802

Language N=204,571 N=35,886 N=180,743 N=37,862 N=21,852 N=52,863

-Mechanics (24) .801 .759 .803 .802 .755 .686

-Expression (24) .818 .779 .823 .804 .757 .680

Average reliability .810 .769 .813 .803 .756 .683

Science N=163,960 N=28,377 N=144,821 N=29,946 N=17,570 N=40,255

-Total (40) .800 .723 .805 .778 .716 .597

Social Science N=204,965 N=36,132 N=181,078 N=38,052 N=21,967 N=53,925

-Total (40) .803 .702 .805 .784 .722 .530

23

To have a more valid AYP assessment For ELLs, some forms of accommodations

must be provided

NCLB requires that LEP students should receive assessment accommodations when necessary

Accommodation Issues for ELls

• Among the most important issues in the inclusion and

assessment of English Language Learners (ELLs) are issues

concerning accommodations for these students.

• By definition, accommodations are used for ELL students to

reduce the performance gap between ELL and non-ELL

students without jeopardizing the validity of assessments.

• There are many forms of accommodations used for both ELLs

and students with disabilities (SDs) by different states

(Abedi, Kim-Boscardin, and Lardon, 2000; Rivera, Stansfield,

Scialdone, & Sharkey, 2000; Thurlow & Bolt, 2001).

Rivera (2003) presents a list of 73 commonly used accommodations for ELLs, some examples include:

• Test administered in ESL/Bilingual classroom.

• Directions explained/clarified in native language.

• Key words and phrases in text highlighted.

We categorized these accommodations based on the level of appropriateness for ELL students

Of the 73 accommodations listed:

47 or 64% are not related

7 or 10% are remotely related

8 or 11% are moderately related

11 or 15% are highly related

The most important issue is the concern about the validity of accommodation strategies:

• Research findings suggest that providing accommodations may increase ELL students’ performance, while also benefiting non-ELL students.

• There is not enough research support for many of the accommodations currently being used in national and state assessments.

• The only way to make judgments about the efficiency and validity of these accommodations is to use them in experimentally controlled situations with both ELL and non-ELL students.

• The dictionary as a form of accommodation suffers from another major limitation, the feasibility issue. It is very difficult logistically to provide this form of accommodation to students.

Some forms of accommodation strategies, such as the use of a glossary with extra time, raised the performance of both ELL and non-ELL students (Abedi, Hofstetter, Lord, and Baker, 1998, 2000)

•English and bilingual dictionaries recipients may be advantaged over those without access to dictionaries. This may jeopardize the validity of assessment.

• Linguistically modification of test items is among these accommodations. (Abedi, Lord, and Hofstetter, 1998; Abedi, Hofstetter, Lord, and Baker, 1998, 2000).

There are, however, some accommodations that help ELL students with their English language limitations without compromising the validity of assessment.

Strategies that are expensive, impractical, or logistically complicated are unlikely to be

widely accepted.

Validity: The goal of accommodations is to level the playing field for ELL students, not to alter the construct under measurement.

Consequently, if an accommodation affects the performance of non-ELL students, the validity of the accommodation could be questioned.

Feasibility: For an accommodation strategy to be useful, its implementation must be possible in large-scale assessments.

Conclusion There is not enough research support for many

of the accommodations that are currently used in national and state assessments.

The only way to make judgments about the efficiency and validity of these accommodations is to use them in experimentally controlled situations with both ELL and non-ELL students and examine their validity and effectiveness under solid experimental design.

The results of CRESST studies along with other studies nationwide have provided support for some of the accommodations used for ELL students.

Conclusion cont.

Providing a customized dictionary is a viable alternative to providing traditional dictionaries.

The linguistic modification of test items that reduce unnecessary linguistic burdens on students is among the accommodations that help ELL students without affecting the validity of assessments.

Computer testing with added extra time and glossary was shown to be a very effective, yet valid accommodation (Abedi, Courtney, Leon and Goldberg, 2003)

Examples of research-supported accommodations:

Conclusion cont.

Without information on important aspects of accommodations such as validity, it would be extremely difficult to make an informed decision on what accommodation to use and how to report accommodated and non-accommodated results.

It is thus imperative to examine different forms of accommodations before using them in state and/or national assessments.

Now for a visual art representation of invalid

accommodations…

Claudia Davis, Shiv Desai, Zitlali Morales, Nina Neulight

Operational Definition of NCLB

Questions for NCLB legislators

When they say 2014 do they mean?

By the year 2014 all students should reach 100% proficiency

OR

After 2014 years from now everyone should reach 100% proficiency.

Jamal Abedi CRESST/University of California,Los Angeles Paper presented at 34 th Annual Conference on Large-Scale Assessment Boston, June 20-23, 2004.

Documents

disabilities4 students

disadvantaged students

ayppercentage of students

reporting ayp measures

crt tests problems

school districts

testcentered approach

different nrt tests