Top Banner
Validity in Action: State Assessment Validity Evidence for Compliance with NCLB William D. Schafer, Joyce Wang, and Vivian Wang University of Maryland
40

Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Mar 19, 2016

Download

Documents

Nitesh

Validity in Action: State Assessment Validity Evidence for Compliance with NCLB. William D. Schafer, Joyce Wang, and Vivian Wang University of Maryland. Objectives. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Validity in Action: State Assessment Validity Evidence

for Compliance with NCLB

William D. Schafer, Joyce Wang, and Vivian Wang

University of Maryland

Page 2: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Objectives

• review the evidence that state testing programs provide to the United States Department of Education on the validity of their assessments

• examine in detail the validity evidence that certain selected states provided for their peer reviews

• make recommendations for improving the evidence submissions supporting validity for state assessments

Page 3: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Data Sources

• official decision letters on each state's final assessment system under NCLB from USED; publicly available at www.ed.gov

• peer review reports for five selected states • technical reports for available states that

have received full approval from USED; downloaded from the web sites of each state

Page 4: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Types of Validity Evidence

• the AERA/APA/NCME Standards lists five types of validity evidence– content-based evidence– response-process-based evidence– evidence based on internal structure– evidence based on relationships with other variables– evidence based on consequences

• we will look at the judgments that each type should support in the context of statewide assessments of educational achievement

Page 5: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Content-Based Evidence

judgments that need to be supported:• the domain is described in the academic

content standards at the grade level • the test items sample that content domain

appropriately • achievement level descriptions refer back

to the content domain of the test

Page 6: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Response-Process-Based Evidence

judgment that needs to be supported:• the activities the test demands of students

are consistent with the cognitive processes the test is supposed to represent (as implied by the content standards)

Page 7: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Evidence Based on Internal Structure

judgment that needs to be supported:• test score relationships are consistent with

the strand structures of the academic content standards

Page 8: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Evidence Based on Relationships with Other Variables

judgments that need to be supported:• higher correlations occur when traits are

more similar• low correlations (perhaps partial on ability)

exist with specific traits (e.g., gender, race-ethnicity, disability)

Page 9: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Evidence Based on Consequences

judgments that need to be supported:• test use maximizes positive outcomes• test use minimizes negative outcomes

Page 10: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Decision Letters• decision letters were viewed at the USED web

site – they are public documents• 19 of the states were required to provide

additional validity evidence• the evidence was not classified by USED, but

we classified it into the five types to help make the project manageable

• decision-letter evidence is required by USED – it is mandatory – these elements may be thought of as necessary for states to submit

Page 11: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Content-Based Evidence• evidence to show that assessments measure the

academic content standards and not characteristics not specified in the academic content standards or grade level expectations

• blueprints, item specifications, and test development procedures

• evidence of alignment with content standards – this is an emphasis in peer review

• explanations of design and scoring• standard setting process, results, and impact

Page 12: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Response-Process-Based Evidence

• evidence to show that items are tapping the intended cognitive processes – this sort of evidence is commonly a part of alignment studies

Page 13: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Evidence Based on Internal Structure

• item interrelationships• subscale score correlations showing they are

are consistent with the structures inherent to the academic content standards

• scoring and reporting are consistent with the subdomain structure of the content standards

• justification of score use given the threat (observed) that the subdomain correlations are higher between content areas than within content areas

Page 14: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Evidence Based on Relationships with Other Variables

• criterion validity• relationships between test scores and

external variables

Page 15: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Evidence Based on Consequences

• studies of intended and unintended consequences

Page 16: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Evidence from State Submissions

• each state submitted voluminous evidence to USED

• the Peer Review Reports included descriptions of the evidence submitted

• we had sets of Reports for five states• this evidence may be over and above what

is actually required

Page 17: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Evidence of Purposes

• each state was asked to provide evidence about the purposes of their assessments

• each state did that• this is an important part of Kane’s (2006)

concept of a validity argument• because it does not fall into the categories

of validity evidence in the USED Peer Review Guidance, we did not include it in our review

Page 18: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Content-Based Evidence

• test blueprints & construction process• alignment reports

– categorical concurrence (each content strand has enough items for a subscore report)

– range of knowledge (the number of content elements in each strand that have items associated with them)

– balance of representation (the distribution of items across the content elements within each strand)

• achievement level descriptions (ALDs) compared with the strand structure

Page 19: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Response-Process-Based Evidence

• alignment reports– depth of knowledge (relates the cognition

tapped by each item to that implied in the statement of the element in the content standards the item is associated with)

• think-aloud studies (proposed)

Page 20: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Evidence Based on Internal Structure

• dimensional analysis at the item level – principal components analysis– dimensionality hypothesis testing

• intercorrelations among the subtest scores

Page 21: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Evidence Based on Relationships with Other Variables

• correlations with external tests of similar constructs (and dissimilar constructs)

• correlations with student demographics and course-taking patterns

• choosing and implementing accommodations for disabilities and limited English proficiency

• bias studies (e.g., DIF) and passage reviews • universal design principles • monitoring of test administration procedures

Page 22: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Evidence Based on Consequences

• longitudinal change in dropout and graduation rates and NAEP results

• use of results to evaluate schools and districts

• use of test data to improve curriculum & instruction

• use of adequate yearly progress reports• use of tests to make promotion &

graduation decisions

Page 23: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Synthesis of Evidentiary Needs

• it would be useful to have a minimum list for state regulatory submissions

• can we use these studies to generate a list?• most likely over-inclusive using our evidence• as soon as we do so, it will surely be challenged • it seems reasonable to submit the following

– for each test series (e.g., regular, alternate)– for each tested content and grade combination

Page 24: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Content Evidence

• content standards• test blueprint• item (and passage) development process• item categorization rules and process• forms development process (e.g., item

sampling; item location; section timing)• results of alignment studies

Page 25: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Process Evidence

• test blueprint (if it has a process dimension)

• item categorization rules and method (if items are categorized by process)

• results of alignment studies• results of other studies, such as think-

alouds

Page 26: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Internal Structure Evidence

• subscore correlations• Item-subscore correlations• dimensionality analyses

Page 27: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Relations with Other Variables• convergent Evidence

– correlations with independent, standardized measures– correlations with within-class variables, such as

grades• discriminant Evidence

– correlations with standardized tests of other traits (e.g., math with reading)

– correlations with within-class variables, such as grades in other contents

– correlations with irrelevant student characteristics (e.g., gender)

– item-level (e.g., DIF) studies

Page 28: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Consequential Evidence

• purposes of the test – as they describe intended consequences

• uses of results by educators• trends over time• studies that generate and evaluate

positive and negative aspects from user input

Page 29: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Validity in the Accountability Context – Role of Processes

• majority of the evidence submitted capitalizes on well-known methods for study of the validity of a particular test form – a product

• but object of study in accountability is actually a process by which tests are developed & used– a test form is important only as a representative of a

process of test development – programs are expected to engage in a continual

process of self-evaluation and improvement

Page 30: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Process Evidence

• assume it is useful to distinguish between product evidence and process evidence – product evidence focuses on a particular test and – process evidence focuses on a testing program

• will review and extend some suggestions for process evidence that were originally proposed in the context of state assessment and accountability peer reviews

Page 31: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

What is a Process?

• a recurring activity that takes material, operates on it, and produces a product

• concept is borrowed from project management • could be as large as the entire assessment and

accountability program • could be as small as, say, the production of a

test item• one challenge is to organize the activities of a

program into useful processes

Page 32: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Is Validity a Process Concept?

• i.e., is there a sense in which we can use the concept of the validity of a process?

• validity is justification for an interpretation of a score– a test form is a static element that can

contribute support for an interpretation– a process is a dynamic element that can

contribute support for future interpretations• so we give this one a tentative “yes”

Page 33: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Elements of Process Evidence • process

– The process is described – The inputs and operating rules are laid out

• product – The results of the process are presented or described

• evaluation (how are these questions are considered) – is the process adequate? – can (or how can) it be improved? – should it be improved (e.g., do the benefits justify the costs)?

• improvement (how the consideration is done?)– The recommendations from the evaluation are considered for

implementation in order to improve the process

Page 34: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Examples of Process Evidence

• three examples of these four elements of process evidence follow

• they vary markedly in scope – small to large – illustrate the nature of process evidence for

different contexts within an assessment and accountability program

Page 35: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Bias and Sensitivity Committee Selection

• process. desired composition, generation of committee members, contacting potential members, proposed meeting schedule, etc.

• product. committee composition, especially the constituencies represented.

• evaluation. comparison of actual with desired composition, follow up with persons who declined, suggestions for improvement.

• improvement. who has responsibility to consider the recommendations generated by the evaluation, how they go about their analysis, how change is implemented in the system, examples of changes that were made in the past to document responsiveness

Page 36: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Alignment

• process. test blueprint, items, item categorizations, sampling processes

• product. a test form• evaluation. alignment study• improvement. review of study

recommendations, plan for future

Page 37: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Psychometric Adequacy of a Test Form

• process. the analyses that are performed.• product. technical manual• evaluation. review by a group such as a

TAC, recommendations for the manual as well as the testing program

• improvement. consideration of recommendations, plan for future

Page 38: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Making Judgments About Processes

• two typically independent layers of judgment

• first layer is an evaluation that makes recommendations about improvement

• second layer considers them • in many cases, second layer would be an

excellent way for a state to use its TAC

Page 39: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Judging Process Evidence

• process evidence by definition describes processes

• it should be judged by how well it describes processes that support interpretations based on future assessments

• it should also be judged on how well it describes processes that lead to improvements in the program

Page 40: Validity in Action: State Assessment Validity Evidence for Compliance with NCLB

Possible Criteria for Process Evidence

• data are collected from all relevant sources• data are reported completely and efficiently• reviewed by persons with appropriate expertise• review is conducted fairly• review results are reported completely and

efficiently• recommendations are suggested in the reports• consideration given to the recommendations• past actions based are presented as evidence

that the process results in improvement