Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie Perkis , NCDPI

Enhancing the Technical Quality of the North Carolina Testing

Program: An Overview of Current Research Studies

Nadine McBride, NCDPIMelinda Taylor, NCDPICarrie Perkis, NCDPI

Overview

• Comparability• Consequential validity• Other projects on the horizon

Comparability• Previous Accountability Conference

presentations provided early results• Research funded by an Enhanced

Assessment Grant from the US Department of Education

• Focused on the following topics:– Translations– Simplified language– Computer-based– Alternative formats

What is Comparability?

Not just “same score”• Same content coverage• Same decision consistency• Same reliability & validity• Same other technical properties (i.e.,

factor structure)• Same interpretations of test results, with

the same level of confidence

• Develop and evaluate methods for determining the comparability of scores from test variations to scores from the general assessments

• The same inferences should be able to be made, with the same level of confidence, from variations of the same test.

Research Questions

• What methods can be used to evaluate score comparability?

• What types of information are needed to evaluate score comparability?

• How do different methods compare in the types of information about comparability they provide?

Products

• Comparability Handbook– Current Practice

• State Test Variations• Procedures for Developing Test Variations and Evaluating

Comparability– Literature Reviews – Research Reports– Recommendations

• Designing Test Variations• Evaluating Comparability of Scores

Results - Translations

• Replication methodology helpful when faced with small samples and widely different proficiency distributions– Gauge variability due to sampling (random) error– Gauge variability due to distribution differences

• Multiple methods for evaluating structure are helpful• Effect size criteria helpful for DIF• Congruence b/w structural & DIF results

Results – Simplified Language

• Carefully documented and followed development procedures focused on maintaining the item construct can support comparability arguments.

• Linking/equating approaches can be used to examine and/or establish comparability.

• Comparing item statistics using the non-target group can provide information about comparability.

Results – Computer-based

• Propensity score matching produced similar results to studies using within-subjects samples.

• Propensity score method provides a viable alternative to the difficult-to-implement repeated measures study.

• Propensity score method is sensitive to group differences. For instance, the method performed better when 8th and 9th grade groups were matched separately.

Results – Alternative Formats

• The burden of proof is much heavier for this type of test variation.

• A study based on students eligible for the general test can provide some, but not solid, evidence of comparability.

• Judgment-based studies combined with empirical studies are needed to evaluate comparability.

• More research is needed in methods for evaluating what constructs each test type is measuring.

Lessons Learned• It takes a village…

– Cooperative effort of SBE, IT, districts and schools to implement special studies

– Researchers to conduct studies, evaluate results

– Cooperative effort of researchers and TILSA members to review study design and results

– Assessment community to provide insight and explore new ideas

Consequential Validity

• What is consequential validity?– Amalgamation of evidence regarding the

degree to which use of test results have social consequences

– Can be both positive and negative; intended and unintended

Who’s Responsibility?

• Role of the Test Developer versus the Test User?

• Responsibility and roles are not clearly defined in the literature

• State may be designated as both a test developer and a user

Test Developer Responsibility

• Generally responsible for… – Intended effects– Likely side effects– Persistent unanticipated effects– Promoted use of scores– Effects of testing

Test Users’ Responsibility

• Generally responsible for… – Use of scores

• the further from the intended uses, the greater the responsibility

Role of Peer Review

• Element 4.1– For each assessment, including the

alternate assessment, has the state documented the issue of validity…. with respect to the following categories:• g) has the state ascertained whether the

assessment produces intended and unintended consequences?

Study Methodology

• Focus Groups– Conducted in five regions across the state– Led by NC State’s Urban Affairs – Completed in Dec 09 and Jan 10– Input of teachers and administration staff– Included large, small, rural, urban,

suburban schools

Study Methodology

• Survey Creation– Drafts currently modeled after surveys

conducted in other states– However, most of those were conducted

10+ years ago– Surveys will be finalized after focus group

results are reviewed

Study Methodology

• Survey administration– Testing Coordinators to receive survey

notification– Survey to be available in late March to April

Study Results

• Stay tuned!– Hope to make the report publicly available

on DPI testing website

Other Research Projects

• Trying out different item types• Item location effects• Auditing

Contact Information

• Nadine McBridePsychometriciannmcbride@dpi.state.nc.us

• Melinda TaylorPsychometricianmtaylor@dpi.state.nc.us

• Carrie PerkisData Analystcperkis@dpi.state.nc.us

Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie Perkis , NCDPI

score comparability

general test

comparability arguments

evidence of comparability

test type

type of test variation

different methods

research questionswhat

Documents

Consolidated Federal Data Collection (CFDC) for LEP Webinar....

NCDPI Observer Calibration Tool Introduction & Demo.

Accommodations: Decide, Document, Monitor Carrie Perkis...

Mcbride Report

NCEXTEND2 Assessments Mike Gallagher, NCDPI Nadine McBride,....

Audrey McBride

Mcbride 2003

North Carolina Educator Evaluation System Process and Online...

McBride Settlement

NCDPI NCTED Meeting April 2014

Immigrant Consolidated Federal Data Collection (CFDC)...

NCDPI Comprehensive Needs Assessment SCHOOL RUBRIC · NCDPI...

NCDPI Update NC Arts Education Coordinators’ Meeting...

North Carolina Principals Council Fall 2012 Kimberly...

Flexible Access – Say What??? Gerry Solomon NCDPI.

Tammy Howard Accountability Services, NCDPI June 26, 2013