Top Banner
Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. [email protected] Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,
20

Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. [email protected] Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

Dec 26, 2015

Download

Documents

Ann Wheeler
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

Kaizen–What Can I Do To Improve My Program?

Kaizen–What Can I Do To Improve My Program?

F. Jay Breyer, [email protected]

Presented at the 2005 CLEAR Annual ConferenceSeptember 15-17 Phoenix, Arizona

Page 2: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

2

Test Development Process (Where we have been) Test Development Process (Where we have been)

Content: found to be important for job as determined by job analysis

Sampling of content: How many items are needed in the test form necessary to assess minimal competency?

Importance of content domains: What is the emphasis on specific content domains?

Based on identified test specifications, select items that match content domains

Evaluate total item bank

Pretest new items Evaluate statistical

parameters: verify appropriate performance of items

Outcome: Valid & reliable test that is sound and defensible

But wait!!!: We can do something else … how can we change what we do to improve the testing program?

Review and edit items to ensure correct grammatical structure and adherence to fairness and sensitivity guidelines

Equate test forms following the standard setting to ensure comparability of test scores for different test forms

Prepare test forms for administration: paper-and-pencil delivery or computer delivery

Validity Reliability & DefensibilityContent

Test Specifications

Item Type

Item Development

ItemWriting

Statistical Analysis

Form Assembly

Edit & Fairness Review

Statistical Parameters

Test Modality

Page 3: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

3

It seems we would never get to this point but here

we are and before the next test is created … What can we learn from this administration?

What should we do to find out about our examination we

just gave and reported?

After the Examination is Over….

Activities

What is the size and quality of my item bank Do I have sufficient numbers of items in each content area for the

next examination form? Can I assemble the next form to content and statistical

specifications? How do I find out what my statistical specifications are? What is the reliability of my test?

Page 4: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

4

Determining appropriate psychometric approaches to item and test development

What do you do if your test is too Long for the time allotted? Too hard/easy for the population tested and

the purpose? Not sufficiently reliable for the test’s

purpose?

Item analysis of the test before scores are reported helps ensure validity

Correct keys are used to grant points Items function as intended

But Test Analyses after the test is reported can be useful for

Construction of new test forms Evaluation of item creation techniques Changes that improve the testing program

Approaches

Challenges

Page 5: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

5

Help ensure quality for testing programs that wish to verify that appropriate test development and psychometric procedures are being used. These analyses help to verify that the program’s test development activities are psychometrically sound and provide directions for possible continuous improvement

Test Analyses

• Assure the public of meeting basic standards of • Quality & Fairness• Reliability

• Answer the question “How are my test development activities doing?”

Analyses should not

Analyses should

• Limit innovation or have a punitive function

• Be ignored

Page 6: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

6

Item Analyses at Different TimesItem Analyses at Different Times

• PIA– Preliminary Item Analysis

• EIA– Early Item Analysis

• IA after PINS but before equating or cut score study• FIA

– Final Item Analysis

Page 7: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

7

PIA: Only Bad ItemsPIA: Only Bad Items

Page 8: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

8

PIA: Hard ItemPIA: Hard Item

Page 9: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

9

PIA: Key IssuePIA: Key Issue

Page 10: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

10

FIA: EverythingFIA: Everything

98.2

72.3

C

89.0

Page 11: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

11

Item/Task Information

Total Score Information

Subscore Information

Reliability Score Distributions Descriptive Information Speededness

Reliability of reported subscores Score Distributions Descriptive Information

Post Test Administration InquiryPost Test Administration Inquiry

A FAIR TEST

Quality of items/tasks from past test

DifficultyDiscriminationDIF

Page 12: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

12

Score Information: Reliability and ValidityScore Information: Reliability and Validity

• Reliability– Consistency & Accuracy

• Validity– Score inferences, score meaning, score interpretations

• What we can say about people

Page 13: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

13

Score Information: ReliabilityScore Information: Reliability

• Reliability– Consistency and Accuracy

• Credential Testing– Refers to consistency of test scores across different test forms

given the content sampling

• Alpha, Kuder-Richardson, (K-R20)

– Refers to consistency of passing and failing the same people as if they were able to take the test twice

• Subkoviak, PF Consistency, RELCLASS

Page 14: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

14

Score Information: ReliabilityScore Information: Reliability

• Measurement Error– Refers to random fluctuations in a person’s score due to factors not

related to the content of the test

• SEM

• CSEM

Page 15: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

15

Test Analyses: Score InformationTest Analyses: Score Information

0.88

75%

Page 16: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

16

Test Analyses: Score InformationTest Analyses: Score Information

Correlations can add to the understanding of score reliability

Page 17: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

17

Item Information: DIF & SensitivityItem Information: DIF & Sensitivity

• Sensitivity– How questions appear– Review by TD Person

• Removes words and phrases from a test that may be

• Insulting• Defamatory• Charged

• Differential Item Functioning (DIF)

– How question behave– Searches for items with

Construct Irrelevant Variance– Tests differences in item

difficulty for k groups when matched on proficiency

– Mantel-Haenszel

Page 18: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

18

DIFDIF

• Impact is not DIF– The assessment of group differences in test performance between

unmatched focal and reference group members

– Confounding of item performance differences between focal; and reference groups

Page 19: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

19

DIFDIF

• How DIF is calculated– The criterion is the total test score or Construct

– The question DIF answers is:

• Is the meaning the same for the focal group as it is for the reference group?

– If the interpretation of the scores – the meaning, is different for subgroups then DIF is present

• DIF has to do with improving validity

Page 20: Kaizen–What Can I Do To Improve My Program? F. Jay Breyer, Ph.D. jay.breyer@thomson.com Presented at the 2005 CLEAR Annual Conference September 15-17 Phoenix,

20

In Summary In Summary

• Statistical Information following Test Administration can provide– Item information

• Difficulty and suitability of the items/tasks for your candidate samples

• DIF

– Potential sources of bias (invalidity)

– Decision Score Information

• Distributions – descriptive statistics – reliability information

– Subscore Information

• Reliability information – intercorrelations

• Help highlight areas for continuous improvement– Kaizen