Testing 07.ppt

8/17/2019 Testing 07.ppt

1/27

Item Response Theory


2/27

Shortcomings of Classical True

Score Model• Sample dependence

• Limitation to the specific test situation.

• Dependence on the parallel forms

• Same error variance for all


3/27

Sample Dependence• The first shortcoming of CTS is that the values of

commonly used item statistics in test development such as item difficulty and item discrimination

depend on the particular examinee samples inwhich they are obtained . The average level ofability and the range of ability scores in aneaminee sample influence! often substantially! thevalues of the item statistics.

• Difficulty level changes "ith the level of sample#sability and discrimination inde is different

bet"een heterogeneous sample and thehomogeneous sample.


4/27

Limitation to the Specific Test

Situation• The tas$ of comparing eaminees "ho have

ta$en samples of test items of differing

difficulty cannot easily be handled "ithstandard testing models and procedures.


5/27

Dependence on the %arallel

&orms• The fundamental concept! test reliability! is

defined in terms of parallel forms.


6/27

Same 'rror (ariance &or )ll

• CTS presumes that the variance of errors of

measurement is the same for all eaminees.


7/27

Item Response Theory

• The purpose of any test theory is to describe ho"

inferences from eaminee item responses and*or

test scores can be made about unobservableeaminee characteristics or traits that are

measured by a test.

• )n individual#s epected performance on a

particular test +uestion! or item! is a function of both the level of difficulty of the item and the

individual#s level of ability.


8/27

Item Response Theory• 'aminee performance on a test can be predicted

,or eplained- by defining eamineecharacteristics! referred to as traits! or abilitiesestimating scores for eaminees on these traits,called /ability scores/- and using the scores to

predict or eplain item and test performance.Since traits are not directly measurable! they arereferred to as latent traits or abilities. )n item

response model specifies a relationship bet"eenthe observable eaminee test performance and theunobservable traits or abilities assumed to underlie

performance on the test.


9/27

)ssumptions of IRT

• 0nidimensionality

• Local independence


10/27

0nidimensionality )ssumption

• It is possible to estimate an eaminee1s ability onthe same ability scale from any subset of items inthe domain of items that have been fitted to the

model. The domain of items needs to behomogeneous in the sense of measuring a singleability2 If the domain of items is too heterogenous!the ability estimates "ill have little meaning.

• Most of the IRT models that are currently beingapplied ma$e the specific assumption that theitems in a test measure a single! or unidimensionalability or trait! and that the items form a

unidimensional scale of measurement.


11/27

Local Independence

• This assumption states that an eaminee1s

responses to different items in a test are

statistically independent. &or thisassumption to be true! an eaminee1s

performance on one item must not affect!

either for better or for "orse! his or herresponses on any other items in the test.


12/27

Item Characteristic Curves

• Specific assumptions about the relationship

bet"een the test ta$er1s ability and his

performance on a given item are eplicitlystated in the mathematical formula! or item

characteristic curve ,ICC-.


13/27


• The form of the ICC is determined by the

particular mathematical model on "hich it is

based. The types of information about itemcharacteristics may include2

• ,3- the degree to "hich the item

discriminates among individuals of differinglevels of ability ,the 1discrimination1

parameter a-


14/27


• ,4- the level of difficulty of the item ,the1difficulty1 parameter b-! and

• ,5- the probability that an individual of lo"ability can ans"er the item correctly ,the1pseudo6chance1 or 1guessing1 parameter c-.

• 7ne of the ma8or considerations in theapplication of IRT models! therefore! is theestimation of these item parameters.


15/27

ICC• pseudo6chance parameter

c2 p9:.4: for t"o items

• difficulty parameter b2half"ay bet"een the

pseudo6chance parameterand one

• discrimination parametera2 proportional to the slop

of the ICC at the point ofthe difficulty parameterThe steeper the slope! thegreater the discrimination

parameter.)bility Scale

%

r ob a b i l i t y


16/27

)bility Score• 3. The test developer collects a set of observed

item responses from a relatively large number oftest ta$ers.

• 4. )fter an initial eamination of ho" "ellvarious models fit the data! an IRT model isselected.

• 5. Through an iterative procedure! parameter

estimates are assigned to items and ability scoresto individuals! so as to maimi;e the agreement! orfit bet"een the particular IRT model and the testdata.


17/27

)bility Score


18/27

Item Information &unction

• The limitations on CTS theory approaches to precision of measurement are addressed in the IRTconcept of information function. The item

information function refers to the amount ofinformation a given item provides for estimatingan individual1s level of ability! and is a function of

both the slope of the ICC and the amount of

variation at each ability level.• The information function of a given item "ill be atits maimum for individuals "hose ability is at ornear the value of the difficulty parameter.


19/27



20/27



21/27


• The information function of a given item "ill be atits maimum for individuals "hose ability is at ornear the value of the difficulty parameter.

• ,3- provides the most information aboutdifferences in ability at the lo"er end of the abilityscale.

• ,4- provides relatively little information at any

point on the ability scale.• ,5- provides the most information about

differences in ability at the high end of the abilityscale.


22/27

Test Information &unction

• The test information function ,TI&- is the sum of

the item information functions! each of "hich

contributes independently to the total! and is ameasure of ho" much information a test provides

at different ability levels.

• The TI& is the IRT analog of CTS theory

reliability and the standard error of measurement.


23/27

Item


24/27

Specifications in CTS Item


25/27

&orm of Items• Dichotomous

Listening comprehension

Statement = +uestion = choices

Short conversation =+uestion = choices

Long conversation * passage = some +uestions = choices

Reading comprehension

%assage = some +uestions = choices

%assage = T*& +uestions

Syntactic $no"ledge * vocabulary

>uestion stem "ith blan$*underlined parts = choices

Clo;e

%assage = choices


26/27

&orm of Items

• ?ondichotomous

Listening comprehension

Dictation

Dictation passage "ith blan$s to be filled


27/27

Describing data

• )bility measured

• Difficulty inde

• Discrimination

• Storage code

Testing 07.ppt

Documents