Classical and Classical and Bayesian Bayesian Computerized Computerized Adaptive Adaptive Testing Testing Algorithms Algorithms Richard J. Swartz Richard J. Swartz Department of Department of Biostatistics Biostatistics ([email protected]([email protected]rg) rg)
46
Embed
Classical and Bayesian Computerized Adaptive Testing Algorithms
Classical and Bayesian Computerized Adaptive Testing Algorithms. Richard J. Swartz Department of Biostatistics ([email protected]). Outline. Principle of computerized adaptive testing Basic statistical concepts and notation Trait estimation methods Item selection methods - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• Comparisons between methodsComparisons between methods
• Current CAT Research TopicsCurrent CAT Research Topics
2
Computerized Adaptive Tests (CAT)
Computerized Adaptive Tests (CAT)
• First developed for assessment testingFirst developed for assessment testing
• Test tailored to an individualTest tailored to an individual– Only questions relevant to individual trait levelOnly questions relevant to individual trait level
– Shorter testsShorter tests
• Sequential adaptive selection problemSequential adaptive selection problem
• Requires item bank Requires item bank – Fit with IRT modelsFit with IRT models
– Extensive initial development before CAT Extensive initial development before CAT implementationimplementation 3
Item Bank Development IItem Bank Development I
• Qualitative item developmentQualitative item development
– Content expertsContent experts
– Response categoriesResponse categories
• Test model fitTest model fit
– Likelihood ratio based methodsLikelihood ratio based methods
– Model fit indicesModel fit indices
4
Item Bank Development IIItem Bank Development II
• Test Assumption: UnidimensionalityTest Assumption: Unidimensionality
• Over time and across groups (i.e. men vs. Over time and across groups (i.e. men vs. women)women)
• Across groupsAcross groups
• Many different methods (Logistic Regression Many different methods (Logistic Regression method, Area between response curves, and method, Area between response curves, and others)others)
6
CAT ImplementationCAT Implementation
11
3344
66
55
77
88
1414
1515
99
1313
1212
1010
1111
Lo DepressionLo Depression
Hi DepressionHi Depression
aa bb cc
22
cc
aa bb cc
55
aa bb cc
1515
bb
bb
22
7
CAT Item SelectionCAT Item Selection
8
Basic Concepts/ NotationBasic Concepts/ Notation
9
Latent trait of interest =
Bank of items B
Set of items administered after stage kk A
1Set of items remaining at stage k kk R B A
Response to item ( possible categories):
{1 2 ,
3 }, , ,i
i i
i m
u m
Single item adminstered at stage kk iResponse to item at stage
kii k u
Basic Concepts/ Notation IIBasic Concepts/ Notation II
10
1 2 3
Vector of responses up to stage
, ,
:
( , ),k kA i i i i
k
u u u uu
Probability of response to item (IRT model) :
( , )
Local independence Assumption:
for two items and , given
( , ) is independe
|
| nt of ( )| ,
i
i i i
i i i j j j
u i
u
i j
u u
P
P P
TRAIT ESTIMATIONTRAIT ESTIMATION
11
Estimating TraitsEstimating Traits
• Assumes Item parameters are knownAssumes Item parameters are known
• Represent the individual’s abilityRepresent the individual’s ability
• Done sequentially in CAT Done sequentially in CAT
• Estimate is updated after each Estimate is updated after each additional responseadditional response
– Maximum Likelihood EstimatorMaximum Likelihood Estimator
– Bayesian EstimatorsBayesian Estimators12
LikelihoodLikelihood
• Model describing a person’s response Model describing a person’s response pattern:pattern:
13
1 2 3
| , ) ( | , ) ( | , ).
item parameters for item
( , , , , )
(k k k k
k
k k
A A j j A Aj A
i
A i i
j
i i
P u f
i
L
u u
( , )depends on IRT model used,
Locally indep ndent
|
e
j jjP u
Maximum Likelihood Estimate
Maximum Likelihood Estimate
• Frequentist: “likely” value to generate the responsesFrequentist: “likely” value to generate the responses
• Consistency, efficiency depend on selection Consistency, efficiency depend on selection methods and item bank used.methods and item bank used.
• Does not always existDoes not always exist
14
ˆ arg max ( | , ): () ,A kk
AML
AL
ku u
Bayesian FrameworkBayesian Framework
is a random variable is a random variable
• A distribution on A distribution on describes knowledge prior describes knowledge prior to data collection (to data collection (Prior distribution)Prior distribution)
• Update information about Update information about (Trait) (Trait) as data is as data is collected (collected (Posterior distributionPosterior distribution))
• Describes distribution ofDescribes distribution of values instead of a values instead of a point estimatepoint estimate
15
Bayes RuleBayes Rule
• Combines information about Combines information about (prior) (prior) with information from the data with information from the data (Likelihood) (Likelihood)
16
,)
( | ) ( ): ( |
( | (, ) )k k
k k
k k
A AA A
A A
f gposterior g
f g d
uu
u
); : ( | ( | ).: ( , ) ,k k k kA A A ALikelihoodprior g L f u u
• Choose the item that is “best” for the Choose the item that is “best” for the individual being testedindividual being tested
• Define “best”Define “best”
– Most information about trait estimateMost information about trait estimate
– Greatest reduction in expected variability of Greatest reduction in expected variability of trait estimatetrait estimate
21
Fisher’s InformationFisher’s Information• Information of a given item at a trait Information of a given item at a trait
valuevalue
22
1
1
2
ˆ
ˆ( ) ln ( | , )k
kA
A k kkA AI E L
U u U
1
2
2ˆ
ln ( | , )k k
Ak
A AE L
U
1
ˆ
2
2ln ( | , )
k Ak
j jj
jA
P UE
1
ˆ( )j kA
k
Uj A
I
u
1
1
2
2ˆ
ˆ( l) | (Observed Informat nn ( i, o )k k k
Ak
A AJ L
k
u
u u u
Maximum Fisher’s Information
Maximum Fisher’s Information
• Myopic algorithmMyopic algorithm
• Pick the item Pick the item iikk at stage at stage k,k, ( (iikk R Rkk) that ) that
maximizes Fisher’s information at current trait maximizes Fisher’s information at current trait estimate, (Classically MLE):estimate, (Classically MLE):
23
1k̂
1
1
, 1
1 1
1
ˆarg max ( ) :
ˆ ˆarg max ( ) ( ) :
ˆarg max ( ) :
j
j
j
Ak
Ak
k U k kj
k U k kj
ML
ML ML
LU k k
j
M
i I j R
I I j R
I j R
U
U
MFI - SelectionMFI - Selection
24
ˆ 1.3
Minimum Expected Posterior Variance (MEPV)
Minimum Expected Posterior Variance (MEPV)
• Selects items that yields the minimum Selects items that yields the minimum predicted Posterior variance given predicted Posterior variance given previous responsesprevious responses
• Uses predictive distributionUses predictive distribution
• Is a myopic Bayesian decision theoretic Is a myopic Bayesian decision theoretic approach (minimizes Bayes risk)approach (minimizes Bayes risk)
• First described by Owen (1969, 1975)First described by Owen (1969, 1975)
25
Predictive DistributionPredictive Distribution
• Predict the probability of a response to Predict the probability of a response to an item given previous responsesan item given previous responses
26
( | ( , ) ( |) | , )k ki i A i i i A A dp u P u g k
• In addition to prior and Likelihood, specify a In addition to prior and Likelihood, specify a loss function (squared error loss):loss function (squared error loss):
27
1 1
2ˆ, , ( ,( ))k k k kA i A iul u
u u
Bayesian Decision Theory: Item Selection
Bayesian Decision Theory: Item Selection
• Optimal estimator for Squared-error loss Optimal estimator for Squared-error loss is posterior mean (EAP)is posterior mean (EAP)
• Select item that minimizes Bayes risk:Select item that minimizes Bayes risk:
28
11 1
2
| | ,
posterior predictive variance given
expectation over predicted response to item
arg minj A A j k jk k
j
j
U A U k
U
k Uj
U j
EAi E P for RE j
u u
11 1,| | ( , , ) ;
j A A j k kk kUU A iRiskBayes E E L u
u u u
Minimum Expected Posterior Variance (MEPV)
Minimum Expected Posterior Variance (MEPV)
• Pick the item Pick the item iikk remaining in the bank at remaining in the bank at
stage stage k,k, ( (iikk R Rkk) that minimizes the ) that minimizes the
expected posterior variance (with respect expected posterior variance (with respect to the predictive distribution):to the predictive distribution):
29
1 11
arg min ( | ) Var( | , ) :k k
j
j
j A Ar
j
m
k j j kj
i p r U r j R
u u
Other Information MeasuresOther Information Measures
• Weighted MeasuresWeighted Measures
– Maximum Likelihood weighted Fisher’s Maximum Likelihood weighted Fisher’s Information(MLWI) Information(MLWI)
– Maximum Posterior Weighted Fisher’s Maximum Posterior Weighted Fisher’s Information (MPWI):Information (MPWI):
• Kulback-Leibler Information: Global Kulback-Leibler Information: Global Information Measure Information Measure
30
Hybrid AlgorithmsHybrid Algorithms
• Maximum Expected Information (MEI)Maximum Expected Information (MEI)– Use observed informationUse observed information
– Predict information for next itemPredict information for next item
• Maximum Expected Posterior Weighted Maximum Expected Posterior Weighted Information (MEPWI)Information (MEPWI)– Use observed information Use observed information
– Predict information for next itemPredict information for next item
– Weight with Posterior Weight with Posterior
– MEPWI MEPWI MPWI MPWI
31
Mix – N– Match Mix – N– Match
• MAP with uniform prior to approximate MAP with uniform prior to approximate MLEMLE
• MFI using EAP instead of MLE (any MFI using EAP instead of MLE (any point information function)point information function)
• Use EAP for item selection, but MFI for Use EAP for item selection, but MFI for final trait estimatefinal trait estimate
• Item selection including burdenItem selection including burden
• Cheating detectionCheating detection
• Response timesResponse times42
43
Thank You!Thank You!
References and Further Reading
References and Further Reading
Choi SW Swartz RJ. (in press) ”Comparison of CAT Item Selection Choi SW Swartz RJ. (in press) ”Comparison of CAT Item Selection Criteria for Polytomous Items” Criteria for Polytomous Items” Applied psychological MeasurementApplied psychological Measurement..
Owen RJ (1969) Owen RJ (1969) A Bayesian approach to tailored testing A Bayesian approach to tailored testing (Research (Research report 69-92) Princeton, NJ: Educational Testing Service report 69-92) Princeton, NJ: Educational Testing Service
Owen RJ (1975). A Bayesian Sequential Procedure for quantal Owen RJ (1975). A Bayesian Sequential Procedure for quantal response in the context of adaptive mental testing. response in the context of adaptive mental testing. Journal of the Journal of the American Statistical Association, 70American Statistical Association, 70, 351-356., 351-356.
van der Linden WJ. (1998). “Bayesian item selection criteria for van der Linden WJ. (1998). “Bayesian item selection criteria for adaptive testing” adaptive testing” PsychometrikaPsychometrika, 2, 201-216., 2, 201-216.
van der Linden WJ. & Glas, C. A. W. (Eds). (2000). van der Linden WJ. & Glas, C. A. W. (Eds). (2000). Computerized Computerized Adaptive Testing: Theory and Practice. Adaptive Testing: Theory and Practice. Dordrecht; Boston: Kluwer Dordrecht; Boston: Kluwer Academic.Academic.
44
45
MLE PropertiesMLE Properties
• Usually has desirable asymptotic Usually has desirable asymptotic propertiesproperties
• Consistency and efficiency depend on Consistency and efficiency depend on selection criteria and item bankselection criteria and item bank
• Finite estimate does not exist for Finite estimate does not exist for repeated responses in categories 1 or repeated responses in categories 1 or mm