2010 R.M. Luecht 1 Operational CBT Implementation Issues: Making It Happen Richard M. Luecht, PhD Educational Research Methodology University of North Carolina at Greensboro Tenth Annual Maryland Assessment Conference: COMPUTERS AND THEIR IMPACT ON STATE ASSESSMENT: RECENT HISTORY AND PREDICTIONS FOR THE FUTURE. 18-19 October, College Park MD
44
Embed
2010 R.M. Luecht1 Operational CBT Implementation Issues: Making It Happen Richard M. Luecht, PhD Educational Research Methodology University of North Carolina.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
2010 R.M. Luecht 1
Operational CBT Implementation Issues:
Making It Happen
Richard M. Luecht, PhDEducational Research Methodology
University of North Carolina at Greensboro
Tenth Annual Maryland Assessment Conference: COMPUTERS AND THEIR IMPACT ON STATE ASSESSMENT: RECENT HISTORY AND PREDICTIONS FOR THE FUTURE. 18-19 October, College Park MD
2010 R.M. Luecht 2
What do you get if you combine a psychometrician, a test development specialist, a computer hardware engineer, a LSI software engineer, a human factors engineer, a QC expert, and a cognitive psychologist?
A pretty useful individual to havearound if you’re implementing CBT!!!!
2010 R.M. Luecht 3
A Naïve View of Operational CBT*
* Includes linear CBT, CAT, CMT CAST and other variants
ku,,uUj
k Rj:ˆImaxikiij
11
Item Selection/Test Assembly Algorithm
,:u,,ugmaxˆkkii ii
MAPuu 1111
Ability Estimation/Scoring Algorithm
Item #001Item #001
Item #001Item #001
Item Bank
Ui=010120113Response Vector
Test Delivery Network
Ethernet
Server
Data
Data Server
Data
Examinee
2010 R.M. Luecht 4
The Challenge of CBTMoving more complex data more quickly, more securely and more accurately from item generation through final scoringImmediate responsiveness where possibleRe-engineering data management and processing systems: end-to-end99.999999% accuracy and eliminating costly and error-prone human factors through automation and better QC/QA
2010 R.M. Luecht 5
Systems Impacted by Redesign and QC/QA
Item development and bankingTest assembly and compositionExaminee eligibility, registration, and scheduling, feesTest deliveryPsychometrics and post-examination processing
Item analysis, key validation and quality assuranceTest analysisFinal scoring, reporting and communication
2010 R.M. Luecht 6
Item and Item Set Repositories
2010 R.M. Luecht 7
Test Data Repositories
Test UnitRecords
Test Unit IdentificationItem List ReferenceTiming DataNavigation ControlsPresentation ScriptsTemplate Reference
ResourceVersions
Version/ReleaseTest Units ReferenceEncryption Key(s)Restructured Test Units(e.g., XML)
Stimulus information (e.g., MCQ stem, a reading passage)Response display labels (e.g., distractors as labels for a check box control)Scripts for interactivityTemplate references
Content and other item attributesContent category codesCognitive and other secondary classificationsLinguistic features
2010 R.M. Luecht 11
Item-Level Data – cont.Statistical item data
Classical item statistics (p-values, biserial correlations, etc.)IRT statistics (1PL, 2PL, 3PL, GPCM parameter estimates)DIF statistics and other special indices
Operational dataReuse historyExposure rates and controls (for CAT)Equating status
2010 R.M. Luecht 12
Test Unit Data
Object list to include (e.g., item identifies for all items in the test unit)Navigation functions, including presentation, review and sequencing rulesEmbedded adaptive mechanisms (score + selection)Timing controls and other information (e.g., how clock functions, time limit, etc.)Title and instruction screens
2010 R.M. Luecht 13
Test Unit Data – cont.
Presentation template referencesHelm look-and-feel (navigation style, etc.)Functions (e.g., direction of cursor movement after ◄┘ or tab is pressed)
Reference and ancillary look-up materialsCalculatorsHyperlink to other BLOBs Custom exhibits available to test takers
2010 R.M. Luecht 14
Standard Hierarchical View of a “Test Form”
Test Form A
Section IISection I Section III
Group 1 Group 2Set 1
Item1
Item2
Item3
Item5
Item6
Item7
Item4
Item8
Item9
Item10
Item12
Item11
2010 R.M. Luecht 15
Examinee Data
Identification informationName and identification numbersPhoto, digital signature, retinal scan informationAddress and other contact information
Demographic informationEligibility-to-test information
JurisdictionEligibility periodRetest restrictions
2010 R.M. Luecht 16
Examinee Data – cont.
Scheduled test date(s)Special accommodations requiredScores and score reporting informationTesting history and exam blockingSecurity history (e.g., previous irregular behaviors, flagged for cheating, indeterminate scores, or large score gains)General correspondence
2010 R.M. Luecht 17
Interactions of Examinee and Items or Test Units
Primary informationFinal responsesCaptured actions/inactions (state and sequencing of actions)
Secondary informationCumulative elapsed time on “unit”Notes, marks or other captured during testing
2010 R.M. Luecht 18
Response Processing in CBT
Response capturing agents convert examinee responses or actions to storable data representationsExamples
EssayStore as…RichTextBox.Item001.text=“There were two important changes that characterized the industrial revolution. First, individuals migrated from rural to urban settings in order to work at new factories and in other industrial settings (geographic change). Second, companies began adopting mechanisms to facilitate mass production (changes in manufacturing procedures, away). ”
2010 R.M. Luecht 22
Entering the Psychometric Zone: Data Components of Scoring
EvaluatorsResponsesSelections, actions or inactions: item.response.state=control.state (ON or OFF)Entries: item.response.value=control.value
Answer expressions (rethinking IA is needed)
Answer keysRubrics of idealized responses or patterns of responsesFunctions of other responses
Score evaluators process the responsesScoring evaluators convert the stored responses to numerical values—e.g., f(responseij, answer keyi)xij[0,1]Raw scoring or IRT scoringaggregation and scaling of item-level numerical scores
2010 R.M. Luecht 23
Planning for Painless Data Exchanges and ConversionsSystems and subsystems need to exchange data on a regular basis, providing different views and field conversionsThe hand-off must have several fool-proof QC steps
Verification of all inputsConversion success 100% verifiedReconciliation of all results, including counts, discrepancies, missing values, etc.
2010 R.M. Luecht 24
Example of a (Partial) Examinee’s Test Results
Recordtestp>wang>marcus>>605533641>0A1CD9>93bw100175>1>>501>001>ENU>CB1_CAST105>>90>0>0>0>DTW>06/26/96>08:41:38>05:58:32>w10>2>apt 75>1000 soldiers field rd>north fayette>IN>47900>USA>1>1235552021>>NOCOMPANYNAME>0>>>>>1>1235551378>>0>0>35>>142/218/0/u>1>-1>7>CBSectionI.12>CB1>s>p>0>36>>72/108/0/u>Survey015>survey15>s>p>0>0>>0/0/0/u>CBSectionI>CAST2S1>s>p>0>28>>28/62/0/u>CBSectionII>CAST2S4>s>p>0>42>>42/48/0/u>CBSectionII>CAST2S3>s>p>0>0>>0/0/0/u>CBSectionII>CAST2S2>s>p>0>0>>0/0/0/u>Survey016>survey2>s>p>0>0>>0/0/0/u>0>372>SAFM0377>2>0>E>5>s>E>1>76>>SAEB0549>2>0>D>5>s>A>0>68>>SAFM0378>2>0>A>5>s>A>1>72>>SAAB1653>2>0>C>5>s>D>0>102>>SABA8868>2>0>B>5>s>C>0>85>>SCAA1388>2>0>E>8>s>E>1>53>>SAAA8447>2>0>D>5>s>E>0>55>>SAAB1934>2>0>A>5>s>A>1>60>>SAAB2075>2>0>E>5>s>E>1>136>>SADA7710>2>0>D>5>s>D>1>40>>SABB1040>2>0>B>5>s>E>0>46>>SCAA1396>2>0>H>10>s>A>0>93>>SACA8906>2>0>D>5>s>E>0>75>>SADA8116>2>0>C>5>s>D>0>53>>SADA8673>2>0>B>5>s>B>1>41>>SACA8626>2>0>B>5>s>D>0>48>>SAFM0374>2>0>C>5>s>D>0>80>>SABA6397>2>0>A>5>s>A>1>110>>SAAB1088>2>0>C>5>s>C>1>55>>SACA8455>2>0>D>4>s>D>1>73>>SAAB1667>2>0>C>5>s>C>1>44>>SAAJ7633>2>0>C>5>s>C>1>89>>SABA5745>2>0>D>5>s>A>0>43>>SCAA1389>2>0>B>8>s>H>0>61>>SADA8650>2>0>A>5>s>C>0>39>>SAFB0112>2>0>C>5>s>C>1>132>>SAAB2513>2>0>B>5>s>B>1>120>>SAFA9248>2>0>E>5>s>A>0>77>>SABJ1042>2>0>D>5>s>C>0>112>>SACJ5894>2>0>C>5>s>D>0>82>>SAAA0410>2>0>D>5>s>E>0>89>>SAAB1681>2>0>C>5>s>C>1>88>>SAFM0365>2>0>A>5>s>A>1>65>>SAEA8980>2>0>A>5>s>B>0>52>>
Extracting Data ViewsA data view is a set of restructuring functions that produce a data set from raw data
Views begin with a queryUsually results a formatted file structureGraphing functions produce graphic data setsDatabase functions produce database record sets
Multiple views are possible for different uses (e.g., test assembly, item analysis, calibrations, scoring)Well-designed views are reusable
Standardized queries of the database(s)Each views as a template with “object” statusViews can be manipulated by changing their properties (e.g., data types, presentation formats)
2010 R.M. Luecht 28
Types of Data Files (Views)Implicit Files: File format implies a structure for the data
Flat files with fixed columns (headers optional)Comma, tab or other delimited files
Explicit Files: Variables, data types, formats and the actual data are explicitly structured
Data base files: dBASE, Oracle, Access, etc.Row-column worksheets with “variable sets” (e.g., Excel in data mode, SPSS)XML and SGML
The P I QuerySELECT Examinee.Records IF(Query_Conditions=TRUE)
Exam.PersonID Exam.TestID Exam.Status Exam.Date107555 TST0183 F 11-Sep-04517101 TST0181 F 11-Sep-04670048 TST0181 F 13-Sep-04758735 TST0183 F 13-Sep-04754364 TST0183 F 13-Sep-04827960 TST0183 R 13-Sep-04619834 TST0183 F 13-Sep-04615233 TST0182 R 13-Sep-04429336 TST0182 F 14-Sep-04
Treatment of (Score_Status=1) items: INCLUDE Items and ResponsesItem File: MasterItemFile.DAT NI = 3857, Total Read = 3857 Excluded= 0 ===========================================================Active_Examinee_Responses File=ActiveExamineeResponse.txt No. IDs (from Active_Examinee_Test_Form)= 1687 File size (examinee transactions)= 506100 No. nonblank records input = 506100 No. records with unmatched items = 0Forms = 8 1687 scored response records saved to Data-ResponseFile-Scored.RSP 1687 raw response records saved to Data-ResponseFile-Raw.RAW
ITEM LISTING and FORM ASSIGNMENTS DETECTED
ID Identifier NOpt Opts N-Count NFrm FormsItem_21801 5 ABCDE 41 1 04Item_24601 5 ABCDE 97 3 01 03 07Item_29801 5 ABCDE 97 2 02 07 :<only partial records included to conserve space>
MISMATCH SUMMARY----------------NO unmatched item IDs to FORMSNO unmatched item IDs to RESPONSE RECORDS
Form ID 01 02 03 04 05 06 07 08
01:SampleForm001 300 0 0 37 0 37 0 0
02:SampleForm002 0 300 0 0 200 0 0 0
03:SampleForm003 0 0 300 0 0 0 200 200
04:SampleForm004 37 0 0 300 0 200 0 0
05:SampleForm005 0 200 0 0 300 0 0 0
06:SampleForm006 37 0 0 200 0 300 0 0
07:SampleForm007 0 0 200 0 0 0 300 200
08:SampleForm008 0 0 200 0 0 0 200 300
Item Counts by Form
2010 R.M. Luecht 40
Follow the Single-Source Principle
A unique master record should exist for every entity
Examinees registered/eligible to testItemsItem setsModules, testlets or groupsTest forms
Changes should be made to the master and forward-propagated for all processing
2010 R.M. Luecht 41
Example of Single Source
NO!
YES!!
2010 R.M. Luecht 42
Ignorable Missing Data?Very little data is missing completely at random, limiting the legitimate use of imputationSome preventable causes of missing data
Lost records due to crashes/transmission errorsCorrupted response capturing/recordsPurposeful omitsRunning out of time/motivation to finish
2010 R.M. Luecht 43
Challenges of Real-Time Test Assembly (CAT or LOFT)
Real-time item selection requires high bandwidth and fast servers and pre-fetch reduces precisionA “test form” does not exist until the examination is complete
QC of test forms is very difficult, except by audit sampling and careful refinement of test specifications (objective functions/constraints)QC of the data against “known” test-form entities is NOT possible