1 Measuring the Link Between Learning and Performance Eva L.Baker UCLA Graduate School of Education & Information Studies National Center for Research on Evaluation, Standards, and Student Testing (CRESST) Supported by the Naval Education and Training Command, the Office of Naval Research, and the Institute of Education Sciences July 27, 2005 – Arlington, VA The findings and opinions expressed in this presentation do not reflect the positions or policies of the Naval Education and Training Command, the Office of Naval Research, or the Institute of Education Sciences
41
Embed
1 Measuring the Link Between Learning and Performance Eva L.Baker UCLA Graduate School of Education & Information Studies National Center for Research.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Measuring the Link Between Learning and Performance
Eva L.BakerUCLA Graduate School of Education & Information Studies
National Center for Research on Evaluation,Standards, and Student Testing (CRESST)
Supported by the Naval Education and Training Command, the Office of Naval Research, and
the Institute of Education Sciences
July 27, 2005 – Arlington, VA
The findings and opinions expressed in this presentation do not reflect the positions or policies of the Naval Education and Training Command, the Office of Naval Research,
or the Institute of Education Sciences
2
Goals for the Presentation
Consider methods to strengthen the link between learning and performance
Use cognitively based assessment to structure and measure objectives during instruction, post-training and on the job
Emphasize design of core architecture reusable tools to build & measure effective, life-long competencies
Identify benefits and savings for the Navy
3
National Center for Research on Evaluation, Standards, and Student
Testing (CRESST)
Consortium of R&D performers led by UCLA: USC, Harvard, Stanford, RAND, UC Santa
Barbara, Colorado CRESST partners with other R&D
organizations
4
National Center for Research on Evaluation, Standards, and Student
Testing (CRESST) [Cont’d]
Mission
– R&D in measurement, evaluation, and technology leading to improvement in learning and performance settings
– Set the national agenda in R&D in the field
– Validity, usability, credibility
– Focus on rapidly usable solutions and tools
– Tools allow reduced cycle time from requirements to use
5
National Center for Research on Evaluation, Standards, and Student
Testing (CRESST) [Cont’d]
President-Elect AERA; 7 former presidents Chair, Board on Testing and Assessment, National
Research Council, The National Academies Standards for Educational and Psychological
Testing (1999) Army Science Board, Defense Science Board task
forces History of DoD R&D, ONR, NETC, OSD, ARI,
TRADOC, ARL, U.S. Marine Corps; NATO Congressional councils and testimony Multidisciplinary staff
State of Testing in the States External, varying standards and tests from States
Range of targets (AYP)
Short timeline to serious sanctions
Raised scores only “OK” evidence of learning
Are there incentives to measure “high standards”?
Are there incentives to create assessments that respond to quality instruction ?
Growing enthusiasm for use of classroom assessment for accountability
Benchmark tests
Need for new ways to think about the relationship of accountability, long-term learning and performance
8
Language Check Cognitive Model: research synthesis used to
create architecture for tests and measures (and for instruction)
Ontology: formal knowledge representation (in software) of a domain of knowledge, showing relationships—sources, experts, text, observation; used in tools for assessment design
Formative assessment: assessment information to pinpoint needs (gaps, misconceptions) for improvement in instruction or on-the-job
Transfer: ability to use knowledge in different contexts
9
Learning Research Efficient learning demands understanding of
principles or big ideas (schema) and their relationships (mental models)
Learning design needs to take into account limits of working memory
Strong evidence for formative assessment: motivated practice with informative feedback
Assessment design needs to link pre-, formative, end-of-training, and refresher measures
Specification of full domain and potential transfer areas
10
Measure Design: Learning Research Focus first on what is known about improved
learning as the way to design measures: acquisition, retention, expertise, automaticity, transfer
Science-based, domain-independent cognitive demands (reusable) objects—paired with content and context to achieve desired knowledge and skills
Criterion performance is based on expertise models (not simply rater judgments)
Design and arrangement of objects is architecture for learning and measurement
11
Measurement Purposes
System or Program
Needs sensing System monitoring Evaluation Improvement Accountability
Individual/Team
Selection/Placement Opt out Diagnosis Formative/Progress Achievement Certification/Career Skill retention Transfer of learning
5 Vector
12
Changes in Measurement/ Assessment Policy and Practices
From: One purpose, one measure
To: Multiple purposes—well-designed measure(s) with proficiency standards
Difficult to retrofit measure designed for one purpose to serve another
Evidence of technical quality? Methods of aggregation? Scaling? Fairness
13
5-Vector Implications
More than one purpose for data from tests, performance records, assessments
– improvement of trainee KSAs
– improvement of program effectiveness; evaluation of program or system readiness/effectiveness
– certification of individual/team performance
– personnel uses
Challenge: comparability
14
Multipurpose Measurement/ Metrics*
Place higher demands on technical quality of measures
Suggest more front-end design, to support adaptation and repurposing
Full representation (in ontologies or other software-supported structures) to link goals, enabling objectives, and content
A shift in the way to think about learning and training * Metrics are measures in a framework for interpretation;
a ratio of achievement to time, cost, benchmarks
15
CRESST Model-Based Assessment
Reusable measurement objects to be linked to skill objects First, depends upon cognitive analysis (domain independent,
e.g., problem solving) Essential to institute in a well-represented content or skill area
(strategies and knowledge developed from experts* May use different forms of cognitive analysis May behavioral formats, templates
– multiple choice, simulated performance, AAR, game settings, written responses, knowledge representations (maps), traces of procedures in technology, checklists
16
Cognitive Human Capital Model-Based Assessment
ContentUnderstanding
ProblemSolving
Teamwork andCollaboration
MetacognitionCommunication
Learning
17
CRESST Approach Summarize scientific knowledge about learning Find cognitive elements that can be adapted and reused
in different topics, subjects and age levels. These elements make a “family” of models
Embed model in subject matter Focus on “Big” content ideas to support learning and
application Create templates, scoring schemes, training, and
reporting systems (authoring systems available) Conduct research (we do) to assure technical quality
U.S. Department of Energy Human Genome Program, Http://www.ornl.gov/hgmis
http://www.carinasoft.com
19
Generally, How HCMBA Works Understanding a procedure
Knowing what the components of the procedure are Knowing when to execute the procedure, including
symptom detection, and search strategies to confirm problem
Knowing principles underlying procedure Knowing how to execute the procedure Knowing when the procedure is off task or not working Repair options
Ability to explain task completed AND describe steps for a different system (transfer)
Embed in content and context Worked example Executing procedure with feedback loops Criterion testing—comparison benchmarks
20
NEWTON'SLAWS
Third Law
Second Law
First LawA body in motionremains in motion
unless...
Forces betweeninteracting bodies:equal but opposite
is
is
is
areForce equals Masstimes Acceleration
(F=MA)
Content/ Skill Ontology
21
Examples of Model-Based Assessment
Risk Assessment EDO– Cognitive demands of skill include problem
identification, judging urgency, constraints and costs
– Content demands involve prior knowledge in task, e.g., ship repair, knowledge needed to find alternatives, vendors, conflicting missions, etc., principles of optimization vs cycle time
22
EDO Risk Management Simulation*
*CRESST/ USC/BTL’s iRides
23
Fundamentals of Marksmanship
Stable FiringPosition
Trigger Control
Effects of Weather
Breath Control
Aiming Process
leads to
leads to
leads to
leads to
Sight Alignment
Light conditions
Sight Picture
Target Focus
affects
part of
part of
affects
affects
Eye of FSP
Eye ReliefCenter Mass
prevents
part of
affects
type of
affects
Wind velocity
Sight Adjustment
Uniform
requires
affects
part of
Physicalconditioning
NRP
affects part of
Trigger squeezeUninterrupted TC
Grip of Firing Hand
Interrupted TC
Finger Placement
during
affects
type of
type of
affects
affects
Classes
7 Factors Commonto All Shooting
Positions
Rapid RecoilRecovery
Stability
Placement ofButtstock in
Shoulder
Muscular Tension
Weapon Movement
Rapid Fatigue
Elbow Placement
Bone Support
Natural Point of Aim
Forward HandPlacement
Sling Tension
Muscular Relaxation
Stock WeldPlacement
Proper Training
Databook
Remarks
Call
Wind Calls
True Zero
BZO
part of
part of
part of
part of
part of
Cold
causes
Rifle Marksmanship
requires
FollowThrough
part of
part of
PreventiveMaintenance
Cleaning
Lubricating
requires
requires
Stoppage
S.P.O.R.T.STap/Rack/
Bang
requires
requires
prevents
BZOGrouping
affects
Engaging MultipleTargets/Moving
Targets
Distance
Quantity
Speed ofTarget
affects
affects
Target Detection
Weapons Handling/Safety
LowLight
Day Light
Camouflage
Field of View
part of
improves
improves
decreases
Safety Rules
Carries
Transports
ConditionCodes
affects
part of
part of
part ofpart of
part of
requires
type of
type of
part of
part of
part ofaffects
part of
UserServiceability
FunctionCheck
part ofpart of
TargetCapability
Threat Level
affects
affects
Type ofTargets
NumberofTargets
affects
affects
Plot
part of
affects
Position
Standing
Kneeling
Prone
Sitting
type of
type oftype of
Low
Medium
High
type oftype of
type of
Crossed Leg
Open Leg Cross Ankle
type of
type oftype of
Cocked LegStraight Leg
type oftype of
type of
Shivering
Numbnessleads to
leads to
Heat
type of
Sweat
Fatigue
leads to
leads to
Sun Glare
type of
Precipitation type of
Rain
Snow
type of
type of
Heavy
Light
type of
type of
Wind
Gusty
Consistent
No Wind
type of
type of
type of
type of
increases
helps
requiresleads to
type of
Snapping in/Dryfire
type of
helps
leads to
leads to
helps
increases
causes
decreases
increases
leads toleads to
helps
leads to
requires
requires
requires
helps
helps
helps
requires
requires
requires
requiresrequires
requires
requires
affects
affects
Accuracy
increases
increases
increases
increases
increases
increases
increases
causes
Eye on Target
worsens
affects
Loop Sling
requires
RemedialAction
follows
improves
Windage Knob
Sight Settings
RS Elevation KnobFront Sight
uses
type of
type ofaffects
Hasty Slingrequires
Zeroing
Range
Centerline of Bore
Aiming
Trajectory
Line of Sight
Zero
part of
part of
part of
part ofpart of
type of
type of
Resetting theTrigger
affects
type of
PeripheralVision
Use of Binos
fieldmarksmanship engagement
improves
helps
part of
part of
Malfunction
decreases
followsfollows
follows
affects
part of
affects
helps
improves
improves
part of
part of
part of
part of
affects
affect
type of
uses
uses
leads to
Shooter
helps
***tracks/records/documents***
type of
leads to
USMC Fundamentals of Rifle MarksmanshipKnowledge MapCRESST/UCLA
Ontology of M-16 Marksmanship
24
Model-Based Example: M-16 Marksmanship
Marksmanship Inventory
Knowledge Assessment
Knowledge Mapping
Evaluation of Shooter
Positions
Shot-to-Shot Analysis
Cognitive Demand
Fidelity
Current Work:
Performance Sensing
Diagnosis/ Prescription
. . . using technologies – sensors, ontologies, and Bayes nets – to identify knowledge gaps and determine remediation and feedback
Building on the science of measures of performance . . .
25
M-16 Marksmanship Example
•Scenario“The shooter is calling right but his rounds are hitting left of the target.”
•Task“Diagnose and then correct the shooter's problem”
• Information sources
Position
Target
Shooter’s notebook
Rifle
Mental state, gear, fatigue, anxietyWind flags
26
Bayesian Network Model of Rifle Marksmanship
Performance and Cognitive Dependencies
Recommender
Ontology of Marksmanship
Domain
content sensing and assessment information
probabilities of skill acquisition on different shooting variables
individualized feedback and
content
M-16 Marksmanship Improvement
Diagnosis and prescription
individualized feedback and
content
Sensing and assessment information
content
27
Language Check
Validity: appropriate inferences are drawn from test(s)
Reliability: assessments give consistent and stable findings
Accuracy: respondents are placed in categories where they belong
28
CRESST Evidence-Based Validity Criteria for HC Assessment Models*
Cognitive complexity
Reliable or dependable
Accuracy of content/skill domain
Instructionally sensitive
Transfer and generalization
Learning focused
Validity evidence reported for each purpose
Fair
Credible
* Baker, O’Neil, & Linn, American Psychologist, 1993
29
Interplay of Model-Based Design,
Development, and Validity Evidence Experiment on prompt
specificity Studies of extended
embedded assessments Studies of rater agreement
and training Studies of collaborative
assessment Studies of utility across age
ranges and subjects Reusable models (without
CRESST hands-on) Scaling-up to thousands of
examinees in a formal context
Experimental studies of prior knowledge
Criterion validity studies Studies of generalizability
within subject domains Studies of L1 impact Studies of OTL Studies of instructor’s
knowledge Cost and feasibility studies* Prediction of distal outcomes Experimental studies of
instructional sensitivity
30
Report Objects
31
Measure Authoring ScreenShot
32
Summary of Tools Tools include cognitive demands for
particular classes of KSAs, to be applied in templates, objects, or other formats represented in authoring systems
Specific domain or task ontology (knowledge representation of content)
Ontological knowledge fills slots in the templates or objects
Commercial ontology systems available Measurement authoring systems for HC
Assessment Models (with evidence)
33
OUTCOME 1: Coherence
Coherent macro architecture for training and operations and measurement
Coherent view from the sailor, management and system views-to support training, retraining, assessment occurs in new environments (distance learning) 5 vector
34
OUTCOME 2: Cost Savings
Each model has reusable templates and objects, empirically validated, to match cognitive requirements
Freestanding measures do not need to be designed and revalidated anew for each task
Cost of design drops, cost of measures drops, throughout life cycle
Common framework supports retention and transfer of learning
Common HCA objects will simplify demands on trainer Multiple-purposed measures will need different reporting
metrics but should have common reporting framework
35
OUTCOME 3: More Trustworthy Evidence of Effectiveness, Readiness, or Individual
or Team Performance
Common frameworks for assessment Ontology (full representation of content) Instructional strategies to support learning
and transfer Aggregation of outcomes using common
metrics Standard reporting formats for each
assessment purpose
36
OUTCOME 4: Flexibility and Reduced Volatility Within a General Structure
Plenty of room for differential preferences by leaders of different configurations or those with different training goals
Evidence in Navy projects, engineering courses, academic topics, across trainees with different backgrounds, in different settings, with different levels of skill of instructor
Easy-to-use guidelines and tools as exemplars
37
Trust
EfficacyNetworks
EffortTransparency
LearningOrganization
Teamwork Skills
Social/Organizational Capital in Knowledge Management-5 Vector Implications
38
Revolution = Opportunities and Constraints
Navy needs common framework so that their work can be easily integrated
Navy needs common metrics to assess their effectiveness and tools to interpret data
Navy needs to provide vendors with framework to permit achievement and performance integration of HCMA from multiple sources