Introduction & Rationale
(Psychometric)
(Edumetric)
Carver, R.P. (1974). Two dimensions of tests: Psychometric and Edumetric. American Psychologists, July, pp. 512-518.
(Dimensions) (Psychometric) (Edumetric)1.
2. P50%D(Variance)
(Gain/Growth)
0100% (Sensitive to Gain)
(Dimensions) (Psychometric) (Edumetric)3. (Consistency)SEmDependent on Variances4. (Convergent & Discriminant Validity)
(Alternate Forms)NOT Dependent on Variances
(Sensitive to Gain)
E. L. ThorndikeIf A Thing Exists,It Exists in Some Amount.
If It Exists in Some AmountIt Can be Measured.
A Grade isPaul Dresse, (1957). Basic College Quarterly.An Inadequate Report of an Inadequate Judgement by a Biased and Variable Judge of the extent to which a student has attended an Undefined Level of Mastery of an Unknown Proportion of an Indefinite Amount of Material.
D. L. StufflebeamThe Purpose of Evaluation isTO Improve,NOT to Prove.
.
(measurement)
(Evaluation), .
Judgment of merit, usually qualitatively; Measurement is quantitative.
(Assessment or testing):
:
:
(Assessment or testing):
(Ability Tests)
Assess the performance or level of skills of individuals in well-defined subject areas. (Satterly, 1990)
(Aptitude Tests)
Indicate the probability with which new material will be learned. (Satterly, 1990)
(Cognition )
Includes the processes of perception, thinking, reasoning, understanding, problem solving, and remembering. (Satterly, 1990)
(Cognitive Style Tests)
Assess their typical approach or ways of learning and thinking in a variety of tasks. (Satterly, 1990)
(Learning Ability Tests)
Seek to measure the ability to respond to instruction and so are measures of potential rather than achievement.(Hegarty, 1990)
1.(Evaluation in the Teaching of Science)(Evaluation of Science Teaching)
() ()
1. (Achievement Test)2. (Aptitude Test)3. (Intelligence Test)
1. (Preference Test) 2. (Belief)
1. (Aptitude Test) 2. (Intelligence Test)
/
:
(Placement Evaluation)(Diagnostic Evaluation)(Formative Evaluation)(Summative Evaluation)
(Norm-Referenced Evaluation) (Norm group)2.(Criterion-Referenced Evaluation)
NRE & CRE
NRE
CRE
NRE & CRE
NRE
CRE
NRTCRT
NRTCRT
NRTCRT
NRTCRT
NRT()CRT()
NRTCRT
&&&&&&
.
1. (Measurement)(Test) (Assessment)(Evaluation) 2. (Summative Evaluation) (Formative Evaluation)
1.
2. --- (Construct)
3.
4.
Predicted Trends in Measurement and Evaluation of Science Instruction From . . . . . . To . . . . . .1. Primarily group-administered testsA variety of administrative formats including large groups, small groups, and individuals.
Predicted Trends in Measurement and Evaluation of Science Instruction From . . . . . . To . . . . . .2. Primarily paper-and-pencil testsA variety of test formats including pictorial and laboratory performance tests.
Predicted Trends in Measurement and Evaluation of Science Instruction From . . . . . . To . . . . . .3. Primarily end-of-course summative assessmentA variety of pretest, diagnostic and formative types of measurements.
Predicted Trends in Measurement and Evaluation of Science Instruction From . . . . . . To . . . . . .4. Primarily measurement of low-level cognitive outcomesThe inclusion of higher level cognitive outcomes (analysis, evaluation, critical thinking), as well as the measurement of affective (attitudes, interests, and values) and psychomotor outcomes.
Predicted Trends in Measurement and Evaluation of Science Instruction From . . . . . . To . . . . . .5. Primarily Norm-Referenced Achievement TestingThe inclusion of more Criterion-Referenced Assessment, mastery testing, and self and peer evaluation.
Predicted Trends in Measurement and Evaluation of Science Instruction From . . . . . . To . . . . . .6. Primarily measurement of facts and principles of scienceThe inclusion of objectives related to the processes of science, the nature of science, and the interrelationship of science, technology, and society.
Predicted Trends in Measurement and Evaluation of Science Instruction From . . . . . . To . . . . . .7. Primarily measurement of student achievementThe inclusion of measuring the effects of programs, curricula, and teaching techniques.
Predicted Trends in Measurement and Evaluation of Science Instruction From . . . . . . To . . . . . .8. Primarily teacher-made testsThe combined use of teacher-made tests, standardized tests, research instruments, and items from collections assembled by teachers, projects, and other sources.
Predicted Trends in Measurement and Evaluation of Science Instruction From . . . . . . To . . . . . .9. Primarily concern with total test scoresInterest in sub-test performance, item difficulty and discrimination, all aided by mechanical and computerized facilities.
Predicted Trends in Measurement and Evaluation of Science Instruction From . . . . . . To . . . . . .10. Primarily a one-dimensional format of evaluation (e.g., a numerical or letter grade)A multidimentional system of reporting student progress with respect to such variables as concepts, processes, laboratory procedures, classroom discussion, and problem-solving skills.
National Science Education Standards
Assessment Standards
Assessment StandardsNational Science Education StandardsAssessments must be consistent with the decisions they are designed to inform.Assessments are deliberately designed.Assessments have explicitly stated purposes.The relationship between the decisions and the data is clear.
Assessment StandardsNational Science Education StandardsAchievement and opportunity to learn science must be assessed.Achievement data collected focus on the science content that is most important for students to learn.Opportunity-to-learn data collected focus on the most powerful indicators.Equal attention must be given to the assessment of opportunity to learn and to the assessment of student achievement.
Assessment StandardsNational Science Education StandardsThe technical quality of the data collected is well matched to the decisions and actions taken on the basis of their interpretation.The feature that is claimed to be measured is actually measured.Assessment tasks are authentic.Students have adequate opportunity to demonstrate their achievements.Assessment tasks and methods of presenting them provide data that are sufficiently stable to lead to the same decisions if used at different times.
Assessment StandardsNational Science Education StandardsAssessment practices must be fair.Stereotype, language,
Assessment StandardsNational Science Education StandardsThe inferences from assessments about student achievement and opportunity to learn must be sound.
1.
2.
3.
4.
5.
6.
7.
The End!
1. (Peer Evaluation) 2. (Multi-talent Evaluation) 3. (Evaluation in Learning Through Inquiry) IRA (Inquiry Role Approach)
4.(Laboratory Work Evaluation) 1)(Science Process Skills) 2) 3) 5.(Self-Evaluation)
(Against)/
()/()(Programmed Instruction)
NRT-CRT.doc
Comparisons Between NRT and CRT
Attribute
Norm-Referenced
Test ( NRT )
Criterion-Referenced
Test ( CRT )
State of the Art
Developmental
Cost
Content
Validity &
Coverage
Score
Interpretation
Highly developed;
Technically sound
Major
Based on a specified
content domain,
appropriately sampled,
and tending to have
fewer items per
objective.
Tends to be general and
broad.
In terms of a specified
norm group (e.g.
percentile ranks, grade
equivalents)
Mixed & variable;
Technology developing
Moderate to major
Based on a specified
content domain,
appropriately sampled,
and tending to have
more items per
objective.
Tends to be specific and
narrow.
In terms of a specified
criterion of proficiency
(e.g. percent mastery)
NRT-CRT.doc
Comparisons Between NRT and CRT
Attribute
Norm-Referenced
Test ( NRT )
Criterion-Referenced
Test ( CRT )
Item
Development
Standardized
Sensitivity to
Instruction
Reliability
Application
Two main considerations:
Content Validity and
Item Discrimination
Yes
Tends to be low to
moderate, because of
its general purpose nature
High
To assess the effectiveness of given instructional treat-
ments achieving eneral
instructional objectives
One main considerations:
Content Validity
Usually
Tends to be high, when
closely matched to a
particular instructional
situation
Can be high, but sometimes hard to establish
To assess the effectiveness of given instructional treat-
ments in achieving
specific instructional objectives.
(Against)/
()/()(Programmed Instruction)