Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved. CHAPTER 3: Standardized Tests How They Are Used, Designed, and Selected Assessment in Early Childhood Education Fifth Edition Sue C. Wortham
Mar 30, 2015
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
CHAPTER 3:Standardized Tests
How They Are Used, Designed,
and Selected Assessment in Early Childhood Education
Fifth Edition
Sue C. Wortham
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Chapter Objectives
1.Understand how standardized tests are used with infants and young children.
2.Understand the process of standardized test design.
3.Understand the differences between test validity and test reliability.
4.Use resources and strategies for selecting and evaluating standardized tests.
5.Understand issues in selecting and using standardized tests.
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Tests for Infants and Preschoolers
Tests for young children have questionable validity and reliability because of infants’ and preschoolers’ short attention spans.
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Screening Tests for Preschoolers
• Detect if a child might have a developmental problem that needs further investigation.
• Tests screen various developmental domains.
Mental Retardation
• Mild (55-70 IQ)
• Moderate (40-54)
• Severe (25-39)
• Profound (< 25)
Poor adaptive skills
Before age 18
© 2006 The McGraw-Hill Companies, Inc. All rights
reserved. Santrock, Educational Psychology, Second Edition,
Classroom Update
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Diagnostic Tests for Preschoolers
After a screening, tests for diagnostic assessment are administered (if needed).• Adaptive behavior measures assess possible
learning, social or motor disabilities.- Vineland Adaptive Behavior Scales (VABS)
• Intelligence tests measure learning potential.
Vineland Adaptive Behavior Scales II (VABS – II)
Parent/Caregiver Rating Form, Interview Form - 0 through 90
Teacher Rating Form - 3 through 21 years,11 months
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Achievement Tests Determine Instructional Effectiveness
National tests:• compare student achievement across states to address
higher standards for education • identify poor instructional areas, pinpoint weaknesses in
a state’s instructional program and facilitate improvements
State developed tests are be used at school districts to: • determine each student’s progress• provide diagnostic information on a child’s needs for
future instruction • describe student progress between and within schools
Standardized Tests and Teaching
Criteria for EvaluatingStandardized Tests
15.9
What Is a Standardized Test?
The Nature of Standardized Tests
The Purposes of StandardizedTests
The Nature of Standardized Tests
Standardized Tests
• Have uniform procedures for administration and scoring.
• Allow comparison of student scores by age, grade level, local and national norms.
• Attempt to include material common across most classrooms.
15.10
Contribute to accountability
Provide information about student progress andprogram placement
Diagnose students’strengths and weaknesses
Provide information for planningand instruction
Help in program evaluation
15.11
Purposes of Standardized Tests
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Steps in Standardized Test Design
The following steps ensure that the test achieves its goals and purposes:• specify the purpose• determine the format• formulate objectives• test construction: write, try out, and analyze items• assemble the final form• administer the final test form • establish norms, determine the validity and reliability• develop a test manual
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Steps in Standardized Test Design: Specifying the Purpose
A clearly defined purpose is the framework forthe construction of the test. • It allows evaluation of the instrument when
design and construction steps are completed.• It helps explain what the test will measure, how
the test results will be used, and who will take the test.
• It describes the population for whom the test is intended.
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Steps in Standardized Test Design: Determining Test Format
Format decisions are based on the purpose of the test
and the characteristics of the test takers:
• how test items will be presented and how the test taker will respond
(e.g., tests designed for very young children are usually presented orally; paper and pencil tests for older students)
• given as a group test or as an individual test
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Steps in Standardized Test Design: Test Construction
The test’s purpose guides:
• defining test objectives
• writing test items for each objective
• assembling experimental test forms
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Steps in Standardized Test Design: Developing Experimental Forms
For a school achievement test:• test content is delimited• curriculum is analyzed to ensure that the test will reflect
the instructional programs • teachers and curriculum experts review content outlines
and objectives for the test; and later they review test items
• writing, editing, trying out, and rewriting or revising test items
• a preliminary test with selected test items is assembled for trial with a sample of students
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Steps in Standardized Test Design: Developing Experimental Forms
• Experimental test forms resemble the final form
• Instructions are written for test administration
• The sample of people who to take the preliminary test is similar to the population that will take the final form of the test
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Steps in Standardized Test Design: Item Analysis in The Test Tryout Phase
Study each item’s: • Difficulty level: how many test takers in the trial group
answered the question correctly • Discrimination: the extent to which the question
distinguishes between test takers who did well or poorly; test takers who did well on the test should be more successful on the item than those who did poorly
• Grade progression in difficulty: for tests that are taken in different grades, in each successively higher grade, a greater percentage of students should answer it correctly
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Steps in Standardized Test Design: The Final Test Form Is Assembled
• In item analysis, test items were revised, or eliminated
• Test items that measure each test objective are selected for the test
• Alternative forms of one test must ensure that each of the forms are equivalent in content and difficulty
• Test directions are finalized-- with instructions for test
administrators about the testing environment and testing procedures; and instructions for test takers
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Steps in Standardized Test Design: Standardizing the Test
The final test form is administered to another, larger sample of test takers to acquire norm data.
Norms allow for comparisons of children’s test performance with the performance of a reference or norming group.• The norming group is chosen to reflect the makeup of
the population for whom the test is designed.
Evaluating Standardized Tests
Reliability – Are test scores stable, dependable and relatively free from error?
Validity – Does the test measure what it is purported to measure?
15.21
22
Correlation
Correlation coefficient
Indicates directionof relationship(positive or negative)
Indicates strengthof relationship(0.00 to 1.00)
r = 0.37+
Correlation Coefficient is a statistical measure of
relationship between two variables.
Pearson correlation coefficient• r = the Pearson coefficient
• r measures the amount that the two variables (X and Y) vary together (i.e., covary) taking into account how much they vary apart
• Pearson’s r is the most common correlation coefficient; there are others.
Computing the Pearson correlation coefficient
• To put it another way:
• Or
separately vary Y and X which todegree
ther vary togeY and X which todegreer
separately Y and X ofy variabilit
Y and X ofity covariabilr
Sum of Products of Deviations• Measuring X and Y individually (the denominator):
– compute the sums of squares for each variable• Measuring X and Y together: Sum of Products
– Definitional formula
– Computational formula
• n is the number of (X, Y) pairs
))(( YYXXSP
n
YXXYSP
Correlation Coefficent:• the equation for Pearson’s r:
• expanded form:
YX SSSS
SPr
nY
YnX
X
nYX
XYr
22
22
Correlation Coefficient Interpretation
Coefficient
Range
Strength of
Relationship
0.00 - 0.20 Practically None
0.20 - 0.40 Low
0.40 - 0.60 Moderate
0.60 - 0.80 High Moderate
0.80 - 1.00 Very High
ReliabilityTest-retest: The extent to which a test yields
the same score when given to a student on two different occasions
Alternate-forms: Two different forms of the same test on two different occasions to determine the consistency of the scores
Split-half: Divide the test items into two halves; scores are compared to determine test score consistency
15.32
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Standard Error of Measurement
• an estimate of the amount of variation to be expected in test scores.
• If the reliability correlations are poor, the standard error of measurement will be large.
• The larger the standard error of measurement, the less reliable the test.
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Variables That Affect the Standard Error of Measurement
The following affect test reliability: • Population sample size --the larger the population
sample, the more reliable the test
• Test length --longer tests are usually more reliable because there are more test items, resulting in a better sample of behaviors
• Range of test scores of the norming group --the wider the spread of scores, the more reliably the test can distinguish between good and poor students
Methods of Studying Reliability
Interrater Reliability- The consistency of a test to measure a skill, trait, or domain across examiners.This type of reliability is most important whenresponses are subjective or open-ended.
Types of Validity…
Content: Test’s ability to sample the content that is being measured
Criterion-related:
1. Concurrent: The relation between a test’s score and other available criteria
2. Predictive: The relationship between test’s score and future performance
Construct: The extent to which there is evidence that a test measures a particular construct
15.36
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Considerations in Choosing and Evaluating Tests
To select the best test to meet the developmental characteristics of young children, the following need to be considered:
• the purpose of the testing• the characteristics to be measured• how the test results will be used• the qualifications of those who will interpret the scores
and use the results• any practical constraints: cost, time, ease of scoring and
use of test results
Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.
Reviewing the Test Manual
The test manual should include information that is adequate for users to determine whether the test is practical and suitable for their purposes.
The manual should address the following:
• Purpose of the test• Test design• Establishment of validity and reliability• Test administration and scoring