Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved. CHAPTER 3: Standardized Tests How They Are.

Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved.

CHAPTER 3:Standardized Tests

How They Are Used, Designed,

and Selected Assessment in Early Childhood Education

Fifth Edition

Sue C. Wortham


Chapter Objectives

1.Understand how standardized tests are used with infants and young children.

2.Understand the process of standardized test design.

3.Understand the differences between test validity and test reliability.

4.Use resources and strategies for selecting and evaluating standardized tests.

5.Understand issues in selecting and using standardized tests.


Tests for Infants and Preschoolers

Tests for young children have questionable validity and reliability because of infants’ and preschoolers’ short attention spans.


Screening Tests for Preschoolers

• Detect if a child might have a developmental problem that needs further investigation.

• Tests screen various developmental domains.

Mental Retardation

• Mild (55-70 IQ)

• Moderate (40-54)

• Severe (25-39)

• Profound (< 25)

Poor adaptive skills

Before age 18

© 2006 The McGraw-Hill Companies, Inc. All rights

reserved. Santrock, Educational Psychology, Second Edition,

Classroom Update


Diagnostic Tests for Preschoolers

After a screening, tests for diagnostic assessment are administered (if needed).• Adaptive behavior measures assess possible

learning, social or motor disabilities.- Vineland Adaptive Behavior Scales (VABS)

• Intelligence tests measure learning potential.

Vineland Adaptive Behavior Scales II (VABS – II)

Parent/Caregiver Rating Form, Interview Form - 0 through 90

Teacher Rating Form - 3 through 21 years,11 months


Achievement Tests Determine Instructional Effectiveness

National tests:• compare student achievement across states to address

higher standards for education • identify poor instructional areas, pinpoint weaknesses in

a state’s instructional program and facilitate improvements

State developed tests are be used at school districts to: • determine each student’s progress• provide diagnostic information on a child’s needs for

future instruction • describe student progress between and within schools

Standardized Tests and Teaching

Criteria for EvaluatingStandardized Tests

15.9

What Is a Standardized Test?

The Nature of Standardized Tests

The Purposes of StandardizedTests

The Nature of Standardized Tests

Standardized Tests

• Have uniform procedures for administration and scoring.

• Allow comparison of student scores by age, grade level, local and national norms.

• Attempt to include material common across most classrooms.

15.10

http://www.eduref.org/cgi-bin/print.cgi/Resources/Evaluation/Testing/Standardized_Tests.html

Contribute to accountability

Provide information about student progress andprogram placement

Diagnose students’strengths and weaknesses

Provide information for planningand instruction

Help in program evaluation

15.11

Purposes of Standardized Tests

http://www.aasa.org/publications/sa/1998_12/herman.htm


Steps in Standardized Test Design

The following steps ensure that the test achieves its goals and purposes:• specify the purpose• determine the format• formulate objectives• test construction: write, try out, and analyze items• assemble the final form• administer the final test form • establish norms, determine the validity and reliability• develop a test manual


Steps in Standardized Test Design: Specifying the Purpose

A clearly defined purpose is the framework forthe construction of the test. • It allows evaluation of the instrument when

design and construction steps are completed.• It helps explain what the test will measure, how

the test results will be used, and who will take the test.

• It describes the population for whom the test is intended.


Steps in Standardized Test Design: Determining Test Format

Format decisions are based on the purpose of the test

and the characteristics of the test takers:

• how test items will be presented and how the test taker will respond

(e.g., tests designed for very young children are usually presented orally; paper and pencil tests for older students)

• given as a group test or as an individual test


Steps in Standardized Test Design: Test Construction

The test’s purpose guides:

• defining test objectives

• writing test items for each objective

• assembling experimental test forms


Steps in Standardized Test Design: Developing Experimental Forms

For a school achievement test:• test content is delimited• curriculum is analyzed to ensure that the test will reflect

the instructional programs • teachers and curriculum experts review content outlines

and objectives for the test; and later they review test items

• writing, editing, trying out, and rewriting or revising test items

• a preliminary test with selected test items is assembled for trial with a sample of students


Steps in Standardized Test Design: Developing Experimental Forms

• Experimental test forms resemble the final form

• Instructions are written for test administration

• The sample of people who to take the preliminary test is similar to the population that will take the final form of the test


Steps in Standardized Test Design: Item Analysis in The Test Tryout Phase

Study each item’s: • Difficulty level: how many test takers in the trial group

answered the question correctly • Discrimination: the extent to which the question

distinguishes between test takers who did well or poorly; test takers who did well on the test should be more successful on the item than those who did poorly

• Grade progression in difficulty: for tests that are taken in different grades, in each successively higher grade, a greater percentage of students should answer it correctly


Steps in Standardized Test Design: The Final Test Form Is Assembled

• In item analysis, test items were revised, or eliminated

• Test items that measure each test objective are selected for the test

• Alternative forms of one test must ensure that each of the forms are equivalent in content and difficulty

• Test directions are finalized-- with instructions for test

administrators about the testing environment and testing procedures; and instructions for test takers


Steps in Standardized Test Design: Standardizing the Test

The final test form is administered to another, larger sample of test takers to acquire norm data.

Norms allow for comparisons of children’s test performance with the performance of a reference or norming group.• The norming group is chosen to reflect the makeup of

the population for whom the test is designed.

Evaluating Standardized Tests

Reliability – Are test scores stable, dependable and relatively free from error?

Validity – Does the test measure what it is purported to measure?

15.21

22

Correlation

Correlation coefficient

Indicates directionof relationship(positive or negative)

Indicates strengthof relationship(0.00 to 1.00)

r = 0.37+

Correlation Coefficient is a statistical measure of

relationship between two variables.

Pearson correlation coefficient• r = the Pearson coefficient

• r measures the amount that the two variables (X and Y) vary together (i.e., covary) taking into account how much they vary apart

• Pearson’s r is the most common correlation coefficient; there are others.

Computing the Pearson correlation coefficient

• To put it another way:

• Or

separately vary Y and X which todegree

ther vary togeY and X which todegreer

separately Y and X ofy variabilit

Y and X ofity covariabilr

Sum of Products of Deviations• Measuring X and Y individually (the denominator):

– compute the sums of squares for each variable• Measuring X and Y together: Sum of Products

– Definitional formula

– Computational formula

• n is the number of (X, Y) pairs

))(( YYXXSP

n

YXXYSP

Correlation Coefficent:• the equation for Pearson’s r:

• expanded form:

YX SSSS

SPr

nY

YnX

X

nYX

XYr

22

22

Correlation Coefficient Interpretation

Coefficient

Range

Strength of

Relationship

0.00 - 0.20 Practically None

0.20 - 0.40 Low

0.40 - 0.60 Moderate

0.60 - 0.80 High Moderate

0.80 - 1.00 Very High

ReliabilityTest-retest: The extent to which a test yields

the same score when given to a student on two different occasions

Alternate-forms: Two different forms of the same test on two different occasions to determine the consistency of the scores

Split-half: Divide the test items into two halves; scores are compared to determine test score consistency

15.32


Standard Error of Measurement

• an estimate of the amount of variation to be expected in test scores.

• If the reliability correlations are poor, the standard error of measurement will be large.

• The larger the standard error of measurement, the less reliable the test.


Variables That Affect the Standard Error of Measurement

The following affect test reliability: • Population sample size --the larger the population

sample, the more reliable the test

• Test length --longer tests are usually more reliable because there are more test items, resulting in a better sample of behaviors

• Range of test scores of the norming group --the wider the spread of scores, the more reliably the test can distinguish between good and poor students

Methods of Studying Reliability

Interrater Reliability- The consistency of a test to measure a skill, trait, or domain across examiners.This type of reliability is most important whenresponses are subjective or open-ended.

Types of Validity…

Content: Test’s ability to sample the content that is being measured

Criterion-related:

1. Concurrent: The relation between a test’s score and other available criteria

2. Predictive: The relationship between test’s score and future performance

Construct: The extent to which there is evidence that a test measures a particular construct

15.36


Considerations in Choosing and Evaluating Tests

To select the best test to meet the developmental characteristics of young children, the following need to be considered:

• the purpose of the testing• the characteristics to be measured• how the test results will be used• the qualifications of those who will interpret the scores

and use the results• any practical constraints: cost, time, ease of scoring and

use of test results


Reviewing the Test Manual

The test manual should include information that is adequate for users to determine whether the test is practical and suitable for their purposes.

The manual should address the following:

• Purpose of the test• Test design• Establishment of validity and reliability• Test administration and scoring

Wortham. Assessment in Early Childhood Education, 5e. © 2008 by Pearson Education, Inc. All Rights Reserved. CHAPTER 3: Standardized Tests How They Are.

Documents

pearson education

test reliability

test validity

early childhood education

test items

purposes of standardized

test manual slide

individual test slide