ESE444/544 - Types of Assessment

CLASSROOM ASSESSMENTAssessment for 21st Century Learning

Types of Assessment

Formative• Occurs during instruction• Not graded• Designed to provide

information needed to adjust teaching and learning while they are still occurring

• Assessment FOR Learning

Summative• Occurs after instruction• Graded• Designed to provide

information about the amount of learning that has occurred at a particular point

• Assessment OF Learning

Formative• On-going (usually daily)• Multiple opportunities to reach the criteria• Allows for practice and improvement• Formative assessments point out areas of incomplete

learning to students and teachers• Formative assessments give teachers time to adapt their

instruction• Can be informal (observations, verbal interactions) or

formal (work samples, paper and pencil tasks)

Summative• Formal (rubric or scoring sheet)• Meaningful descriptive feedback is important• Provide a means of determining what has been learned of

the purposes of reporting• Does not provide an opportunity to correct or improve

performance• Provide numbers for statistical review of achievement• Must be reliable and valid• Examples: State assessments, interim assessments, end

of unit assessments,

Which of the following is an example of a formative assessment?

1. End-of-chapter test

2. The ACT

3. The graduation exam

4. Ungraded daily assessments

End-of-chapte

r test

The ACT

The graduation exam

Ungraded daily asse

ssments

0% 0%0%0%

Response

Counter

Exit Slips are written responses to questions the teacher poses at the end of a lesson or a class to assess student understanding of key concepts. This is an example of a summative assessment.

1. True

2. False

Response

Counter

True

False

0%0%

Types of Summative Assessments

Norm-Referenced Tests

• Made to compare test takers to each other.

• Most appropriate when one wishes to make comparisons across large numbers of students or important decisions regarding student placement and advancement.

Criterion-Referenced Test

• Intended to measure how well a person has learned a specific body of knowledge and skills

• Most appropriate for quickly assessing what concepts and skills students have learned from a segment of instruction.

Norm-Referenced Tests• determine individual performance in comparison to others;

standardized, comparisons among people

• it is inappropriate to use NRTs to determine the effectiveness of educational programs and to provide diagnostic information for individual students

• items cover a broad range of content and often represent a mismatch between what is taught locally and what is taught in other states

Criterion-Referenced Tests• determine individual performance in comparison to some

standard or criterion

• items based on standards given to students (i.e., objectives); most students should answer correctly

The SAT, a college entrance exam, compares individual student performance to the performance of a sample of students. The SAT is what type of test?

1. Norm-Referenced

2. Criterion-Referenced

Response

Counter

Norm-R

eference

d

Criterio

n-Reference

d

0%0%

The goal of the driving test is to see whether the test taker is skilled enough to be granted a driver's license, not to see whether one test taker is more skilled than another test taker. This type of test is called what?

1. Norm-Referenced

2. Criterion-Referenced

Response

Counter

Norm-R

eference

d

Criterio

n-Reference

d

0%0%

Concerns with Summative AssessmentReliability

• The consistency of the test as a measurement

• Same results, time and again

Validity• How well a test

measures what it says it measures

Reliability• Another way to think of reliability is to imagine a kitchen

scale. If you weigh five pounds of potatoes in the morning, and the scale is reliable, the same scale should register five pounds for the potatoes an hour later.

• Likewise, instruments such as classroom tests and national standardized exams should be reliable – it should not make any difference whether a student takes the assessment in the morning or afternoon; one day or the next.

A test designed to assess student learning in math class is given to a group of students twice, with the second administration coming a week after the first. The test would be considered reliable if …

1. Students scored higher on the first exam than the second.

2. Students scored the same on both exams.

3. Students scored higher on the second exam than the first.

Response

CounterStu

dents sc

ored high

er o...

Students

score

d the sa

m...

Students

score

d higher o

...

0% 0%0%

Validity• Refers to the accuracy of an assessment -- whether or not

it measures what it is supposed to measure. • Even if a test is reliable, it may not provide a valid

measure.

Validity• Let’s imagine a bathroom scale that consistently tells you

that you weigh 130 pounds. The reliability (consistency) of this scale is very good, but it is not accurate (valid) because you actually weigh 145 pounds (perhaps you re-set the scale in a weak moment)!

• Since teachers, parents, and school districts make decisions about students based on assessments (such as grades, promotions, and graduation), the validity inferred from the assessments is essential -- even more crucial than the reliability.

A history teacher designs a unit assessment. The questions are written with complicated wording and

phrasing. Which of the following is true?

1. This test is NOT valid, because the test could be one of reading comprehension rather than history.

2. This test is NOT reliable, because students will do poorly due to the complicated wording.

3. This test is valid, because the students should be able to comprehend the question wording regardless of its level.

4. This test is reliable, because all students will do poorly, regardless of their knowledge of history.

Response

CounterThis t

est is

NOT valid

, be...

This test

is NOT re

liable,...

This test

is va

lid, b

ecause

...

This test

is relia

ble, beca

u..

0% 0%0%0%

Evaluation of Summative Assessments

Mean Median

Definition: arithmetic average of a set of numbers

the middle number when the set is sorted in numerical order.

Applicability:The mean is used

for normal distributions.

The median is generally used for

skewed distributions.

Limitations largely influenced by outliers

better suited for skewed distributions

Data Distribution• Normal Distribution

• mean = median = mode• Symmetry about the center• 50% of values less than the mean and 50% greater than the mean• Bell curve

Data Distribution

• Negative Skew• The long "tail" is on the

negative side of the peak.

• skewed to the left

• Positive skew • the long tail is on the

positive side of the peak

• skewed to the right

Other Evaluations of Summative Assessments

• Standard Deviation• Square root of variance in scores (how the scores are arranged

around the average score)• High SD means more space between scores, low SD means

clustered scores

Which graph has a higher standard deviation?

1. Red

2. Blue

Other Evaluations of Summative Assessments

• Percentile• The value below which a certain percent of values fall

A student scored in 74% on her ACT exam. Which of the following is true?

1. She answered 74% of the test questions correctly.

2. Her ACT score was equal to or better than 74% of students taking the ACT exam

3. She got a C on her ACT exam.

She answered 74% of t

he...

Her ACT sc

ore w

as equal...

She got a

C on her ACT ...

0% 0%0%

Response

Counter

What it all means…• Is the assessment measuring what we want it to?• What are the instructional decisions to be made based on

this assessment information?• How will those changes be made?

The use of clickers in this presentation is a form of what kind of assessment?

1. Summative

2. Formative

Summative

Form

ative

0%0%

Response

Counter

ESE444/544 - Types of Assessment

Education

ESE444/544 - Types of Assessment