Setting Achievement Levels for the 2011 NAEP Writing for Grades 8 and 12.

Post on 01-Jan-2016

215 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

Transcript

Setting Achievement Levelsfor the

2011 NAEP Writing for Grades 8 and 12

Presentations

• Historical Context and Perspective

Susan Loomis, Consultant

• Developing Achievement Levels Descriptions

Pat Porter, Instruction Driven Measurement Center

Susan Loomis, Consultant

• Panelist Recruitment

Chris Clough, Measured Progress

• Facilitation

Joseph St. George, Measured Progress

• Technological Enhancements

Luz Bay, Measured Progress

• Discussant: Kevin Sweeney, The College Board

HISTORICAL PERSPECTIVE

NAEP Achievement Levels-Setting

2011 Writing NAEP

2011 was first fully computer-based assessment for NAEP. Grade 8 (ca. 24,000 examinees) and grade 12

(approximately 29,000) examinees 22 unique writing tasks organized into 44 forms for each

gradeo Each form is made up of two writing tasks specifying

different communicative purposes: to persuade; to explain; and to convey experience, real or imagined

o Writing stimuli presented in different formats: text, visual, audio, and video

Components of NAEP Achievement LevelsAchievement Levels Definitions: The

criteriaCutscores: The measure of the criteriaExemplar Items: The representation of the

criteriaMethodology: The procedure for collecting

informed judgments about performance relative to achievement levels descriptions

Features of NAEP Achievement Levels-Setting Policy Definitions of each level of achievement: same for all subjects

and grades Basic Proficient Advanced

Achievement Levels Descriptions developed for each subject and grade to specify what students should know and be able to do to demonstrate performance at each level of achievement

Panelists selected to be nationally representative Sex Race/ethnicity Geographical region SES of district

Features of NAEP Achievement Levels-SettingPanelists selected by type:

55% classroom teachers in subject and grade 15% other educators (with subject and grade expertise) 30% general public (with subject and grade expertise)

Methodology must have research baseFacilitation by experts in standard setting and in content

Features of NAEP Achievement Levels-SettingProcedures must be piloted with panels before operational

implementation Extra pilot study with panelists required for writing to

test procedures for working with computersExtensive training and instructions for panelists in an

iterative process with multiple rounds of judgmentsEvidence of validity of procedures implemented and levels

set Special study required to collect information for Board

to use in evaluating reasonableness of 2011 levels, relative to previous results.

NAEP Achievement Levels-Setting ProceduresExtensive training

Advance materials: Frameworks, achievement levels descriptions, logistics, agenda

General sessions to provide general instructions to all followed by grade-level sessions to provide specific instructions and practice

Take the assessment under timed conditions and self-score Become conversant in meaning and use of achievement

levels descriptions through exercises and discussion Practice ratings only after training complete!

Rounds of Ratings and FeedbackCut ScoresCut Score Distribution ChartCut Score Location FeedbackClassification Tally (R1)Consequences Data Feedback (R2 & R3)

Recommendations for Board Approval

Numerical results: cut scores and %s at or above for each grade

Achievement Levels Descriptions for grades 4, 8, and 12

Exemplar performances for each level for each grade

DEVELOPING ACHIEVEMENT LEVELS DESCRIPTIONS

Recommendations for Developing Descriptions of Performance StandardsPersons who develop the statements

should be recognized for their expertise in the content area and in student learning at the appropriate grade level(s).

Recommendations for Developing Descriptions of Performance StandardsA generic or policy definition of

performance at each level should provide a clear statement of what students should know and be able to do at each level. The policy definition is then

operationalized for each grade and subject.

Recommendations for Developing Descriptions of Performance StandardsDescriptions of performance

standards should be detailed enough to delineate performance at one level from that at the next, but general enough to avoid association with a specific item or task.

Recommendations for Developing Descriptions of Performance StandardsPerformance standards should

describe knowledge and skills that can be measured--not beliefs or feelings about students.

Recommendations for Developing Descriptions of Performance StandardsThe descriptions of performance standards

should be consistent with procedures for scoring and scaling student performance. Performance standards should avoid the

use of frequency to calibrate knowledge and skills across levels.

Recommendations for Developing Descriptions of Performance StandardsDescriptions of performance standards

must be evaluated to verify the alignment There must be an appropriate and logical

relationship between the policy definitions and the performance descriptions.

There must be an appropriate and logical progression of criteria from one level to another within each grade.

There must be an appropriate and logical progression of criteria within each level across grades.

Performance Levels Descriptions Recommended for Implementation

NAGB Committee (COSDAM) gave provisional approval to the recommended descriptions for use in setting 2011 writing achievement levels

Implemented for field trial #1 and pilot study Changes needed!

A second field trial was needed to try out new ALDs and procedures

Grade 8 Achievement LevelsDimension BASIC PROFICIENT ADVANCED

Addressing Communicative Purpose

Eighth-grade students writing at the Basic level should be able to address the tasks appropriately and mostly accomplish their communicative purposes.

Eighth-grade students writing at the Proficient level should be able to develop responses that clearly accomplish their communicative purposes.

Eighth-grade students writing at the Advanced level should be able to construct skillful responses that accomplish their communicative purposes effectively.

Text Structure & Coherence

Their texts should be coherent and effectively structured.

Their texts should be coherent and well structured, and they should include appropriate connections and transitions.

Their texts should be coherent and well structured throughout, and they should include effective connections and transitions.

Idea Development

Many of the ideas in their texts should be developed effectively.

Most of the ideas in the texts should be developed logically, coherently, and effectively.

The ideas in the texts should be developed logically, coherently, and effectively.

Details & Elaboration

Supporting details and examples should be relevant to the main ideas they support.

Supporting details and examples should be relevant to the main ideas they support, and contribute to overall communicative effectiveness.

Supporting details and examples should skillfully and effectively support and extend the main ideas in the texts.

Voice Voice should align with the topic, purpose, and audience.

Voice should be relevant to the tasks and support communicative effectiveness.

Voice should be distinct and enhance communicative effectiveness.

Sentence Structure & Complexity

Texts should include appropriately varied uses of simple, compound, and complex sentences.

Texts should include a variety of simple, compound, and complex sentence types combined effectively.

Texts should include a well-chosen variety of sentence types, and the sentence structure variations should enhance communicative effectiveness.

Word & Phrase Choice

Words and phrases should be relevant to the topics, purposes, and audiences.

Words and phrases should be chosen thoughtfully and used in ways that contribute to communicative effectiveness.

Words and phrases should be chosen strategically, with precision, and in ways that enhance communicative effectiveness.

Grammar Usage Mechanics

Knowledge of spelling, grammar, usage, capitalization, and punctuation should be made evident; however, there may be some errors in the texts that impede meaning.

Solid knowledge of spelling, grammar, usage, capitalization, and punctuation should be evident throughout the texts. There may be some errors, but these errors should not impede meaning.

An extensive knowledge of spelling, grammar, usage, capitalization, and punctuation should be evident throughout the texts. Appropriate use of these features should enhance communicative effectiveness. There may be a few errors, but these errors should not impede meaning.

Initial Steps for Evaluating the ALDs for the 2011 NAEP Writing Test• Identify a series of dimensions to be examined in students’

responses, including:

-Addressing Communicative Purpose

-Text Structure and Coherence

-Idea Development

-Details and Elaboration

-Sentence Structure and Complexity

-Word and Phrase Choice

-Grammar, Usage, and Mechanics

Revisions of Achievement Levels Descriptions

A matrix design was developed by the content facilitators to analyze each dimension of the ALDs to ensure that all critical elements of writing for the 2011 Writing NAEP were addressed in a pedagogically sound and consist manner and that the ALDs accurately represent the NAGB policy definitions.

Grade 8 Achievement LevelsBasic Proficient Advanced

Eighth-grade students writing at the Basic level should be able to address the tasks appropriately and mostly accomplish their communicative purposes. Their texts should be coherent and effectively structured. Many of the ideas in their texts should be developed effectively. Supporting details and examples should be relevant to the main ideas they support. Voice should align with the topic, purpose, and audience. Texts should include appropriately varied uses of simple, compound, and complex sentences. Words and phrases should be relevant to the topics, purposes, and audiences. Knowledge of spelling, grammar, usage, capitalization, and punctuation should be made evident; however, there may be some errors in the texts that impede meaning.

Eighth-grade students writing at the Proficient level should be able to develop responses that clearly accomplish their communicative purposes. Their texts should be coherent and well structured, and they should include appropriate connections and transitions. Most of the ideas in the texts should be developed logically, coherently, and effectively. Supporting details and examples should be relevant to the main ideas they support, and contribute to overall communicative effectiveness. Voice should be relevant to the tasks and support communicative effectiveness. Texts should include a variety of simple, compound, and complex sentence types combined effectively. Words and phrases should be chosen thoughtfully and used in ways that contribute to communicative effectiveness. Solid knowledge of spelling, grammar, usage, capitalization, and punctuation should be evident throughout the texts. There may be some errors, but these errors should not impede meaning.

Eighth-grade students writing at the Advanced level should be able to construct skillful responses that accomplish their communicative purposes effectively. Their texts should be coherent and well structured throughout, and they should include effective connections and transitions. Ideas in the texts should be developed logically, coherently, and effectively. Supporting details and examples should skillfully and effectively support and extend the main ideas in the texts. Voice should be distinct and enhance communicative effectiveness. Texts should include a well-chosen variety of sentence types, and the sentence structure variations should enhance communicative effectiveness. Words and phrases should be chosen strategically, with precision, and in ways that enhance communicative effectiveness. An extensive knowledge of spelling, grammar, usage, capitalization, and punctuation should be evident throughout the texts. Appropriate use of these features should enhance communicative effectiveness. There may be a few errors, but these errors should not impede meaning.

PANELIST RECRUITMENT

Panelist Recruitment: Goals

• Number of Panelists

– Field Trial: 20

– Pilot and Operational: 100 (40 + 60)

• Panel Composition and Diversity

• Inclusion of General Public

– Rationale

– Qualifications

• Staged Selection Process

– Select districts and identify nominators

– Contact nominators

– Contact nominees

– Select and recruit panelists

Panelist DistributionDemographic

VariableAttributes

(Target %)

Grade 8 Grade 12 All

% % %

Panelist Type

Teachers (55%) 59 54 56

Nonteacher Educators (15%)

19 18 18

General Public (30%) 22 29 25

GenderFemale (50%) 81 68 75

Male (50%) 19 32 25

Race/EthnicityCaucasian (80%) 85 96 91

Non-Caucasian (20%) 15 4 9

NAEP Region

Midwest (35%) 22 29 25

Northeast (20%) 19 14 16

South (25%) 22 21 22

West (20%) 37 36 36

Panelist Distribution

2

2

12

2

2

2

1

3

2

11

1

1

11

2

1

2 1

1

1

2

2

3

2

12

1

11

11

13

Author Panelists

Donald Ball – Author of published mystery thrillers such as Toll Road, Scenic Route, and Twisted Road Home.

Robin Cody – Author of three books noted by the state of Oregon as top novels celebrating Oregon’s culture and heritage.

Tricia Brown – A mother of two and grandmother of six, Tricia has written several children’s and young adult books.

Amy Koss – This mother has authored 14 young adult novels.

Author Panelists

Ginny Rorby – Author of many young adult novels including Lost in the River of Grass, The Outside of a Horse, and Dolphin Sky.

Vivian Yang – Vivian has authored two young adult novels, Shanghai Girl and Memoirs of a Eurasian.

Tom Allen – Co-authored with his son National Geographic’s Mr. Lincoln’s High-Tech War, voted one of the best non-fiction young adult books by Voice of Youth Advocates Magazine.

Panelist Recruitment: Challenges• History of a Growing Challenge

• Observed Issues

– IneligibilityNominees were found not to meet the panelist qualifications

– HomogeneityLack of diversity among the nominee pool with regard to one or more aspects of the diversity criteria

– UnresponsivenessLack of reply from the intended recipients of the recruitment communications. Primary obstacle to recruitment.

• Gaps

– Relationship

– Relevance

– Calendaring

– Comprehension

– Credibility.

Panelist Recruitment: Challenges

• Relationship Gap: a significant gap wherein the receivers’ lack of experience with or relationship to people affiliated with the recruitment effort or the NAEP project, negatively impacting the likelihood of response. Contacts are disinterested because they don’t “know” the recruiter or the project.

• Relevance Gap: the pertinence of the initial recruitment invitations are not apparent to the recipients, negatively impacting their likelihood to respond. Contacts ask, “What does this have to do with me?”

Panelist Recruitment: Challenges

• Calendar Gap: purely logistical gap referring to schedule conflicts or lack of commitment due to late receipt of the recruitment invitation.

• Comprehension Gap: understanding of the message is egregiously interrupted, negatively impacting the likelihood of response. Contacts (esp. GP) commented that the initial materials were too lengthy or filled with jargon.

• Credibility Gap: the initial recruitment invitation (typically email) or its claims seem implausible, negatively impacting the recipients’ likelihood to respond. The prevalence of spam, phishing, telemarking, and “junk mail” has created a credibility deficit that must be overcome by any recruitment or mass communication effort.

Panelist Recruitment: Recommendations1. Partner with the key contacts from demographically

diverse, subject-specific groups to fulfill recruitment needs.

Gaps closed: • Relationship

• Credibility

• (Relevance)

• (Comprehension)

• (Calendar)

Panelist Recruitment: Recommendations2. Prioritize nominee communications through research

of qualifications and experience prior to initial contractor contact.

Gaps closed: • Calendar

• (Relevance)

Panelist Recruitment: Recommendations3. Plan recruitment communications around knowable

nominator and nominee schedule conflicts.

Gaps closed: • Calendar

Panelist Recruitment: Recommendations4. Prioritize direct, personal phone contact with

nominators and nominees as the chief means of securing nominations and nominee acceptances.

Gaps closed: • Relationship

• Credibility

• (Relevance)

• (Comprehension)

• (Calendar)

Panelist Recruitment: Recommendations5. Craft the initial communication piece to increase

accessibility to the nominees, specifically by removing jargon, reducing length, and improving presentation.

Gap closed: • Comprehension

Panelist Recruitment: Recommendations6. Send the initial recruitment piece in hard copy as

well as email.

Gaps closed: • Credibility

• (Relevance)

Panelist Recruitment: Recommendations7. Provide a simple explanation of the recruitment

process to the nominees in the initial recruitment piece.

Gaps closed: • Comprehension

• Relationship

• Credibility

FACILITATION

NAEP Standard Setting vs. State Standard Setting Facilitation: Commonalities and differences• The NAEP standard setting process, as we have heard, was in many

ways quite different than statewide standard settings

• Facilitation for the NAEP standard setting (including the pilot study, field trials, and operational standard setting) had certain similarities, but also was in some ways very different than facilitation for a statewide standard setting

• There were challenges that are typical to statewide standard settings, as well as new challenges that required different solutions

• Lessons in facilitation learned from the NAEP standard setting may be applicable to future non-typical standard settings

Facilitation Commonalities

• While there were specific differences between NAEP standard setting and a typical statewide standard setting, many aspects of facilitation were similar:

– “Information overload”

– Feeling of insecurity or unease• Panelists question “Am I doing this right?”

– General desire to do the right thing for the students

Information OverloadCHALLENGE:•The process facilitator, in both a statewide standard setting and the NAEP standard setting is responsible for providing a large amount of information.

– Facilitators need to be expert in the process, in order to keep the process moving.

– During the first several days of any standard setting, a large amount of information is presented.

SOLUTION:•A carefully designed process, especially where an agenda is presented to panelists, allows them to better understand standard setting. Training and explanation of small steps of the process at one time allows panelists to better grasp what is happening currently, and leads to better context around the process as a whole.

Feelings of insecurityCHALLENGE:

•As panelists are asked to move the process along, often times with a less than clear understanding of the process, they begin to feel a sense of unease, and often times begin to have serious doubts.

– These doubts manifest in one of two ways:• Doubts about the process.

• Doubts about their ability to complete the task properly.

SOLUTION:

•The facilitator must know and understand the process, in order to assure the panelists that positive progress is being made.

•The facilitator must reassure the panelists that they were chosen for a reason, and that they do have the required skill set to be able to accomplish the task at hand.

Desire to do the right thing for students• Whether a statewide standard setting, or a nationwide assessment

like NAEP, the people who agree to serve on standard setting panels have good intentions.

• Those people who serve as panelists have a strong desire to do the best job that they can, and to ensure that students are being held accountable, but in a fair manner.

CHALLENGE:

• While most panelists set appropriate performance standards based on the ALD’s, some believe that doing the right thing for students means setting standards lower to ensure more students pass the test.

SOLUTION:

• Refocusing panelists on the goals of standard setting, and on the ALD’s that were agreed upon, rather than on the impact to students.

Facilitation Differences

– Technological enhancement

– Developing a working relationship

– Preconceived notions

– First time standard setters

Technological enhancement

CHALLENGE:

•The NAEP writing standard setting used technological enhancement, in the form of computers running BoWTIE.

•Panelists for the NAEP, as well as statewide standard settings, come from a variety of backgrounds, including differing abilities around use of technology.

SOLUTION:

– Process facilitator has additional function as first round IT support.

– The process facilitator was also responsible for keeping panelists together in the process, which can become more difficult when differing levels of computer literacy among the panelists leaves some people bored, and others frustrated at slow progress.

Developing a working relationship

CHALLENGE:

•In a statewide standard setting, teachers from across the state are assembled to work together.

•Teachers may know each other through work on statewide initiatives like curriculum committees, or assessment review committees.

•There is usually an innate group acceptance, since each individual can relate to the others in the room:

– “We all teach in a school in this state, using the state standards”

•In a nationwide standard setting, like NAEP, panelists from all different backgrounds are brought together.

– Without the common experience of all the panelists being educators, panelists took longer than on a statewide standard setting to become a cohesive group.

Developing a working relationship

SOLUTION:

•Panelists received extensive training, as well as proceeding though multiple activities as a group, prior to begin the actual functions of setting standards.

– This allowed panelists to begin working together, and getting to know one another prior to being asked to set standards or to discuss these ratings.

– Panelists were more confident in each other’s abilities, since they had spent more time as a group doing the trainings and discussions.

Preconceived notionsCHALLENGE:

•Teachers from each state are familiar with how writing is taught and evaluated in their state.

– Standards can be vastly different across the country.

– Asking teacher panelists to set aside what they know about writing in their states, and focus solely on what the definitions set out by NAEP is oftentimes difficult.

SOLUTION:

•A constant refocusing on the NAEP ALD’s and rubrics as the tools by which students were being evaluated was needed.

– It was necessary to reframe the standard setting process around the NAEP ALD’s, in order to ensure that ratings reflected the NAEP achievement levels.

First time standard setters

CHALLENGE:

•Often, during statewide standard settings, the majority of panelists are teachers or administrators.

•Educators usually have a basic understanding of how a standard setting works whether from previous experiences serving on standard setting panels, or from classroom evaluation.

•Most educators also have experience with using rubrics to evaluate student performance.

•Many panelists on the NAEP writing standard setting panels were members of the public.

– While they often had writing experience, they did not have experience reading student writing, and evaluating it for specific elements

First time standard setters

SOLUTION:

•The facilitators needed to provide an additional level of guidance and explanation to these panelists, to ensure that they understood each task, and to ensure that they were appropriately applying basic evaluation techniques.

– Explanations that were provided had to be fleshed out to fully communicate the task.

– Additional checking on these panelists to make sure they were able to accomplish the goals was necessary.

Lessons learned and future applications• NAEP standard setting brought together panelists from across the

country, with differing levels of experience at standard setting, resulting in different challenges than a typical statewide standard setting.

• As assessment transitions from a single state, to consortia based assessments, understanding the complex interactions between panelists who are educators and non-educators, and who are from different states will be important.

• As states adopt common core state standards, the instruction and implementation of these content standards is likely to differ from state to state. Once common assessments are in place, panelists will have to come to a common understanding of the consortia’s ALD’s developed for standard setting.

TECHNOLOGICAL ENHANCEMENTS

• Developed specifically for performance assessments

• Panelists are given sample student responses to classify into performance categories.

• The cut score represents the score that maximizes the difference in performance of students between two levels.

Body of Work (BoW)

• Training and Calibration

• Achievement Levels Descriptions• Examples of student performance• Tasks and scoring rubrics

• Rangefinding

Classify student work (BoWs) sampled across the score range according to Achievement Level Descriptions

• Rangefinding Replication

Classify a new sample of student work (BoWs) sampled as before and classify according to Achievement Levels Descriptions.

Stages

Body of Work (BoW)

• A cut score is computed (using logistic regression) for each individual panelist as the score that best differentiates the booklets above and below the score point.

• The group cut score is the median (midpoint) across individual panelists.

Cut Score Computation

• Orientation to NAEP ALS, Including Taking a Form of the NAEP

• Training on the NAEP Writing Framework, Achievement Levels Descriptions, Items and Scoring Rubrics, and Rating Methodology

• Rounds of Ratings and Feedback

• Consequences Data Questionnaire

• Selection of Exemplar Items and Responses

• Process Evaluation at Every Major Stage

Process same as other NAEP Achievement Levels-Setting (ALS) Processes

Implementation

• To overcome the logistical difficulties in materials preparation

• To enhance security of materials

• To promote “green” procedures

• To enhance the overall efficiency and effectiveness of the process

Body of Work Technological Integration and Enhancements

BoWTIE

• 50 booklets (BoWs), each with responses to two prompts to classify according to Achievement Levels Descriptions.

• BoWs ordered from highest to lowest scores.• Classify each BoW as either below Basic, Basic,

Proficient, or Advanced.• Classifications were made independently without

discussion.

Round 1

Rounds of Ratings and Feedback

• Cut Scores

• Cut Score Distribution Chart

• Cut Score Location Feedback

• Classification Tally

Feedback from Round 1

Rounds of Ratings and Feedback

• Cut Score Distribution Feedback

• Panelists were provided with the same 50 BoWs in the same order and with their Round 1 classifications to review.

• Using the NAEP Achievement Levels Descriptions and feedback information presented after Round 1, panelists reviewed/reclassified each BoW as either below Basic, Basic, Proficient, or Advanced.

• Classifications were again made independently, without discussion.

Round 2

Rounds of Rating and Feedback

• All feedback from Round 1

• Consequences Data Feedback

Feedback from Round 2

Rounds of Rating and Feedback

• Panelists were provided with a new set of 50 BoWs.

• Otherwise, the process is exactly the same.

Round 3

Rounds of Ratings and Feedback

• Same as feedback from previous round

Feedback from Round 3

Rounds of Ratings and Feedback

Questions

Thank you.

For further questions, please contact

Luz Baybay.luz@measuredprogress.org

Setting Achievement Levels for the 2011 NAEP WritingKevin Sweeney

June 21, 2013

Random Thoughts

• Standard Setting for any Writing Test

• Panelist Selection

• Use of Technology in Standard Setting

Overall comments

• Well Designed

• Well Executed

• Use of Technology

• NAEP is the exception to the rule with regard to resources allocated for standard setting

Standard Setting for Writing

• One of the harder content areas to set standards

• Few ‘items’

• Few student observations

• BoW methodology is a good choice here

• Selection of responses is critically important

• Consideration of bringing in other external data?

Panelist Selection

• Who is on the panel matters

• The more diverse the group, the more potential for a ‘problematic’ panel (or panelist)

• The consortia should pay attention to this

Use of Technology

• Always better to press a button than to use pen and paper

• Technology for technology sake is generally a bad idea

• In this case, the technology helped with logistics, data processing, etc…

• Ease of use for the panelists?

• Information overload?

• Is the software general use or specific to this application?

top related