Top Banner
Item-writing Orientation & Review
53

Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Jan 11, 2016

Download

Documents

Agnes Pitts
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Item-writing Orientation & Review

Page 2: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Quality Test Item-writing

Evaluation

Measurement

Testing

Page 3: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Goal of quality item-writing in a nutshell:

Examinees should get an item…,

Right - because they know the correct answer

Wrong - because they don’t know the correct answer.

Page 4: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

1st Questions to Ask Yourself

Why am I testing?

How am I testing?

What results am I getting (or hoping to get)?

How am I going to use the results?o What kind of interpretations do you want to

make with the scores?

Page 5: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Bloom’s Taxonomy for the Cognitive Domain

Essays

“Objective” Formats

Cognitive Level

Page 6: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.
Page 7: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Multiple Choice Items

Page 8: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Stem: a question or incomplete sentence

A. DistracterB. DistracterC. DistracterD. Correct or Best answer (the “keyed”

response)E. Distracter

General Format to Multiple Choice Items.

Options

Page 9: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Item Technical Flaws

• Issues Related to Irrelevant Difficulty

• Issues Related to Testwiseness

2 classes of flaws

Page 10: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Irrelevant Difficulty

Flaws related to irrelevant difficultymake the question difficult for reasons unrelated to the trait that is the focus of assessment

Page 11: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Irrelevant Difficulty

Grammatical Inconsistencies:

one or more of the distracters fail to followgrammatically from the stem

Page 12: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Grammatical Inconsistencies

A 60-year-old alcoholic in status epilepticusis brought to the emergency department bythe police. After ascertaining that the airwayis open, the first step in management shouldbe administration of:

A. examination of cerebrospinal fluidB. glucose with vitamin B1 (thiamine)C. CT scan of the headD. phenytionE. diazepam

Page 13: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Grammatical Inconsistencies

A 60-year-old alcoholic in status epilepticusis brought to the emergency department bythe police. After ascertaining that the airwayis open, the first step in management shouldbe administration of:

A. examination of cerebrospinal fluidB. glucose with vitamin B1 (thiamine)C. CT scan of the headD. phenytionE. diazepam

A testwise examinee would throw out A and C

Page 14: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Irrelevant Difficulty

Options are long, complicated, or multiple facettedPeer review committees in HMOs may move to take action against a physician’s credentials to care for participants of the HMOs. There is an associated requirement to assure that the physician receives due process in the course of these activities. Due process must include which of the following?

A. Proper notice, a tribunal empowered to make the decision, a chanceto confront witnesses against him/her, and a chance to presentevidence in defense.

B. Notice, an impartial forum, council, a chance to hear and confrontevidence against him/her.

C. Reasonable and timely notice, impartial panel empowered to makea decision, a chance to hear evidence against him/herself and toconfront witnesses, and the ability to present evidence in defense.

Page 15: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Irrelevant Difficulty

Options are long, complicated, or multiple facettedPeer review committees in HMOs may move to take action against a physician’s credentials to care for participants of the HMOs. There is an associated requirement to assure that the physician receives due process in the course of these activities. Due process must include which of the following?

A. Proper notice, a tribunal empowered to make the decision, a chanceto confront witnesses against him/her, and a chance to presentevidence in defense.

B. Notice, an impartial forum, council, a chance to hear and confrontevidence against him/her.

C. Reasonable and timely notice, impartial panel empowered to makea decision, a chance to hear evidence against him/herself and toconfront witnesses, and the ability to present evidence in defense.

This is actually a 5-option item but too long to get on one slide!

Page 16: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Irrelevant Difficulty

Numeric data are not stated consistentlyFollowing a second episode of salpingitis, what is the likelihood that a woman is infertile?A. 0 - 20%

B. 20 to 30%

C. Greater than 50%

D. 90%

E. 75%

Page 17: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Irrelevant Difficulty

Numeric data are not stated consistentlyFollowing a second episode of salpingitis, what is the likelihood that a woman is infertile?A. 0 - 20%

B. 20 to 30%

C. Greater than 50%

D. 90% (greater than 50%)

E. 75% (greater than 50%)

Page 18: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Irrelevant Difficulty

Frequency terms in the options are vague (e.g., often, rarely, usually)Severe obesity in early adolescence:

A. usually responds dramatically to dietary regimens

B. often is related to endocrine disorders

C. has a 75% chance of clearing spontaneously

D. shows a poor prognosis

E. usually responds to pharmacotherapy andintensive psychotherapy

Page 19: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Irrelevant Difficulty

Frequency terms in the options are vague (e.g., often, rarely, usually)Severe obesity in early adolescence:

A. usually responds dramatically to dietary regimens

B. often is related to endocrine disorders

C. has a 75% chance of clearing spontaneously

D. shows a poor prognosis

E. usually responds to pharmacotherapy andintensive psychotherapy

Page 20: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Irrelevant Difficulty

“None of the above” or “all of the above” is used as an optionThe diagnosis of a large ovarian cyst is most strongly suggested by an:

A. anterior dullness, lateral tympany

B. decreased peristalsis

C. fluid wave

D. shifting dullness

E. none of the above

Page 21: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Irrelevant Difficulty

“None of the above” or “all of the above” is used as an optionThe diagnosis of a large ovarian cyst is most strongly suggested by an:

A. anterior dullness, lateral tympany

B. decreased peristalsis

C. fluid wave

D. shifting dullness

E. none of the above

Essentially turns this into a multiple true/false item

Page 22: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Irrelevant Difficulty

Stems are tricky or unnecessarily complicatedArrange the parents of the following children with Down’s syndrome in order of highest to lowest risk of recurrence. Assume that the maternal age in all cases is within 5 years. The karyotypes of the daughters are:

I: 46, XX, -14, +T (14q21q) patII: 46, XX, -14, +T (14q21q) de novoIII: 46, XX, -14, +T (14q21q) matIV: 46, XX, -21, +T (14q21q) patV: 47, XX, -21, +T (21q21q) (parents not karyotyped)

A. III, IV, I, V, II

B. IV, III, V, I, II

C. III, I, IV, V, II

D. IV, III, I, V, II

E. III, IV, I, II, V

Page 23: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Testwiseness

The probability of answering a question correctlyshould relate to the examinee’s amount of expertise on the topic being assessed and should not relate to their expertise on test-taking strategies

Flaws related to testwiseness make it easier for some students to answer the question correctly, based on their test-taking skills alone.

These flaws commonly occur in items that are unfocused and do not satisfy the “cover-the-options” rule.

Testwise examinees work to eliminate item options in order to increase the odds of them guessing the correct response.

Page 24: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

The Correct Answer is often:

Longer than the incorrect options

More qualified or more general

Written using familiar phraseology

More grammatically correct for item stem

1 of the 2 similar statements 1 of the 2 opposite

statements

Testwise students are aware that….

Remember to use their testwiseness against them! Use theirawareness of these tendencies for the WRONG answers.

Page 25: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

A Wrong Answer often:

Testwise students are aware that….

is the first or last option contains extreme words (always, never,

nonsense, etc.) contains unexpected language or technical

terms contains flippant remarks or completely

unreasonable statements

Remember to use their testwiseness against them! Use theirawareness of these tendencies for the RIGHT answers.

Page 26: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Testwiseness

Logical Cues:

a subset of the options are collectively exhaustive

Page 27: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Logic Cues

Crime is:

A. equally distributed among the social classes

B. overrepresented among the poor

C. overrepresented among the middle class and the rich

D. primarily an indication of psychosexual maladjustment

E. reaching a plateau of tolerability for the nation

Page 28: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Logic Cues

Crime is:

A. equally distributed among the social classes

B. overrepresented among the poor

C. overrepresented among the middle class and the rich

D. primarily an indication of psychosexual maladjustment

E. reaching a plateau of tolerability for the nation

A, B, & C are mutually exclusive so D & E can be thrown out.A unlikely because few social measures are distributed equally across all social classes.

Page 29: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Testwiseness

Absolute Terms:

terms such as “always” or “never”are used in the options

Page 30: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Absolute Terms

In patients with advanced dementia, Alzheimer’s type, the memory defect

A. can be treated adequately with phosphatidylcholine (lecithin)

B. could be a sequela of early parkinsonism

C. is never seen in patients with neurofibrillary tangles at autopsy

D. is never severe

E. possibly involves the cholinergic system

Page 31: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Absolute Terms

In patients with advanced dementia, Alzheimer’s type, the memory defect

A. can be treated adequately with phosphatidylcholine (lecithin)

B. could be a sequela of early parkinsonism

C. is never seen in patients with neurofibrillary tangles at autopsy

D. is never severe

E. possibly involves the cholinergic system

Page 32: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Testwiseness

Long Correct Answer:

correct answer is longer, more specific,or more complete than other options

Page 33: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Long Correct Answer

Secondary gain is:

A. synonymous with malingering

B. a problem in obsessive-compulsive disorder

C. a complication of a variety of illnesses and tends to prolong many of them

D. never seen in organic brain damage

Page 34: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Long Correct Answer

Secondary gain is:

A. synonymous with malingering

B. a problem in obsessive-compulsive disorder

C. a complication of a variety of illnesses and tends to prolong many of them

D. never seen in organic brain damage

Page 35: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Testwiseness

Word Repeats:

a word or phrase is included in the stemand in the correct answer

Page 36: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Word Repeats

A 58-year-old-man with a history of heavy alcoholuse and previous psychiatric hospitalization isconfused and agitated. He speaks of experiencingthe world as unreal. This symptom is called:

A. derealization

B. depersonalization

C. derailment

D. focal memory deficit

E. signal anxiety

Page 37: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Word Repeats

A 58-year-old-man with a history of heavy alcoholuse and previous psychiatric hospitalization isconfused and agitated. He speaks of experiencingthe world as unreal. This symptom is called:

A. derealization

B. depersonalization

C. derailment

D. focal memory deficit

E. signal anxiety

Page 38: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Testwiseness

Convergence Strategy:

the correct answer includes the mostelements in common with the otheroptions

Page 39: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Convergence Strategy

Local anesthetics are most effective in the:

A. anionic form, acting from inside the nerve membrane

B. cationic form, acting from inside the nerve membrane

C. cationic form, acting from outside the nerve membrane

D. uncharged form, acting from inside the nerve membrane

E. uncharged form, acting from outside the nerve membrane

Page 40: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Convergence Strategy

Local anesthetics are most effective in the:

A. anionic form, acting from inside the nerve membrane

B. cationic form, acting from inside the nerve membrane

C. cationic form, acting from outside the nerve membrane

D. uncharged form, acting from inside the nerve membrane

E. uncharged form, acting from outside the nerve membrane

Since 3 of the 5 involve a charge, test wise examinees will pick “B”

Page 41: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

General Guidelines for Multiple Choice Item Construction

Make sure the item can be answered without looking at the options.

Include as much of the item as possible in the stem - the stems should be long and the options short.

Avoid superfluous information.

Page 42: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

General Guidelines for Multiple Choice Item Construction

Avoid “tricky” and overly complex items.

Write options that are grammatically consistent and logically compatible with the stem; list them in logical or alphabetical order.

Write distractors that are plausible and the same relative length as the answer.

Page 43: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

General Guidelines for Multiple Choice Item Construction

Avoid using absolutes such as always, never, and all in the options; Also avoid using vague terms such as usually and frequently.

Avoid negatively phrased items (those with except or not in the lead-in). If you must use a negative stem, use only short (preferably single word) options.

Focus on important concepts; Don’t waste time testing trivial facts.

Page 44: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Evaluating Item characteristics

Index of Difficulty

Index of Discrimination

Page 45: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Index of Difficulty

The percentage of the group of examinees who answered the item correctly (p-value). The larger the value the easier the item.

• Usually expressed in decimal form (Range of 0 to 1 ).

• Is not determined solely by the content of the item, but also reflects the ability of the group responding to that item.

Page 46: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Index of Discrimination

The correlation between the scores on a particular item and the total score on the exam. If a large proportion of the high scoring examinees get an item correct, and a small proportion of the low scoring examinees get it right, that item has discriminated properly and has contributed to the test purpose.

• Usually expressed as a correlation coefficient ( Range - 1.0 to + 1.0 )

Page 47: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Ideal range for item difficulty

Discrimination is closely related to difficulty. Items that are too hard or too easy are not as capable of discriminating between high and low achievers as items of moderate difficulty.

Moderate difficulty is generally identified with index scores half-way between the prefect score and the change score.

For a 5-option multiple choice item: Perfect score: 1.0 Chance score: 0.20 ( 1 in 5 ) Moderate difficulty score: 0.60

Page 48: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Ideal Range for a discrimination index

o The index of discrimination can be used in the selection of the best (most highly discriminating) items for inclusion on the exam.

o According to Ebel and Frisbie (1991), the following standards should be used:

Index score Item Evaluation 0.40 and up Very good items 0.30 to 0.39 Reasonably good 0.20 to 0.29 Marginal items – could be improved Below 0.19 Poor items - should be rejected or

revised

Page 49: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Try to predict item analysis stats…….

Difficulty index

Discrimination index

Statistically, items of “medium difficulty” have the best chance of discriminating well.

Medium difficulty: For every 10 examinees, 6-7 get the question right

So, think about how many WILL (not SHOULD) get it right! Know your audience!

Page 50: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Too Easy is no good…….

Page 51: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Too hard is no good……

Page 52: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.

Problem: right difficulty range but STILL doesn’t discriminate

Page 53: Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.