Top Banner
24

M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

Dec 14, 2015

Download

Documents

Ally Dillard
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.
Page 2: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

MMAKINGAKING A APPROPRIATE PPROPRIATE

PPASS-ASS-FFAIL AIL DDECISIONS ECISIONS

DDWIGHT WIGHT HHARLEY, Ph.D.ARLEY, Ph.D.

DIVISION OF STUDIES IN MEDICAL EDUCATIONDIVISION OF STUDIES IN MEDICAL EDUCATIONUNIVERSITY OF ALBERTAUNIVERSITY OF ALBERTA

Page 3: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

PPASSINGASSING S SCORESCORES

Essential component of high stakes exams Essential component of high stakes exams Reaffirm standardsReaffirm standards Their purpose is to ensure that Their purpose is to ensure that

qualified candidates passqualified candidates pass unqualified candidates unqualified candidates do notdo not pass pass

How much is enough?How much is enough? Is 50% the passing score on this exam ?Is 50% the passing score on this exam ?

Page 4: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

RREAFFIRMINGEAFFIRMING S STANDARDSTANDARDS

Performance standardPerformance standard Minimally adequate level of performance to enter Minimally adequate level of performance to enter

practicepractice Passing scorePassing score

Point on the score scale which separates those who are Point on the score scale which separates those who are successful and those who are notsuccessful and those who are not

Page 5: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

TTHEHE B BASISASIS F FOROR P PASSINGASSING S SCORESCORES

Arbitrary judgment unavoidableArbitrary judgment unavoidable Reflect consensus of experts on reasonable Reflect consensus of experts on reasonable

expectations for evidence of competenceexpectations for evidence of competence Imposing discrete categories on a continuumImposing discrete categories on a continuum Set to serve the interests of public and professionSet to serve the interests of public and profession Process should be as open as possibleProcess should be as open as possible Based on as much relevant data as possibleBased on as much relevant data as possible Rationale presented as clearly as possibleRationale presented as clearly as possible

Page 6: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

PPROCESSROCESS OFOF SSETTINGETTING PPASSINGASSING SSCORESCORES

Unreasonable to expect 100% correctUnreasonable to expect 100% correct Possible to construct tests with predetermined Possible to construct tests with predetermined

passing scorespassing scores Possible to adjust passing scores to achieve an Possible to adjust passing scores to achieve an

acceptable pass rateacceptable pass rate Possible to estimate a minimum passing score by Possible to estimate a minimum passing score by

combining estimates of the importance of individual combining estimates of the importance of individual test itemstest items

Page 7: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

PPASSINGASSING S SCORECORE L LEVELEVEL

Determined by the situation and purposeDetermined by the situation and purpose Provide society with enough sufficiently competent practitionersProvide society with enough sufficiently competent practitioners Raising the passing score increases the average competence Raising the passing score increases the average competence

of those who pass but decreases their numberof those who pass but decreases their number Proportions passing should remain constantProportions passing should remain constant The more relevant and demanding the requirements for writing The more relevant and demanding the requirements for writing

the test, the fewer are expected to failthe test, the fewer are expected to fail If more than a small proportion of successful candidates fail the If more than a small proportion of successful candidates fail the

exam, its validity may be subject to serious challenge.exam, its validity may be subject to serious challenge.

Page 8: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

CCRITERIARITERIA F FOROR D DEFENSIBILITYEFENSIBILITY

A standard setting method should …A standard setting method should … produce appropriate classification informationproduce appropriate classification information be sensitive to candidate performancebe sensitive to candidate performance be sensitive to instructionbe sensitive to instruction be statistically soundbe statistically sound identify the “true” standardidentify the “true” standard be easy to implement and computebe easy to implement and compute be credible and easily interpretable by lay peoplebe credible and easily interpretable by lay people

Page 9: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

More than 3 dozen methodsMore than 3 dozen methods Some of the better known methods includeSome of the better known methods include

NedelskyNedelsky AngoffAngoff BookmarkBookmark EbelEbel Jaeger Jaeger IRT methodsIRT methods

SSTANDARD TANDARD SSETTING ETTING MMETHODSETHODS

Page 10: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

““TTHE HE IINDUSTRYNDUSTRY SSTANDARDTANDARD””

The Angoff Method is:The Angoff Method is: the most commonly used methodthe most commonly used method convenient to useconvenient to use well-researched well-researched easily explainedeasily explained easily customizedeasily customized applicable to several response formatsapplicable to several response formats

Page 11: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

AANGOFFNGOFF M METHODETHOD

Judges assign probabilities that a hypothetical Judges assign probabilities that a hypothetical minimally competent borderline candidate will be minimally competent borderline candidate will be able to answer each item correctly.able to answer each item correctly.

For each judge, probabilities are summed to get a For each judge, probabilities are summed to get a minimum performance level (MPL)minimum performance level (MPL)

MPLs are averaged to get a final passing scoreMPLs are averaged to get a final passing score

Page 12: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

MMINIMALLY INIMALLY CCOMPETENTOMPETENT

The effectiveness of the Angoff method rests on the The effectiveness of the Angoff method rests on the judges’ ability to accurately conceptualize a judges’ ability to accurately conceptualize a “minimally competent, borderline candidate.”“minimally competent, borderline candidate.”

Repeated references to a formal summary of the Repeated references to a formal summary of the behaviours and performance indicators is requiredbehaviours and performance indicators is required

Judge training and calibration are essentialJudge training and calibration are essential

Page 13: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

AANGOFF NGOFF CCALCULATIONSALCULATIONS

ItemItem Judge 1Judge 1 Judge 2Judge 2

11 1.001.00 0.850.85

22 0.650.65 0.500.50

33 0.800.80 0.750.75

44 0.450.45 0.500.50

55 0.300.30 0.400.40

MPLMPLjj 3.23.2 3.03.0

Passing score for this test is 3.1 items correct out of 5.

Page 14: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

AA MMINORINOR VVARIANTARIANT

Judges are asked to imagine Judges are asked to imagine a pool of 100a pool of 100 minimally competent borderline students and then minimally competent borderline students and then estimate the number of these students who would estimate the number of these students who would answer the item correctlyanswer the item correctly

Reduces cognitive complexity of the taskReduces cognitive complexity of the task

Page 15: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

VVARIATIONS ON A ARIATIONS ON A TTHEMEHEME

ScalesScales Iterative processIterative process Feedback between roundsFeedback between rounds

Judges’ resultsJudges’ results Past item performancePast item performance

p-valuesp-values % passing% passing

Yes/No procedureYes/No procedure

Page 16: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

SSCALESCALES

Probability scales are sometimes provided to Probability scales are sometimes provided to simplify the process. For example:simplify the process. For example:

5%, 20%, 40%, 60%, 75%, 90%, 95%5%, 20%, 40%, 60%, 75%, 90%, 95%

0%, 5%, 10%, 15% … 95%, 100%0%, 5%, 10%, 15% … 95%, 100%

20%, 25%, 30% … 95%, 100%20%, 25%, 30% … 95%, 100%

Page 17: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

AANGOFF WITH NGOFF WITH IITERATIONTERATION

Most commonly used modification.Most commonly used modification. ““Angoff-ing” is done a number of times.Angoff-ing” is done a number of times. Time between rounds is used for discussion among Time between rounds is used for discussion among

judges.judges. Intent is to reduce variability among judges on item Intent is to reduce variability among judges on item

estimates.estimates.

Page 18: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

NNORMATIVE ORMATIVE DDATAATA

Normative or impact data is presented just prior to Normative or impact data is presented just prior to the final iteration.the final iteration.

Improves inter-rater reliability.Improves inter-rater reliability. Greatest impact on items that have been greatly Greatest impact on items that have been greatly

over or underestimated.over or underestimated.

Page 19: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

YYES/ES/NNO O PPROCEDUREROCEDURE

Judges decide whether or not a single minimally Judges decide whether or not a single minimally competent borderline student would or would not competent borderline student would or would not answer the item correctlyanswer the item correctly

Attempt to simplify the cognitive complexity of the Attempt to simplify the cognitive complexity of the judges’ taskjudges’ task

Comparable results to the traditional methodComparable results to the traditional method

Page 20: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

YYES/ES/NNO O CCALCULATIONSALCULATIONS

ItemItem Judge 1Judge 1 Judge 2Judge 2

11 11 11

22 11 00

33 11 11

44 00 00

55 00 00

MPLMPLjj 33 22

Passing score = Average of MPLs= (3+2)/2= 2.5 items correct

Page 21: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

IIN AN N AN EEMERGENCYMERGENCY

When a committee is not available, Angoff-ing can When a committee is not available, Angoff-ing can be done solobe done solo

Assign Angoff values to each item ands sum the Assign Angoff values to each item ands sum the valuesvalues

Ask a colleague to review your Angoff assignmentsAsk a colleague to review your Angoff assignments Use an item analysis as a reality checkUse an item analysis as a reality check

Page 22: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

RROUNDING OUNDING PPASSING ASSING SSCORESCORES

Rarely do derived passing scores produce exact Rarely do derived passing scores produce exact whole numberswhole numbers

Rounding may have an impact on the pass/fail rateRounding may have an impact on the pass/fail rate Consider the consequences of rounding Consider the consequences of rounding

Page 23: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

Questions?Questions?

Page 24: M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.