Top Banner

of 27

20121020101015Item_Analysis

Apr 03, 2018

Download

Documents

Kalai Vani
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/29/2019 20121020101015Item_Analysis

    1/27

    Item Analysis

  • 7/29/2019 20121020101015Item_Analysis

    2/27

    Purpose of Item Analysis

    Evaluates the quality of each item

    Rationale: the quality of items determines the

    quality of test (i.e., reliability & validity)

    May suggest ways of improving the

    measurement of a test

    Can help with understanding why certain

    tests predict some criteria but not others

  • 7/29/2019 20121020101015Item_Analysis

    3/27

    Item Analysis

    When analyzing the test items, we have several

    questions about the performance of each item. Someof these questions include:

    Are the items congruent with the test objectives?

    Are the items valid? Do they measure what they're

    supposed to measure?

    Are the items reliable? Do they measure consistently?

    How long does it take an examinee to complete each

    item?

    What items are most difficult to answer correctly?

    What items are easy?

    Are there any poor performing items that need to be

    discarded?

  • 7/29/2019 20121020101015Item_Analysis

    4/27

    Types of Item Analyses for CTT

    Three major types:

    1. Assess quality of the distractors

    2. Assess difficulty of the items

    3. Assess how well an itemdifferentiates between high and lowperformers

  • 7/29/2019 20121020101015Item_Analysis

    5/27

    A. Multiple-Choke

    B. Multiply-Choice

    C. Multiple-Choice

    D. Multi-Choice

    DISTRACTOR ANALYSIS

  • 7/29/2019 20121020101015Item_Analysis

    6/27

    Distractor Analysis

    First question of item analysis: How many

    people choose each response?

    If there is only one best response, then all

    other response options are distractors.

    Example from in-class assignment (N = 35):

    Which method has the best internal consistency? #

    a) projective test 1

    b) peer ratings 1

    c) forced choice 21

    d) differences n.s. 12

  • 7/29/2019 20121020101015Item_Analysis

    7/27

    Distractor Analysis (contd)A perfect test item would have 2 characteristics:

    1. Everyone who knows the item gets it right2. People who do not know the item will have

    responses equally distributed across the wrong answers.

    It is not desirable to have one of the distractors chosenmore often than the correct answer.

    This result indicates a potential problem with the

    question. This distractor may be too similar to the correctanswer and/or there may be something in either the stem

    or the alternatives that is misleading.

  • 7/29/2019 20121020101015Item_Analysis

    8/27

    Distractor Analysis (contd)

    Calculate the # of people expected to choose each of thedistractors. If random same expected number for each

    wrong response (Figure 10-1).

    N answering incorrectly 14

    Number of distractors 3

    # of PersonsExp. To Choose

    Distractor

    = = 4.7

  • 7/29/2019 20121020101015Item_Analysis

    9/27

    Distractor Analysis (contd)

    When the number of persons choosing a distractor

    significantly exceeds the number expected, there are 2possibilities:

    1. It is possible that the choice reflects partial knowledge

    2. The item is a poorly worded trick question

    unpopular distractors may lower item and test difficulty

    because it is easily eliminated

    extremely popular is likely to lower the reliability and

    validity of the test

  • 7/29/2019 20121020101015Item_Analysis

    10/27

    Item Difficulty Analysis

    Description and How to Compute

    ex: a) (6 X 3) + 4 = ?b) 9[1n(-3.68) X (1 1n(+3.68))] = ?

    It is often difficult to explain or define difficulty interms of some intrinsic characteristic of the item

    The only common thread of difficult items is thatindividuals did not know the answer

  • 7/29/2019 20121020101015Item_Analysis

    11/27

    Item Difficulty

    Percentage of test takers who respond correctly

    What ifp= .00

    What ifp= 1.00?

  • 7/29/2019 20121020101015Item_Analysis

    12/27

    Item Difficulty

    An item with ap value of .0 or 1.0 does not

    contribute to measuring individual differences andthus is certain to be useless

    When comparing 2 test scores, we are interested in

    who had the higher score or the differences inscores

    p value of .5 have most variation so seek items in

    this range and remove those with extreme values

    can also be examined to determine proportion

    answering in a particular way for items that dont

    have a correct answer

  • 7/29/2019 20121020101015Item_Analysis

    13/27

    Item Difficulty (cont.)

    What is the best p-value?

    most optimal p-value = .50

    maximum discrimination between good

    and poor performers

    Should we only choose items of .50?

    When shouldnt we?

  • 7/29/2019 20121020101015Item_Analysis

    14/27

    Should we only choose items of .50?

    Not necessarily ...

    When wanting to screen the very top group of

    applicants (i.e., admission to university or medical

    school).

    Cutoffs may be much higher

    Other institutions want a minimum level (i.e., minimumreading level)

    Cutoffs may be much lower

  • 7/29/2019 20121020101015Item_Analysis

    15/27

    Item Difficulty (cont.)

    Interpreting the p-value...

    example:

    100 people take a test15 got question 1 right

    What is the p-value?Is this an easy or hard item?

  • 7/29/2019 20121020101015Item_Analysis

    16/27

    Item Difficulty (cont.)

    Interpreting the p-value...

    example:

    100 people take a test

    70 got question 1 right

    What is the p-value?

    Is this an easy or hard item?

  • 7/29/2019 20121020101015Item_Analysis

    17/27

    Item Difficulty (contd)

    General Rules of Item Difficulty

    p low (< .20) difficult test item

    p moderate (.20 - .80) moderately diff.

    p high (> .80) easy item

  • 7/29/2019 20121020101015Item_Analysis

    18/27

    ITEM DISCRIMINATION

    ... The extent to which an item

    differentiates people on thebehavior that the test is designed

    to assess.

    the computed difference between

    the percentage of high achievers

    and the percentage oflow achievers who got the item

    right.

  • 7/29/2019 20121020101015Item_Analysis

    19/27

    Item Discrimination (cont.)

    compares the performance of upper

    group (with high test scores) and lower

    group (low test scores) on each item--%of test takers in each group who were

    correct

  • 7/29/2019 20121020101015Item_Analysis

    20/27

    Item Discrimination (contd):

    Discrimination Index (D)

    Divide sample into TOP half and

    BOTTOM half (or TOP and BOTTOM

    third)

    Compute Discrimination Index (D)

  • 7/29/2019 20121020101015Item_Analysis

    21/27

    Item Discrimination

    D = U - L

    U = # in the upper group correct response

    Total # in upper group

    L = # in the lower group correct response

    Total # in lower group

    The higher the value of D, the more adequately

    the item discriminates (The highest value is 1.0)

  • 7/29/2019 20121020101015Item_Analysis

    22/27

    Item Discrimination

    seek items with high positive numbers (thosewho do well on the test tend to get the item

    correct)

    negative numbers (lower scorers on test more

    likely to get item correct) and low positive

    numbers (about the same proportion of low andhigh scorers get the item correct) dont

    discriminate well and are discarded

    It Di i i ti ( td)

  • 7/29/2019 20121020101015Item_Analysis

    23/27

    Item Discrimination (contd):

    Item-Total CorrelationCorrelation between each item (a correct responseusually receives a score of 1 and an incorrect a score

    of zero) and the total test score.

    To which degree do item and test measures the samething?

    Positive -item discriminates between high and low

    scores

    Near0 - item does not discriminate between high & low

    Negative - scores on item and scores on test disagree

    It Di i i ti ( td)

  • 7/29/2019 20121020101015Item_Analysis

    24/27

    Item Discrimination (contd):

    Item-Total Correlation

    Item-total correlations are directly

    related to reliability.

    Why?Because the more each item correlates

    with the test as a whole, the higher all

    items correlate with each other

    ( = higher alpha, internal consistency)

  • 7/29/2019 20121020101015Item_Analysis

    25/27

    Quantitative Item Analysis

    Inter-item correlation matrix displays thecorrelation of each item with every other

    item

    provides important information forincreasing the tests internal consistency

    each item should be highly correlated

    with every other item measuring the sameconstruct and not correlated with items

    measuring a different construct

  • 7/29/2019 20121020101015Item_Analysis

    26/27

    Quantitative Item Analysis

    items that are not highly correlated with

    other items measuring the same

    construct can and should be droppedto

    increase internal consistency

    It Di i i ti ( td)

  • 7/29/2019 20121020101015Item_Analysis

    27/27

    Item Discrimination (contd):

    Interitem Correlation

    Possible causes for low inter-item correlation:

    a. Item badly written (revise)

    b. Item measures other attribute than rest ofthe test (discard)

    c. Item correlated with some items, but not

    with others: test measures 2 distinct

    attributes (subtests or subscales)