Top Banner
UNIT IV UNIT IV ITEM ANALYSIS ITEM ANALYSIS IN TEST DEVELOPMENT IN TEST DEVELOPMENT CHAP 14: ITEM ANALYSIS CHAP 14: ITEM ANALYSIS CHAP 15: INTRODUCTION TO CHAP 15: INTRODUCTION TO ITEM RESPONSE THEORY ITEM RESPONSE THEORY CHAP 16: DETECTING ITEM CHAP 16: DETECTING ITEM BIAS BIAS 1
38

UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT

Jan 19, 2016

Download

Documents

shay

UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT. CHAP 14: ITEM ANALYSIS CHAP 15: INTRODUCTION TO ITEM RESPONSE THEORY CHAP 16: DETECTING ITEM BIAS. CHAPTER 14  ITEM ANALYSIS. * The goal of test construction is to create a test with minimum length and good reliability and validity. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

UNIT IV UNIT IV ITEM ANALYSIS ITEM ANALYSIS IN TEST DEVELOPMENTIN TEST DEVELOPMENT

CHAP 14: ITEM ANALYSISCHAP 14: ITEM ANALYSISCHAP 15: INTRODUCTION CHAP 15: INTRODUCTION TO ITEM RESPONSE TO ITEM RESPONSE THEORYTHEORYCHAP 16: DETECTING CHAP 16: DETECTING ITEM BIASITEM BIAS

1

Page 2: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

CHAPTER 14 ITEM ANALYSIS

**The goal of The goal of test construction test construction is to create a is to create a test with test with minimum length minimum length and and good good reliability and validity.reliability and validity. *Item Analysis *Item Analysis is the is the computationcomputation and and examination of any statistical property of an examination of any statistical property of an item response distribution.item response distribution.*Item Analysis *Item Analysis is a is a processprocess that we go that we go through when constructing a new test or through when constructing a new test or subtests from a subtests from a pool of itemspool of items with with good good reliability and validity. reliability and validity.

2

Page 3: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

CHAPTER 14 ITEM ANALYSIS*Categories of *Categories of Item ParameterItem Parameter **Item parameters Item parameters fall intofall into 3 3 categories or categories or

indices.indices.1. 1. IndicesIndices that that describedescribe the distribution of the distribution of

responsesresponses to a singleto a single item item (e. g. mean and(e. g. mean and variancevariance of item responses).of item responses).

2. 2. IndicesIndices that that describedescribe the degree of the degree of relationshiprelationship between the between the response to the response to the item item and some and some criterion of interest. criterion of interest.

Ex. next

3

Page 4: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

CHAPTER 14 ITEM ANALYSISEx. The Ex. The relationshiprelationship between the questions between the questions

(items) (items) and the and the criterion of interest criterion of interest i.e., i.e., depression in Factor Analysis.depression in Factor Analysis.

3. 3. IndicesIndices that are a function of that are a function of bothboth, , meaning meaning relationship relationship to to item variance/meanitem variance/mean and a and a criterioncriterion of of interest.interest.

Ex. Ex. First, First, find the find the variancevariance/mean for your items /mean for your items then, then, calculate the calculate the relationshiprelationship between these between these items items variance variance and the and the criterioncriterion of interest (i.e., of interest (i.e., depression) for two groups..depression) for two groups..

4

Page 5: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

ITEM DIFFICULTIES ITEM DIFFICULTIES (P)(P)

It is one of the It is one of the 7 steps in 7 steps in Item Analysis. Item Analysis. We use Item difficulties We use Item difficulties to select the best items.to select the best items.

5

Page 6: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

ITEM DIFFICULTIES (P)PP= = ff//N N or Number of examinees who or Number of examinees who answered an item answered an item correctlycorrectly // Total Total number of participants number of participants ( (See your See your midterm item analysis and Chap 5).midterm item analysis and Chap 5).

The The higherhigher the P value the the P value the easiereasier the the itemitem

6

Page 7: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

7

Page 8: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

CHAPTER 14 ITEM ANALYSIS *Steps in Item Analysis*Steps in Item Analysis In a typical item analysis In a typical item analysis

the test developer will take the test developer will take 7 steps 7 steps (they are similar to (they are similar to the process of test the process of test construction in Chapter 4). construction in Chapter 4). Next Slide

8

Page 9: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

FYI FYI PROCESS OF TEST CONSTRUCTION PROCESS OF TEST CONSTRUCTION CHAP CHAP IV IV

1-Identifying 1-Identifying purposespurposes of test scores of test scores useuse2-2-IdentifyingIdentifying behaviorsbehaviors to represent to represent the constructthe construct3- 3- Preparing Preparing test specification test specification i.e., i.e., Bloom Taxonomy Bloom Taxonomy 4- 4- Item constructionItem construction5- 5- Item Review Item Review

9

Page 10: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

PROCESS OF TEST CONSTRUCTION

6- 6- PreliminaryPreliminary item tryoutsitem tryouts7- 7- Field testField test8- 8- Statistical Statistical AnalysisAnalysis9- 9- Reliability and ValidityReliability and Validity10- 10- GuidelinesGuidelines

10

Page 11: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

7 STEPS IN ITEM ANALYSIS (P)1. Describe what proportions of the 1. Describe what proportions of the test score are of test score are of greatest important. greatest important. Ex. when I select questions for your Ex. when I select questions for your midterm/final exam I look for the midterm/final exam I look for the similaritiessimilarities of the of the questions with those questions with those of qualifying/comprehensive or EPPP of qualifying/comprehensive or EPPP examsexams.

11

Page 12: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

7 STEPS IN ITEM ANALYSIS (P)

2. Identify the 2. Identify the item parameters item parameters (e.g. (e.g. mean, variance) most relevant to these mean, variance) most relevant to these proportions.proportions. 3. 3. AdministerAdminister the items to a sample the items to a sample of examinees representative of those of examinees representative of those for whom the for whom the test is intended.test is intended. Ex. IQ test for Ex. IQ test for childrenchildren or or depression test for depression test for adultsadults..

12

Page 13: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

7 STEPS IN ITEM ANALYSIS (P)4. 4. EstimateEstimate for each item the for each item the parametersparameters identified in step 2 i.e., identified in step 2 i.e., variance).variance).5. 5. Establish a plan for Establish a plan for item item selection.selection. Ex. Using Ex. Using item difficulties (P) item difficulties (P) as as in Item Analysis to select the items. in Item Analysis to select the items.

13

Page 14: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

7 STEPS IN ITEM ANALYSIS (P)6.6. Select Select the final subset of items, or use the final subset of items, or use the data (Items in your Item Analysis) for the data (Items in your Item Analysis) for test revision.test revision. Ex. Takeout all questions with very Ex. Takeout all questions with very high or very low item difficulties. high or very low item difficulties. 7. Conduct 7. Conduct a cross validation (validity) a cross validation (validity) study. study. Ex. Use SPSS and compare the results of 2 Ex. Use SPSS and compare the results of 2 tests or 2 classes (e. g. this year class and tests or 2 classes (e. g. this year class and last year class). i.e., Confirmatory Factor last year class). i.e., Confirmatory Factor Analysis.Analysis.

14

Page 15: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

UNIT V UNIT V TEST SCORING AND INTERPRETATIONTEST SCORING AND INTERPRETATION

CHAP 17: CORRECTING FOR GUESSING CHAP 17: CORRECTING FOR GUESSING AND OTHER SCORING METHODSAND OTHER SCORING METHODSCHAP 18: SETTING STANDARDSCHAP 18: SETTING STANDARDSCHAP 19: NORMS AND STANDARD CHAP 19: NORMS AND STANDARD SCORESSCORESCHAP 20: EQUATINGSCORESFROM CHAP 20: EQUATINGSCORESFROM DIFFERENT TESTSDIFFERENT TESTS

15

Page 16: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

CHAPT 19CHAPT 19NORMS AND STANDARDS NORMS AND STANDARDS SCORESSCORES

16

Page 17: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

18951895*Alfred Binet (1910)*Alfred Binet (1910)RatioRatio IQ IQ = = RatioRatio of MA/CA of MA/CA

NORMS AND STANDARD SCORES

Page 18: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

19121912In 1912 in Germany In 1912 in Germany Wilhelm SternWilhelm Stern proposed the following proposed the following formula: IQ = formula: IQ = [[MMental age/ental age/CChronological age]100 hronological age]100

standardized it.standardized it.This formula works This formula works fairly well for children fairly well for children but not for adults.but not for adults. *The abbreviation "IQ" was The abbreviation "IQ" was coined by the coined by the psychologist William Stern for the for the German term term Intelligenz-quotientIntelligenz-quotientRatio IQRatio IQ

Page 19: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

19161916 *3. *3. Lewis Terman Lewis Terman from from Stanford University,Stanford University, publishes publishes the the Stanford-Binet Stanford-Binet Intelligence Test.Intelligence Test.He used the He used the standardized versionstandardized version

IQ = [IQ = [MMental age/ental age/CChronological age]100hronological age]100

NORMS AND STANDARD SCORES

Page 20: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

NORMS AND STANDARD SCORES

*Deviation *Deviation IQ IQ = Uses = Uses NormsNorms to to estimate the IQ estimate the IQ

We use Norms when we want to We use Norms when we want to compare an examinee’s score compare an examinee’s score (raw (raw score) or score on a test score) or score on a test to the to the distribution of scores distribution of scores (scaled or (scaled or standard scores) standard scores) for a sample from for a sample from a well-defined population. a well-defined population. Ex. nextEx. next

20

Page 21: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

NORMS AND STANDARD SCORESEx. When we want to estimate the IQ Ex. When we want to estimate the IQ

of a 20 year-old persons, We compare of a 20 year-old persons, We compare their their raw score raw score on the subtest of an IQ on the subtest of an IQ test with the people of their age, test with the people of their age, which is which is “their “their norm” (standard norm” (standard score). score). Using this technique tells us Using this technique tells us where they stand among the people of where they stand among the people of their age.their age.

21

Page 22: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

*9 *9 BASIC STEPS IN CONDUCTING A BASIC STEPS IN CONDUCTING A NORMING NORMING STUDY STUDY (P.432)(P.432)

1. 1. Identify the population of interestIdentify the population of interest Ex. Students, employees of a company, Ex. Students, employees of a company,

inmates, patients, etc.inmates, patients, etc. 2. 2. Identify the most critical statistics that will Identify the most critical statistics that will

be computed for the sample data. be computed for the sample data. Ex. Standard deviation σ, σ² , M, SS, pEx. Standard deviation σ, σ² , M, SS, p

22

Page 23: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

NORMS AND STANDARD SCORESNORMS AND STANDARD SCORES*9*9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)

3. 3. Decide on the tolerable amount of Decide on the tolerable amount of sampling errorsampling error

That is the discrepancy between the That is the discrepancy between the sample statistic (M) and population sample statistic (M) and population parameter, (µ) (parameter, (µ) (Central Tendency Central Tendency M=µM=µ). ). The The Central Limit Theorem Central Limit Theorem has 3 has 3 characteristics;characteristics;

1. Central Tendency 1. Central Tendency 2.The Shape of the 2.The Shape of the Distribution (normal) Distribution (normal) and and 3. Variability or 3. Variability or Standard Error of Mean (Standard Error of Mean (σσmm).). M-µ M-µ

23

Page 24: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.4329BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432))

4. 4. Device a procedure for Device a procedure for drawing a sample drawing a sample from the from the population of interest.population of interest.

There are 4 types of probability samplingThere are 4 types of probability samplingII Simple Random Sampling Simple Random Sampling Give everyone in the population an equal chance to Give everyone in the population an equal chance to

be selected Ex. Draw names from a hat. be selected Ex. Draw names from a hat. II II Systemic Sampling N/nSystemic Sampling N/n Select every KSelect every Kthth name on the list. name on the list. Ex. Ex. CAU Pop CAU Pop

N=1500 N=1500 and your sample size and your sample size n=150n=150 N/nN/n=1500/150=10 Select every 10=1500/150=10 Select every 10thth student. student.

24

Page 25: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)

SAMPLING CONT..

IIIIII StratStratified Sampling ified Sampling “Strata” means “Strata” means different layers. different layers. We use Stratified We use Stratified Sampling when we want to compare 2 Sampling when we want to compare 2 different groups different groups (e.g. (e.g. Males and femalesMales and females CAU Doctoral StudentsCAU Doctoral Students).).

First we randomly select males then, First we randomly select males then, randomly select randomly select females. females.

25

Page 26: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

9BASIC STEPS IN CONDUCTING A NORMING STUDY(P.432)9BASIC STEPS IN CONDUCTING A NORMING STUDY(P.432)

SAMPLING CONT..SAMPLING CONT..IV IV Cluster Sampling Cluster Sampling We use Cluster We use Cluster

sampling when the population consists of sampling when the population consists of units units not individuals, such as not individuals, such as classes. classes. Ex. Ex. Miami Dade School Districts. If we want Miami Dade School Districts. If we want to conduct a research with the Miami to conduct a research with the Miami Dade 2Dade 2ndnd graders (1000- 2 graders (1000- 2ndnd grade classes). grade classes). We’ll randomly select about 10 of these We’ll randomly select about 10 of these 1000- 21000- 2ndnd grade classes to be in our sample, grade classes to be in our sample, then we conduct research.then we conduct research.

26

Page 27: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

9BASIC STEPS IN CONDUCTING A NORMING STUDY 9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)(P.432)

5.5.Estimate the minimum Estimate the minimum sample size (n) sample size (n) required to required to hold the sampling error within the specific limits. hold the sampling error within the specific limits.

There are different statistical procedures to There are different statistical procedures to estimate the estimate the (n). (n). (n) should be ≥30. (Law of large (n) should be ≥30. (Law of large number). number).

1. n= (σ/d)²1. n= (σ/d)² d=effect size d=M-µ/σd=effect size d=M-µ/σ 2. n= (σ/σ2. n= (σ/σm)m) ² ²

σσmm= σ/√n = σ/√n Standard error of meanStandard error of mean for for pop pop Ex. Z Ex. Z scorescore

SSmm=S=S/√/√n n Estimated Standard Error of the Mean Estimated Standard Error of the Mean for a for a sample. sample. Ex. t-distribution Ex. t-distribution

27

Page 28: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

NORMS AND STANDARD SCORESNORMS AND STANDARD SCORES

28

Page 29: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

THE EFFECT SIZE THE EFFECT SIZE EX. TWO INDEPENDENT T-TESTEX. TWO INDEPENDENT T-TEST

29

Page 30: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

NORMS AND STANDARD SCORESNORMS AND STANDARD SCORES

30

Page 31: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)

6. 6. Draw the Draw the SampleSample and collect the and collect the DataData7. Compute7. Compute the Values of the Group the Values of the Group

Statistics of interest and their Statistics of interest and their standard standard error. error. SSmm=S/√n=S/√n or or σσm = m = σσ//√n√n

Calculate the Calculate the standard error standard error of of measurement, which is the difference measurement, which is the difference between M andbetween M and µ. Also known as µ. Also known as sampling error.sampling error.

31

Page 32: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)

8. 8. Identify the Identify the Types of Normative Types of Normative Scores that will be needed, Scores that will be needed, and and prepare the Normative Score prepare the Normative Score Conversion table (see next 2 slide).Conversion table (see next 2 slide).

9. 9. Prepare Prepare written documentation written documentation of of the Normative Scores.the Normative Scores.

32

Page 33: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

NORMS AND STANDARD SCORESTypes of Normative ScoresTypes of Normative ScoresRaw ScoreRaw Score Score on a Score on a subtest or a subtest or a

test.test.Scaled ScoreScaled Score Normative score Normative score for for

specific agespecific age..

33

Page 34: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

NORMATIVE SCORES

34Wex-ler

Page 35: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

*NORMATIVE SCORES*NORMATIVE SCORES

35

Page 36: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

NORMS AND STANDARD SCORES

*Usefulness of Scaled Scores*Usefulness of Scaled Scores

Scaled Scores are useful forScaled Scores are useful for two purpose:two purpose:

1. 1. Scaled scores Scaled scores relate relate the examinee’s the examinee’s performance to performance to percentile rank scorespercentile rank scores of the of the norm group and their grade level.norm group and their grade level.

2. 2. In evaluation and research In evaluation and research the mean scaled the mean scaled score score is a is a betterbetter estimation of average group estimation of average group performance than performance than the mean raw score.the mean raw score.

36

Page 37: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

37

Page 38: UNIT IV  ITEM ANALYSIS  IN TEST DEVELOPMENT

43