AN ITEM ANALYIS ON DISCRIMINATING POWER OF …repository.uinjkt.ac.id/dspace/bitstream/123456789/2826/1/HIKMAH... · AN ITEM ANALYIS ON DISCRIMINATING POWER OF ENGLISH SUMMATIVE TEST

AN ITEM ANALYIS ON DISCRIMINATING POWER

OF ENGLISH SUMMATIVE TEST

(A Case Study of Second Year of “SMPN 87” Pondok Pinang)

A “Skripsi”

Presented to Faculty of Tarbiya and Teachers Training

in Partial Fulfillment of the Requirements

for the Degree of S.Pd. (Bachelor of Arts) in English Language Education

By:

HIKMAH LESTARI

105014000379

By:

HIKMAH LESTARI

105014000379

DEPARTMENT OF ENGLISH EDUCATION

FACULTY OF TARBIYAH AND TEACHERS’ TRAINING

SYARIF HIDAYATULLAH STATE ISLAMIC UNIVERITY

JAKARTA

2011

AN ITEM ANALYIS ON DISCRIMINATING POWER

OF ENGLISH SUMMATIVE TEST

(A Case Study of Second Year of “SMPN 87” Pondok Pinang)

A “Skripsi”

Presented to Faculty of Tarbiyaand Teachers‟ Training

in Partial Fulfillment of the Requirements

for the Degree of S.Pd. (Bachelor of Arts) in English Language Education

Approved by the advisor:

Dr. Atiq Susilo, MA

NIP: 19491122 197803 1 001

DEPARTMENT OF ENGLISH EDUCATION

FACULTY OF TARBIYA AND TEACHERS’ TRAINING

SYARIF HIDAYATULLAH STATE ISLAMIC UNIVERITY

JAKARTA

2011

ENDORSEMENT BY THE EXAMINATION COMMITTEE

The examination committee of the Faculty of Tarbiya and Teachers Training certifies that

the „Skripsi” entitled: “AN ITEM ANALYIS ON DISCRIMINATING POWER OF

ENGLISH SUMMATIVE TEST”, written by Hikmah Lestari, student‟s registration

number 105014000379 was examined at examination session of the Faculty of Tarbiyah

and Teachers‟ Training, “SyarifHidyatullah” State Islamic University Jakarta on March,

31th, 2011. The “Skripsi” has been accepted and declared to have fulfilled one of the

requirements for the degree of S.Pd. (Bachelor Art) in English Education in The

Department of English Education.

Jakarta, March 31th, 2011

Examination Committee:

CHAIRMAN : Drs. Syauki, M.Pd. (…………………….)

NIP. 19641212 199103 1 002

SECRETARY : NenengSunengsihS.Pd. (…………………….)

NIP. 19730625 199903 2 001

EXAMINER 1 : Dr. Fahriany, M.Pd. (…………………….)

NIP. 19700611 199101 2 001

EXAMINER 2 : Drs. BahrulHasibuan, M.Ed. (…………………….)

Acknowledge by:

Dean of Tarbiya and Teachers Training Faculty

Prof. Dr. DedeRosyada, M.A.

NIP. 19571005 198703 1 003

i

ABSTRACT

Lestari, Hikmah. 2011, An Item Analysis on Discriminating Power of English

Summative Test at the Second Grade Students of SMPN 87 Pondok

Pinang, “Skripsi”, Department of English Education, The Faculty of

Tarbiya and Teachers’ Training, Syarif Hidayatullah State Islamic

University Jakarta.

Advisor: Dr. Atiq Susilo, MA

Key words: Item Analysis, Discriminating Power, SMPN 87

The purpose of this study is to analyze the discriminating power of English

Summative test at second grade of “SMPN” 87 Pondok Pinang. Through this

study, it is hoped that the teacher can get clear description about the quality of

discriminating power of English summative, so the teacher is able to help the poor

students.

This study is categorized as descriptive analysis; because it is intended to

describe the objective condition about the discriminating power of students’

English summative test at odd semester of second grade of “SMPN 87” Pondok

Pinang by analyzing the quality of English summative test items in discriminating

students’ achievement. This study is considered as quantitative research, because

the researcher used some numerical data which is analyzed statistically.

The finding of this study is that the English summative test which is tested

at second grade of “SMPN 87” Pondok Pinang has good discriminating power,

because 35 items ranging from 0.25 until 0.75 (70%) of the test items have

fulfilled the criteria of a positive discriminating power.

ii

ABSTRAK

Lestari, Hikmah. 2011, An Item Analysis on Discriminating Power of Engish

Summative Test at the Second Grade of SMPN 87 Pondok Pinang,

Skripsi, Jurusan Pendidikan Bahasa Inggris, Fakultas Ilmu TArbiyah dan

Keguruan, UIN Syarif Hidayatullah Jakarta.

Pembimbing: Dr. Atiq Susilo, MA

Key words: Analisa Butir Soal, Daya Pembeda, SMPN 87

Tujuan dari penelitian ini adalah untuk menganalisis daya pembeda dari

tes sumatif bahasa Inggris kelas dua SMPN 87 Pondok Pinang. Melalui penelitian

ini, diharapkan guru mendapatkan penjelasan secara jelas tentang kualitas daya

pembeda tes summatif bahasa Inggris, sehingga guru dapat membantu siswa-

siswa yang mendapatkan nilai rendah.

Penelitian ini dikategorikan sebagai analisis deskriptif; karena penelitian

ini menggambarkan kondisi objektif daya pembeda tes summative bahasa Inggris

semester ganjil siswa kelas dua SMPN 87 Pondok Pinang dengan menganalisis

kemampuan butir – butir soal pada tes summative bahasa Inggris dalam

membedakan kemampuan para siswa. Penelitian ini termasuk penelitian

kuantitatif, karena peneliti menggunakan beberapa data penghitungan yang

dianalisa dengan statistik.

Hasil temuan penelitian ini menyatakan bahwa tes sumatif bahasa Inggris

yang diuji pada kelas kedua SMPN 87 Pondok Pinang memiliki daya pembeda

yang baik, karena 35 item dengan kategori 0.25 sampai 0.75 atau 70% dari item

tes telah memenuhi kriteria dengan daya diskriminasi positif.

iv

Mrs. Dra. Farida Hamid, M.Pd., Mr. Prof. Dr. Mulyanto Sumardi, MA., The late

Mr. Drs. Munir Sonhaji, M.Ed., Mr. Drs. Nasifuddin Djalil, M.Ag.,

Mr. Drs Arifin Toy, M.Sc., Mr. Drs. M. Zaenuri, M.Pd., and Mr. Drs Nasrun

Mahmud, M.Ed.

The writer also would like to express her gratitude to Mr. Dr. Atiq Susilo,

MA as the writer’s advisor who had kindly spent his time to give his valuable

advice, guidance, correction, and suggestion in finishing this “skripsi”

His gratitude also goes to Mr. Drs. Syauki, M.Pd as the head of English

Education Department, Mrs. Neneng Sunengsih S.Pd as the secretary of English

Education Department, and Prof. Dr. Dede Rosyada, MA as the Dean of Faculty

of Tarbiya and Teachers’ Training.

The writer dedicated many thanks to Mr. Drs. Ishak Idrus as the

headmaster of “SMPN” 87 Pondok Pinang, who had given the permission to the

writer to do the research at “SMPN” 87 Pondok Pinang.

The writer also would like to express her gratitude and love to all beloved

classmates (2005) of English Education Department, either class A, B, or C, for

sharing their knowledge, support, and time in accomplishing this ”skripsi” and for

the wonderful friendship while studying together.

May Allah the Almighty bless the all, so be it.

Finally, the writer realizes that this “skripsi” is still far from being perfect.

Constructive criticism and suggestion would be welcomed to make it better.

Jakarta, March 31th, 2011

The writer

v

TABLE OF CONTENTS

ABSTRACT .................................................................................................... i

ABSTRAK ...................................................................................................... ii

ACKNOWLEDGEMENT ............................................................................. iii

TABLE OF CONTENTS ............................................................................... v

LIST OF TABLES ......................................................................................... vii

LIST OF APPENDICES ............................................................................... viii

CHAPTER I: INTRODUCTION

A. The Background of the Study .......................................................... 1

B. The Limitation of the Problem ........................................................ 3

C. The Formulation of the Problem ..................................................... 4

D. The Objective of Study .................................................................... 4

E. The Method of Study ....................................................................... 4

F. The Organization of the Writing .................................................... 5

CHAPTER II: THEORETICAL FRAMEWORK

A. Test

1. The Definition of Test ............................................................... 6

2. Types of Test ............................................................................. 7

3. Characteristic of a Good Test .................................................... 11

B. Item Analysis

1. The Definition of Item Analysis ................................................ 13

2. Discriminating Power ................................................................ 14

3. Types of Test Item ..................................................................... 16

vi

4. The Importance of Item Analysis .............................................. 18

CHAPTER III: RESEARCH METHODOLOGY

A. Place and Time of Research ............................................................ 20

B. Technique of Sample Taking ........................................................... 20

C. Technique of Data Collecting .......................................................... 21

D. Research Instrument ........................................................................ 21

E. Technique of Data Analysis ............................................................ 21

CHAPTER IV: RESEARCH FINDINGS

A. Description of Data .......................................................................... 23

B. Analysis of Data .............................................................................. 36

C. Interpretation of Data....................................................................... 37

CHAPTER V: CONCLUSION AND SUGGESTION

A. Conclusion ....................................................................................... 38

B. Suggestion ....................................................................................... 39

BIBLIOGRAPHY .......................................................................................... 40

APPENDICES ................................................................................................ 42

vii

LIST OF TABLES

Table 4.1 The Students’ Score and Group Position of English Summative

Test at the Odd Semester............................................................... 25

Table 4.2 The Students’ Answer Sheet of English Summative Test Items

from the Upper Group ................................................................... 27

Table 4.3 The Students’ Answer Sheet of English Summative Test Items

from the Lower Group .................................................................. 30

Table 4.4 The Discriminating Power Index of the Upper and Lower Group 33

Table 4.5 The Percentage of Discriminating Power ..................................... 35

viii

LIST OF APPENDICES

Appendix 1 Form of Test .......................................................................... 42

Appendix 2 Answer Key ........................................................................... 48

Appendix 3 Students’ Answer Sheet ........................................................ 54

Appendix 4 Surat Pengajuan Judul Skripsi ............................................... 59

Appendix 5 Surat Izin Penelitian .............................................................. 61

Appendix 6 Surat Keterangan Telah Melakukan Penelitian ..................... 62

1

CHAPTER I

INTRODUCTION

A. The Background of Study

Making an evaluation is an integral part of life; we evaluate all aspects of

our life and work constantly. In the field of education, evaluation plays an

important role because it reflects the result of education development. Evaluation

may be defined as the systematic process of collecting, analyzing, and interpreting

information to determine the extent to which pupils are achieving instructional.

Evaluation gives information about how successful the efforts of education

have been. It helps teachers to get the information about the progress of students’

achievement of the material they have learned in order to make decision.

Evaluation cannot be separated from teaching-learning process. According to

Bahman in her book “Fundamental Consideration in Language Testing”,

evaluation is defined as the systematic gathering of information for the purpose of

making decision.1 In additions, the purpose of evaluation is to provide relevant

information.

There are many techniques in collecting information for evaluation

purposes. One of them is by using test.2 A test in plain words is “A method of

1 Lyle F. Bahman, Fundamental Consideration in Language Testing, (Toronto:

Oxford University Press, 1990), p.20. 2 Fred Genesee & John A. Uphsur, Classroom – Based Evaluation in Second

Language, (New York: Cambridge University Press, 1996), p. 140

1

2

measuring person’ ability or knowledge in a given domain”.3 Tests are used for

pedagogical purposes, either as a mean of motivating students to study or as a

mean of reviewing the material taught.

Students usually tend to study harder when they are going to have an

examination rather than when they are not and they will emphasize in studying the

material that expect to be tested. Thus, if the teacher announces that they are

going to give an examination, most students are motivated to study or to review

the material assigned.

A test is supposed to be well constructed so that it can be used effectively.

To be said as a good test, it has to fulfill the characteristics of good test, they are

validity, reliability and practicality. It is valid if the test can measure what is

supposed to be measured. It can be reliable if the result from the test is the same

even though the test is administered to the same standard for several times. A test

can be practical if it is easy to do and administered.

In applying the three characteristics above, the teacher should prepare the

test as good as possible. After that, the teacher should administer and score the

test; it is desirable to evaluate the effectiveness of the test, especially the test item

because it is necessary for teachers to use their own judgment to know how well

the test item works. This is done by studying the students’ responses to each item.

When formalized, the procedure is called item analysis.

Item analysis provides a quick, simple technique for appraising the

effectiveness of individual test items.4 Item analysis procedures provide

information for evaluating the functional effectiveness of each item and for

detecting weaknesses that should be corrected. This information is useful when

reviewing the test with students and it is indispensable when building a file of

high quality items.

There are three characteristics which are usually determined for a test

item: first, item difficulty; it indicates how difficult each item was for the group.

3 H. Douglass, Teaching by Principles, An Interactive Approach to Language

Pedagogy, (New York: Addison Wesley, Longman, 2001), p. 384. 4 Norman E. Gronlund, Measurement and Evaluation in Teaching, (New York:

Macmillan Publishing Co., 1981), p. 262.

3

Second, discriminatory power; it tells how well the item performs in separating

the better students from the poorer students. Third, item distracter; for multiple

choice items, it indicates how effective each alternative was for the item. So it can

be concluded that item analysis provide us the data whether the test item is too

difficult or too easy, whether it can discriminate the students or not, and whether

all the alternatives functioned as intended.

Incidentally, the writer is as a practice teacher at “SMPN 87” Pondok

Pinang. She often hears some of the students said that English items in their

school are hard and some students said in the contrary. Therefore, from the

personal experience above, she is interested in figuring out the quality of

discriminating power of English summative test items at “SMPN 87” Pondok

Pinang. The reason why the writer chooses the discriminating power is because

she thinks that the discriminating power deals more with the students than the

other two choices-the level on difficulty and the effectiveness of the distracter.

Based on the explanation above, the writer tries to limit the problem of

item analysis that she will discuss, so she just focuses on the discriminating power

of the test item. The test item that will be analyzed by the writer is a final test of

odd semester which is tested on the second grade of “SMPN 87” Pondok Pinang.

The main aim of this study is to know how well the test item can discriminate

between the students who have achieved well and those who have achieved

poorly. So, the writer tries to analyze and interpret it under the title “AN ITEM

ANALYSIS ON THE DISCRIMINATING POWER OF ENGLISH

SUMMATIVE TEST”. And this study will be done at second grade of “SMPN

87” Pondok Pinang.

B. Limitation and scope of the Problem

In order to make this study easier and deeper to comprehend, and not too

broad, the writer is going to limit the area of study. First, the writer intends to see

the quality of test item only by doing an item analysis that focused on

discriminating power. By analyzing on the discriminating power of test items, the

4

writer can conclude how well it discriminates between the students who

performed well from those who did poorly on the test as a whole.

C. Formulation of the Problem

Based on the limitation of the problem, the writer formulates the problem

in this research as follow: “Do the test items of English summative test which is

administered at the second grade of “SMPN 87” Pondok Pinang have a good

discriminating power?”

D. Objective of the Study

The objective of this study is to measure the quality of English summative

test and to know whether the English summative test items have a good

discriminating power or not. High quality of test items is prominent to diagnose

the strengths and weaknesses of students. Thus, the findings of this study are

expected to provide useful information about the test items quality.

It is hoped that analysis on discriminating power of English summative

test which will be done by the writer can give significant contribution in

improving the quality of future English summative test items. It is also expected

that it can reveal the students who have achieved well in doing the test and they

who have not, so the teacher is able to help the poor students.

E. Method of the Study

This study is categorized as descriptive analysis; because it is intended to

describe the objective condition about the discriminating power of students’

summative test a odd semester of second grade of “SMPN 87” Pondok Pinang.

Besides, this study is called analysis, because it analyze how well the items of

English summative test can discriminate between the students who have achieved

well and those who have achieved poorly. This study is considered as quantitative

research, because the researcher used some numerical data which is analyzed

statistically.

5

F. Organization of the Writing

The writing systematically divided into five chapters, they are:

Chapter one deals with the introduction. It consists of the background of

study, limitation and formulation of the problem, objective of the study, method

of the study, significant of the study, and organization of the writing.

Chapter two discusses the theoretical framework. It is divided into two

sections. The first section discusses about Test, definition of test, function of test,

types of test, types of test item, and character of good test. The second section

discusses about item analysis, the definition of item analysis, discriminating

power and the importance of item analysis.

Chapter three deals with the research methodology. It discusses about the

objective of study, place and time of study, technique of data collecting and

technique of data analysis.

In Chapter four, the writer will focus on the research finding. The data

description, data analysis and data interpretation are included in this section.

The last chapter, chapter five will talk about the conclusion and

suggestion.

6

CHAPTER II

THEORETICAL FRAMEWORK

A. Test

1. The Definition of Test

Test is one of instruments for collecting data. Test can be used in an

instructional program to assess entry behavior, monitoring learning progress,

diagnose learning difficulties, and measure performance at the end of

instruction. Tests are given for many different reasons. In order to achieve

such diverse purposes, they need to be carefully planned. In classroom

settings, this planning usually entails instructional objectives.

A test is a procedure designed to elicit certain behavior from which

one can make inferences about certain characteristics of an individual.1 A test

is an instrument, device, or procedure that proposes a sequence of tasks which

a student is to respond – the result of which are used as measures of a specific

trait. Test may be defined as a task or series of tasks used to obtain systematic

observations presumed to be representative of educational or psychological

traits or attributes.2

Cronbach defines a test as a “systematic procedure for observing a

person‟s behavior and describing it with the aid of a numerical scale or a

1 Lyle F. Bahman, Statistical Analysis for Language Assessment, (Cambridge:

Cambridge University Press, 2004), p. 9. 2 Gilbert Sax, Principles of Educational and Psychological Measurement and

Evaluation, (Belmont: Wadsworth Publishing Company, 1980), p. 13.

6

7

category system”. The phrase “systematic procedure” indicates that a test is

constructed, administered, scored and described according to prescribed rules.

The term behavior implies that a test measures the responses a person makes

to the test items. Tests do not measure a person directly but rather they infer

his characteristics from his responses to test items. We do not observe all

behavior but only a sample of behavior. A test contains only a sample of all

possible items. The test results are described with the aid of measurement

scales.3

Based on the definitions above, the writer can conclude that a test is

an instrument which is administered to measure students‟ responses to the test

items.

2. Types of Test

Test can be categorized according to the types of information they

provide. This categorization will prove useful both in deciding whether an

existing test is suitable for a particular purpose and in writing appropriate new

tests where these are necessary.4 Test can be classified based on its purpose

and based on its test maker.

a. Based on its purpose

1) Aptitude Test

An aptitude test is primarily designed to predict success in some

future learning activity. It is generally given before the student begins

language study, and may be used to select students for a language

course or to place students in section appropriate to their ability.

There are some information provided by aptitude test that the test is

useful in determining learning readiness, individualizing instruction,

organizing classroom groups, identifying underachievers, diagnosing

3 H.J.X. Fernandes, Testing and Measurement, (Jakarta: National Educational

Planning, Evaluation and Curriculum Development, 1984), p.1. 4Arthur Hughes, Testing for Language Teachers, (New York: Cambridge University

Press, 2003), p.5.

8

learning problems and helping students with their educational and

vocational plans.

Aptitude tests are often used in selecting individuals for jobs, for

admission to training program, for scholarship, and for many other

purposes. Sometimes aptitude tests are used for classifying individuals,

as when students are assigned to different ability-grouped sections of

the same course.5

2) Achievement Test

Achievement tests measure what a person has learnt during a

course of instruction. It is given at the end of the course. The content of

achievement tests is generally based on the course of syllabus or the

course textbook.

Achievement test is designed to indicate degree of success in some

past learning activity. This purpose of achievement test is obviously

different with the purpose of aptitude test, where the aptitude test is

designed to predict success in some future learning activity. A

distinction between these two tests is made in terms of the use of the

results rather than of the qualities of the test themselves.6

Assessment and evaluation are term often used in connection with

achievement testing. The purpose of the testing is for assessing present

attainment. With achievement tests, we are trying to measure students‟

present attainment. 7 A common distinction is that achievement tests

measure what a student has learned, and aptitude tests measure the

ability to learn new tasks.8

5Howard B. Lyman, Test Scores and What They Mean, (Singapore: Allyn & Bacon,

1998), p.22. 6 Drs. Wilmar Tinambunan, Evaluation of Student Achievement, (Jakarta: Depdikbud,

1998), p.7. 7 Howard B. Lyman, Test Score……….., p. 22

8 Robert L. Linn, Norman E. Gronlund, Measurement and Assessment in Teaching,

(New Jersey: Prentice-Hall, Inc., 1995), p. 391-392.

9

There are two kinds of achievement test, final achievement test and

progress achievement test.9 Final achievement test is that administered

at the end of the course of study. They may be written and administered

by ministries of education, official examining board, or by member of

teaching institutions. On the other hand, progress achievement test are

intended to measure the progress that students are making. They

contribute to formative assessment. Since, progress is toward the

achievement of course objective, therefore this test should relate to

objective.

3) Proficiency test

Proficiency test are designed to measure people‟s ability in a

language regardless of any training they may have had in that language.

The content of a proficiency test, therefore, is not based on the content

or objectives of language courses that people taking the test may have

followed. Rather, it is based on a specification of what candidates have

to be able to do in the language in order to be considered proficient. 10

It

means that the function of the test is to show whether the candidates

have reached certain specific abilities or not.

Some proficiency tests are intended to show whether students have

reached a given level of general language ability. Others are designed to

show whether students have sufficient ability to be able to use a

language in some specific area such as medicine, tourism or academic

study or not.

4) Diagnostic Test

The diagnostic tests seek to identify those are in which a student

needs further help. These tests can be fairly general, and show, for

example, whether a student needs particular help with one of the four

9 Desmon Allison, Language Testing and Evaluation, (Singapore: Singapore

University Press, 1999), p. 80 10

Arthur Hughes, Testing for Language…, p. 11.

10

main language skills; or they can be more specific, seeking perhaps to

identify weaknesses in a student‟s use of grammar. These more specific

diagnostic tests are not easy to design since it is difficult to diagnose

precisely strengths and weaknesses in the complexities of language

ability. For this reason, there are very few purely diagnostic tests.

b. Based on the Test Maker

1) Standardized Test

Standardized tests are constructed by test specialists working with

curriculum experts and teachers. They are standardized in that they

have been administered and scored under standard and uniform testing

conditions so that results from different classes and different schools

may be compared. The quality of the test items in this test is high

quality because the test items are made by the specialist. Those test

items are also pretested and selected on basis of effectiveness.

2) Teacher-Made Test

Teacher made test are constructed by teachers for use within their

own classroom. Their effectiveness depends on the skill of the teacher

and his or her knowledge of test construction. The quality of test items

in this test is unknown unless the test item file is used. But the quality

is typically lower than standardized test because of teacher‟s limited

time and skill. Because this test is conducted within teacher own

classroom so that the test only compare the score among the student in

that classroom.

In general, teacher made examinations have flexibility for use

within a given classroom, but provide little data for comparing students

in different classes. Standardized tests, in contrast, are used to compare

students‟ performance in different classes or schools.11

11

Gilbert Sax, Principles of Educational and Psychological Measurement and

Evaluation, (Belmont: Wadsworth, Inc., 1980), p. 16-18.

11

3. Characteristic of a Good Test

The test can be said as the good test I it has the certain qualifications or

the certain characteristics. The most essential characteristic of the good test

can be classified into three main aspect, they are, validity, reliability, and

practicality.12

a. Validity

The most simplistic definition of validity is that it is degree to which a

test measures what it is supposed to be measure. J.B Heaton said, “The

validity of the test is the extent to which it measures what it is supposed to

measure and nothing else”.13

The validity of a test must be considered in

measurement in this case there must be seen whether the test used really

measures what are supposed to measure.

b. Reliability

Reliability means dependability or trustworthiness. Basically,

reliability is the degree to which a test consistently measures whatever it

measures. The more reliable a test is, the more confidence we can have the

scores obtained from the administration of the test are essentially the same

scores that would be obtained if the test were re-administered to the same

group. An unreliable test is essentially useless. If a test were unreliable

then scores from a given group would be expected to be quite different

every time the test was administered. If an intelligence test were

unreliable, a student scoring an IQ of 120 today might score 140

tomorrow, and 95 the day after tomorrow If the test were highly reliable

and if the students‟ IQ were 110, then we would not expect that score to

fluctuate too greatly from testing to testing.

12

Norman E Gronlund, Measurement and…, p.51. 13

J.B Heaton, Writing Language Test, (Longman: 1998), p.153.

12

A valid test is always reliable, but a reliable test is not necessarily

valid. If a test is measuring what it is supposed to be measuring, it will be

reliable and do so every time. But a reliable test can consistently measure

the wrong thing and be invalid.14

c. Practicality

The third characteristic of a good test is practicality or usability in the

preparation of a new test. The teacher must keep in mind a number of very

practical considerations which involves economy, ease administration, and

interpretation the result.

Economy means that the test is not costly. The teacher must take into

account the cost per copy, how many scores will be needed. How long the

administering and scoring of it will take.

Ease administration means that the test administrator can perform his

task quickly and efficiently. We must also consider the ease with which

the test can be administered.

Ease of interpretation and application JB. Heaton states “The final

point concerns the presentation of the test paper itself”, where possible, it

should be printed or type written and appear neat, tidy, aesthetically

pleasing. Nothing is worse and more disconcerting to the testiest than

untidy test paper, full of miss spelling, omissions and corrections.” If it

happens, it will be easy for student or testiest easy to interpret the test

items”.15

14

L.R Gay, Educational Evaluation and Measurement, (New York: Macmillan, Inc.,

1985) p. 167. 15

J. Charles Anderson, Claphane & Dianne Wall, Language Test Construction &

Evaluation, (Melbourne: Cambridge University Press, 1995), p.187.

13

B. Item Analysis

Selection of appropriate language items is not enough by itself to ensure a

good test. Each question needs to function properly; otherwise, it can weaken the

exam. Fortunately, there are some rather simple statistical ways of checking

individuals‟ item. This is done by studying the students‟ responses to each item.

When formalized this procedure is called “item analysis”.16

An item analysis tells

us basically three things: how difficult each item is, whether or not the question

discriminate or tells the difference between high and low students, and which

distracters are working as they should. An analysis like this is used with any

important exam- for example, review tests and tests given at the end of a school

term or course.

1. The Definition of Item Analysis

Item analysis is usually done for purposes of selecting which items will

remain on future revised and improved version of test. There are several

descriptions about item analysis. According to Nitko in his book he stated

that, “Item analysis refers to the process of collecting, summarizing, and

using information about individual test items, especially information about

pupil‟s responses to items.17

Furthermore, Lado defines item analysis is the

study of validity, reliability and difficulty of test items taken individually as if

they were separate tests.18

Item analysis usually provides two kinds of information on items: item

facility, which helps us decide if test items are at the right level for the target

group, and item discrimination, which allows us to see if individual items are

providing information on candidates‟ abilities consistent with that provided by

the other items on the test.19

16

Harold S. Madsen, Techniques in Testing, (New York: Oxford University Press,

1983), p.180. 17

Anthony J. Nitko, Educational Test and Measurement, an Introduction, (New

York: Harcourt B Race Jovanovich, Inc, 1983), p. 284 18

Robert Lado, Language Testing, (London: Longman Group Limited, 1983), p. 342. 19

Tim McNamara, Language Testing, (Oxford: Oxford University Press, 2000), p.

60

14

From those opinions, it can be concluded that item analysis is the

process of collecting information about pupil‟s responses to the items, to see

the quality of test items. More specific, item analysis information can tell us if

an item was too easy or too hard, how well it discriminated between high and

low scores on the test and whether all of the alternatives function as intended.

Item analysis data also aids in detecting specific technical flaws and thus

further provides information for improving the test items.

2. Discriminating Power

Item discriminatory power of a test is its ability to separate good

students from poor students. These students groups are defined by their scores

on the test as whole. The difference between the percentage of the top scoring

27% and bottom scoring 27% of students get the item right in its

discrimination index.20

As well as knowing how difficult an item is, it is important to know

how it discriminates, that is how well it distinguishes between students at

different levels of ability. If the item is working well, we should expect more

of the top-scoring students to know the answer than the low-scoring ones. If

the strongest students get the item wrong, while the weaker students get it

right, there is clearly a problem with the item, and it needs investigating.

Each item on the test should contribute to the total score and to the

meaning of that total score. Many times the purpose of a test is to discriminate

between groups of students, such as those who have mastered the domain of

content that the test represents and those students who have not achieved

mastery.

The discrimination index can range from -1 to +1. Items with positive

values of the discrimination index are desired because those are the items that

are contributing to the usefulness of the total score. When the discrimination

index is near zero, it indicates that the item is contributing nothing to the

20

H.J.X. Fernandes, Testing and Measurement, (Jakarta: National Educational

Planning, Evaluation and Curriculum Development, 1984), p. 27.

15

discriminating power of the overall test.21

When a larger proportion of

students in the lower group got the item right more than those in the upper

group, it discriminates negatively. And since more students in the upper group

than in the lower group got the item right, it is discriminating positively. 22

Item discriminating power can be obtained by subtracting the number

of students in the lower group who got the item right (U) from the number of

students in the upper group who got the tem right (L) and dividing by the total

number of students in one group included in the item analysis (N).

It summarized in formula form, as below:

DI = U - L

N

Where:

DI = the index of discriminating power

U = the number of pupils in the upper group who answered the item

correctly

L = the number of pupils in the lower group who answered the item

correctly

N = number of pupils in each of the groups23

The classifications of the index of discriminating power (D) are:

DI = 0.70 – 1.00 = Excellent

0.40 – 0.70 = Good

0.20 – 0.40 = Satisfactory

≤ 0.20 = Poor

Negative value on D= Very poor 24

21

William Wiersma, Educational Measurement and…, p. 245. 22

Gilbert Sax, Principles of Educational and…, p. 191. 23

Charles D. Hopkins & Richard L. Antes, Classroom Measurement and Evaluation,

(Illinois: F.E. Peacock Publishers, Inc., 1990), p. 279. 24

Anas Sudijono, Pengantar Evaluasi Pendidikan, (Jakarta: PT. Raja Grafindo

Persada, 2006), p. 389.

16

3. Types of Test Item

In constructing the test items, the test maker may choose from a

variety of item types. There are two types of test items: subjective test and

objective test.

a. Subjective test

Subjective test is attest which the examinee answers in his own words,

and at appropriate length, all or some of a relatively small number of

questions. Typical key-words in the question set in examinations of this

kind are: „discuss‟, „compare‟, „contrast‟, „describe‟, and the answer they

elicit may range from a single sentence to a dozen or more paragraphs.

These answers are commonly called „essays‟. Here are some subjectively

marked tests:

1) Short-Answer Items

The short answer (or completion) item is the only objective item

type that requires the examinee to supply, rather than select, the

answer. Its make-up is similar to a well-stated multiple choice item

without the alternatives. Thus, it consists of a question or incomplete

statement, to which the examinee responds by providing the

appropriate words, numbers, or symbols.

2) Essay

Essay tests are inefficient for measuring knowledge outcomes but

they provide a freedom of response that is needed in measuring certain

complex outcomes. These outcomes include the ability to create, to

organize, to integrate, to express and similar behaviors that call for the

production and synthesis of ideas.

The most noticeable characteristic of the essay test is the freedom

of response it provides. The student is asked a question that requires

him to produce his own answer. He is relatively free to decide how to

17

approach the problem, what factual information to use, how to

organize his reply, and what degree of emphasis to give each aspect of

his answer. Thus the essay question places a premium on the ability to

produce, integrate, and express idea.

b. Objective test

Objective test is said to be one that may be scored by comparing

examinee responses with an establishing set of acceptable responses of

scoring key. Objective test can be scored objectively. That is, equally

competent scorers can score them independently and obtain the same

results. Objective test includes a variety of item types:

1) Multiple choice items

Multiple-choice refers to test items that require the students to

select one or more responses from a set of two or more options These

items consists of a stem, which presents a problem situation, and

several alternatives, which provide possible solutions to the problem.

The stem may be a question of an incomplete statement. The

alternatives include the correct answer and several plausible wrong

answers, called distracters. The function of the latter is to distract those

students who uncertain of the answer.25

Multiple choice items can measure a variety of learning outcomes,

ranging from simple to complex, and it is easy to see why this item

type is regarded so highly and used so widely.

2) Matching Items

Another selected-response item, sometimes called an objective

item, is the matching item. The format is not used as extensively as

true-false or multiple-choice items. But the matching item can be used

25

Norman E. Gronlund, Constructing Achievement..., p.38.

18

effectively to measure learning and, when used, it provides variety in

the test format for both student and teacher.

The matching item is exactly what the name implies; it requires the

student to use some association criterion in order to match the words or

phrases that represent ideas, concepts, principles, or things. Matching

items are usually presented in two-column format: one column consists

of premises and the other consists of responses.26

3) True-False Items

The true-false item is simply a declarative statement that the

student must judge as true or false. There are modifications of this

basic form in which the student must respond “yes” or “no,” “agree”

or “disagree,” “right” or “wrong,” “fact” or “opinion,” and the like.

Such variations are usually given the more general name of

alternative-response items. In any event this item type is characterized

by the fact that only two responses are possible.27

True-false items can be effective when a few guidelines are

followed in the construction: Statements must be clearly true or false,

statements should not be lifted directly from the text, specific

determiners should be avoided, trick questions should not be used,

some statements should be written at higher cognitive levels, and true-

false items should be of the same frequency and length.28

3. The Importance of Item Analysis

Item analysis is an important and necessary step in the preparation of

good multiple-choice tests. Because of this fact, it is suggested that every

classroom teachers who use multiple choice test data should know something

of item analysis, how it is done and what it means.

26

William Wiersma, Educational Measurement and Testing, (Boston: Allyn &

Bacon, 1990), p. 48. 27

Norman E. Gronlund, Constructing Achievement…, p.54. 28

William Wiersma, Educational Measurement and…, p. 47.

19

The benefits of item analysis are not limited to the improvement of

individual test items; however there are a number of fringe benefits of special

value to classroom teachers. The most important of these are the following:

a. Item analysis data provide a basis for efficient class discussion of the test

result.

b. Item analysis data provide a basis for remedial work.

c. Item analysis data provide a basis for the general improvement of

classroom instruction.

d. Item analysis procedures provide a basis for increased skill in test

construction.29

While Nitko states in his book, the important of item analysis are:

a. Determining whether an item functions as teacher intends,

b. Feedback to students‟ performance and as a basis for class discussion,

c. Feedback to the teacher about pupil‟s difficulties,

d. Area for curriculum improvement,

e. Revising the items,

f. Improving item writing skills.30

29

Robert L. Linn and Norman E. Grondlund, Measurement and…, p.316. 30

Anthony J. Nitko, Educational Test and…, p. 284.

20

CHAPTER III

RESEARCH METHODOLOGY

A. Place and Time of Research

The research was conducted at “SMPN 87” Pondok Pinang. This is located

at Jl. Ciputat Raya Pondok Pinang, Kebayoran Lama, South of Jakarta. The writer

did the research in December 2010. The writer took the English summative test

papers and the students’ answer sheets of second grade period of 2010-2011 to be

analyzed.

B. Technique of Sample Taking

The writer took the sample from third year students of “SMPN 87”

Pondok Pinang. The total number of second year students is 238 students; those

are divided into 6 classes. The writer took 25% of the total number of the second

year students as a sample. That is 25% x 238 = 60 students. The writer used an

ordinal sampling to get the students’ answer sheet. The writer divides the students

into three groups; they are upper, middle, and lower group. Then the writer takes

upper and lower group only to be analyzed.

20

21

C. Technique of Data Collecting

To collect data connecting with the topic of discussion, the writer came to

the school to get the permit from the headmaster to take students’ answer sheet

and the test question paper of English summative test of second year students of

“SMPN 87” Pondok Pinang to be analyzed.

D. Research Instrument

a. Students’ answer sheet

The students answer sheet is papers in which the students give their

answer that correspond to the English summative test. The English summative test

that the writer used is the final odd semester for the second year students of

“SMPN 87” Pondok Pinang academic year 2010-2011, prepared by MGMP.

b. English summative test of the second year student of “SMPN 87” Pondok

Pinang.

E. Technique of Data Analysis

In this research, the writer used quantitative method to analyze the

discriminating power of English summative test items of second year of “SMPN

87” by using a statistic formula, namely, the Discriminating Power Index:

DI = U - L

N

Where:


U = the number of pupils in the upper group who answered the item correctly

L = the number of pupils in the lower group who answered the item correctly

N = number of pupils in each of the groups1

1Charles D. Hopkins & Richard L. Antes, Classroom Measurement and Evaluation,

(Itasca: F.E. Peacock Publishers, Inc., 1990), p. 279

22


DI = 0.70 – 1.00 = Excellent

0.40 – 0.70 = Good


≤ 0.20 = Poor

Negative value on D = Very poor 2

2 Anas Sudijono, Pengantar Evaluasi…, p. 389.

23

CHAPTER IV

RESEARCH FINDINGS

A. Description of Data

The data which is used by the writer is the English summative test in the

odd semester of second grade of “SMPN 87” Pondok Pinang. This English

summative test was held on Wednesday, December 8th

2010, that must be finished

in 120 minutes. The total numbers of test items are 50 questions, which all of

them are multiple choice items.

The total numbers of students that took part in this analysis are 60

students. Kelley in the book “Classroom Measurement and Evaluation”

demonstrated that the selection of criterion groups based upon the upper 27

percent and lower 27 percent of the papers provide the greatest confidence that the

upper group is superior in the trait measured by the test as compared to the lower

group. The middle 46 percent of the papers is not used when 27 percent in the

upper and 27 percent in the lower groups are employed in item analysis.1 Based

on that statement, the writer classified the students into three groups; upper,

middle and lower group. The writer took only 27% of the lower group and 27% of

the upper group for this analysis. And the rest students that belong to the middle

1 Charles D. Hopkins, Richard L. Antes, Classroom Measurement and

Evaluation,(Itasca: F.E. Peacock Publishers, Inc., 1990), p. 275

23

24

group will not take part to this analysis. The next table is the students’ scores and

group position in English summative test.

Table 4.1

The Students’ Scores and Group Position of English Summative Test

In the Odd Semester

No Score Explanation

1 82 U

P

P

E

R

G

R

O

U

P

2 80

3 78

4 76

5 76

6 74

7 74

8 74

9 74

10 72

11 72

12 72

13 72

14 64

15 64

16 64

17 64 M

18 62

25

19 62

I

D

D

L

E

G

R

O

U

P

20 60

21 60

22 60

23 60

24 60

25 58

26 58

27 58

28 58

29 58

30 58

31 58

32 56

33 56

34 56

35 56

36 56

37 54

38 54

39 54

40 54

41 54

42 54

43 52

26

44 52

45 52 L

O

W

E

R

G

R

O

U

P

46 50

47 50

48 50

49 48

50 46

51 46

52 44

53 44

54 42

55 40

56 38

57 34

58 34

59 34

60 32

Table 4.1 shows that students who are taking the test are classified into 3

groups: upper group, middle group and lower group. The writer took 27% or 16

students from upper and lower group to be analyzed. The highest score in upper

group is gained by one student in score 82. The lowest score in upper group is

gained by three students in the same score 64. Meanwhile the highest score in

lower group is gained by one student in the same score 52. So, the lowest score in

lower group is gained by one student in score 32.

27

28

Table 4.2

The Students' Answer Sheet of English Summative Test Items from the Upper Group

No

Students' Number of items

score 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Answer key * B D C D B B D B B B A C C B C D C C B C D A B D

1 82 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 0 1

2 80 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1

3 78 0 1 1 1 1 1 1 1 0 1 0 0 1 1 1 1 1 1 1 1 0 0 1 0 1

4 76 0 1 1 1 1 1 1 0 0 1 1 0 1 1 1 0 1 1 1 0 1 0 1 0 1

5 76 0 1 1 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 1

6 74 0 1 1 0 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 0 1

7 74 0 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 0 1 1 1

8 74 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 0 0 1 0 0 1 1 1

9 74 0 1 1 1 1 1 1 1 0 0 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1

10 72 0 1 1 1 0 0 1 1 1 1 1 0 1 0 1 1 1 0 0 1 0 0 1 1 1

11 72 0 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 0 0 1 1 0 0 1 0 0

12 72 0 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1 0 1 0 0 0 1 0 0

13 72 0 1 1 0 1 0 0 1 1 1 1 0 1 0 1 0 1 1 0 1 0 1 1 1 1

14 64 0 1 1 1 1 1 1 1 1 1 0 0 1 0 1 1 0 0 0 0 0 0 1 1 1

15 64 0 1 1 0 1 1 0 0 1 1 0 0 0 1 1 1 1 0 0 0 0 0 1 1 1

16 64 0 1 1 0 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 0 0 1 1 1

Correct answer 0 16 16 12 15 13 11 12 12 15 13 1 14 11 15 11 14 10 10 10 5 3 16 7 14

29

No

Students' Number of Items

score 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

Answer key A D A B B D B D A B C D C C D B C B B D B B D A D

1 82 1 0 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1

2 80 0 1 1 0 1 0 0 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 0

3 78 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1 0 1 1 0 1 1 0 1 1 1

4 76 0 1 1 0 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 1 1

5 76 0 1 1 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1

6 74 0 0 0 1 1 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 0 1 1 0

7 74 0 0 1 0 0 1 0 1 1 1 1 0 1 0 1 1 1 1 0 1 1 0 1 1 0

8 74 0 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 1 1

9 74 0 1 1 1 0 0 0 1 1 1 1 1 1 1 1 0 1 1 1 0 0 1 1 1 1

10 72 1 1 1 0 1 0 0 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 1 1 0

11 72 1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 1 1 1 1

12 72 1 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 0

13 72 1 0 1 1 1 0 0 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1

14 64 0 1 0 1 0 0 0 0 1 1 1 0 1 1 1 1 1 1 0 1 0 1 1 1 1

15 64 0 0 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1

16 64 0 0 1 1 1 1 0 1 0 1 1 1 0 0 1 1 1 1 0 1 1 1 1 0 1

Correct answer 7 9 14 12 13 8 1 12 13 14 16 11 11 10 16 12 16 15 7 15 13 12 15 14 11

30

31

Table 4.3

The Students' Answer Sheet of English Summative Test Items from the Lower Group

No

Students' Number of Items

score 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Answer key * B D C D B B D B B B A C C B C D C C B C D A B D

1 52 0 1 1 0 0 0 0 1 1 1 1 0 1 1 0 0 0 0 1 0 0 0 1 1 1

2 50 0 0 1 1 0 0 1 1 1 0 0 0 1 1 0 1 0 1 0 1 0 0 0 1 0

3 50 0 1 1 1 0 1 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1

4 50 0 1 1 1 1 1 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 1 0

5 48 0 1 1 1 1 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 1

6 46 0 1 1 0 0 0 0 1 0 1 0 1 1 1 0 0 0 1 0 1 0 0 1 1 0

7 46 0 0 1 1 0 1 0 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 1 1 0

8 44 0 0 1 1 0 1 0 0 1 1 1 0 0 1 0 0 0 0 0 1 0 0 1 1 1

9 44 0 0 1 1 0 1 0 0 1 1 1 0 0 1 0 0 0 1 0 1 0 0 1 0 0

10 42 0 0 1 1 1 0 0 1 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 0

11 40 0 1 1 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 1 1

12 38 0 0 1 0 0 1 0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 1 1 1

13 34 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0

14 34 0 0 1 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 0 1 0 0 1 1 0

15 34 0 1 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 1

16 32 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 0 1 1 1 1 0 0 1 0 0

Correct answer 0 7 16 11 5 6 2 6 11 9 7 1 10 10 4 2 1 9 6 9 0 0 13 12 7

32

No

Students Number of Items

score 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

Answer key A D A B B D B D A B C D C C D B C B B D B B D A D

1 52 1 0 1 1 0 0 0 0 0 0 1 1 1 0 1 1 1 0 0 1 0 0 1 1 1

2 50 1 1 1 1 1 0 0 0 1 0 1 1 0 1 1 0 1 0 0 0 0 1 1 0 0

3 50 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 1 1 0 0 1 0 0 0 1 0

4 50 0 0 1 0 0 1 0 0 0 0 1 1 1 0 0 0 1 1 0 1 0 1 1 1 1

5 48 1 1 1 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1

6 46 1 1 1 1 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 1 1 0 0 1 0

7 46 1 1 0 0 0 0 0 1 1 0 1 0 1 0 1 0 1 1 0 1 0 0 0 0 1

8 44 1 0 1 1 0 1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 0 0 1 0 0

9 44 0 1 0 1 1 0 1 1 1 0 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0

10 42 1 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 0

11 40 1 1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 1 1 0 1 0 0 0 0 0

12 38 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0

13 34 0 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 1 0 0 1 1 1 1 1 1

14 34 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 0 1 1 0 0 1

15 34 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 1

16 32 0 0 1 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0

Correct answer 10 7 8 7 4 4 2 9 6 2 8 6 7 2 8 2 16 7 2 12 4 7 7 9 7

33

From the table 4.3 above it can be concluded that the responses of each

item of lower group students in their English test are:

1. There are 16 students who answered the item no 3 and 42 correctly.

2. There are 13 students who answered the item no 23 correctly.



5. There are 10 students who answered the item no 13, 14 and 26 correctly.

6. There are 9 students who answered the item no 10, 18, 20, 33, and 49

correctly.

7. There are 8 students who answered the item no 28, 36, and 40 correctly.

8. There are 7 students who answered the item no 2, 11, 25, 27, 29, 38, 43,

47, 48, and 50 correctly.


correctly.


11. There are 4 students who answered the item no 15, 30, 31 and 46 correctly.

12. There are 2 students who answered the item no 7, 16, 32, 35, 39, 41 and 44

correctly.

13. There is 1 student who answered the item no 12 and 17 correctly.

14. There is no student who answered the item no 1, 21 and 22 correctly.

34

Before the writer analyzes the data, the writer has calculated the data into

statistic calculation. The writer used Discrimination Index formula to find the

discriminating power criteria of English summative test. The table is as follows:

Table 4.4

The Discriminating Power Index of the Upper and Lower Group

Item

number

Total correct answer

U - L DI = U - L

N Remark* Upper

Group

Lower

Group

1 0 0 0 0 Poor

2 16 7 9 0.56 Good

3 16 16 0 0 Poor

4 12 11 1 0.06 Poor

5 15 5 10 0.62 Good

6 13 6 7 0.43 Good

7 11 2 9 0.56 Good

8 12 6 6 0.37 Satisfactory

9 12 11 1 0.06 Poor



12 1 1 0 0 Poor


14 11 10 1 0.06 Poor

15 15 4 11 0.68 Good

16 11 2 9 0.56 Good

17 14 1 13 0.81 Excellent

18 10 9 1 0.06 Poor


20 10 9 1 0.06 Poor


22 3 0 3 0.18 Poor

23 16 13 3 0.18 Poor

35

24 7 12 -5 -0.31 Very poor

25 14 7 7 0.43 Good

26 7 10 -3 -0.18 Very poor

27 9 7 2 0.13 Poor



30 13 4 9 0.56 Good


32 1 2 -1 -0.06 Very Poor

33 12 9 3 0.18 Poor

34 13 6 7 0.43 Good

35 14 2 12 0.75 Excellent

36 16 8 8 0.50 Good



39 10 2 8 0.50 Good

40 16 8 8 0.50 Good

41 12 2 10 0.62 Good

42 16 16 0 0 Poor

43 15 7 8 0.50 Good


45 15 12 3 0.18 Poor

46 13 4 9 0.56 Good


48 15 7 8 0.50 Good



*note: the classification of remark is adopted by Anas Sudijono, Pengantar

Evaluasi Pendidikan, (Jakarta: PT. Raja Grafindo Persada, 2006), p. 389.

36

Based on the data above, the percentage of discriminating power of

English summative test is:

Table 4.5

The percentage of Discriminating Power

No Discriminating

power Total item Percentage Item number

1 Excellent 2 4% 17, 35

2 Good 16 32%

2, 5, 6, 7, 15, 16, 25,

30, 34, 36, 39, 40, 41,

43, 46, 48

3 Satisfactory 15 30%

8, 10, 11, 13, 19, 21,

28, 29, 31, 37, 38, 44,

47, 49, 50

4 Poor 14 28%

1, 3, 4, 9, 12, 14, 18,

20, 22, 23, 27, 33, 42,

45

5 Very Poor 3 6% 24, 26, 32

The table above showed that: there are 2 test items (4%) are categorized

into excellent test item, which is showed by the test items number 17 and 35. It is

categorized as excellent test item because its discriminating index is in range

between 0.70 – 1.00. There are 16 test items (32%) are categorized into good

items, that range from 0.40 – 0.69, they are test items number 2, 5, 6, 7, 15, 16,

25, 30, 34, 36, 39, 40, 41, 43, 46, and 48. There are 15 test items (30%) are

categorized as satisfactory test items for their discriminating index are in range

0.20 – 0.39, they are test items number 8, 10, 11, 13, 19, 21, 28, 29, 31, 37, 38, 44,

47, 49, and 50.

Meanwhile, 14 test items (28%) are categorized into poor test items

because their discriminating index are range in 0.00 – 0.19, they are test items

number1, 3, 4, 9, 12, 14, 18, 20, 22, 23, 27, 33, 42, and 45. At last, there are 3 test

37

items (6%) are categorized as very poor item as their discriminating index are

range in negative values.

B. Data Analysis

In analyzing the discriminating power of the data, the writer listed the

students’ responses of each number of the test firstly. The list can be seen in the

table 4.2 and 4.3 of this “skripsi”.

Then the next step is to make a format of item analysis. This format and

the result of this format labeled table 4.4. The last step is to count discriminating

power of all items using this formula:

DI = U - L

N

Where:


U = the number of pupils in the lower group who answered the item correctly

L = the number of pupils in the lower group who answered the item correctly

N = number of pupils in each of the groups

The result of this last step can be seen also in the table 4.4 In this table,

result of each item will be in decimal then the writer categorized each item

according to this formula:


DI= 0.70 – 1.00 = Excellent

0.40 – 0.70 = Good


≤ 0.20 = Poor

Negative value on D = Very poor

Based on the data of item analysis result in discriminating power above,

the writer can conclude that from 50 items:

1. There are 33 test items (66%) are categorized into good test items which is

range from 0.25 – 0.81.

38

2. There are 14 test items (28%) are categorized into poor test items because

their discriminating index are range in 0.00 – 0.18.

3. There are 3 test items (6%) are categorized as very poor item as their

discriminating index are range in negative values that -0.06 – -0.31

C. Data Interpretation

For whole items, the writer can interpret that the discriminating power of

English summative test prepared by “MGMP” tested at the second grade of

“SMPN 87” Pondok Pinang belongs to good discriminating power, because there

are 33 test items or 66% from 50 test items is ranging from 0.25 – 0.81.

39

Based on the table 4.2 on the previous page, the writer concluded the

achievement of upper group students in their English test. From 50 multiple

choice items, none of the students got the perfect score. The following description

tells about the responses of each item:

1. There are 16 students who answered the item no 2, 3, 23, 36, 40, and 42

correctly.


correctly.


correctly.


correctly.

5. There are 12 students who answered the item no 4, 8, 9, 29, 33, 41, and 47

correctly.


correctly.

7. There are 10 students who answered the item no 18, 19, 20, and 39

correctly.



10. There are 7 students who answered the item no 24, 26, and 44 correctly.




14. There is 1 student who answered the item no 12 and 32 correctly.

15. There is no student who answered the item no 1 correctly.

38

CHAPTER V

CONCLUSION AND SUGGESTION

A. Conclusion

Based on the analysis and the interpretation in the previous chapter, the

writer would like to conclude that the English summative test which is tested at

second grade of “SMPN 87” Pondok Pinang, can be categorized into 5 different

range of discrimination power index. First, they are 2 test items (4%) that is

categorized into excellent test items. Then, there are 16 test items (32%) that are

categorized into good test items. Besides that, 15 test items (30%) are categorized

into satisfactory test items. Fourth, there are 14 test items (28%) are categorized as

poor test items. Lastly, 3 test items (6%) are categorized into very poor test items.

So, there are 33 test items (66%) of English summative test regarded as a

good discriminating power that range from 0.25 – 0.81 and it can be used for the

next test. Meanwhile, 14 test items (28%) are needed to be revised for their poor

value in differentiating the ability of the upper from the lower group that range

from 0.00 – 0.18. And 3 test items (6%) have to be eliminated, because those

items have negative discrimination index that range from -0.06 - -0.31.

From the explanation above, the writer concludes that the English

summative test which is tested at second grade of “SMPN 87” Pondok Pinang has

good discriminating power, because 33 items (66%) of the test items have fulfilled

the criteria of a positive discriminating power which range from 0.25 – 0.81.

38

39

B. Suggestion

After doing the research, there are some suggestions that can be given

in relation to the writer’s conclusion. The suggestions are as follows:

1. Teachers have to give good techniques in answering the items. For

instance, encourage the students to do the easier items and not to be

stuck to the difficult items. This technique should be common used by

the students so that they will not waste their time.

2. Teachers should save test items which have satisfactory, good and

excellent criteria in order can be used by the teachers for the future

evaluation

3. Teachers should revise the test items which have poor criteria and

discard those which have very poor criteria, so that they can be used

for the next evaluation.

40

BIBLIOGRAPHY

Anderson, J. Charles, et.al. Language Test Construction & Evaluation,

Melbourne: Cambridge University Press, 1995.

Bahman, Lyle F. Statistical Analysis for Language Assessment, Cambridge:

Cambridge University Press, 2004.

Brown, H. Douglas. Teaching by Principles, An Interactive Approach to

Language Pedagogy, White Plains: Addison Wesley, Longman, 2001.

Fernandes, H.J.X. Testing and Measurement, Jakarta: National Educational

Planning, Evaluation and Curriculum Development, 1984.

Gay, L.R. Educational Evaluation and Measurement, New York: Macmillan, Inc.,

1985.

Gronlund, Norman E. Measurement and Evaluation in Teaching, New York:

Macmillan Publishing Co., Inc, 1981.

Genesee, Freed and John A.Upshur. Classroom – Based Evaluation in Second Language,

New York: Cambridge University Press, 1996.

Heaton, J.B. Writing Language Test, Longman: 1998.

Hopkins, Charles D. and Antes, Richard L. Classroom Measurement and

Evaluation, Itasca: F.E. Peacock Publishers, Inc., 1990.

Hughes, Arthur. Testing for Language Teachers, Cambridge: Cambridge

University Press, 2003.

Lado, Robert. Language Testing, London: Longman Group Limited, 1983.

Madsen, Harold S. Techniques in Testing, New York: Oxford University Press,

1983.

McNamara, Tim. Language Testing, Oxford: Oxford University Press, 2000.

Nitko, Anthony J. Educational Test and Measurement, an Introduction, New

York: Harcourt B Race Jovanovich, Inc, 1983.

Sax, Gilbert. Principles of Educational and Psychological Measurement and

Evaluation, Belmont: University of Washington, 1980.

Sudijono, Anas. Pengantar Evaluasi Pendidikan, Jakarta: PT. Raja Grafindo

Persada, 2006.

41

Wiersma, William. Educational Measurement and Testing, Boston: Allyn &

Bacon, 1990.

AN ITEM ANALYIS ON DISCRIMINATING POWER OF …repository.uinjkt.ac.id/dspace/bitstream/123456789/2826/1/HIKMAH... · AN ITEM ANALYIS ON DISCRIMINATING POWER OF ENGLISH SUMMATIVE TEST

Documents