Student ID : 610058427 1 UNIVERSITY OF EXETER GRADUATE SCHOOL OF EDUCATION DESIGNING A VOCABULARY ACHIEVEMENT TEST FOR SAUDI SIX GRADERS Student Name: Ahmed Saad Abdelsalam Summative Assignment Summer 2013 Submitted to Dr Susan Riley Submitted as part of the requirements for the M.Ed in TESOL Summer Intensive Course
19
Embed
Designing a Vocabulary Achievement Test For Saudi Six Graders
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Student ID : 610058427
1
UNIVERSITY OF EXETER
GRADUATE SCHOOL OF EDUCATION
DESIGNING A VOCABULARY ACHIEVEMENT TEST FOR SAUDI SIX GRADERS
Student Name: Ahmed Saad Abdelsalam
Summative Assignment
Summer 2013
Submitted to Dr Susan Riley
Submitted as part of the requirements for the M.Ed in TESOL
Summer Intensive Course
Student ID : 610058427
2
Contents
Introduction
I. Teacher-based Assessment
II. Testing Vocabulary
III. Context
IV. Test-Takers
V. Test Purpose
VI. Content
VII. Structure, timing, medium/channel
VIII. Test tasks
a. Task types
b. Writing tasks
IX. Test Administration
a. Test environment
b. Personnel and Procedures
c. Scoring
d. Feedback
Conclusion
Bibliography
Appendix 1 (unit 4)
Appendix 2 ( Vocabulary Achievement Test)
Student ID : 610058427
3
INTRODUCTION
Brown (2003) defines the term “ test” as “ A method of measuring a person’s ability, knowledge
or performance in a given domain”. Testing and assessing as described by Brown (2003) are not
synonyms. However, tests could be considered as a subset of assessment. Brown (2003)
suggests different distinctions of assessment. These distinctions are mainly emerging from
differences in the form, function and score interpreting of tests.
This paper is mainly about the development of an achievement test for Saudi young learners.
The test focuses only on vocabulary knowledge. In the following, the teacher-based assessment
will be introduced. The assessment of vocabulary, the testing ability, will be discussed. Then,
the test development steps will be presented. These steps will be based on the arguments of
Fulcher & Davidson (2007), Douglas (2010) and Hughes (2003) about the stages of test
development. The first stage includes describing the context and test-takers, identifying the
test purpose and setting the content, structure and timing. The second stage includes selecting
the appropriate task types and writing the task items. The last stage involves developing the
rules and procedures of test administration. It presents the test environment, personnel,
procedures, scoring and feedback. Concepts of validity, reliability and practicality are
considered during the test development. Thus, the test would be a valid, reliable and practical
one.
I. TEACHER-BASED ASSESSMENT
Teacher-based assessment as stated by Davison and Leung (2009) is used widely in a
number of educational systems internationally. TBA could be defined as a teacher-
mediated, classroom-embedded and context-based assessment. This kind of
assessment as discussed by Davison and Leung (2009) has some main characteristics
that make it distinguished from traditional exams. First, it contains teacher
involvement in the assessment process. It could be adapted and modified and is also
conducted in the ordinary classroom. It is administered by the students’ teacher, not a
stranger. TBA gives the teacher the opportunity to give instant and constructive
Student ID : 610058427
4
feedback. In addition, This kind of assessment has some tools that establish the
validity and reliability of the test. As the teacher can assess the students more than
once that gives an opportunity for assessor reflection or standardization. Moreover,
TBA is considered cheaper and practical as it is taken by the teacher and integrated
into the normal curriculum.
II. TESTING VOCABULARY
Vocabulary is one of the controversial topics in language teaching. Many theories and
approaches have discussed the nature of vocabulary. Several scholars such as Schimtt (2008),
Nagy (2005) and Herman and Dole (1988) set approaches, techniques and procedures for
teaching and learning vocabulary. They have viewed vocabulary as the main factor of language
learning and teaching. Wilkins (1972) (as stated by Schmitt (2010) says “Without grammar very
little can be conveyed, without vocabulary nothing can be conveyed. “
Read (2000) argues that the assessment of vocabulary knowledge is necessary and
straightforward. He described words as small units that form the larger structures such as
sentences, paragraphs and whole texts. Read (2000) refers to two different perspectives of
vocabulary assessment. These perspectives as described by Read (2000) are complementary as
they relate to different purposes. Concerning the first perspective, meaning and usage of a set
of words will be assessed as independent semantic units. This kind of assessment could be used
by classroom teachers to assess the students’ progress in learning vocabulary and identify the
points of weakness. Concerning the other perspective, the vocabulary will be assessed in a
context, integrated with the other language knowledge. This kind of test is a large scale one,
where vocabulary has less importance. From these perspectives, three dimensions have
suggested for vocabulary assessment; discrete – embedded, selective – comprehensive and
context independent and context dependent. The first dimension as stated by Read (2000)
concentrates on the construct; the ability to be measured. The discrete test would assess the
vocabulary as a distinct construct. On the other hand, the embedded test would assess the
vocabulary as part of the language construct. The second dimension relates to the range of
Student ID : 610058427
5
vocabulary. In this dimension, the difference is that whether the test will be based on selected
words or based on the overall ability of the language use. The third dimension considers the
contexts. In the context independent vocabulary test, the vocabulary will be presented in
isolation. On the other hand, in the context dependant vocabulary test, the vocabulary will be
presented within a context.
III. CONTEXT
The test occurs in a Saudi national school, which has an American division. The school is
accredited by AdvanceED and follows California Standards for public schools. Scott Foresman
Reading Street coursebook is used in the elementary stage. Scott Foresman Reading Street is a
story based course book. The core book is divided into units, each unit consists of five stories
that contain the to-be learned vocabulary and the related word knowledge. The teachers are
Arab nationals. They hold bachelor degrees in language education, however, they do not have
training qualifications such as CELTA.
IV. TEST-TAKERS
The characteristics of the test takers are generally not influencing the test design so
much. However, it is not the case when it comes to the young learners. McKay (2006)
shows that the young learners have special characteristics that require a special
approach to their assessment. This approach should take into account that the young
learners are still growing cognitively, socially, emotionally and physically. They are
also not aware of the outcomes of either success or failure in the assessment. In
addition, the young learners as described by Hughes (2003) lose attention in the long
tests. They respond well to pictures and interact with games. All these features should
be taken into account when the target test is designed. So, the test would have a
positive impact on the learners. In addition, it would achieve the main purpose of
designing it.
Student ID : 610058427
6
In this case, the test takers are the same as described by McKay (2006). They are
Saudi six graders. They are young language learners. They do not pay much attention
to the results of the tests, only when it comes to the appraisal and rewards. They got
bored in long tests and that reflects on their answers. They may leave some questions
of the test. On the other hand, their performance in the exams affects their attitudes
toward the subjects.
V. TEST PURPOSE
Many scholars such as Douglas (2010), Read (2000), Hughes (2003) and Brown (2003),
describe defining the test purpose as an important step in designing a test. The test
purpose as stated by Hughes (2003) is the basis for the test validity. The test is valid as
long as its results help to achieve the goals of designing it. Thus, the first step in
designing the target test should be to define a clear purpose of it. Read (2000)
suggests three broad purposes for language tests; for research, making decisions
about learners and making decisions about a language program. Hughes (2003) also
suggests some key questions that would help to build a solid basis for designing the
test. These questions are about test type, precise purpose, abilities to be tested, form
and accuracy of the results and the importance of backwash. These questions
suggested by Hughes (2003) in addition to the broad purposes of Read (2000) will be
the framework of defining the purpose of the target test.
This test is a teacher-based assessment, not a large scale one. It is mainly for Saudi six
graders in a Saudi national school. This test aims at making decisions about learners,
not for research nor making decisions about language programs. It tries to assess the
vocabulary knowledge of the students. This knowledge mainly includes the meaning
and the use of the words with a focus on the recognition and recall skills. This test,
according to the categorization of Brown (2003) or Hughes (2003) is an achievement
test as it occurs at the end of unit 4 of Scott Foresman Reading Street (Grade 6). It is a
Student ID : 610058427
7
formal assessment as it should be taken regularly at the end of each unit. It is a
discrete-point test, not an integrative one as it assess only vocabulary. This type of
testing usually occurs in the target context for some main reasons. First, learning
vocabulary is one of the weak points in the students’ performance. Thus, this regular
assessment will motivate them to focus on vocabulary. In addition, the school follows
an ongoing assessment that takes up 60% of the overall grade. 30% is only for weekly
and unit tests. Moreover, this regular test gives the students as well as their parents
the opportunity to identify the points of weakness with regard to vocabulary. The
teacher also suggests the appropriate solution to overcome the challenges faced by
the students.
VI. CONTENT
Read (2000) and Hughes (2003) emphasize the importance of defining the content of
the test. Read (2000) sees that the full information on the content will help to make
better decisions in the subsequent steps of the test. Hughes (2003) also shows that
defining the construct to be assessed influences the validity of the test. Content
validity is an important factor in building the validity of the test. It also affects the
process of backwash. Hughes (2003) shows that the content validity is developed by
including the proper samples of the skills or knowledge to be assessed. Alderson,
Clapham and Wall (1995) state that content validation is based on gathering experts’
judgements about the content.
On the other hand, Hughes (2003) argues that the description of the content varies
according to the abilities to be discussed. For instance, the content of grammar would
be a list of structures. However, the content of a language skill would cover other
dimensions such as type of text, topics, speed of processing and dialects. In case of a
vocabulary test, another issue arises that is selecting target words. Read (2000) shows
that selecting the target words varies according to the type of test. In designing
classroom progress test, as described by Read (2000), the teacher usually selects the
Student ID : 610058427
8
words that the learners have recently been studying. This process is simple when the
test is based on coursebooks that include word lists for each unit. After selecting the
words, the issue of how to present the words also arises. Read (2000) argues that the
words could be presented either in isolation or in context. The key factor in this issue
is that whether the test-takers’ responses would be based on the contextual
information.
In this case, as shown in the test purpose, it is an achievement test on vocabulary. So,
the outline will be simple and practical. It will include simply the word lists to be
assessed. The process of selecting the target words will be also simple. The test is
mainly based on Scott Foresman Reading Street coursebook. This coursebook is
divided into units, each unit has five stories and each story contains a list of high
frequency words. The target word lists will be based on Unit 4 as shown in Appendix
1. The word lists will also be monolingual. The definitions will be presented in the
English language. All the words delivered through the unit will be included. After that,
the content will be examined by the academic supervisor. Thus, the content validation
would be assured.
VII. Structure, timing, medium/channel
According to the syllabus outline as well as the available time, this test usually occurs
at the end of the unit in one period. Therefore, the test will take 30 minutes. The test
will assess only vocabulary as mentioned before. It will include 20 items. The test will
be a paper-pencil one.
VIII. TEST TASKS
a. Task Types
Douglas (2010) shows that various testing techniques could be used according to the
target language skill or knowledge. Techniques that are used for assessing speaking
could be different from the techniques used for assessing writing. However, Hughes
(2003) refers to some common techniques that could be used with different language
Student ID : 610058427
9
skills. These techniques include multiple choice items, Yes/No items and filling gaps
items. Read (2000) suggests several techniques to assess vocabulary; matching items,
completion items and sentence writing items. Matching technique is a recognition
task that focuses on the meaning of the word. Read (2000) refers to some notes to be
considered when we write the items. One of these notes is to add one or two extra
items. Another note is that the definitions should be easy to understand. One more
note is that the words should belong to one word class. On the other hand,
completion technique is a recall task. One idea suggested by Read (2000) is to provide
the letter of the target word so that the number of answers will be restricted.
Sentence writing technique is the simplest vocabulary task from the perspective of
test preparation (Read 2000). In addition, this technique assesses the meaning, the
grammatical function, word collocation and the ability to use the word in writing.
In addition, Douglas (2010) categorizes the testing techniques into three categories;
selected response, short response and extended response. Concerning the selected
response category, the test-takers have to choose answers from the available options.
The selected response category includes; multiple choice, matching and ordering
tasks. This category has an advantage of easy scoring and is able to assess language
knowledge. However, it requires a great deal of care in development through revision
and test trial processes. In the short answer items, the test-takers have to produce
words or phrases to complete the sentences. The main obstacle in developing such
techniques is that you provide all the acceptable answers. They also require a great
deal of care in the development process. Concerning the extended response category,
the test-takers are supposed to produce written or spoken longer discourse than a
sentence. One common extended response is the essay or composition.
In this case, the selected response items for assessing vocabulary will be selected. Therefore,
the matching items and completion items will be used. Several reasons are behind the selection
of selected response items; matching and completion items. First, these techniques are easy
scoring, so they would be more appropriate to the target test. As the test occurs during
Student ID : 610058427
10
teaching, so the time will not be enough for scoring short answer and extended answer
techniques. In addition, the matching and completion techniques cover the testing vocabulary
abilities. The matching technique will assess the recognition ability. The completion technique
will assess the recall ability. The sentence writing technique is not appropriate in this case. It is
time consuming and involves grammar and writing, which are not the objectives of the test in
this case.
b. Writing Items
Hughes (2003) argues that it is not possible to cover the whole items of content by
any one version of the test. To assure the validity of the test, the test should choose
widely and unpredictably from the content area. In case of testing vocabulary,
Hughes shows that the vocabulary items could be taken at random. He also
emphasizes the importance of keeping the test specifications in mind when writing
the items. In addition, Hughes (2003) states that the test should be examined by at
least two colleagues. He thinks that moderating the test is the best way to identify the
items to be improved.
In this case, four random vocabulary items will be taken from each story. So, all
stories would be covered by the test. Two words will be used in the matching part and
the other two will be used in the completion part. In the matching part, the
recognition ability will be assessed. It will be divided into sections so that it will not be
long for the students. Five words will be written against seven definitions in each
section. In the completion part, ten sentences based on the other selected ten words
will be written. The target words will be removed from the sentences. Only the first
letter of the word will be left. These sentences could be extracted from the course
book or any other text materials. After writing it, the test items will be examined by
the academic supervisor to be improved. Sometimes the test is also examined by one
teacher who teaches the same grade.
Student ID : 610058427
11
IX. TEST ADMINISTRATION
a. Test Environment
Douglas (2010) refers to the importance of providing the test-takers with the best
opportunity to show their language ability. Therefore, the test should be taken in a
setting which is as pleasant as possible. Hughes (2003) says that rooms should be
appropriate to the number of the test-takers. There should be enough space between
tables to prevent copying and cheating. Hughes (2003) states that the testing
materials should be ready in advance. All required equipments should be available in
plenty of time for repair or replacement. Douglas (2010) suggests to have short breaks
in case of long tests that last for two or three hours.
In this case, as the test is a teacher - based assessment, so the students will take the
test in their ordinary class. The desks are to be checked two days before for repair or
replacement. The desks are organized in rows and there is enough space between
them to prevent cheating. The test sheets are to be photocopied two or three days
before. The test will last for only 30 minutes, so there will be no breaks.
b. Personnel and Procedures
Hughes (2003) and Douglas (2010) show that many factors related to personnel and procedures
of the test have to be considered. First, examiners should be provided with full instructions and
receive practice on the different aspects of the tests. There should be enough proctors
especially in case of a large number of test-takers. The test-takers themselves should receive
the full instructions about the test. Some rules should be set about the arrival time, accepted
behavior and checking identity.
In this case, the examiner or the proctor will be the teacher. The students will be ready on time,
as the test will be taken in one period of the school days. The students will be notified on the
test day. The teacher will give the students full instructions about what they should do on the
test, testing rules and the time limit.
c. Scoring & Feedback
Student ID : 610058427
12
After designing the test, scoring and grading must be considered (Brown 2003). The
ways of scoring vary according to the types of subtests. In the objective subtests,
Alderson, Clapham and Wall (1995) say that each item will be assigned one mark if
correct and 0 if wrong. Then, these marks can be added together to arrive a total for
the subtest or the whole test. These tests can be scored mechanically (Douglas 2010).
On the other hand, holistic and analytic ratings will be used to assess the performance
on the whole test or on individual tasks. Douglas 2010 says that these tests require
trained raters.
Alderson, Clapham and Wall (1995) show that it is possible that some items could be
more important than the other items. Thus, they could be given extra value. However,
they state that the simplest method of weighting is to give the same mark to each
item.
One important step twhen set the score plan is to determine the pass mark. Many
ways are suggested (Alderson, Clapham and Wall 1995) to set the pass mark. These
ways are mainly based on whether the test is norm-referencing or criterion
referencing. The main difference between these approaches is whether the scoring
will be based on the ranking of test-takers’ score or based on a standard or criterion.
Alderson, Clapham and Wall (1995) refer to some historic traditions in pass mark such
as 50% or 75%.
In this case, the test contains 20 objective items; matching items and completion
items. Each item will be assigned one mark if correct and o if wrong. All the items will
be given the same mark. They have the same importance, as all of them are
vocabulary items. In addition, the recall and recognition skills have the same focus.
Concerning the pass mark, the school itself follows 50% pass mark, therefore, the test
follows the same pass mark. The test will be checked by the students’ teacher. This
could be a tool that establishes the reliability of the test. As the teacher already
knows the target students. He also has the opportunity to do more than one
assessment to check the reliability.
Student ID : 610058427
13
Brown (2003) shows that the scoring process will not be complete unless the feedback is given.
In this case, as the target test is a teacher-based assessment, so the students will receive an
instant and constructive feedback.
CONCLUSION
To conclude, an achievement test has been designed (See Apendix 2). It could be described as a
teacher-based assessment. This test will be taken in a Saudi national school. The test-takers are
Saudi six graders of the American division in the school. The test focuses only on vocabulary
knowledge; recognition and recall skills. This test is part of ongoing assessment that occurs
weekly and at the end of each unit. It also helps to build a beneficial washback. The content of
the test is based on the high frequency word lists of Unit four in Scott Foresman Reading
Street. These lists are supported by an English definition for each word. The time limit of the
test will be 30 minutes. It consists of 20 items. The test is a paper and pencil one. The test tasks
will be selected response ones; completion and matching items. Four vocabulary items will be
selected at random from each story; two items used in the matching part and the other two
items used in the completion part. The test will be examined by the academic supervisor. It may
also be examined by another teacher. The test will be conducted in the ordinary classroom. It
will be administered by the students’ teacher, not a stranger. The teacher will also mark the
test. Each item will be assigned one mark if correct and 0 if wrong. The students will receive an
instant and constructive feedback after the test has finished.
Student ID : 610058427
14
BIBLIOGRAPHY
Alderson, J. C. & Clapham C. & Wall, D. (1995). Language Test Construction and Evaluation.
Cambridge University Press: UK.
Brown, H. D. (2003). Language Assessment: Principles and Classroom Practices. Pearson &
Longman.
Davison, C. & Leung, C. (2009). Current Issues in Teacher Based Assessment. TESOL Quarterly,
Volume 43, Issue 3, pages 393–415, September 2009.
Douglas, D. (2010). Understanding language testing. Abingdom : Hadder.
Fulcher G., Davidson, F.(2007). Language Testing and Assessment : An Advanced Resource Book.
Routledge: USA & Canada.
Hughes, A. (2003). Testing for Language Teachers. Cambridge University Press: UK.
McKay, P. (2006). Assessing Young Language Learners. Cambridge University Press : UK.
Read, J. (2000). Assessing Vocabulary. Cambridge University Press: UK.
Nagy, W. (2005). Why Vocabulary Instruction Needs to Be Long-Term and
Comprehensive. Hiebert, & Kamil (Ed.), Teaching and Learning Vocabulary (p. 26 -40). New
Jersey : Lawrence Erlbaum Associates, Publishers.
Schmitt, N. (2008). Teaching Vocabulary. Pearson Education, Inc.
Schmitt, N. (2010). Researching Vocabulary: A Vocabulary Research Manual. Palgrave
MacMillan.
Herman A. & Dole, J. (1988). Theory and Practice in Vocabulary Learning and Instruction. The
Elementary School Journal, Vol. 89, N. 1 (Sep. 1988) , P. 42 -54.
Student ID : 610058427
15
Appendix 1
Student ID : 610058427
16
Student ID : 610058427
17
Student ID : 610058427
18
Appendix 2
Name : ………………………………………………………………………………………………………………………………..
Grade Subject Unit Selections Date
6 L.A. 4 1 – 5 ...../……/2014
Vocabulary Test
Directions: Match the word with its meaning. (1 mark each)
( A)
1. Existence ( )
2. Ordeal ( )
3. Encounter ( )
4. Earthen ( )
5. Proclaimed ( )
(B)
a. Made of soil
b. Meet in a battle
c. Condition of being
d. Declared publicly
e. Friendly
f. The state of being alone
g. A severe experience
(A)
1. Generated ( )
2. Destiny ( )
3. Isolation ( )
4. Aliens ( )
5. Hospitable ( )
(B)
a. One’s fate or fortune
b. Condition of being
c. Imaginary creatures
d. The state of being alone
e. Produced
f. Friendly
20
Student ID : 610058427
19
Complete the following sentence with the appropriate word.
(1 mark each) Note: The first letter of the word is already written.
1. N……………………. a person in charge of finding the position and course of a ship or
aircraft.
2. E……………………………. is a journey taken for a special purpose.
3. Azzam really enjoyed the c………………………. among his friends.
4. The science class went to visit several wildlife S……………….. .
5. Enslaved people experienced the b…………………….
6. S……………………. Is what a town or city of today once was.
7. They mine iron O……………… here.
8. Her v…………………. of the story was hilarious.
9. “It c…………………… food into electrical pulses,” the inventor declared.
10. If the machine worked, it could result in more e…………………….. by saving time.
(Sentences are extracted from Scott Foresman Reading Street / Practice Book)