Top Banner
Measurement
43

Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Measurement

Page 2: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Measurement, test, evaluation • Measurement: process of quantifying the

characteristics of persons according to explicit procedures and rules

• Quantification: assigning numbers, distinguishable from qualitative descriptions

• Characteristics: mental attributes such as aptitude, intelligence, motivation, field dependence/independence, attitude, native language, fluency

• Rules and procedure: observation must be replicable in other context and with other individuals.

Page 3: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Test • Carroll: a procedure designed to elicit certain

behavior from which one can make inferences about certain characteristics of an individual

• Elicitation: to obtain a specific sample of behavior• Interagency Language Roundtable (ILR):oral

interview: a test of speaking consisting of (1) a set of elicitation procedures, including a sequence of activities and sets of question types and topics; (2) a measurement scale of language proficiency ranging from a low level of 0 to a high level of 5.

Page 4: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Test?

• Years’ informal contact with a child to rate the child’s oral proficiency: the rater did not follow the procedures

• A rating based on a collection of personal letters to indicate an individual’s ability to write effective argumentative editorials for a news magazine.

• A teacher’s rating based on informal interactive social language use to indicate the student’s ability to use language to perform various cognitive/academic language functions.

Page 5: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Evaluation

• Definition: systematic gathering of information for the purpose of making decisions.

• Evaluation need not be exclusively quantitative: verbal descriptions, performance profiles, letters of reference, overall impressions

Page 6: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Relation Between Evaluation, Test, and Measurement

Page 7: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Relation Between Evaluation, Test, and Measurement

• 1: qualitative descriptions of student performance for diagnosing learning problems

• 2. teacher’s ranking for assigning grades• 3. achievement test to determine student progress• 4. proficiency test as a criterion in second languag

e acquisition research• 5.assigning code numbers to subjects in second la

nguage research according to native language

Page 8: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

What Is It, Measurement, Test, Evaluation ?

• placement test

• classroom quiz

• grading of composition

• rating of classroom fast reading exercise

• rating of dictation

Page 9: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Measurement Qualities

• A test must be reliable and valid.

Page 10: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Reliability • Free from errors of measurement.• If a student does a test twice within a short time,

and if the test is reliable, the results of the 2 tests should be the same.

• If 2 raters rate the same writing sample, the ratings should be consistent if the ratings should be reliable.

• The primary concerns in examining reliability is to identify the different sources or error, then to use the appropriate empirical procedures for estimating the effect of these sources of errors on test scores.

Page 11: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Validity

• Validity: the extent to which the inferences or decisions are meaningful, appropriate and useful. The test should measure the ability and very little else.

• If a test is not reliable, it is not valid.• Validity is a quality of test interpretation and use.• The investigation of validity is both a matter of

judgment and of empirical research.

Page 12: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Reliability and Validity

• Both are essential to the use of tests.• Neither is a quality of tests themselves: reliability

is a quality of test scores, while validity is a quality of interpretations or uses that are made of test scores.

• Neither is absolute: we can never attain perfectly error free measures and particular use of a test score depends upon many factors outside the test itself.

Page 13: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Properties of Measurement Scales

• 4 properties• distinctiveness: different numbers assigned to pers

ons with different values• ordered in magnitude: the larger the number, the la

rger the amount of the attribute• equal interval: equal difference between ability lev

els• absolute zero point: the absence of the attribute

Page 14: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Four Types of Scales

• Nomical: naming classes or categories.

• Ordinal: an order with respect to each other.

• Interval: the distance between the levels are equal.

• Ratio: includes the absolute zero point

Page 15: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Nominal

• Examples :License plate numbers; Social Security numbers; names of people, places, objects; numbers used to identify football players

• Limitations: Cannot specify quantitative differences among categories

Page 16: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Ordinal

• Examples: Letter grades (ratings from excellent to failing), military ranks, order of finishing a test

• Limitations: Restricted to specifying relative differences without regard to absolute amount of difference

Page 17: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Interval

• Examples: Temperature (Celsius and Fahrenheit), calendar dates

• Limitations: Ratios are meaningless; the zero point is arbitrarily defined

Page 18: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Ratio

• Examples: Distance, weight, temperature in degrees Kelvin, time required to learn a skill or subject

• Limitations: None except that few educational variables have ratio characteristics

Page 19: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Nominal, Ordinal, Interval or Ratio?

• 5 in IELTS

• 550 in TOEFL

• C in BEC

• 8 in CET-4 writing

• 58 in the final evaluation of a student

Page 20: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Property and Type of Scale Type of Scale

Property Nominal Ordinal Interval Ratio

Distinctiveness + + + +

Ordering - + + +

Equal intervals - - + +

Absolute zero point

- - - +

Page 21: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Limitations in Measurement

• It is essential and important for us to understand the characteristics of measures of mental abilities and the limitations these characteristics place on our interpretation of test scores.

• These limitations are of two kinds: limitations in specification and limitations in observation and quantification.

Page 22: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Limitation in Specification

• Two levels of the specification of language ability• Theoretical level• Task: we need to specify the ability in relation to,

or in contrast to, other language abilities and other factors that may affect test performance.

• Reality: large number of different individual characteristics—cognitive, affective, physical—that could potentially affect test performance make the task nearly impossible.

Page 23: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Limitation in Specification

• Operational level• Task: we need to specify the instances of language

performance as indicators of the ability we wish to measure.

• Reality: the complexity ad the interrelationships among the factors that affect performance on language tests force us to simplify assumptions in designing language tests and interpreting test scores.

Page 24: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Conclusion

• Our interpretations and uses of test scores will be of limited validity.

• Any theory of language test performance we develop is likely to be underspecified and we have to rely on measurement theory to deal with the problem of underspecification.

Page 25: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Limitations in Observation and Quantification

• All measures of mental ability are indirect, incomplete, imprecise, subjective and relative.

Page 26: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Indirectness • The relationship between test scores and the

abilities we want to measure is indirect. Language tests are indirect indicators of the underlying traits in which we are interested. Because scores from language tests are indirect indicators of ability, the valid interpretation and use of such scores depends crucially on the adequacy of the way we have specified the relationship between the test score and the ability we believe it indicates. To the extent that this relationship is not adequately specified, the interpretations and uses made of the test score may be invalid.

Page 27: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Incompleteness

• The performance we observe and measure in a language test is a sample of an individual's total performance in that language.

Page 28: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Incompleteness

• Since we cannot observe an individual's total language use, one of our main concerns in language testing is assuring that the sample we do observe is representative of that total use - a potentially infinite set of utterances, whether written or spoken.

Page 29: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Incompleteness

• It is vitally important that we incorporate into our measurement design principles or criteria that will guide us in determining what kinds of performance will be most relevant to and representative of the abilities we want to measure, for example, real life language use.

Page 30: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Imprecision

• Because of the nature of language, it is virtually impossible (and probably not desirable) to write tests with 'pure' items that test a single construct or to be sure that all items are equally representative of a given ability. Likewise, it is extremely difficult to develop tests in which all the tasks or items are at the exact level of difficulty appropriate for the individuals being tested.

Page 31: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Subjectivity

• As Pilliner (1968) noted, language tests are subjective in nearly all aspects.

• Test developers

• Test writers

• Test takers

• Test scorers

Page 32: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Relativeness • The presence or absence of language abilities is

impossible to define in an absolute sense.• The concept of 'zero' language ability is a complex

one• The individual with absolutely complete language

ability does not exist.• All measures of language ability based on domain

specifications of actual language performance must be interpreted as relative to some 'norm' of performance.

Page 33: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Steps in Measurement

• Three steps

• 1. identify and define the construct theoretically

• 2.   define the construct operationally

• 3. establish procedures for quantifying observations

Page 34: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Defining Constructs Theoretically

• Historically, there were two distinct approaches to defining language proficiency.

• Real-life approach: language proficiency itself is not define, but a domain of actual language us is identified.

• The approach assumes that if we measure features present in language use, we measure the language proficiency.

Page 35: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Real Life Approach: Example • American Council on the Teaching of Foreign

Languages (ACTFL): definition of advanced level• Able to satisfy the requirements of everyday

situations and routine school and work requirements. Can handle with confidence but not with facility complicated tasks and social situations, such as elaborating, complaining, and apologizing. Can narrate the describe with some details, liking sentences together smoothly. Can communicate facts and talk casually about topics of current public and personal interest, using general vocabulary.

Page 36: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Interactional/ability Approach

• Language proficiency is defined in terms of its component abilities. These components can be reading, writing, listening, speaking, (Lado), functional framework (Halliday), communicative frameworks (Munby)

Page 37: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Example of Pragmatic Competence

• The knowledge necessary, in addition to organizational competence, for appropriately producing or comprehending discourse,. Specifically, it includes illocutionary competence, or the knowledge of how to perform speech acts, and sociolinguistic competence, or the knowledge of the sociolinguistic conventions which govern language use.

Page 38: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Defining Constructs Operationally

• This step involves determining how to isolate the construct and make it observable.

• We must decide what specific procedures we will follow to elicit the kind of performance that will indicate the degree to which the given construct is present in the individual.

• The context in which the language testing takes place influences the operations we would follow.

• The test must elicit language performance in a standard way, under uniform conditions.

Page 39: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Quantifying Observations

• The units of measurement of language tests are typically defined in two ways.

• 1.   points or levels of language performance.• From zero to five in oral interview• Different levels in mechanics, grammar,

organization, content in writing • Mostly an ordinal scale, therefore needing

appropriate statistics for ordinal scales.• 2.   the number of tasks successfully completed

Page 40: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Quantifying Observations

• 2.   the number of tasks successfully completed

• We generally treat such a score as one with an interval scale.

• Conditions for an interval scale

Page 41: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Quantifying Observations • the performance must be defined and selected in a

way that enables us to determine the relative difficulty and the extent to which they represent the construct being tested.

• the relative difficulty: determined from the statistical analysis of responses to individual test items.

• How much they represent the construct: depend on the adequacy of the theoretical definition of the construct.

Page 42: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Score Sorting

• Raw score

• Score Class

1. Range

2. Number of groups: K=1.87(N-1)2/5

3. Interval: I=R/K

4. Highest and Lowest of the group

5. Arrange the data into groups

Page 43: Measurement. Measurement, test, evaluation Measurement: process of quantifying the characteristics of persons according to explicit procedures and rules.

Central Tendency & Dispersion

• Mean:  x-=∑x / N

• Median: middle of the range

• Mode: the score around which the bulk of the data congregate

• Variance: V=∑(x-x-)2 / (n-1)

• Standard deviation:

S=√(∑(x-x-)2 / (n-1))