Top Banner
VALIDITY AND RELIABILITY
26

VALIDITY AND RELIABILITY. VALIDITY In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Dec 24, 2015

Download

Documents

Sylvia Dixon
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

VALIDITY AND RELIABILITY

Page 2: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

VALIDITY

In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended to answer.

Instrument selection is important for validity because instruments are used to collect data Data are used to make inferences related to the

questions

Thus, the inferences about the specific uses of an instrument should also be validated.

Page 3: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

What do we mean with validity of inferences?

Our inferences should be relevant to the purpose of the study (appropriate) If we want to see what our students’ attitudes are

towards learning English, there is no use in making inferences using their scores in English tests.

Our inferences should be meaningful and correct We should say something about the meaning of the

information we collect. E.g. What does a high score on a particular test mean?

Page 4: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Our inferences should be useful They should help researchers make a

decision related to what they are trying to find out. E.g. If you want to see positive effects of

formative assessment on student achievement, you should have information that will help you infer whether your students’ achievement is affected by formative assessment or not.

Thus, validity depends on the amount and type of the evidence you have!

Page 5: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Kinds of evidence of validity

Content-Related evidence of validity

Content and format of the instrument: the degree to which an instrument logically appears to measure an intended variable

How appropriate is the content? Is the format appropriate? Does it logically get at the intended variable? How adequately does the sample of items or

questions represent the content to be assessed?, etc.

Page 6: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Two points to consider in content-related evidence

i) adequacy of sampling

Whether the content of the instrument has adequate sample of the domain of content it is supposed to represent E.g. If you want to see your students’ achievement

at macro level, you should have enough number of items that show this skill.

Page 7: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

ii) format of the instrument

Clarity of printing, size of type, adequacy of work space, appropriateness of language, clarity of directions, etc.

E.g. If you want to see students’ attitudes towards English, the questionnaire should be in their target language if their level of target language proficiency is not high enough.

Page 8: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

How do we obtain content-related evidence of validity?

Write out the definition of what you want to measure and give this definition (together with the instrument and the intended sample) to a number of judges.

The judges look at the definition and place a checkmark in front of each item in the instrument that they feel does not measure the objectives.

They also place a checkmark in front of each aspect in the definitions that is not assessed by the instrument.

They evaluate the appropriateness of the format. Then the researcher rewrites these items. This continues until all judges approve of all items.

Page 9: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

ExampleJudge No: ___________

Match to Portfolio Assessment Objectives No Match Perfect Match

A RANGE 1. ability to link ideas in a variety of ways 2. ability to use wide range of genres (stories, reports, articles, etc) 3. evidence of various topics

1 2 3 4 51 2 3 4 51 2 3 4 5

B FLEXIBILITY 4. evidence of variations in the style, vocab, tone,lang., voice and ideas 5. evidence for the appropriateness of style, vocab, tone, lang. and voice

1 2 3 4 51 2 3 4 5

C CONNECTIONS 6. evidence of applications of already-known concepts to newly- learned ones 7. evidence of new concepts and/or metaphors

1 2 3 4 51 2 3 4 5

Page 10: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

General Aims of the Portfolio Assessment System 1. improving students’ writing abilities 2. improving students’ metacognitive skills 3. leading students to become autonomous language learners Specific objectives of the Portfolio Assessment System I- Helping students improve their linguistic skills in writing from the point of A) Grammar, punctuation and spelling, B) Vocabulary C) Coherence and Cohesion II- Helping students improve their metacognitive skills from the point of A) Applying and/or creating new concepts or ideas B) Using varieties in writing appropriately C) Analysing and Synthesising what they have learned/read D) Using other sources III- Helping students become autonomous language learners from the point

of A) Applying their own views B) Connecting other sources with what they know

Page 11: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Criterion-Related Evidence

Comparing performance on one instrument with performance on some other.

Two forms are available: a) predictive validity: compares the scores

on the original test with scores on one or more criterion measures obtained in a follow-up testing

b) concurrent validity: compares the test results with results obtained through a parallel, substitute measure

Page 12: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

On both forms, a correlation coefficient is used.

Correlation coefficient (r) shows the degree of relationship that exists between the scores individuals obtain on two instruments. A positive relationship : a high (low) score on one instrument is

accompanied by a high score (low) score on the other A negative relationship: a high (low) score on one instrument is

accompanied by a low (high) score on the other

Correlation coefficients fall somewhere between +1.00 and -1.00. An r of .00 indicates that no relationship exists.

Page 13: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Construct-Related Evidence

Establishing a link between the underlying theoretical construct we wish to measure and the visible performance we choose to observe

Construct validation consists of building a strong logical case based on circumstantial evidence that a test measures the construct it is intended to measure

Page 14: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Generally there are 3 steps i) the variable being measure is clearly

defined ii) hypotheses, based on a theory

underlying the variable, are formed about how people who possess a lot versus a little of the variable will behave in a particular situation

iii) hypotheses are tested both logically and empirically

Page 15: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

RELIABILITY

The consistency of the scores obtained.

Possible to have quite reliable but invalid scores (Unreliable scores can never be valid!)

What is desirable is to have both high reliability and high validity.

Page 16: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Errors of Measurement

When someone takes the same test twice, they rarely perform exactly the same, due to many factors.

Such factors result in errors of measurement.

Because of errors of measurement, researchers expect some variation in scores.

Reliability estimates help researchers have an idea of how much variation to expect.

Page 17: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

This estimate is another application of correlation coefficient, known as a reliability coefficient.

A reliability coefficient is again a relationship, but it is between scores of the same individuals on the same instrument on two different times, or between two parts of the same instrument.

There are three best ways to obtain reliability coefficient.

Page 18: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

1. Test-Retest Method

Administering the same test twice to the same group after a certain time. A reliability coefficient indicates the relationship between the two sets of scores obtained.

Reliability coefficient is affected by the length of the time interval. The longer the time, the lower the reliability coefficient.

The interval should be determined by the researcher considering that the individuals would retain their relative position.

Most of the time 1-3 month interval is sufficient!

Page 19: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

2. Equivalent-Forms Method

Two different but equivalent (parallel) forms of an instrument are administered to the same group of individuals during the same period of time.

The questions (items) are different but they sample the same content.

A reliability coefficient indicates strong evidence that the two forms are measuring the same thing.

Page 20: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

3. Internal-Consistency Methods

There are several internal-consistency methods and they all require only a single administration of an instrument.

Page 21: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Split-half procedure Two halves of a test (odd items vs even

items) is scored and a correlation coefficient is calculated for the two sets of scores.

Spearman-Brown prophecy formula is used for calculation.

The reliability of a test (instrument) can be increased by adding more items.

Page 22: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Kuder-Richardson Approaches

Two formulas: KR20 and KR21

KR21 is used when all items are of equal difficulty: you need the number of items on the test, the mean, and the standard deviation

KR20 is more complicated but must be used when you cannot assume that all items are of equal difficulty

Page 23: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Alpha Coefficient (Cronbach alpha) (α)

General form of the KR20 formula Used to calculate the reliability of items

that are not scored right versus wrong e.g. some essays where more than one

answer is possible

Page 24: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Scoring Agreement

When there is subjective evaluation (like essay scoring), there is the possibility of observer differences. In that case, scoring agreement should be reported.

Such cases require training to obtain as high reliability as possible.

The expected correlation is at least .90 correlation or 80% of agreement.

Page 25: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

In case of subjective rating, we can talk about two kinds of reliability:

Intra-rater reliability: similar to test-retest strategy.

The same raters score the papers of the same group of students in two separate occasions (e.g. two weeks apart).

Thus, the intra-rater reliability is an estimate of the consistency of judgments over time

Page 26: VALIDITY AND RELIABILITY. VALIDITY  In scientific research validity refers to whether a study is able to scientifically answer the questions it is intended.

Inter-rater reliability: similar to the equivalent-forms strategy since the scores are obtained from two different raters

Inter-rater reliability estimates the extent to which two or more raters agree on the score that should be assigned to a written sample.

A correlation coefficient is calculated between the scores. Then the obtained coefficients are adjusted by the use of Spearman-Brown Prophecy formula.