Top Banner
The Art and Science of Test Development—Part E Calculate norms and derived scores The basic structure and content of this presentation is grounded extensively on the test development procedures developed by Dr. Richard Woodcock Kevin S. McGrew, PhD. Educational Psychologist Research Director Woodcock-Muñoz Foundation
32

Applied Psych Test Design: Part E--Cacluate norms and derived scores

Dec 03, 2014

Download

Business

Kevin McGrew

The Art and Science of Applied Test Development. This is the fifth in a series of PPT modules explicating the development of psychological tests in the domain of cognitive ability using contemporary methods (e.g., theory-driven test specification; IRT-Rasch scaling; etc.). The presentations are intended to be conceptual and not statistical in nature. Feedback is appreciated.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Applied Psych Test Design: Part E--Cacluate norms and derived scores

The Art and Science of Test Development—Part E

Calculate norms and derived scores

The basic structure and content of this presentation is grounded extensively on the test development procedures developed by Dr. Richard Woodcock

Kevin S. McGrew, PhD.

Educational Psychologist

Research DirectorWoodcock-Muñoz Foundation

Page 2: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Part A: Planning, development frameworks & domain/test specification blueprints

Part B: Test and Item Development

Part C: Use of Rasch Technology

Part D: Develop norm (standardization) plan

Part E: Calculate norms and derived scores

Part F: Psychometric/technical and statistical analysis: Internal

Part G: Psychometric/technical and statistical analysis: External

The Art and Science of Test Development

The above titled topic is presented in a series of sequential PowerPoint modules. It is strongly recommended that the modules (A-G) be viewed in sequence.

The current module is designated by red bold font lettering

Page 3: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Norm: A standard or range of values that represents the typical performance of a group or of an individual (of a certain age, for example) against which comparisons can be made

Page 4: Applied Psych Test Design: Part E--Cacluate norms and derived scores

How do we construct age-based norms from standardization norm data?

Answer: Curve fitting of sorted subsample data points is the engine that drives the development of all derived scores

Page 5: Applied Psych Test Design: Part E--Cacluate norms and derived scores

546

432

Block Rotation Summary: Final

Rasch for Publication test – graphic item map

n = 37 norming items (0-74 RS

points)n = 4,722 norm

subjects

Pub. TestW-score

scale

Graphic display of distribution of Block

Rotation person abilities

These Block Rotation W-scores are then used for developing test

“norms” and validity research

Page 6: Applied Psych Test Design: Part E--Cacluate norms and derived scores

11AgeW

1AgeW

4AgeW

3AgeW

2AgeW

5AgeW

6AgeW

9AgeW

8AgeW

7AgeW

10AgeW

…AgeW

13AgeW

12AgeW

8,000AgeW

n =8,000+ norm subjects

………………………………………………………………………………………

1. Sort 8,000 subjects from youngest (CA in months) to oldest

Youngest Oldest

2. Divide sorted pool of subjects in successively older blocks of n=50

3. Calculate “weighted” (US Census derived subject weights) median (average) CAMOS (X) and REF W (Y) for each block

Mdn CA x1 x2 ……..

Mdn W y1 y2 ……..

4. Plot mdn CAMOS (x1, x2,..) and REF W (y1, y2…) and smooth curve

WJ III “classic” norm calculation procedures

(each ball represents an individual norm subject)

Page 7: Applied Psych Test Design: Part E--Cacluate norms and derived scores

20 40 60 80 100 120250

300

350

400

450

500

550

250

300

350

400

450

500

550Example:

Letter-Word ID Ref W (20-120 months) raw data points

Each data point is a “sample” that contains “sampling error” --- this accounts for the “bounce” between data points. How do we deal with this sampling error (bounce) to construct norms and derived scores?

Page 8: Applied Psych Test Design: Part E--Cacluate norms and derived scores

20 40 60 80 100 120250

300

350

400

450

500

550

250

300

350

400

450

500

550

Letter-Word ID Ref W (20-120 months) polynomial curve generated solution(using special curve fitting software)

The smoothed curve represents the best approximation of the population average norm W-score for a test (Reference W or REF W)

Page 9: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Obtaining Developmental Scores (age/grade equivalents)

A W-score of 450 (for Letter-Word Identification test) = 2.4 grade equivalent

A W-score of 400 = 1.3 grade equivalent

Smoothed age curves are used in the same manner to obtain age equivalents

Page 10: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Developing norms and derived scores: What does a tested person’s score on a test mean when compared to the appropriate reference group (age

norms will be used as example)

The meaning of a Block Rotation W-score of X (e.g., 477) will have different interpretations when compared to different age group norm subsamples

Measures of relative standing (percentile rank, standard score) derive meaning based on how far away the person’s W-score is from average (for age)BBB

Block Rotation W-scale

545.7477431.6

2 yr olds 3 yr olds 4 yr olds 5 yr olds

Page 11: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Obtaining Measures of Relative Standing: A subjects W-score for a specific measure is compared to the average W-scores for that subjects

specific age (age norms) or grade (grade norms). This is called the Reference W (REF W)

Expected “average REF W” for someone tested at grade 3.0 (grade norms) is 472.5 (obtained score of 472.5

would be SS=100; PR=50)

Page 12: Applied Psych Test Design: Part E--Cacluate norms and derived scores

SS (M=100; SD=15) = (z x 15) + 100

• e.g. z = -1; SS = 85

“Mean” is the smoothed “Ref W” value for a specific

age/grade

“SD” is the smoothed SD (10/90) for a

specific age/grade

Obtaining Scores of Relative Standing: Subjects obtained W-score for a specific measure is compared to the distribution (mean and SD) of W-scores for that

subjects specific age (age norms) or grade (grade norms)

Page 13: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Custom software generatednorm “setup” data file

example(Block Rotation)

Input for graphing and polynomial curve fitting

X Y

X Y

Note: These examples are from original WJ III 2001 norms and not the subsequent WJ III NU

(2007) norms

Page 14: Applied Psych Test Design: Part E--Cacluate norms and derived scores

r^2=0.12670607 Eqn 8160 Line(a,b) Robust None7667WLO Eqn 7667 Chebyshev=>Std Rational Order 8/9

Block Rotation Ref-W Age

12 120 1200Age (in months)

460

465

470

475

480

485

490

495

500

505

510

Ref

-W

460

465

470

475

480

485

490

495

500

505

510

Ref

-W

Solution A: Up to 230 months (note: age scale is a log scale)

Original Block Rotation Reference W age-based curve fitting: A real-world example of the “art + science” of constructing norms

Note: These examples are from original WJ III 2001 norms and not the subsequent WJ III NU (2007) norms

Page 15: Applied Psych Test Design: Part E--Cacluate norms and derived scores

r^2=0.12670607 Eqn 8160 Line(a,b) Robust None6870WHI Eqn 6870 Chebyshev=>Std Polynomial Order 20

Block Rotation Ref-W Age

12 276 540 804 1068Age (in months)

460

465

470

475

480

485

490

495

500

505

510

Ref

-W

460

465

470

475

480

485

490

495

500

505

510

Ref

-W

Solution B: 231 to 1200 months (note: age scale is regular interval

scale--not log scale)

Original Block Rotation Reference W age-based curve fitting: A real-world example of the “art + science” of constructing norms

Note: These examples are from original WJ III 2001 norms and not the subsequent WJ III NU (2007) norms

Page 16: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Original Block Rotation Reference W age-based curve fitting: A real-world example of the “art + science” of constructing norms

Note: These examples are from original WJ III 2001 norms and not the subsequent WJ III NU (2007) norms

Curve solution A “feathered/blended” with Curve Solution B at 230 months for single final solution. Sometimes more than 2 curve parts are

needed for age norms.

Page 17: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Final smoothed curves serve as the mechanism for the published norms, either in the form of equations in software

Or,

Note: These examples are from original WJ III 2001 norms and not the subsequent WJ III NU

(2007) norms

Page 18: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Tables of values for published norms in test manuals

etc……

Age(in months)

ReferenceW

Age(in months)

ReferenceW

Note: These examples are from original WJ III 2001 norms and not the subsequent WJ III NU

(2007) norms

Page 19: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Custom software generatednorm “setup” data file

example(Block Rotation)

Input for graphing and polynomial curve fitting

X Y

X Y

Note: These examples are from original WJ III 2001 norms and not the subsequent WJ III NU

(2007) norms

Page 20: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Block Rotation SD90 AgeRank 2502 Eqn 7938 y=(a+cx^(0.5)+ex+gx^(1.5)+ix^2)/(1+bx^(0.5)+dx+fx^(1.5)+hx^2+jx^(2.5)) [NL]

r^2=0.48235094 DF Adj r^2=0.41998358 FitStdErr=1.6978814 Fstat=8.6968999a=15.791894 b=0.66619087 c=1.7270779 d=-0.2462822 e=-1.0287721

f=0.02543265 g=0.082451267 h=-0.00095281528 i=-0.0010522608 j=1.5044367e-05

12 120 1200Age (in months)

1

3

5

7

9

11

13

15

SD

90

1

3

5

7

9

11

13

15

SD

90

Original Block Rotation SD90 age-based curve fitting: A real-world example of the “art + science” of constructing norms

Note: These examples are from original WJ III 2001 norms and not the subsequent WJ III NU (2007) norms

Same is done for SD 10

Page 21: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Block Rotation SD90 AgeRank 2502 Eqn 7938 y=(a+cx^(0.5)+ex+gx^(1.5)+ix^2)/(1+bx^(0.5)+dx+fx^(1.5)+hx^2+jx^(2.5)) [NL]

r^2=0.48235094 DF Adj r^2=0.41998358 FitStdErr=1.6978814 Fstat=8.6968999a=15.791894 b=0.66619087 c=1.7270779 d=-0.2462822 e=-1.0287721

f=0.02543265 g=0.082451267 h=-0.00095281528 i=-0.0010522608 j=1.5044367e-05

12 276 540 804 1068Age (in months)

1

3

5

7

9

11

13

15

SD

90

1

3

5

7

9

11

13

15

SD

90

Original Block Rotation SD90 age-based curve fitting: A real-world example of the “art + science” of constructing norms

Same is done for SD 10

Note: These examples are from original WJ III 2001 norms and not the subsequent WJ III NU (2007) norms

Page 22: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Final smoothed curves serve as the mechanism for the published norms, either in the form of equations in software

Or,

Note: These examples are from original WJ III 2001 norms and not the subsequent WJ III NU

(2007) norms

Page 23: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Tables of values for published norms in test manuals

etc……

Age(in months)

SD (in W units)

Age(in months)

SD (in W units)

Note: These examples are from original WJ III 2001 norms and not the subsequent WJ III NU

(2007) norms

Page 24: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Obtaining Scores of Relative Standing: Subjects W-score for a specific measure is then compared to the distribution of W-scores for that subjects

specific age (age norms) or grade (grade norms)

Smoothed REF W (average)

Smoothed SD90

Smoothed SD10

Note: These are NOT the curves for Block

Rotation. They are from another measure. Used

here as example

Page 25: Applied Psych Test Design: Part E--Cacluate norms and derived scores

More is better. The larger the sample the smaller will be the sampling error associated with computed scores

When calculating norm curves, use medians for each age (or grade) block—not means

Special test(s)-cluster consistency checks and procedures need to be used to prevent test(s)-cluster score anomalies

Apply the proposed norms for each measure to the actual norm data as a quality control procedure

If concurrent validity data are available (correlations with other published and respected measures of similar abilities/constructs), it may be wise to apply proposed norms to your tests and then compare the respective set on the derived scores to the external measures via correlations and descriptive statistics (means and SDs).

• May be particularly informative if you begin to question the variability in your norm sample data (to restricted or to variable)

• You are using other as crude “benchmarks” established test batteries

Page 26: Applied Psych Test Design: Part E--Cacluate norms and derived scores

Use of bootstrap re-sampling methods in curve fitting

Special proprietary iterative curve fitting Q/A procedures for selecting best possible curve from a pool of plausible curves

Different subject weighting procedures

Calculating other measure norms

•Cluster norms (combinations of tests)

•Differentially weighted cluster norms (e.g., WJ III GIA cluster)

•Discrepancy norms

Special test-cluster consistency checks and procedures

Creating special Rasch (W-score) based interpretative scoring options and features (e.g., RPI, instructional ranges) – explained in separate PPT module

Special test-length correction procedures for calculation of reliabilities and correlations

Linear vs area (normalization) transformation of scores. Woodcock combined approach

Page 27: Applied Psych Test Design: Part E--Cacluate norms and derived scores

With publication of WJ III NU norms, we now use bootstrap generated “sticks” and not raw single data points

Page 28: Applied Psych Test Design: Part E--Cacluate norms and derived scores
Page 29: Applied Psych Test Design: Part E--Cacluate norms and derived scores

WJ III NU boostraping: If you really want to know check out ASB9

Page 30: Applied Psych Test Design: Part E--Cacluate norms and derived scores

WJ III NU boostraping: If you really want to know check out ASB9

Page 31: Applied Psych Test Design: Part E--Cacluate norms and derived scores

WJ III NU boostraping: If you really want to know check out ASB9

Page 32: Applied Psych Test Design: Part E--Cacluate norms and derived scores

End of Part E

Additional steps in test development process will be presented in subsequent modules as they are developed