This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Adam Scheller, Ph.D. Senior Educational Consultant
Pearson
2 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 2 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Disclosures
• Dr. Scheller is an employee of Pearson, publisher of the CELF-5. No other language assessments will be presented in this presentation.
3 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 3 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Agenda
1. Introduction 1. Disclosure
2. Agenda
3. Learning Objectives
2. Research Overview 1. Standardization
2. Reliability
3. Validity
3. Summary/Questions
4 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 4 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Learning Outcomes
1. Name at least one study conducted to evaluate the reliability of the CELF-5.
2. Describe two procedures used to evaluate test score and index score differences to determine if the differences are significant.
3. Describe the average difference of CELF-4 and CELF-5 scores in a study conducted with typically developing students.
4. Identify the sensitivity/specificity on a chart using the cut score used in the clinicians’ place of employment.
5. Name the optimal cut score on CELF-5 that provides the best balance between the sensitivity and specificity measures.
6 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 6 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Research Overview
• Multiple research phases
• Over 4000 students tested in standardization and related reliability and validity studies
• Students tested from March through December 2012
• Over 450 SLPs across the U.S. participated in standardization testing
7 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 7 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Multiple Bias Studies
• Multiple phases of objective and subjective reviews of administration directions, cues, test items, and test formats – Assessment/bias experts examined test items for
potential bias related to • Socioeconomic status
• Race/Ethnicity
• Gender
• Culture
• Region
– Clinicians in the field provided feedback about students’ responses and engagement in test tasks
– Statistical analysis of bias verified or refuted subjective bias concerns
8 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 8 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
9 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 9 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
A Diverse STDZ Sample: Parent Education Level
Parent Education Level
N %
Less than 11 years 292 12.3%
H.S. Diploma or GED 544 22.9%
1-3 Years College or Technical School
817 34.3%
4 or more Years of College
727 30.6%
Total Sample 2380 100%
10 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 10 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
11 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 11 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Other Demographic Variables
Gender N %
Female 1190 50%
Male 1190 50%
Total Sample 2380 100%
12 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 12 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
13 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 13 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
15 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 15 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Reliability 101
• How confident are you in the accuracy of a test score?
• Reliability = accuracy, consistency and stability of test scores across situations
• True Score = Observed Score + Error
– Errors are systemic and random
16 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 16 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Internal Consistency & Test-retest
• Internal consistency: estimates how consistently the items of the test measure one construct (homogeneity)
– Split-half method (Spearman-Brown correction): correlation between the total scores of two half-tests
• Test-retest stability: correlation between test and retest scores.
– Time interval between the test and retest is as short as possible.
17 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 17 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Test Reliability Coefficients
CELF-5 Test
Average Reliability
Coefficients
(across target ages)
Sentence Comprehension .87 Good
Linguistic Concepts .91 Excellent
Word Structure .89 Good
Word Classes .90 Excellent
Following Directions .91 Excellent
Formulated Sentences .86 Good
Recalling Sentences .94 Excellent
Understanding Spoken Paragraphs .85 Good
Word Definitions .89 Good
Sentence Assembly .93 Excellent
Semantic Relationships .89 Good
Pragmatics Profile .98 Excellent
Reading Comprehension .87 Good
Structured Writing .75 Acceptable
18 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 18 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
19 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 19 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
20 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 20 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Clinical Group
Test
Language Disorder (n=166)
Learning Disability
(Reading & Writing) (n=69)
Autism Spectrum Disorder (n=66)
Average rxx
Understanding
Spoken
Paragraphs
.81 Good .75 Acceptable .91 Excellent .84 Good
Word
Definitions .87 Good .91 Excellent .95 Excellent .92 Excellent
21 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 21 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Standard Error of Measurement (SEM) & Confidence Interval
22 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 22 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Standard Error of Measurement (SEM) & Confidence Interval (cont.)
23 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 23 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Test-retest Stability
Test Retest Stability (n= 137)
Test-retest interval: 7-46 days
Mean: 19 days
24 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 24 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
25 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 25 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Score Differences
• Interpretation of performance:
1. Examine if difference is statistically significant
• Reflection of Standard Error
2. Examine if difference is clinically significant (rare)
27 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 27 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Validity 101
• How well can your test results predict the presence of a disorder (or, predict one’s skill)?
• Validity is demonstrated through evidence supporting a test’s interpretations and uses. – Validity Evidence: the degree to which specific
data, research, or theory supports that: 1. A test measures the concepts it’s supposed to
measure
2. The test is applicable to its intended population
28 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 28 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Evidence Based on Test Content
• Validity evidence related to test content
– Content Relevance: when the content areas being measured are accepted as relating to the proposed construct
– Content Coverage: when the content areas measured by the test are accepted to be an adequate sampling of these areas (Also developmentally appropriate)
• Validity evidence includes: literature review; users’ feedback; and expert review
29 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 29 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Intercorrelation Studies
30 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 30 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
31 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 31 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
CELF-5 – CELF-4 (cont.)
32 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 32 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
33 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 33 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
LD Study (cont.)
34 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 34 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Sensitivity and Specificity 101
Sensitivity: The probability that someone who has the “condition” will test positive for it.
Specificity: The probability that someone who does not have the “condition” will test negative.
Errors False Positive: A student who is falsely identified as
having a condition or disorder.
False Negative: A student with a condition or disorder
who is not correctly identified by a test (the most serious error).
35 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 35 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
CELF-5 Diagnostic Accuracy
36 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 36 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Hypothetical using 1.5 SD (77) cutoff with CELF-5: 10,000 Students, 10% Prevalence
37 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 37 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Another Example: October is Breast Cancer Awareness Month!! (These numbers are approximates and based a 12% prevalence rate)
38 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 38 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
39 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 39 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Autism Spectrum Disorder
40 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved 40 | Copyright 2013. Pearson Education and its Affiliates. All rights reserved
Summary/Questions?
• Reliability is reflection of error
• Validity helps us determine how we apply test interpretations and predict
• Choosing cutoff has implications for including or not including people in groups.