DRAFT – PLEASE DO NOT CITE OR DISTRIBUTE WITHOUT CONSENT 1 A “Jarring” Experience? Exploring how Changes to Standardized Tests Impact Teacher Experience Effects Mark Chin Harvard Graduate School of Education Author Note Mark Chin, Harvard Graduate School of Education, Harvard University. Correspondence concerning this article should be addressed to Mark Chin, Center for Education Policy Research, 50 Church Street 4 th Floor, Cambridge, MA 02138. E-mail: [email protected]
27
Embed
A “Jarring” Experience? Exploring how Changes to ......other metrics to measure teacher effectiveness in their updated evaluation systems. One such measure includes teacher impacts
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
DRAFT – PLEASE DO NOT CITE OR DISTRIBUTE WITHOUT CONSENT 1
A “Jarring” Experience?
Exploring how Changes to Standardized Tests Impact Teacher Experience Effects
Mark Chin
Harvard Graduate School of Education
Author Note
Mark Chin, Harvard Graduate School of Education, Harvard University. Correspondence concerning this article should be addressed to Mark Chin, Center for Education Policy Research, 50 Church Street 4th Floor, Cambridge, MA 02138. E-mail: [email protected]
DRAFT – PLEASE DO NOT CITE OR DISTRIBUTE WITHOUT CONSENT 2
Abstract
Experience has long been used by states and districts to indicate a teacher’s quality. More time in
the classroom theoretically leads to the development of skills and the improved implementation
of instructional practices key to student learning. Empirical evidence links teacher experience to
student test performance. Test familiarity, and, subsequently, veteran teachers’ more effective
tailoring of instruction to the content and format of test items, may also contribute to this
relationship. If the teacher experience effect is in part explained by teacher test-experience, this
could lead to non-persistent student learning, and to misallocation of resources or misguided
personnel decisions. I used administrative data from Kentucky before and after the state switched
standardized tests to test whether test experience does factor into the teacher experience effect. I
found that the teacher experience effect on mathematics attenuated following the change,
supporting this hypothesis, as both novice and more veteran teachers became test-inexperienced.
Keywords: teacher quality, teacher experience, test preparation, standardized tests
DRAFT – PLEASE DO NOT CITE OR DISTRIBUTE WITHOUT CONSENT 3
A “Jarring” Experience?
Exploring how Changes to Standardized Tests Impact Teacher Experience Effects
Experience has long been used by states and districts to indicate a teacher’s quality. As
such, teacher compensation contracts typically connect pay to years of experience (Schools and
Staffing Survey, 2012). Recent federal policies, however, have pushed policymakers to employ
other metrics to measure teacher effectiveness in their updated evaluation systems. One such
measure includes teacher impacts on students’ standardized test outcomes, or “value-added”
measures. Theory would suggest that students taught by more veteran teachers should
demonstrate higher test score growth. For example, the additional years of experience in the
classroom likely translate to better student outcomes through the development of key
proficiencies or the improved implementation of important instructional practices, such as better
understanding of student learning pathways or improved ability to minimize unproductive
classroom time (e.g., Leinhardt, 1989; Scribner & Akiba, 2010). This hypothesis has largely
played out in empirical analyses; most extant research suggests positive within-teacher returns to
experience, particular in the earlier years of a teacher’s time in the classroom (e.g., Harris &
The outcome in Equation 1, 𝑦𝑖𝑖𝑖𝑖𝑖𝑖, captures the performance of student i in class j taught by
teacher k in grade g in school s in the academic year t on either the mathematics or ELA test
from the KCCT or the K-PREP assessment.0F
1 This performance (i.e., the student’s scaled score)
was standardized within grade and year to have a mean of zero and standard deviation of one.
The model controlled for a vector of controls for student baseline ability levels (𝑌𝑖𝑖−1), including
a cubic function for prior test achievement; a vector of the student demographic characteristics
described above (𝐷𝑖𝑖); the aggregate of the two covariate vectors for a student’s classroom peers
(𝑃𝑖𝑖𝑖𝑖𝑖); the aggregate of the two covariate vectors for a student’s grade-level cohort (𝐶𝑖𝑖𝑖); and
grade-by-year fixed effects (𝜅𝑖𝑖).
My coefficients of interest in Equation 1 are the effects (modeled using different
functional forms described below) of being taught by a teacher k with experience EXP on the
outcome in different years, captured by 𝜇 for 2010-11 (i.e., the teacher experience effect on
KCCT performance) and by 𝜈 for 2011-12 (i.e., the teacher experience effect on K-PREP
performance). If the teacher experience effect on outcomes is largely independent of the specific
test, I would not expect the difference between the coefficients 𝜇 and 𝜈 to be statistically
significant. This result would contradict the theory that the positive returns to teacher experience
on outcomes seen in extant literature might be caused by test familiarity (and, subsequently,
1 Exploration into the distributions of scaled scores for students on standardized tests in 2010-11 showed a significant ceiling effect and a minor floor effect. To ensure that my results were not influenced by the loss of information regarding students’ actual ability levels at these extremes, I dropped students attaining the highest or lowest possible scale score on all tests from my analyses. Sensitivity checks suggested that inclusion of these students had minor impacts on estimates, with overall patterns remaining the same.
DRAFT – PLEASE DO NOT CITE OR DISTRIBUTE WITHOUT CONSENT 9
ability to teach to the test well) of more veteran teachers. However, if the effect of experience is
in part explained by the test familiarity of more veteran teachers, as I would hypothesize, I
expect the difference between the two coefficients to be statistically significant. Specifically, I
would expect the effect of additional experience to attenuate in 2011-12, as both novice and
more veteran teachers would be equally (un)familiar with the new K-PREP assessments.
Key to my hypothesis is that the KCCT and the K-PREP assessments are sufficiently
different from one another as to cause a “drop” in test familiarity for more veteran teachers
following the standardized test change. Exploration into the alignment of items on the KCCT to
the CCSS (which were aligned to the K-PREP) suggested that, though items on the old
standardized tests did assess many of the new adopted standards, gaps still existed in terms of the
content and the depth of knowledge assessed (Taylor, Thacker, Koger, Koger, & Dickinson,
2010). Anecdotal evidence also indicated the K-PREP to be more difficult than the KCCT, which
was supported empirically by observations that far fewer students scored the highest possible
scaled score on the new exams.
Another key to my hypothesis is that evidence exists documenting the implementation of
narrow, test-specific instruction by teachers during the time period that the KCCT was
administered. Though anecdotal report does provide such evidence, some quantitative analyses
into the relationship between changes in school-level test performance across years with school-
level averages of FRPL in Kentucky have also supported this notion. Specifically, this
relationship, insignificant before the switch to the K-PREP, is negative and significant (i.e.,
schools with higher proportions of disadvantaged students worsen in their average performance
across years) in 2011-12 (Dickinson, Levinson, & Thacker, 2013). Though several reasons might
explain this observed relationship, extant research has found higher incidence of narrow, test-
DRAFT – PLEASE DO NOT CITE OR DISTRIBUTE WITHOUT CONSENT 10
specific instructional practices in schools with larger proportions of disadvantaged students (e.g.,
Herman & Golan, 1993). Thus, it stands that such schools would be “hurt” most in terms of
student achievement by the switch from the KCCT to the K-PREP assessments. These findings
informed supplemental analyses (described below) exploring how differences between the
teacher experience effect in 2010-11 and teacher experience effect in 2011-12 varied across
different types of schools.
A notable inclusion in the model represented in Equation 1 is the control for school fixed
effects (𝜂𝑖). Though the appropriate specification for modeling teacher effects on student
achievement is still being debated (see Goldhaber & Theobald, 2012), I opt to include school
fixed effects, as other researchers investigating returns to teacher experience have done in the
past (e.g., Papay & Kraft, 2015), and because prior research has provided evidence for the
systematic sorting of teachers—in particular, inexperienced ones—to certain types of schools
(see Rice, 2013). Further, other literature has documented heterogeneous effects of experience
DRAFT – PLEASE DO NOT CITE OR DISTRIBUTE WITHOUT CONSENT 23
Figure 1. Distribution of teacher experience in 2011 and 2012 in Kentucky.
DRAFT – PLEASE DO NOT CITE OR DISTRIBUTE WITHOUT CONSENT 24
Figure 2. Plotted regression coefficients of teachers’ experience on student mathematics achievement growth. Effect of teachers with more than 10 years of experience collapsed into the 10-year category.
0.0
5.1
.15
.2S
tude
nt A
chie
vem
ent
0 2 4 6 8 10Experience
2011 2012
DRAFT – PLEASE DO NOT CITE OR DISTRIBUTE WITHOUT CONSENT 25
Table 1. Regression coefficients for teachers’ experience on students’ mathematics achievement growth 1 2 3 4 5 1-year Exp. Effect in 2011 0.0856*** 0.0739** 0.119** 0.0840*** 0.0881***
(0.0227) (0.0351) (0.0491) (0.0255) (0.0234)
1-year Exp. Effect in 2012 -0.000827 0.0282 0.0231 0.00621 -0.00163
(0.0251) (0.0403) (0.0584) (0.0298) (0.0252)
2-plus-years Exp. Effect in 2011 0.120*** 0.0863*** 0.111*** 0.120*** 0.120***
(0.0170) (0.0190) (0.0332) (0.0195) (0.0174)
2-plus-years Exp. Effect in 2012 0.0495** 0.0609** 0.0159 0.0445** 0.0514**
(0.0208) (0.0287) (0.0482) (0.0218) (0.0207)
Controls Student Demographics x x x x x
Prior Achievement x x x x x Grade-by-year Fixed Effects x x x x x Cohort Aggregates x x x x x Peer Aggregates x x x x x School Fixed Effects x x x x x
Note: School-level clustered standard errors reported in parentheses. The Selection Sample includes only students taught by teachers who teach in both 2011 and 2012 (or are novices in 2012). ***p<0.01, **p<0.05, *p<0.1
DRAFT – PLEASE DO NOT CITE OR DISTRIBUTE WITHOUT CONSENT 26
Figure 3. Plotted regression coefficients of teachers’ experience on student ELA achievement growth. Effect of teachers with more than 10 years of experience collapsed into the 10-year category.
0.0
5.1
.15
.2S
tude
nt A
chie
vem
ent
0 2 4 6 8 10Experience
2011 2012
DRAFT – PLEASE DO NOT CITE OR DISTRIBUTE WITHOUT CONSENT 27
Table 2. Regression coefficients for teachers’ experience on students’ ELA achievement growth
1-year Exp. Effect in 2011 0.0473*
(0.0248)
1-year Exp. Effect in 2012 0.0308*
(0.0173)
2-year Exp. Effect in 2011 0.0152
(0.0238)
2-year Exp. Effect in 2012 0.0319*
(0.0179)
3-year Exp. Effect in 2011 0.0208
(0.0230)
3-year Exp. Effect in 2012 0.0104
(0.0180)
4-plus-years Exp. Effect in 2011 0.0525***
(0.0164)
4-plus-years Exp. Effect in 2012 0.0271**
(0.0134)
Controls Student Demographics x
Prior Achievement x Grade-by-year Fixed Effects x Cohort Aggregates x Peer Aggregates x School Fixed Effects x