Leadership Effects: School Principals and Student Outcomes ...€¦ · Michael Coelliy and David Greenz 12 October 2009 Abstract ... University of Melbourne and Monash University.

Leadership Effects: School Principals and Student

Outcomes∗

PRELIMINARY: DO NOT CITE

Michael Coelli† and David Green‡

12 October 2009

Abstract

We identify the effect of individual high school principals on high school graduation

rates and grade 12 English exam scores using an administrative data set of grade 12 students

in BC Canada. Many principals were rotated across schools by school districts, permitting

the isolation of the effect of school principals from the effect of schools. A lower bound

estimate of the variance of the idiosyncratic effect of principals on student outcomes using

only within school variation in outcomes is constructed. We also estimate a dynamic model

of unobserved principal effects to allow for changing influences of principals over time.

Keywords: school principals, student outcomes, graduation, test scores, leadership

JEL codes: I20, I21

∗The statistical analysis presented in this document was produced from administrative micro-data provided bythe British Columbia Ministry of Education. The interpretation and opinions expressed are our own and do notrepresent those of the BC Ministry of Education. We are indebted to David Harris for assistance and advice. Thanksalso to seminar participants at the University of British Columbia, University of Melbourne and Monash University.

†Department of Economics, The University of Melbourne‡Department of Economics, University of British Columbia

1 Introduction

There remains considerable debate on whether schools can actually improve student outcomes,

despite the large volume of research on the topic. The exchange between Krueger (2003) and

Hanushek (2003) highlights the extent of this debate. Much of this research on schools was

conducted in response to the Coleman Report (1966) on schooling in the United States, which

suggested that schools did “not matter”. The Report’s authors claimed that measurable school

inputs, such as spending per student, pupil-teacher ratios, and the education and experience of

teachers, had no significant effect on student outcomes, once student family background and peer

effects were controlled for. Although the subsequent research has been considerable, the debate

on whether schools matter has yet to be resolved. Many conflicting pieces of empirical evidence

have been found.

Education professionals widely agree that teacher quality is an extremely important deter-

minant of student achievement. The effect of individual teachers on student outcomes may not,

however, be only related to the easily measurable qualities of teachers, such as education levels

and experience. As a result, research may have missed identifying the full effect of teachers

on achievement by only searching on those observable dimensions. Recently, a number of re-

searchers have attempted to identify the idiosyncratic effect of individual teachers (teacher “fixed

effects”) on student outcomes. This research generally finds that there is significant variability

in teacher quality. This research includes Uribe et al. (2003), Rockoff (2004), Nye et al (2004),

Rivkin et al. (2005), Kane et al. (2006), Aaronson et al. (2007), Leigh (2007) and Koedel (2008).

In this paper, we expand on this line of research by identifying the idiosyncratic effect of school

principals (individual school principal fixed effects) on student outcomes.

We estimate the effect of individual school principals on the high school graduation prob-

abilities and grade 12 provincial final exam scores in English of students employing a unique

administrative data set from the Canadian province of British Columbia. Many high school prin-

1

cipals were rotated across schools during this period by school districts. This rotation permits

us to isolate the effect of school principals on student graduation probabilities from the effect of

individual school and neighbourhood characteristics. We use administrative information on all

students entering grade 12 in British Columbia over the 1995 to 2004 period.

The particular student outcomes we focus on in our analysis are high school graduation

and provincial grade 12 exam scores in English. Encouraging youth to complete high school

confers benefits to the individual, and potentially to society also. High school completion has

been related to better employment outcomes of individuals, higher worker productivity, lower

reliance on welfare payments, reduced probabilities of individuals committing crime, and higher

involvement in democratic institutions.

School principals can affect student outcomes in many ways. As school leaders, they have

considerable influence on many aspects of the school, including teacher supervision and reten-

tion, introducing new curriculum and teaching techniques, student discipline, and student allo-

cation to teachers and classes. Certain principals may motivate both teachers and students more

effectively, and may create a school environment that promotes staying in school and studying

for exams effectively that fits the particular student body better. See Appendix A for a list of

duties and responsibilities of school principals in the jurisdiction we study.

Education scholars have highlighted several school conditions through which school leader-

ship may influence student outcomes: (1) purposes and goals, (2) structure and social networks,

(3) people, and (4) organizational culture (Hallinger and Heck, 1998). Much of this Education

research was conducted within the wider agenda attempting to identify the particular attributes of

high achieving or effective schools. Hallinger and Heck (1998) reviewed a large set of studies of

principal leadership from the Education literature, and concluded that the evidence on the effect

of principals on student outcomes is mixed. This line of research mostly employed survey data

of individual teacher perceptions of various components of leadership and school conditions.

For example, teachers may be asked to rate how strongly they agree with a statement such as

2

“Our school administrators have a strong presence in the school” to measure school leadership.

They may be asked to rate their agreement with the statement “I easily understand our school’s

mission/outcomes statement(s)” to measure school purposes and goals (Leithwood and Jantzi,

1999). A significant concern with this type of research is that survey measures of perception

may be endogenous or suffer from considerable mis-measurement, biasing any results.

Our analysis does not rely on measures of perception. Instead, we employ turnover of school

principals within schools over time to identify their effects on student outcomes, purged of any

fixed school, neighbourhood and stable peer group effects. This strategy comes at the expense of

not being able to identify the particular pathways by which principals affect student outcomes,

nor the strategies of principals that are effective.

Our empirical strategy has two main components. First, we employ a semi-parametric tech-

nique to identify a lower bound estimate of the variance in the quality of individual school princi-

pals. Only within school variation in school principal quality is employed in identification, thus

removing any fixed school or peer group effects. This technique is adapted from that employed

by Rivkin et al. (2005). Unlike those authors, however, we cannot remove individual student

fixed effects in this study, as we only observe individual students once each. Identification here

thus requires that principals were not rotated across schools in response to changes in the qual-

ity of students, and that students do not sort themselves across schools in response to principal

quality changes. The technique does allow for principal allocation based on long-run average

student quality in the school, however, just not changes in student quality over time. This tech-

nique assumes that individual principals have a fixed and immediate effect on the schools that

they lead.

The second component of the empirical strategy estimates a simple dynamic model of the

effect of school principals on student outcomes. This strategy allows for a potentially cumulative

effect of school principals on schools over time. Principals may take a certain number of years to

have a measurable influence on student outcomes, as the strategies of the principal may take some

3

time to have their full effect on a school. In both the semi-parametric and dynamic strategies, we

estimate the effect of individual school principals on student outcomes after first controlling for

aggregate time effects and a number of individual student and neighbourhood characteristics.

The main contribution of this analysis over the previous economic literature is the ability to

measure the effect of individual school principals on the outcomes of students. The idiosyncratic

(unobservable) quality of school principals is one input into the education process that, to our

knowledge, has not previously been investigated.1 Our results suggest that there is sizable het-

erogeneity in school principal quality, as observed in high school graduation rates and grade 12

provincial English exam scores.

The outline of this paper is as follows. The empirical models are outlined in Section 2. The

administrative data we employ is described in Section 3. Semi-parametric lower bound estimates

of the variance of principal quality are provided in Section 4. Estimates from our dynamic model

of school principal effects are presented in Section 5. Section 6 concludes.

2 Empirical Models of Student Outcomes

2.1 General Model

In the empirical analysis below, we estimate the effect of school principals on two main student

outcomes: graduation from high school and performance on grade 12 English Exams. Consider

first the outcome of high school graduation. The outcome of performance on grade 12 Exam

Scores will be modeled in a similar manner. An individual will graduate from high school if she

or he both stays in school and surpasses some threshold level of achievement. Both outcomes

are the result of choices made by the individual, such as the effort expended on studying while

in school. The individual will weigh the expected benefits of graduation - the increased lifetime

1We employed school principals as instruments for high school graduation in a related study on the effect of highschool graduation on subsequent welfare use for youth from welfare backgrounds. See Coelli, Green and Warburton(2007) for details.

4

flow of utility from graduation - against the costs, including the time spent in school.

The net benefit of graduating from high school, denoted by the unobserved latent variableI∗ist,

can be represented by equation 1. This net benefit calculation will include the effort each youth

must expend to graduate, given their prior academic preparation and innate academic ability.

Individuals with better prior preparation and ability will not need to expend as much effort as

others to graduate.

I∗ist = Fistπ1 + Nistπ2 + Sstπ3 + Aistπ4 + P (−i)stπ5 + G(−i)stπ6 + εist (1)

Individual i in cohortt is observed to graduate from high schools (represented byGist = 1)

if I∗ist > 0, i.e. if the net benefit of graduating is positive. An individual is observed to drop out

(Gist = 0) if I∗ist ≤ 0.

The matricesF , N andS denote family inputs, neighbourhood influences and school inputs

respectively. Prior academic preparation and ability may be proxied by measured achievement

via the termAist. HereG(−i)st denotes the percentage of an individual’s classmates who graduate

high school. The subscript(−i) denotes averages are calculated over all students in schools and

cohortt except studenti. This term captures purely endogenous peer effects in the graduation

decision. The perceived benefits of graduation may depend directly on the number of one’s peers

who choose to graduate. This is in addition to any impact that the background or predetermined

characteristics of one’s peers may have on the net benefit. The predetermined characteristics of

peers enter the net benefit calculation in the termP (−i)st. These exogenous (contextual) charac-

teristics, including family background, are not affected by the current endogenous behaviour of

students.

School peers can influence outcomes of individual youth in several ways. In terms of school

achievement, peers can be a source of motivation and competition, they may learn more easily

so the class can do more in a given amount of time, they may be less disruptive, and so on. Peers

may also influence an individual’s expectations of the benefits of high school graduation, by

5

being a source of information regarding the benefits of school success. They may also directly

raise the utility from graduation via “belonging” to the peer group. The impact of peers on

education outcomes has received considerable attention in the economics literature, although

solid empirical evidence of peer effects is not prevalent.

NeighbourhoodsN may have their own separate influence on the graduation probabilities of

youth. The expected benefits of graduation, in terms of better job and post-secondary education

prospects, and in terms of expected higher utility, will depend on surroundings. If more role

models exist in the local community who have obviously benefited from further education, the

more likely the individual will decide to graduate.

If the current behaviour of peers does affect an individual’s net benefit of graduation (the

parameterπ6 is non-zero), there will be a reciprocal impact of one’s own graduationGist on the

achievement of others in the school and grade, affectingG(−i)st directly. This reciprocal effect

then induces a correlation betweenG(−i)st and the error termεist. This correlation leads to a

bias in regression estimates of equation 1 if measures such asG(−i)st are included directly. This

potential bias is commonly known as the reflection problem in estimating social interactions such

as peer effects (Manski, 1993; Brock and Durlauf, 2001).

Taking expectations of equation 1 conditional on the predetermined variables, one can solve

for the endogenous peer group measureG(−i)st. A reduced-form version of equation 1 can then

be written as follows.

I∗ist = Fistφ1 + Nistφ2 + Sstφ3 + Aistφ4 + P (−i)stφ5 + ηist (2)

Here theφ parameters are combinations of theπ parameters in equation 1. To recover the

underlying structural parameters (theπ’s) requires strong identifying assumptions. The objective

of this analysis is to identify the overall effect of school principals on student outcomes. Sepa-

rating out the direct effect of school principals on graduation from any potential indirect effect

of school principals on the individual’s probability of graduation via peer effects is not crucial.

6

If students can move or self-select to schools where certain principals are in charge, or prin-

cipals can move to schools where student quality is changing, it is important to control for all

individual attributes that affect student outcomes, including measures of prior academic prepara-

tion and ability. Including prior measures of high school achievementAist (such as grade 11 test

scores, which are available) in our estimation would, however, come at a cost. School principals

may have an effect on these prior student achievement levels if they were present at the school

in the prior year. By including these measures, we are removing one avenue by which princi-

pals may affect student outcomes. In light of this, we construct estimates of the effect of school

principals on graduation rates without including these grade 11 prior achievement measures.2

Unfortunately, there are no measures of innate ability that were taken prior to students entering

high school in the data.

Unbiased estimation of equation 2 requires that there are no missing measures of school

inputs. Such missing measures are correlated across students in the same school and grade,

which will bias any estimates of peer group effects in particular. This is called the “correlated

effects” problem identified by Manski (1983) and others. Such missing measures may also bias

estimates of the effect of school principals here, which is the direct concern of this analysis.

Including school fixed effects in the regression estimation is necessary to control for any non-

time varying school characteristics. This identification strategy then attributes changes over

time in individual school characteristics to the school principal, if there is variation in school

principals in a particular school.

This analysis is closely related to the literature on Education Production Functions (EPFs).

Under this approach, schooling outcomes such as exam scores are a function of various family,

school and peer group inputs. A similar model can be formed with an exam score as the depen-

dent variable, and the same regressors as in equation 1 above included. See Appendix B for

2There are also many missing values in the grade 11 exam score data, making estimation including such measuresmore difficult.

7

a description of this approach. We are essentially relying on such a model when we estimate

principal effects on grade 12 English Exam Scores below.

2.2 Model with Fixed Principal Effects

Here we follow the general procedure used by Rivkin et al. (2005) in their study of fixed teacher

effects, with some notable differences to deal with our specific problem of estimating principal

effects. The objective is to set up a model where we directly estimate a parameter governing

the variance in school principal fixed effects. If this parameter is estimated to be significantly

greater than zero, it implies that school principals do affect student outcomes, and that there is

significant variability in school principal quality.

To develop the estimating equation, we begin by assuming the following linear equation for

the school graduation probability of an individual youth. Instead of positing individual inputs

into the graduation decision equation, we focus on the total systematic effect of schools, school

principals and individuals on graduation. Note that in the estimation to follow, we will control

for individual, peer and neighbourhood characteristics as far as we can with the available data in

first stage regressions, as described below.

Gist = δs + θst + γist + υist (3)

The probability that an individuali in cohort t of schools graduates from high school is

written as a linear function of school (δ), principal (θ) and individual (γ) effects, plus a random

error term (υ), assumed independent of the other three components in this equation. The student

effect is the composite of all individual and family factors that affect graduation, such as parental

education and permanent income, individual student ability, etcetera. The fixed school effect is

the composite of all stable elements of schools that affect graduation, such as resources, location,

school district hiring practices, infrastructure, etcetera. The principal component captures the

impact of the particular principal leading the school. The random error term is a composite of all

8

residual factors affecting graduation, for example, a shock to the individual’s home environment,

or a random shock to the school environment.

In equation 3, the variance ofθ will measure the variance in the effect of school principals

on the student outcome. We use information on principal turnover and student outcomes by

school and cohort to generate a lower bound estimate of the variance in principal effects. After

constructing the mean student outcome in a school and cohort, equation 4 represents this mean

as a linear function of fixed school and principal effects, the average student quality in the cohort

and school, and the school-cohort average random error.

Gst = δs + θst + γst + υst (4)

The school principal may vary across cohorts in the school, but all students in the same cohort

and school (subscripted byst) will have the same principal effect (θst). We can then compare

the mean outcome of an individual student cohort to the average outcome of the school over our

observation period.

(Gst −Gs) = (θst − θs) + (γst − γs) + (υst − υs) (5)

In equation 5, all fixed school effects (δs) have been removed by subtracting school mean

outcomes. Thus deviations in the mean outcome of a particular cohort from the school mean is

a function of deviations in principal effects from the school average principal effect, deviations

in average student quality from the school average student quality, and a remaining composite

random error component.3

Squaring both sides of equation 5 yields equation 6.

(Gst −Gs)2 = (θst − θs)

2 + (γst − γs)2 + 2(γstθst + γsθs − γsθst − γstθs) + est (6)

Equation 6 characterizes the squared deviations in mean student outcomes as a sum of terms

denoting the within school variance in fixed principal effects, within school variation in average

3This final component will capture time-varying school factors uncorrelated with school principal changes.

9

student quality, the covariance between average student quality deviations and principal quality

deviations within a school, and a final component denotede. This final componente encom-

passes the random error’s variance and cross product terms between the random errorυ and

principal and student quality deviations.

To identify the variance of school principal effects, we make the following assumption. Devi-

ations in the cohort average quality of students within a school are not correlated with deviations

in individual principal effects within that school. This assumption does not imply that the overall

average quality of students in a school must be unrelated to the average quality of principals in a

school. It merely requires that changes in student quality are not related to changes in principal

quality within a school. If we think that there is positive self-selection of students over time

(good students move to schools where good principals move to) or positive movement of princi-

pals across schools over time (good principals move to schools with improving student quality),

it will bias up our estimate of the variance of principal effects.

We calculate the arithmetic average of these squared deviations in student outcomes within

each school to form a measure of the within school variance in student outcomes. The average is

taken overT , the total number of years we observe the school in our sample. Taking expectations

of this measure of the within school variance in student outcomes, employing equation 6 and

imposing the assumption stated above, yields equation 7.

E

[1

T

T∑t=1

(Gst −Gs)2

]= E

[1

T

T∑t=1

(θst − θs)2

]+ σ2

γs+ E[es] (7)

We observe the same principal in a school for more than one cohort. To reflect this, we use

the subscriptj to indicate the particular principal in schools when cohortt entered grade 12.

Thus principalj has a principal effect ofθj. Equation 8 defines the expectation of our term of

interest, the term capturing the within school variance of school principal effects. The termσ2θs

denotes the within school variance of principal effects, whereσ2θs

= E[θ2j ]. We assume that each

principal is an independent random draw from a common distribution, such thatE[θjθk] = 0

10

wherej 6= k.

E

[1

T

T∑t=1

(θst − θs)2

]= σ2

θs

[1

n

J∑j=1

qj

[1 +

1

n2

J∑

k=1

q2k −

2

nqj

]](8)

The term on the right hand side of equation 8 afterσ2θs

is a deterministic number denoting the

amount of school principal turnover within one schools. The numberqj denotes the number of

years that principalj is in the school, whileJ denotes the total number of different principals in

the school over the period.

While this turnover term looks quite complicated, it collapses to easily understood numbers

in most cases. If the same principal leads the school for the entire sample period, this term will

equal zero. The variation in fixed principal effects in a school that has the same principal over the

period must be zero. If there are multiple principals in charge over the period, the turnover term

will be positive and increasing in the number of principals (the amount of principal turnover). For

example, if there are two principals in the school over the period, each one for the same number

of years, then the turnover term equals one half. If there are three principals for equal numbers

of years, the term equals two thirds. The intuition behind this equation is that, if principals affect

student outcomes and there is variability in quality across school principals, the within school

variance in student outcomes should be higher in schools with more principals. This variance

should be increasing in the amount of principal turnover. Details of the construction of this

turnover term, including a simple example, is provided in Appendix C.

The termσ2γs

in equation 7 denotes the variance of cohort average quality of students in a

school. To identify the variance of principal effects using principal turnover, we assume that

principal turnover is not related to this within school variance in cohort average student quality.

Note that this does not imply that turnover is unrelated to the average quality of students in a

school, just its within school variance. The within-school variance in cohort average student

quality σ2γs

will be proportional to the inverse of the number of students in the school. The year

to year variation will be higher in smaller schools. We control for this effect in our estimates by

11

including the inverse of school size in our estimating equation.

Our primary estimating equation for the fixed principal effects model is equation 7, after sub-

stituting in equation 8. The regressand is the variance in student outcomes across cohorts within

each school. We regress this variance on the term we construct to denote principal turnover (the

summation term on the right hand side of equation 8), plus the inverse of the average size of the

grade 12 entering cohort in the school. Imposing the assumptions stated above, the coefficient

on the turnover term will provide a consistent estimate of the within school variance of fixed

principal effectsσ2θs

.

Our estimator removes all across school variation in school principal effects, so we are gen-

erating a lower bound estimate of the overall variance in principal effects. If all schools hired

from a common pool of potential school principals, across school variation in average principal

quality would be zero. If, however, certain schools can hire from a larger pool of applicants, by

offering say more advantageous living and working conditions, the average quality of those hired

should be higher. Thus there may be considerable across school variation in principal quality that

we do not identify here.

Another source of downward bias in our estimate of the variance in principal effects is the

violation of the assumption that the effect of leadership does not change when principals do not

change. Principals may take a number of years to make an impact on a school after joining it,

changing the culture and instruction over time. The effect of a particular principal on student

outcomes may therefore also change over time. We develop a second empirical model to analyze

this particular issue further in the next sub-section.

If there is non-random attrition of principals from the BC public high school system, there

may be an upward bias in our estimate of the variance of principal effects. We observe principal

turnover from both rotation of principals across schools and from principals leaving and joining

the sector. If only good or bad principals leave, and new hires are drawn randomly from the

distribution, turnover will be related to the distribution of principal effects in a school. For

12

example, say a school gets a good draw from the principal distribution, raising student outcomes.

If that principal leaves, and the next principal is drawn randomly, turnover and quality deviations

will be related. More turnover will be observed in schools that have a wider distribution in

principal effects.

A second source of potential upward bias in our estimate of the variance of principal effects

is if there are exogenous changes in the school environment that occur simultaneously with

principal turnover. If there are teacher or instructional changes in a school that coincide with

principal turnover, the effect of these changes on graduation rates are attributed to the principal

with this estimator.

There are two important differences between our estimator here of the within school variance

of principal effects and the estimator of the variance of teacher effects developed by Rivkin, et al.

(2005). To begin, we cannot difference out the individual student fixed effect as we only observe

individuals once in our data - the year they enter grade 12. Secondly, our estimator is based on

the within-school variance in student outcomes, and uses the school as the unit of observation.

The Rivkin et al. (2005) estimator uses first differences in student achievement within a school

and grade, and uses individual cohorts within a school as the unit of observation. Our choice

was purposeful here as principals may take some time before making noticeable changes to a

school. A first differencing technique will only identify the immediate effect of a new principal

on a school, potentially missing more important medium term effects.

2.3 Model with Dynamic Principal Effects

In the empirical model of fixed principal effects outlined above, the estimator is based on the

assumption that the impact of principals on student outcomes are immediate and constant year

to year. This assumption may not accord with how principals actually affect schools and indi-

vidual student outcomes. A particular principal may not have their full impact on a school until

the principal has led the school for several years. It may take time to replace under-performing

13

teachers, to implement new disciple procedures or change learning objectives. A particular prin-

cipal may also have an effect on students in lower grades of high school (e.g. from grade 9 when

they usually enter high school) that carries over to their grade 12 outcomes.

We now construct a simple model of principal effects that allows for each principal to have a

cumulative effect (positive or negative) on student outcomes over time. We start with the same

linear equation describing average student outcomes in a school and cohort as above (Equa-

tion 4). Instead of assuming that a principal has the same fixed effect on the school each year, we

allow for the school leadership effectθst to be a weighted average of the school leadership effect

in the previous yearθs,t−1 and of the “full” individual school principal effectθpt, with weight

parameterρ.

θst = ρ θs,t−1 + (1− ρ) θpt = θs,t−1 + (1− ρ) (θpt − θs,t−1) (9)

Here,θpt is the individual underlying full principal effect (type) of principalp in schools for

student cohortt. If the principal was left to run the school for many years, the leadership effect

θst would approachθpt. This simple model of leadership dynamics is a one parameter model.

More flexible models of leadership dynamics could of course be constructed, but such flexibility

would come at the expense of increasing the number of parameters to be estimated, with the

concern that the precision of any estimates may fall.

Substituting in this function of leadership dynamics into Equation 4 yields the following,

whereνst = γst + υst is a composite error term.

Gst = δs + ρ θs,t−1 + (1− ρ) θpt + νst (10)

To construct our estimator of this particular model of principal effects, we start with the

following notation for the full principal effects in one schools.

θpt =∑p∈Ps

λpsDpst = λ′sDst (11)

In this equation,Dpst = 1 if principal p is leading schools for cohortt, Dpst = 0 otherwise.

The set of all principals at schools over the period is denoted byPs. Parameterλps is the

14

unobserved full effect of principalp at schools. Using this notation, we can re-write Equation 10

as follows.

θst = ρ θs,t−1 + (1− ρ) λ′sDst (12)

Repeated back substitution of this equation yields:

θst = ρt θs,0 + (1− ρ)t−1∑j=0

ρjD′s,t−jλs = ρt θs,0 + Dst(ρ)′λs (13)

As an identifying assumption, we setρt θs,0 to zero. This essentially imposes the restriction

that the effect of leadership in the school in the year just prior to the data period we observe is

zero in each school. We also set the parameter for the full principal effect of the first principal

observed in each school to zero. As a result, all other full principal effects (the remainingλps’s

collected in the vectorλs) are estimated relative to the first principal.

The estimated model is a panel (by school) of non-linear in parameters regressions with

commonρ.

Gst = δs + Dst(ρ)′λs + νst = Xst(ρ)′βs + νst (14)

For a given value ofρ, we can solve analytically forβs, the vector of school fixed effectsδs

(a school level constant) and the full principal effectsλps (Ordinary Least Squares equations).

Concentrating onρ, we form the following minimisation problem for estimatingρ.

minρ

N∑s=1

T∑t=1

νst(ρ)2 where νst(ρ) = Gst −Xst(ρ)′β̂s(ρ) (15)

Using the estimated value ofρ, we construct our final estimates of theδs’s and theλps’s.

This estimator yields an estimate of the speed by which principals affect schools, plus estimates

of the unobserved full effects of individual school principals relative to the first school principal

observed in each school in the data.

Note that both this technique and the one assuming fixed leadership effects over time assume

an unobserved effect for each principal that affects student outcomes additively and linearly. In

both cases, school fixed effects are allowed for, so only within school variation in principal effects

15

are estimated. The potential biases in the estimates described above for the first technique will

be the same for this second technique, except that we now allow for a growing effect of principal

leadership on schools.

So what are the main differences between the two estimators? In the first model above,

the technique restricts each principal to have the same fixed effect on student outcomes each

year the principal is leading the school. The estimator yields an estimate of the variance of the

distribution of these unobserved fixed effects, and it produces a direct test of whether there is

observable variation across principals in their effects on student outcomes.

In this second estimator, each principal can have a growing effect (positive or negative) on

student outcomes in the school. In this case, a principal has an initial impact of(1− ρ) times the

estimated full unobserved effect (1st year in school). In the second year, the estimated impact

is (1 − ρ)(1 + ρ) = (1 − ρ2) times the unobserved full effect. In yeark, the effect equals

(1− ρk) = (1− ρ)∑k−1

j=0 ρj. In this method, the estimator yields an estimate of (1-ρ) (the speed

by which principals affect schools), plus estimates of the full unobserved principal effects. Given

this set of estimated unobserved effects, we can calculate the variance of both the full effects,

and of the year by year impacts of principals on schools.

3 The Data

The data we employ is Ministry of Education records on all youth enrolled at the start of Novem-

ber in grade 12 of standard public (provincially funded) British Columbia high schools from

1996 to 2004. For each grade 12 student, we observe whether and when they graduated from

high school.4 We also observe the high school the individual attended, from which we can iden-

tify the principal at the school when the student was in grade 12. In addition, we often observe

the score or scores the student achieved in particular provincially set final exams (English, Math-

4We only observe high school graduation if it occurred before October 2005.

16

ematics and Communications).5

The administrative school records contain information on each student’s birth month and

year, from which we can construct a variable denoting the student’s age in months. Students

entering grade twelve at older ages are those that either repeated a grade of school earlier on, or

entered school at a later age than normal. The records also contain information on gender, first

nation status, whether they are an English as a Second Language student, plus information on

the language spoken at home.

From 1996 onwards, the data also has the student’s home postcode recorded. Using this in-

formation, we link individual student records to 2001 Census information on the characteristics

of the Census Tract or Subdivision (neighbourhood) where the student lives. Since we do not

have direct information on the income or education level of the students’ parents, the Census

information provide an indirect means of controlling for missing family background character-

istics on student outcomes. Our neighbourhood characteristics may also proxy peer quality, as

most youth attended the school nearest to the family home. These characteristics are measured

at the Census Tract level where such tracts are identified (the majority of urban areas), and at

the Census Subdivision level if Census Tracts were not defined or had populations that were too

small to measure neighbourhood characteristics with an appropriate level of accuracy (less than

250 people). In a small number of cases, characteristics were measured at the Census Division

level, if both Tract and Subdivision information were not available or were unreliable due to

those areas having a population of less than 250.

3.1 Data Description

In the analysis to follow, we analyse student outcomes for students attending standard public

high schools only for several reasons. To begin, a number of public school districts employed

5This data set also has information on each student’s grades in up to three subjects in grade 11 and, if theycomplete grade 12, their final high school Grade Point Average (GPA), which is a weighted average of their coursemarks. As discussed above, we chose not to use this grade 11 information in the analysis.

17

a program where principals were regularly rotated from one school to another by the district’s

school board. This provides us with some exogenous turnover of principals by which to iden-

tify principal effects. Focusing on public schools will also minimize resource differences across

schools, with funding levels for all public schools set by the British Columbia Provincial gov-

ernment. We also restrict attention to standard public high schools with at least 25 students in

each grade 12 entering cohort when analysing graduation rates. This restriction was undertaken

to limit noise in the estimates of mean graduation rates by school and cohort. When analysing

English Grade 12 exam scores, we further limit attention to schools that had at least 25 students

writing such exams each year. We only analyse English scores and not the available scores on

a Mathematics exam and a Communications exam, as the number of students taking these other

exams was not large enough to estimate principal effects with any precision.

We construct two separate indicators of high school graduation from the data. The first

measure (1 year) denotes high school graduation as occurring if the student graduated by October

of the year after the student was identified as being enrolled in grade 12 (measured in November).

This gives individuals on average about a year to complete high school. If they complete high

school on a regular schedule, this should occur in June. The second measure (2 year) denotes

high school graduation occurring if the student graduated before October two years after being

enrolled in grade 12. This gives students an extra year by which to graduate high school, even if

they do not complete high school with their own cohort. Only a very small number of students

complete high school after these times. For our purposes, such students would be denoted as

non-school completers. If they did complete high school at later stages, this was often completed

outside of regular public high schools in specific continuing education institutions.

For the graduation outcomes, our final sample covered 224 schools. Of the 224 schools, 22

had only one principal over the period, another 77 had two, 87 had three, 29 had four, and even

9 had five principals. Note that a small number of the schools were not observed over the full

10 year period, as new schools were opened in British Columbia, and some were closed. In

18

all, we observe 504 separate principals in these schools over the period. We observe 127 (25

per cent) of these principals in more than one school over the period (114 in two schools, 13

in three schools). These switching principals are observed for an average of 3.3 years in each

school. For 97 of these switching principals, they were observed only in schools within the same

school district. Thus we observe significant rotation of principals across schools, particularly

within school districts. There is also significant principal turnover unrelated to rotation, due to

new principals entering and others leaving the British Columbia public high school system. The

proportion of all turnover that is due purely to rotation (switching principals) is 37 per cent.

Summary statistics by school are provided in Table 1. On average over this period in British

Columbia, 78 per cent of entering grade 12 students graduate from high school within one year

of entering grade 12 (82 per cent within two years). Average graduation rates varied significantly

across these public high schools, from a low of 35 to a high of 93 per cent. The across school

distribution of mean school graduation rates over the 1996 to 2004 period is presented in Fig-

ure 1. Note the distribution of graduation rates are skewed to the left, with bunching near the

upper limit of 100 per cent graduation. Given this distribution, when we construct our estimates

of principal effects below, we use the log odds of graduation ratesln[G/(1−G)] as our outcome

measure rather than actual graduation rates in levelsG.

For English grade 12 exam scores, we have data for 209 schools that had at least 25 students

write the grade 12 English Exam each year. Over time mean scores varied from 61.7 to 75.7

per cent across schools, with an average of 68.9 per cent and a standard deviation of 2.6. The

across school distribution of mean school Grade 12 English exam scores over the 1995 to 2004

period is presented in Figure 2. This distribution is approximately bell-shaped, with no apparent

skewness. Given this, we analyse these exam scores in levels in the analysis below.

19

Table 1:Statistics by School

mean s.d. min max

Graduation rate (1 year) 0.78 0.08 0.35 0.93

Graduation rate (2 years) 0.82 0.07 0.41 0.95

Number of students 208.3 116.4 31.5 667.9

Male 0.51 0.03 0.32 0.66

First Nation 0.06 0.09 0.00 0.69

English as second language 0.05 0.08 0.00 0.37

English 0.84 0.20 0.19 1.00

French 0.00 0.00 0.00 0.02

Other language 0.16 0.20 0.00 0.81

Age (months) 213.7 5.7 209.7 278.7

Notes: 224 observations. Averages over period 1995 to 2004 (or less if school not in existence

for whole period).

Figure 1:School Mean Graduation Rate Distribution - 1 year

02

46

Den

sity

.4 .5 .6 .7 .8 .9(mean) grad1

20

Figure 2:School Mean English Scores Distribution

0.0

5.1

.15

.2D

ensi

ty

60 65 70 75(mean) Mbepcen

3.2 Controlling for individual characteristics

Before employing the two estimation techniques described above to examine whether school

principals affect the student outcome measures of graduation and English exam scores, we first

remove from our measures the influence of a number of available individual, peer and neighbour-

hood characteristics. We also remove the aggregate effect of time from our measures. We control

for these factors by constructing first stage estimations. For the two graduation rate measures,

we employ the Logit technique to estimate the graduation probabilities of individual students in

these first stage estimates. For Grade 12 English Exam Scores, we use Ordinary Least Squares

to construct our first stage estimates.

The coefficient estimates for these first stage estimations are collected in Tables A1, A2 and

A3. We construct three sets of first stage regressions for each outcome measure. In Table A1,

we control for aggregate time effects only. In Table A2, we control for time, individual and

peer effects. In Table A3, we control for time, individual, peer and neighbourhood effects. Note

21

that when controlling for neighbourhood characteristics, we lose one year of data (1995), as

individual student postcode information was not collected until 1996.

Looking at these first stage regression results, note the strong aggregate time effects for all

three student outcome measures in Table A1. These logit coefficients imply an 8 percentage

point growth in graduation rates over the period from 1995 to 2004 for the 1 year measure. Any

changes over time may reflect changes in Provincial government policy in schools, in British

Columbia labour market conditions (high unemployment rates may keep youth in school), in

Income Assistance rules, and in aggregate post-secondary education possibilities. There is also

a 3 percentage point growth in average English exam scores, all of which occurred after 1999.

This may reflect grade inflation.

When controlling for individual characteristics (Table A2), we include the male indicator plus

its interaction with the other five individual characteristics. Although the coefficient on the male

indicator is positive, when we take into account the influence of these interactions, being male

significantly lowers the probability of graduating from high school and lowers English exam

scores. Being of First Nation background and being an English as a Second Language (ESL)

student lowers the probability of graduation by approximately 20 percentage points. French

speakers have a 6 percentage point lower probability of graduation, while speaking a language

other than English or French at home raises the probability of graduation by around 4 percentage

points. Being one year older than the average student reduces the probability of graduation by

10 percentage points.

We construct peer quality measures for each individual student by calculating the average

of the individual characteristics for all other students in the same school and cohort (year). For

example, we calculate the proportion of a student’s classmates that came from a First Nation

background, but do not include the individual student themselves in the calculation. These are

the exogenous peer measuresP (−i)st described in Section 2.1 above. Regarding the estimated

peer effects in Table A2, having more male, First Nation and “Other” home language speakers

22

in your class lowers the probability of graduation. Having more ESL students in class actually

raises graduation probabilities.

When we add neighbourhood characteristics in Table A3, recall that we lose students for

1995. We also lose a very small number of students as we were unable to link their home post-

code to the Census data, or the home postcode was not provided.6 Here, many of these neigh-

bourhood characteristics do influence student outcomes, and mostly in the expected directions.

Two variables have unexpected effects. Having a higher proportion of the neighbourhood with

less than a grade 9 education is associated with better student outcomes, while having a higher

average value of dwellings is associated with worse student outcomes.

4 Estimates using Model with Fixed Principal Effects

Our first estimator of the effect of high school principals on student outcomes is constructed using

the hypothesis that the within-school variation in student outcomes should be higher in schools

that have several principals over a period than in schools with only one principal over the same

length of time. We use the technique developed in Section 2.2 above to construct estimates of the

within-school variance in principal fixed effects. This estimator involves regressing the within

school variance in student outcomes on our indicator of principal turnover within the school.

The coefficient on the turnover indicator is our estimate of within school variance in principal

effects. We also include the inverse of grade 12 enrolment in the estimated equation to control for

sampling variability in our measures of the within school variance in student outcomes. Including

this term will control for differences in the within school over time variance of student quality

across different sized schools (the termσ2γs

in equation 7) and for the final error componentes.

As discussed above, principals may take a few years to make an impact on a school they are

put in charge of. Our estimator here will find difficulty in picking up such time varying effects.

To investigate this issue, we construct our estimates of the variance of within school principal

6Only 0.4 of one percent of students could not be linked to Census data.

23

effects for the whole sample of schools, and for the sub-sample of schools where three or less

principals lead the school over the period. By focussing on schools with fewer principals, each

principal has a longer number of years to make an impact on the school, thus making it easier

for this estimator to identify the underlying variance in principal effects.

Estimates of the within-school variance in principal effects for our de-trended and two ad-

justed measures of student outcomes are presented in Table 2. The “adjusted I” measure controls

for individual, peer and time effects, while the “Adjusted II” measure controls for individual,

peer, time and neighbourhood effects. When constructing these estimates for the two graduation

rate measures, we take deviations of actual mean graduation rates in log odds terms by school

and cohort from predicted mean graduation rates from our first stage estimates, also in log odds

terms. Estimates for all schools are presented in the top panel of the Table, while estimates for

the sub-sample of schools with three or less principals leading the school over the period are

presented in the bottom panel.

For the 1 year graduation rate measure, the variance in principal effects is estimated to be

only 0.020 using the de-trended measure and all schools, with the estimates being even smaller

for the two adjusted measures. All estimates here are insignificantly different from zero. If we

include only schools where we observe three or less principals over the period (bottom panel),

our variance estimates increase considerably, but they are still statistically insignificant. For the

2 year graduation rate measure, our estimates of the variance in principal quality are larger than

for the 1 year measure, but are again statistically insignificant. When we restrict the sample to

schools with three or less principals using the 2 year measure, the variance estimate using the

de-trended measure is statistically significant at the 10 per cent level, but the adjusted measures

are statistically insignificant.

One straightforward way of interpreting the size of the estimates of the variance in principal

effects for our graduation rate measures in log odds terms is as follows. Start from the mean

graduation rate in the sample of 82 per cent for the 2 year measure. A one standard deviation

24

Table 2:Variance in Principal Quality: Fixed Effects

coefficient s.e. observations

All Schools

Grad. rate (1 yr) - de-trended 0.020 0.037 224

Grad. rate (1 yr)- adjusted I -0.002 0.033 224

Grad. rate (1 yr)- adjusted II 0.003 0.033 224

Grad. rate (2 yrs) - de-trended 0.046 0.037 224

Grad. rate (2 yrs)- adjusted I 0.022 0.032 224

Grad. rate (2 yrs)- adjusted II 0.015 0.032 224

English Scores - de-trended 2.17** 0.87 209

English Scores- adjusted I 1.02 0.87 209

English Scores- adjusted II 1.36 0.85 209

≤ 3 principals

Grad. rate (1 yr) - de-trended 0.033 0.039 186

Grad. rate (1 yr)- adjusted I 0.000 0.036 186

Grad. rate (1 yr)- adjusted II 0.001 0.035 189

Grad. rate (2 yrs) - de-trended 0.067* 0.039 186

Grad. rate (2 yrs)- adjusted I 0.032 0.035 186

Grad. rate (2 yrs)- adjusted II 0.016 0.035 189

English Scores - de-trended 2.42** 0.94 174

English Scores- adjusted I 1.43 0.97 174

English Scores- adjusted II 1.64* 0.94 177

Notes:Regressions also include a constant and the inverse of school grade 12 enrolment. One

and two *’s denotes statistical significance at the 10% and 5% levels respectively.

25

increase in the log odds of graduating equals 0.126 using our variance estimate of 0.016 for

the adjusted II measure in the bottom panel. An increase of 0.126 from the mean raises the

probability of graduating by 1.8 percentage points.

We can use the model estimates to construct a decomposition of the variance in student

outcomes across schools, to give us an additional method by which to judge how much school

principals can matter. For the estimates for the adjusted II graduation rate (2 year) and three or

less principals, the cross-school variance in the log odds of graduation rates in 2004 of 0.2782

can be decomposed into the effect of schools (59%), the effect of student quality variation and

remaining shocks (35%), and to the effect of school principals (6%). See Appendix D for details

on how this decomposition was constructed. From this decomposition we can see that school

principals do not contribute much to overall variation in student graduation rates according to

these estimates.

For English exam scores, the estimates of the variance of principal effects are statistically

significant at the 5 per cent level and sizable for the de-trended measures. For the adjusted

measures, the size of the estimates fall, and only the estimate using the “adjusted II” measure

and the sub-sample of schools with three or less principals is statistically significant at the 10

per cent level. A one standard deviation of within-school principal effects of 1.28 (if three

or less principals in the school, i.e. the square root of 1.64) is quite large when compared to

a standard deviation in adjusted mean English exam scores across schools of 1.82 percentage

points. It implies that if a student attended a school that had a school principal that was one

standard deviation higher in the “effective” distribution, their English exam score would be 1.28

percentage points higher.

We can again use the model estimates to decompose the variance in English test scores across

schools into the effects of schools, student quality and remaining shocks, and school principals.

For the adjusted II measure of English exam scores and three or less principals, the cross-school

variance in scores of 7.18 can be decomposed into the effect of schools (35%), the effect of

26

student quality variation and remaining shocks (42%), and to the effect of school principals

(23%). Thus school principals have a larger proportional effect on English exam scores than on

graduation rates.

To summarize these estimates, there is weak evidence of principals affecting the student

outcome of English exam scores, and even weaker evidence of principals affecting graduation

rates. Generally speaking, the point estimates are larger when the sample is restricted to schools

that have three or less principals leading the school over the period. This suggests that there may

be some support for the hypothesis that principals may take a number of years to affect a school.

It is to this particular issue that we turn to in the next section.

5 Estimates using Model with Dynamic Principal Effects

We now turn to estimating the model of school principal effects described in subsection 2.3

above. In this estimation, the effect of a school principal on student outcomes is allowed to grow

over the time the principal is leading the school. The results here will include estimates of the

speed by which a new principal affects student outcomes (the parameter (1-ρ)), plus estimates of

the unobserved “full” principal effects. Using these “full” principal effect estimates, we construct

estimates of the overall variance of such effects. Unlike the first estimation method, however,

there is no readily available test in this case of whether this estimated variance of the distribution

of principal effects is significantly different from zero or not. The variance estimate will always

be positive. The estimates of the parameter on the speed of adjustment to these unobserved

principal effects (1-ρ), however, does provide information about whether principals do have

effects on student outcomes. If estimates of (1-ρ) are significantly different from zero, it implies

that student outcomes do respond significantly to individual principals.

In this section, we construct our estimates using the two measures of graduation rates and

English exam scores. We again use the measures that have been de-trended, adjusted for individ-

27

ual student, peer and time effects, and also adjusted for individual, peer, time and neighbourhood

characteristics. The graduation rates are again analyzed in log odds form.

Before providing our model estimates, information on average changes in student outcomes

by the number of years a principal has led a school is provided in Figure 3. We present figures

for both the average raw and absolute value of year on year changes in the log odds of graduation

rates (2 year) and in English exam scores (the adjusted II measures). The 95 percent confidence

bands around the averages are provided as dotted lines. To interpret the first graph, the first

data point in the top left graph denotes the average year on year change in the graduation rate

for all principals that are in their first year of leading a school. This value is essentially zero,

thus new principals in a school do not appear on average to have an overall positive or negative

effect on graduation rates. Thus new principals do not have on average a positive honeymoon

effect nor a negative disruption effect. The average change is also zero when principals are in

their second year, third year, etcetera. Thus there does not appear to be any systematic positive

learning effects when a new principal starts at a school, nor are there systematic negative effects

later on, possibly due to a souring effect. The same is true for English exam scores (bottom left

graph).

There is, however, some evidence of declines in the average of the absolute value of these year

on year changes in student outcomes. The average of the absolute value of year on year changes

in the log odds of graduation rates is approximately 0.38 in the first year a principal leads a

school, but this average falls to 0.35 by the fourth year principals are leading a particular school.

The decline appears more significant for English exam scores. There is thus some evidence that

principals have larger effects (positive and negative) on student outcomes earlier in their tenure

at a school, and smaller effects later on. This pattern in absolute changes is consistent with a

growing effect of a school principal over time towards their “true” or full effect level. Note that

these dynamic patterns do not appear to be a composition effect i.e. smaller absolute changes for

those principals that happen to remain in a school for a large number of years. We constructed

28

Figure 3:Student Outcomes by Year Principal Leading School

Graduation Rates (log odds adjusted II)

English Exam Scores (adjusted II)

-0.35

-0.30

-0.25

-0.20

-0.15

-0.10

-0.05

0.00

0.05

0.10

0.15

1 2 3 4 5 6

Raw changes

No. yrs principal in school

0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

1 2 3 4 5 6

Absolute changes


0.5

1.0

1.5

2.0

Raw changes

2.5

2.7

2.9

3.1

Absolute changes

-0.35

-0.30

-0.25

-0.20

-0.15

-0.10

-0.05

0.00

0.05

0.10

0.15

1 2 3 4 5 6

Raw changes


0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

1 2 3 4 5 6

Absolute changes


-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6

Raw changes


1.5

1.7

1.9

2.1

2.3

2.5

2.7

2.9

3.1

1 2 3 4 5 6

Absolute changes


note: dashed lines are 95 percent confidence bands.

29

these same figures for the sub-sample of all principals that stayed in a school for at least five

years, and the observed patterns for this sub-sample were the same as those depicted in Figure 3.

Estimates of the dynamic model of principal effects are presented in Table 3. For the grad-

uation rate measures, all estimates of the initial impact of school principals(1 − ρ) are not

statistically significant, although the size of the impact, at approximately 0.25, is quite large. For

English exam scores, however, the impact of school principals is found to be quite precisely esti-

mated and statistically significant for all three versions. Taking the estimate for the English exam

scores adjusted for individual, peer, time and neighbourhood characteristics (“adjusted II”), the

impact factor of 0.267 implies that a new school principal will have about one quarter of their

potential full effect in their first year. After two years, they will have had an impact of around

0.463, i.e.(1− ρ) ∗ (1 + ρ) = 1− ρ2. After three years, the impact will be 0.606 times the full

effect, i.e.(1− ρ) ∗ (1 + ρ + ρ2) = 1− ρ3, and so on. If the principal was left to lead the school

for many years, the effect will gradually approach the full effect.

This dynamic model also provides estimates of school fixed effects and of the deviations in

principal effects within each school from the first principal observed in each school over the

period. Estimates of the variance of the complete set of estimated principal effects are provided

in Table 4. Standard errors on these estimates of the variance are also provided. Note that the

variance of the full principal effects in the first column of the table are larger than the estimates

of the variance of principal effects using the first estimation method that treated the principal

effect as being fixed at the same level each year the principal is in each school. These effects

denote the full impact that a principal would have if left leading the school for many years. The

initial impact of a principal in the first year they lead a school is approximately 0.25 of this full

effect, that is, it equals the full effect multiplied by our estimate of (1− ρ).

The third column of Table 4 reports estimates of the variance of the impacts that each prin-

cipal is estimated to have on a school each year. For each year, this estimated impact equals the

full impact times (1− ρj), wherej is the number of years the principal has led the school by that

30

Table 3:Estimates of Dynamics (ρ)

estimate s.e. estimate

of ρ (ρ) of (1 - ρ)

Grad. rate (1 yr) - de-trended 0.775*** 0.144 0.225

Grad. rate (1 yr)- adjusted I 0.744*** 0.174 0.256

Grad. rate (1 yr)- adjusted II 0.740*** 0.173 0.260

Grad. rate (2 yrs) - de-trended 0.781*** 0.145 0.219

Grad. rate (2 yrs)- adjusted I 0.753*** 0.178 0.247

Grad. rate (2 yrs)- adjusted II 0.751*** 0.176 0.249

English Scores - de-trended 0.821*** 0.056 0.179***

English Scores- adjusted I 0.682*** 0.052 0.318***

English Scores- adjusted II 0.733*** 0.052 0.267***

Notes: Standard errors (s.e.) constructed as square root of second derivative of the objective

function. Three *’s denotes statistical significance at the 1% level. For graduation rates, 1726

observations. For English Exam Scores, 1615 observations. Student numbers used as weights.

31

Table 4:Variance of Principal Effects Estimates

full effect impact effect

variance s.e. variance s.e.

Grad. rate (1 yr) - de-trended 0.302 0.371 0.046 0.048

Grad. rate (1 yr)- adjusted I 0.205 0.265 0.036 0.039

Grad. rate (1 yr)- adjusted II 0.199 0.248 0.036 0.038

Grad. rate (2 yrs) - de-trended 0.338 0.432 0.049 0.053

Grad. rate (2 yrs)- adjusted I 0.220 0.299 0.038 0.043

Grad. rate (2 yrs)- adjusted II 0.216 0.284 0.038 0.042

English Scores - de-trended 16.047 9.944 1.390 0.774

English Scores- adjusted I 5.282 1.584 1.244 0.316

English Scores- adjusted II 7.260 2.711 1.260 0.407

Notes:Estimates constructed using the number of years a principal is in each school as weights.

The estimated principal effects were first demeaned within each school, including the imposed

zero for the first principal observed in each school. Standard errors constructed using delta

method, using full variance-covariance matrix ofρ, school fixed effects and principal fixed ef-

fects.

32

year. These estimated variances of the principal effect impacts in column 3 are of a similar order

of magnitude as the estimates of the variance of fixed principal effects calculated using the first

method.

These estimates of the full principal fixed effects are constructed based on sample data, thus

they are random estimates of the true full principal fixed effects. This results in the estimated

variance of the full principal fixed effects reported in Table 4 exceeding the true variance of the

full principal effects. A simple simulation exercise was conducted in order to ascertain by how

much these estimates of the variance may exceed the true variance. This simulation exercise

involved drawing randomly from the estimated distribution of the principal fixed effects 500

times, and constructing an estimate of the variance of the principal full effects for each draw.

The exercise used the assumption that the parameter estimates are normally distributed. The

average of these simulated variance estimates was then compared to the relevant estimate from

the first column in Table 4. This exercise implied the estimated variance may overstate the true

variance by on average 20 percent for English exam scores, but by a significantly larger 420

to 500 per cent for graduation rates. The parameter estimates were measured with much less

precision for the graduation rate outcome.

These estimates can be used to construct some simple measures to gain an understanding of

the potential size of the effect school principals can have on student outcomes. After adjusting for

the sampling variability using the simulation exercise above, the model estimates implied that

a one standard deviation more effective principal would raise graduation rates (2 year) by 2.6

percentage points from the mean of 82 percent, if left in a school “forever” (having their “full”

effect). A one standard deviation more effective principal would raise English exam scores by

2.5 percentage points, if left in a school “forever”. These calculation are based on the adjusted II

measures. Principals would have just 0.25 of this “full” effect in their first year in a school.

A variance decomposition exercise was conducted for this dynamic model in order to obtain

another estimate of how much principals can matter. This exercise uses the estimated variance of

33

Table 5:Proportion of Outcome Variance Attributable to School Principals

year 1 year 2 year 3 year 4 year 5 year 6 FULL

Grad. rate (2 yrs) 0.8 2.5 4.3 5.9 7.3 8.2 11.3

English Scores 9.3 23.5 34.5 42.0 47.0 50.5 58.8

Notes: Authors calculations based on estimates using the adjusted II measures. First column

assumes all principals are in their first year at a new school, second column that all are in their

second year, etcetera.

full principal effects appropriately adjusted for being estimated using sample data. It also uses

the estimated variance of within school variation in student quality and remaining errors from

the model estimates. This is simply the mean squared error from the estimated model. Finally, it

uses an estimated variance of school effects constructed assuming all principals are in their third

year of leading a school (the average in the data is 3.3 years). Variance decompositions were

constructed firstly assuming all principals were in their first year at a new school i.e. they had an

impact equal to (1 − ρ) times their full effect. A second set of decompositions was constructed

assuming all principals were in their second year (impact of (1 − ρ2) times their full effect). A

third set was based on all principals being in their third year, and so on, up to six years. A final

set was constructed based on all principals being in their school “forever” i.e. having their “full”

effect.

The results of these hypothetical decompositions for the adjusted II measures of the grad-

uation rate (2 year) and English exam scores are summarized in Table 5. Note that for both

outcome measures, the proportion of the cross-sectional variation in student outcomes that could

be attributed to school principals is quite small if principals were all in their first year in a school.

This proportion grows as school principals lead a school for more years, reaching 22.9 and 46.5

percent for graduation rates and English exam scores respectively if school principals were left in

schools long enough to have their full effect. Thus school principals can have quite a large effect

34

on student outcomes, if they are given enough time to do so. Their effect is larger on English

exam scores than on graduation rates, a result consistent with the estimates using the first fixed

effects technique.

The dynamic model we have estimated thus far is based on a simple one parameter adjustment

path for principals to affect schools. The effect of leadership on schools is implicitly assumed

to have a smooth concave shape in the number of years a principal is leading a school. We

can of course estimate a much more flexible adjustment path, where the shape can vary over

the number of years a principal leads a school, by allowing our adjustment parameterρ to vary

over these years. We could allow complete flexibility by estimating a separateρ parameter for

each year, or we could allowρ to vary in some particular more limited manner. An analysis was

conducted to uncover an appropriate structure for the adjustment parameters to take. Allowing

full flexibility resulted in quite imprecisely estimatedρ parameters in many cases, particularly

for the parameters corresponding to the fourth, fifth and higher number of years of a principal

leading a school, where the number of observations falls. As a result, adding more structure to

the adjustment path appeared warranted.

A number of structures for the adjustment parameters were estimated, including allowingρ

to follow a linear trend over the years a principal leads a school, or allowing theρ value for the

first one and two years a principal leads a school to differ from theρ value for the remaining

years. This investigation confirmed that a simple one parameter model is quite appropriate for

the English exam score outcome. For graduation rates, a model where theρ value for the first

year a principal leads a school was allowed to differ from theρ value governing the rest of the

adjustment process was found to be most appropriate. The estimates of theρ parameters for this

model are presented in Table 6. Note the large estimate ofρ for the first year a principal is in a

school, and the smallerρ for later years. This implies slow adjustment to the principal’s “full”

effect initially (impact factor of1− ρ), and faster adjustment after that.

As in the oneρ model, a simple decomposition analysis can be conducted for this particular

35

Table 6:Estimates of Dynamics (ρ) - Added Flexibility

estimate s.e. estimate

of ρ (ρ) of (1 - ρ)

Grad. rate (1 yr) - de-trended first yr 0.831*** 0.125 0.169

all other yrs 0.423* 0.246 0.577**

Grad. rate (1 yr)- adjusted I first yr 0.773*** 0.153 0.227

all other yrs 0.483 0.311 0.517*

Grad. rate (1 yr)- adjusted II first yr 0.777*** 0.154 0.223

all other yrs 0.456* 0.309 0.544*

Grad. rate (2 yrs) - de-trended first yr 0.850*** 0.127 0.150

all other yrs 0.400 0.244 0.600**

Grad. rate (2 yrs)- adjusted I first yr 0.794*** 0.152 0.206

all other yrs 0.461 0.303 0.539*

Grad. rate (2 yrs)- adjusted II first yr 0.798*** 0.153 0.202

all other yrs 0.447 0.300 0.553*

Notes: Standard errors (s.e.) constructed as square root of second derivative of the objective

function. One, two and three *’s denotes statistical significance at the 10%, 5% and 1% levels

respectively. 1726 observations. Student numbers used as weights.

36

Table 7:Proportion of Outcome Variance Attributable to School Principals - Two ρ Model

year 1 year 2 year 3 year 4 year 5 year 6 FULL

Grad. rate (1 yr) 0.5 4.0 6.6 7.9 8.5 8.8 9.1

Grad. rate (2 yrs) 0.4 4.2 7.0 8.4 9.0 9.3 9.6

Notes: Authors calculations based on estimates using the adjusted II measures. First column

assumes all principals are in their first year at a new school, second column that all are in their

second year, etcetera.

model also. The results of this decomposition are presented in Table 7. This decomposition

reveals the expected pattern of a small observed effect of principals in the first year they lead a

school, but a much larger effect after that. The full effect estimates are similar to those for the

oneρ model.

6 Conclusions

In the analysis above, we have found evidence that school principals can matter in terms of

affecting high school student outcomes. There is considerable evidence that school principals do

affect student scores on common English grade 12 exams, but weaker evidence that principals

affect high school graduation rates. In addition, there is evidence that principals may take a

number of years to have their full effect on schools and student outcomes. Thus individual

principal quality, like idiosyncratic teacher quality, may be an important input by which schools

can affect student outcomes.

Certain potential explanations for why principals may have more influence over exam scores

than graduation rates come to mind. Graduation rates may be a more difficult outcome for

principals to influence. Getting at-risk students to remain in school may take considerable effort.

Raising average English exam scores, however, may simply involve directing teachers to place

a stronger emphasis on “teaching to the test”, or on advising students more carefully on what

37

specific exams to sit. Principals may also differ more considerably on the weight they place on

raising grade 12 exam scores than they do on the weight they place on improving graduation

rates. All principals may have strong preferences for improving graduation rates, and thus will

devote efforts to improving them. Principals may differ in their preferences for raising grade

12 English exam scores, if an emphasis on English exam scores comes at the cost of reducing

other student outcomes that certain principals have stronger preferences for, such as developing

non-cognitive skills and the development of students’ general life skills.

A considerable amount of principal turnover that we observe in the data stems from school

principals leaving the BC high school system, which may reflect quits from this type of career.

Being a school principal is a stressful job, and many school districts are finding it difficult to

attract quality applicants and to keep successful principals in their jobs. Given the important

effects that school principals can have on student outcomes found in this research, there may be

a role for public policy in increasing efforts to retain good school principals. Such efforts may

be made via increasing salaries or via other measures to improve the desirability of working as a

school principal.

A School Principal Duties and Responsibilities

The following school principal powers and duties were taken from the B.C. Regulation con-

cerning School Regulation (BC Reg. 265/89, amended by BC Reg. 1114/04), made under the

authority of the B.C. School Act.

“...

(6) The principal or, if so authorized by the principal, the vice principal of a school shall,

(a) perform the supervisory, management and other duties required or assigned by the board,

38

(b) confer with the board on matters of educational policy and, where appropriate, attend

board meetings for that purpose,

(c) evaluate teachers under his or her supervision and report to the board as to his or her

evaluation,

(d) assist in making the Act and this regulation effective and in carrying out a system of

education in conformity with the orders of the minister,

(e) advise and assist the superintendent of schools in exercising his or her powers under the

Act,

(f) recommend to the superintendent of schools the assignment or reassignment of teachers

to positions on the teaching staff of the school board, SCHOOL REGULATION BC Ministry of

Education Governance and Legislation Unit D-64 September 15, 2004

(g) recommend to the superintendent of schools the dismissal or discipline of a teacher, (h)

perform teaching duties assigned by the board,

(h.1) administer and grade, as required by the minister, Required Graduation Program Ex-

aminations,

(h.2) ensure the security of Provincial examinations, including retaining completed Provin-

cial examinations for any period of time set by the minister, and

(i) represent the board when meeting with the public in the capacity of principal or vice

principal of a school.

(7) The principal of a school is responsible for administering and supervising the school includ-

ing

(a) the implementation of educational programs,

(b) the placing and programming of students in the school,

(c) the timetables of teachers,

(d) the program of teaching and learning activities,

(e) the program of student evaluation and assessment and reporting to parents,

39

(f) the maintenance of school records, and

(g) the general conduct of students, both on school premises and during activities that are off

school premises and that are organized or sponsored by the school, and shall, in accordance with

the policies of the board, exercise paramount authority within the school in matters concerning

the discipline of students.

(8) Principals shall ensure that parents or guardians are regularly provided with reports in respect

of the student’s school progress in intellectual development, human and social development and

career development and the student’s attendance and punctuality.

...”

These regulations also include duties related to providing reports on teachers, the details of

student reports, and holding of school assemblies.

B School Achievement Production Function

School achievement can be represented as a function of several inputs, including family inputs,

own ability, peer effects and school inputs (teachers, other resources, curricula, discipline levels,

etc).7 School achievement is a cumulative process, with past inputs affecting achievement as

well as current inputs. Current inputs will determine any gains (or losses) in achievement from

prior levels.

The cumulative nature of school achievement can be seen clearly in the following regression

analog of some true EPF for the achievementAiGs of studenti in gradeG of schools.

AiGs = XiGsαG + P (−i)GsβG + A(−i)GsγG

+G−1∑g=1

Xigsαg +G−1∑g=1

P (−i)gsβg +G−1∑g=1

A(−i)gsγg + µisδG + εiGs (16)

The matrixX denotes all family, school and neighbourhood inputs in the EPF. Peer group

measures are separated into exogenous (contextual) variables inP (−i)Gs and the endogenous or

7Note that individual motivation does not enter the EPF in the standard version of the model in the literature.

40

behavioural variable inA(−i)Gs. The termµis denotes endowed ability. The endogenous peer

group measure here is the actual contemporaneous average achievement level of other students

in the class (A(−i)Gs). If students are doing better in class, it may lift the achievement of all other

students in the class, over and above the measurable exogenous characteristics of those students.

If historical input measures are not included in the estimated equation, their impact will be

subsumed into the error term. In this case, the identification of peer effects is difficult. Many of

the missing historic variables (especially school input variables) will be common to peers and

to current included input measures, yielding significant omitted variable bias. The error will

be correlated to our peer measure, potentially biasing up the estimated impact of peers on own

achievement. Manski (1983) denoted these potential biases as correlation effects.

Taking first differences of equation 16 yields a value added model of achievement.

∆AiGs = XiGsαG + NiGsβG + SGsγG + P (−i)GsδG + A(−i)GsηG + εiGs (17)

This achievement growth specification is not the only one employed in the literature. Often

past achievementAci(G−1)s is included as a regressor without imposing a coefficient of one on it.

Researchers support such a specification by claiming that past achievement is a sufficient statistic

for all past inputs and for individual ability. Todd and Wolpin (2003) discuss the identifying

restrictions inherent in these value-added models. In our study, we are not directly estimating an

EPF for achievement, such as equation 16. However, certain estimation problems highlighted by

Todd and Wolpin are common to our study estimating an equation such as 2.

School principals can influence the school inputs in this production function, but further, they

may be able to alter the functionf(..) itself. They may be able to bring in a better technology

for the production of individual school outcomes given the inputs at hand.

C School Principal Turnover Term Construction

Here is a simple example of the construction of the principal turnover term in equation 8.

41

E

[1

n

n∑c=1

(θst − θs)2

]= E

[1

n

n∑c=1

(θ2st + θs

2 − 2θstθs)

](18)

In this example, there are two principals in our school (with principal effectsθj andθk), and

each is leading the school for three years, giving a total number of years of six (n = 6). To begin,

take the expectation of one term (cohort or year) within the school average, where the principal

in charge is principalj (thusθst = θj).

E[θ2j + θs

2 − 2θjθs] = E [θ2j ] + E

[(1

6(θj + θj + θj + θk + θk + θk)

)2]

−2 E

[θj

(1

6(θj + θj + θj + θk + θk + θk)

)](19)

Using our definition ofE[θ2j ] = σ2

θsand our assumption thatE[θjθk] = 0, this equation can

be written as follows.

E[θ2j + θs

2 − 2θjθs] = σ2θs

+1

62(32 + 32) σ2

θs− 2

1

6(3) σ2

θs

=1

2σ2

θs(20)

In the turnover term for this particular school, we have six equivalent terms to that above in

the school average. So, our turnover term here is also simply one half.

Now, we can follow the same method to show how we develop the turnover term for the

general case. Again let us look at one element of the school average term, where principalj is

again in charge. There are a total ofJ principals in the school over the sample period, each in

charge forqk years, wherek = 1, ..., J , and∑J

k=1 qk = n. In this general case, principalj is in

charge forqj years.

E[θ2j + θs

2 − 2θjθs] = E [θ2j ] +

1

n2E

(n∑

c=1

θcs

)2 − 2

1

nE

[θj

n∑c=1

θcs

]

= σ2θs

[1 +

1

n2

J∑

k=1

q2k −

2

nqj

](21)

42

Averaging this term over then years that schools is in our sample yields equation 8, our

turnover term for the general case. This average in equation 8 uses the fact that for each principal

j, there areqj equivalent terms to equation 21 in the school average.

D Variance Decomposition Details

D.1 Fixed Principal Effects

The decomposition of the cross section variance in student outcomes at a point in time employed

the following assumptions:

1. The across school variance in principal quality equals our estimate ofσ2θ , the within school

variance we estimate, from Table 2.

2. The across school variance in student quality and any remaining error term i.e. the variance

of γst + υst from equation 4, is equal to the average within school variation in student

quality and remaining error term implied by our estimates. It is calculated by subtracting

the implied average principal quality variation within schools (the across school average

of the principal turnover term times our estimate ofσ2θ ) from the across school average of

the within school variance of student outcomes (i.e. the average of1T

∑Tt=1(Gst − Gs)

2

across schools).

3. The across school variance in school effects is equal to the the raw across school variance in

student outcomes at a point in time (the year of 2004) minus the two variance components

identified above, for the variance in principal effects and the variance in student quality

and remaining error.

43

E First Stage Estimates

Table A1:First Stage - De-trended measures

Grad. rate (1 year) Grad. rate (2 years) English Scores

Coeff. s.e. Coeff. s.e. Coeff. s.e.

1996 0.019 0.016 0.013 0.017 0.70a 0.10

1997 0.088a 0.015 0.076a 0.017 -0.06 0.10

1998 0.171a 0.016 0.150a 0.017 -0.83a 0.10

1999 0.233a 0.016 0.218a 0.017 -0.35a 0.10

2000 0.325a 0.016 0.298a 0.017 0.89a 0.10

2001 0.393a 0.016 0.370a 0.017 1.38a 0.10

2002 0.500a 0.016 0.484a 0.018 2.01a 0.10

2003 0.519a 0.017 0.467a 0.018 1.35a 0.10

2004 0.515a 0.016 0.185a 0.017 3.22a 0.10

observations 442,504 442,504 316,248

Notes: Superscriptsa, b and c denote statistical significance at the 1%, 5% and 10% levels

respectively. For graduation rates, estimates are from Logit models. For English exam scores,

estimates are from Ordinary Least Squares.

44

Table A2:First Stage - Adjusted measures I



male 0.956a 0.199 2.766a 0.207 2.64b 1.06

First Nation -1.024a 0.022 -1.017a 0.023 -4.75a 0.17

ESL student -0.972a 0.025 -0.707a 0.028 -9.26a 0.15

French -0.328a 0.098 -0.457a 0.101 -0.83 0.56

other language 0.236a 0.019 0.346a 0.021 -5.63a 0.09

age -0.050a 0.001 -0.050a 0.001 -0.17a 0.00

male*First Nation 0.159a 0.031 0.194a 0.032 0.80a 0.26

male*ESL student 0.118a 0.032 0.099a 0.035 1.33a 0.20

male*French 0.326b 0.140 0.341b 0.145 0.47 0.85

male*other lang. -0.067a 0.023 -0.074a 0.025 0.65a 0.11

male*age -0.007a 0.001 -0.015a 0.001 -0.03a 0.01

Peer - male -1.363a 0.088 -0.879a 0.095 -2.57a 0.43

Peer - First Nation -1.221a 0.061 -1.176a 0.064 -15.83a 0.55

Peer - ESL students 0.868a 0.049 0.615a 0.053 8.11a 0.28

Peer - French 0.849 0.835 2.312b 0.910 34.90a 4.43

Peer - other lang. -0.550a 0.027 -0.480a 0.029 -0.66a 0.14

1996 0.006 0.016 -0.002 0.018 0.73a 0.10

1997 0.071a 0.016 0.047a 0.018 0.05 0.10

1998 0.141a 0.017 0.099a 0.018 -0.70a 0.10

1999 0.162a 0.017 0.125a 0.018 -0.33a 0.10

2000 0.239a 0.017 0.183a 0.019 0.83a 0.10

2001 0.311a 0.017 0.258a 0.019 1.34a 0.10

2002 0.383a 0.018 0.333a 0.019 1.93a 0.10

2003 0.433a 0.018 0.336a 0.020 1.52a 0.10

2004 0.419a 0.018 0.028 0.019 3.46a 0.10

observations 442,504 442,504 316,248

Notes: Superscriptsa, b and c denote statistical significance at the 1%, 5% and 10% levelsrespectively. For graduation rates, estimates are from Logit models. For English exam scores,estimates are from Ordinary Least Squares.

45

Table A3:First Stage - Adjusted measures II, including Neighbourhood characteristics



male 0.996a 0.215 2.776a 0.225 2.2010 1.13

First Nation -1.001a 0.024 -0.994a 0.024 -4.53a 0.17

ESL student -0.998a 0.027 -0.730a 0.030 -9.39a 0.16

French speaking -0.267b 0.109 -0.402a 0.112 -0.70 0.61

other language 0.237a 0.020 0.345a 0.022 -5.57a 0.09

age -0.049a 0.001 -0.050 0.001 -0.16a 0.00

male*First Nation 0.162a 0.032 0.198a 0.033 0.89a 0.27

male*ESL student 0.105a 0.034 0.086b 0.038 1.29a 0.22

male*French speaking 0.29510 0.153 0.321b 0.158 0.39 0.90

male*other lang. -0.04610 0.024 -0.051 0.026 0.69a 0.12

male*age -0.007a 0.001 -0.015a 0.001 -0.03a 0.01

Peer - male -1.659a 0.095 -1.110a 0.102 -5.92a 0.46

Peer - First Nation -0.879a 0.082 -0.841a 0.086 -7.08a 0.71

Peer - ESL students 0.786a 0.055 0.492a 0.059 5.86a 0.31

Peer - French speaking 1.82410 0.934 3.417a 1.015 18.24a 4.85

Peer - other lang. -0.359a 0.038 -0.385a 0.042 0.31 0.20

N - Lone parents -0.277a 0.080 -0.389a 0.086 -0.58 0.46

N - Number rooms 0.094a 0.009 0.098a 0.010 0.16a 0.05

N - Rented proportion -0.221a 0.056 -0.138b 0.060 0.16 0.32

N - Non-English at home -0.216 0.168 -0.021 0.183 -2.72a 0.92

N - Immigrants 0.107 0.111 0.20510 0.120 -3.20a 0.60

N - First nation -0.389a 0.105 -0.19110 0.109 -3.41a 0.71

N - Unemployment rate -0.609a 0.171 -0.555a 0.183 -6.83a 1.00

N - Less than grade 9 1.404a 0.209 0.958a 0.227 13.51a 1.17

N - University educated 0.152 0.140 0.033 0.152 12.95a 0.74

N - other post-secondary 0.525a 0.146 0.541a 0.158 3.81a 0.81

N - ave. family income 0.004a 0.001 0.004a 0.001 0.00b 0.00

N - value of dwellings -0.000a 0.000 -0.000 0.000 -0.00a 0.00

observations 400,116 400,116 287,047

Notes:on next page

46

Notes: Superscriptsa, b and c denote statistical significance at the 1%, 5% and 10% levelsrespectively. For graduation rates, estimates are from Logit models. For English exam scores,estimates are from Ordinary Least Squares. Year indicators also included.

References[1] Aaronson, Daniel, Lisa Barrow and William Sander (2007) “Teachers and Student Achieve-

ment in the Chicago Public High Schools,”Journal of Labor Economics, 25 (1), pp. 95-135,January.

[2] Brock, William A. & Steven N. Durlauf (2001) “Interactions-Based Models,” inHandbookof EconometricsVolume 5, edited by James J. Heckman and Edward Leamer, Chapter 54,North-Holland.

[3] Coelli, Michael B., David A. Green & William P. Warburton (2004) “Breaking the Cy-cle? The Effect of Education on Welfare Receipt Among Children of Welfare Recipients,”Institute for Fiscal Studies Working Paper W04/14, London, June.

[4] Coleman, James S., Ernest Q. Campbell, Carol J. Hobson, James McPartland, AlexanderM. Mood, Frederic D. Weinfeld, and Robert L. York (1966)Equality of educational oppor-tunity, U.S. Government Printing Office, Washington, D.C.

[5] Hallinger, Philip & Ronald H. Heck (1998) “Exploring the Principal’s Contribution toSchool Effectiveness: 1980-1995,”School Effectiveness and School Improvement, 9(2),pp. 157-191.

[6] Hanushek, Eric A. (2003) “The failure of input-based schooling policies,”The EconomicJournal, 113, pp. F64-F98, February.

[7] Kane, Thomas J., Jonah E. Rockoff and Douglas O. Staiger (2006) “What Does Certifica-tion Tell Us About Teacher Effectiveness? Evidence from New York City,” unpublishedmanuscript, Columbia Business School, March.

[8] Koedel, Cory (2008) “Teacher quality and dropout outcomes in a large, urban school dis-trict,” Journal of Urban Economics, 64 (3), pp. 560-572, November.

[9] Krueger, Alan (2003) “Economic considerations and class size,”The Economic Journal,113, pp. F34-F63, February.

[10] Leigh, Andrew (2007) “Estimating Teacher Effectiveness from Two-Year Changes in Stu-dents’ Test Scores,” unpublished manuscript, Australian National University, May.

[11] Leithwood, Kenneth & Doris Jantzi (1999) “The Relative Effects of Principal and TeacherSources of Leadership on Student Engagement with School,”Education AdministrationQuarterly, 35(Supplemental), pp. 679-706, December.

[12] Manski, Charles F. (1993) “Identification of Endogenous Social Effects: The ReflectionProblem,”The Review of Economic Studies, 60 (3), pp. 531-542, July.

[13] McFadden, Daniel L. (1993) “Conditional Logit Analysis of Qualitative Choice Analysis,”in Frontiers in Econometrics, edited by P. Zarembka, pp. 105-142, Academic Press, NewYork.

[14] Nye, Barbara, Spyros Konstantopoulos and Larry V. Hedges (2004) “How Large areTeacher Effects?”Educational Evaluation and Policy Analysis, 26 (3), pp. 237-257, Fall.

[15] Rivkin, Steven G., Eric A. Hanushek & John F. Kain (2005) “Teachers, Schools and Aca-demic Achievement,”Econometrica, 73(2), pp. 417-458, March.

47

[16] Rockoff, Jonah E. (2004) “The Impact of Individual Teachers on Student Achievement:Evidence from Panel Data,”American Economic Review: Papers and Proceedings, 96(2),pp. 247-252, May.

[17] Todd, Petra E. & Kenneth I. Wolpin (2003) “On the Specification and Estimation of theProduction Function for Cognitive Achievement,”The Economic Journal, 113, pp. F3-F33,February.

[18] Uribe, Claudia, Richard J. Murnane and John B Willett (2003) “Why Do Students LearnMore in Some Classrooms Than in Others? Evidence from Bogota,” Working paper, Har-vard Graduate School of Education, November.

48

Leadership Effects: School Principals and Student Outcomes ...€¦ · Michael Coelliy and David Greenz 12 October 2009 Abstract ... University of Melbourne and Monash University.

Documents