Disrupting ‘Secondary’ Class Effects on Educational Outcomes * Aleksei Opacic Harvard University June 11, 2021 Abstract The education system plays a crucial role as both a mediator and moderator of intergenerational reproduction. While a large portion of the association between parental income and child outcomes op- erates through educational attainment, the school and college system is a primary locus of intervention for policy-makers wishing to increase rates of mobility. In this paper, I argue that a theoretical distinc- tion from sociologists of education - namely, between primary and secondary class effects on educational outcomes - is useful for constructing a set of realistic and informative policy interventions to promote equality of life chances. Specifically, I make two contributions. First, I demonstrate how primary and secondary effects can be understood within a targeted intervention framework, and distinguish between two types of secondary intervention to neutralize ‘secondary’ class effects on educational outcomes. I clarify how these interventions can be understood as a hypothetical field experiment. Second, I demon- strate how, under certain identification and credibility assumptions, the corresponding interventional effects can be identified with observational data, and propose a set of imputation and weighting estima- tors that can be combined with machine-learning methods to estimate them. I demonstrate the utility of distinguishing between these two types of intervention in understanding the sources of and possibilities to disrupt intergenerational income persistence using the NLSY97 cohort. 1 Introduction One of the strongest predictors of adult socio-economic attainment is your family income during child- hood. On average, a 10 percentile point increase in parent income rank is associated with a 3.41 per- centile increase in a child’s income rank in adulthood (Chetty et al., 2014, 2020). This association between parental income and adult attainment - variously called intergenerational persistence or its complement, * Direct all correspondence to Aleksei Opacic, Department of Sociology, Harvard University, 1737 Cambridge Street, Cambridge MA 02138; email: [email protected]. Many thanks to Xiang Zhou for incredibly helpful comments and advice. 1
47
Embed
Disrupting `Secondary' Class Effects on Educational Outcomes
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Disrupting ‘Secondary’ Class Effects on Educational
Outcomes*
Aleksei Opacic
Harvard University
June 11, 2021
Abstract
The education system plays a crucial role as both a mediator and moderator of intergenerational
reproduction. While a large portion of the association between parental income and child outcomes op-
erates through educational attainment, the school and college system is a primary locus of intervention
for policy-makers wishing to increase rates of mobility. In this paper, I argue that a theoretical distinc-
tion from sociologists of education - namely, between primary and secondary class effects on educational
outcomes - is useful for constructing a set of realistic and informative policy interventions to promote
equality of life chances. Specifically, I make two contributions. First, I demonstrate how primary and
secondary effects can be understood within a targeted intervention framework, and distinguish between
two types of secondary intervention to neutralize ‘secondary’ class effects on educational outcomes. I
clarify how these interventions can be understood as a hypothetical field experiment. Second, I demon-
strate how, under certain identification and credibility assumptions, the corresponding interventional
effects can be identified with observational data, and propose a set of imputation and weighting estima-
tors that can be combined with machine-learning methods to estimate them. I demonstrate the utility of
distinguishing between these two types of intervention in understanding the sources of and possibilities
to disrupt intergenerational income persistence using the NLSY97 cohort.
1 Introduction
One of the strongest predictors of adult socio-economic attainment is your family income during child-
hood. On average, a 10 percentile point increase in parent income rank is associated with a 3.41 per-
centile increase in a child’s income rank in adulthood (Chetty et al., 2014, 2020). This association between
parental income and adult attainment - variously called intergenerational persistence or its complement,
*Direct all correspondence to Aleksei Opacic, Department of Sociology, Harvard University, 1737 Cambridge Street, CambridgeMA 02138; email: [email protected]. Many thanks to Xiang Zhou for incredibly helpful comments and advice.
1
intergenerational mobility - is important because it gives us some indication of the degree of opportu-
nity in society (Hout, 1988; Breen, 2004). When intergenerational persistence is high, individuals from
low income households tend to stay poor, and those from wealthier households tend to stay wealthy. By
contrast, when intergenerational persistence is low - that is, when mobility is high - individuals’ socio-
economic outcomes are to a greater degree independent of the socio-economic advantages or disadvan-
tages that characterized their upbringing.
For social scientists and policy-makers concerned with increasing rates of intergenerational mobility,
a natural question to ask is what sort of interventions might be most effective towards this goal. Provid-
ing an answer is a challenging theoretical and empirical task. Theoretically, it requires an understanding
of the causal factors and processes that shape intergenerational mobility: the complex interactions be-
tween the economic and non-economic resources of an individual and their family, on the one hand, and
broader social institutions such as schooling systems, colleges and the labour market, on the other. Such
an understanding might, for example, lead us to the education system as a crucial site of intergenerational
reproduction, given that a large portion of the association between parental and child income is mediated
through educational attainment (Bloome et al., 2018; Breen and Müller, 2020). At the same time, hypo-
thetical interventions that are truly informative must be tempered with an eye to the practical. It seems
more reasonable to consider an intervention to school resources than to alter class- or race-specific test
score distributions per se (Jackson and VanderWeele, 2018).
Counterfactual interventions also pose an empirical challenge to the researcher. We might conceptu-
alize the goal of suggesting effective interventions as evaluating a hypothetical field experiment whose
results we wish to know without actually undertaking the experiment. Of course, such an aim shifts
our research objective into the counterfactual realm. But while the tools of causal inference have enabled
us, with the due assumptions, to make inferences about counterfactuals in the present, there are some
important differences where we seek to make predictions about causal relationships under some social
setup that does not as yet exist - a scenario in which all potential outcomes are unobserved (see Jackson
and Arah, 2020).
The aim of this paper is to draw on insights from sociologists of education on two dimensions of
class-based educational inequalities to define two theoretically-motivated education-based interventions
designed to promote intergenerational mobility, and to produce a set of credible estimates for levels of
mobility we would observe under these interventions. In the following, I first argue that the distinction
between between class inequalities in educational outcomes produced from primary effects (class effects
on academic performance) and those produced from secondary effects (class effects on individuals’ prob-
ability of making an educational transition, net of performance) (Boudon, 1974; Breen and Goldthorpe,
2
1997) is especially informative from a policy standpoint. While it is difficult to imagine how we might
reduce performance effects given the years of cumulative and multidimensional (dis)advantage among
children from different socio-economic groups that impact on attainment, a more realistic form of inter-
vention might target resource and informational constraints at educational transition junctures. Next,
I argue that, despite the utility of the primary-secondary effects distinction from a policy perspective,
the original theory has a number of weaknesses from a theoretical and a causal identification perspec-
tive. I therefore offer a corrective in the context of interventional effects by distinguishing between two
types of ‘secondary interventions’: what I label ‘weak’ and ’strong’ secondary intervention, which can
be conceptualized from a field-experimental standpoint. Third, I show how, under some relatively weak
assumptions and by carefully delimiting the scope of the intervention, we can express these interventions
in terms of observational data. I further propose a set of imputation and weighting estimators that can be
used to estimate secondary interventions. Finally, I show empirically the utility of considering alternative
credible interventional effects by examining intergenerational income persistence of racial majority and
minority groups of the NLSY97 cohort.
2 Primary and secondary effects in an interventionist context
2.1 Secondary class effects are conducive to policy interventions
The education system plays a vital role in social reproduction. Children from high income families are
likely to attain higher levels of education than their socio-economically disadvantaged peers (Ziol-Guest
and Lee, 2016; Duncan et al., 2017). In turn, higher levels of education lead to higher returns in the
labour market (Autor et al., 2008; Baum et al., 2010). On the one hand, because of these twin processes,
a major portion of the association between parental and child income is mediated through educational
attainment (Blau and Duncan, 1967; Bloome et al., 2018). On the other hand, these facts point towards
the education system, and in particular, class-based inequalities in educational outcomes, as a key site of
policy intervention if we wish to break the link between social origin and social destination. Nevertheless,
to clearly articulate an effective and practical set of educational interventions designed to disrupt these
inequalities requires a theoretical understanding of the mechanisms producing this inequality.
Sociologists interested in understanding inequality in educational opportunity have typically drawn
a distinction between two distinct mechanisms underpinning such inequality, first articulated by Boudon
(1974). First, primary effects refer to the effects of social background on academic performance - the fact
that children from higher class backgrounds, on account of their superior economic, social, and cultural
3
resources, tend to outperform their disadvantaged peers in standardized tests and public examinations.1
By contrast, secondary effects are the effects of class background on the decision an individual, in con-
junction with their parents, teachers and peers, makes about whether to transition to a higher level of
education, conditional on prior performance (for instance, whether or not to continue to university educa-
tion conditional on high school GPA). e.g. Breen and Goldthorpe, 1997; Jackson, 2013. Class inequalities
in educational outcomes can then be understood as being produced from these two sources. Indeed, a
salient finding from the empirical literature on primary and secondary effects is that, across a range of
countries and time periods, working-class children are less likely than their advantaged peers to tran-
sition to a higher level of education, even when they have the same attainment level as these advan-
taged peers (Jackson et al., 2007; Jackson, 2013; Morgan, 2012) This approach is formalized in the directed
acyclic graphs (DAGs) presented in Figure 1. Consider the top DAG (A), which illustrates the assumed
data-generation process in many applications of primary and secondary effects. Let A denote family
income, Z be a measure of high school GPA, and M be an indicator denoting whether an individual tran-
sitions to college. Family income or social class background, can then affect individuals’ choice decisions
at an educational juncture both indirectly through performance, A → Z → M as well as directly, net of
performance, A→ M.
One important payoff to making a distinction between these two types of effects is that it draws at-
tention to the distinct processes underlying different aspects of educational inequality, and thus different
policy solutions. In particular, primary effects on performance are understood as the consequence of
a complex interaction between the cultural, economic and social resources of children and their families
and the educational system: the superior economic resources at the disposal of better-off families, and the
higher cultural fluency of better educated parents and their familiarity with the schooling system, for ex-
ample. By contrast, class differences in educational decisions at transitions can be seen as stemming from
an evaluation of costs and benefits of the various educational options available - decisions conditioned
by class-specific resources and informational constraints – and on the ‘perceived probabilities of more or
less successful outcomes’ of those in different classes (Breen and Goldthorpe, 1997, p.276). For example,
education costs provide a one key source of discrepancy in attainment as they decrease the proportion
of working-class families whose resources will meet the costs of further education (in addition to indi-
rect earnings-forgone through pursuing an educational pathway) (Goldthorpe, 2007). While social policy
does target the mechanisms producing primary effects in the form of educational resources, the extent of
cumulative (dis)advantage that underpins primary effects is surely less amenable to policy intervention
1In this paper, I refer to social background broadly as material circumstances of upbringing, and use this term interchangeablywith class origin and family income. Although I operationalize social origin in terms of family income, the approach I outline is ofcourse generalizable to other domains of childhood material advantage.
4
than the cost benefit calculus underpinning secondary effects. One could envisage such policy interven-
tions in the form of informational campaigns, financial incentives and support and types of class-based
affirmative action. In other words, we may be able to credibly infer what would gaps in adult income
between individuals from different parental income groups look like if individuals from less advantaged
backgrounds would ‘exploit’ their demonstrated academic ability as advantaged children at educational
transitions - if we were to ‘neutralize’ secondary class effects on educational outcomes.
Despite the important insight that the distinction between primary and secondary effects can bring
to interventional effects, there are several limitations to the traditional definition of these effects that
need to be addressed. Consider next the more elaborated causal DAG (B) below, which features two
additional vectors representing an unobserved vector ~U denoting cultural resources as well as genetic
inheritance, which affects all other sets of variables in the DAG, and a vector of ‘intermediate’ variables
~X on the causal path from family income to performance (such as educational expectations, school type
and neighborhood quality). Note that our outcome is now adult income than educational transition (in
keeping with this paper’s motivating example of disrupting intergenerational mobility associations).2
If we are to understand the causal structure as more complex than the top DAG presumes, then a
causal interpretations of primary and secondary effects quickly becomes far less straightforward. First,
in this more elaborated DAG, disparities in high school GPA, college attendance and adult income across
family income groups arise in several ways. First, unobserved variables ~U such as parental ability affect
both parental income A as well as intermediate variables ~X such as child ability. Crucially, this is a non-
mediating path - it does not capture the effect of family income on ~X. In addition, there are also forward
paths emanating from family income: for instance, family income might affect the type of neighborhood
a child grows up in or the quality of school they attend.3 If we do not observe all of the components in
~U, then we will be unable to identify the causal effect of family income or class. Second, in addition to
the two causal pathways elaborated in (A) we have an additional causal path A → ~X → M. As Morgan
(2012) writes, this causal path cannot be considered a mechanistic elaboration of Boudon’s conception
of secondary effects understood as simply class-specific cost benefit analyses. Instead, they are a ‘sepa-
rate component of the net association between class and college entry that is best attributed to a broad
structural interpretation’ (p. 33). Thus, while if we modeled the data using the naive DAG in Panel A,
this path will be absorbed into the secondary effect, yet these variables are not explicit components of the
choice process that is thought to generate the causal secondary effects suggested by Boudon.
2Note in addition that, while ~X is multivariate, we are agnostic about the causal relationships among its constitutive variables.We assume that A and ~X precede Z which precedes M which precedes Y, though we are also agnostic about the temporal orderingof A and X.
3In the formal language of DAGs, we now have 4 backdoor paths from A to M: A ← U → M, A ← ~U → ~X → M, A ← U →Z → M, A← U → ~X → Z → M, and A← U → Z ← ~X → M.
5
Figure 1: Two DAGs showing alternative models of class inequalities in educational attainment. Thefirst corresponds to the simple model of primary and secondary effects as described in the literature(e.g. Jackson et al., 2007; Jackson, 2013). The second corresponds to the more elaborated (and realistic)setting where we have two additional vectors: pre-family income unobserved confounders ~U, as well asa set of intermediate variables ~X, which encompasses a range of aspects of social disadvantage duringupbringing. Note that ~U has forward paths to every vertex in the DAG.
6
2.2 We can define two secondary interventions through a field-experimental ideal
How then might we use primary and secondary mechanisms to inform targeted interventions? Under a
targeted intervention, we might want to ask what gaps in adult income, Y, would look like if we ‘neu-
tralized’ secondary effects in the sense of imposing some distribution among children from different
backgrounds. I define two types of intervention, which I illustrate graphically in Figure 2:
1. First ‘weak’ secondary interventions can be understood as those effects that capture class-differentiated
choice decisions at a juncture operating net of the effect of class origin on intermediate variables ~X.
In other words, such an intervention would not disrupt the path from family income to college
attendance through ~X; only directly net of ~X.
2. By contrast, ‘strong’ secondary effects refer to interventions that would neutralize the path from
parental income to college attendance net of high school performance that operates both through
intermediate variables ~X as well as through other pathways. The distinction compared with weak
secondary interventions therefore entails blocking an additional pathway from family income to
college attendance.We can further subdivide this strong form of intervention into two forms. The
first refers to cases where the strong intervention is applied to both high and low income groups,
while the second captures instances where the strong intervention is applied only to low income
groups. The latter quantity corresponds to an intervention targeted solely to alter the proportion of
lower income children attending college while not altering admissions policies or behavior for high
income children, and as such represents a form of class-based affirmative action (AA).4 I refer to the
former uniformally-applied intervention as the ‘uniform-strong’ intervention, and the latter as the
‘AA-strong’ intervention.
The advantages of this approach are twofold. First, when we consider A as simply a descriptive marker
that informs a population disparity, then we do not need to identify the ‘effect’ of income at all.5 It there-
fore conveniently sidesteps identification issues of A on M present in the original primary-secondary
effects framework; many of the backdoor paths considered by Morgan are simply not relevant to inter-
ventional effects. To recall, disparities in high school GPA, college attendance and adult income across
family income groups arise in several ways. First, unobserved variables ~U such as parental ability af-
fect both parental income A as well as intermediate variables ~X such as child ability. Crucially, this4This latter quantity is closely aligned with Proposition 4 in Jackson and Vanderweele (2018), where the interventional distri-
bution f (m|a) is applied only to individuals from the disadvantaged social group.5I adopt notation from the mediation literature concerned with effect identification of A to ease comparison, but A is simply
indicative of a collection of individuals, and is not of direct causal interest
7
is a non-mediating path - it does not capture the effect of family income on ~X. In addition, there are
also forward paths emanating from family income: for instance, family income might affect the type of
neighborhood a child grows up in or the quality of school they attend. As a result, the interventiouns I
consider deactivate both the forward path(s) from A to M (net of high school GPA, and, in the case of
the strong intervention, additionally, net of intermediate variables), and the backdoor paths from A to M.
Specifically, the strong intervention which equalizes college attendance within high-school GPA groups
deactivates the forward paths A → M and A → ~X → M as well as the backdoor paths A ← ~U → M
and A← ~U → ~X → M, while the weak intervention deactivates only one forward path A→ M and one
backdoor path A← ~U → M .
Second, we have approached the issue of defining an intervention to secondary class effects on ed-
ucational transitions by creating two distinct interventions. Clearly, defining a secondary intervention
solely as an effort to neutralize the ‘direct effect’ of family income net of performance begs the question
of how to treat variables on the (backdoor) pathway from A to M, college transition. As we can see in
the DAG, there are two pathways from family income to college attendance (which may be backdoor
paths through unobserved variable ~U the pathway that operates through intermediate variables such as
neighborhood type, and the pathway that operates directly, net of these intermediate variables). This
observation then naturally leads us to consider two different types of secondary intervention on what the
traditional literature labels the ‘direct’ pathway from class background to college, net of performance.
To define strong and weak secondary effects formally, I follow Lundberg (2020) in defining my esti-
mand in terms of a hypothetical field experiment. Consider the set-up where a and a∗ are two levels of
family income we wish to compare, i.e. A ∈ {a, a∗}with a∗ representing low parental income groups. For
instance, if we dichotomized family income we may have a = 1 and a∗ = 0. Imagine drawing a sample
S from a population P of interest, and then intervening to assign each individual in the sample a level
of education M = m ; the value Y(m) then denotes the potential value of adult earnings that individual
i would take under that that level of education. For instance, 1na
∑ni:Ai=a Yi(m) is the average outcome
of units in the sample S when each individual with A = a has been exposed to m level of education.
Additionally, let P(M|u) denote the cumulative distribution function (CDF) of M (college attendance)
among those with a particular set of characteristics u. In addition letM|u denote a random draw from
this distribution.
Turning first to the weak secondary intervention, let P(M|X, Z, a) denote the cumulative distribution
function (CDF) of M (college attendance) among those with high school GPA (Z) grade Z and interme-
diate variables value X among those with family income level A = a, andM|X,Z,a, a random draw from
this distribution. Note that X and Z are random variables, whereas A is fixed to A = a, indicating that
8
Figure 2: A ‘weak’ secondary intervention removes the association between family income A and itsconfounders ~U on college attendance, conditional on high school GPA Z and on intermediate variables~X such as school type, neighborhood of origin, and peer expectations. A ‘strong’ secondary interventionremoves the association between family income and its confounders on college attendance, conditional onhigh school GPA, but unconditional on intermediate variables. family income A purely as a demographicmarker, the DAG accomodates unobserved cultural and genetic confounders ~U of the effect of familyincome on each set of variables in the model. Note that ~U has forward paths to every vertex in theDAG. Moreover, for expositional simplicity I illustrate the case where intermediate variables ~X are ‘post’-family income, though the flexibility of treating family income as a demographic marker means that myframework is agnostic about whether ~X occur before or after family income. The only chronologicalrequirement is that A and ~X occur before Z, which in turn occurs before M, which occurs before Y.
9
the intervention randomly assigns college attendance M among individuals from all income backgrounds
such that it follows the same marginal distribution as is observed in the group A = a, within Z and X
groups as currently observed. Therefore, the quantity
θa,a1 , E[Y(M|X,Z,a)|A = a1]
reflects the expected mean over repeated samples S where we impose the CDF P(M|X, Z, a) among
all individuals with family income level A = a1.6
While this formula is general in that it denotes the interventional quantity E[Y(M|X,Z,a)|A = a1] for
any combination of a, a1 ∈ {a∗, a}, for weak secondary interventions we are interested only in the case
where a ≥ a1; i.e. θaa∗ and θaa . For instance, θa,a∗ = E[Y(M|X,Z,a)|A = a∗] captures the expected outcome
of individuals from low income backgrounds if we intervened to send these individuals to college at the
same rate as individuals from high income backgrounds with the same high school GPA (Z) score and
value of intermediate variables X.7
Turning next to the strong secondary intervention, M|Z,a denotes a random draw from P(M|Z, a) -
the distribution of M (college attendance) among those with high school GPA Z among those with family
income level a1, such that
ψa,a1 , E[Y(M|Z,a)|A = a1]
reflects the expected mean over repeated samples S where we impose the distributionP(M|Z, a)
among all individuals with family income level A = a1, and
ψaa∗ , E[Y(M|Z,a)|A = a∗]
captures the expected mean over repeated samples of individuals from low income backgrounds if
we intervened to send these individuals to college at the same rate as individuals from high income
backgrounds with the same high school GPA (Z) score (i.e. if we imposed the distribution of college
attendance among high income children with a particular GPA on low income children, conditional on
GPA).8 Note in addition that the contrasts
6We could equally define this quantity in terms of a population-level expectation, rather than as a sample average, i.e.θa,a∗ , E[Y(M|x,z,a)|A = a∗]. Identification proofs for both are identical, but the version I present in the main text is usefulfor interpretation purposes, as I will clarify in the next section.
7Intuitively, setting M to a particular distribution is equivalent to assigning each individual a random draw from that distri-bution. Note in addition that, compared with other causal estimands, it is only meaningful in general to talk about interventionaleffects in the average, since each individual’s Y(M|x,z,a) depends on the random quantityM|x,z,a
8Note that we could also write the weak and strong interventions, respectively, as θa,a1 , ES [yS ,a1 (M|x,z,a)] and ψa,a1 ,ES [yS ,a1 (M|x,a)], preserving the notation used in Lundberg (2020), where the expectation is defined over hypothetical repeated
10
(a) E[Y(M|Z,a)|A = a]−E[Y(M|Z,a)|A = a∗]
(b) E[Y|A = a]−E[Y(M|Z,a)|A = a∗]
represent, respectively, (a) the expected disparity over repeated samples in income between high and
low parental income groups after the strong intervention is applied to both high and low income groups
(the ‘uniform-strong’ intervention), and (b) the expected disparity over repeated samples in income be-
tween high and low parental income groups after the strong intervention is applied only to low income
groups (the ‘AA-strong’ intervention). The latter quantity corresponds to an intervention targeted solely
to alter the proportion of lower income children attending college while not altering admissions policies
or behavior for high income children, and as such represents a form of class-based affirmative action
(AA).9
Importantly, these two types of secondary intervention map onto different types of policy as might be
delivered in practice. Figure 3 summarizes this mapping. First, weak secondary interventions can be con-
sidered the weaker form of intervention as they only block pathways from family income to attendance
that operate net of structural aspects of socio-economic upbringing - i.e. pathways which are thought
to capture the cost-benefit calculus aspect of educational transitions. Thus, an intervention of this sort
would pertain to policy aimed directly to alter the cost-benefit calculus of individuals from different class
backgrounds - for instance, to informational resources targeted at low income children, or needs-based
grants or financial incentives to apply or enroll in college. By contrast, strong secondary interventions are
the more radical since they are interventions that would not be sensitive to - or whose efficacy would not
be shaped by - other aspects of disadvantaged students’ upbringing environment. In other words, since
they block the composite path from family income to attendance through both the cost-benefit calculus
and structural aspects of upbringing (but not the path through GPA performance), they capture a world in
which college admission depends only on high school performance and no other class-contingent factors.
An intervention of this sort might refer to targeted university admissions or quotas for a representative
admission of individuals from different class origins, within GPA groups. Two points are of note here.
First, both interventions operate at the level of college enrolment rather than of attainment, which make
them distinct from the concept of ‘controlled mobility’ introduced in Zhou (2019) (see Section 2.4 for
samples. I opt for population level quantities in the main text for notational simplicity, but I seek to make inferences only about alimited subsample, as I specify in the next section.
9This latter quantity is closely aligned with Proposition 4 in Jackson and Vanderweele (2018), where the interventional distri-bution f (m|a) is applied only to individuals from the disadvantaged social group.
11
further discussion).10 Second, neither of these interventions alter the association between family back-
ground and intermediate variables or high school GPA. The intervention is solely with respect to college
attendance conditional on high school GPA (strong intervention), or conditional on both high school GPA
and intermediate variables (weak intervention).
Of course, policy interventions to equalize transition rates to higher education for those within the
same GPA bracket are likely to be unsatisfactory for reducing class educational inequalities in general
insofar as low income students are constrained by their lower average GPA scores, as well to the extent
that class inequalities in adult income persist among college graduands. I explore these issues more
thoroughly in Section 6.
10Moreover, because Zhou’s (2019) intervention concerns BA completion, rather than college enrolment, it hinges on both ensur-ing access and ensuring completion among students. It could therefore be considered a composite secondary-primary intervention.
12
Figure 3: Educational interventions to reduce class-based inequalities in adult earnings operate at dif-ferent levels. Most broadly, they can be designed to alter either individuals’ educational performanceor individuals’ probabilities of transitioning to the subsequent stage of education (e.g. college), net ofperformance. These correspond, respectively, to primary- and secondary-based interventions, only thelatter of which I consider in this paper. Secondary interventions can then be further divided into twotypes: (a) those that only block the direct pathway from parent income to college attendance net of bothperformance and intermediate variables such as school, neighborhood and other family characteristics(a weak secondary intervention), and (b) those that block the composite path from parent income to col-lege attendance comprising both the direct effect in (a) and the path through intermediate variables. Theimportant takeaway from this distinction is that weak and strong secondary interventions capture twodifferent forms of policy intervention: whether we intervene to alter individuals’ cost-benefit calculus forinstance through an informational or grant-based approach (weak secondary), or through interveningto specify a particular admissions policy (strong secondary). Whether this admissions policy applies toeveryone or just to disadvantaged children underpins the final distinction between ‘uniform-strong’ and‘AA-strong’ interventions.
13
2.3 We can approximate the ideal field experiment using observational data
Conceptualizing these interventions as hypothetical experiments is useful because it helps clarify our
research goal, and is unbounded in scope. We may want to ask what mobility would look like under
an intervention to the whole population of high-school goers. Of course, in practice, since we cannot
change admissions policies twice over across colleges, approximating these counterfactual interventions
using observational data is the most feasible way to estimate these estimands. As I demonstrate below,
a key advantage of the interventions that I propose is that they are identified under relatively weak
assumptions - namely, no unobserved confounding of the effect of college attendance M on adult earnings
Y, conditional on all antecedent variables.11 This assumption would be met in the DAG in Figure 2, if ~U
does not affect M and Y.
Despite the advantages of the estimation approach I propose, there are two validity-related difficulties
when we shift from the theoretical field experimental estimand at the population level, to estimating this
estimand with observational data.
The first regards causal identification. While a clear advantage of the approach I propose compared
with the traditional primary-secondary effects literature is the ability to sidestep identification of family
income on on educational transitions/adult income (see Section 2.2), this framework still requires iden-
tification of the effect of college attendance M on adult earnings Y (see formal identification proofs in
Appendix A). In practice, this requires no unobserved confounding of the effect of college attendance
M on adult earnings Y, conditional on all antecedent variables.12 This assumption would be met in the
DAG in Figure 2, but would be violated if there existed unobserved confounders of the M− Y relation-
ship. If we fail to meet this assumption then we would fail to approximate the ideal field experiment
using observational data.
The second regards external validity. In practice, a key limitation of estimating the counterfactual
disparities I suggest with observational data concerns the size of the population about which we are
able to credibly make a counterfactual claim. This trade-off between making a claim with credible yet
broad scope is succinctly laid out in Lundberg (2020, p. 8-12) in the context of ‘gap-closing estimands’,
which closely parallel the interventions I propose here. This is a particular issue if the effect of college
11Again, since A is used in purely a descriptive sense as a demographic marker, we do not require the assumptions of nounobserved treatment-mediator or treatment-outcome confounding. Note in addition that we are agnostic about the causal orderingof A andM1. A can be seen as either causally prior to or causally post any of the component variables ofM1 (e.g. family incomedetermines the type of school a child attends, but parental education level or neighborhood type affects income (e.g. Wodtke et al2011)). Either of these causal relationships may be true, but weak secondary interventions are identified in all cases.
12Again, since A is used in purely a descriptive sense as a demographic marker (of family income), we do not require anyadditional identification assumptions in the model. Note in addition that we are agnostic about the causal ordering of A and X: Acan be seen as either causally prior to or causally post any of the component variables of X (e.g. family income determines the typeof school a child attends, but parental education level or neighborhood type affects income e.g. Wodtke et al., 2011. Either of thesecausal relationships may be true in reality, but weak and strong secondary interventions are identified in all cases.
14
attendance on adult earnings is partly a function of the proportion of individuals in the population who
receive a college degree. In this case, then we can only use observational data to estimate the effect of
college attendance when the observed proportion of individuals obtain a degree. Yet, the interventions I
consider in this piece will alter the proportion of individuals attending college since they serve to make
individuals from particularly lower class backgrounds less constrained in their college options. Claims
about a counterfactual world in which more people attend college are however at best speculative.13 We
might expect, for example, labour market returns to college to be very different in this world than in
reality, where fewer than 40% of young adults enroll in postsecondary education.14
There are at least two options for addressing this issue. The first simply concerns interpretation. Infer-
ence about the counterfactual world where we undertake such a set of interventions to college attendance
will be strongest when we interpret it as a local claim about hypothetical mobility rates in a small fraction
of the population, rather about the whole population. As Lundberg (2020) notes, this latter claim is often
relevant from the perspective of policy-makers, who cannot intervene on the whole population at once
(p.12). Moreover, secondary interventions can readily be conceptualized at the school-level: one could
imagine admissions policies or outreach programs changing at a set of schools, in which case the coun-
terfactual mobility rates I estimate would apply to the subpopulations attending these particular those
schools.
The second is to insist on making a global, population-level claim (i.e. about counterfactual mobility
rates in the whole population) by explicitly changing the estimand into one that preserves the marginal
distribution of college attendance at its observed value. An estimand that changes not the prevalence
of college-goers but rather that simply changes who goes to college - i.e. that simply shuffles college ad-
missions among individuals from different socio-economic backgrounds - can be endowed with a global
rather than local interpretation, and is more credibly estimated using observed data. We can easily ex-
tend the estimation framework I develop here to numerically derive a set of weighting constraints that
preserve the marginal distribution of college attendance. In future analyses, I plan to compare both ap-
proaches - thus exploring counterfactual mobility rates under (a) an intervention to a small sample of the
population (equal to the size of my survey sample) and (b) an intervention to the entire population where
the number of college attendees does not change.
In Appendix A, I offer formal identification proofs for the strong and weak interventions. Under
the assumption of no unobserved confounding of the effect of college attendance on earnings, the weak
13Jackson and Arah (2020) refer to these extrapolation issues as violations of the assumption that the intervention is systempreserving: intervening on the treatment shifts its values but does not change the statistical relationships that define the system.Formally, this general equilibrium threat can be understood as a violation of the Stable Unit Treatment Value Assumption (SUTVA)- the assumption that yi(~m) = yi(m), i.e. that the potential outcome of unit i under an intervention to send that individual to collegeis independent of the college attendance statuses of other individuals in the population.
14From the NLSY97 cohort, I calculate that approximately 36% of individuals enrol in a 2- or 4-year college by the age of 25.
15
secondary intervention is then non-parametrically identified as follows:
θaa∗ =∫∫
E[Y|a∗, x, z, m]dP(m|x, z, a)dP(x, z|a∗) (1)
This identification formula for the weak secondary intervention is closely related to the generalized
mediation functional in mediation analysis in the case of no pre-treatment confounders (see Zhou, 2020).
Note additionally that we can factorize E[Y|A = a] as follows:
E[Y|A = a]
=∫∫
E[Y|a, x, z, m]dP(m|x, z, a)dP(x, z|a)
= E[Y(M|x,z,a)|A = a]
, θaa
In other words, the expectation of adult income under the weak intervention among individuals from
high parent backgrounds is equal to their average observed outcome. For the strong secondary interven-
tion, in terms of observed data, we can write ψaa∗ ≡ E[Y(M|x,a)|A = a∗] as
ψaa∗ =∫∫∫
E[Y|a∗, x, z, m]dP(m|x, a)dP(z|x, a∗)dP(x|a∗) (2)
Note the key distinction between this quantity and the identification result for the weak intervention
considered above is simply that the PMF for college attendance in the strong secondary intervention lacks
intermediate variables in the conditioning set, reflecting the fact that this intervention breaks the pathway
from family income to college attendance via intermediate variables, while the weak intervention does
not. Note additionally the relationship between parental income and ~X (intermediate variables) and Z
(high school GPA) is preserved in both types of intervention.15
2.4 Comparison with existing approaches: controlled mobility, conditional equal-
ization and the randomized analog of the mediation formula
Strong and weak secondary interventions are closely related to three alternative estimands recently con-
sidered in sociology and epidemiological literature, all of which fall into a general category of ‘interven-
15Additionally, in the case of the weak secondary intervention, the interventional estimand among high income children reducesto the average observed outcome among high income children: ψaaa = E[Y(a)]. By contrast, in the case of the strong secondaryintervention, only the ‘AA-strong’ interventional estimand reduces to the average observed outcome among high income children,while the ‘uniform-strong’ intervention does not.
16
tional effects’. In general, interventional effects refer to counterfactual quantities in which the natural
distribution of a mediator under one exposure condition is replaced by that mediator’s distribution from
the other exposure condition.
To demonstrate the intimate connections between these, note that we can use the notation introduced
in Section 2.2 to write any stochastic assignment rule that randomly allocates individuals to a level of
educational attainment M as a function of observed characteristic. Generally, then , we have the following
formula:
τstoch.a,a∗ (m) = E[Y(M|u︸︷︷︸
∗
)|A = a]−E[Y(M|u︸︷︷︸∗
)|A = a∗] (3)
where E[Y(M|U)|A = a] denotes the average potential outcome in category A = a under an interven-
tion to equalize the distribution of M across social groups conditional on their value of U. U may be the
empty set, in which the CDF is simply the marginal distribution of M. Alternatively, it may be a vector of
covariates, which means that we are definining an interventional distribution conditional on some set of
covariates U, which can be defined in relation to the specific intervention we consider. Note that U may
consist of both random and fixed elements: random elements denote those characteristics over which we
define an intervention, but on which we do not intervene directly; fixed elements denote the Generally,
this quantity is identifiable so long as E[Y(m)] is identifiable, in which case we marginalize E[Y(m)] over
P(m|u), which denotes the cumulative distribution function (CDF) of M|u.
First, consider the case when P(m|u) = 1 and when u is the empty set. In this case, the interven-
tion is ‘deterministic’: each unit is deterministically assigned a value of the intervention, rather than
stochastically according to some assignment rule. This estimand is equivalent to that of ‘controlled mo-
bility’ proposed in Zhou (2019), when M is defined as an indicator for BA attainment, which captures
the remaining disparity in adult income by parental income groups after an intervention to set college
attainment to M = 1 for all individuals.
Consider next the alternative estimand Lundberg (2020) labels ‘conditional equalization’, which ’as-
signs a treatment to each unit according to the conditional distribution within the covariate stratum’ Z in
which that unit is observed, but independently of the category A within that stratum. Conditional equal-
ization can be written in terms of the above formulas, withMm|z and dP(m|z), respectively, replacing the
underbraced components of the equations.
Conditional interventions are further distinct from another estimand considered in the interventional
effects literature: randomized interventional effects (VanderWeele et al., 2014), where the underbraced
components of the equations are replaced by Mm|a. This intervention coincides with proposition 4 in
17
Jackson and Vanderweele (2018), though in the case where we have no pre-treatment confounders. Intu-
itively, such an approach would shift the focus away from eliminating ‘direct’ paths from family income
to college attendance net of high school performance, towards eliminating disparities in college atten-
dance entirely, regardless of whether those disparities arise through high school performance or through
alternative pathways. In contrast with the secondary interventions considered earlier, this estimand is
the quantity obtained under randomly assigning college attendance among low income students such
that they follow the same marginal distribution as among all those from high-income backgrounds, but
not conditional on high school performance (strong secondary interventions) or high school performance
and other class-based aspects of upbringing (weak secondary interventions).
Each of these approaches comes with its relative advantages and disadvantages compared with the
secondary interventions I propose here. First, to be sure, an intervention to send everyone to college
attainment is more radical than strong or weak secondary interventions, but arguably, interventions are
most informative when identificiable from observed data. An intervention to everyone’s college status
would likely drastically change returns to college; further; estimating this quantity is highly susceptible
to positivity extrapolations - when we impute a set of counterfactual outcomes for types of individuals
not or rarely observed in the treatment condition - in disparity estimands where we manipulate evey-
rone’s treatmnet condition to M = m. Do we really know how enrolling high school dropouts in higher
education would impact their earnings when we observe near to no such cases in our datasets? The
approach I suggest in this paper sidesteps this issue by considering interventions conditional on GPA.
Further, conditional equalization may not be entirely satisfactory as an intervention. Consider the
motivating setup where A is an indicator variable denoting family income, Z, high school GPA, M an
indicator of college attendance, and Y, earnings. While conditional equalization uses the distribution
f (m|z) for both family income groups, this distribution reflects both the ‘net effect’ of academic perfor-
mance on college attendance (Z → M) and the confounding effect of family income (Z ← A → M). In
other words, conditional equalization represents an intervention that does not fully purge the interven-
tion of the lingering effect of family background. Finally, randomized interventions, unlike secondary
interventions, do not constrain low income students to a college attendance distribution that depends
on their (low) GPA performance, and therefore block not only the path from family income to college
attendance net of GPA but, in addition, the path from family income to college attendance via GPA. The
advantage of this approach is that represents an even more radical intervention than the strong secondary
intervention considered above, and it neutralizes both secondary and primary class effects on attainment.
The disadvantage is that it is difficult to envisage such an intervention in practice, as it would require the
college admissions process to disregard any signal of academic ability (as proxied through high school
18
GPA). It is therefore perhaps a less realistic or at any rate foreseeable intervention compared with strong
and weak forms of secondary intervention.
3 Estimating weak and strong interventional effects
3.1 Estimation by imputation and weighting
The equations in the previous section suggest that both weak and strong secondary interventions can be
estimated via numerical integration or a Monte Carlo simulation approach (e.g. Imai et al., 2010).16 A
limitation of this approach however is that estimates of these conditional density or probability functions
tend to be noisy if any of the mediators are continuous or multivariate. Fortunately, we can rewrite both
of the integrals that identify weak and strong interventions, respectively, in terms of probability functions
that are more amenable to estimation.
Consider first weak secondary interventions. Without loss of generality, consider the case where fam-
ily income takes two levels. We note that the integral θaa∗ =∫∫
E[Y|a∗, x, z, m]dP(m|x, z, a)dP(x, z|a∗) can
be expressed as a series of iterated expectations of observed/imputed outcomes:
θaa∗ = Ex,z|a∗Em|a,x,zE[Y|a∗, x, z, m]
Since this approach does not require modeling of any of the mediators, it is especially advantageous
when one or more mediators is multivariate or continuous. This equation then suggests the following
regression-imputation procedure (see Zhou and Yamamoto, 2020):
1. Since θa,a ≡ E[Y(M|x,a)|A = a] = E[Y|A = a] by construction, and because A is treated as a
demographic marker with no ‘pre-treatment’ covariates, θa,a can be estimated by the average of the
observed outcome Y among units with A = a.
2. Estimate E[Y|a∗, x, z, m] by fitting a model for the outcome conditional on A, M, X and Z, and
obtain predicted values for all units at A = a∗ and their observed values of X and Z.
3. Fit a model for the imputed outcomes obtained in (1), E[Y|a∗, x, z, m], conditional on A, X and Z,
and obtain predicted values for all units at A = a and their observed values of X and Z.
4. Estimate θaa∗ by averaging the fitted values of Em|a,x,zE[Y|a∗, x, z, m] among all units with A = a∗.
16Specifically, this would involve fitting models for the observed outcome and conditional densities or probabilities of the media-tors f (m3|a∗m2), f (m2|a, m1) and f (m1|a), simulating counterfactual values of the mediator and of the outcome given the simulatedmediator values, and computing the integral using the simulated values.
19
Standard errors can then be obtained by bootstrapping the entire procedure. Steps 1-4 can easily be gen-
eralized to instances where A is multivariate, in which case the procedure is analogous but the contrasts
a and a∗ may refer to arbitrary levels of family income.17 Further, because the iterated expectations are
agnostic about functional form, we can use any model, including flexible machine learning methods,
to estimate the outcome models. This approach can help reduce model dependence, especially in cases
where the mediators are high-dimensional and when non-linearities and interactions are likely to exist. In
order to improve convergence rates of the machine-learning estimators employed, and for semiparamet-
ric/asymptotic efficiency, we can alternatively use a debiased-machine learning approach to estimate this
intervention (Chernozhukov et al., 2017). This approach is characterised by two components: first, the
use of a Neyman orthogonal estimating equation which makes estimates of targeted parameter ’locally
robust’ to estimates of the nuisance function; second, the use of a K-fold cross-fitting algorithm. In my
main analyses, I employ this debiased machine-learning approach for the weak secondary intervention,
which I detail further in Appendix B.
Consider next the strong secondary intervention. Again, like the weak secondary intervention, it
would be difficult to estimate the identifying equation directly when any of the mediators are continuous
or multivariate. Further, the integral expression for the strong intervention cannot be expressed as a series
of iterated iterations that would enable us to pursue a regression-imputation approach. Fortunately,
however, using Bayes’ rule, we can rewrite Equation 1 as a function of odds ratios of M:
∫∫∫E[Y|a∗, x, z, m]dP(m|a, z)dP(z|x, a∗)dP(x|a∗) = E
[Y
f (M|a, X)
f (M|a∗, X, Z)|a∗]
A proof is shown in Appendix A. Since M is an indicator for college attendance, the density function
f (M|u) becomes a more estimable probability mass function, Pr(M = m|u). This equation therefore sug-
gests the following weighting-based estimator for the strong secondary interventions among low income
children
ψaa∗ =1n ∑
i
[Yi
f (Mi|a, Xi)
f (Mi|a∗, Zi, Xi)|a∗]
and
ψaa =1n ∑
i
[Yi
f (Mi|a, Xi)
f (Mi|a, Zi, Xi)|a]
for high-income children when applying the ‘uniform-strong’ intervention. This method can be seen
as an extension of the weighting-based estimators proposed in Vanderweele et al. (2014) Approach 3, to
17Note that when A is continuous, an additional step to fit a further model as a function of A would be required.
20
estimate the ‘randomized interventional analog of the natural (in)direct effect, and suggests the following
procedure:
1. Estimate a model (1) for M conditional on A and X; for individuals with observed level of family
income A = a∗, obtain predicted values at A = a.
2. Estimate a model (2) for M conditional on A, Z and X; for individuals with observed level of family
income A = a∗, obtain predicted values at A = a∗.
3. For individuals with observed level of family income A = a∗:
(a) Obtain predicted values from model (1) at A = a;
(b) Obtain predicted values from model (2) at A = a∗;
(c) A weighted average of observed Yi among individuals with observed level of family income
A = a∗, where the weights are the ratio of the predicted probabilities obtained in steps (a) and
(b) constitutes an estimate of ψaa∗ .
4. For individuals with observed level of family income A = a:
For the ‘uniform-strong’ intervention:
(a) Obtain predicted values from model (1) at A = a;
(b) obtain predicted values from model (2) at A = a;
(c) A weighted average of observed Yi among individuals with observed level of family income
A = a, where the weights are the ratio of the predicted probabilities obtained in steps (a) and
(b) constitutes an estimate of ψaa.
For the ‘AA-strong’ intervention:
(a) Calculate E[Y|A = a] as the simple average among individuals from high income backgrounds.
Standard errors can then be obtained by bootstrapping the entire procedure. Again, the procedure can
easily be generalized to instances where A is multivariate, in which case the procedure is analogous but
the contrasts a and a∗ may refer to arbitrary levels of family income. In addition, because the estimator
has been derived without any parametric assumptions, flexible machine-learning methods can also be
used to fit the propensity score models for M.
Further, because the identification formulae are agnostic about functional form, we can use any model,
including flexible machine learning methods, to estimate the propensity score models for the mediators.
21
This approach can help reduce model dependence, especially in cases where the ‘intermediate variables’
~X are high-dimensional and when non-linearities and interactions exist. For all models in the analyses
that follow, I use a super learner consisting of Lasso and random forest.
3.2 General equilibrium threats: weighting constraints
As I detal in Section 2.3, one key limitation of the estimands I propose is that they involve changing the
proportion of individuals attending college. This is problematic insofar as the effect of college attendance
on adult earnings is partly a function of the proportion of individuals in the population who receive a
college degree. Claims about a counterfactual world in which more people attend college are at best spec-
ulative. Two options I consider are (a) adapting our interpretation of the estimates we obtain as simply
hypothetical mobility rates in a small fraction of the population (see Lundberg, 2020) , and (b) changing
the estimand into one that preserves the marginal distribution of college attendance at its observed value.
Since in practice, we can only use observational data to estimate the effect of college attendance when the
observed proportion of individuals obtain a degree, an estimand that altershanges not the prevalence of
degree-holders but rather that simply changes who holds a degree can be endowed with a global rather
than local interpretation.
We can easily extend the estimation framework I develop above to numerically derive a set of weight-
ing constraints that preserve the marginal distribution of college attendance in future analyses, I plan to
compare both approaches (exploring counterfactual mobility rates under both an intervention to a small
sample of the population, and alternatively as an intervention to the entire population where the number
of college attendees does not change. To this end, we can therefore additionally impose the constraint on
the quantities obtained via my proposed estimation proceudres that ensures P(M = m) is preserved by
selecting the weight wi such that
∫Pr(M = m|a, x, z)widP(x, z) = Pr(M = m)
for the weak secondary intervention, and
∫Pr(M = m|a, z)widP(z) = Pr(M = m)
for the strong secondary intervention. Using the empirical distribution as estimates of P(x, z) and
P(z), the resultant weights can therefore be calculated numerically as wi = Pr(M=m)Pr(Mi=m|ai ,xi ,zi)
and wi =
Pr(M=m)Pr(Mi=m|ai ,zi)
, for the weak and strong interventions, respectively.
REDO
22
These formulas seem incorrect. I was talking about a weighted average of Pr(M=1|a,x,z) across
different values of a as a "composite intervention" such that the marginal distribution is unchanged.
So the weight should be a function of a: w(a) and the intervention is a summation of Pr(M=1|a,x,z)w(a)
over a, subject the constraint that w(a) sum to 1.
4 Data
To estimate weak and strong secondary interventions, I draw primarily on the National Longitudinal
Survey of Youth 1997 (NLSY97), which began with a nationally representative sample of men and women
at ages 12 to 18 in 1997. The population amenable to the interventions I consider are all students who
completed a high school diploma or GED (‘high school graduates’). I exclude students who dropped out
of high school since they would be ineligible for college entry, and thus for the interventions I propose.
I measure parental income (A) as the average family income reported in the five earliest survey waves
(1997 to 2001), and adult income by averaging respondent annual earnings between ages 30 and 33. Both
variables are adjusted for inflation to 2019 dollars using the personal consumption expenditures index
(PCE). I treat respondent’s annual earnings as the sum of their self-reported wage and salary income and
income from farms and businesses. Although total family income arguably captures a more complete
picture of economic (dis)advantage in adulthood, focusing on individual income enables a more focused
analysis for two reasons. First, because family income is function of extra-labour market processes such
as assortative mating in addition to processes in the labour market, the counterfactual disparities I es-
timate would capture the effect of college on labour market and marital outcomes, which may differ
and even counteract each other (Zhou, 2019) Second, for more global interpretations of the counterfac-
tual disparities, family-level measures of adult income would require the stricter assumption that neither
labour market nor marital market outcomes are a function of the proportion of individuals attending col-
lege, which is harder to maintain than solely the first component. As is common practice in the income
mobility literature (Chetty et al., 2014, 2020; Bloome et al., 2018; Zhou, 2019), I transform both parental
and respondent/adult earnings into their percentile ranks. This enables me to capture ‘relative’ rates of
income mobility, which consider the intergenerational persistence of income net of overall changes in
the marginal distribution of income over time and thus more directly measures equality of opportunity
(Torche, 2015; Bukodi and Goldthorpe, 2018). I calculate the income ranks with respect to the population
of high-school completers, and adjust them using the NLSY97 sampling weights.
Compared with the NLSY79, which could also in theory be used to estimate the interventions I con-
sider, the NLSY97 facilitates estimation of interventional effects that are likely to be closer to those we
23
would obserce if we undertook the ideal field experiment, since it traces the educational and labor mar-
ket experiences of a younger cohort. Nevertheless, such an approach necessarily comes with a trade-off,
since it only enables us to measure adult earnings in these cohort members’ early thirties. As has been
noted in the mobility literature, measures of adult income at younger ages are likely to act as poor prox-
ies for permanent adult income, and thus misrepresent true rates of mobility (e.g. Blanden, 2013; Bloome
et al., 2018). I therefore plan to re-estimate the interventional effects I consider using the NLSY79, with
measures of adult income at later ages.
In addition to parent and child income, I construct three sets of variables. First, I measure college
attendance as a binary variable denoting whether an individual has ever enrolled in a 4-year college (with
the reference category indicating completion of high school studies). I set the cutoff of college attendance
at age 25, and only treat individuals who began their postsecondary education as 4-year college enrollees
(“four-year beginners”). Categorizing community-college transfers as high-school graduates enables a
more focused analysis: individuals who transition from 2- to 4-year colleges are qualitatively distinct
from those who begin at 4-year colleges Ciocca Eller and DiPrete, 2018, since including transfer students
in the analysis would necessitate some measure of associates degree (AA) performance which would be
considered in college admissions processes. Second, I measure high school GPA Z as a credit-weighted
average of GPA from high school classes. I use NLSY97 transcript data rather than self-reported GPA and
curricular measures to measure academic performance, which increases confidence in the validity of our
estimates.
Third, I construct a set of background and school characteristic variables as part of the intermediate
variables ~X in my model. To recall, these variables are important for two reasons: first, they are neces-
sary in order to plausibly estimate the effect of college attendance on earnings, which is only identified
if I appropriately adjust for all confounding of the M− Y relationship. Second, these variables delineate
a particular pathway through which family income is associated with college attendance, and as such
play an important role in the interventions themselves. In particular, the weak secondary intervention I
propose concerns equalizing college attendance within subgroups defined by both high school GPA and
social-structural aspects of (dis)advantage that affect college attendance. While in practice it is difficult to
measure all of the components of this set of variables, the NLSY97 is advantageous as it contains a wealth
of information on individuals’ family and social background, making identification of the effect of college
attendance on earnings more plausible. I include a range of variables that include demographic charac-
teristics (age in 1997, gender and race), non-economic aspects of family background (parental education,
whether the respondent lived with both biological parents, presence of a father figure, and southern or
rural residence), ability, (percentile score on the ASVAB test, a measure of substance use and of delin-
24
quency, whether the respondent had any children by age 18, and peer and school-level characteristics
(peer college expectations, an indicator for whether the individual attended a private high school, and
three dummy variables denoting whether the respondent ever had property stolen at school, was ever
threatened at school, and was ever in a fight at school). In particular, parental education is measured us-
ing mother’s years of schooling or, if missing, father’s years of schooling. Because I treat family income
as purely a descriptive marker, we need not worry about whether such variables are strictly descendants
of family income as opposed to pre-treatment. I restrict the sample to respondents with non-missing
information on all variables of interest, yielding an analytic sample of N = 3, 737 individuals.18
5 Results
5.1 Sample characteristics
To get a sense of the extent of inequalities by parental income groups, Table 1 presents descriptive statis-
tics of my analytic sample. For expositional simplicity, in Table 1 I dichotomize parental income by me-
dian parental income. These dichotomized statistics provide a picture consistent with inequalities across
all parental income strata, as shown in Figures 9 and 10 in the Appendix, which show the fitted values
from regressing each covariate as a spline function of parental income rank. Indeed, many of the vari-
ables I consider have an almost linear relationship with parental income, indicating how socio-economic
advantage persists throughout the income distribution, and not just between the top and bottom halves.
A comparison between low and high income groups reveals considerable differences by class back-
ground in background characteristics, and educational and labour market outcomes. On all measured
social background characteristics, poor children are substantially disadvantaged compared with children
from high income backgrounds. Poor children are more likely to have had school experiences disrupted
by violence or threatening behaviour than their socio-economically advantaged peers, and less likely to
be surrounded by school peers expecting college education or to have attended a private high school. Re-
garding the family unit, poor children are far less likely to have been living with their biological parents,
with both parents, and with parents who have a high level of education (measured in years). The average
poor child also lived in a family with far fewer parental assets.
Stark differences for children of different parental income brackets persist throughout young adult-
hood. The average credit-weighted GPA of a high school graduate from the bottom 50% of family earn-
18I adopt this approach because of the sheer computational time required to boostrap on multiple imputed datasets. However,MI increases the sample size by approximatley 1.7. After submission of this project, I will therefore either boostrap on the im-puted datasets or try to develop an alternative variance estimation strategy for the strong secondary intervention (for the weakintervention, I can use the EIFs proposed in Zhou (2020) for inference, as I currently do on the non-bootstrapped sample.
25
ings is 2.65, compared with 2.99 for individuals in the top 50% of family earnings. Poorer children are
also less likely to have progressed beyond only high school education, and, if they do ‘make it’ to college,
are less likely to have attended attended a four-year college (32% as opposed to 60% for high income
children), and are half as likely as their more advantaged peers to have attended a four-year college with-
out transferring (20% vs 42%). Class origin inequalities also leave their mark on adult socio-economic
outcomes. The gap in annual earnings between individuals from a high and low income background is
on average 12.1 percentiles; the corresponding figure for hourly wages is similar, at 13.6 percentiles. In
terms of real dollars, these differences translate to approximately $16,000 and $6, respectively. In keeping
with my focus on intergenerational income persistence, I focus on these percentile outcomes (specifically,
of earnings) as my primary dependent variable in the analyses that follow. Thie backdrop of intense in-
equalities by parental income motivates an exploration of what might happen to class-based inequalities
under a series counterfactual secondary interventions
5.2 Weak and strong secondary interventions
The left panel of Figure 4 presents current income gaps in adult percentile earnings rank by parental in-
come quintiles. Extant class-baed inequalities in adult socio-economic are stark: individuals coming from
the lowest parental income quintile attain, on average adult earnings in the 37th percentile, while those
who were brought up in highest parental income quintile on average take home earnings in the 58th per-
centile. In other words, coming from a high rather than low income household in childhood increases
expected earnings by 21 percentiles. Using the estimation strategy outlined above, I next estimate aver-
age earnings under a weak secondary intervention. The figure shows that under such an intervention,
this gap in expected income between individuals from the highest and lowest parental income brackets
would diminish to 17 percentiles. Interestingly, a reduction of adult earnings disparities is unique for the
gap between the first and fifth parental income quantiles; the gap between the 2nd, 3rd and 4th parental
quantiles, on the one hand, and the 5th parental quantile, remain largely unchanged under the weak sec-
ondary intervention. This interesting finding could point to the fact that the mechanism of disadvantage
that weak secondary interventions neutralize - namely, the cost-benefit calculus at an educational tran-
sition juncture - is especially relevant to individuals from a low-income background. These individuals’
decisions, one might suspect, are especially sensitive to the cost of higher education in contrast to the
economic payoff to immediately entering the job market, and it is perhaps these individuals who have
minimal access to information about the long-term payoffs to college and financial aid available.
Next, I estimate the counterfatual expected earnings among different parental income quintiles under
the strong secondary intervention. As Figure 5 demonstrates, the strong secondary intervention raises
26
Table 1: Conditional means in educational and labour market outcomes, as well as background charac-teristics, by family income (dichotomized).
27
Figure 4: Expected adult income ranking under (a) no intervention and (b) a weak secondary interven-tion. Standard errors for the weak intervention are constructed using the empirical analog of estimatedinfluence functions
expected earnings for all parental income groups, and not just for the bottom quintile, as was the case
with the weak secondary intervention. For example, average earnings among the bottom four parental
income quintiles would be expected to increase by 10.8, 10.2, 9 and 7.5 percentage points, respectively,
under such an intervention. The blue and green point estimates for the 5th parental income quintile
group correspond, respectively, to this group’s expected earnings under (a) the ‘uniform-strong’ and (b)
‘AA-strong’ interventions. To recall, these estimands refer to an intervention that (a) applies the strong
intervention to both high and low income groups, and (b) applies the strong intervention only to low in-
come groups. What we see is that the ‘uniform-strong’ also serves to increase the average earnings of
individuals from the highest parental income quintile, albeit by a smaller amount (6 percentage points)
than corresponding increases for lower income quintiles. Therefore, although the strong secondary inter-
vention compresses the gaps in adult income by parental background (gap in expected earnings between
1st and 5th quintiles is 16.0 percentage points) margianlly than the weak intervention (that same gap is
17.2 percentage points) and far more than the gap under no intervention (a gap of 21.0 precentage points),
the equalizing potential of the strong intervention is somewhat attenuated by the fact it promotes the ex-
pected earnings of children from high income backgrounds. Instead, therefore, we could consider (b)
the ‘AA-strong’ intervention, which does not alter the distribution of high income children attending
college (i.e. preserves this distribution at its current level). This quantity can be seen as correspond-
28
Figure 5: Expected adult income ranking under (a) no intervention and (b) a strong secondary interven-tion. Blue and green point estimates for the 5th parental income quintile group correspond to this group’sexpected earnings under the ‘uniform-strong’ and ‘AA-strong’ interventions, respectively. Standard er-rors for the strong intervention are obtained via the nonparametric boostrap (500 replications).
ing to an intervention targeted solely to alter the proportion of lower income children attending college
while not altering admissions policies or behavior for high income children, and as such represents a
form of class-based affirmative action. As the green point estimate and confidence interval show, under
this alternative type of secondary intervention, individuals from the 4th parental income quintile would
marginally overtake the most advantaged children in terms of expected earnings. Under this ‘AA-strong’
intervention, the gap in average adult earnings by parent income quintile would be reduced from 16.0
percentage points to 10.2 percentage points.
5.3 Heterogeneity by racial groups
I next examine heterogeneity in secondary interventional effects by racial/ethnic groups. To this end, I
repeat the analysis conducted in the previous section for racial-ethnic subgroups. The dependent vari-
able of interest - annual earnings - is again ranked in relation to the distribution of all individuals who
at least complete high school, regardless of racial group. Figures 6, 7 and 8 show expected adult earn-
ings by parental income quintile groups under a series of secondary interventions for whites, Hispanics
and Blacks, respectively. First, there are presently considerable inequalities in adult attainment by racial
group, even among individuals from the same parental quintile. For example, while whites and Hispan-
29
ics in the bottom income quintile have average earnings in the 39-40th percentile, average earnings for
Blacks in the equivalent group are 8 percentage points lower. Similarly, while whites and Hispanics in
the top income quintile have average earnings in the 57-60th percentiles, Blacks in the same group have
average earnings in only the 47th percentile. Consequently, the gap in average earnings between the bot-
tom and top parent income groups differs across race: this gap is 18.4 percentage points for whites, 20.6
percentage points for Hispanics and 15.6 percentage points for Blacks. The smaller gap for Blacks reflects
the lower and more compressed income distribution among this group.
Turning now to the weak form of secondary intervention, the estimated increase in expected earn-
ings among whites and Hispanics is similar, at approximately 2-4 percentage points for children in the
bottom three parental quintiles. By contrast, Black individuals are not estimated to benefit from such
an intervention; the expected increase in earnings among this group under a weak intervention is sub-
stantively insignificant. Consequently, gaps in annual earnings between the lowest and highest parental
income groups would be expected to decrease by 1.9 percentage points for whites, 4.0 percentage points
for Hispanics, and 0.9 precentage points for Blacks.Under the strong intervention, the story is quite dif-
ferent. Specficially, for whites, a strong secondary intervention would see expected earnings increase by
6-8 percentiles for the bottom three parental quintiles. The result is even more stark for Hispanics in the
equivalent quintile groups: Hispanics in the bottom three parental quintiles would be expected to earn
between 7 and 12 percentiles more under this form of intervention. Unlike in the case of weak secondary
interventions, Blacks also stand to benefit highly from the strong form of secondary intervention - more
so than whites but slightly less than Hispanics. Black individuals in the bottom parental quintile would
see average earnings increase by 10.6 percentage points, while expected earnings for Blacks in the second
and third quintile groups increase by between 6 and 7.5 percentage points under such an intervention.
Overall, a ‘uniform-strong’ secondary intervention would see the gap in adult percentile earnings be-
tween the lowest and highest quintile groups drop by 3.6 percentage points for whites, 5.8 percentage
points for Hispanics, but –.2 percentage points for Blacks. The fact that the ‘uniform-strong’ intervention
does not reduce the overall gap between the first and fifth quintiles for Black individuals reflects the low
relatively earnings of high parental income Black groups under no intervention, and that Blacks from
all income groups stand to benefit from this intervention. However, under the ‘AA-strong’ intervention
(which applies the strong secondary intervention only to low-income groups), the gap between low and
high income Black students would essentially be reduced to 0.
More generally, how might we make sense of the fact that the equalizing effect of the strong secondary
intervention is more prominent for Blacks than whites, while the weak form of secondary intervention is
slightly more effective for whites? One possible explanation might be that (a) the pathway from family in-
30
come to college attendance net of both high school GPA and intermediate variables constitutes a weaker
effect for Blacks than whites, while (b) the composite path from family income to college attendance
through both this direct path and through intermediate variables constitutes a stronger effect for Blacks
than whites (see Figure 2 for a comparison of these two pathways). Specifically, (a) might correspond
to the fact that the class-based cost-benefit calculus (Boudon, 1974; Breen and Goldthorpe, 1997) might
interact with one’s racial group. Perhaps ethnic minority students expect larger benefits from higher ed-
ucation, for instance because they expect levels of ethnic discrimination in the labor market to be reduced
if they attain higher qualifications. Alternatively, given the significant obstacles that Black parents would
have faced to make it to a similar class position as that of their white counterparts, we might expect Black
individuals with a particular income level to be more positively selected on attributes salient for their
children’s educational decision-making than whites. On the other hand, (b) might result from the fact
that Black children, on average, suffer from a broader set of structural disadvantages as compared with
their white peers, even within the same income level - such as growing up in an impoverished neigh-
bourhood, being surrounded by lower-income peers and attending lower-performing high schools (e.g.
Wodtke et al., 2011; Sampson, 2012). Therefore, a strong secondary intervention, which intervenes to
block the composite path from family income to college attendance via (i) a class-based cost-benefit cal-
culus and (ii) structural factors encapsulated in the set of intermediate variables, blocks a set of pathways
that are presently more constraining for Blacks than for whites.
31
Figure 6: Gaps in white adults’ income ranking under a series of stochastic educational interventions.Blue and green point estimates for the 5th parental income quintile group correspond to this group’s ex-pected earnings under the ‘uniform-strong’ and ‘AA-strong’ interventions, respectively. Standard errorsfor the strong intervention are obtained via the nonparametric boostrap (500 replications).
32
Figure 7: Gaps in Hispanic adults’ income ranking under a series of stochastic educational interventions.Blue and green point estimates for the 5th parental income quintile group correspond to this group’s ex-pected earnings under the ‘uniform-strong’ and ‘AA-strong’ interventions, respectively. Standard errorsfor the strong intervention are obtained via the nonparametric boostrap (500 replications).
33
Figure 8: Gaps in Black adults’ income ranking under a series of stochastic educational interventions.Blue and green point estimates for the 5th parental income quintile group correspond to this group’s ex-pected earnings under the ‘uniform-strong’ and ‘AA-strong’ interventions, respectively. Standard errorsfor the strong intervention are obtained via the nonparametric boostrap (500 replications).
34
6 Conclusion and future steps
The aim of this paper has been to define and estimate a series of stochastic interventional effects that can
usefully inform us about the levels of mobility we would likely observe under a series of interventions
to the schooling/college system. Its starting point was the recognition that a long-appreciated distinc-
tion among sociologists of education between two dimensions of class-based educational inequalities is
useful from the perspective of designing policy. Theoretically, I have sought to improve upon the orig-
inal primary-secondary effects theory by re-considering identification issues in the context of mobility
interventions. My proposal is to consider two types of ‘secondary interventions’: what I label ‘weak’
and ’strong’ secondary intervention, which can be conceptualized from a field-experimental standpoint.
Empirically, I have shown how these estimands can be both identified from observational data under the
corresponding assumptions, and have proposed a regression-imputation and weighting strategy that can
be used for estimation in practice. The flexibility of the estimation procedures I propose is such that we
can use any flexible, including machine-learning model, to implement them. Using the NLSY97 as a case
study, I have shown the insights that these distinct interventional effects can bring for understanding
policy correctives to inequalities of opportunity.
I’ve been constrained by time in terms of the empirical analyses I have been able to undertake, and
finish the piece by simply listing those which I would like to explore further in the immediate future. The
first, most mundanely, is a replication of the main analyses using different measurement specifications,
as I allude to in the data section. This would include analyzing interventional effects at different college
attendance cutoffs, measuring adult income as total family, rather than solely individual, income, and
analyzing the effects using the NLSY79 to capture adult earnings at a later stage in the life-cycle. This
might be more relevant if I pursue this as a substantive rather than as a methods piece, however.
Second, as I highlight in Section 2.3, there are two possibilities in terms of estimating the interven-
tional effects with observational data. The first, which I have pursued here, is to simply interpret the
interventional estimates as ‘local’. The second, which involves applying the weighting procedure I detail
in Section 3.2, involves an additional estimation step that preserves the marginal distribution of college
attendees under the intervention. This second approach therefore enables a more ‘global’ interpretation
to the resulting estimates. I would like to provide comparison analyses for these two approaches.
The third regards potential a theoretical concern regarding the anticipatory component of choice.
One potential criticism of my classification of strong and weak interventions is that some of the vari-
ables I include in the vector ~X of intermediate variables (such as peer expectations), might alternatively
be considered indicators of anticipatory decision-making that is part of the secondary class effect. By
35
marginalizing out college attendance conditional on these variables, the weak intervention would artif-
ically downwardly bias the inequality-reducing capability of the weak intervention. I would therefore
like to re-estimate the weak and strong interventions considering this alternative classification of a subset
of the intermediate variables.
Next, I plan to undertake a formal decomposition of the ‘limitations’ of the interventions I propose.
As I have noted at a number of points throughout this paper, policy interventions to equalize transition
rates to higher education for those within the same GPA bracket are likely to be unsatisfactory for re-
ducing class educational inequalities in general insofar as low income students are constrained by their
lower average GPA scores, as well to the extent that class inequalities in adult income persist among
college graduands. This latter aspect of the persistent effect of social background on attainment, even
among college goers, might be particularly concerning as a source of inequality of opporunity when
within-educational group inequality by parent background has increased in recent years (Bloome et al.,
2018). One could therefore seek to perform a formal decomposition of the sources of lingering inequality
on both a pre- and post-intervention sample. For instance, one could pursue the following decomposi-
tion:
{E[Y|a]−∫
E[Y|a∗, x, z, m]dP(m|x, a, z)dP(z|a, x)dP(x|a))︸ ︷︷ ︸residual inequality in X,Z,M groups
}
+{∫
E[Y|a∗, x, z, m]dP(m|x, a, z)dP(z|a, x)dP(x|a)−∫
E[Y|a∗, x, z, m]dP(m|x, a, z)dP(z|a, x)dP(x|a∗)︸ ︷︷ ︸residual inequality in X (intermediate variables)
}
+{∫
E[Y|a∗, x, z, m]dP(m|x, a, z)dP(z|a, x)dP(x|a∗)−∫
E[Y|a∗, x, z, m]dP(m|x, a, z)dP(z|a∗, x)dP(x|a∗)︸ ︷︷ ︸residual inequality in GPA
}
Such a decomposition could also point towards fruitful research avenues to explore. For instance,
if it turns out that residual inequality in GPA is a large driver of the remaining discrepancy in income
outcomes, then this could act as motivation for a more radical set of interventions additionally seek to
eradicate the path from parent income to college attendance via high school GPA.
Finally, there are two additional components of the project I would love to consider more, especially
with your guidance! The first is the further substantive issue that the interventions I consider concern
college enrolment, rather than completion. Much work has shown vast class- and race-based inequalities
in college completion, and to the extent that many of the low-income individuals ‘sent’ to college under
36
the interventions I propose subsequently drop out (see Ciocca Eller and DiPrete, 2018), the strong and sec-
ondary policies could be deemed relatively inefficient. Future work could therefore consider a series of
dynamic interventions to college attendance and then completion. The second pertains to methodological
aspects of the project that would facilitate estimation. In particular, the bootstrapping procedure neces-
sary for inference of the strong secondary intervention is computationally inefficient; I wonder whether
it is possible to derive a consistent variance estimator for the strong intervention. Perhaps we could also
look into whether semiparametric estimation can be developed for the strong intervention, which would
additionally facilitate uncertainty estimation.
• end up 29 percentiles higher in the income distribution on average relative to children
• a. Check results under (a) alternative age cutoffs for college attendance (I currently use age 25) and
(b) different measures of college attendance (my treatment is 4 year college and I classify transfer
students as high school graduates). b. Replicate all analyses on the NLSY79 and ELS.
37
A Identification and Estimation of Equations 1 and 2
In this section, I offer formal identification proofs of the ‘weak’ and ‘strong’ secondary interventions I
propose, as well as proofs of the estimation strategies I undertake in the main text. Let A, X, Z, M and
Y be as in the main text. The following single independence assumption is sufficient to identify both
the weak and strong secondary interventions: Y(m) ⊥⊥ M|A, X, Z, i.e. there must be no unobserved
mediator-outcome confounding conditional on all antecedent variables.
First, for the ‘weak’ secondary intervention, E[Y(M|X,Z,a)|A = a∗], we have:
E[Y(M|X,Z,a)|A = a∗]
=∫
E[Y(M|x,z,a)|A = a∗, X = x, Z = z]dP(z|A = a∗, X = x)dP(x|A = a∗)
=∫
E[Y(m)|A = a∗, X = x, Z = z,M|x,z,a = m)]dP(M|x,z,a = m|A = a∗, X = x, Z = z)dP(z|A = a∗, X = x)dP(x|A = a∗)
=∫
E[Y(m)|A = a∗, X = x, Z = z, M = m]dP(m|X = x, Z = z, A = a)dP(z|A = a∗, X = x)dP(x|A = a∗)
=∫
E[Y|A = a∗, X = x, Z = z, M = m]dP(m|X = x, Z = z, A = a)dP(z|A = a∗, X = x)dP(x|A = a∗)
Note that this quantity can then be rewritten as follows:
Additionally, π0(a∗) can be estimated as the simple proportion of individuals in the sample
with family income level A = a∗.
(b) For each fold k (estimation sample), calculate the signal θ∗aa∗ for each observation using the
equation given above.
2. Compute an estimate of the weak secondary effect by averaging the estimated influence functions
40
across all subsamples S1 through SK, for all units:
θaa∗ = n−1 ∑i
θ∗i,aa∗
Standard errors can be constructed using the sample variance of the estimated influence functions:19
Var(θaa∗) = E(θ∗i,aa∗ − ˆE(θ∗i,aa∗)]
2
As shown in the main text, E[Y|A = a] (i.e. the average observed outcome among individuals from
high parent backgrounds) factorizes into the expectation of their income under the weak intervention.
As such, we only require estimation for the lower income group. Note additionally that this debiased
machine-learning approach is valid for any paired contrasts of parent income, A ∈ {a, a∗}. In my main
analyses I bin parental income into quintiles, and thus extend this algorithm for each contrast {a∗ =
1; a = 5}, {a∗ = 2; a = 5}, . . . , {a∗ = 4; a = 5}.
19Since the sample estimation variance Var(EIF) coincides wth the semiparametric efficiency bound.
41
C Smoothed spline functions of variables used in analyses
42
Figure 9: Fitted conditional means of labour market and educational outcomes as a natural spline functionof parental income rank (with 3 degrees of freedom). Ribbons represent 95% confidence intervals. Agecutoff for attainment of all outcomes is set to 25.
43
Figure 10: Fitted conditional means of background characteristics as a funtion of parental income rank.Fitted values are obtained by a natural spline function with 3 degrees of freedom. Ribbons represent 95%confidence intervals. Age cutoff for attainment of all outcomes is set to 25.
44
References
David H Autor, Lawrence F Katz, and Melissa S Kearney. Trends in US wage inequality: Revising the
revisionists. The Review of economics and statistics, 90(2):300–323, 2008. ISSN 0034-6535.
Sandy Baum, Jennifer Ma, and Kathleen Payea. Education Pays, 2010: The Benefits of Higher Education
for Individuals and Society. Trends in Higher Education Series. College Board Advocacy & Policy Center,
2010.
Jo Blanden. Cross-country rankings in intergenerational mobility: a comparison of approaches from
economics and sociology. Journal of Economic Surveys, 27(1):38–73, 2013. doi: 10.1111/j.1467-6419.2011.