15 november 2005 1 Nonlinear Trend in Inequality of Educational Opportunity in the Netherlands 1930- 1989 Maarten L. Buis Harry B.G. Ganzeboom
Jan 31, 2016
15 november 2005 1
Nonlinear Trend in Inequality of Educational Opportunity in the
Netherlands 1930-1989
Maarten L. Buis
Harry B.G. Ganzeboom
15 november 2005 2
Outline
• Main results• Model selection
– Continuous or discrete education and father’s status
– Importance mother’s education relative to father’s education
– Difference in effect between sons and daughters
• Non-linearity in trend in effects: identify periods of negative, positive, and no trend.
15 november 2005 3
Main results
• Model selection– distinction between highest and lowest educated parent
is more important than distinction between father and mother, or same-sex-parent.
– Effects of parental education and father’s occupational status is the same for sons and daughters
• Non-linearity in trend– Effect of father’s status decreases non-linearly over
time, slowing down significantly around 1970– parental education decreases most likely linearly.
15 november 2005 4
1930 1950 1970 1990
12
34
56
7
OLSsignificant trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
12
34
56
7
OLSsignificant change in trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
24
68
10SOR
significant trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
24
68
10
SORsignificant change in trend
year in w hich respondent is 12IE
O
1930 1950 1970 1990
12
34
5
RC2significant trend
nyear in w hich respondent is 12
IEO
1930 1950 1970 1990
12
34
5RC2
significant change in trend
year in w hich respondent is 12
IEO
15 november 2005 5
Data
• International Stratification and Mobility File (ISMF)
• 49 surveys held between 1958 and 2003 with information on cohorts 1930-1989.
• 80,000 observations, of which 66,000 have complete information on child's, father’s and mother’s education and father's status.
• Number of cases are unequally distributed over cohorts.
15 november 2005 6
0
500
1000
1500
2000
2500
Fre
qu
en
cy
1920 1940 1960 1980 2000year in which respondent is 12
Number of obeservations per cohort
15 november 2005 7
Model 1: linear regression• Dependent variable is years of education
and treated as continuous.• Parental education is either entered as
father’s and/or mother’s education, highest and/or lowest educated parent, or education of same sex parent
• Father’s occupational status is measured in ISEI scores
• Trend in effects are measured as third order orthogonal polynomials or lowess curves.
15 november 2005 8
Two objections against linear education
• Regression coefficient is effected by both ‘real’ effects of parental characteristics on probabilities of making transitions and educational expansion– True, if education is studied as a process – False, if education is studied as an outcome
• education is discrete– this does not have to be a problem if there is no
concentration in the lowest or highest category.
15 november 2005 9
0
.2
.4
.6
.8
1pr
opor
tion
1940 1960 1980year in which respondent is 12
higher tertiarylower tertiaryhigher secondarylower secondary
primary or less
highest achievedlevel of education
15 november 2005 10
Model 2:Stereotype Ordered Regression (SOR)
• SOR allows for ordered dependent variable
• SOR will estimate (sequentially) an optimal scaling of education and the effect of independent variables on this scaled education.
15 november 2005 11
Model 3: Row Collumn Model II (RC2)
• Objection against use of ISEI: – Effect of father’s occupation is better
represented by small number of discrete classes, rather than on continuous scale.
• Classes used are EGP classification.
• RC2 is extension of SOR that also estimates an optimal scaling for EGP
15 november 2005 12
Father’s and mother’s education
• Conventional model: Only father matters• Individual model: Both mother and father matter• Joint model: Effect of father and mother are equal• Dominance model: Highest educated parent
matter• Modified Dominance model: Highest and lowest
educated parent matter• Sex Role model: Same sex parent matters
15 november 2005 13
BICsname no. model OLS SOR RC2Baseline model 0a FIS*BYR 3̂*FEM + BYR_D*FEM
0b FIS*BYR 3̂ + BYR_D*FEM
conventional model 1a (0a) + FED*BYR 3̂*FEM -10375 -20352 -341121b (0a) + FED*BYR 3̂ -10414 -20391 -341571c (0b) + FED*BYR 3̂ -10433 -20418 -34194
individual model 2a (0a) + FED*BYR 3̂*FEM + MED*BYR 3̂*FEM -11003 -21043 -345662b (0a) + FED*BYR 3̂ + MED*BYR^3 -11075 -21124 -346592c (0b) + FED*BYR 3̂ + MED*BYR^3 -11093 -21148 -34698
joined model 3a (0a) + (FED=MED)*BYR^3*FEM -11065 -21120 -345803b (0a) + (FED=MED)*BYR^3 -11104 -21159 -346243c (0b) + (FED=MED)*BYR^3 -11121 -21183 -34662
dominance model 4a (0a) + HI_ED*BYR 3̂*FEM -10923 -20923 -345524b (0a) + HI_ED*BYR 3̂ -10961 -20963 -345964c (0b) + HI_ED*BYR 3̂ -10980 -20989 -34634
modified dominance model 5a (0a) + HI_ED*BYR 3̂*FEM + LO_ED*BYR 3̂*FEM -11071 -21094 -347055b (0a) + HI_ED*BYR 3̂ + LO_ED*BYR^3 -11149 -21180 -34797
5c (0b) + HI_ED*BYR 3̂ + LO_ED*BYR^3 -11166 -21204 -34835
sex-role model 6a (0a) + SS_ED*BYR^3*FEM -10231 -20217 -337366b (0a) + SS_ED*BYR^3*FEM -10261 -20252 -337426c (0b) + SS_ED*BYR^3 -10275 -20283 -33790
15 november 2005 14
Scaling of father’s status
EGP mean(ISEI) RC2I Service class, higher grade 66.5 1.000II Service class, lower grade 56.4 0.838IIIa Routine non-manual employees 48.6 0.651
IIIb Personal service workers 41.7 0.370
IVa Small proprietors with employees 45.4 0.467
IVb Small proprietors without employees 44.7 0.184
V Manual foremen and technicians 41.3 0.216VI Skilled manual workers 34.8 -0.148VIIa Semi- and unskilled manual workers 29.7 -0.354VIIb Agricultural workers 17.5 -0.553IVc Farmers and smallholders 29.1 0.000
15 november 2005 15
Scaling of education
education mean(educyr) SOR RC2
primary or less 6.0 0.000 0.000
lower secondary 9.3 0.348 0.299
higher secondary 11.0 0.646 0.601
lower tertiary 14.9 0.813 0.793
higher tertiary 17.1 1.000 1.000
15 november 2005 16
Linearity of trend, orthogonal polynomials
OLS SOR RC2
trend t p t p t p
FSES linear -10,850 0,000 -6,573 0,000 -20,148 0,000
quadratic 6,940 0,000 4,387 0,000 4,410 0,000
cubic 0,390 0,694 -0,386 0,699 -0,536 0,591
HI_ED linear -12,290 0,000 -4,628 0,000 0,504 0,614
quadratic 0,940 0,349 1,065 0,112 5,267 0,000
cubic -2,290 0,022 -2,080 0,037 2,614 0,009
LO_ED linear -6,970 0,000 -3,068 0,000 -0,866 0,386
quadratic 2,100 0,036 1,587 0,112 1,955 0,050
cubic 2,470 0,014 1,540 0,123 3,962 0,000
15 november 2005 17
Identifying periods with significant trend
• A negative slope means a negative trend.
• A positive slope mean a positive trend.
• A zero slope means no trend.
15 november 2005 18
Identifying periods with significant change in trend
• An accelerating trend means that a negative trend becomes more negative, so a negative change in slope.
• A decelerating trend means that a negative trend becomes less negative, so a positive change in slope.
• A constant trend means no change in slope.
15 november 2005 19
Data
• The ISMF dataset is converted into a new dataset, containing estimated IEO for 60 annual cohorts.
• The precision of the estimates (the standard error) is used to weigh the cohorts.
15 november 2005 20
Lowess
• We have a dataset consisting of estimates of IEO for each annual cohort which used only information from that cohort
• If we think that IEO develops like a smooth curve over time, than nearby estimates also contain relevant information.
• The lowess curve creates an improved estimate of the IEO for each cohort using information from nearby cohorts.
• It results in a smooth line by connecting the lowess estimates.
• Estimates of the trend and change in trend at each cohort can also be obtained from this curve.
15 november 2005 21
Lowess curve in 1949
• Point on lowess curve in 1949• Select closest 60% of the points.• Give larger weights to nearby points.• Adjust weights for precision of estimated IEO.• Normal regression of IEO on time, time squared and time
cubed on weighted points.• Predicted value in 1949, is smoothed value of 1949.• First derivative in 1949 is trend in 1949.• Second derivative in 1949 is change in trend in 1949.• Repeat for all cohorts and connect the dots.
15 november 2005 22
1930 1940 1950 1960 1970 1980 1990
12
34
56
7
(a) Observations Within the Windowspan = 0.6
year in which respondent is 12
IEO
1949
1930 1940 1950 1960 1970 1980 1990
0.0
0.5
1.0
1.5
2.0
(b) Tricube Weights
year in which respondent is 12
Tric
ube
Ker
nel W
eigh
t
1949
1930 1940 1950 1960 1970 1980 1990
0.0
0.5
1.0
1.5
2.0
(c) Tricube (+), Precision (x),and Joint (o) Weights
year in which respondent is 12
wei
ghts
1949
1930 1940 1950 1960 1970 1980 1990
12
34
56
7
(d) Weighted Third Degree Polynomial(size of circle proportional to weight)
year in which respondent is 12
IEO
1949
IEO^
1949
15 november 2005 23
Selecting spans
• Percentage closest points (span) determines the smoothness of the lowess curve.
• Trade-off between smoothness and goodness of fit.
• Can be judged visually by comparing lowess curves with different spans.
• Numerical representations of this trade-off are Generalize Cross Validation, and Akaike Information Criterion.
• Lower values mean a better trade-off.
15 november 2005 24
1930 1950 1970 1990
12
34
56
7
OLS, span=.5
year in w hich respondent is 12
ieo
1930 1950 1970 1990
12
34
56
7
OLS, span=.6
year in w hich respondent is 12
ieo
1930 1950 1970 1990
12
34
56
7
OLS, span=.7
year in w hich respondent is 12
ieo
1930 1950 1970 1990
24
68
10SOR, span=.5
year in w hich respondent is 12
ieo
1930 1950 1970 1990
24
68
10
SOR, span=.6
year in w hich respondent is 12
ieo
1930 1950 1970 1990
24
68
10
SOR, span=.7
year in w hich respondent is 12
ieo
1930 1950 1970 1990
12
34
5
RC2, span=.5
year in w hich respondent is 12
ieo
1930 1950 1970 1990
12
34
5
RC2, span=.6
year in w hich respondent is 12
ieo
1930 1950 1970 1990
12
34
5
RC2, span=.7
year in w hich respondent is 12
ieo
15 november 2005 25
0.2 0.4 0.6 0.8 1.0
0.38
0.39
0.40
0.41
0.42
(a) Generalized CrossValidation
span
gcv
0.2 0.4 0.6 0.8 1.0
3035
4045
(b) Akaike InformationCriterion
span
aic
0.2 0.4 0.6 0.8 1.0
0.62
0.64
0.66
0.68
0.70
(a) Generalized CrossValidation
span
gcv
0.2 0.4 0.6 0.8 1.0
4550
55
(b) Akaike InformationCriterion
span
aic
0.2 0.4 0.6 0.8 1.0
0.08
80.
090
0.09
20.
094
0.09
60.
098
0.10
0
(a) Generalized CrossValidation
span
gcv
0.2 0.4 0.6 0.8 1.0
1520
2530
35
(b) Akaike InformationCriterion
span
aic
15 november 2005 26
Bootstrap confidence intervals
• Confidence interval gives the range of results that could plausibly occur just through sampling error.
• Make many `datasets' that could have occurred just by sampling error.
• Fit lowess curves through each `dataset'.• The area containing 90% of the curves is the 90%
confidence interval.• The estimates of IEO are regression coefficient with
standard errors.• The standard error gives information about what values of
IEO could plausibly occur in `new' dataset.
15 november 2005 27
1930 1940 1950 1960 1970 1980 1990
23
45
6
(a) Lowess Smooths in theFirst 25 Bootstrap Samples
year in which respondent is 12
IEO
1930 1940 1950 1960 1970 1980 1990
-40
-20
020
40
(b) Trend in IEO in theFirst 25 Bootstrap Samples
year in which respondent is 12
tren
d, c
hang
e in
IE
O p
er 1
00 y
ears
1930 1940 1950 1960 1970 1980 1990
-800
-600
-400
-200
020
040
0
(c) Change in Trend in IEO in theFirst 25 Bootstrap Samples
year in which respondent is 12
chan
ge in
tre
nd p
er 1
00 y
ears
15 november 2005 28
1930 1950 1970 1990
12
34
56
7
(a) Lowess Smooth and90% Confidence Envelope
year in w hich respondent is 12IE
O1930 1950 1970 1990
-30
-10
010
30
(c) Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12
tren
d, c
hang
e in
IEO
per
100
yea
rs
1930 1950 1970 1990
-600
-200
020
0
(d) Change in Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12
chan
ge in
tren
d pe
r 10
0 ye
ars
1930 1950 1970 1990
24
68
10
(a) Lowess Smooth and90% Confidence Envelope
year in w hich respondent is 12
IEO
1930 1950 1970 1990
-60
-40
-20
020
40
(c) Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12tr
end,
cha
nge
in IE
O p
er 1
00 y
ears
1930 1950 1970 1990
-600
-200
200
600
(d) Change in Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12
chan
ge in
tren
d pe
r 10
0 ye
ars
1930 1950 1970 1990
12
34
5
(a) Lowess Smooth and90% Confidence Envelope
year in w hich respondent is 12
IEO
1930 1950 1970 1990
-15
-50
510
15
(c) Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12
tren
d, c
hang
e in
IEO
per
100
yea
rs
1930 1950 1970 1990
-300
-200
-100
010
0
(d) Change in Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12ch
ange
in tr
end
per
100
year
s
OLS
SOR
RC2
15 november 2005 29
1930 1950 1970 1990
12
34
56
7
OLSsignificant trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
12
34
56
7
OLSsignificant change in trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
24
68
10
SORsignificant trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
24
68
10
SORsignificant change in trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
12
34
5
RC2significant trend
nyear in w hich respondent is 12
IEO
1930 1950 1970 1990
12
34
5RC2
significant change in trend
year in w hich respondent is 12
IEO