22/07/2015
1
Longitudinal Modelling with Longitudinal Households
Paul ClarkeWorkshopUnderstanding Society Conference 21 July 2015
Overview
1. Longitudinal households and modelling review
2. Household-level outcomes: Residential mobility example
3. Approaches for modelling person-level outcomes
4. Further issues
2
22/07/2015
2
LONGITUDINAL HOUSEHOLDS
AND MODELLING REVIEW
Part 1
3
Longitudinal Households (HHs)
� For British Household Panel Study � Sample of 5500 HHs drawn in 1991 from PAF� HHs followed up annually
� Sample members classified as� Ordinary/Temporary/Permanent
� Families and persons move over time� Especially important for maturing studies (BHPS, PSID)
4
22/07/2015
3
Modelling Panel Data
� Types of longitudinal model:� Growth curve� Repeated measures� Survival/Event history
� All based on adapted regression models
� Common themes: trend and autocorrelation
5
A Simple Regression Model
� Linear model for row i of data set (person)
� Variance-covariance structure unspecified� Crucial part of modelling exercise� Often of substantive interest
Fixed part Random part or residual
iii xy εβ +=
6
22/07/2015
4
Parameter Estimation: No Clustering
� No clustering, persons independent
� Estimating equations summed over persons
� e.g. GLS for heteroskedastic linear regression( ) )()( ,| 2 ieiiiiii xxxWxyxy σβε =−=
( ) 0;|);(1
=∑ =n
i iiixyxW θεθ
Depends wholly on our model
Can be chosen for efficiency
7
Covariance Structure
=
=
2
2
2
2
1
2
1
00
00
00
varvar
ε
ε
ε
σ
σσ
ε
εε
L
MOM
L
MM
nny
y
y
x
All sample covariate information
Everything homoskedastic from now on!
8
22/07/2015
5
The Effects of Clustering
� Persons can be grouped (e.g. households)� Persons a group more alike than general population (> 0)� Less commonly, can also be more unalike (< 0)
� If no clustering then no need to allow for grouping� But clustering is usually present
� Multiple groups: membership can be hierarchical� Focus on most important/known groups
9
Autocorrelation
� Repeated measures over time� i now indexes unique wave-person observations (i = t,i)� Clustering: waves within a person
� Almost inevitable: hence autocorrelation� Almost always > 0� Decays over time
� For simplicity, I will ignore it (life is hard enough!)
0),cov( i.e. )( =+ idtti εε10
22/07/2015
6
Parameter Estimation: Clustering
� Household (HH) clustering; each HH independent
� So estimating equations sum over HHs
� Two different approaches:� Marginal models and conditional models
( ) 0;|,,);(1 1
=∑ =Hn
h hmhhhyyW θεθ xx K
Combined HH covariates
=
mh
h
h
x
x
M
1
x
11
Multilevel Models
� Very flexible family of models e.g. Goldstein 2011; Snijders & Bosker 2012
� Two-level linear model for outcomes
� Decompose residual (c.f. variance-components model)� Both residuals normally distributed
� HH-level residual explains HH clustering:
∏==m
i hihihhhmhhuxypuyyp
11),|(),|,,( xK
( )hihihihihih
uex
xy
++=+=
βεβ
12
22/07/2015
7
Parameter Estimation
� Specify log-likelihood
� Maximise likelihood � Numerically (quadrature, MCMC)� Iterative least squares (ILS)� Distributional assumptions determine form of “Wε”
)( ),|(log)|,,(log11
udFuxypyyph u
m
i ihihh hmhh ∑ ∫∏∑ ==xK
Normal distn
Integrate over unknown uCond. model plus normality of e
13
Covariance Structure� Allow correlation between household members
� Recall decomposition of model residual:
� Leads to …
0),cov()|,cov( ≠= jhihhjhih yy εεx
hihihih uexy ++= β
),0(~
),0(~2
2
uh
eih
Nu
Ne
σσ
14
22/07/2015
8
Residual Covariance Matrix� Example: Three-person HH
++
+=
+++
=
22
222
2222
3
2
1
3
2
1
varvar
ue
uue
uuue
hh
hh
hh
h
h
h
ue
ue
ue
σσσσσσσσσ
εεε
( ) tic)homoskedas (i.e. var where 2eihe σ=
15
Going Longitudinal: Fixed HHs
1
3
2
Wave 2
1
3
2
Wave 1
Household 116
22/07/2015
9
BHPS/UKHLS-Style Dataset
PID Wave HHID
10017933 1 1001507
10017933 2 2000717
10017968 1 1001507
10017968 2 2000717
10017992 1 1001507
10017992 2 2000717
17
Need to convert to …
Dataset for Analysis
PID Wave HHID
1 1 1
1 2 1
2 1 1
2 2 1
3 1 1
3 2 1
18
22/07/2015
10
Fixed Households
� 3-level model (wave–person–household)
� To ignore individual autocorrelation
( )hihtihtihtihtihtih
uvex
xy
+++=+=
βεβ
0),cov( )( ==+ ihidtti vee
19
=′=2
22
222
2112
u
uu
uuu
CC
σσσσσσ
Implied Covariance Matrix
=
2221
1211
231
221
211
131
121
111
varCC
CC
εεεεεε
++
+==
22
222
2222
2211
ue
uue
uuue
CC
σσσσσσσσσ
Diagonals: autocorrelation due to HH (between-wave, same person)
Off-diagonal: between-wave covariance, different persons
20
22/07/2015
11
HOUSEHOLD-LEVEL
OUTCOMES
Part 2: Multilevel Modelling with Longitudinal Households
21
What is a Household?
� Murphy (1996)� “without some additional conditions it is impossible to
use the household as the unit of analysis across time”
� i.e. Must be clear what a HH is to understand what constitutes a change, and the effect this has on the residual correlation structure
22
22/07/2015
12
Residential Mobility
� Residential mobility important in HH change� Economically: labour market� Demographically: family dissolution
� HH members decide whether to move or remain� Couple (married or cohabiting)� Singleton
� Dependents of secondary importance
23
Some HH-Change Scenarios
� Couple� Both partners move to new address� Separate: one remains, one leaves current address� Separate: both leave for different addresses
� Singleton� Moves to new address (and remains single)� Moves to new address to form couple� Remains at address, joined by new partner
24
22/07/2015
13
Transition Models
� Different type of model
� y = HH moved between waves t – 1 and t
� All about covariance structure
� Consider approaches based on multilevel models
L+== βththth xxy )|1Pr(logit
25
Review: Part 1
� Pooled cross-sectional, household-level analysis� Clark & Huang (2003)� Collapse person-level data set to household-level� Each HH-wave contribution is considered distinct
βththth xxy == )|1Pr(logit
• ‘Solve’ problem by ignoring it– Cannot model change – Knowingly mis-specified model
26
22/07/2015
14
Review: Part 2
� Household-level longitudinal analysis � Pickles & Davies (1985)� Collapse person-level into household-level data set� Each household-wave contribution in separate row� HH-level random effect for autocorrelation
hththth uxxy +== β)|1Pr(logit
• ‘Fudges’ longitudinal HH problem: ‘intact’ HHs only– Ahead of its time … but selection effects?
27
Dynamic HHs
� u comprises omitted time-invariant person chars.� e.g. behaviour, personality, health, attitude
� Made up of� HH members’ characteristics � Change with HH composition
� Change in HH membership, change in HH residual
� But all random effectsare independent:� What if two HHs shared individuals: contradiction?
28
22/07/2015
15
Review: Part 3� Individual-based approaches
� Davies Withers (1997), Bӧheim & Taylor (2002)� (Ideally) Covariates characterise household members� Person-level residual (B&T 2002)
itititi uxxy +== β)|1Pr(logit
• ‘Double counting’ means SEs under estimated
• Requires extensive HH-level information– But previous applications included little info. on
partner
hyyy ttt HHin both if 21 ==
29
Advice from the Study
“Although households were the initial sampling unit, the BHPS does not treat them as longitudinal entities as
households do not remain constant over time. For this reason, longitudinal weights are not calculated for households. To study households longitudinally,
researchers must choose a household member to follow over time and take observations of their household characteristics from wave to wave.”
30
22/07/2015
16
Review: Part 4
� Represent HH by a head of household (HoH)� Ioannides & Kan (1996)
� HoH is nominal choice but they chose males
� Limited covariates to represent other partner
� Random effect unchanged even if couple separated
� Improved by Rabe & Taylor (2010) � Included covariates for both partners
� Separate models if household is singleton or couple
� ‘Singleton’ and ‘couple’ random effects independent
itititi uxxy +== β)|1Pr(logit
++
==singleton if~couple if
)|1Pr(logitSiti
Citi
titiux
uxxy
αβ
31
Review: Part 5
� Health and place (Chandola et al 2005)� Previous work highlighted area-level effects� But are apparent area effects actually HH effects?� What about changing HHs?
� Used multiple membership multilevel models
� Outcome SF-36 at Wave 9� Weight HH by no. waves� Not truly longitudinal
32
22/07/2015
17
Multiple Membership Models
� Due to Browne et al (2001)� Fit by MCMC (Rasbash et al 2009; Leckie & Charlton 2013)
� No HH effects, just weighted sum of person effects
∑ =+n
k ktktiuwx
1β
33
New Approaches: Generalisation
� Steele et al (2013)� Extended HoH model� Multiple membership model
� Generalised couple model
=otherwise0
at couplein HoH if1 tcti
Use interactions (no constant term)
)1)(~()()|1Pr(logit )()( titS
itititC
itititi cuxcuxxy −+++== αβ
34
22/07/2015
18
Random-Part Specification
� Extended HoH model:
� Multiple membership model:
=
2
2
)(
)(
,0
0~
S
SCC
Si
Ci
tSi
tCi N
u
u
u
u
σσσ
),0(~, where
, ofpartner if 22
2...
)(
)(
u
dii
ji
i
ji
tSi
tCi
Nuu
ij
u
uu
u
u
σ
+=
c.f. consensus HH model (Corfman & Lehmann 1987; Chiappori 1992)
Coupling process independent (given predictors)
35
Example: Residential Mobility
3 (Florence)
1 (Gertrude)
2 (Phil)
Wave 2
2
3
1
Wave 1
36
22/07/2015
19
Data View
PID Wave HoHID c Name
1 1 1 1 Gert
1 2 1 0 Gert
2 1 1
2 2 2
3 1 2 0 Flo
3 2 2 1 Flo
Phil’s records deleted
37
HoH Model: Covariance Matrix
22
1211
C
CC
=
2
2
110
0
S
CCσ
σ
=
SC
SCCσ
σ0
012
=
2
2
220
0
C
SCσ
σ
Florence
Wave 2
Wave 1
Gertrude
Gertrude
Florence
Phil with neither: Correct
Phil with both: Incorrect
38
22/07/2015
20
MM Model: Covariance Matrix
22
1211
C
CC
=
2
2
110
02
u
uCσ
σ
=
20
422
22
12
u
uuCσσσ
=
20
02
2
22
u
uCσ
σ
Consensus means couples’ decisions less variable than singletons’
Correct but (over?) structured
39
Simulation Study Results
40
Ioannides & Kan Rabe & Taylor Extended HoH Multiple membership
Boheim & TaylorNote typo: should equal 1.0402/2 = 0.50201
22/07/2015
21
Table 5a. Estimates of duration effects and residual variances from alternative models of residential mobility, British Household Panel Study (Steele et al 2013)
Variable HoH-common HoH-joint MM-consensus MM-double
Est. SE Est. SE Est. SE Est. SE
Constant -1.70 0.13 -1.76 0.14 -1.77 0.13 -1.69 0.13
Years since last move(ref ≤ 1)
(1,2] -0.69 0.07 -0.69 0.07 -0.69 0.07 -0.69 0.07
(2,3] -0.94 0.09 -0.93 0.09 -0.93 0.09 -0.94 0.09
(3,4] -0.98 0.10 -0.96 0.10 -0.96 0.10 -0.98 0.10
⁞
(8,9] -1.25 0.18 -1.21 0.19 -1.20 0.19 -1.26 0.18
(9,10] -1.13 0.19 -1.09 0.19 -1.08 0.19 -1.14 0.18
(10,11] -1.40 0.22 -1.37 0.22 -1.35 0.23 -1.41 0.22
>11 -1.43 0.09 -1.39 0.09 -1.38 0.09 -1.44 0.09
Residual variances
Between-individual (Singles) 0.14 0.04 0.22 0.06 0.23 0.05 0.12 0.03
Between-couple 0.14† 0.04† 0.09 0.04 0.12† 0.02† 0.24† 0.06†
Single-couple covariance - - 0.03 0.05 0.12† 0.02† - -
For Singles 41
Table 6a. Estimates of covariate effects from alternative models of residential mobility, British Household Panel Study (Steele et al 2013)
42
HoH-common HoH-joint MM-consensus MM-double
Variable Est. SE Est. SE Est. SE Est. SE
Tenure
Owned-mortgage 0 - 0 - 0 - 0 -
Owned-outright 0.30 0.10 0.29 0.10 0.30 0.10 0.29 0.07
Private rented 1.52 0.07 1.49 0.07 1.51 0.06 1.50 0.05
Social rented 0.24 0.08 0.23 0.07 0.23 0.08 0.25 0.06
Living with parents 1.47 0.19 1.45 0.19 1.47 0.19 1.53 0.15
London residence -0.02 0.09 -0.02 0.09 -0.02 0.09 -0.03 0.07
Rooms per person -0.52 0.05 -0.501 0.05 -0.51 0.05 -0.51 0.03
Age, centred at 40 -0.03 0.003 -0.03 0.003 -0.03 0.003 -0.03 0.002
(Age, centred at 40)2 -0.001 0.0002 -0.001 0.0002 -0.001 0.0002 -0.001 0.0002
No. dependent children
0 0 - 0 - 0 - 0 -
1 0.25 0.08 0.25 0.07 0.25 0.08 0.25 0.06
≥ 2 -0.28 0.08 -0.27 0.08 -0.28 0.08 -0.26 0.06
Other covariates (estimates not shown): Age of Youngest Child, Cohabiting, Post-School Education, Employment status
22/07/2015
22
� On fixed effects (coefficients)� Introduces unnecessary bias
� (See HoH-common, MM-double columns)
� On random-part (variances & covariances)� Can be severely biased/misleading interpretation
� Not big impact in this example� Large sample size (sim and data example) limit impact: worse if
sample size smaller
Summary: Impact of Mis-specification
43
PERSON-LEVEL OUTCOMES
Part 3: Multilevel Modelling with Longitudinal Households
44
22/07/2015
23
Handling Longitudinal HHs
� Back to linear models
� Drop household subscriptand ignoring autocorrelation …
� How do we specify ui(HH(t))?
hihtihtihtih uvexy +++= β
45
))((0 tHHitititi uexy +++= β
Approach 1: HH = House
� u comprises omitted household characteristics� e.g., draughty, damp, noisy, small
� Random effect unaffected by� No. HH members� Characteristics of HH members
� u1 and u2 are� a) Distinct: HHs 1 and 2 have distinct physical locations� b) Independent: person 3 chose new HH ‘randomly’
46
22/07/2015
24
Longitudinal HHs: Simple Example
1
3
2
Wave 2
1
3
2
Wave 1
HH 1
HH ?
HH ??
47
Approach 1: HH Definitions
� Persons 1 and 2 still in same household�Keep same random effect (u1) at wave 2
� Person 3 left to join all-new household�Different random effect (u2) at wave 2
48
22/07/2015
25
Approach 1: Data View
PID Wave HHID
1 1 1
1 2 1
2 1 1
2 2 1
3 1 1
3 2 2
49
Approach 1: Model
iii euxy 1111 ++= β
2,1for 2122 =++= ieuxy iii β
2322323 euxy ++= β
Wave 1
Wave 2
50
22/07/2015
26
Approach 1: Covariance Matrix
2221
1211
CC
CC
++
+=
22
222
2222
11
ue
uue
uuue
C
σσσσσσσσσ
=0
0
02
22
12 u
uu
C σσσ
++
+=
22
22
222
22 0
0
ue
ue
uue
C
σσσσ
σσσ
51
Approach 2: HoH-Type Models
� No need to worry about double counting� Rabe & Taylor (2010) HoH model: HoH = HH
� Wave 1: All in HH 1 (u1)
� Wave 2: Persons 1 and 2 in ‘different’ HH (u2)
� Wave 2: Person 3 left in another HH (u3)
� Steele et al (2013) extended-HOH model� Dummy variables for size of original HH
� Random effects uHH1, uHH2, uHH3 etc. with covariances
� Leavers HH effects independent of ‘core’ HH
52
22/07/2015
27
Approach 2: Data View
PID Wave HHID
1 1 1
1 2 2
2 1 1
2 2 2
3 1 1
3 2 3
53
Approach 2: Rabe&Taylor-type Model
iii uxy 1111 εβ ++=
2,1for 2222 =++= iuxy iii εβ
2332323 εβ ++= uxy
Wave 1
Wave 2
54
22/07/2015
28
Approach 2: Covariance Matrix
2221
1211
CC
CC
++
+=
22
222
2222
11
ue
uue
uuue
C
σσσσσσσσσ
=0
00
000
12C
++
+=
22
22
222
22 0
0
ue
ue
uue
C
σσσσ
σσσ
55
Approach 3: Super-Household
� Super-household (SHH)� A set of HHs which have shared at least one individual
� Include SHH random effect (v)� 3-level model: Persons within HHs within SHHs� allows between-wave correlation
� v is very hard to interpret� Earlier wave outcomes can depend on people who have yet to move in� Or on people who did not move in but were co-resident with someone who
did
� Covariance structure too crude
56
22/07/2015
29
Approach 3: SHH Definition
� Combine with distinct HHs� SHHs can be determined graphically:
� Wave 1:� Each node represents person at Wave 1� Insert edge between person pairs in same HH
� Wave 2: Leaving existing edges intact:� Inset new node for new sample members� Insert edge between pairs in same HH
� Repeat for Waves 3, …, T� SHH: Set of persons connected by paths
57
Approach 3: Data View
PID Wave HHID SHHID
1 1 1 1
1 2 2 1
2 1 1 1
2 2 2 1
3 1 1 1
3 2 3 1
N.B. Distinct HHs
58
22/07/2015
30
Approach 3: Model
iii vuxy 11111 εβ +++=
2,1for 21222 =+++= ivuxy iii εβ
23132323 εβ +++= vuxy
Wave 1
Wave 2
59
Approach 3: Covariance Matrix
+++++++++
=222
22222
2222222
11
vue
vuvue
vuvuvue
C
σσσσσσσσσσσσσσσ
=2
22
222
12
v
vv
vvv
C
σσσσσσ
++++
+++=
222
2222
222222
22
vue
vvue
vvuvue
C
σσσσσσσσσσσσσ
60
22/07/2015
31
Approach 4: Multiple Membership
� Pros: Flexible enough to handle larger HHs and person-level outcomes
� Cons: Independence assumptions about who forms HHs with whom become more tenuous
61
Approach 4: Data View
PID Wave HHID
1 1 1
1 2 2
2 1 1
2 2 2
3 1 1
3 2 3
62
22/07/2015
32
Approach 4: Model
113211111 3
1
3
1
3
1iii uuuxy εβ ++++=
2,1for 2
1
2
122212222 =+++= iuuxy iii εβ
23333233233 εβ +++= uuxy
Wave 1
Wave 2
63
Approach 4: Covariance Matrix
++
+=
3
33
333
22
222
2222
11
ue
uue
uuue
C
σσσσσσσσσ
=3
33
333
2
22
222
12
u
uu
uuu
C
σσσσσσ
++
+=
22
22
222
22 02
022
ue
ue
uue
C
σσσσ
σσσ
64
22/07/2015
33
FURTHER ISSUES
Part 4
65
Selection
� Usual ‘random effects’ assumption� Covariates and person/HH effects uncorrelated� Unlikely if person-HH selection non-random� Estimates are purely descriptive: not causal effects
� On-going further work� Modelling residential move process incorporating
‘push’ and ‘pull’ factors (Steele et al 2015, under review)
66
22/07/2015
34
Sample Design
� Stratified multistage sampling design� Design weights
� Self-weighting if extended samples excluded� Clustering handled by multilevel model (more or less)
� Post-stratification and non-response weights� Depends on how correlated these things are with your
outcome (given included covariates)� Sensitivity analysis (with and without/SEs wrong)
67
Missing Data
� Multiple imputation� Only for standard hierarchical structures
� Using MCMC:� Fills in missing outcomes using data augmentation� MAR assumption� More difficult for missing covariates
� Drop incomplete wave contributions� c.f. noninformative censoring in survival analysis� Different assumption to MAR but not unrealistic
� Informative drop-out (Washbrook et al 2014)
68
22/07/2015
35
References� Böheim, R., M. Taylor. 2002. "Tied down or room to move? Investigating the relationships between housing tenure,
employment status and residential mobility in Britain." Scottish Journal of Political Economy 49:369-92.� Browne, W.J., H. Goldstein, J. Rasbash. 2001. "Multiple membership multiple classification (MMMC) models."
Statistical Modelling 1:103-24.� Chandola, T., P. Clarke, R. Wiggins, M. Bartley. 2005. "Who you live with and where you live: setting the context for
health using multiple membership multilevel models." Journal of Epidemiology and Community Health 59(2):170-75.� Chiappori, P.A. 1992. "Collective labor supply and welfare." Journal of Political Economy 100:437-67.� Clark, W.A.V., Y. Huang. 2003. "The life course and residential mobility in British housing markets." Environment and
Planning Series A 35:323-39.� Corfman, K.P., D.R. Lehmann. 1987. "Models of cooperative group decision-making and relative influence - an
experimental investigation of family purchase decisions." Journal of Consumer Research 14:1-13.� Davies Withers, S. 1997. "Methodological considerations in the analysis of residential mobility: A test of duration, state
dependence, and associated events." Geographical Analysis 29:354-72.� Goldstein, H. 2011. Multilevel Statistical Models (4th ed.). Wiley: Chichester.� Goldstein, H., J. Rasbash, W.J. Browne, G. Woodhouse, M. Poulain. 2000. "Multilevel models in the study of dynamic
household structures." European Journal of Population 16:373-87.� Ioannides, Y.M., K. Kan. 1996. "Structural estimation of residential mobility and housing tenure choice." Journal of
Regional Science 36:335-63.� Leckie, G.B., C. Charlton. 2013. "runmlwin - a program to run the MLwiN multilevel modelling software from within
Stata." Journal of Statistical Software 52:11.� Murphy, M.J. 1996. "The dynamic household as a logical concept and its use in demography." European Journal of
Population 12:363-81.� Pickles, A., R.B. Davies. 1985. "The longitudinal analysis of housing careers." Journal of Regional Science 25:85-101.� Rabe, B., M. Taylor. 2010. "Residential mobility, quality of neighbourhood and life course events." Journal of the Royal
Statistical Society Series A (Statistics in Society) 173:531-55.� Rasbash, J., C. Charlton, W.J. Browne, M.J.R. Healy, B. Cameron. 2009. MLwiN version 2.1. University of Bristol:
Centre for Multilevel Modelling.� Snijders, T.A.B., R. Bosker. 2012. Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling (2nd
ed.). Sage: London.� Steele, F., P. Clarke, E.Washbrook. 2013. “Modeling household decisions using longitudinal data from household
panel surveys, with applications to residential mobility.” Sociological Methodology 43: 220-271. � Washbrook, E.V., P. Clarke, F. Steele. 2014. “Investigating non-ignorable dropout in panel studies of residential
mobility.” Journal of the Royal Statistical Society Series C (Applied Statistics) 63: 239-266.
69