Multilevel Modeling: Introduction Chongming Yang, Ph.D Social Science Research Institute Social Capital Group Meeting, Spring 2008
Mar 26, 2015
Multilevel Modeling:Introduction
Chongming Yang, Ph.D
Social Science Research InstituteSocial Capital Group Meeting, Spring
2008
“In the past twenty years we have witnessed a paradigm shift in the analysis of correlational data. Confirmatory factor analysis and structural equation modeling have replaced exploratory factor analysis and multiple regression as the standard methods. We are currently in the early stages of a paradigm shift in the analysis of experimental data. Multilevel modeling is replacing ANOVA. Certainly ANOVA will remain a basic tool in the social psychological research, but it can no longer be considered the only technique”
Kenny, D.A. Kashy, D.A., & Bolger, N. (1998). Data analysis in psychology. In D.T. Gilbert, S.T. Fiske, & G. Lindzey (Eds.) The Handbook of Social Psychology, Vol. 1 (pp233-265). New York: McGraw-Hill.
New Paradigm in Data Analysis
Alternative Labels
• Hierarchical Linear Model (HLM)
• Random Coefficient Model
• Variance Component Model
• Multilevel Model
• Contextual Analysis
• Mixed Linear Model
Hierarchical Data Structure
• Response (outcome) variable at lowest level
• Grouping at higher levels
• Explanatory (predictive) variables at all levels
• Assuming sampling at all levels
Two Types
• Persons nested within a group
• Repeated measures nested within a person
Example of Multilevel Data
• Class Student id Math(yr1) Verb(yr1) ses Math(yr2)
1 1 78 72 70 80
1 2 65 60 56 67
1 3 80 78 63 81
1 4 85 80 75 85
2 1 92 90 80 90
2 2 91 92 81 92
2 3 93 91 83 93
2 4 90 92 82 91
2 5 94 93 85 95
Properties of Hierarchical Data
• Observations are interdependent, more similar within groups than from different groups due to shared history, contextual effects, etc.
• Errors are not independent (longitudinal data)
Standard Modeling Assumptions
• Independent observations
• Independent errors
• Equal variances of errors for all observations
Consequences of Ignoring Hierarchical Data Properties
• Smaller standard errors for regression coefficients, thus
• Spurious effects
Design-based Approach
• Apply standard analysis with sampling weights to adjust standard errors, common in survey research
Design Effects of Two-level Data
• Intraclass Correlation
= between-level variance/total variance
• Design Effect
n/[1+(n-1)]
where n = average cluster size (=>2 warrants a multilevel analysis)
Another Look
• Class Student id Math(y1) Verbal(y1) ses Teachers’
Competence
1 1 78 72 70 4
1 2 65 60 56 4
1 3 80 78 63 4
1 4 85 80 75 4
2 1 92 90 80 3
2 2 91 92 81 3
2 3 93 91 83 3
2 4 90 92 82 3
2 5 94 93 85 3
Intercepts & Slopes for Each Class
X
y
0
Class Level Summary
• Class intercept slope …
1 9.72 2.50
2 13.51 3.26
3 7.64 4.07
4 16.25 0.92
5 13.17 1.27
6 11.21 3.85
7 9.05 4.21
8 17.11 1.32
9 15.32 2.11
Modeling Intercepts & Slopes
0 = g0 + u0
1 = g10 + u1
when variances of u0 and u1 are zero, there are no group differences in 0 and 1. Thus variances of u0 and u1 are very important parameters.
Model-based ApproachMultilevel Modeling
• (Multiple Equations) Multilevel Model: yi= 0 + 1xi + ri
0 = g00 + u0
1 = g10 + u1
• (Single Equation) Mixed Model:yi= g00+u0+g10xi+u1xi+ri
Multilevel Modeling (with 2nd Level Predictors)
• (Multiple Equations) Multilevel Model: yi= 0 + 1xi + ri
0 = g00 + g01zj + u0 (main effects)
1 = g10 + g11zj + u1 (cross-level interaction)
• (Single Equation) Mixed Model:yi= g00+g01zj+u0+g10xi+g11zjxi+u1xi+ri
Rearranged Single Equation
• yi= [g00 + g10xi + g01zj + g11zjxi] (fixed effects)
+ [u1xi + u0 + ri] (random effects)
• Parameters to be estimated:
intercept: g00
slopes: g10, g01, g11
variances: r, u0, u1
covariances: among rs(in longitudinal data), u0 &u1, gs
Fixed or Random?
Fixed Random
Effect All levels are present in the experiment
Random selection of all possible levels
Variable Known Values:
e.g. gender
has a expectation (mean) and variance
Coefficient Gender A probability function of others variables, has a variance, e.g. 1st level coefficients
Cross-level Interaction
• Appears in a single equation as product term, not in multiple equations
• The effect of a lower level variable depends on upper level variables
• Example:
The effect of students’ aptitude on math achievement depends teachers’ competence
Estimation
• Restricted Maximum Likelihood: Variance components are included in the likelihood function, regression coefficients are estimated in a second step (less biased against variance)
• Full Maximum Likelihood: Both variance components and regression coefficients are included in the likelihood function (variances are slightly underestimated.
Deviance
• -2 times log-likelihood Function, 2 distribution, can be used for model comparison,
• The smaller, the better fit
Explore HLM Program
• Create MDM
• Specify and run a model
• Interpret parameters in the output
Model Exploration Procedures
1. Start with an intercept-only model (Calculate intraclass correlation)
2. Add 1st level predictors for a fixed model (Test individual slopes)
3. Model intercept by 2nd level predictors (Test significance & amount of variance explained)
4. Random coefficient model (Test variance component of 1st level slopes one by one)
5. Model Random slopes predicted by higher level variables (Test significance and amount variance explained)
Longitudinal Data
Time
y
0 1 2 3 4
Unconditional Growth Model
• 1st level: Occasion
y = p0 + p1t + r
• 2nd level: Person
p0 = g00 + u0
p1 = g10 + u1
Parameters to Interpret
• Means of Intercept (g00) & Slope(g10)
• Variances of Intercept (u0) & Slope (u1)
• Covariance/Correlation of Intercept & Slope (u0 & u1)
Extended Model
• Occasion level: Time-variant covariate x
y = p0 + p1t + p3x + r
• Person level: time-invariant covariate z
p0 = g00 + g01zj + u0
p1 = g10 + g11zj + u1
p2 = g20 + g21zj + u2
Nonlinear Growth (by Recoding T Variable)
• Linear: 0, 1, 2, 3… (0, 1, 2.5, 3.5…)
• Quadratic: 0, 1, 4, 9…
• Logarithmic: 0, 0.69, 1.10, 1.39…
• Exponential: 0, 1.72, 6.39, 19.09
Explore HLM Program Chapter 4 Example
• Create MDM
• Specify and run a model
• Interpret parameters in the output
Explore the SAS program
• Identify levels of the variables in the data
• Identify which variables could have main and/or interaction effects
• Identify random coefficients and then their variances in the output
Minimum Sample Size
• Cluster level: > 20
• Individual level: =>1
Obtain Standardized Coefficients
Standardize continuous variables to obtain standardized coefficients
Further Topics
• Categorical dependent variables
• Multivariate dependent variables
• Latent variables + mediating effects (multilevel structural equation modeling)
• Power & Sample Size
• ...
Further Resources
• http://gseweb.harvard.edu/~faculty/singer/
• www.ats.ucla.edu/stat/sas/default.htm
• SSRI consultants
• …