Top Banner
ED VUL | UCSD Psychology 201ab Quantitative methods Linear Mixed Effects Models 1
44

201ab Quantitative methods Linear Mixed Effects ModelsWe have pre-and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section,

Jan 29, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • ED VUL | UCSD Psychology

    201ab Quantitative methodsLinear Mixed Effects Models

    1

  • ED VUL | UCSD Psychology

    Our expanding model landscape

    Eq. var t-tests

    Multiple regression

    ANCOVAOLS regressionANOVA

    Gen

    eral

    lin

    ear m

    odel

    Logistic regressionPoisson regression

    Gen

    eral

    ized

    linea

    r mod

    els

    Linear mixed models

    Repeated measures ANOVA

    Mixed design (split-plot) ANOVA

    Generalized linear mixed models

    Hierarchical Bayesian models

    As we saw, and in general, special cases of broader model classes are usually favored despite being less flexible because they are simpler and allow for easier estimation and inference.

    A few of the methods we covered don’t fit into this scheme:- Binomial test- Pearson’s Chi-

    square tests- Unequal

    variance t-tests

    There are other generalizations:- Multivariate

    linear model- General(ized)

    additive model

  • ED VUL | UCSD Psychology

    Motivating dataWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.

    str(test.data)

    'data.frame': 400 obs. of 6 variables:$ student: Factor w/ 20 levels "S.1","S.10","S.11",..: 1 1 1 1 1 1 11 ...$ sex : Factor w/ 2 levels "female","male": 2 2 2 2 2 2 2 2 2 2 ...$ test : Factor w/ 2 levels "post","pre": 2 2 2 2 2 2 2 2 2 2 ...$ part : Factor w/ 10 levels "P.1","P.10","P.2",..: 1 3 4 5 6 7 8 9 ...$ section: Factor w/ 2 levels "qualitative",..: 1 1 1 1 1 2 2 2 2 2 ...$ score : num 53 50 79 67 70 68 68 62 65 79 ...

    head(test.data)

    student sex test part section scoreS.1 S.1 male pre P.1 qualitative 53S.1.2 S.1 male pre P.2 qualitative 50S.1.3 S.1 male pre P.3 qualitative 79S.1.4 S.1 male pre P.4 qualitative 67S.1.5 S.1 male pre P.5 qualitative 70S.1.6 S.1 male pre P.6 quantitative 68

  • ED VUL | UCSD Psychology

    MotivationWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements

    What’s the correlation structure?

    How do you analyze this?

  • ED VUL | UCSD Psychology

    MotivationWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements1) Do students improve from pre to post?2) Do females outperform males?3) Is qualitative or quantitative easier?4) Is improvement different for males and females? For qual. vs quant.?5) Does qual./quant. improve more? easier for males/females? 6) Is male/female difference different for qual/quant? Pre/Post?7) Does learning [pre vs post] alter any qual/quant disparity between

    males/females?8) Are some parts easier or harder?9) Are some parts easier for males/females? Improve more?10) Do some parts improve more for males than females?11) Do some students do better or worse?12) Are some students better at qual/quant? Improve more?13) Do some students improve more on qual/quant?

  • ED VUL | UCSD Psychology

    Crossed random effectsWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements• repeated measures ANOVA worldview: two options:– Students as the unit of analysis.sex as “between student” factorsection and time as “within student” factors.

    – Test part as the unit of analysis.section as a “between part” factor, sex and time as “within part” factors.

  • ED VUL | UCSD Psychology

    Analysis by pooling (subject)summary(aov(data=test.data,

    score~sex*test*section + Error(student/(test*section))))

    Error: studentDf Sum Sq Mean Sq F value Pr(>F)

    sex 1 3570 3570 1.443 0.245Residuals 18 44539 2474

    Error: student:testDf Sum Sq Mean Sq F value Pr(>F)

    test 1 0.90 0.90 0.879 0.361 sex:test 1 33.06 33.06 32.195 2.21e-05 ***Residuals 18 18.48 1.03

    Error: student:sectionDf Sum Sq Mean Sq F value Pr(>F)

    section 1 5048 5048 661.6 1.20e-15 ***sex:section 1 1604 1604 210.2 2.27e-11 ***Residuals 18 137 8

    Error: student:test:sectionDf Sum Sq Mean Sq F value Pr(>F)

    test:section 1 1242.6 1242.6 1238.1 < 2e-16 ***sex:test:section 1 61.6 61.6 61.4 3.29e-07 ***Residuals 18 18.1 1.0

    Error: WithinDf Sum Sq Mean Sq F value Pr(>F)

    Residuals 320 31652 98.91

  • ED VUL | UCSD Psychology

    Analysis by pooling (item)summary(aov(data=test.data,

    score~sex*test*section + Error(part/(test*sex))))

    Error: partDf Sum Sq Mean Sq F value Pr(>F)

    section 1 5048 5048 1.302 0.287Residuals 8 31014 3877

    Error: part:testDf Sum Sq Mean Sq F value Pr(>F)

    test 1 0.9 0.9 0.046 0.836 test:section 1 1242.6 1242.6 62.891 4.65e-05 ***Residuals 8 158.1 19.8

    Error: part:sexDf Sum Sq Mean Sq F value Pr(>F)

    sex 1 3570 3570 205.83 5.44e-07 ***sex:section 1 1604 1604 92.48 1.14e-05 ***Residuals 8 139 17

    Error: part:test:sexDf Sum Sq Mean Sq F value Pr(>F)

    sex:test 1 33.06 33.06 23.53 0.001270 ** sex:test:section 1 61.62 61.62 43.86 0.000165 ***Residuals 8 11.24 1.41

    Error: WithinDf Sum Sq Mean Sq F value Pr(>F)

    Residuals 360 45042 125.1

  • ED VUL | UCSD Psychology

    Analysis by item/subject pooling

    summary(aov(data=test.data, score~sex*test*section + Error(student/(test*section))))

    Error: studentDf Sum Sq Mean Sq F value Pr(>F)

    sex 1 3570 3570 1.443 0.245Residuals 18 44539 2474

    Error: student:testDf Sum Sq Mean Sq F value Pr(>F)

    test 1 0.90 0.90 0.879 0.361 sex:test 1 33.06 33.06 32.195 2.21e-05 ***Residuals 18 18.48 1.03

    Error: student:sectionDf Sum Sq Mean Sq F value Pr(>F)

    section 1 5048 5048 661.6 1.20e-15 ***sex:section 1 1604 1604 210.2 2.27e-11 ***Residuals 18 137 8

    Error: student:test:sectionDf Sum Sq Mean Sq F value Pr(>F)

    test:section 1 1242.6 1242.6 1238.1 < 2e-16 ***sex:test:section 1 61.6 61.6 61.4 3.29e-07 ***Residuals 18 18.1 1.0

    Error: WithinDf Sum Sq Mean Sq F value Pr(>F)

    Residuals 320 31652 98.91

    summary(aov(data=test.data, score~sex*test*section + Error(part/(test*sex))))

    Error: partDf Sum Sq Mean Sq F value Pr(>F)

    section 1 5048 5048 1.302 0.287Residuals 8 31014 3877

    Error: part:testDf Sum Sq Mean Sq F value Pr(>F)

    test 1 0.9 0.9 0.046 0.836 test:section 1 1242.6 1242.6 62.891 4.65e-05 ***Residuals 8 158.1 19.8

    Error: part:sexDf Sum Sq Mean Sq F value Pr(>F)

    sex 1 3570 3570 205.83 5.44e-07 ***sex:section 1 1604 1604 92.48 1.14e-05 ***Residuals 8 139 17

    Error: part:test:sexDf Sum Sq Mean Sq F value Pr(>F)

    sex:test 1 33.06 33.06 23.53 0.001270 ** sex:test:section 1 61.62 61.62 43.86 0.000165 ***Residuals 8 11.24 1.41

    Error: WithinDf Sum Sq Mean Sq F value Pr(>F)

    Residuals 360 45042 125.1

    Both strategies give us the wrong answer because they neglect one source of covariation.

    Subject analysis (pooling over parts)Ignores within-section among-part variation, most relevant for assessing differences between sections.

    Item analysis (pooling over students)Ignores within-sex among-student variation, most relevant for assessing differences between sexes.

  • ED VUL | UCSD Psychology

    Analysis by explicit poolingWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.

    ANOVA “by student” (disregarding part random effect)and ANOVA “by part” (disregarding subject random effect) give conflicting results.

    This is because they respect some correlation structure (due to students, or due to parts) but not the other.

  • ED VUL | UCSD Psychology

    Crossed random effectsWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements• repeated measures ANOVA worldview: two options:– Students as the unit of analysis.sex as “between student” factorsection and time as “within student” factors.

    – Test part as the unit of analysis.section as a “between part” factor, sex and time as “within part” factors.

    • Both of these omit the other source of correlated noise!

  • ED VUL | UCSD Psychology

    We want both

    12

    summary(aov(data=test.data, score~sex*test*section + Error(student/(test*section))))

    summary(aov(data=test.data, score~sex*test*section + Error(part/(test*sex))))

    But we can’t do this:summary(aov(data=test.data,

    score~sex*test*section + Error(student/(test*section)) + Error(part/(test*sex))))

    Need a more flexible way to specify correlation structure.

  • ED VUL | UCSD Psychology

    lme4::lmerinstall.packages(‘lme4’)

    Syntax:

    - Specify the various random effects yourself, manually!- (thing that varies randomly | grouping variable it varies

    over)- E.g., (1|student) intercept varies randomly with student

    (section|student) section effect varies across students

    13

    m = lmer(data=test.data, score~sex*test*section +(1|student)+(1|student:test)+(1|student:section)+(1|student:test:section)+(1|part)+(1|part:test)+(1|part:sex)+(1|part:test:sex)+(1|student:part))

  • ED VUL | UCSD Psychology

    Random vs fixed effects: lay theoryWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements1) Do students improve from pre to post?2) Do females outperform males?3) Is qualitative or quantitative easier?4) Is improvement different for males and females? For qual. vs quant.?5) Does qual./quant. improve more? easier for males/females? 6) Is male/female difference different for qual/quant? Pre/Post?7) Does learning [pre vs post] alter any qual/quant disparity between

    males/females?8) Are some parts easier or harder?9) Are some parts easier for males/females? Improve more?10) Do some parts improve more for males than females?11) Do some students do better or worse?12) Are some students better at qual/quant? Improve more?13) Do some students improve more on qual/quant?

  • ED VUL | UCSD Psychology

    Fixed effects:- pre/post- male/female- qual/quantRandom effects:- student- part

    Random vs fixed effects: lay theoryWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements

    Factor levels are of general relevance.We care about offsets for specific levels.

    Factor levels are specific to our study. They are random samples of possible levels that might occur in the world.Maybe we care about variance across levels in general, but not the actual offsets for specific levels.

  • ED VUL | UCSD Psychology

    1) Do students improve from pre to post?2) Do females outperform males?3) Is qualitative or quantitative easier?4) Is improvement different for males and females? For

    qual. vs quant.?5) Does qual./quant. improve more? easier for

    males/females? 6) Is male/female difference different for qual/quant?

    Pre/Post?7) Does learning [pre vs post] alter any qual/quant disparity

    between males/females?8) Are some parts easier or harder?9) Are some parts easier for males/females? Improve more?10) Do some parts improve more for males than females?11) Do some students do better or worse?12) Are some students better at qual/quant? Improve more?13) Do some students improve more on qual/quant?

    Random vs fixed effects: lay theoryWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements

    We can classify these kinds of questions as:Fixed main effects.Fixed 2-way interactions.Fixed 3-way interactions.Random main effects.Random 2-way interactions.Random 3-way interactions.

  • ED VUL | UCSD Psychology

    Complete / No / Partial Pooling• Complete pooling:

    • No pooling:

    • Partial pooling:

    17

    yi ~α +βXi + ei

    yi =α j[i] +βXi + eiα j = anything

    yi =α j[i] +βXi + eiα j ~ Normal(γ,σα )

    We estimate an overall mean starting weight, and an overall weight/time slope. These are completely pooled over all people, disregarding individual differences.

    We estimate an overall weight/time slope (complete pooling for slope). And individual-specific starting weights with no constraints on their relationship (no pooling for intercept).

    We estimate an overall weight/time slope (complete pooling for slope). And individual-specific starting weights with the understanding that they are somewhat similar, and thus are modeled as Normally distributed around some common mean (partial pooling for intercept).

  • ED VUL | UCSD Psychology

    Complete / No / Partial Pooling• Complete pooling:

    Consequently: we get one intercept

    • No pooling: Consequently: we get lots of intercepts that may differ a lotand be noisy.

    • Partial pooling:Consequently: we get lots of intercepts, but they will be drawn toward the overall meanby an amount that scales with our uncertainty about them.

    18

    yi ~α +βXi + ei

    yi =α j[i] +βXi + eiα j = anything

    yi =α j[i] +βXi + eiα j ~ Normal(γ,σα )

  • ED VUL | UCSD Psychology

    Complete / No / Partial Pooling

    19

    Gelman’s example: Our estimates of coefficients for counties with little data are noisy. In the no pooling analysis these counties look extreme because of the noise. In the partial pooling analysis, this noisiness yields uncertainty and coefficients are adjusted to look more like the overall mean.

    No pooling Partial pooling

    Sample size per county Sample size per county

  • ED VUL | UCSD Psychology

    Random vs fixed effects: formalRandom effect coefficients are thought to come from some distribution. Their variation is treated as noise. They are ‘partially pooled’ so that they ‘shrink’ toward the average (0).

    Fixed effects are parameters.There is no distribution for them.They are not pooled at all. They do not “shrink”.

  • ED VUL | UCSD Psychology

    Back to the design we couldn’t handleWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.m = lmer(data=test.data, score~sex*test*section + (1|student), REML = F)

    Linear mixed model fit by maximum likelihood ['lmerMod']Formula: score ~ sex * test * section + (1 | student)

    Data: test.dataAIC BIC logLik deviance df.resid

    2991.907 3031.822 -1485.953 2971.907 390 Random effects:Groups Name Std.Dev.student (Intercept) 10.352 Residual 9.152 Number of obs: 400, groups: student, 20Fixed Effects:

    (Intercept) sexmale69.18 -2.18

    testpre sectionquantitative-3.64 6.80

    sexmale:testpre sexmale:sectionquantitative0.42 -6.44

    testpre:sectionquantitative sexmale:testpre:sectionquantitative8.62 -3.14

    Here we allow the intercept to vary with student: +(1|student)

  • ED VUL | UCSD Psychology

    Back to the design we couldn’t handleWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.m = lmer(data=test.data, score~sex*test*section +(1|student)+(1|student:test)+(1|student:section)+(1|student:test:section))

    This is the full “student” random effect structure.

    m = lmer(data=test.data, score~sex*test*section +(1|part)+(1|part:test)+(1|part:sex)+(1|part:test:sex))

    This is the full “part” random effect structure.

    m = lmer(data=test.data, score~sex*test*section +(1|student)+(1|student:test)+(1|student:section)+(1|student:test:section)+(1|part)+(1|part:test)+(1|part:sex)+(1|part:test:sex)+(1|student:part))

    This is the full random effect structure of the design.

  • ED VUL | UCSD Psychology

    Linear Mixed Effects AnalysisWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.m0 = lmer(data=test.data, score~1

    + (1|student) + (1|part) + (1|student:test) + (1|student:section) + (1|part:test) + (1|part:sex) + (1|part:sex:test) + (1|student:test:section)+ (1|student:part), REML = FALSE)

    This is the exhaustive list of random effects that might exist in our data.

    This exhaustive strategy makes sense for such a factorial design with random intercepts capturing sources of correlated error in our data – to get the by-item and by-subject ANOVAs together, but it’s not something you would do with complex real-world data with variable slopes, etc. In those cases the usually advised procedure is to start with the fixed effects, and add random effects so long as they improve the model.

  • ED VUL | UCSD Psychology

    Linear Mixed Effects AnalysisWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.

    m0 = lmer(data=test.data, score~1+ (1|student) + (1|part) + (1|student:test) + (1|student:section) + (1|part:test) + (1|part:sex) + (1|part:sex:test) + (1|student:test:section) + (1|student:part), REML = FALSE)

    mF = lmer(data=test.data, score~sex*test*section+ (1|student) + (1|part) + (1|student:test) + (1|student:section) + (1|part:test) + (1|part:sex) + (1|part:sex:test) + (1|student:test:section) + (1|student:part), REML = FALSE)

    This is a null model that includes only the random effects, no fixed effects of interest.

    This is the full model including all the fixed effects.

    anova(m0,mF)

    Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq) m0 11 1494.4 1538.3 -736.21 1472.4 mF 18 1440.4 1512.3 -702.22 1404.4 67.977 7 3.783e-12 ***

    The full model is significantly better than null, so something is going on with our fixed effects.

  • ED VUL | UCSD Psychology

    Interpreting random effect SDs.We have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.

    summary(m0)

    Random effects: Groups Name Std.Dev.student:part (Int) 0.88364student:test:section (Int) 0.36681part:sex:test (Int) 0.95617student:section (Int) 0.71859student:test (Int) 0.01777part:sex (Int) 2.94028part:test (Int) 2.54545student (Int) 11.08337part (Int) 9.27407Residual 0.60322

    Let’s look at the random effects (errors) to see what happened.

    These are partitioning all the variation of our data into independent sources that add up to yield the full covariance. The SDs tell us how much variation is attributable to each source.

    Can I test if these reflect significantly non-zero var?- Why do it? They reflect the structure of your

    experiment, not really something worth testing.- Better to just compare the amount of variation in

    one source to another (e.g. students vs residuals)But I really want to!- You could drop one you are interested in from

    the model, and see if a model that includes it is better (AIC, or LRT)

    - You could extract coefficients with ranef(), their standard errors with se.ranef() then calculate X=sum((ranef/se.ranef)^2) and compare to chi-squared distribution… Not really common.

  • ED VUL | UCSD Psychology

    Interpreting random effect SDs.

    26

    We have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.

    summary(m0) summary(mF)

    Random effects: Groups Name Std.Dev.student:part (Int) 0.88364student:test:section (Int) 0.36681part:sex:test (Int) 0.95617student:section (Int) 0.71859student:test (Int) 0.01777part:sex (Int) 2.94028part:test (Int) 2.54545student (Int) 11.08337part (Int) 9.27407Residual 0.60322

    Random effects:Groups Name Std.Dev.student:part (Int) 0.88375student:test:section (Int) 0.34807part:sex:test (Int) 0.28981student:section (Int) 0.70360student:test (Int) 0.04714part:sex (Int) 0.80917part:test (Int) 0.85788student (Int) 10.69397part (Int) 8.95206Residual 0.60474

    Let’s look at the random effects (error) SDs to see what happened.

    SDs reduced by a large proportion tell us that fixed effects explained that variation:part:test, part:sex, part:sex:test

    Which fixed effects could account for this reduction?

  • ED VUL | UCSD Psychology

    Random effect SDs and residuals

    summary(aov(data=test.data, score~sex*test*section + Error(student/(test*section))))

    Error: studentDf Sum Sq Mean Sq F value Pr(>F)

    sex 1 3570 3570 1.443 0.245Residuals 18 44539 2474

    Error: student:testDf Sum Sq Mean Sq F value Pr(>F)

    test 1 0.90 0.90 0.879 0.361 sex:test 1 33.06 33.06 32.195 2.21e-05 ***Residuals 18 18.48 1.03

    Error: student:sectionDf Sum Sq Mean Sq F value Pr(>F)

    section 1 5048 5048 661.6 1.20e-15 ***sex:section 1 1604 1604 210.2 2.27e-11 ***Residuals 18 137 8

    Error: student:test:sectionDf Sum Sq Mean Sq F value Pr(>F)

    test:section 1 1242.6 1242.6 1238.1 < 2e-16 ***sex:test:section 1 61.6 61.6 61.4 3.29e-07 ***Residuals 18 18.1 1.0

    Error: WithinDf Sum Sq Mean Sq F value Pr(>F)

    Residuals 320 31652 98.91

    summary(aov(data=test.data, score~sex*test*section + Error(part/(test*sex))))

    Error: partDf Sum Sq Mean Sq F value Pr(>F)

    section 1 5048 5048 1.302 0.287Residuals 8 31014 3877

    Error: part:testDf Sum Sq Mean Sq F value Pr(>F)

    test 1 0.9 0.9 0.046 0.836 test:section 1 1242.6 1242.6 62.891 4.65e-05 ***Residuals 8 158.1 19.8

    Error: part:sexDf Sum Sq Mean Sq F value Pr(>F)

    sex 1 3570 3570 205.83 5.44e-07 ***sex:section 1 1604 1604 92.48 1.14e-05 ***Residuals 8 139 17

    Error: part:test:sexDf Sum Sq Mean Sq F value Pr(>F)

    sex:test 1 33.06 33.06 23.53 0.001270 ** sex:test:section 1 61.62 61.62 43.86 0.000165 ***Residuals 8 11.24 1.41

    Error: WithinDf Sum Sq Mean Sq F value Pr(>F)

    Residuals 360 45042 125.1

    These give us estimates roughly comparable to the random effects:SD(students) = 10.7 ~= sqrt(2474/20)=11.18 ~= sqrt(125)=11.12SD(parts) = 8.95 ~= sqrt(3877/40)=9.84 ~= sqrt(98.91)=9.94However, even these are higher in aov, because we have more error terms to partition the variance in the full mixed model, and because we have partial pooling to draw estimates closer together. This is even more true for the other residual terms.

    Subject analysis (pooling over parts)Ignores within-section among-part variation, most relevant for assessing differences between sections.

    Item analysis (pooling over students)Ignores within-sex among-student variation, most relevant for assessing differences between sexes.

  • ED VUL | UCSD Psychology

    Interpreting fixed effects.We have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.

    Caution! It’s contentious whether it is sensible to get p values from those t values!

    In general, we assess significance by either:- Dropping a coefficient from the full

    model, and comparing it to full via LRT (makes sense in cases that are not a factorial experiment)

    - Adopting a sequential ANOVA-like procedure of adding coefficients in sequence, and assessing whether they improve the model.

    summary(mF)

    Fixed effects:Estimate Std. Error t value

    (Intercept) 69.1800 5.2767 13.110sexmale -2.1800 4.8309 -0.451testpre -3.6400 0.6060 -6.006sectionquantitative 6.8000 5.7284 1.187sexmale:testpre 0.4200 0.3818 1.100sexmale:sectionquantitative -6.4400 0.9640 -6.681testpre:sectionquantitative 8.6200 0.8565 10.064sexmale:testpre:sectionquantitative -3.1400 0.5384 -5.833

    These are the various fixed effect coefficients. In this case, because we have a 2x2x2 fixed design, we have one coefficient per main-effect/interaction.

    What do these coefficients mean?

  • ED VUL | UCSD Psychology

    A sequential ANOVA approach.m0 = lmer(data=test.data, score~1 + (1|student) + (1|part) + (1|student:test) + (1|student:section) + (1|part:test) + (1|part:sex)

    + (1|part:sex:test) + (1|student:test:section)+(1|student:part), REML=FALSE)m0.S = lmer(data=test.data, score~sex + (1|student) + (1|part) + (1|student:test) + (1|student:section) + (1|part:test)

    + (1|part:sex) + (1|part:sex:test) + (1|student:test:section)+(1|student:part), REML=FALSE)m0.ST = lmer(data=test.data, score~sex+test + (1|student) + (1|part) + (1|student:test) + (1|student:section) + (1|part:test)

    + (1|part:sex) + (1|part:sex:test) + (1|student:test:section)+(1|student:part), REML=FALSE)m0.STS = lmer(data=test.data, score~sex+test+section + (1|student) + (1|part) + (1|student:test) + (1|student:section)

    + (1|part:test) + (1|part:sex) + (1|part:sex:test) + (1|student:test:section)+(1|student:part), REML=FALSE)m0.STS.I1 = lmer(data=test.data, score~sex+test+section+test:section + (1|student) + (1|part) + (1|student:test)

    + (1|student:section) + (1|part:test) + (1|part:sex) + (1|part:sex:test) + (1|student:test:section)+(1|student:part), REML=FALSE)

    m0.STS.I2 = lmer(data=test.data, score~sex+test+section+test:section+sex:test + (1|student) + (1|part) + (1|student:test) + (1|student:section) + (1|part:test) + (1|part:sex) + (1|part:sex:test) + (1|student:test:section)+(1|student:part), REML=FALSE)

    m0.STS.I3 = lmer(data=test.data, score~sex+test+section+test:section+sex:test+sex:section + (1|student) + (1|part) + (1|student:test) + (1|student:section) + (1|part:test) + (1|part:sex) + (1|part:sex:test) + (1|student:test:section)+(1|student:part), REML=FALSE)

    mF = lmer(data=test.data, score~sex+test+section+sex:test+sex:section+test:section+sex:test:section + (1|student) + (1|part) + (1|student:test) + (1|student:section) + (1|part:test) + (1|part:sex) + (1|part:sex:test) + (1|student:test:section)+(1|student:part), REML=FALSE)

    Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq) m0 11 1494.4 1538.3 -736.21 1472.4 m0.S 12 1496.4 1544.3 -736.20 1472.4 0.0065 1 0.93594 m0.ST 13 1497.0 1548.9 -735.51 1471.0 1.3915 1 0.23815 m0.STS 14 1497.6 1553.5 -734.78 1469.6 1.4457 1 0.22922 m0.STS.I1 15 1477.9 1537.7 -723.93 1447.9 21.7012 1 3.186e-06 ***m0.STS.I2 16 1476.0 1539.8 -721.98 1444.0 3.9105 1 0.04799 * m0.STS.I3 17 1455.6 1523.5 -710.82 1421.6 22.3145 1 2.315e-06 ***mF 18 1440.4 1512.3 -702.22 1404.4 17.2073 1 3.351e-05 ***

    So we made models of ever-increasing complexity, and did sequential nested comparisons.This is the sort of thing that the ANOVA command does for us for linear models.

    This is not particularly advisable in LMM analyses that are more naturalistic, less factorial, may involve variable slopes, etc. In those cases it is generally advisable to start with fixed effects, then add random effects, with the goal of building a good model (not ascertaining significance!)

    0.+ sex

    . + test.+ section

    .+ test:section.+ sex:test

    .+ sex:section.+ sex:test:section

  • ED VUL | UCSD Psychology

    Mixed model vs subject/itemsummary(aov(data=test.data, score~sex*test*section

    + Error(student/(test*section))))

    Error: studentDf Sum Sq Mean Sq F value Pr(>F)

    sex 1 3570 3570 1.443 0.245Residuals 18 44539 2474

    Error: student:testDf Sum Sq Mean Sq F value Pr(>F)

    test 1 0.90 0.90 0.879 0.361 sex:test 1 33.06 33.06 32.195 2.21e-05 ***Residuals 18 18.48 1.03

    Error: student:sectionDf Sum Sq Mean Sq F value Pr(>F)

    section 1 5048 5048 661.6 1.20e-15 ***sex:section 1 1604 1604 210.2 2.27e-11 ***Residuals 18 137 8

    Error: student:test:sectionDf Sum Sq Mean Sq F value Pr(>F)

    test:section 1 1242.6 1242.6 1238.1 < 2e-16 ***sex:test:section 1 61.6 61.6 61.4 3.29e-07 ***Residuals 18 18.1 1.0

    Error: WithinDf Sum Sq Mean Sq F value Pr(>F)

    Residuals 320 31652 98.91

    summary(aov(data=test.data, score~sex*test*section + Error(part/(test*sex))))

    Error: partDf Sum Sq Mean Sq F value Pr(>F)

    section 1 5048 5048 1.302 0.287Residuals 8 31014 3877

    Error: part:testDf Sum Sq Mean Sq F value Pr(>F)

    test 1 0.9 0.9 0.046 0.836 test:section 1 1242.6 1242.6 62.891 4.65e-05 ***Residuals 8 158.1 19.8

    Error: part:sexDf Sum Sq Mean Sq F value Pr(>F)

    sex 1 3570 3570 205.83 5.44e-07 ***sex:section 1 1604 1604 92.48 1.14e-05 ***Residuals 8 139 17

    Error: part:test:sexDf Sum Sq Mean Sq F value Pr(>F)

    sex:test 1 33.06 33.06 23.53 0.001270 ** sex:test:section 1 61.62 61.62 43.86 0.000165 ***Residuals 8 11.24 1.41

    Error: WithinDf Sum Sq Mean Sq F value Pr(>F)

    Residuals 360 45042 125.1

    Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq) m0 11 1494.4 1538.3 -736.21 1472.4 m0.S 12 1496.4 1544.3 -736.20 1472.4 0.0065 1 0.93594 m0.ST 13 1497.0 1548.9 -735.51 1471.0 1.3915 1 0.23815 m0.STS 14 1497.6 1553.5 -734.78 1469.6 1.4457 1 0.22922 m0.STS.I1 15 1477.9 1537.7 -723.93 1447.9 21.7012 1 3.186e-06 ***m0.STS.I2 16 1476.0 1539.8 -721.98 1444.0 3.9105 1 0.04799 * m0.STS.I3 17 1455.6 1523.5 -710.82 1421.6 22.3145 1 2.315e-06 ***mF 18 1440.4 1512.3 -702.22 1404.4 17.2073 1 3.351e-05 ***

    Mixed model gives us the worst of both by-subject and by-item analyses, by appropriately recognizing that here, part-variance is relevant for section, and subject-variance relevant for sex.

    0.+ sex

    . + test.+ section

    .+ test:section.+ sex:test

    .+ sex:section.+ sex:test:section

  • ED VUL | UCSD Psychology

    Crossed random effects.

    • Taking into account only one random effect (separating subject and item analyses) gives us incorrect answers.

    • We need to take both random effects into account jointly.• lmer() allows us to add many random effects and their

    interactions.• For such factorial designs, I favor a sequential model-building

    comparison:– Include all random effects from the design.– Add explanatory variables sequentially, and do nested model

    comparison via LRT in anova() – like regular Type I ANOVA

  • ED VUL | UCSD Psychology

    What kind of random effect structure?

    M = lmer(data=test.data, score~sex*test*section+ (test*section|student) + (sex*test|part) + (1|student:part)) )

    M = lmer(data=test.data, score~sex*test*section+(1|student)+(1|part)+(1|student:test)+(1|student:section)+(1|part:test)+(1|part:sex)+(1|part:sex:test)+(1|student:test:section)+(1|student:part) )

    This is a dummy coded random effect error structure.It will only work for categorical factors. It makes a few assumptions that may be violated. It is most similar to what the ‘repeated measures’ aov() command does. It will be anti-conservative in a few particular cases with odd covariances for the subject:X interactions.

    This is the full “varying intercept and varying slope” random effect structure. Folks who think deeply about these things believe this to be the most generally effective specification. It uses quite a few more parameters to capture the (more flexible) random effect covariance structure. It estimates more parameters and is sometimes harder to fit.

    In practice: you may run into estimation problems before you’ve managed to add the full random effect structure. You probably should avoid such highly factorial crossed, structured designs. (If you keep the design simple, the latter specification is better).

  • ED VUL | UCSD Psychology

    Other factors calling for LMM• Repeated measures designs that are unbalanced

    (e.g., due to missing data).

    • Need to:– Account for lower-level variation when estimating higher-level

    coefficients.– Model variation of lower-level coefficients between groups.– Estimating coefficients for specific groups that might have little

    data by partially pooling across all groups.– Make predictions at different grouping levels.

  • ED VUL | UCSD Psychology 34

  • ED VUL | UCSD Psychology

    Data structure decomposition

    35

    s.idA[s.id] B[s.id] ... C1[s.id]... Y[s.id]

    s.id SEX[s.id] MAJOR[s.id] IQ[s.id]SCORE[s.id]JohnDoe m psychology 110 60JaneSmith f business 125 75SueChang f psychology 105 40...

    eA[A] eB[B] eAB[A,B]... eC ...

    eSEX[]m -1f +1

    eMAJOR[]psychology -1business +3cogsci -0.5linguistics -1.5

    eIQ0.13

  • ED VUL | UCSD Psychology

    Single table decomposition: LMs.idA[s.id] B[s.id] ... C1[s.id]... Y[s.id]

    eA[A] eB[B] eAB[A,B]... eC ...

    • Every t-test/ANOVA/ANCOVA (LM/GLM) data structure can be reduced to this single-table “3rd normal form”.

    • Each measurement (Y) is uniquely associated with a particular subject/unit.

    • Each subject may have infinitely many other properties (like factor level category, covariate values, etc.)

  • ED VUL | UCSD Psychology

    Repeated and mixed-design ANOVAs.id sA[s.id]sB[s.id]... sC1[s.id] ...

    eSA[A] eSB[B] eSAB[A,B] ... eSC1...

    m.id s.id[m.id] Q[m.id] R[m.id] ... W[m.id] Y[m.id]

    eMQ[Q] eMR[R] eQR[Q,R]... eSW ...

    • Data structures for repeated measures and mixed-design ANOVA can be reduced to such a 2-table form*

    • Every measurement is associated with a particular subject, and might have some measurement categorization scheme (within-subject factors).

    • In the course of factoring into error strata, we effectively estimate lots of “random effects” Es[s.id] EQs[s.id,Q]

  • ED VUL | UCSD Psychology

    Predictors at many levels

    38

    Private Public Private PublicEast Coast West Coast

    SchoolsSchoolA,Private, East, $1000/studentSchoolB, Public, East, $2500/student...

    Classes (in Schools)

    SchoolB

    Textbook1 Textbook2 Textbook1 Textbook2Written Exam Oral Exam

    SchoolB,Class.B.1, Textbook1, Written, 24 studentsSchoolB, Class.B.2, Textbook1, Oral, 32 students...

    Class.B.1Students(in Classes)

  • ED VUL | UCSD Psychology

    Predictors at many levelsWe can manage this design by partitioning variance into strata and proceeding with a mixed-design ANOVA.

    However, repeated measures ANOVA can’t handle all the possible structures we may need to deal with.

    Mixed-design ANOVA will fail us if we…- add explanatory factors to students- consider crossed random effects (substitute teachers)- add other levels (districts; …, teachers within schools, classes within

    teachers,…; etc.)

  • ED VUL | UCSD Psychology

    Something we can’t do…

    s.Id d.id[s.id] sG[s.id]sH[s.id]...

    eSG[G] eSH[H] eSGH[G,H] ...

    m.id s.id[m.id] Q[m.id] R[m.id] ... W[m.id] Y[m.id]

    eMQ[Q] eMR[R] eQR[Q,R]... eSW ...

    More than two explanatory levels.• E.g., students (with explanatory variables), in schools (with

    explanatory variables), in districts (with explanatory variables)

    d.iddA[s.id]dB[d.id]...

    eDA[A] eDB[B] eDAB[A,B] ...

  • ED VUL | UCSD Psychology

    Something we can’t do…s.id sG[s.id]sH[s.id]...

    eSG[G] eSH[H] eSGH[G,H] ...

    m.Id s.id[m.id] i.id[m.id] Q[m.id] ... Y[m.id]

    eMQ[Q] eMR[R] eQR[Q,R]... eSW ...

    Crossed “random effects”• E.g., each measurement is associated with a particular

    student (with explanatory variables), and a particular item (with explanatory variables)

    i.id iA[s.id]iB[d.id]...

    eIA[A] eIB[B] eIAB[A,B] ...

  • ED VUL | UCSD Psychology

    Data structures in need of LMMNormal LM/GLM structure: each unit is uniquely associated with a measurement.

    Repeated measures structure:Explanatory variables at only two levels. (e.g., “between subject” and “within subject”).

    More levels!Explanatory variables at more than two levels. (e.g., classes in schools in districts)

    Crossed random effects!(e.g., subjects with explanatory variables crossed with items with explanatory variables)

    Stuff we can’t do appropriately without mixed-effect models.

  • ED VUL | UCSD Psychology

    MotivationWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements

    s.idSEX[s.id]

    m.idt.id[m.id] p.id[m.id] SCORE[m.id]

    p.idSECTION[p.id]

    t.ids.id[t.id] TIME[t.id]

  • ED VUL | UCSD Psychology

    MotivationWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements

    female malepr

    e

    post

    qualitativequantitative

    parts

    stud

    ents