Repeated Measures ANOVA (RM ANOVA) and Mixed Effects … · RM ANOVA: Greenhouse-Geisser / Huynh-Feldt Epsilon It is not uncommon that repeated measures data violate the compound

Lukas Meier (most material based on lecture notes and slides from H.R. Roth)

Repeated Measures ANOVA (RM ANOVA)

and Mixed Effects Models

Repeated Measures ANOVA (RM ANOVA)

Now we want to model everything in “one go”.

We use (multiple) ANOVA approaches. Split-plot model Mixed effects models

RM ANOVA: Growth Curves

We have three factors: sex (2 levels) age (4 levels) person (27 levels)

We treat age as a categorical variable. This gives us maximal flexibility as we

do not have to care about the functional form of the age effect.

We set up a model of the form

𝑌𝑖𝑗𝑘 = 𝜇 + 𝛼𝑖 + 𝛿𝑗(𝑖) + 𝛽𝑘 + 𝛼𝛽 𝑖𝑘 + 휀𝑖𝑗𝑘

𝑖: sex, 𝑗: person, 𝑘: time-point

effect of sexdistance effect of time-

point 𝑘interaction

time × sex

deviation (or

error) of

person 𝑗

boy girl

8 10 12 14 8 10 12 14

person

Person is a block factor.

We cannot have both sex and person as fixed effects in the model because

this model would not be identifiable anymore.

Here it seems quite natural to treat person as a so called random effect (better:

think of a random error per person). That is, we assume 𝛿𝑗(𝑖) ∼ 𝑁(0, 𝜎𝛿2).

This is nothing else than the split-plot model that some have seen in the

ANOVA class. It is not a split-plot design, because age was not randomized!

We have

whole plot = person split plot = age

12 12 12

10 …

14 141414

12 12 12

10 …

14 141414

Boy 1 Girl 1Boy 2 Girl 2

12not randomized

We therefore have a so called mixed effects model (containing random and

fixed effects).

We can fit this in R with the lmer function in package lmerTest.

Note that the denominator degrees of freedom for sex are only 25 as we only

have 27 observations on the whole-plot level (patients!).

You can think of doing a two-sample 𝑡-test with two groups having 16 and 11

observations, respectively: 25 = 16 + 11 – 2.

Going back to the original questions: Are the profiles of girls and boys parallel? check interaction

Does dental distance change over time? check effect of age

Is the average level of boys and girls the same or not? check effect of sex

RM ANOVA: Induced Correlation Structure

Let us have a closer look at the model

𝑌𝑖𝑗𝑘 = 𝜇 + 𝛼𝑖 + 𝛿𝑗(𝑖) + 𝛽𝑘 + 𝛼𝛽 𝑖𝑘 + 휀𝑖𝑗𝑘

with 𝛿𝑗(𝑖) ∼ 𝑁(0, 𝜎𝛿2) and 휀𝑖𝑗𝑘 ∼ 𝑁(0, 𝜎2).

We get Var 𝑌𝑖𝑗𝑘 = 𝜎𝛿

2 + 𝜎2

Cor 𝑌𝑖𝑗𝑘 , 𝑌𝑖𝑗𝑘∗ = 𝜎𝛿2/(𝜎𝛿

2 + 𝜎2) if 𝑘 ≠ 𝑘∗ (same person but different time-points)

Cor 𝑌𝑖𝑗𝑘 , 𝑌𝑖∗𝑗∗𝑘∗ = 0 if 𝑖 ≠ 𝑖∗ or 𝑗 ≠ 𝑗∗ (different persons)

Observations of the same person are correlated. The correlation is constant,

i.e., does not depend on how far away the time-points lie (this is not always a

meaningful assumption for growth curves!).

Observations from different persons are independent.

RM ANOVA: Induced Correlation Structure

Hence, we have the following correlation matrix per person

1 𝜌 𝜌 𝜌𝜌 1 𝜌 𝜌𝜌 𝜌 1 𝜌𝜌 𝜌 𝜌 1

where 𝜌 = 𝜎𝛿2/(𝜎𝛿

2 + 𝜎2) is the so-called intra-class correlation.

This correlation structure is called compound symmetry.

Here, the empirical correlation matrix is

RM ANOVA: Greenhouse-Geisser / Huynh-Feldt Epsilon

It is not uncommon that repeated measures data violate the compound

symmetry assumption.

There are measures which describe the deviation from the compound

symmetry model.

Greenhouse-Geisser Epsilon: 휀GG (rather conservative) Huynh-Feldt Epsilon: 휀HF

We have 휀GG ≤ 휀HF ≤ 1, where “= 1” means no deviation.

Correction is being performed by multiplying both the numerator and the

denominator degrees of freedom of the 𝐹-distribution with 휀GG (or 휀HF).

Program “by hand” or use function Anova in package car.

This only affects within-subjects factors!

Alternative: Use model nesting approach? (structured vs. unstructured

approach?)8

Example: Pulse (Spector, 1987)

3 groups of 8 patients each Drug 1 Drug 2 Control (placebo)

Measure pulse at 5, 10, 15 and 20 minutes after taking medication.

control drug1 drug2

5 10 15 20 5 10 15 20 5 10 15 20

period

subject

Example: Pulse

We treat time as a categorical predictor to be flexible enough.

We use the model

𝑌𝑖𝑗𝑘 = 𝜇 + 𝛼𝑖 + 𝛽𝑘 + 𝛼𝛽 𝑖𝑘 + 휀𝑖𝑗𝑘

We use this example to illustrate how one can use other correlation structures.

The random effects (errors) at first sight disappeared!

They are now fully integrated into the error term 휀𝑖𝑗 ∼ 𝑁(0, Σ) ∈ ℝ4

When dealing with longitudinal data it is quite common to use an

autoregressive correlation structure.

effect of drugpulseeffect of

time-point 𝑘interaction

drug × time

error term

(correlated)

Example: Pulse

An 𝑨𝑹(𝟏) structure would be (exponential decay)

Σ = 𝜎2

1 𝜙 𝜙2 𝜙3

𝜙 1 𝜙 𝜙2

𝜙2 𝜙 1 𝜙

𝜙3 𝜙2 𝜙 1

The compound symmetry model would be (slightly other notation)

Σ = 𝜎2

1 𝜌 𝜌 𝜌𝜌 1 𝜌 𝜌𝜌 𝜌 1 𝜌𝜌 𝜌 𝜌 1

Many more choices possible.

Example: Pulse

For such models, the older package nlme is more user friendly than lme4 or

lmerTest.

We use the function gls (generalized least sq.) or lme from package nlme.

Random Intercept / Random Slope Model: Growth Curves

Due to the “nice” profiles we can also try to model the growth curves with a

linear regression model approach (see also summary statistic approach).

We us a random intercept / random slope model:

𝑌𝑖𝑗𝑘 = 𝜇 + 𝛼𝑖 + 𝛿𝑗(𝑖) + (𝛽𝑖 + 𝛽𝑗 𝑖 )𝑥𝑖𝑗𝑘 + 휀𝑖𝑗𝑘

Here: 𝑥𝑖𝑗𝑘 ∈ {8, 10, 12 , 14} is time (as a continuous predictor variable).

It is natural to center time.

That means we use −3,−1, 1 , 3 instead of 8, 10, 12 , 14 , otherwise intercept

and slope estimates will always be correlated.

effect of sexdistanceslope in

group 𝑖

person specific deviation from

population slope: 𝑁(0, 𝜎𝛽2).

person specific

deviation from

population

intercept : 𝑁(0, 𝜎𝛿2).

boy girl

8 10 12 14 8 10 12 14

person

We can model the two random effects 𝛿𝑗(𝑖) and 𝛽𝑗(𝑖) as independent random

variables or we can allow any correlation structure.

Most general model is multivariate normal with unspecified covariance

matrix:

(𝛿𝑗 𝑖 , 𝛽𝑗(𝑖)) ∼ 𝑁(0, Σ) ∈ ℝ2,

where Σ is a 2 × 2 covariance matrix.

Model with arbitrary correlation structure:

ො𝜎𝛽

ො𝜎

ො𝜎𝛿

Check if model was

interpreted correctly

Corr(𝛿𝑗 𝑖 , 𝛽𝑗 𝑖 )

Model with independence assumption:

ො𝜎𝛿

ො𝜎

ො𝜎𝛽 Check if model was

interpreted correctly

We can compare the nested models with the anova command.

Smaller model (independent intercept and slope) seems to be complex enough.

However, results are very close anyway.

Compare results with summary statistic approach!

Actually, we can further simplify this model (see R-Code).

Random Intercept / Random Slope Model: Additional Insights …

For time points 𝑡𝑘 the expressions for the variances and covariances of the

observations 𝑌𝑖𝑗𝑘 are

Var 𝑌𝑖𝑗𝑘 = 𝜎𝛿2 + 2𝑡𝑘Cov 𝛿𝑗 𝑖 , 𝛽𝑗 𝑖 + 𝑡𝑘

2𝜎𝛽2 + 𝜎2

Cov 𝑌𝑖𝑗𝑘 , 𝑌𝑖𝑗𝑘∗ = 𝜎𝛿2 + 𝑡𝑘 + 𝑡𝑘∗ Cov 𝛿𝑗 𝑖 , 𝛽𝑗 𝑖 + 𝑡𝑘𝑡𝑘∗ 𝜎𝛽

This means that for time points

𝑡𝑘 > −Cov 𝛿𝑗 𝑖 ,𝛽𝑗 𝑖

𝜎𝛽2

the variance Var 𝑌𝑖𝑗𝑘 increases (and decreases before that time point).

Of course the data does not always follow this assumption.

Repeated Measures ANOVA (RM ANOVA) and Mixed Effects … · RM ANOVA: Greenhouse-Geisser / Huynh-Feldt Epsilon It is not uncommon that repeated measures data violate the compound

Documents

Friedman Nonparametric for Repeated Measure ANOVA

Repeated Measures ANOVA - University of...

Repeated Measures ANOVA

Two-Way Repeated Measures ANOVA repeated measures all...

Repeated Measures ANOVA - Overview

Repeated anova measures ppt

Repeated Measures ANOVA - University of Colorado...

Two-way Repeated Measures ANOVA

Repeated measures anova with spss

Multivariate ANOVA & Repeated Measures - RUG

ch9 Repeated measures...

Repeated Measures/Mixed-Model ANOVA:

Lab 2: repeated measures ANOVA

Quantitative Data Analysis for Health Research ·...

Repeated Measures ANOVA - Discovering Statistics

Varianzanalyse mit Messwiederholungen (Repeated-measures...