university of copenhagen department of biostatistics Faculty of Health Sciences Basics of repeated measurements Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen university of copenhagen department of biostatistics Contents Basic concepts for correlated and clusted data Descriptive statistics The multivariate normal distribution Analysis of balanced longitudinal data Baseline-follow up studies 2 / 72 university of copenhagen department of biostatistics Outline What are repeated measurements? Basics of longitudinal data (FLW chapters 1 & 2) The multivariate normal distribution Analysis of response profiles (FLW chapters 3 & 5) SAS proc mixed (FLW section 5.8) Baseline adjustment (FLW section 5.6) Appendix: Supplementary SAS-code 3 / 72 university of copenhagen department of biostatistics What are repeated measurents? Repeated measurements refer to data where the same outcome has been measured several times, in different situations or at different spots, on the same subjects. Special case: longitudinal means repeatedly over time. 4 / 72
18
Embed
Basics of repeated measurementsstaff.pubhealth.ku.dk/~jufo/courses/nfa2016/basics2016...I Special case:longitudinalmeansrepeatedly over time. 4/72 university of copenhagen department
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Faculty of Health Sciences
Basics of repeated measurementsAnalysis of repeated measurements, NFA 2016
Julie Lyng Forman & Lene Theil SkovgaardDepartment of Biostatistics, University of Copenhagen
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Contents
I Basic concepts for correlated and clusted data
I Descriptive statistics
I The multivariate normal distribution
I Analysis of balanced longitudinal data
I Baseline-follow up studies
2 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Outline
What are repeated measurements?
Basics of longitudinal data (FLW chapters 1 & 2)
The multivariate normal distribution
Analysis of response profiles (FLW chapters 3 & 5)
SAS proc mixed (FLW section 5.8)
Baseline adjustment (FLW section 5.6)
Appendix: Supplementary SAS-code?
3 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
What are repeated measurents?
Repeated measurements refer to data where the same outcome hasbeen measured several times, in different situations or at differentspots, on the same subjects.
I Special case: longitudinal means repeatedly over time.
4 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
What is clustered data?
Repeated measurements are termed clustered data when the sameoutcome is measured on groups of subjects that are somehowrelated.
5 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Statistical analysis must account for repetitions
The usual assumption is that observations are independent.
If you have repeated or clustered measurements . . .I the assumption of independence is violated.
Ignoring the repetitions/clustering most often leads to:I p-values that are too small or too large.I confidence intervals that are too wide or too narrow.
It is wrong to analyse repeated measurements data with anordinary GLM or ANOVA model!!!
6 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Example: A pre-post studyAverage daily dietary intake for 10 women over 10 pre-menstrualand 10 post-menstrual days.
D.G. Altman: Practical Statistics for Medical Research, Section 9.57 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Paired data
The most simple example of clustered or repeated measurements.I Two replicates per subject or two subjects per cluster
Examples of paired data:I Same person with treatment and placebo.I A baseline and a follow up measurement.I Twin study.I Comparison of two measurement methods
or reliability of a measurement method.
Quantiative outcomes are analysed with the paired t-test.
8 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Example: Paired vs unpaired comparison
To compare pre-menstrual and post-menstrual dietary intake.I Test H0 : µ1 = µ2.I Find a confidence interval for µ1 − µ2.
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
ExplanationThe two-sample t-test assumes the two samples are independent.But there is a strong dependence between pre- and post-intake forthe same woman (correlation 0.95, 95% CI: 0.83-0.99).
10 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Outline
What are repeated measurements?
Basics of longitudinal data (FLW chapters 1 & 2)
The multivariate normal distribution
Analysis of response profiles (FLW chapters 3 & 5)
SAS proc mixed (FLW section 5.8)
Baseline adjustment (FLW section 5.6)
Appendix: Supplementary SAS-code?
11 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Case: A baseline follow-up studyA randomized clinical trial was conducted to compare Eplerenonewith standard treatment of patients with chronic kidney disease.
Outcome: Augmentation index (aix), smaller is better.
Repeated measurements at:I Baseline,I after 12 weeks (safety),I after 24 weeks (end point).
Note: The study was planned with 37 subjects in each group, butonly 25 and 26, resp. could be treated within the time limit.
Boesby et al: Eplerenone Attenuates Pulse Wave Reflection in Chronic KidneyDisease Stage 3–4 - A Randomized Controlled Study, PLOS ONE 2013.12 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Typical set-up for longitudinal measurements
Two or more groups of subjectsI Often receiving different treatmentsI Possibly randomised at baseline.
Longitudinal measurements, typically as a function ofI duration (of treatment or disease)I age
Do the time courses differ between the groups?
13 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Merits of longitudinal studies
In longitudinal studies measurements are taken repeatedly on thesame subjects over time.
I This allows us to study changes over time within subjectsand factors that influence these changes, e.g. treatment.
I By comparing each subjects responses at two or moreoccations we eliminate extraneuous but unavoidable sources ofvariabitlity among subjects. Thus we obtain more accurateestimates and more certain conclusions about changes overtime than in cross-sectional studies.
14 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Visualizing data: SpaghettiplotsGood for visual inspection because replicates are connected!
Note: Missing data due to failed measurements, side effects,relapse or other illness (missing data discussed further in lecture 4).15 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Why visualization is so important
Graphical description of the data is useful for:
I Exploratory data analysis and hypothesis generation.I Aiding interpretation of planned analyses.I Presentation or saying it in figures rather than in numbers.I Spotting outliers that could otherwise spoil your analysis.I Rough assessment of model assumptions such as normal
distribution or linear trend over time.
Note: Having a large dataset is no excuse for forgetting graphicaldescription. You can divide your data into subgroups or at leastlook at a random subsample.16 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Balanced and complete data
In a planned study the times of measurements will usually be thesame for all subjects. We have a balanced design
In practice data is most often somewhat unbalanced due todrop-out, missed visits, failed measurements.
I In this case we say that data is incomplete.I But still the design is balanced.
Data from (retrospective) observational studies are most oftenunbalanced both by design and in practice.
Unbalanced desgins are treated in lecture 2.Missing data is treated in lecture 4.
17 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Outline
What are repeated measurements?
Basics of longitudinal data (FLW chapters 1 & 2)
The multivariate normal distribution
Analysis of response profiles (FLW chapters 3 & 5)
SAS proc mixed (FLW section 5.8)
Baseline adjustment (FLW section 5.6)
Appendix: Supplementary SAS-code?
18 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
The distribution of repeated outcomes
Repeated measurements are characterized by beingI mutually dependent or correlated.
We need to characterize their joint distribution.
Standard model for quantiative data: The multivariate normalI Location: mean-vectorI Variability: covariance-matrix
19 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
The multivariate normal distribution
Source: Wikipedia.20 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
The multivariate normal distribution
Source: Wikipedia.21 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Eplerenone: Scatter plots
Left: Eplerenone. Right: Controls.
Better check of normal distribution: use residual diagnostics (lecture 4).22 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Eplerenone: Summary statistics
Means and std.devs for the three time points:trt=0
Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximumaix0 24 24.64583 9.37559 591.50000 -2.50000 35.00000aix1 24 25.31250 10.60333 607.50000 8.00000 49.50000aix2 24 27.33333 8.70490 656.00000 8.50000 44.50000
trt=1Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximumaix0 26 22.28846 11.11321 579.50000 -5.50000 43.00000aix1 24 19.93750 13.69966 478.50000 -16.50000 38.50000aix2 22 20.38636 11.43192 448.50000 -10.00000 39.00000
Note: this does not tell us any thing about the joint distribution.
23 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Eplerenone: Correlations
Recall: 0 for independence vs ±1 for perfect linear association.trt=0: Pearson Correlation Coefficients
Number of Observations
aix0 aix1 aix2aix0 1.00000 0.78707 0.76148
24 23 23aix1 0.78707 1.00000 0.79525
23 24 24aix2 0.76148 0.79525 1.00000
23 24 24
trt=1: Pearson Correlation CoefficientsNumber of Observations
aix0 aix1 aix2aix0 1.00000 0.67942 0.72694
26 24 22aix1 0.67942 1.00000 0.81741
24 24 22aix2 0.72694 0.81741 1.00000
22 22 22
Note: correlations can be misleading if data is not normal.24 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Basic concepts: covariance and correlationBoth are used to describe the linear association between twovariables assumed to have a joint normal distribution.
I The covariance between two measurements is:
Cov(Y1,Y2) = E{(Y1 − µ1)(Y2 − µ2)}
. . . in squared units of the original measurements.
I The correlation between two measurements
Cor(Y1,Y2) = Cov(Y1,Y2)SD(Y1)SD(Y2)
. . . it has no units - interpretation is free of scale.
Note: SAS PROC MIXED and most other statistical softwaredisplay the covariances, not correlations, as default output.25 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Matrix notation
Covariances and correlations of the 3D (normal) distribution:
Cov =
σ21 σ12 σ13
σ21 σ22 σ23
σ31 σ32 σ23
, Cor =
1 ρ12 ρ13ρ21 1 ρ23ρ31 ρ32 1
NOTE:I Variances σ2
1, σ22, σ
23 along the diagonal in Cov.
I 1’s along the diagonal in Cor.I Both are symmetric σij = σji and ρij = ρji .I Note the relation ρij = σij/
√σ2
i · σ2j .
26 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
What if data is not normally distributed?
The usual assumption is that outcomes from the same subjectfollow a multivariate normal distribution.
But linear mixed models for repeated outcomes are robust.I If sample size is not too small.I If the distribution of the data is not too skewed.
So your data doesn’t have to be perfectly normal.
Highly skewed data should always be transformed.
Models for counts are treated in lecture 5.Models for binary outcomes are treated in lecture 6.
27 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Outline
What are repeated measurements?
Basics of longitudinal data (FLW chapters 1 & 2)
The multivariate normal distribution
Analysis of response profiles (FLW chapters 3 & 5)
SAS proc mixed (FLW section 5.8)
Baseline adjustment (FLW section 5.6)
Appendix: Supplementary SAS-code?
28 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Analysis of response profilesComparison of change over n time points within g groups ofsubjects (e.g. different treatments).
I Similar to two-way ANOVA only with correlated data.I Covariates: group and time (both categorical)I Balanced design, but possibly incomplete data.I Do the groups evolve differently with time?
Interest is in the mean parameters (systematic effects)
group = Control group = Eplerenonetime=0 µ11 µ21time=12 µ12 µ22time=24 µ13 µ23
29 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Eplerenone: Averages over time
Seeming improvement over time with Eplerenone.I But what about statistical uncertainty?I We also need to consider the (co)variance . . .30 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Main hypothesis
Analysis of response profiles allows for testing a large number ofdifferent hypothesis about the mean parameters. Which hypothesesare relevant of course depend on the subject matter.
Example: The scientific hypothesis was that there would be apositive effect of Eplerenone compared to the standard treatmentat final follow up.
The relevant statistical nullhypothesis is:
H0: µ13 − µ11 = µ23 − µ21,
I.e. same change in means in the two groups at last follow-up.
31 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Eplerenone: results of analysis
Changes in mean AIX (%) since baseline estimated by theanalysis of response profiles
There is a seeming improvement at last follow-up with Eplerenonecompared to standard treatment with a mean difference in changein AIX of -3.61 (95% CI: -7.90 to 0.68, P = 0.10) .
Note: The difference between the treatments is not significant.
32 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Merits of analysis of response profiles
We can use a linear mixed model (PROC MIXED in SAS) todescribe differences between groups at any time point or changesbetween any two time points (explanation follows).
Computationally this is an advantage compared to making manydifferent t-tests. Everything is computed at one go.
Linear mixed models handles data that are missing at randomoptimally whereas t-tests may be biased (more on this lecture 4).
There is a gain in statistical power when doing baseline adjustmentin the analysis of randomized studies (more on this later today).
33 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Linear mixed models (LMMs)We use linear mixed models to analyze quantitative repeatedmeasurements.
Systematic effects (means) are modeled similar to general linearmodels (GLM) including relevant explanatory variables such astime, treatment, age, gender, etc.
Additional specification of a model for the covariance is neededdue to the repeated measurements. We will consider many suchmodels either given in terms of
I So-called covariance pattern models for the residual covarianceI So-called random effects (e.g. in multi-level models)I Or a mixture of these for more complex data.
(More about linear mixed models and applications in lectures 2-4).34 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
The unstructured covariance
With a balanced design and few different time points we don’thave to make any specific assumptions about the covariance; Anunstructured covariance pattern model is assumed.
I One variance parameter for each time pointI One correlation parameter for each pair of time pointsI n + n(n−1)
2 parameters in total with n time points.
Usually all groups are assumed to have the same covariance, butthis assumption can be relaxed.
Note: Not feasible with many time points or groups.
35 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Two-way ANOVA type model for the means
Describe means for the six time-treatment combinations as:
group = Control group = Eplerenonetime=0 β1 β1 + β4time=12 β1 + β2 β1 + β2 + β4 + β5time=24 β1 + β3 β1 + β3 + β4 + β6
I Mean of standard treatment at baseline is reference (intercept)I Change over time with standard treatment (time estimates)I Difference between groups at baseline (group estimate)I Differences in time effects (interaction estimates)
36 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Num DenEffect DF DF F Value Pr > Fweek 2 44.5 0.99 0.3794treat 1 47 1.84 0.1817week*treat 2 44.5 1.43 0.2490
(Confidence intervals omitted due to lack of space)37 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
About the test of the group∗time-interaction
MODEL aix = time treat time*treat / SOLUTION CL;
By testing H0: No group*time-interaction we test thatI mean change over time is identical in all groups . . .I . . . at all follow-up times.
If we aim to show that there is a treatment effect we will get morepower by focusing on a specific time point;
I The time point where the difference is largest.
(A so-called one degree of freedom test, FLW section 5.5)
38 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Post hoc testingThat the group∗time interaction is significant indicate that there isa difference in changes over time between the groups, but
I not between which time points.I not between which groups, if there are more than two.
To find out where differences occur we have to look at estimateddifferences between specific groups at specific time points.
I The total number of comparisons may become large inparticular if there are many time points (or several groups).
I Shouldn’t we adjust for multiple testing?
Learn to do this in P.H.Wesfall, R.D.Tobias & R.D.Wolfinger:Multiple comparisons and multiple testing in SAS (2nd edition),SAS Press, 2011.39 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Outline
What are repeated measurements?
Basics of longitudinal data (FLW chapters 1 & 2)
The multivariate normal distribution
Analysis of response profiles (FLW chapters 3 & 5)
SAS proc mixed (FLW section 5.8)
Baseline adjustment (FLW section 5.6)
Appendix: Supplementary SAS-code?
40 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Preparing data for analysis
Most often raw data is stored in the wide format (e.g. in Excell).I one row per subjectI several columns with the outcomes for different occations
Example:
id sex age treat aix0 aix1 aix21 1 57 0 10.5 17.5 25.02 1 48 0 -2.5 8.0 8.53 2 54 1 18.0 24.0 23.5...
To fit a linear mixed model with any statistical software datamust be in the so-called long format . . .41 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
The long formatI Each row contains only one observation of the outcome.I A time-variable identifies the time of measurement.I An id-variable identifies measurements from same subject.
Num DenEffect DF DF F Value Pr > Fweek 2 44.5 0.99 0.3794treat 1 47 1.84 0.1817week*treat 2 44.5 1.43 0.2490
(confidence intervals omitted due to lack of space)
49 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
SAS proc mixed outputStandardized (aka Studentized) residuals: Normal distribution?
(Boxplots of residuals vs time and group omitted)50 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Plotting the estimated response profilesUse the output data (ckdfit) from PROC MIXED in:
PROC SGPLOT DATA=ckdfit;WHERE id IN (1,3);SERIES x = week y = pred / GROUP = treat MARKERS;RUN;
Note: Not identical to averages over time (due to missing data).51 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Alternative parametrisations
The same model can be phrased differently to highlight differencesbetween groups at specific time points or changes over time.To compare change over time between groups:
I Include both main effects and the interaction term.MODEL aix = time treat time*treat / SOLUTION CL;
To get mean differences between groups at each time point:I Omit the main effect of group and the intercept.
MODEL aix = time time*treat / NOINT SOLUTION CL;
To get the means for all combinations of group and time.I Include only the interaction term and omit the intercept.
MODEL aix = time*treat / NOINT SOLUTION CL;Note: This can be combined with LSMEANS . . .
52 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
I Estimates the means for all times and treatments,I . . . and all possible differences between them (DIFF-option).I NOINT means that the model does not include an intercept
(so there is no need to specifiy reference groups)I Use SLICE=week to test for overall differences between
multiple groups at each time separately (one-way ANOVA).
53 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
ANCOVA with multiple times of follow-up
I Include baseline as a covariate in the data.I Use change-since-baseline as outcome for direct quantification.I It is suggested to center the baseline variable around its mean
for ease of interpretation, and to omit main effect oftreatment and the intercept in the model.
I Include the baseline*time interaction in the model.
PROC MIXED DATA=ckd; WHERE week > 0;CLASS id week treat (ref=’0’);MODEL aixchange = week treat*week baseline*week
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
ANCOVA: predicted changes over time
64 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
ANCOVA vs cLMMThe two models have somewhat different interpretations.
I cLMM estimates the population mean response.I ANCOVA estimates the expected response depending on thebaseline value.
The two models estimate the same treatment effect? withsimilar accuracy/power
I Estimates and p-values are very similar.I Except that cLMM can better handle missing data, while
ANCOVA merely deletes subjects with missing baseline ormissing series of follow-up.
? The feature that treatment effect is the same on the subject mean and thepopulation mean is particular to linear models (lectures 5+6).65 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Baseline in observational studies
Compare the outcomes for individuals from different groups (e.g.gender or illness groups):
I The groups are likely to differ in many respects . . . includingthe baseline outcome value!
I Differences in response profiles may be due to many factors,and quantifications will depend on which of these are factorsare included in the model.
I Adjust for the covariates that are sensible in the context.
Is the baseline measurement a sensible covariate?
66 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Baseline in observational studies
Fitzmaurice et al. (2011)[Section 5.6]:
For example, in an observational study examining gender differences in weightgain of infants between 12 months (baseline) and 24 months (...) At baselineboys are on average 1 1/2 pounds heavier than girls, but there is no evidence ofa gender effect on the 12 month change in body weight, with boys and girlsboth gaining approximately 5 1/4 pound. In contrast the analysis of covarianceof the same data reveals a discernible gender effect with boys showing moreweight gain than girls.(...) the analysis of covariance is directed at the conditional question of whetherboys are expected to gain more weight than girls given that they have the sameinitial weight at 12 months. (...) The reasoning is that if a boy and girl havethe same intial weight at 12 months, then there are two possibilities: (1) thegirl is initially overweight and is expected to gain less weight or (2) the boy isinitially underweight and is expected to gain more weight over the 12 months.We advise readers to employ the analysis of covariance approach inlongitudinal settings only if the approach and its implications are fullyunderstood.67 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Outline
What are repeated measurements?
Basics of longitudinal data (FLW chapters 1 & 2)
The multivariate normal distribution
Analysis of response profiles (FLW chapters 3 & 5)
SAS proc mixed (FLW section 5.8)
Baseline adjustment (FLW section 5.6)
Appendix: Supplementary SAS-code?
68 / 72
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
From wide to long format
Data was transformed from the wide to the long format with: