Parsimonious Statistical Modeling of Inter- Individual Response Differences to Sleep Deprivation Greg Maislin Principal Biostatistician Biomedical Statistical Consulting & Director, Biostatistics and Data Management Core Center for Sleep and Respiratory Neurobiology University of Pennsylvania School of Medicine
59
Embed
Statistical Modeling of Inter-Individual Differences
Parsimonious Statistical Modeling of Inter-Individual Response Differences to Sleep Deprivation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Parsimonious Statistical Modeling of Inter-Individual Response
Differences to Sleep Deprivation
Greg MaislinPrincipal Biostatistician
Biomedical Statistical Consulting &Director, Biostatistics and Data Management Core
Center for Sleep and Respiratory NeurobiologyUniversity of Pennsylvania School of Medicine
Acknowledgements and Thanks Special thanks to Dr. David F. Dinges whose experiments
exploring the consequences of partial and total sleep deprivation with and without counter measures provided fertile ground for deep thinking about some very interesting statistical issues.
To Dr. Hans Van Dongen for challenging me to be clear and compelling in my thinking.
And to Bob Hachadoorian, Senior Statistical Programmer for his dedication in programming SAS production runs capable of mass processing multitudes of assessments across many domains in the PSD and TSD protocols.
Introduction
Inter-individual differences in response to sleep loss are substantial. Sleep deprivation protocols often involve neurobehavioral and physiological testing over multiple days. There is need for conceptually simple, yet quantitatively valid statistical methods that recognize inter-individual variability and accommodate assay-specific non-linearity of time trajectories.
Intraclass Correlation Coefficient (ICC)
Between-subject variance (bs2 )
Within-subject (ws2) variance
Total2 = bs
2 + ws2
ICC is defined as follows to quantify trait-like inter-individual variability:
ICC = bs2
bs2 + ws
2
Intraclass Correlation Coefficient (ICC)
ICC varies by population since bs2 varies by
population. The magnitudes of bs
2 and ws2 should be
interpreted, not just ICC. Mixed effects ANOVA can be used to estimate
ICC-like measures that include multiple sources of variance and that filter out ‘fixed’ effects such as demographic factors and experimental conditions.
Evidence of Trait-Like Variance
Change in Total PVT lapses after two exposures 2-4 wks apart of 36 h sleep deprivation.Van Dongen HP, Kijkman M, Maislin G, Dinges D. Phenotypic aspect of vigilance decrement during sleep deprivation. Physiologist 1999; 42:A-5.
Evidence of Trait-Like Variance
Preliminary data from Heritability of Sleep Homeostasis, Drs. Allan Pack and Samuel Kuna,
Division of Sleep Medicine.
PVT Transformed Lapses Linear SlopesOver 38 Hours (19 trials) of Sleep Deprivation
PVT Transformed Lapses Linear SlopesOver 38 Hours (19 trials) of Sleep Deprivation
Dizygotic Twins
Twin Pair
132 98 135 29 117
142
126
106 25 128
134 10 148
100 94 149 41 4 58 113
104 16 124
130
145
127 57 5
138 52
PV
T T
ran
sfo
rmed
Lap
ses
Lin
ear
Slo
pes
-0.1
0.0
0.1
0.2
0.3
0.4
0.5ICC = 24.5% (N=30 pairs)
Var(B) = 0.93*10E-3
Var(W) = 2.87*10E-3
‘Test-bed’ Experiment1
This sleep restriction experiment involved one adaptation day and two baseline days with 8 h sleep opportunities (TIB 23:30–07:30), followed by randomization to 8 h, 6 h or 4 h periods for nocturnal sleep (TIB ending at 07:30) for 14 days. 13 Subjects randomized to 4 hrs TIB for 14 days 13 Subjects randomized to 6 hrs TIB for 14 days 9 Subjects randomized to 8 hrs TIB for 14 days
Assessments every 2 hrs during wakefulness1 Van Dongen HP, Maislin G, Mullington JM, and Dinges DF. The cumulative cost of additional wakefulness: Dose-response effects on neurobehavioral functions and sleep physiology from chronic sleep restriction and total sleep restriction. Sleep 2003, 26(2):117-126.
Neurobehavioral Test Battery (NAB)(1) Psychomotor vigilance task (PVT) (Dinges & Powell 1985)(2) Probed recall memory (PRM) test that controls for reporting bias and evaluates free
recall/retention (Dinges et al 1993)
(3) Digit symbol substitution task (DSST) assesses cognitive throughput (speed/accuracy)
(4) Time estimation task (TET)
(5) Performance evaluation and effort rating scales (PEERS) to track self monitoring, compensatory effort, and motivation (Dinges et al 1992)
(1) Karolinska Sleepiness Scale (KSS) (Akerstedt & Gillberg 1990) (2) Stanford Sleepiness Sale (SSS) (Hoddes et al 1973)
(3) Visual analog scale (VAS) for mental and physical exhaustion
(5) Profile of Mood States (POMS) (McNair, Lorr, & Druppleman 1971)
Psychomotor vigilance task (PVT)
Simple, high-signal-load reaction time (RT) test designed to evaluate the ability to sustain attention and respond in a timely manner to salient signals.
10 minute duration Yields six primary metrics on the capacity for
sustained attention and vigilance performance.
Psychomotor vigilance task (PVT)
Frequency of lapses (RT>500 msec) Duration of the lapse domain (mean of 10% slowest
reciprocal RTs) Optimum response times (mean of 10% fastest RTs) False response frequency (errors), Frequency of non-responses (caused by spontaneous
sleep episodes) Fatigability function (slope computed from 1 minute
bins of mean 1/RTs).
PVT Lapses Among 9 Subjects withTime in Bed Restricted to 8 Hours
Day
0 2 4 6 8 10 12 14
PV
T L
apse
s
0
5
10
15
20
25
PVT Lapses Among 13 Subjects withTime in Bed Restricted to 6 Hours
Day
0 2 4 6 8 10 12 14
PV
T L
apse
s
0
10
20
30
40
50
PVT Lapses Among 13 Subjects withTime in Bed Restricted to 4 Hours
Place the X next to the ONE statement that best describes your SLEEPINESS during the PREVIOUS 5 MINUTES. You may also use the intermediate steps.
X __ 1. very alert __ 2. __ 3. alert, normal level __ 4. __ 5. neither alert nor sleepy __ 6. __ 7. sleepy, but no effort to keep awake __ 8. __ 9. very sleepy, great effort to keep awake, fighting sleep
Use {UP/ DOWN} cursor keys to move X block, then press {ENTER}
Karolinska Sleepiness Scale (KSS)
Karolinska Sleepiness Score Among 13 Subjectswith Time in Bed Restricted to 4 Hours
Day
0 2 4 6 8 10 12 14
KS
S
0
2
4
6
8
10
Karolinska Sleepiness Score Among 13 Subjectswith Time in Bed Restricted to 6 Hours
Day0 2 4 6 8 10 12 14
KS
S
0
2
4
6
8
10
Karolinska Sleepiness Score Among 9 Subjectswith Time in Bed Restricted to 8 Hours
Day0 2 4 6 8 10 12 14
KS
S
0
2
4
6
8
10
Observations For the subjective measure, end-of-study values
depend heavily on baseline values. The objective measure increased linearly throughout
the PSD protocol. The increase in the subjective measure was non-
linear with decelerating increases. Most of the increase was very early.
There was substantial variability among subjects in both objective and subjective responses to PSD.
Substantial Variability in Responses to PSD
PVT Lapses Among 13 Subjects withTime in Bed Restricted to 4 Hours
Day
0 2 4 6 8 10 12 14
PV
T L
apse
s
0
10
20
30
40
Statistical Approachesfor Growth Curves
Classical repeated measures analysis fails to recognize individual response variance. It is a model for the mean response with no recognition of true biological variability among subjects in the magnitudes of their response.
Mixed effects models for each individual observation include random subject effects and can allow for a variety of covariance structures that can reflect many different assumptions concerning the nature of within subject correlations overtime (e.g. AR(1)). Although theoretically appealing, concern has been raised about the robustness of these models1. They also require sophisticated statistical approaches that may not be immediately accessible to all researchers. 1 Ahnn, Tonidandel, and Overall. Issues in use of Proc.Mixed to test the significance of treatment effects in controlled clinical trials. J of Biopharm Stat 10(2):265-286, 2000
Standard Two Stage (STS) regression
Using simple linear regression, a slope (and intercept) for each subject are determined at the first stage. Second stage group comparisons are made by comparing mean slopes.
STS gives each subject’s first stage slope estimate equal weight, which is not appropriate if the sample size or layout of time values varies widely among subjects.
STS disguises residual error, pooling it with between-subject variance and biasing the latter upward. If residual variance is small or the numerical values of variance components are not themselves of interest, this is not a problem.
Standard Two Stage (STS) regression (cont.)
STS does not account for the covariance between slopes and intercepts.
Parsimony: It is desirable to reduce the response curve to a single number (eliminating the intercept).
Slopes assume constant accumulations of deficit over time. However, accumulation of deficits can be decelerating or accelerating.
Mixed linear model determination of slopes
The simultaneous determination of subject specific slopes using maximum likelihood incorporates the assumption that the slopes are normally distributed with condition-specific mean values and a common variance.
STS does not make this assumption, computing each subject-specific slope independently from all other subjects (robustness?).
Proposed Model: Two-Stage Non-LinearMixed Model Regression
(t)i(j) = Bi(j) · t + (t)i(j)
(t)i(j) = performance deficit for subject i in group j at
time t
is a curvature parameter reflecting the nature of non-
linearity of growth in deficits
Bi(j) are subject-specific “non-linear” slopes
(t)i(j) are residual errors.
Reference
Van Dongen HPA, Maislin G, Dinges DF. Dealing with inter-individual differences in the temporal dynamics of fatigue and performance: Importance and techniques. Aviat Space Environ Med 2004: 75:A147–A154.
Proposed Model: Two-Stage Non-LinearMixed Model Regression
The (non-linear) slopes are combinations of group specific
mean values and random effects reflecting individual
susceptibilities to the deprivation challenge.
Bi(j) = j + bi(j)
j is the mean response in group j
bi(j) ~ Normal(0, 2b).
2b is a subject specific variance contribution.
Bi(j) ~ Normal(j,, 2b ).
Three Methods of Estimating Bi(j)
Two-stage Random Effects Regression1 with grid
search varying .
REML2 (treating as fixed)
MLE3 (estimating )1 Feldman. Families of lines: random effects in linear regression, J Appl. Physiol.
Two-stage Random Effects Regression with Grid search varying
REML Mixed Linear Model (fixed )
The 2-stage approach disguises residual error, pooling it with between-subject variance and biasing the latter upward (compare SD’s in Table 1 and Table 2).
Conditional on the value of the optimal obtained from the grid search, the model is no longer non-linear.
When is assumed known, a mixed linear model can be used to simultaneously derive subject specific slopes (e.g., SAS Proc Mixed).
ML Mixed Non-Linear Model (simultaneous estimation of )
Has greatest theoretical appeal. Requires specialized software
(e.g., SAS Proc NLMIXED). Model sometimes does not converge. More precise estimate of (Table 3).
Examples of Second Stage AnalysisFig. 2 Delta PVT Lapses
REML (fixed theta) Non-linear Slopesby Study Group (Theta=0.78)
Group
4 hr 6 hr 8 hr
No
n-l
inea
r S
lop
e
-1
0
1
2
3
4
5
6
F2,30=3.67, p=0.037
Examples of Second Stage Analysis
Table 1. Two-stage Non-linear Slopes (=0.7753)
N Mean SD Min Max
4 hr 13 1.9269 1.3493 0.0638 3.8784
6 hr 13 1.2897 1.6999 -0.5383 5.5608
8 hr 9 0.3345 0.6851 -0.3228 2.0517
Examples of Second Stage Analysis
Table 2. REML Mixed Model Non-linear Slopes (=0.7753)
Note that the mean 2-stage slope and the mean mixed model slope are not identical for KSSQ in the 8 hour condition because there was a missing value. The mixed model slopes correctly adjust for the reduced precision caused by the single missing value.
An Application in Another Area Berkowitz RI, Stallings VA, Maislin G, Stunkard AJ.
Growth of children at high risk for obesity during the first six years: Implications for prevention. American Journal of Clinical Nutrition. Am J Clinical Nutrition 2005;81:140–6.
Body size and composition of high and low risk groups were measured repeatedly from 3 mo. to 6 yrs of age at CHOP. Subjects included 33 children at high risk for and 37 children at lower risk for obesity on the basis of mothers’ overweight1
1 high risk mean (SD) BMI = 30.2 (4.2), low risk mothers’ BMI = 19.5 (1.1).
An Application in Another Area
At year 2, there were no differences between high and low risk groups in any measure of body size and composition1
(Energy intake and sucking behaviors at Month 3 were predictive of 2 year weight in both groups.)
1 Stunkard AJ, Berkowitz RI, Schoeller D, Maislin G, Stallings VA. Predictors of body size in the first 2 years of life: a high-risk study of human obesity. International Journal of Obesity 2004 1-11.
Weight Over TimeFrom Month 24 to Month 72
High Risk Group
Month
24 30 36 42 48 54 60 66 72
Wei
gh
t (k
g)
0
10
20
30
40
50
High Risk
Weight Over TimeFrom Month 24 to Month 72
Low Risk Group
Month
24 30 36 42 48 54 60 66 72
Wei
gh
t (k
g)
0
10
20
30
40
50
Low Risk
Average Mean Squared Errorfrom First Stage Linear Regressions
Parsimony of InterpretationsChange in Weight from Month 24 to Month 72
REML (fixed theta) Non-linear Slopesby Group (Theta=2.4)
Group
High GT 85th High LE 85th Low
No
n-l
inea
r S
lop
e
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Generally monotonic but varying in direction:
It is possible that trajectories are generally monotonic but vary in direction (i.e., there are subgroups of individuals with generally increasing values and others with generally decreasing values over time).
In this case, can be set to 1 with slope estimated individually using two-stage random effects regression or simultaneously by Proc Mixed.
Generally non-monotonic changes: If trajectories are non-monotonic (e.g. quadratic), mixed model analyses of
longitudinal changes can be performed that incorporate multiple time variables such as linear plus quadratic terms or appropriately constructed sets of time indicator variables.
Correlations among observations within subject can be accounted for by using appropriately constructed covariance matrices1. A particular covariance structure is the ‘random intercept plus AR(1)’ structure. This structure induces within subject correlations by assuming both systematic variance in overall levels between subjects plus a component that diminishes over time.
1 Littell RC, Pendergast J, Natarajan R. Tutorial in biostatistics: Modeling covariance structure in the analysis of repeated measures data. Statistics in Medicine. 19:1793-1819, 2000.
General Linear Mixed Model:
Verbeke G and Molenberghs G (2000). Linear Mixed Models for Longitudinal Data. Springer Series in Statistics. New-York: Springer.
Fixed effects: β Random effects: bi Variance components: Elements in D and ∑i
General Linear Mixed Model:
Hierarchical model can be rewritten as:Yi|bi ~ N(Xiβ + Zibi, ∑i); bi ~ N(0,D)
Marginal model can be rewritten as:Yi ~ N(Xiβ, ZibiZ'i +∑i)
The hierarchical model is most naturally interpreted through a Bayesian perspective
Only the hierarchical model explicitly assumes inter-individual variability
General Linear Mixed Model:
Prior distribution: f(bi) = N(0,D)
Likelihood function: f(yi|bi) = N(Xiβ + Zibi, ∑i)
Posterior distribution:f(bi | Yi = yi) α f(yi|bi) * f(bi)
Posterior mean: ∫ bi f(bi | yi) dbi is the Empirical Bayes estimate of bi
Conclusions The STS, REML, and ML approaches have advantages and
disadvantages. “The great advantage of the STS, aside from conceptual and
computational simplicity, is the availability of valid small-sample statistics. STS can thus be relied upon, whereas WLS and REML cannot, to produce accurate P values in the cases of very few subjects, so long as the assumptions of the small-sample model (e.g., normality) are met1”
The ML approach requires starting values (guesses) for every parameter and sometimes the ML optimization does not converge.1 Feldman HA. Families of lines: random effects in linear regression analysis, J. Appl. Physiol. 64(4):1721-1732, 1988.
Conclusions The “grid search plus STS” method provides good
solutions that in most cases are very similar to the optimal ML solutions, facilitate analysis of inter-individual variability in responses to sleep deprivation, and is easy to implement.
The model: (t)i(j) = Bi(j) · t + (t)i(j) can be generally recommended for sets of responses that are generally monotonic.
Other methods are needed for non-monotonic trajectories. The cost of non-monotonicity is greater analytical complexity.
Conclusions The Linear Mixed Model provides a
comprehensive platform for evaluation and estimation of inter-individual variability including the evaluation of prior and posterior distributions of subject specific parameters. It may be possible to update subject specific parameters reflecting individual performance, and then sum over these individual performance estimates to obtain a summary prediction of unit performance.