The general linear model for fMRI Methods and Models in fMRI, 17.10.2017 Jakob Heinzle [email protected]Translational Neuromodeling Unit (TNU) Institute for Biomedical Engineering (IBT) University and ETH Zürich Many thanks to K. E. Stephan and F. Petzschner for material Translational Neuromodeling Unit
42
Embed
The general linear model for fMRI - TNU · PDF fileBasic math: Whatisa convolution? ... (1995) Statistical parametric maps in functional ... Add a time-offset, t, which allows g
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Translational Neuromodeling Unit (TNU) Institute for Biomedical Engineering (IBT)University and ETH Zürich
Many thanks toK. E. Stephan andF. Petzschner for material
Translational Neuromodeling Unit
Overview of SPM
GLM for fMRI 2
Realignment Smoothing
Normalisation
General linear model
Statistical parametric map (SPM)Image time-series
Parameter estimates
Design matrix
Template
Kernel
Gaussian field theory
p <0.05
Statisticalinference
What is the problem we want to solve?
• We have an experimental paradigm andwant to test whether brain activity is(linearly) related to the paradigm.
• We will try to solve the problem bymodeling the data.
GLM for fMRI 3
Modelling the measured data
GLM for fMRI 4
stimulus function
1. Decompose data into effects anderror
2. Form statistic using estimates ofeffects and error
Make inferences about effects of interestWhy?
How?
datalinearmodel
effects estimate
error estimate
statistic
A very simple experiment
GLM for fMRI 5
time
• One session• 7 cycles of rest and listening• Blocks of 6 scans with 7 sec
TR
What is the brain‘s responseto such a stimulation?
How is brain data related to the input?
GLM for fMRI 6
time
single voxel time series
Question: Is there a change in the BOLD response between listening and rest?
What we know.
What we measure.
A linear model of the data
GLM for fMRI 7
BOLD signal
Time =1 2+ + er
ror
e
y x11 x22 ey x11 x22 e
Explain your data… as a combination of experimental manipulation,confounds and errors
Single voxel regression model:regressors
x1 x2
Writing everything in matrix notation
GLM for fMRI 8
BOLD signal
Time = + erro
r
x1 x2 e
1
2
eXy Single voxel regression model:
The way it looks in SPM
GLM for fMRI 9
n
= + + erro
r
1
2
1
np
p1
n
1
desi
gn m
atrix
erro
r
eXy n: number of scansp: number of regressors
data
y
We need …
• … to specify the design matrix.• … specify a noise model, e.g. • … and then, estimate the parameters b
that minimize the error– Minimization of the error depends on
assumptions about the noise.
GLM for fMRI 10
et2
t1
N
),0(~ 2INe ),0(~ 2INe
Summary: Mass-univariate GLM
GLM for fMRI 11
=
e+y X
N
1
N
1 1p
p
Model is specified by1. Design matrix X2. Assumptions about e
N: number of scansp: number of regressors
eXy eXy
The design matrix embodies all available knowledge about experimentally controlled factors and potential confounds.
),0(~ 2INe ),0(~ 2INe
N
How to fit the model parameters.
GLM for fMRI 12
= +
e
2
1
y X
erro
r
y Xe y y
e y X
min(eTe) min((y X)T (y X ))
y Xe y y
e y X
min(eTe) min((y X)T (y X ))
Data predicted by our model
e = error between predicted and actual data
Goal is to determine the betas that minimize the quadratic error
OLS (Ordinary Least Squares)
OLS – Ordinary least squares
GLM for fMRI 13
eTe (y X )T (y X )
eTe (yT T XT )(y X )
eTe yT y yT X T XT y T XT X
eTe yT y 2T XT y T XT XeTe
2XT y 2XT X
0 2XT y 2XT X
(XT X)1 XT y
We want to minimize the quadratic error between data and model
OLS – Ordinary least squares
GLM for fMRIv 14
yXXX
XXyX
XXyXeeXXyXyyee
XXyXXyyyee
XyXyee
XyXyee
TT
TT
TTT
TTTTTT
TTTTTTT
TTTT
TT
1)(ˆ
ˆ220
ˆ22ˆ
ˆˆˆ2
ˆˆˆˆ)ˆ)(ˆ(
)ˆ()ˆ(
OLS estimate for
Summary: OLS solution
GLM for fMRI 15
eXy
= +
e
2
1
Ordinary least squaresestimation (OLS)
(assuming i.i.d. error):
yXXX TT 1)(ˆ
Objective:estimate parametersto minimize
N
tte
1
2
y X
Geometric perspective
GLM for fMRI 16
ye
Design space defined by X
x1
x2 ˆ Xy
yXXX TT 1)(ˆ yXXX TT 1)(ˆ OLS estimates
PIRRye
PIR
Rye
TT XXXXPPyy
1)(
ˆ
TT XXXXP
Pyy1)(
ˆ
Residual forming matrix R
Projection matrix P
Correlated and orthogonalized regressors
GLM for fMRI 17
x1
x2x2*
y
When x2 is orthogonalized with regard to x1, only the parameter estimate for x1 changes, not that for x2!
Correlated regressors = explained variance is shared between regressors
121
2211
exxy
121
2211
exxy
1;1 *21
*2
*211
exxy
1;1 *21
*2
*211
exxy
Design space defined by X
We are nearly there …
GLM for fMRI 18
linear model
effectsestimate
errorestimate
statistic
Problems of this model
GLM for fMRI 19
1. BOLD responses have a delayed and dispersed form (cf. Lecture 1).
HRF
2. The BOLD signal includes substantial amounts of low-frequency noise.
3. The data are serially correlated (temporally autocorrelated) this violates the assumptions of the noise model in the GLM
Summary: Mass-univariate GLM
GLM for fMRI 20
=
e+y X
N
1
N N
1 1p
p
Model is specified by1. Design matrix X2. Assumptions about e
N: number of scansp: number of regressors
eXy eXy
The design matrix embodies all available knowledge about experimentally controlled factors and potential confounds.
),0(~ 2INe ),0(~ 2INe
Problem 1: The BOLD response
GLM for fMRI 21
t
dtgftgf0
)()()(
The response of a linear time-invariant (LTI) system is the convolution of the input with the system's response to an impulse (delta function).
Basic math: What is a convolution?
GLM for fMRI 22
t
dtgftgf0
)()()(
Solution: Convolution with the HRF
GLM for fMRI 23
expected BOLD response = input function impulse response function (HRF)
HRF
t
dtgftgf0
)()()(
blue = datagreen = predicted response, taking convolved with HRFred = predicted response, NOT taking into account the HRF
Problem 2: Low frequency noise
GLM for fMRI 24
blue = datablack = mean + low-frequency driftgreen = predicted response, taking into account
low-frequency driftred = predicted response, NOT taking into
account low-frequency drift
Solution 2: High-pass filtering
GLM for fMRI 25
discrete cosine transform (DCT) set
Solution 2: High-pass filtering
GLM for fMRI 26
blue = datablack = mean + low-frequency driftgreen = predicted response, taking into account
low-frequency driftred = predicted response, NOT taking into
account low-frequency drift
Linear model
Problem 3: Serial correlations
GLM for fMRI 27
sphericity = i.i.d.error covariance is a scalar multiple of the
identity matrix:Cov(e) = 2I
1001
)(eCov
1004
)(eCov
2112
)(eCov
Examples for non-sphericity:
non-identity
non-independence
Problem 3: Serial correlations
GLM for fMRI 28
Cov(e)
n: number of scans
n
n
autocovariancefunction
withttt aee 1 ),0(~ 2 Nt
1st order autoregressive process: AR(1)
Solution 3: Pre-whitening
GLM for fMRI 29
• Pre-whitening:
1. Use an enhanced noise model with multiple error covariance components, i.e. e ~ N(0,2V) instead of e ~ N(0,2I).
2. Use estimated serial correlation to specify filter matrix W for whitening the data.
WeWXWy WeWXWy
This is i.i.d
How to define W?
GLM for fMRI 30
• Enhanced noise model
• Remember linear transform for Gaussians
• Choose W such that error covariance becomes spherical
• Conclusion: W is a simple function of V so how do we estimate V ?
WeWXWy WeWXWy
),0(~ 2VNe ),0(~ 2VNe
),(~),,(~
22
2
aaNyaxyNx
),(~),,(~
22
2
aaNyaxyNx
2/1
2
22 ),0(~
VWIVW
VWNWe
2/1
2
22 ),0(~
VWIVW
VWNWe
Find W – multiple covariance components.
GLM for fMRI 31
),0(~ 2VNe ),0(~ 2VNe
iiQVeCovV
)(
iiQVeCovV
)(
= 1 + 2
Q1 Q2
Estimation of hyperparameters with EM (expectation maximisation) or ReML (restricted maximum likelihood). For more details see (Friston et al, Neuroimage, 16:465; 2002)
V
enhanced noise model error covariance components Qand hyperparameters
linear model
effectsestimate
errorestimate
statistic
c = 1 0 0 0 0 0 0 0 0 0 0
Null hypothesis: 01
)ˆ(
ˆ
T
T
cStdct
)ˆ(
ˆ
T
T
cStdct
Lecture: Classical (frequentist) inference
GLM for fMRI 32
Outlook: Contrasts and statistical maps
GLM for fMRI 33
Q: activation during listening ?
c = 1 0 0 0 0 0 0 0 0 0 0
Null hypothesis: 01
)ˆ(
ˆ
T
T
cStdct
)ˆ(
ˆ
T
T
cStdct
X
Summary of GLM
GLM for fMRI 34
WeWXWy
c = 1 0 0 0 0 0 0 0 0 0 0
)ˆ(ˆˆ
T
T
cdtsct
cWXWXc
cdtsTT
T
)()(ˆ
)ˆ(ˆ
2
)(
ˆˆ
2
2
RtrWXWy
ReML-estimates
WyWX )(
)(2
2/1
eCovVVW
)(WXWXIRX
iiQV
TT XWXXWX 1)()( TT XWXXWX 1)()(
For brevity:
Physiological confounds
GLM for fMRI 35
• head movements
• arterial pulsations (particularly bad in brain stem)
• breathing
• eye blinks (visual cortex)
• adaptation effects, fatigue, fluctuations in concentration, etc.
Lecture: Noise models in fMRI and noise correction
Outlook – further challenges
GLM for fMRI 36
• correction for multiple comparisons
• variability in the HRF across voxels
• slice timing
• limitations of frequentist statistics Bayesian analyses
• GLM ignores interactions among voxels models of effective connectivity
These issues are discussed in future lectures.
Correction for multiple comparison
GLM for fMRI 37
• Mass-univariate approach: We apply the GLM to each of a huge number of voxels (usually > 100,000).
• Threshold of p<0.05 more than 5000 voxels significant by chance!
• Massive problem with multiple comparisons!
• Solution: Gaussian random field theory
Lecture: Multiple comparison correction
Variability in the BOLD response
GLM for fMRI 38
• HRF varies substantially across voxels and subjects
• For example, latency can differ by ± 1 second
• Solution: use multiple basis functions
• See talk on event-related fMRI
Summary
GLM for fMRI 39
• Mass-univariate approach: same GLM for each voxel
• GLM includes all known experimental effects and confounds
• Convolution with a canonical HRF
• High-pass filtering to account for low-frequency drifts, implemented by a set of cosine functions.
• Estimation of multiple variance components (e.g. to account for serial correlations)
Bibliography
GLM for fMRI 40
Friston, Ashburner, Kiebel, Nichols, Penny (2007) Statistical Parametric Mapping: The Analysis of Functional Brain Images. Elsevier.
• Christensen R (1996) Plane Answers to Complex Questions: The Theory of Linear Models. Springer.
• Friston KJ et al. (1995) Statistical parametric maps in functional imaging: a general linear approach. Human Brain Mapping 2: 189-210.
Supplementary slides
1. Express each function in terms of a dummy variable τ.
2. Reflect one of the functions: g(τ)→g( − τ).
3. Add a time-offset, t, which allows g(t − τ) to slide along the τ-axis.
4.Start t at -∞ and slide it all the way to +∞. Wherever the two functions intersect, find the integral of their product. In other words, compute a sliding, weighted-average of function f(τ), where the weighting function is g( − τ).
The resulting waveform (not shown here) is the convolution of functions f and g. If f(t) is a unit impulse, the result of this process is simply g(t), which is therefore called the impulse response.