Top Banner
Computational Biology VI Jianfeng Feng Warwick University
77

Computational Biology VI Jianfeng Feng Warwick University.

Dec 27, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Computational Biology VI Jianfeng Feng Warwick University.

Computational Biology VI

Jianfeng Feng

Warwick University

Page 2: Computational Biology VI Jianfeng Feng Warwick University.

Our module

Preprocessing Data Analysis

Raw Data Acquisition

Slice-time Correction

Motion Correction, Co-registration & Normalization

Spatial Smoothing

Localizing Brain Activity

Applications:Clinical

Reconstruction

Data Processing Pipeline

Connectivity

Experimental Design

Linear Model

Grangercausality

Page 3: Computational Biology VI Jianfeng Feng Warwick University.

Outline

1. GLM

2. Model building

3. Noise module

4. Inferences

Multi-comparison correction ( after next week)

Page 4: Computational Biology VI Jianfeng Feng Warwick University.

1: The General Linear Model

Page 5: Computational Biology VI Jianfeng Feng Warwick University.

Statistical Analysis

• There are multiple goals in the statistical analysis or machine learning side of fMRI data.

• They include:

– localizing brain areas activated by the task;

– determining networks corresponding to brain function; and

– making predictions about psychological or disease states.

Page 6: Computational Biology VI Jianfeng Feng Warwick University.

Human Brain Mapping

• The most common use of fMRI to date has been to localize areas of the brain that activate in response to a certain task.

• These types of human brain mapping studies are necessary for the development of biomarkers and increasing our understanding of brain function.

Page 7: Computational Biology VI Jianfeng Feng Warwick University.

Massive Univariate Approach

• Typically analysis is performed by constructing a separate model at each voxel– The ‘massive univariate approach’.

– Assumes an improbable independence between voxel pairs......

• Typically dependencies between voxels are dealt with later using random field theory, which makes assumptions about the spatial dependencies between voxels.

Page 8: Computational Biology VI Jianfeng Feng Warwick University.

General Linear Model

• The general linear model (GLM) approach treats the data as a linear combination of model functions (predictors) plus noise (error).

• The model functions are assumed to have known shapes, but their amplitudes are unknown and need to be estimated.

• The GLM framework encompasses many of the commonly used techniques in fMRI data analysis (and big data analysis more generally).

y(t) = b0 + b1 f1(t) +… + + bp fp(t) + e(t)

Page 9: Computational Biology VI Jianfeng Feng Warwick University.

• Consider an experiment of alternating blocks of finger-tapping and rest.

• Construct a model to study data from a single voxel for a single subject.

• We seek to determine whether activation is higher during finger-tapping compared with rest.

Illustration

Page 10: Computational Biology VI Jianfeng Feng Warwick University.

fMRI Data Design matrix Model parameters

Residuals

BOLD signal Intercept Predicted task response

= X +[𝛽0

𝛽1]

10

301

300

2

1

01

01

11

11

11

01

01

01

x

y

y

y

y

y

N

Page 11: Computational Biology VI Jianfeng Feng Warwick University.

fMRI Data Design matrix Model parameters

Residuals

= X +[𝛽0

𝛽1]

In simple formula: y(t) = b0 + b1 f1(t) + e1(t)

Page 12: Computational Biology VI Jianfeng Feng Warwick University.

fMRI Data Design matrix Model parameters

Residuals

BOLD signal Intercept Predicted task response

= X +

H0 : 1 0

Null hypothesis

[𝛽0

𝛽1]

Page 13: Computational Biology VI Jianfeng Feng Warwick University.

where

fMRI Data Design matrix

Regression coefficients

GLM

A standard GLM can be written:

y(t) = b0 + b1 f1(t) +… + + bp fp(t) + e(t) or

Y Xβ ε

Noise

ε ~ N (0, V)

V is the covariance matrix whose format depends on the noise model.

The quality of the model depends on our choice of X and V.

Page 14: Computational Biology VI Jianfeng Feng Warwick University.

In matrix format (lecture 4, time series)

For k dimensional case

where the recorded data (Y(1), Y(2), Y(3), …, Y(T)), denoting

Page 15: Computational Biology VI Jianfeng Feng Warwick University.

Problem Formulation

• Assume the model:

• The matrices X and Y are assumed to be known, and the noise is considered to be uncorrelated.

• Our goal is to find the value of that minimizes:

y XβT y Xβ

Y Xβ ε ε ~ N(0, I 2 )

Page 16: Computational Biology VI Jianfeng Feng Warwick University.

OLS Solution

Ordinary least squares solution

βˆ (XT X)1 XT y

Properties:

Maximum likelihood estimate

E ˆ

Var(ˆ) 2 XT X

1

Can we do better?

Page 17: Computational Biology VI Jianfeng Feng Warwick University.

Gauss Markov Theorem

• The Gauss-Markov Theorem states that any other unbiased estimator of will have a larger variance than the OLS solution.

• Assume is an unbiased estimator of .

• Then according to G-M Theorem,

Var() Var()

• ˆ is the best linear unbiased estimator (BLUE) of .

Page 18: Computational Biology VI Jianfeng Feng Warwick University.

Estimation

• If is i.i.d., then Ordinary Least Square (OLS) estimate is optimal

• If Var() =V2 I2, then Generalized Least Squares (GLS) estimate is optimal

Y Xβ εestimate

βˆ (X' X)1 X' Y

estimate

βˆ (X' V 1X)1 X' V 1Y

Y Xβ ε

model

model

Page 19: Computational Biology VI Jianfeng Feng Warwick University.

Summary

estimate

βˆ (X' V 1X)1 X' V 1Y

model

Y Xβ ε

fitted values

r Y

(I (X ' V1X)1

X ' V1 )Y

RY

residuals

Page 20: Computational Biology VI Jianfeng Feng Warwick University.

Estimating the Variance

• Even if we assume is i.i.d., we still need to estimate the residual variance, 2.

• For OLS:

• Estimating V I more difficult, using iterative methods.

T

N p

rT r̂

2

Page 21: Computational Biology VI Jianfeng Feng Warwick University.

Estimation

• If is i.i.d., then Ordinary Least Square (OLS) estimate is optimal

• If Var() =V2 I2, then Generalized Least Squares (GLS) estimate is optimal

Y Xβ εestimate

βˆ (X' X)1 X' Y

estimate

βˆ (X' V 1X)1 X' V 1Y

Y Xβ ε

model

model

Page 22: Computational Biology VI Jianfeng Feng Warwick University.

Model Refinement

• This model has a number of shortcomings.

• We want to use our understanding of the signal and noise properties of BOLD fMRI to aid us in constructing appropriate models.

• This includes deciding on an appropriate design matrix, as well as an appropriate noise model.

Page 23: Computational Biology VI Jianfeng Feng Warwick University.

Issues

1. BOLD responses have a

delayed and dispersed form.

2. The fMRI signal includes substantial amounts of low- frequency noise.

3. The data are serially correlated which needs to be considered in the model.

clear allclose allfor i=1:400 h=0.1; x(i)=i*h; y(i)=x(i)*x(i)*x(i)*x(i)*exp(-x(i));endfor i=41:350 m(i)=y(i-40); y(i)=y(i)-m(i)*0.2;end plot(x([1:300]),y([1:300])/max(y))

Page 24: Computational Biology VI Jianfeng Feng Warwick University.

2: Model Building

Page 25: Computational Biology VI Jianfeng Feng Warwick University.

A standard GLM can be written:

Y Xβ εwhere

fMRI Data Design matrix

Model parameters

Noise

General Linear Model

V is the covariance matrix whose format depends on the noise model.

The quality of the model depends on our choice of X and V.

ε ~ N (0, V)

Page 26: Computational Biology VI Jianfeng Feng Warwick University.

Model Building

• Proper construction of the design matrix is critical for effective use of the GLM.

• This process can be complicated by the following properties of the BOLD response:– It includes low-frequency noise and artifacts related to

head movement and cardiopulmonary-induced brain movement.

– The neural response shape may not be known.

– The hemodynamic response varies in shape across the brain.

Page 27: Computational Biology VI Jianfeng Feng Warwick University.

BOLD Response

• Predict the shape of the BOLD response to a given stimulus pattern. Assume the shape is known and the amplitude is unknown.

• The relationship between stimuli and the BOLD response is typically modeled using a linear time invariant (LTI) system.

• In an LTI system an impulse (i.e., neuronal activity) is convolved with an impulse response function (i.e., HRF).

Page 28: Computational Biology VI Jianfeng Feng Warwick University.

Convolution Examples

Hemodynamic Response Function

Predicted Response

Block Design

Experimental Stimulus Function

Page 29: Computational Biology VI Jianfeng Feng Warwick University.

HRF Models

• Often a fixed canonical HRF is used to model the response to neuronal activity

- Linear combination of 2 gamma functions.

- Optimal if correct.

- If wrong, leads to bias

and power loss. Unlikely that the same HRF

is valid for all voxels.

True response may be faster/slower

True response may have smaller/bigger undershoot

Page 30: Computational Biology VI Jianfeng Feng Warwick University.

fMRI Data Design matrix Model parameters

Residuals

BOLD signal Intercept Predicted task response

= X +

H : 001

[𝛽0

𝛽1]

Page 31: Computational Biology VI Jianfeng Feng Warwick University.

Example

ModelImage of

predictors Data & Fitted

Single HRF

Page 32: Computational Biology VI Jianfeng Feng Warwick University.

fMRI Data Design matrix Model parameters

Residuals

BOLD signal Intercept Predicted task response

= X +

H : 001

[𝛽0

𝛽1]

Page 33: Computational Biology VI Jianfeng Feng Warwick University.

Example

In mathematical term, we have

y(t) = b0 + b1 f1(t) + e(t) or

Y Xβ ε

Page 34: Computational Biology VI Jianfeng Feng Warwick University.

Checkerboard Thermal pain

Aversive picture, Aversive anticipation

Stimulus On

The HRF shape depends both on the vasculature and the time course of neural activity.

Problems

Assuming a fixed HRF is usually not appropriate.

Page 35: Computational Biology VI Jianfeng Feng Warwick University.

Temporal Basis Functions

• To allow for different types of HRFs in different brain regions, it is typically better to use temporal basis functions.

• A linear combination of functions can be used to account for delays and dispersions in the HRF.– The stimulus function is convolved with each of the

basis functions to give a set of regressors.

– The parameter estimates give the coefficients that determine the combination of basis functions that best models the HRF for the trial type and voxel in question.

Page 36: Computational Biology VI Jianfeng Feng Warwick University.

• In an LTI system the BOLD response is modeled

x(t) (s h)(t)

where s(t) is a stimulus function and h(t) the HRF and * is the convolution.

• Model the HRF as a linear combination of temporal basis functions, fi(t), such that

h(t) i fi (t)

Temporal Basis Functions

Page 37: Computational Biology VI Jianfeng Feng Warwick University.

h(t) i fi (t)

Temporal Basis Functions

1

2

3

h(t)

Page 38: Computational Biology VI Jianfeng Feng Warwick University.

• The BOLD response can be rewritten:

x(t) i (s fi )(t)

• In the GLM framework the convolution of the stimulus function with each basis function makes up a separate column of the design matrix.

• Each corresponding i describes the weight of that component.

Temporal Basis Functions

Page 39: Computational Biology VI Jianfeng Feng Warwick University.

Temporal Basis Functions

• Typically-used models vary in the degree they make a priori assumptions about the shape of the response.

• In the most extreme case, the shape of the HRF is fixed and only the amplitude is allowed to vary.

• By contrast, a finite impulse response (FIR) basis set, contains one free parameter for every time- point following stimulation for every cognitive event type.

Page 40: Computational Biology VI Jianfeng Feng Warwick University.

The model estimates an HRF of arbitrary shape for each event type in each voxel of the brain

Finite Impulse Response

Page 41: Computational Biology VI Jianfeng Feng Warwick University.

Basis sets

Time (s)

ModelImage ofpredictors Data & Fitted

Single HRF

HRF +derivatives

Finite Impulse Response (FIR)

Page 42: Computational Biology VI Jianfeng Feng Warwick University.

Basis sets

Time (s)

ModelImage ofpredictors Data & Fitted

Single HRF

HRF +derivatives

Finite Impulse Response (FIR)

Overfitting ?????

Page 43: Computational Biology VI Jianfeng Feng Warwick University.

Nuisance Covariates

• Often model factors associated with known sources of variability, but that are not related to the experimental hypothesis, need to be included in the GLM.

• Examples of possible ‘nuisance regressors’:– Signal drift

– Physiological (e.g., respiration) artifacts

– Head motion, e.g. six regressors comprising of three translations and three rotations.

Sometimes transformations of the six regressors also included.

Page 44: Computational Biology VI Jianfeng Feng Warwick University.

• Slow changes in voxel intensity over time (low- frequency noise) is present in the fMRI signal.

• Scanner instabilities and not motion or physiological noise may be the main cause of the drift, as drift has been seen in cadavers.

• Need to include drift parameters in our models.- Use splines, polynomial basis or discrete cosine basis

Drift

Page 45: Computational Biology VI Jianfeng Feng Warwick University.

• Blue curve is the data

• Red curve is what we expected

Drift

Page 46: Computational Biology VI Jianfeng Feng Warwick University.

=

+

= +Y X

Model with Drift

• Selecting fi as cos(i w t) and sin(i w t) for example (as in MP3, only cos is used)

Page 47: Computational Biology VI Jianfeng Feng Warwick University.

blue =

black =

green =

data

mean + low-frequency drift predicted response, taking into account low-frequency drift

predicted response (with low- frequency drift explained away)

red =

High Pass Filtering

Page 48: Computational Biology VI Jianfeng Feng Warwick University.

• Respiration and heart beat give rise to high- frequency noise.

• This type of noise is difficult to remove and is often left in the data giving rise to temporal autocorrelations.

Physiological Noise

Page 49: Computational Biology VI Jianfeng Feng Warwick University.

3: Noise Models

Page 50: Computational Biology VI Jianfeng Feng Warwick University.

where

fMRI Data Design matrix

Regression coefficients

GLM

A standard GLM can be written:

Y Xβ ε

Noise

ε ~ N (0, V)

V is the covariance matrix whose format depends on the noise model.

The quality of the model depends on our choice of X and V.

Page 51: Computational Biology VI Jianfeng Feng Warwick University.

Design Matrix

• We has previously discussed various signal and nuisance components that can be included in the design matrix to improve the model.– Temporal Basis functions

Allows for flexible HRF

– Motion parameters Corrects for ‘spin history’ artifacts

Page 52: Computational Biology VI Jianfeng Feng Warwick University.

fMRI Noise

• Functional MRI data typically exhibit significant autocorrelation.

– Caused by physiological noise and low frequency drift, that has not been appropriately modeled.

– Typically modeled using either an AR(p) process.

– Single subject statistics are not valid without an accurate model of the noise.

Page 53: Computational Biology VI Jianfeng Feng Warwick University.

AR(1) model

• Serial correlation can be modeled using a first-order autoregressive model, i.e.

• The error term t depends on the previous error term t-1 and a new disturbance term ut.

t t1 ut ut ~ N (0, )2

Page 54: Computational Biology VI Jianfeng Feng Warwick University.

AR(1) model

• The autocorrelation function (ACF) for an AR(1) process:

5 1 0 1 5

- 1. 0

- 0. 5

0. 00. 5

1. 0

1 :1 6

=0.7

𝜌 (h )={1 ,    i f   h    0 ,     | h |      i f   h    0  

Page 55: Computational Biology VI Jianfeng Feng Warwick University.

IID Case AR(1) Case

Error Term

• The format of V will depend on what noise model is used.

Page 56: Computational Biology VI Jianfeng Feng Warwick University.

GLM Summary

estimate

βˆ (X' V 1X)1 X' V 1Y

model

Y Xβ ε

fitted values

r Y

(I (X ' V1X)1

X ' V1 )Y

RY

residuals

Page 57: Computational Biology VI Jianfeng Feng Warwick University.

Estimating V

• In general the form of the covariance matrix is unknown, which means it has to be estimated.

• Estimating V depends on ’s, and estimating’s depends on V. Need iterative procedure.

Page 58: Computational Biology VI Jianfeng Feng Warwick University.

estimated covariance matrix from step 2.

Iterative Procedure

1. Assume that V=I and calculate the OLS solution.

2. Estimate the parameters of V using the residuals.

3. Re-estimate the values using the

4. Iterate until convergence.

Page 59: Computational Biology VI Jianfeng Feng Warwick University.

Spatio-temporal Behavior

• The spatiotemporal behavior of these noise processes is complex.

Spatial maps of the model parameters from an AR(2) model estimated for each voxel’s noise data.

Page 60: Computational Biology VI Jianfeng Feng Warwick University.

4: Inference

Page 61: Computational Biology VI Jianfeng Feng Warwick University.

GLM Summary

estimate

βˆ (X' V 1X)1 X' V 1Y

model

Y Xβ ε

fitted values

r Y

(I (X ' V1X)1

X ' V1 )Y

RY

residuals

Page 62: Computational Biology VI Jianfeng Feng Warwick University.

• After fitting the GLM use the estimated parameters to determine whether there is significant activation present in the voxel.

• Inference is based on the fact that:

ˆ ~ N (, (XT V1X)1 )

• Use t or F test to perform tests on effects of interest.

Inference

Page 63: Computational Biology VI Jianfeng Feng Warwick University.

Contrasts

• It is often of interest to see whether a linear combination of the parameters are significant.

• The term cT specifies a linear combination of the estimated parameters, i.e.

• Here c is called a contrast vector.

1 1 2 2 n nT

c β c c … c

Page 64: Computational Biology VI Jianfeng Feng Warwick University.

• Experiment with two types of stimuli.

=

+ Noise

1

2

3

Example

H0 : 2 3

H0 : c β 0T

0, 1, 1

cT

Page 65: Computational Biology VI Jianfeng Feng Warwick University.

H0 : cβ 0

use the t-statistic:

T Ha : c β 0T

Var cT βˆ

cT

βˆT

T-test

• To test

• Under H0, T is approximately t() with tr((RV))

(tr(RV))2 2

Page 66: Computational Biology VI Jianfeng Feng Warwick University.

Multiple Contrasts

• We often want to make simultaneous tests of several contrasts at once.

• c is now a contrast matrix.

• Suppose

then

𝑪=(1 0 0 0 00 1 0 0 0)

Page 67: Computational Biology VI Jianfeng Feng Warwick University.

=

+

= +Y X

Consider a model with box-car shaped activation and drift modeled using the discrete cosine basis.

Example

Page 68: Computational Biology VI Jianfeng Feng Warwick University.

Do the drift components add anything to the model?

Test: H0 : c 0T

where

Example

𝑪=(0 0 1 0 0 0 0 0 00 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 00 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

)

Page 69: Computational Biology VI Jianfeng Feng Warwick University.

This is equivalent to testing:

H : 0T

89

7650 34

To understand what this implies, we split the design matrix into two parts:

X 0X 1

Example

[1 𝑋 11 𝑋 12 ⋯ 𝑋19

1 𝑋 21 𝑋 22 ⋯ 𝑋 29

⋮ ⋮ ⋮ ¿ ¿𝑋𝑛1¿𝑋𝑛2¿⋯¿𝑋𝑛9¿ ]

Page 70: Computational Biology VI Jianfeng Feng Warwick University.

Example

• Do the drift components add anything to the model?

• The X1 matrix explains the drift. Does it contribute in a significant way to the model?

• Compare the results using the full model, with design matrix X, with those obtained using a reduced model, with design matrix X0.

Page 71: Computational Biology VI Jianfeng Feng Warwick University.

r r r

r tr((R

R)Vˆ 0

2

00F TT

tr(R R

)Vtr(R R

)V 2

0

20

0 tr(RV)2

tr((RV)2 )

and

F-test

• Test the hypothesis using the F-statistic:

• Assuming the errors are normally distributed, F has an approximate F-distribution with (0,) degrees of freedom, where

Page 72: Computational Biology VI Jianfeng Feng Warwick University.

Statistical Images

• For each voxel a hypothesis test is performed. The statistic corresponding to that test is used to create a statistical image over all voxels.

T-value

Page 73: Computational Biology VI Jianfeng Feng Warwick University.

Localizing Activation

1. Construct a model for each voxel of the brain.– “Massive univariate approach”– Regression models (GLM) commonly used.

=

+

= +Y X

Page 74: Computational Biology VI Jianfeng Feng Warwick University.

Localizing Activation

2. Perform a statistical test to determine whether task related activation is present in the voxel.

TH : c β 00

Statistical image: Map of t-tests across all voxels (a.k.a t-map).

Page 75: Computational Biology VI Jianfeng Feng Warwick University.

Localizing Activation

3. Choose an appropriate threshold for determining statistical significance.

Statistical parametric map: (SPM) Each significant voxel is color-coded according to the size of its p-value.

Page 76: Computational Biology VI Jianfeng Feng Warwick University.

How do we determine which voxels are actually active?

Problems:

• The statistics are obtained by performing a large number of hypothesis tests.

• Many of the test statistics will be artificially inflated due to the noise.

• This leads to many false positives.

Statistical Images

Page 77: Computational Biology VI Jianfeng Feng Warwick University.

Multiple Comparisons

• Which of 100,000 voxels are significant?

– =0.05 5,000 false positive voxels

• Choosing a threshold is a balance

between sensitivity (true positive rate)

and specificity (true negative rate).

t > 1 t > 2 t > 3 t > 4 t > 5