Maximum Likelihood Estimation for Multiepoch EEG Analysis

Maximum Likelihood Estimation for

Multiepoch EEG Analysis J.Kříž

Department of physics,University of Hradec Králové

Quantum Circle

January 30, 2007

MOTIVATION

It is often used in radar signal processing.

Why maximum likelihood estimation (MLE)?Why maximum likelihood estimation (MLE)?

RADAR = Radio Detection and Ranging



EEG = Electroencephalographymeasures electric potentials on the scalp

(generated by neuronal activity in the brain)

Multiepoch EEG: Evoked potentials= responses to the external stimulus (auditory, visual, etc.)

sensory and cognitive processing in the brain

Multiepoch EEG: Evoked potentials

• spatial–temporal spatial–temporal character

• data of the form X = S + WX = S + W

• low signal to noise ratio low signal to noise ratio (SNR)

Common properties of EEGand radar signal processing :

MOTIVATION

YES !!!YES !!!

Is MLE suitable topic for QC seminar?Is MLE suitable topic for QC seminar?

QC seminar: QC seminar: On various aspects of the quantum theory, for students in the first place

MLE: MLE: Hradil, Řeháček, Fiurášek, Ježek, Maximum Likelihood Methods in Quantum Mechanics, in Quantum State Estimation, Lecture Notes in Physics (ed. M.G.A. Paris, J. Rehacek), 59-112, Springer, 2004.

Basic concept of MLE

• originally developed by R.A. Fisher in 1920’s

• assume pdf f of random vector y depending on a parameter set w, i.e. f(y|w)

• it determines the probability of observing the data vector y (in dependence on the parameters w)

• however, we are faced with an inverse problem: we have given data vector and we do not know parameters

• define likelihood function l by reversing the roles of data and parameter vectors, i.e. l(w|y) = f(y|w).

• MLE maximizes l over all parameters w• that is, given the observed data (and a model of interest),

find the pdf, that is most likely to produce the given data.

MLE for EEG evoked response analysis

Baryshnikov, B.V., Van Veen, B.D. and Wakai R.T., IEEE Trans. Biomed. Eng. 51 ( 2004), p. 1981 – 1993.

Experiment: Experiment: pattern reversal evoked potentials63 – channel EEG device 100 epochs sampling rate of 1 kHz

Assumptions: Assumptions: response is the same across all epochsnoise is independent from trial to trial,it is temporally white, but spatially colouredit is normally distributed with zero mean

MLE for EEG evoked response analysis

Experiment: Experiment:

N … spatial channels , T … time samples per epochJ … number of epochs ( N=63, T=666, J=100)

data for j-th epoch: Xj = S + Wj ... N x T matrix

The estimate of repeated signal S can be expressed in the form

S=HS=HCCTT

C … known T x L matrix of temporal basis vectors, i.e.rows of S are linear combinations of columns of Cknown frequency band of interest is used to construct C

H … unknown N x P matrix of spatial basis vectors, i.e.columns of S are linear combinations of columns of H

… unknown P x L matrix of coefficients

Model is purely linear, both spatial and temporal nonlocalModel is purely linear, both spatial and temporal nonlocal

MLE for EEG evoked response analysis:

ModelModel

Full dataset of J epochs: X=[ X1 X2 ... XJ ] ... N x JT matrixNoise over J epochs: W=[ W1 W2 ... WJ ] ... N x JT matrix

X = [ S S ... S ] + W ,

[ S S ... S ] = HDT, where DT = [ CT CT... CT ]


ModelModel

Noise covariance „supermatrix“ is modeled as the Kronecker product of spatial and temporal covariance matrices, i.e.:every element of N x N „spatial matrix“ is JT x JT „temporal matrix“

RT= WTW… JT x JT temporal cov. matrix, (RT=1 1 in our model)R = WWT … N x N spatial cov. matrix (unknown in our model)

Temporal basis matrix CProcesses of interests in EEG are usually in the frequency band 1-20 Hz.Temporal basis vectors can be chosen as (discretized): sin(2ft), cos(2ft) to cover the frequency band of interest.The number of basis vectors L is given by frequency band.

In the case L=T we may choose C=11 (we take all frequencies)


ModelModel

Under all above assumptions, the pdfthe pdf can be written as

TTT1

2/2/))((Tr

2

1exp

2)(det

1),,|( DHXDHXR

RHRXf NTJTJ

Maximum-likelihood parameter estimation

Thus, we are looking for unknown matrices R, and H to

maximize the likelihood function for our data X.

TTT1

2/2/))((Tr

2

1exp

2)(det

1)|,,( DHXDHXR

RXHRl NTJTJ

It was done by Baryshnikov et al.It was done by Baryshnikov et al.

However, the parameter P (rank of matrix H) remains free.The question of suitable choice of P is discussed.

Comparison of MLE with independent methods

• Filtering and averaging1. Filter data (4th order Butterworth filter with passband 1-20 Hz)2. Average data over all epochs- local in both temoral and spatial dimension

• Principal component method (PCA)Project the data to the subspace given by eigenvectors corresponding to the largest eigenvalues of data covariance matrix.PCA in EEG evoked potentials analysis requires signal-free data for noise whitening (pre-stimulus based whitening).

Comparison of MLE with independent methods

Theoretically

• for C=1, 1, P=N, MLE gives exactly the mean over epochs• for C≠11, , P=N, MLE gives the mean over „filtered“ epochs

• for C=1, 1, P ≠N, , matrix H contains eigenvectors corresponding to P largest eigenvalues of

TT )()()(1 1

jiji

ji WWWWXXJ

1

Link to PCA

columns of H … eigenvectors corr. to P largest eigenvaluesincreasing P … MLE tends to „filtering and averaging“

low values of P are interesting

Comparison of MLE with filtering/averaging method

Green … nonfiltered mean over epochsBlue … filtered (1-20 Hz) mean over epochs

Red … MLE (P = 5, frequency band 1-20 Hz)

stimulus onset at 200 ms

Channels 33-36

Channels 57-60

Dependence of MLE on P

Differences betweem two matrices are calculated in the norm

2

1

,

2

1

111

,

jiij

NTN

T

A

AA

AA

AA



Dependence of MLE and filtering/averaging on J

Questeion of suitable value of PProblem: we do not know the signal of interest S, we cannot

determine for which P is the MLE closest to S.

2

2

10log10W

SSNR(dB)

Solution: simulated EEG data: take some signal of interest S and add a (coloured) noise to it.

Questeion of suitable value of P: simulated EEG data

Green … real signal of interestBlue … filtered (1-20 Hz) mean over epochs

Red … MLE (P = 5, frequency band 1-20 Hz)



Simulated EEG data: SNR = 0 dB, P=5, channels 33-36

Simulated EEG data: SNR = 0 dB, P=5, channels 57-60



Simulated EEG data: SNR = -10 dB, P=5, channels 33-36










Conclusions

BETTER RESULTS THAN FILTERING/AVERAGING:

• low number of epochs• low SNR

Maximum Likelihood Estimation for Multiepoch EEG Analysis

Documents