Maximum Likelihood Estimation for Multiepoch EEG Analysis J.Kříž Department of physics, University of Hradec Králové Quantum Circle January 30, 2007
Feb 01, 2016
Maximum Likelihood Estimation for
Multiepoch EEG Analysis J.Kříž
Department of physics,University of Hradec Králové
Quantum Circle
January 30, 2007
MOTIVATION
It is often used in radar signal processing.
Why maximum likelihood estimation (MLE)?Why maximum likelihood estimation (MLE)?
RADAR = Radio Detection and Ranging
RADAR = Radio Detection and Ranging
RADAR = Radio Detection and Ranging
EEG = Electroencephalographymeasures electric potentials on the scalp
(generated by neuronal activity in the brain)
Multiepoch EEG: Evoked potentials= responses to the external stimulus (auditory, visual, etc.)
sensory and cognitive processing in the brain
Multiepoch EEG: Evoked potentials
• spatial–temporal spatial–temporal character
• data of the form X = S + WX = S + W
• low signal to noise ratio low signal to noise ratio (SNR)
Common properties of EEGand radar signal processing :
MOTIVATION
YES !!!YES !!!
Is MLE suitable topic for QC seminar?Is MLE suitable topic for QC seminar?
QC seminar: QC seminar: On various aspects of the quantum theory, for students in the first place
MLE: MLE: Hradil, Řeháček, Fiurášek, Ježek, Maximum Likelihood Methods in Quantum Mechanics, in Quantum State Estimation, Lecture Notes in Physics (ed. M.G.A. Paris, J. Rehacek), 59-112, Springer, 2004.
Basic concept of MLE
• originally developed by R.A. Fisher in 1920’s
• assume pdf f of random vector y depending on a parameter set w, i.e. f(y|w)
• it determines the probability of observing the data vector y (in dependence on the parameters w)
• however, we are faced with an inverse problem: we have given data vector and we do not know parameters
• define likelihood function l by reversing the roles of data and parameter vectors, i.e. l(w|y) = f(y|w).
• MLE maximizes l over all parameters w• that is, given the observed data (and a model of interest),
find the pdf, that is most likely to produce the given data.
MLE for EEG evoked response analysis
Baryshnikov, B.V., Van Veen, B.D. and Wakai R.T., IEEE Trans. Biomed. Eng. 51 ( 2004), p. 1981 – 1993.
Experiment: Experiment: pattern reversal evoked potentials63 – channel EEG device 100 epochs sampling rate of 1 kHz
Assumptions: Assumptions: response is the same across all epochsnoise is independent from trial to trial,it is temporally white, but spatially colouredit is normally distributed with zero mean
MLE for EEG evoked response analysis
Experiment: Experiment:
N … spatial channels , T … time samples per epochJ … number of epochs ( N=63, T=666, J=100)
data for j-th epoch: Xj = S + Wj ... N x T matrix
The estimate of repeated signal S can be expressed in the form
S=HS=HCCTT
C … known T x L matrix of temporal basis vectors, i.e.rows of S are linear combinations of columns of Cknown frequency band of interest is used to construct C
H … unknown N x P matrix of spatial basis vectors, i.e.columns of S are linear combinations of columns of H
… unknown P x L matrix of coefficients
Model is purely linear, both spatial and temporal nonlocalModel is purely linear, both spatial and temporal nonlocal
MLE for EEG evoked response analysis:
ModelModel
Full dataset of J epochs: X=[ X1 X2 ... XJ ] ... N x JT matrixNoise over J epochs: W=[ W1 W2 ... WJ ] ... N x JT matrix
X = [ S S ... S ] + W ,
[ S S ... S ] = HDT, where DT = [ CT CT... CT ]
MLE for EEG evoked response analysis:
ModelModel
Noise covariance „supermatrix“ is modeled as the Kronecker product of spatial and temporal covariance matrices, i.e.:every element of N x N „spatial matrix“ is JT x JT „temporal matrix“
RT= WTW… JT x JT temporal cov. matrix, (RT=1 1 in our model)R = WWT … N x N spatial cov. matrix (unknown in our model)
Temporal basis matrix CProcesses of interests in EEG are usually in the frequency band 1-20 Hz.Temporal basis vectors can be chosen as (discretized): sin(2ft), cos(2ft) to cover the frequency band of interest.The number of basis vectors L is given by frequency band.
In the case L=T we may choose C=11 (we take all frequencies)
MLE for EEG evoked response analysis:
ModelModel
Under all above assumptions, the pdfthe pdf can be written as
TTT1
2/2/))((Tr
2
1exp
2)(det
1),,|( DHXDHXR
RHRXf NTJTJ
Maximum-likelihood parameter estimation
Thus, we are looking for unknown matrices R, and H to
maximize the likelihood function for our data X.
TTT1
2/2/))((Tr
2
1exp
2)(det
1)|,,( DHXDHXR
RXHRl NTJTJ
It was done by Baryshnikov et al.It was done by Baryshnikov et al.
However, the parameter P (rank of matrix H) remains free.The question of suitable choice of P is discussed.
Comparison of MLE with independent methods
• Filtering and averaging1. Filter data (4th order Butterworth filter with passband 1-20 Hz)2. Average data over all epochs- local in both temoral and spatial dimension
• Principal component method (PCA)Project the data to the subspace given by eigenvectors corresponding to the largest eigenvalues of data covariance matrix.PCA in EEG evoked potentials analysis requires signal-free data for noise whitening (pre-stimulus based whitening).
Comparison of MLE with independent methods
Theoretically
• for C=1, 1, P=N, MLE gives exactly the mean over epochs• for C≠11, , P=N, MLE gives the mean over „filtered“ epochs
• for C=1, 1, P ≠N, , matrix H contains eigenvectors corresponding to P largest eigenvalues of
TT )()()(1 1
jiji
ji WWWWXXJ
1
Link to PCA
columns of H … eigenvectors corr. to P largest eigenvaluesincreasing P … MLE tends to „filtering and averaging“
low values of P are interesting
Comparison of MLE with filtering/averaging method
Green … nonfiltered mean over epochsBlue … filtered (1-20 Hz) mean over epochs
Red … MLE (P = 5, frequency band 1-20 Hz)
stimulus onset at 200 ms
Channels 33-36
Channels 57-60
Dependence of MLE on P
Differences betweem two matrices are calculated in the norm
2
1
,
2
1
111
,
jiij
NTN
T
A
AA
AA
AA
Dependence of MLE on P
Dependence of MLE on P
Dependence of MLE and filtering/averaging on J
Questeion of suitable value of PProblem: we do not know the signal of interest S, we cannot
determine for which P is the MLE closest to S.
2
2
10log10W
SSNR(dB)
Solution: simulated EEG data: take some signal of interest S and add a (coloured) noise to it.
Questeion of suitable value of P: simulated EEG data
Green … real signal of interestBlue … filtered (1-20 Hz) mean over epochs
Red … MLE (P = 5, frequency band 1-20 Hz)
Questeion of suitable value of P: simulated EEG data
Questeion of suitable value of P: simulated EEG data
Simulated EEG data: SNR = 0 dB, P=5, channels 33-36
Simulated EEG data: SNR = 0 dB, P=5, channels 57-60
Questeion of suitable value of P: simulated EEG data
Questeion of suitable value of P: simulated EEG data
Simulated EEG data: SNR = -10 dB, P=5, channels 33-36
Simulated EEG data: SNR = -10 dB, P=5, channels 57-60
Questeion of suitable value of P: simulated EEG data
Questeion of suitable value of P: simulated EEG data
Simulated EEG data: SNR = -20 dB, P=2, channels 33-36
Simulated EEG data: SNR = -20 dB, P=2, channels 57-60
Questeion of suitable value of P: simulated EEG data
Questeion of suitable value of P: simulated EEG data
Simulated EEG data: SNR = -30 dB, P=1, channels 33-36
Simulated EEG data: SNR = -30 dB, P=1, channels 57-60
Conclusions
BETTER RESULTS THAN FILTERING/AVERAGING:
• low number of epochs• low SNR