Wavelet-Based Denoising Using Hidden Markov Models ELEC 631 Course Project Mohammad Jaber Borran
Jan 05, 2016
Wavelet-Based Denoising Using Hidden Markov Models
ELEC 631 Course Project
Mohammad Jaber Borran
Some properties of DWT
• Primary– Locality Match more signals– Multiresolution
– Compression Sparse DWT’s
• Secondary– Clustering Dependency within scale
– Persistence Dependency across scale
Probabilistic Model for an Individual Wavelet Coefficient
• Compression many small coefficients
few large coefficients
S
W
pS(1)
fW|S(w|1)
pS(2)
fW|S(w|2)
fW (w)
Probabilistic Model for a Wavelet Transform
t
f
Ignoring the dependencies
Independent Mixture (IM) Model
t
f
Clustering
Hidden Markov Chain Model
t
f
Persistence
Hidden Markov Tree Model
Parameters of HMT Model
• pmf of the root node
• transition probability
• (parameters of the)
conditional pdfs
e.g. if Gaussian Mixture is used
)(1
mpS
rmii
,)(,
)|()( |, mwfwfii SWmi
2,, and mimi
: Model Parameter Vector
Dependency between Signs of Wavelet Coefficients
SignalWavelet
0T
T
w1T
w1
0T/2
TT/2
T/2
w2
T/2
w2
T
T/2 w2
T/2 w2
T
New Probabilistic Model for Individual Wavelet Coefficients
S
W
pS(1)
fW|S(w|1)
pS(2)
fW|S(w|2)
fW (w)pS(3)
fW|S(w|3)
pS(4)
fW|S(w|4)
• Use one-sided functions as conditional probability densities
Proposed Mixture PDF
• Use exponential distributions as components of the mixture distribution
0 0
0 )|(
,
,|
w
wemwf
wmi
SW
mi
ii
0 0
0 )|(
,
,|
w
wemwf
wmi
SW
mi
ii
If m is even:
If m is odd:
PDF of the Noisy Wavelet Coefficients
yQemyf mi
y
miSY
mimi
ii ,2
1
,|
22,,
)|(If m is even:
If m is odd:
),0(~ where, 2Nnnwy
yQemyf mi
y
miSY
mimi
ii ,2
1
,|
22,,
)|(
Wavelet transform is orthonormal, therefore if the additive noise is white and zero-mean Gaussian process with variance then we have
Noisy wavelet coefficient,
Training the HMT Model
• y: Observed noisy wavelet coefficients
• s: Vector of hidden states
• Model parameter vector
Maximum likelihood parameter estimation:
)|( maximize θyYθ
f
Intractable, because s is unobserved (hidden).
Model Training Using Expectation Maximization Algorithm
• and then,
N
i
ssiiS
N
iiiSY
ii
iispfsyff
ffff
2
,)(,1
1|
)(
1)()|( ,)|(),|(
)|(),|()|,()|(
θsθsy
θsθsyθsyθx
SY
SYYSX
• Define the set of complete data, x = (y,s)
ll fU θyθxθθ XSθ
,|)|(logE),( maximize
EM Algorithm (continued)
Nii
ii
N
M
N
iiiSY
N
i
ssiiS
l
M
ll
syfspf
ffU
,...,1 1|
2
,)(,1|
,...,1|
)|(loglog)(log),|(
)|,(log),|(),(
)(
1
sYS
sYSYS
θys
θsyθysθθ
• State a posteriori probabilities are calculated using Upward-Downward algorithm• Root state a priori pmf and the state transition probabilities are calculated using Lagrange multipliers for maximizing U.• Parameters of the conditional pdf may be calculated analytically or numerically, to maximize the function U.
Denoising
yQ
e
syfswyfswfsywf
wywy
SYWSYSWYSW
2
222
2
)()(
2
1
||||
2
1
)|(),|()|(),|(
• MAP estimate:
2| ),|(maxargˆ ysywfw YSW
ws
s
sYSWSs
ww
sywfsps
ˆ
|
ˆˆ
),|ˆ()(maxargˆ
Denoising (continued)
• Conditional mean estimate:
s
M
sS wspw ˆ)(ˆ
1
)(2
,|Eˆ 22
12
yy
Q
esyww
y
s
0 0.5 1-100
0
100
200
Orig
inal
0 0.5 1-100
0
100
200
Noi
sy
0 0.5 1-100
0
100
200
4 G
auss
ian,
Haa
r
0 0.5 1-100
0
100
200
4 G
auss
ian,
D8
0 0.5 1-100
0
100
200
4 E
xpon
entia
l, H
aar
0 0.5 1-100
0
100
200
4 E
xpon
entia
l, D
8
Init. MSE = 24.639723 4 mix, Haar 4 mix, D8
Gaussian Mixture 3.078267 7.020152Exponential Mixture 2.326472 7.030970
0 0.5 1-20
0
20
Orig
inal
0 0.5 1-20
0
20
Noi
sy
0 0.5 1-20
0
20
2 G
auss
ian,
D8
0 0.5 1-20
0
20
4 G
auss
ian,
D8
0 0.5 1-20
0
20
2 E
xpon
entia
l, D
8
0 0.5 1-20
0
20
4 E
xpon
entia
l, D
8
Init. MSE = 2.429741 2 mix, D8 4 mix, D8
Gaussian Mixture 0.471568 0.417795Exponential Mixture 0.426488 0.397808
0 0.5 1-200
-100
0
100
Orig
inal
0 0.5 1-200
0
200
Noi
sy
0 0.5 1-200
0
200
2 G
auss
ian,
D4
0 0.5 1-200
-100
0
100
4 G
auss
ian,
D8
0 0.5 1-200
-100
0
100
2 E
xpon
entia
l, D
4
0 0.5 1-200
-100
0
100
4 E
xpon
entia
l, D
8
Init. MSE = 92.907059 2 mix, D4 4 mix, D8
Gaussian Mixture 8.442306 7.873508Exponential Mixture 8.394187 7.862579
Conclusion
• Mixture distributions for individual wavelet coefficients can effectively model the non–Gaussian nature of the coefficients.
• Hidden Markov Models can serve as a powerful tool for wavelet-based statistical signal processing.
• One-sided exponential distributions for mixture components along with hidden Markov Tree model can achieve better performance in denoising.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-100
-50
0
50
100
150Noisy Signal
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-60
-40
-20
0
20
40
60
80
100
120
140Denoised Using 4 Gaussian Mixture and Haar Wavelet
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-60
-40
-20
0
20
40
60
80
100
120
140Denoised Using 4 Exponential Mixture and Haar Wavelet
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-15
-10
-5
0
5
10
15
20Noisy Signal
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-15
-10
-5
0
5
10
15Denoised Using 4 Gaussian Mixture and Daubechy Length 8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-15
-10
-5
0
5
10
15Denoised Using 4 Exponential Mixture and Daubechy Length 8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-200
-150
-100
-50
0
50
100
150Noisy Signal
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-150
-100
-50
0
50
100Denoised Using 4 Gaussian Mixture and Daubechy Length 8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-150
-100
-50
0
50
100Denoised Using 4 Exponential Mixture and Daubechy Length 8