Top Banner
Discrete Denoising with Shifts Taesup Moon Yahoo! Labs EE477 Guest Lecture November 10, 2011 Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 1 / 24
83

Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Apr 26, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts

Taesup Moon

Yahoo! Labs

EE477 Guest LectureNovember 10, 2011

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 1 / 24

Page 2: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts

1 Prediction with Experts’ Advice

2 Discrete Denoising with ShiftsRecap of DUDEMotivationNew algorithm: S-DUDEResults

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 2 / 24

Page 3: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Recap of DUDE

Discrete denoising

Xt, Zt, Xt take values in finite alphabets

Choose Xn1 as close as possible to Xn

1 , based on theentire Zn

1Ex) text correction, image denoising, DNA sequence analyses, etc.Performance metric: per-symbol average loss

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 3 / 24

Page 4: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Recap of DUDE

DUDE is the first universal discretedenoiser

DUDE - [Weissman et.al 05]

For location t to be denoised, do :

1 fix the window size k

2 find left k-context (`1, . . . , `k) and right k-context (r1, . . . , rk) of zt

`1 `2 · · · `k zt r1 r2 · · · rk

3 count all occurrences of symbols in zn with the same context

4 decide on xt according to

xt(zt+kt−k) = simple rule(Π,Λ, count vector[zn, zt−1

t−k, zt+kt+1 ], zt)

Whenever DUDE sees zt+1t−kztz

t+kt+1 , it makes the same decision for zt

DUDE is a “sliding window” denoiser

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 4 / 24

Page 5: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Recap of DUDE

Ex 1 : stationary bit stream gets corrupted

Xn : 00000011111110000000000111111111100000001111111110000

Zn : 00100011101110010001000111110111100000011110111110001

source : binary Markov chain with p = 0.1, sequence length n = 106

1! p

0 1

p

p

1! p

noise : BSC(δ = 0.1)

!

0

1 1

01! !

1! !

!

⇒ optimal BER attained by the Forward-Backward Recursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 5 / 24

Page 6: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Recap of DUDE

Ex 1 : stationary bit stream gets corrupted

Xn : 00000011111110000000000111111111100000001111111110000Zn : 00100011101110010001000111110111100000011110111110001

source : binary Markov chain with p = 0.1, sequence length n = 106

1! p

0 1

p

p

1! p

noise : BSC(δ = 0.1)

!

0

1 1

01! !

1! !

!

⇒ optimal BER attained by the Forward-Backward Recursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 5 / 24

Page 7: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Recap of DUDE

Ex 1 : stationary bit stream gets corrupted

Xn : 00000011111110000000000111111111100000001111111110000Zn : 00100011101110010001000111110111100000011110111110001

source : binary Markov chain with p = 0.1, sequence length n = 106

1! p

0 1

p

p

1! p

noise : BSC(δ = 0.1)

!

0

1 1

01! !

1! !

!

⇒ optimal BER attained by the Forward-Backward Recursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 5 / 24

Page 8: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Recap of DUDE

DUDE achieves the optimal BER as thewindow size grows

0 1 2 3 4 5 60.5

0.6

0.7

0.8

0.9

1

Window size k

Bit e

rror r

ate/

Bit error rate plot

Bayes Optimum = 0.558

DUDE = 0.561

Window size k is a design parameter for given sequencelength n

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 5 / 24

Page 9: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Recap of DUDE

DUDE attains the optimum performancesfor stationary sources

For a denoiser Xn = {Xt(zn)}nt=1,

LXn(xn, zn) =1

n

n∑

t=1

Λ(xt, Xt(zn))

is the performance measure

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 6 / 24

Page 10: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Recap of DUDE

DUDE attains the optimum performancesfor stationary sources

main results of DUDE : when k = kn < d12 log|Z| ne,

1 For any stationary process X,

limn→∞

[E(LXn

DUDE(Xn, Zn)

)− min

Xn∈Dn

E(LXn(Xn, Zn)

)]= 0

Dn is the set of all denoisers in the world

DUDE attains the Bayes optimal performance

2 For all x ∈ X∞,

limn→∞

[LXn

DUDE(xn, Zn)−Dk(xn, Zn)

]= 0 w.p.1

Dk(xn, zn) : the best performance among Sk

DUDE is as good as the best sliding window denoiser

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 6 / 24

Page 11: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Recap of DUDE

DUDE attains the optimum performancesfor stationary sources

main results of DUDE : when k = kn < d12 log|Z| ne,

1 For any stationary process X,

limn→∞

[E(LXn

DUDE(Xn, Zn)

)− min

Xn∈Dn

E(LXn(Xn, Zn)

)]= 0

Dn is the set of all denoisers in the world

DUDE attains the Bayes optimal performance

2 For all x ∈ X∞,

limn→∞

[LXn

DUDE(xn, Zn)−Dk(xn, Zn)

]= 0 w.p.1

Dk(xn, zn) : the best performance among Sk

DUDE is as good as the best sliding window denoiser

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 6 / 24

Page 12: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Recap of DUDE

DUDE attains the optimum performancesfor stationary sources

main results of DUDE : when k = kn < d12 log|Z| ne,

1 For any stationary process X,

limn→∞

[E(LXn

DUDE(Xn, Zn)

)− min

Xn∈Dn

E(LXn(Xn, Zn)

)]= 0

Dn is the set of all denoisers in the world

DUDE attains the Bayes optimal performance

2 For all x ∈ X∞,

limn→∞

[LXn

DUDE(xn, Zn)−Dk(xn, Zn)

]= 0 w.p.1

Dk(xn, zn) : the best performance among Sk

DUDE is as good as the best sliding window denoiser

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 6 / 24

Page 13: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Recap of DUDE

DUDE attains the optimum performancesfor stationary sources

main results of DUDE : when k = kn < d12 log|Z| ne,

1 For any stationary process X,

limn→∞

[E(LXn

DUDE(Xn, Zn)

)− min

Xn∈Dn

E(LXn(Xn, Zn)

)]= 0

Dn is the set of all denoisers in the world

DUDE attains the Bayes optimal performance

2 For all x ∈ X∞,

limn→∞

[LXn

DUDE(xn, Zn)−Dk(xn, Zn)

]= 0 w.p.1

Dk(xn, zn) : the best performance among Sk

DUDE is as good as the best sliding window denoiser

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 6 / 24

Page 14: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Motivation

Ex 2 :piecewise stationary bit stream getscorruptedXn : 00000011111110000000000111111101100011011011011010110

Zn : 00100011101110010001000111110101100011111011010010100

source : binary Markov chain with p1 = 0.01→ p2 = 0.2 at t∗ = n2

1! p

0 1

p

p

1! p

noise : BSC(δ = 0.1)

!

0

1 1

01! !

1! !

!

⇒ optimal BER attained by the Forward-Backward Recursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 7 / 24

Page 15: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Motivation

Ex 2 :piecewise stationary bit stream getscorruptedXn : 00000011111110000000000111111101100011011011011010110Zn : 00100011101110010001000111110101100011111011010010100

source : binary Markov chain with p1 = 0.01→ p2 = 0.2 at t∗ = n2

1! p

0 1

p

p

1! p

noise : BSC(δ = 0.1)

!

0

1 1

01! !

1! !

!

⇒ optimal BER attained by the Forward-Backward Recursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 7 / 24

Page 16: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Motivation

Ex 2 :piecewise stationary bit stream getscorruptedXn : 00000011111110000000000111111101100011011011011010110Zn : 00100011101110010001000111110101100011111011010010100

source : binary Markov chain with p1 = 0.01→ p2 = 0.2 at t∗ = n2

1! p

0 1

p

p

1! p

noise : BSC(δ = 0.1)

!

0

1 1

01! !

1! !

!

⇒ optimal BER attained by the Forward-Backward Recursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 7 / 24

Page 17: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Motivation

Does DUDE achieve the optimal BER?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 7 / 24

Page 18: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Motivation

Does DUDE achieve the optimal BER?

0 1 2 3 4 5 60.4

0.5

0.6

0.7

0.8

0.9

1

Window Size k

Bit e

rror r

ate/

Bit error rate plot

Bayes Optimum = 0.487

DUDE = 0.574

(+18%)

DUDE applies the same rule “regardless of the location”DUDE has a limitation for time- (space-) varying sources

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 7 / 24

Page 19: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Motivation

In practice, many sources are time-(space-) varying

text : English → Spanish → German . . .

voice : image :

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 8 / 24

Page 20: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Motivation

In practice, many sources are time-(space-) varying

text : English → Spanish → German . . .

voice :

image :

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 8 / 24

Page 21: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Motivation

In practice, many sources are time-(space-) varying

text : English → Spanish → German . . .

voice : image :

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 8 / 24

Page 22: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Can we do better than the DUDE whenthe source varies?

Questions

1 Can we perform as if we knew the source including its change points?

2 If so, can we do it efficiently?

answers1 Yes. S-DUDE can do essentially as well as if it knows

the source and its change points

2 Yes. S-DUDE is a linear complexity algorithm

[M and Weissman, IEEE Trans. Info. Theory, Nov 09]

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 9 / 24

Page 23: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Can we do better than the DUDE whenthe source varies?

Questions

1 Can we perform as if we knew the source including its change points?

2 If so, can we do it efficiently?

answers1 Yes. S-DUDE can do essentially as well as if it knows

the source and its change points

2 Yes. S-DUDE is a linear complexity algorithm

[M and Weissman, IEEE Trans. Info. Theory, Nov 09]

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 9 / 24

Page 24: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Can we do better than the DUDE whenthe source varies?

Questions

1 Can we perform as if we knew the source including its change points?

2 If so, can we do it efficiently?

answers1 Yes. S-DUDE can do essentially as well as if it knows

the source and its change points

2 Yes. S-DUDE is a linear complexity algorithm

[M and Weissman, IEEE Trans. Info. Theory, Nov 09]

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 9 / 24

Page 25: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Take a closer look at the binary example

Binary, BSC(δ)Suppose DUDE with window size k = 3 decided as follows :

zt+3t−3 :

↓xt :

0100110︸ ︷︷ ︸↓0

0101110︸ ︷︷ ︸↓1

010 • 110 defined a “say-what-you-see” mapping in the middleDUDE employs the same mapping whenever it sees 010 • 110

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 10 / 24

Page 26: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Take a closer look at the binary example

Binary, BSC(δ)Suppose DUDE with window size k = 3 decided as follows :

zt+3t−3 :

↓xt :

0100110︸ ︷︷ ︸↓0

0101110︸ ︷︷ ︸↓1

010 • 110 defined a “say-what-you-see” mapping in the middle

DUDE employs the same mapping whenever it sees 010 • 110

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 10 / 24

Page 27: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Take a closer look at the binary example

Binary, BSC(δ)Suppose DUDE with window size k = 3 decided as follows :

zt+3t−3 :

↓xt :

0100110︸ ︷︷ ︸↓0

0101110︸ ︷︷ ︸↓1

010 • 110 defined a “say-what-you-see” mapping in the middleDUDE employs the same mapping whenever it sees 010 • 110

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 10 / 24

Page 28: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Take a closer look at the binary example

Binary, BSC(δ)Suppose DUDE with window size k = 3 decided as follows :

zt+3t−3 :

↓xt :

0100110︸ ︷︷ ︸↓0

0101110︸ ︷︷ ︸↓1

010 • 110 defined a “say-what-you-see” mapping in the middleDUDE employs the same mapping whenever it sees 010 • 110

Only 4 single-letter mappings in binary example“say-what-you-see”,“flip-what-you-see”,“always-say-0”,“always-say-1”

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 10 / 24

Page 29: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Take a closer look at the binary example

Binary, BSC(δ)Suppose DUDE with window size k = 3 decided as follows :

zt+3t−3 :

↓xt :

0100110︸ ︷︷ ︸↓0

0101110︸ ︷︷ ︸↓1

010 • 110 defined a “say-what-you-see” mapping in the middleDUDE employs the same mapping whenever it sees 010 • 110

DUDE counts n0 and n1 for 010 • 110 andif n0 ≈ n1 → “say-what-you-see”if n0 � n1 → “always-say-0”if n0 � n1 → “always-say-1”threshold depends on δ

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 10 / 24

Page 30: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Employing shifting single-letter mappingswill be helpful

Suppose 0’s 1’s at 010 • 110 looked like

0000100011000011111111011101︸ ︷︷ ︸swys

“always-say-0” → “always-say-1” may be better than fixed“say-what-you-see”

Generally, if single-letter mappings have some freedom to shift,they can attain smaller loss

How can we decide when to shift to what?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 11 / 24

Page 31: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Employing shifting single-letter mappingswill be helpful

Suppose 0’s 1’s at 010 • 110 looked like

00001000110000︸ ︷︷ ︸ 11111111011101︸ ︷︷ ︸all− 0 all− 1

“always-say-0” → “always-say-1” may be better than fixed“say-what-you-see”

Generally, if single-letter mappings have some freedom to shift,they can attain smaller loss

How can we decide when to shift to what?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 11 / 24

Page 32: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Employing shifting single-letter mappingswill be helpful

Suppose 0’s 1’s at 010 • 110 looked like

00001000110000︸ ︷︷ ︸ 11111111011101︸ ︷︷ ︸all− 0 all− 1

“always-say-0” → “always-say-1” may be better than fixed“say-what-you-see”

Generally, if single-letter mappings have some freedom to shift,they can attain smaller loss

How can we decide when to shift to what?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 11 / 24

Page 33: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Employing shifting single-letter mappingswill be helpful

Suppose 0’s 1’s at 010 • 110 looked like

00001000110000︸ ︷︷ ︸ 11111111011101︸ ︷︷ ︸all− 0 all− 1

“always-say-0” → “always-say-1” may be better than fixed“say-what-you-see”

Generally, if single-letter mappings have some freedom to shift,they can attain smaller loss

How can we decide when to shift to what?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 11 / 24

Page 34: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Employing shifting single-letter mappingswill be helpful

Suppose 0’s 1’s at 010 • 110 looked like

00001000110000︸ ︷︷ ︸ 11111111011101︸ ︷︷ ︸all− 0 all− 1

“always-say-0” → “always-say-1” may be better than fixed“say-what-you-see”

Generally, if single-letter mappings have some freedom to shift,they can attain smaller loss

How can we decide when to shift to what?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 11 / 24

Page 35: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Snm is a class of shifting single-lettermappings

Ideally, shifting every time to the correct mapping would be thebest

equivalent to knowing the source sequence ⇒ impossible!

We limit the number of shifts to m

Snm : class of single-letter mappings shifting at most m times for

sequence length n, e.g.,

swys!

swys{s1, · · · , sn} :

zn :

all-0 all-1

|Snm| ≤

(nm

)· |S|m, |S| = |Z||X | (number of single-letter mappings)

Deciding when to shift to what m times⇔ Selecting the best combination in Sn

m

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 12 / 24

Page 36: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Snm is a class of shifting single-lettermappings

Ideally, shifting every time to the correct mapping would be thebest

equivalent to knowing the source sequence ⇒ impossible!

We limit the number of shifts to m

Snm : class of single-letter mappings shifting at most m times for

sequence length n, e.g.,

swys!

swys{s1, · · · , sn} :

zn :

all-0 all-1

|Snm| ≤

(nm

)· |S|m, |S| = |Z||X | (number of single-letter mappings)

Deciding when to shift to what m times⇔ Selecting the best combination in Sn

m

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 12 / 24

Page 37: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Snm is a class of shifting single-lettermappings

Ideally, shifting every time to the correct mapping would be thebest

equivalent to knowing the source sequence ⇒ impossible!

We limit the number of shifts to m

Snm : class of single-letter mappings shifting at most m times for

sequence length n, e.g.,

swys!

swys{s1, · · · , sn} :

zn :

all-0 all-1

|Snm| ≤

(nm

)· |S|m, |S| = |Z||X | (number of single-letter mappings)

Deciding when to shift to what m times⇔ Selecting the best combination in Sn

m

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 12 / 24

Page 38: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Snm is a class of shifting single-lettermappings

Ideally, shifting every time to the correct mapping would be thebest

equivalent to knowing the source sequence ⇒ impossible!

We limit the number of shifts to mSn

m : class of single-letter mappings shifting at most m times forsequence length n, e.g.,

swys!

swys{s1, · · · , sn} :

zn :

all-0 all-1

|Snm| ≤

(nm

)· |S|m, |S| = |Z||X | (number of single-letter mappings)

Deciding when to shift to what m times⇔ Selecting the best combination in Sn

m

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 12 / 24

Page 39: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Snm is a class of shifting single-lettermappings

Ideally, shifting every time to the correct mapping would be thebest

equivalent to knowing the source sequence ⇒ impossible!

We limit the number of shifts to mSn

m : class of single-letter mappings shifting at most m times forsequence length n, e.g.,

swys!

swys{s1, · · · , sn} :

zn :

all-0 all-1

|Snm| ≤

(nm

)· |S|m, |S| = |Z||X | (number of single-letter mappings)

Deciding when to shift to what m times⇔ Selecting the best combination in Sn

m

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 12 / 24

Page 40: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Snm is a class of shifting single-lettermappings

Ideally, shifting every time to the correct mapping would be thebest

equivalent to knowing the source sequence ⇒ impossible!

We limit the number of shifts to mSn

m : class of single-letter mappings shifting at most m times forsequence length n, e.g.,

swys!

swys{s1, · · · , sn} :

zn :

all-0 all-1

|Snm| ≤

(nm

)· |S|m, |S| = |Z||X | (number of single-letter mappings)

Deciding when to shift to what m times⇔ Selecting the best combination in Sn

m

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 12 / 24

Page 41: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Snm is a class of shifting single-lettermappings

Ideally, shifting every time to the correct mapping would be thebest

equivalent to knowing the source sequence ⇒ impossible!

We limit the number of shifts to mSn

m : class of single-letter mappings shifting at most m times forsequence length n, e.g.,

swys!

swys{s1, · · · , sn} :

zn :

all-0 all-1

|Snm| ≤

(nm

)· |S|m, |S| = |Z||X | (number of single-letter mappings)

Deciding when to shift to what m times⇔ Selecting the best combination in Sn

m

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 12 / 24

Page 42: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

The key tool is to devise an estimate ofthe loss ΛFocus on the single-letter setting (s(·) : Z → X )

X = s(Z)x ! Z

Λ(x, s(Z)) : loss between x and s(Z)

not observable

But, from the knowledge of Π, we devise `(Z, s) such that

Ex

(`(Z, s)

)= Ex

(Λ(x, s(Z))

)

`(Z, s) is an unbiased estimate of Ex

(Λ(x, s(Z))

)

`(Z, s) : loss between Z and s(·)

observable

[Weissman et. al., Universal filtering via prediction, IEEE IT 07]

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 13 / 24

Page 43: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

The key tool is to devise an estimate ofthe loss ΛFocus on the single-letter setting (s(·) : Z → X )

X = s(Z)x ! Z

Λ(x, s(Z)) : loss between x and s(Z)

not observable

But, from the knowledge of Π, we devise `(Z, s) such that

Ex

(`(Z, s)

)= Ex

(Λ(x, s(Z))

)

`(Z, s) is an unbiased estimate of Ex

(Λ(x, s(Z))

)

`(Z, s) : loss between Z and s(·)

observable

[Weissman et. al., Universal filtering via prediction, IEEE IT 07]

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 13 / 24

Page 44: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

The key tool is to devise an estimate ofthe loss ΛFocus on the single-letter setting (s(·) : Z → X )

X = s(Z)x ! Z

Λ(x, s(Z)) : loss between x and s(Z)

not observable

But, from the knowledge of Π, we devise `(Z, s) such that

Ex

(`(Z, s)

)= Ex

(Λ(x, s(Z))

)

`(Z, s) is an unbiased estimate of Ex

(Λ(x, s(Z))

)

`(Z, s) : loss between Z and s(·)

observable

[Weissman et. al., Universal filtering via prediction, IEEE IT 07]

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 13 / 24

Page 45: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

The key tool is to devise an estimate ofthe loss ΛFocus on the single-letter setting (s(·) : Z → X )

X = s(Z)x ! Z

Λ(x, s(Z)) : loss between x and s(Z)

not observable

But, from the knowledge of Π, we devise `(Z, s) such that

Ex

(`(Z, s)

)= Ex

(Λ(x, s(Z))

)

`(Z, s) is an unbiased estimate of Ex

(Λ(x, s(Z))

)

`(Z, s) : loss between Z and s(·)

observable

[Weissman et. al., Universal filtering via prediction, IEEE IT 07]

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 13 / 24

Page 46: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

The key tool is to devise an estimate ofthe loss ΛFocus on the single-letter setting (s(·) : Z → X )

X = s(Z)x ! Z

Λ(x, s(Z)) : loss between x and s(Z)

not observable

But, from the knowledge of Π, we devise `(Z, s) such that

Ex

(`(Z, s)

)= Ex

(Λ(x, s(Z))

)

`(Z, s) is an unbiased estimate of Ex

(Λ(x, s(Z))

)

`(Z, s) : loss between Z and s(·)

observable

[Weissman et. al., Universal filtering via prediction, IEEE IT 07]

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 13 / 24

Page 47: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

The key tool is to devise an estimate ofthe loss ΛFocus on the single-letter setting (s(·) : Z → X )

X = s(Z)x ! Z

Λ(x, s(Z)) : loss between x and s(Z)

not observable

But, from the knowledge of Π, we devise `(Z, s) such that

Ex

(`(Z, s)

)= Ex

(Λ(x, s(Z))

)

`(Z, s) is an unbiased estimate of Ex

(Λ(x, s(Z))

)

`(Z, s) : loss between Z and s(·)

observable

[Weissman et. al., Universal filtering via prediction, IEEE IT 07]Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 13 / 24

Page 48: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

The key tool is to devise an estimate ofthe loss ΛFocus on the single-letter setting (s(·) : Z → X )

X = s(Z)x ! Z

Λ(x, s(Z)) : loss between x and s(Z)

not observable

But, from the knowledge of Π, we devise `(Z, s) such that

Ex

(`(Z, s)

)= Ex

(Λ(x, s(Z))

)

`(Z, s) is an unbiased estimate of Ex

(Λ(x, s(Z))

)

`(Z, s) : loss between Z and s(·)observable

[Weissman et. al., Universal filtering via prediction, IEEE IT 07]Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 13 / 24

Page 49: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE is defined by minimizing the sumof the estimated losses

For each context c (e.g., 010 • 110),S-DUDE finds

S , arg minS∈Snc

m

i∈context c

`(zi, si)

vs. arg minS∈Snc

m

i∈context c

Λ(xi, si(zi))

and applies them

Question : how can we get S = {s1, · · · , snc} ∈ Sncm efficiently?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 14 / 24

Page 50: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE is defined by minimizing the sumof the estimated losses

For each context c (e.g., 010 • 110),S-DUDE finds

S , arg minS∈Snc

m

i∈context c

`(zi, si)

vs. arg minS∈Snc

m

i∈context c

Λ(xi, si(zi))

and applies them

Question : how can we get S = {s1, · · · , snc} ∈ Sncm efficiently?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 14 / 24

Page 51: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE is defined by minimizing the sumof the estimated losses

For each context c (e.g., 010 • 110),S-DUDE finds

S , arg minS∈Snc

m

i∈context c

`(zi, si)

vs. arg minS∈Snc

m

i∈context c

Λ(xi, si(zi))

and applies them

Question : how can we get S = {s1, · · · , snc} ∈ Sncm efficiently?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 14 / 24

Page 52: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE is defined by minimizing the sumof the estimated losses

For each context c (e.g., 010 • 110),S-DUDE finds

S ,

arg minS∈Snc

m

i∈context c

`(zi, si)

vs. arg minS∈Snc

m

i∈context c

Λ(xi, si(zi))

and applies them

Question : how can we get S = {s1, · · · , snc} ∈ Sncm efficiently?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 14 / 24

Page 53: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE is defined by minimizing the sumof the estimated losses

For each context c (e.g., 010 • 110),S-DUDE finds

S , arg minS∈Snc

m

i∈context c

`(zi, si)

vs. arg minS∈Snc

m

i∈context c

Λ(xi, si(zi))

and applies them

Question : how can we get S = {s1, · · · , snc} ∈ Sncm efficiently?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 14 / 24

Page 54: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE is defined by minimizing the sumof the estimated losses

For each context c (e.g., 010 • 110),S-DUDE finds

S , arg minS∈Snc

m

i∈context c

`(zi, si)

vs. arg minS∈Snc

m

i∈context c

Λ(xi, si(zi))

and applies them

Question : how can we get S = {s1, · · · , snc} ∈ Sncm efficiently?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 14 / 24

Page 55: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE is defined by minimizing the sumof the estimated losses

For each context c (e.g., 010 • 110),S-DUDE finds

S , arg minS∈Snc

m

i∈context c

`(zi, si)

vs. arg minS∈Snc

m

i∈context c

Λ(xi, si(zi))

and applies them

Question : how can we get S = {s1, · · · , snc} ∈ Sncm efficiently?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 14 / 24

Page 56: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE is defined by minimizing the sumof the estimated losses

For each context c (e.g., 010 • 110),S-DUDE finds

S , arg minS∈Snc

m

i∈context c

`(zi, si)

vs. arg minS∈Snc

m

i∈context c

Λ(xi, si(zi))

and applies them

Question : how can we get S = {s1, · · · , snc} ∈ Sncm efficiently?

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 14 / 24

Page 57: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE can be implemented with atwo-pass algorithm

again binary, BSC(δ) example

problem : find best {s1, · · · , sn} ∈ Snm that minimizes

∑nt=1 `(zt, st)

si ∈ {all-0, all-1, swys, fwys}

to solve,

1 allocate Mt ∈ Rm×4 for each 1 ≤ t ≤ n2 first pass : scan (z1, · · · , zn) and update {Mt}nt=1 by dynamic

programming

3 second pass : from Mn, extract the best {s1, · · · , sn} by a backwardrecursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 15 / 24

Page 58: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE can be implemented with atwo-pass algorithm

again binary, BSC(δ) example

problem : find best {s1, · · · , sn} ∈ Snm that minimizes

∑nt=1 `(zt, st)

si ∈ {all-0, all-1, swys, fwys}

to solve,

1 allocate Mt ∈ Rm×4 for each 1 ≤ t ≤ n2 first pass : scan (z1, · · · , zn) and update {Mt}nt=1 by dynamic

programming

3 second pass : from Mn, extract the best {s1, · · · , sn} by a backwardrecursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 15 / 24

Page 59: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE can be implemented with atwo-pass algorithm

again binary, BSC(δ) example

problem : find best {s1, · · · , sn} ∈ Snm that minimizes

∑nt=1 `(zt, st)

si ∈ {all-0, all-1, swys, fwys}

to solve,

1 allocate Mt ∈ Rm×4 for each 1 ≤ t ≤ n2 first pass : scan (z1, · · · , zn) and update {Mt}nt=1 by dynamic

programming

3 second pass : from Mn, extract the best {s1, · · · , sn} by a backwardrecursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 15 / 24

Page 60: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE can be implemented with atwo-pass algorithm

again binary, BSC(δ) example

problem : find best {s1, · · · , sn} ∈ Snm that minimizes

∑nt=1 `(zt, st)

si ∈ {all-0, all-1, swys, fwys}

to solve,

1 allocate Mt ∈ Rm×4 for each 1 ≤ t ≤ n2 first pass : scan (z1, · · · , zn) and update {Mt}nt=1 by dynamic

programming

3 second pass : from Mn, extract the best {s1, · · · , sn} by a backwardrecursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 15 / 24

Page 61: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE can be implemented with atwo-pass algorithm

again binary, BSC(δ) example

problem : find best {s1, · · · , sn} ∈ Snm that minimizes

∑nt=1 `(zt, st)

si ∈ {all-0, all-1, swys, fwys}

to solve,

1 allocate Mt ∈ Rm×4 for each 1 ≤ t ≤ n

2 first pass : scan (z1, · · · , zn) and update {Mt}nt=1 by dynamicprogramming

3 second pass : from Mn, extract the best {s1, · · · , sn} by a backwardrecursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 15 / 24

Page 62: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE can be implemented with atwo-pass algorithm

again binary, BSC(δ) example

problem : find best {s1, · · · , sn} ∈ Snm that minimizes

∑nt=1 `(zt, st)

si ∈ {all-0, all-1, swys, fwys}

to solve,

1 allocate Mt ∈ Rm×4 for each 1 ≤ t ≤ n2 first pass : scan (z1, · · · , zn) and update {Mt}nt=1 by dynamic

programming

3 second pass : from Mn, extract the best {s1, · · · , sn} by a backwardrecursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 15 / 24

Page 63: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

S-DUDE can be implemented with atwo-pass algorithm

again binary, BSC(δ) example

problem : find best {s1, · · · , sn} ∈ Snm that minimizes

∑nt=1 `(zt, st)

si ∈ {all-0, all-1, swys, fwys}

to solve,

1 allocate Mt ∈ Rm×4 for each 1 ≤ t ≤ n2 first pass : scan (z1, · · · , zn) and update {Mt}nt=1 by dynamic

programming

3 second pass : from Mn, extract the best {s1, · · · , sn} by a backwardrecursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 15 / 24

Page 64: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Mt stores minimum sum of estimatedlosses up to tAgain binary, BSC(δ) example

Problem : find best {s1, · · · , sn} ∈ Snm that minimizes

∑nt=1 `(zt, st)

si ∈ {all-0, all-1, swys, fwys}Elements of Mt are defined to be the minimum sum up to t, e.g.,

Mt

all-0 swysall-1 fwys

i

m

Mt(i, swys) = min{s1,··· ,st}∈St

i

{`(zt, st = swys) +t−1∑

r=1

`(zr, sr)}

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 16 / 24

Page 65: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

First pass uses dynamic programming

Only two possible cases to attain Mt(i, swys)

1 i-th shift has occurred at t : min1≤j≤|S|Mt−1(i− 1, j) + `(zt, swys)2 i-th shift has occurred before t : Mt−1(i, swys) + `(zt, swys)

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 17 / 24

Page 66: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

First pass uses dynamic programming

Only two possible cases to attain Mt(i, swys)

1 i-th shift has occurred at t : min1≤j≤|S|Mt−1(i− 1, j) + `(zt, swys)

2 i-th shift has occurred before t : Mt−1(i, swys) + `(zt, swys)

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 17 / 24

Page 67: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

First pass uses dynamic programming

Only two possible cases to attain Mt(i, swys)

1 i-th shift has occurred at t : min1≤j≤|S|Mt−1(i− 1, j) + `(zt, swys)2 i-th shift has occurred before t : Mt−1(i, swys) + `(zt, swys)

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 17 / 24

Page 68: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

First pass uses dynamic programming

Only two possible cases to attain Mt(i, swys)

Mt(i, swys) =`(zt, swys) + min

{Mt−1(i, swys),min1≤j≤|S|Mt−1(i− 1, j)

}

same for all other elements

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 17 / 24

Page 69: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Second pass extracts S and denoise

When t = n,sn = arg minj∈{all−0,all−1,swys,fwys}Mn(m, j), xn = sn(zn)

minS!Snm

!nt=1 !(zt, st)

Mn

all-0 swysall-1 fwys

mmin

for t = n− 1, · · · , 1 : follow the optimal path and denoise!

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 18 / 24

Page 70: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Second pass extracts S and denoise

When t = n,sn = arg minj∈{all−0,all−1,swys,fwys}Mn(m, j), xn = sn(zn)

minS!Snm

!nt=1 !(zt, st)

Mn

all-0 swysall-1 fwys

mmin

for t = n− 1, · · · , 1 : follow the optimal path and denoise!

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 18 / 24

Page 71: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

The complexity of S-DUDE is linear in nand m

Complexity

space : O(mn|Z|2k)time : O(mn|Z|2k)practical

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 19 / 24

Page 72: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

The complexity of S-DUDE is linear in nand m

Complexityspace : O(mn|Z|2k)time : O(mn|Z|2k)

practical

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 19 / 24

Page 73: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

The complexity of S-DUDE is linear in nand m

Complexityspace : O(mn|Z|2k)time : O(mn|Z|2k)practical

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 19 / 24

Page 74: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts New algorithm: S-DUDE

Summary of S-DUDE

S-DUDE (Shifting DUDE)

For location t to be denoised, do :

1 fix the window size k, set the number of shifts m

2 find left k-context (`1, . . . , `k) and right k-context (r1, . . . , rk) of zt

`1 `2 · · · `k zt r1 r2 · · · rk

3 on all positions that share the same context c with zt

find S = arg minS∈Sncm

Pt∈context c `(zt, st)

4 decide on xt according to

xt = st(zt), where st(·) comes from S

We can also show that if we set m = 0, S-DUDE coincides withDUDE

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 20 / 24

Page 75: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Results

S-DUDE achieves the optimum loss fortime-(space-) varying sourcesWhen k = kn <

12 log|Z| n,

Theorem 1 (stochastic setting)

For all piecewise stationary processes X,

limn→∞

[E(LXn

S-DUDE(Xn, Zn)

)− min

Xn∈Dn

E(LXn(Xn, Zn)

)]= 0,

provided that the number of stationary segments is m = o(n) w.p.1

Theorem 2 (individual sequence setting)

When m = o(n), for all x ∈ X∞,

limn→∞

[LXn

S-DUDE(xn, Zn)−Dk,m(xn, Zn)

]= 0 w.p.1

where Dk,m(xn, zn) is the best performance attained by k-th order slidingwindow denoisers that can shift at most m times

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 21 / 24

Page 76: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Results

S-DUDE achieves the optimum loss fortime-(space-) varying sourcesWhen k = kn <

12 log|Z| n,

Theorem 1 (stochastic setting)

For all piecewise stationary processes X,

limn→∞

[E(LXn

S-DUDE(Xn, Zn)

)− min

Xn∈Dn

E(LXn(Xn, Zn)

)]= 0,

provided that the number of stationary segments is m = o(n) w.p.1

Theorem 2 (individual sequence setting)

When m = o(n), for all x ∈ X∞,

limn→∞

[LXn

S-DUDE(xn, Zn)−Dk,m(xn, Zn)

]= 0 w.p.1

where Dk,m(xn, zn) is the best performance attained by k-th order slidingwindow denoisers that can shift at most m times

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 21 / 24

Page 77: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Results

No denoiser is better than S-DUDE

Strong converse

If m = Θ(n), no denoiser can achieve previous theorems.

m = o(n) is a necessary and sufficient condition for previous theorems!

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 22 / 24

Page 78: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Results

Ex 2 : piecewise stationary bit stream(revisited)Xn : 00000011111110000000000111111111100000001111111110000

Zn : 00100011101110010001000111110111100000011110111110001

source : binary Markov chain with p1 = 0.01→ p2 = 0.2 at t∗ = n2

1! p

0 1

p

p

1! p

noise : flips bits with probability δ = 0.1

!

0

1 1

01! !

1! !

!

⇒ optimal BER attained by the Forward-Backward Recursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 23 / 24

Page 79: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Results

Ex 2 : piecewise stationary bit stream(revisited)Xn : 00000011111110000000000111111111100000001111111110000Zn : 00100011101110010001000111110111100000011110111110001

source : binary Markov chain with p1 = 0.01→ p2 = 0.2 at t∗ = n2

1! p

0 1

p

p

1! p

noise : flips bits with probability δ = 0.1

!

0

1 1

01! !

1! !

!

⇒ optimal BER attained by the Forward-Backward Recursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 23 / 24

Page 80: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Results

Ex 2 : piecewise stationary bit stream(revisited)Xn : 00000011111110000000000111111111100000001111111110000Zn : 00100011101110010001000111110111100000011110111110001

source : binary Markov chain with p1 = 0.01→ p2 = 0.2 at t∗ = n2

1! p

0 1

p

p

1! p

noise : flips bits with probability δ = 0.1

!

0

1 1

01! !

1! !

!

⇒ optimal BER attained by the Forward-Backward Recursion

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 23 / 24

Page 81: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Results

Can S-DUDE achieve the Bayes optimalperformance?

0 1 2 3 4 5 60.4

0.5

0.6

0.7

0.8

0.9

1

Bit e

rror r

ate/

Window size k

Bit error rate plot

Bayes Optimum = 0.487

DUDE = 0.574

S DUDE (m=1) = 0.498

(+2.3%)

⇒ m can be regarded as another design parameter indevising a discrete denoiser

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 24 / 24

Page 82: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Results

Can S-DUDE achieve the Bayes optimalperformance?

0 1 2 3 4 5 60.4

0.5

0.6

0.7

0.8

0.9

1Bi

t erro

r rat

e/

Window size k

Bit error rate plot

Bayes Optimum = 0.487

DUDE = 0.574

S DUDE (m=1) = 0.498

(+2.3%)

⇒ m can be regarded as another design parameter indevising a discrete denoiser

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 24 / 24

Page 83: Discrete Denoising with Shifts - Stanford University · Discrete Denoising with Shifts 1 Prediction with Experts’ Advice 2 Discrete Denoising with Shifts Recap of DUDE Motivation

Discrete Denoising with Shifts Results

Can S-DUDE achieve the Bayes optimalperformance?

0 1 2 3 4 5 60.4

0.5

0.6

0.7

0.8

0.9

1Bi

t erro

r rat

e/

Window size k

Bit error rate plot

Bayes Optimum = 0.487

DUDE = 0.574

S DUDE (m=1) = 0.498

(+2.3%) ⇒ m can be regarded as another design parameter indevising a discrete denoiser

Taesup Moon (Yahoo! Labs) EE477 Guest Lecture Nov 10, 2011 24 / 24