Signal Processing Course : Inverse Problems Regularization

Inverse ProblemsRegularization

www.numerical-tours.com

Gabriel Peyré

http://www.ceremade.dauphine.fr/~peyre/wavelet-tour/

http://www.ceremade.dauphine.fr/~peyre/wavelet-tour/

Overview

• Variational Priors

• Gradient Descent and PDE’s

• Inverse Problems Regularization

J(f) = ||f ||2W 1,2 =

Z

R2

||rf(x)||dxSobolev semi-norm:

Smooth and Cartoon Priors

�|�f |2

J(f) = ||f ||2W 1,2 =

Z

R2


Total variation semi-norm:

J(f) = ||f ||TV =

Z

R2

||rf(x)||dx


�|�f |

�|�f |2

J(f) = ||f ||2W 1,2 =

Z

R2


Total variation semi-norm:

J(f) = ||f ||TV =

Z

R2

||rf(x)||dx


�|�f |

�|�f |2

Natural Image Priors

Discrete Priors

Discrete Priors

Discrete Differential Operators

Discrete Differential Operators

Laplacian Operator

Laplacian Operator

Function:

f̃ : x 2 R2 7! f(x) 2 R

f̃(x+ ") = f̃(x) + hrf(x), "iR2 +O(||"||2R2)

rf̃(x) = (@1f̃(x), @2f̃(x)) 2 R2

Gradient: Images vs. Functionals

Function:

f̃ : x 2 R2 7! f(x) 2 R

Discrete image: f 2 RN , N = n2

f [i1, i2] = f̃(i1/n, i2/n) rf [i] ⇡ rf̃(i/n)

f̃(x+ ") = f̃(x) + hrf(x), "iR2 +O(||"||2R2)

rf̃(x) = (@1f̃(x), @2f̃(x)) 2 R2


Function:

f̃ : x 2 R2 7! f(x) 2 R


f [i1, i2] = f̃(i1/n, i2/n)

Functional:

J : f 2 RN 7! J(f) 2 RJ(f + ⌘) = J(f) + hrJ(f), ⌘iRN +O(||⌘||2RN )

rf [i] ⇡ rf̃(i/n)

f̃(x+ ") = f̃(x) + hrf(x), "iR2 +O(||"||2R2)

rf̃(x) = (@1f̃(x), @2f̃(x)) 2 R2

rJ : RN 7! RN


Function:

f̃ : x 2 R2 7! f(x) 2 R


f [i1, i2] = f̃(i1/n, i2/n)

Functional:

J : f 2 RN 7! J(f) 2 R

Sobolev:

rJ(f) = (r⇤ � r)f = ��f

J(f) =1

2||rf ||2

J(f + ⌘) = J(f) + hrJ(f), ⌘iRN +O(||⌘||2RN )

rf [i] ⇡ rf̃(i/n)

f̃(x+ ") = f̃(x) + hrf(x), "iR2 +O(||"||2R2)

rf̃(x) = (@1f̃(x), @2f̃(x)) 2 R2

rJ : RN 7! RN


rJ(f) = �div

✓rf

||rf ||

◆If 8n,rf [n] 6= 0,

If 9n,rf [n] = 0, J not di↵erentiable at f .

Total Variation Gradient||rf ||

rJ(f)

rJ(f) = �div

✓rf

||rf ||

◆If 8n,rf [n] 6= 0,

Sub-di↵erential:

If 9n,rf [n] = 0, J not di↵erentiable at f .

Cu =�↵ 2 R2⇥N \ (u[n] = 0) ) (↵[n] = u[n]/||u[n]||)

@J(f) = {� div(↵) ; ||↵[n]|| 6 1 and ↵ 2 Crf}

Total Variation Gradient||rf ||

rJ(f)

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

4

6

8

10

12

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

4

6

8

10

12

px

2 + "

2

|x|

Regularized Total Variation

||u||" =p

||u||2 + "2 J"(f) =P

n ||rf [n]||"

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

4

6

8

10

12

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

4

6

8

10

12

rJ"(f) = �div

✓rf

||rf ||"

◆

px

2 + "

2

|x|

rJ" ⇠ ��/" when " ! +1

Regularized Total Variation

||u||" =p

||u||2 + "2 J"(f) =P

n ||rf [n]||"

rJ"(f)

Overview




f (k+1) = f (k) � ⌧krJ(f (k)) f (0) is given.

Gradient Descent

and 0 < ⌧ < 2/L, then f (k) k!+1�! f?

a solution of min

fJ(f).

If f is convex, C1, rf is L-Lipschitz,Theorem:


Gradient Descent

and 0 < ⌧ < 2/L, then f (k) k!+1�! f?

a solution of min

fJ(f).

If f is convex, C1, rf is L-Lipschitz,Theorem:


Optimal step size: ⌧k = argmin⌧2R+

J(f (k) � ⌧rJ(f (k)))

Proposition: One has

hrJ(f (k+1)), rJ(f (k))i = 0

Gradient Descent

Gradient Flows and PDE’sf (k+1) � f (k)

⌧= �rJ(f (k))Fixed step size ⌧k = ⌧ :



Denote ft = f (k)for t = k⌧ , one obtains formally as ⌧ ! 0:

8 t > 0,@ft@t

= �rJ(ft) and f0 = f (0)

J(f) =R||rf(x)||dx

Sobolev flow:

@ft@t

= �ftHeat equation:

Explicit solution:



Denote ft = f (k)for t = k⌧ , one obtains formally as ⌧ ! 0:

8 t > 0,@ft@t

= �rJ(ft) and f0 = f (0)

Total Variation Flow

@ft@t

= �rJ(ft)

Noisy observations: y = f + w, w ⇠ N (0, IdN ).

and ft=0 = y

Application: Denoising

Optimal choice of t: minimize ||ft � f ||

�! not accessible in practice.

SNR(ft, f) = �20 log10

✓||f � ft||

||f ||

◆

Optimal Parameter Selection

t t

Overview




Inverse Problems

Inverse Problems

Inverse Problems

Inverse Problems

Inverse Problems

Inverse Problem Regularization



Sobolev prior: J(f) = 12 ||rf ||2

f? = argminf2RN

E(f) = ||y � �f ||2 + �||rf ||2

(assuming 1 /2 ker(�))

Sobolev Regularization


f? = argminf2RN

E(f) = ||y � �f ||2 + �||rf ||2

rE(f?) = 0 () (�⇤�� )f? = �⇤yProposition:

�! Large scale linear system.




f? = argminf2RN

E(f) = ||y � �f ||2 + �||rf ||2

rE(f?) = 0 () (�⇤�� )f? = �⇤yProposition:

�! Large scale linear system.

Gradient descent:


where ||A|| = �max

(A)

�! Slow convergence.


Convergence: ⇥ < 2/||⇥�⇥� ��||

Mask M , � = diagi(1i2M )

Example: Inpainting

since x � [0, 1]2. This distance corresponds to the Euclidean distance over thecube ��1(M), but since c̃f has a complex convoluted geometry, this distanceis not Euclidean when displayed as a 2D image.

Image f Surface c̃f Distance dMFig. 2. Manifold of smooth images.

4.3 Numerical Experiments

Figure 3 shows iterations of the algorithm 1 to solve the inpainting problemon a smooth image using a manifold prior with 2D linear patches, as defined in16. This manifold together with the overlapping of the patches allow a smoothinterpolation of the missing pixels.

Measurements y Iter. #1 Iter. #3 Iter. #50

Fig. 3. Iterations of the inpainting algorithm on an uniformly regular image.

5 Manifold of Step Discontinuities

In order to introduce some non-linearity in the manifoldM, one needs to gobeyond the Fourier world of uniformly regular functions and consider signalsand images with discontinuities.

13

log10(||f (k) � f (�)||/||f0||)

k k

E(f (k))

M

(�f)[i] =

⇢0 if i 2 M,f [i] otherwise.

Symmetric linear system:

Conjugate Gradient

Ax = b () minx2Rn

E(x) = 1

2hAx, xi � hx, bi


x

(k+1) = argmin E(x)s.t. x� x

(k) 2 span(rE(x(0)), . . . ,rE(x(k)))

Intuition:

Conjugate Gradient

Ax = b () minx2Rn

E(x) = 1

2hAx, xi � hx, bi

Proposition:

8 ` < k, hrE(xk), rE(x`)i = 0


Initialization:

x

(0) 2 RN, r

(0) = b�Ax

(0), p

(0) = r

(0)

r

(k) =hrE(x(k)), d(k)ihAd(k), d(k)i

d

(k) = rE(x(k)) +||v(k)||

||v(k�1)||d

(k�1)

v

(k) = rE(x(k)) = Ax

(k) � b

x

(k+1) = x

(k) � r

(k)d

(k)

Iterations:

x

(k+1) = argmin E(x)s.t. x� x

(k) 2 span(rE(x(0)), . . . ,rE(x(k)))

Intuition:

Conjugate Gradient

Ax = b () minx2Rn

E(x) = 1

2hAx, xi � hx, bi

Proposition:

8 ` < k, hrE(xk), rE(x`)i = 0

TV" regularization:


f? = argminf2RN

E(f) = 1

2||�f � y||+ �J"(f)

Total Variation Regularization

||u||" =p

||u||2 + "2 J"(f) =P

n ||rf [n]||"

TV" regularization:


f (k+1) = f (k) � ⌧krE(f (k))

rE(f) = �⇤(�f � y) + �rJ"(f)

rJ"(f) = �div

✓rf

||rf ||"

◆

Convergence: requires ⌧ ⇠ ".

Gradient descent:

f? = argminf2RN

E(f) = 1

2||�f � y||+ �J"(f)


||u||" =p

||u||2 + "2 J"(f) =P

n ||rf [n]||"

TV" regularization:


f (k+1) = f (k) � ⌧krE(f (k))

rE(f) = �⇤(�f � y) + �rJ"(f)

rJ"(f) = �div

✓rf

||rf ||"

◆

Convergence: requires ⌧ ⇠ ".

Gradient descent:

f? = argminf2RN

E(f) = 1

2||�f � y||+ �J"(f)

Newton descent:

f (k+1) = f (k) �H�1k rE(f (k)) where Hk = @2E"(f (k))


||u||" =p

||u||2 + "2 J"(f) =P

n ||rf [n]||"

k

Large

"TV vs. Sobolev ConvergeSmall"

Observations y Sobolev

Total variation

Inpainting: Sobolev vs. TV

Noiseless problem: f? 2 argminf

J"(f) s.t. f 2 HContraint: H = {f ; �f = y}.

f (k+1)= ProjH

⇣f (k) � ⌧krJ"(f

(k))

⌘

ProjH(f) = argmin

�g=y||g � f ||2 = f + �⇤(�⇤�)�1(y � �f)

Inpainting:ProjH(f)[i] =

⇢f [i] if i 2 M,y[i] otherwise.

Projected gradient descent:

f (k) k!+1�! f?a solution of (?).

(?)

Projected Gradient Descent

Proposition: If rJ" is L-Lipschitz and 0 < ⌧k < 2/L,

TV

Priors:

Non-quadratic

better edge recovery.

=)

Conclusion

Sobolev

TV

Priors:

Non-quadratic

better edge recovery.

=)

Optimization

Variational regularization:()

– Gradient descent.– Newton.

– Projected gradient.

– Conjugate gradient.

Non-smooth optimization ?

Conclusion

Sobolev

Signal Processing Course : Inverse Problems Regularization

Education

r2 gradient

o2 r2 r fx

rn gradient

rf2 jf

log10fk f f0

cartoon priors f f2

pdes fk

o2 rn rfi r fin fx