Exact Recovery of Multichannel Sparse Blind Deconvolution via Gradient DescentQing Qu∗, Xiao Li†, Zhihui Zhu⋄
∗ Center for Data Science, New York University, † EE Department, Chinese University of Hong Kong, ⋄ MINDS, the Johns Hopkins University
Basic Task
Given multiple yi ∈ Rn of circulant convolution
yi = a ⊛ xi, (1 ≤ i ≤ p),can we recover both unknown kernel a ∈ Rn and
sparse signal xipi=1 ∈ Rn simultaneously?
Our Contribution
With random initializations, a vanilla RGD fol-lowed by a subgradient method converges exactlyto the target solution in a linear rate.
Motivations in Imaging Science
• Computational Microscopy Imaging.
• Geophyiscs and Seismic Imaging.
• Neuroscience. Calcium imaging, Functional MRI.• Image Deblurring.
Symmetric Solutions in MCS-BD
• Scaled & shifts of (a, xi) are solutions to MCS-BD
−1 1
1 −2 0
−1 0 2
=⊛
⊛yi = αsℓ[a] (1/α)s−ℓ[xi]⊛
- W.l.o.g., fix the scaling ∥a∥ = 1 for a.- Hope to recover a up to signed shifts ± sℓ[a0]n−1
ℓ=−n+1.
Assumptions & Problem Formulation
• Assumptions.- Sparse signal xi: xi ∼i.i.d. Bernoulli − Gaussian(θ), θ ∈ (0, 1);- Invertible kernel a with its circulant matrix
Ca︸︷︷︸invertible
= F ∗ diag (a) F , or |a| > 0.
• Problem Formulation. Let us denoteY =
[y1 y2 · · · yp
], X =
[x1 x2 · · · xp
]
- Let h be the inverse kernel of a, h = a⊙−1 or a ⊛ h = 1,Ch · Y = Ch · Ca︸ ︷︷ ︸
= I
· X = X︸︷︷︸sparse
- Ideally, we want to solve the problem
minq
1np
∥CqY ∥0︸ ︷︷ ︸sparsity
= 1np
p∑i=1
∥∥∥Cyiq∥∥∥
0, s.t. q = 0︸ ︷︷ ︸
prevent trivial solution
,
to recover a = sℓ
[αq⊙−1
⋆
]up to a shift-scaling symmetry.
• Nonconvex Relaxation. We consider
minq
φ(q) := 1np
p∑i=1
Hµ
(Cyi
P q)
︸ ︷︷ ︸smooth sparsity function
, s.t. q ∈ Sn−1︸ ︷︷ ︸sphere constraint
.
- Hµ(·) is smooth Huber loss for promoting sparsity.
Hµ(Z) :=n∑
i=1
p∑j=1
hµ(Zij), hµ (z) :=
|z| |z| ≥ µz2
2µ + µ2 |z| < µ
,
- P is a preconditioning matrix.
P = 1
θnp
p∑i=1
C⊤yi
Cyi
−1/2
≈(C⊤
aCa
)−1/2.
- Preconditioning orthogonalizes the kernel Ca
CyiP = Cxi
CaP︸ ︷︷ ︸R
≈ CxiCa
(C⊤
aCa
)−1/2
︸ ︷︷ ︸orthogonal Q
.
Given CyiP q ≈ Cxi
Qq and suppose Q = I , it reduces to
minq
f (q) := 1np
p∑i=1
Hµ (Cxiq) , s.t. q ∈ Sn−1.
This implies that standard basis ±eini=1 are global solutions.
(a) ℓ1-loss, 7 (b) Huber-loss, 7 (c) ℓ4-loss, 7
(d) ℓ1-loss, (e) Huber-loss, (f) ℓ4-loss,
Geometric Property
Study optimization landscape for union of sets
S i±ξ :=
q ∈ Sn−1 | |qi|∥q−i∥∞
≥√
1 + ξ, qi ≷ 0 ,
for some ξ ∈ (0, +∞), where for each set- It contains exactly one solution±ei;
- It excludes all saddle points;- For some small ξ = 1
5 log n,random initialization falls inone S i±
ξ with Prob. ≥ 1/2.
e1e2
−e2
e3
−e3
ξ = 0
ξ = 5 log(n)
• Regularity Condition. With p ≥ Ω (poly(n)), w.h.p.⟨grad f (q), qiq − ei⟩ ≥ α(q) · ∥q − ei∥
holds for each S i+ξ (1 ≤ i ≤ n) with α > 0 and for all
q ∈ S i+ξ ∩
q ∈ Sn−1 |
√1 − q2
i ≥ µ
.
• Implicit Regularization. With p ≥ Ω (poly(n)),w.h.p.⟨
grad f (q), 1qj
ej − 1qi
ei
⟩≥ c
θ(1 − θ)n
ξ
1 + ξ,
for all q ∈ S i+ξ and any qj such that j = i and q2
j ≥ 13q
2i .
From Geometry to Optimization
• Random Initialization. Draw q(0) ∼ U(Sn−1), s.t.
Pq ∈
n⋃i=1
S i±ξ
≥ 1/2
• Phase I: Riemannian Gradient Descent (RGD).q(k+1) = PSn−1
(q(k) − τ · grad f (q(k))
),
with small fixed τ , stays in S i±ξ thanks to implicit
regularization. RGD produces a solution q⋆ with∥∥∥q⋆ − qtgt
∥∥∥ ≤ O(µ)in a linear rate, thanks to regularity condition.
• Phase II: Rounding. With r = q⋆, solve
minq
ζ(q) := 1np
p∑i=1
∥∥∥CyiP q
∥∥∥1 , s.t. ⟨r, q⟩ = 1
via projected subgradient descentq(k+1) = q(k) − τ (k) · Pr⊥g(k),
with τ (k+1) = βτ (k) and β ∈ (0, 1), it converges linearly∥∥∥∥q(k) − qtgt
∥∥∥∥ ≤ ηk, η ∈ (0, 1),thanks to local sharpness of ζ(q).
Comparison with Literature
Experiments
• Algorithmic Convergence and recovery with varying θ
(a) Comparison of iterate convergence (b) Recovery prob. with varying θ
• Phase Transition on (p, n)
(a) ℓ1-loss (b) Huber-loss (c) ℓ4-loss
• Experiments on STORM Imaging
(a) Observation (b) Ground truth (c) Huber-loss (d) ℓ4-loss
(e) Kernel: Ground truth (f) Kernel: Huber-loss (g) Kernel: ℓ4-loss
References
[1] Q. Qu, X. Li, and Z. Zhu “A nonconvex approach for exact and efficient multichannel sparse blinddeconvolution”, NeurIPS, 2019.
[2] Y. Li, and Y. Bresler, “Multichannel sparse blind deconvolution on the sphere”. NeurIPS, 2018.[3] L. Wang, and Y. Chi, “Blind Deconvolution From Multiple Sparse Inputs”. IEEE SPL, 2016.