This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Image Analysis & Retrieval
CS/EE 5590 Special Topics (Class Ids: 44873, 44874)
Piece-wise Linear Models via Query Driven Solution
Subspace Indexing on Grassmann Manifold
Optimization of Subspace on Grassmann Manifold
Sparse Signal Processing
Sparse Representation and Robust PCA
Sparse Signal Processing
L1 norm and L1 Magic Solution
Application in occluded face recognition
Summary
p.12Z. Li, Image Analysis & Retrv, 2016 Fall
Sparse representation
• Signals/Images are sparse if it can have very few non-zero coefficients representation in certain subspace:
– E.g. cameraman image X represented as 2-D DCT in Y:
• How is this related to classification problem ?
– Intuitively, sparse is good for classification, because it is to separate samples from different classes
– Only when data points are dense and intertwined , classification is hard
– How to characterize this mathematically ?
x y=dct2(x)
Eigen face
p.13Z. Li, Image Analysis & Retrv, 2016 Fall
Sparsity in Human Visual System
p.14Z. Li, Image Analysis & Retrv, 2016 Fall
Sparse Signal Recovery
If x is sparse, i.e |x|0 is small, we can recovery x by a random projection measurement, y=Ax
Basis pursuit de-noising:
LASSO:
p.15Z. Li, Image Analysis & Retrv, 2016 Fall
Sparse Face Model
Consider a face recognition system
We have k=1,2,…,K subjects, each subject has nk training samples {[v1,1, .., v1,n1], [v2,1, .., v2,n2], …, [vK,1, .., vK,nK]}, each is a thumbnail image with d=wxh pixels.
Let us stack all training samples as a collection of column vectors, A, of d N, N=n1 + n2 + … + nK.
The problem is, for a given thumbnail image, y, with unknown class label, how to solve for its label ?K
p.16Z. Li, Image Analysis & Retrv, 2016 Fall
Assume y is belonging to class i, then,
Or,
Where only a small number of coefficients in x has non-zero entry, thus sparse.
Sparsity
p.17Z. Li, Image Analysis & Retrv, 2016 Fall
Assume y is belonging to class 1, then,
Most co-efficients related to other classes are zero, only a small
number of non-zero coefficients in alpha 1
Illustration of Sparsity
p.18Z. Li, Image Analysis & Retrv, 2016 Fall
• So the problem is rather straight forward– Give y = Ax, where
• y is the unknown face image in Rd,
• A is the d x N training data matrix, or dictionary, with N large
• x is the coefficients of y as linear combination of training samples that is sparse, out of total N coefficients, only a small number of them are non-zero
– Mathematically, we are looking for :
• Where |x|0 is L0 norm, which counts number of non-zero coefficients in x.
Mathematical formulation
𝑥0 = argmin𝑥
𝑥 0, 𝑠. 𝑡. , 𝐴𝑥 = 𝑦
p.19Z. Li, Image Analysis & Retrv, 2016 Fall
• The L0 minimization problem is basically a combinatorial optimization problem
• Not much structure to exploit fast algorithm
• Dumbest solution:
– Assuming that x has at most 3 non-zero coefficients, then search total
– Possible coefficients combinations and find the one gives the best match
– It is an kNN search in effect !
L0 minimization is NP hard
𝑁
1+
𝑁
2+
𝑁
3
p.20Z. Li, Image Analysis & Retrv, 2016 Fall
L0 and L1 norm
Lk norm (recall minkowski distance)
p.21Z. Li, Image Analysis & Retrv, 2016 Fall
L1 solution
L-2 ball
p.22Z. Li, Image Analysis & Retrv, 2016 Fall
L1 based recognition
p.23Z. Li, Image Analysis & Retrv, 2016 Fall
L1 solution for invalid input images
• For non-face images:
– Non sparse coefficients in x
• Can threshold on residual to return not found result
p.24Z. Li, Image Analysis & Retrv, 2016 Fall
Occlusion and Disguise
• A big problem in biometrics is disguise and occlusion
• The magic of sparsity and L1 minimization can deal with
that effectively !
• Consider a face image with a small fraction p of its pixels
corrupted:
p.25Z. Li, Image Analysis & Retrv, 2016 Fall
• Let the occluded face images be y = Ax + e
• Then re-state the constraint as,
• then solve for P1 with y=Bw. Notice that sparsity in w is
achieved thru sparsity in both x and e.
Sparsity criteria takes care of occlusion
p.26Z. Li, Image Analysis & Retrv, 2016 Fall
• Occlusion example
– Large L2 errors, not recoverable by Eigenface/Fisherface:
• Accuracy for sunglasses and scarves effects:
Occluded face recognition
p.27Z. Li, Image Analysis & Retrv, 2016 Fall
L1 vs L2 minimization
• A natural question is why not solve y=Ax with L2
minimization ?
– Typically, number of training samples is smaller than number of pixels
in the training images, so why not do a pseudo-inverse like:
– Which looks for a Maximum Likelihood estimation of true x, if noises
are Gaussian with covariance sI.
– However, the noises are non-gaussian and can be unbounded. The
resulting L2 solution pretty bad
p.28Z. Li, Image Analysis & Retrv, 2016 Fall
L2 solution for Occlusion
• Example with occlusion:
– (a): Occluded face
– (b): x solved from L2 minimization, not sparse at all
– (c ): error
– (d ): reconstruction from x
p.29Z. Li, Image Analysis & Retrv, 2016 Fall
L1 vs L2 minimization
L1 vs L2 in 2D space:
y=Ax
p.30Z. Li, Image Analysis & Retrv, 2016 Fall
Sparsity is bad news for L2
• Given training set A, the unknown image y is under-
determined in A:
– R(A): a set of y that satisfies y=Ax:
p.31Z. Li, Image Analysis & Retrv, 2016 Fall
Numeric solution for L1 minimization
Candes (of CalTech)’s group has this L1 magic matlabtoolbox
Check out manual on course webpage
Stephen Boyd:
Boyd’s nice book on Optimization can be downloaded from his webpage at Stanford.
Excellent book, with slides, homework and solutions.
Super-resolves a lower resolution patch, say k x k, to 3k x 3k.
Mathematically, learn a function:
p.37
𝑓 𝑥 → 𝑌, 𝑥 ∈ 𝑅𝑑 , 𝑌 ∈ 𝑅𝐷
Z. Li, Image Analysis & Retrv, 2016 Fall
Basic Framework
Super-resolve is the inverse of down scaling:
Low res patch y is the blurred and scaled high res patch x:
Assume the high res image is sparse on some dictionary (true, say DCT):
p.38
Output OriginalInput
Training patches
≈
𝑦 = 𝑆𝐻𝑥
Z. Li, Image Analysis & Retrv, 2016 Fall
Coupled Dictionary Learning
Pre-train a common set of coupled low and high resolution dictionary
Super-resolve by solving L1 minimization on lower resolution patch, and use the same coeffiients to superresolve the higher resolution patch
p.39Z. Li, Image Analysis & Retrv, 2016 Fall
Coupled Dictionary Learning
Learn two sets of Dictionaries, Dh, Dl, that have common sparse coefficients for low and high resolution image patches, y and x:
Reconstruction of low res patch with sparse coefficients:
Furthermore, introduce a linear projection, F, to enforce perceptual metrics
Then the high res patch x, can be constructed as
p.40
min 𝛼0, 𝑠. 𝑡. , 𝐷𝑙𝛼 − 𝑦
2≤ 𝜖
min 𝛼0, 𝑠. 𝑡. , 𝐹𝐷𝑙𝛼 − 𝐹𝑦
2≤ 𝜖
𝑥 = 𝐷ℎ𝛼
Yang, J Wright, TS Huang, Y Ma, Image super-resolution via sparse representation, IEEE Trans. Image Processing, vol.19 (11), 2861-2873
Z. Li, Image Analysis & Retrv, 2016 Fall
Coupled Dictionary Learning
Put together, super resolve is to solve:
Sparse reconstruction of lower resolution y
Enforce local consistence with high res patches, extract adjacent overlapping stripes, via P, to be in agreement, w is the previously reconstructed patch pixels:
Solution via Lagrangian relaxation:
p.41Z. Li, Image Analysis & Retrv, 2016 Fall
Overall Algorithm
Patch level super-resolution, complete with global image gradient search
p.42Z. Li, Image Analysis & Retrv, 2016 Fall
Dictionary Training
Training data: low and high resolution image patches Yl={yk}, Xh={xk}:
Enforce the common sparse coefficients
p.43Z. Li, Image Analysis & Retrv, 2016 Fall
Results
Dictionary Training
From flowers and animals data set, covering a variety of texture
Training dictionary from more than 100,000 samples
p.44
𝐷ℎ
𝐷𝑙
Z. Li, Image Analysis & Retrv, 2016 Fall
Results
3x super-resolution
p.45
Bicubic Neighbor embedding[Chang CVPR ‘04]
Low-resolution
input
Original Coupled Dictionary
Z. Li, Image Analysis & Retrv, 2016 Fall
Related Work
Potential paper review project
p.46Z. Li, Image Analysis & Retrv, 2016 Fall
Summary
Sparse Signal Processing If signal is sparse in some (unknown) domain, then from a random measurement,
we can reliably recover the signal via L1 minimization
Applications: Robust PCA and Face Recognition with Occlusion
Face images are sparse linear combination from a face dictionary
Recovery from solving L1 problem ~ caveat: only additive noises can be delt.
Applications: Coupled Dictionary for Image Super Resolution
Coupled dictionary: high and low res image patches sharing the same coefficients.