Image Super-resolution Soma Biswas Department of Electrical Engineering, Indian Institute of Science, Bangalore.
Image Super-resolution
Soma Biswas
Department of Electrical Engineering,
Indian Institute of Science, Bangalore.
Applications
2
Applications
3
Terminology
In most applications, high resolution images are usually desired
Improvement of pictorial information for human interpretation
Automatic machine perception.
Resolution: is the capability of sensor to observe or measure the smallest object
clearly with distinct boundaries. - describes the details contained in an image
Spatial resolution: a digital image is made up of small picture elements called pixels.
Spatial resolution refers to the pixel
density in an image and measures in
pixels per unit area.
Fig. shows a classic test target to
determine the spatial resolution
of an imaging system.
4
Terminology
Low-Resolution (LR): – Pixel density within an image is small, therefore
offering less details.
High-Resolution (HR): – Pixel density within an image is larger, therefore
offering more details that may be critical in various applications.
HR medical images are very helpful for a doctor to make a correct diagnosis.
Easy to distinguish an object from similar ones using HR satellite images
Superresolution (SR):
Obtaining a HR image from one or multiple LR images .
5
How to increase resolution?
Sensors arranged in a 2d array to capture two-dimensional image signals.
Number of sensor elements per unit area determines the spatial resolution
1) Reducing pixel size i.e., increase the sensor density by reducing the
sensor size.
Disadvantage:
- As the sensor size decreases, the amount of light incident also decreases.
- Shot Noise introduced.
- Hardware cost increases
6
Super-resolution
Super-resolution is the process of combining multiple low resolution images
to form a higher resolution one.
Resulting image should represent reality better than all the input images.
Increase high frequency components and remove degradations caused by the
imaging process of the low resolution camera
Combine non-redundant information contained in multiple low-resolution
frames to generate a high-resolution image.
Cost less than comparable approaches.
LR imaging systems can still be utilized.
7
Related Topics
Image restoration: Goal is to recover a degraded (e.g., blurred, noisy) image,
but it does not change the size of image.
Restoration and SR reconstruction are closely related theoretically
SR reconstruction can be considered as a second-generation problem of
image restoration.
Image interpolation: Used to increase the size of a single image.
Usually, the quality of an image magnified from a LR image is inherently
limited
Single image interpolation cannot recover the high-frequency components
lost or degraded during the LR sampling process.
So, image interpolation methods are not considered as SR techniques.
SR: The fusion of information from various observations of the same scene
8
Classical Reconstruction Based SR
Non-redundant information in LR images is due to by subpixel shifts between them.
Subpixel shifts -> uncontrolled motions between imaging system and scene, e.g.,
movements of objects, or due to controlled motions, (satellite system)
Each LR image imposes a set of linear constraints on the unknown HR intensity
values.
If enough LR are available, then the set of equations becomes determined
Practically, however, this approach is numerically limited only to small increases in
resolution.
If LR images are shifted by integer units, then each image contains the same
information, SR is not possible.
HRHRHRHRHRHRHRHRHRHRHR
LRLR
LRLR
LRLR
LRLR
9
10
If these scene motions are known or can be estimated within subpixel accuracy and if
we combine these LR images, SR image reconstruction is possible
Image acquisition System
Goal of SR: Restore an HR image from several degraded LR images.
11
Observation Model
12
Observation Model – Contd.
13
The recovery of x -> inverse problem
Combines denoising, deblurring, scaling-up operation, and fusion of the
different images.
Assumption: D, H, and Ft are known, or can be reliably estimated from the
given data.
Also, motion can be estimated with sub-pixel accuracy
Accurate general motion estimation, (optical flow), is a under-determined
problem.
Inaccurate estimated motion -> output image inferior than input
Simplifying assumptions -> global warps or rigid bodies.
Intuitive Idea
14
3 stages:
– Registration
– Interpolation
– Deblurring
Classic SR
Goal: Want the most likely high resolution image, given the existing low-
resolution images (and the known decimation, blur and transformations).
Maximum-Likelihood (ML) estimate of x, minimizing the penalty function
In many cases, measurements are not sufficient for recovering x.
Then regularization is required.
15
Classic SR
Regularization uses some prior information about the solution to make the problem
well posed
Solution: Choose x to minimize
a priori knowledge: smoothness constraint, suggesting that most images are naturally
smooth with limited high-frequency activity,
In the formulation, the amount of high-pass energy in the restored image is minimized
α: Lagrange multiplier (regularization parameter), that controls the tradeoff between
fidelity to the data and smoothness of the solution
Large α: lead to a smoother solution -> useful when only a small number of LR images
are available or fidelity of observed data is low due to registration error and noise.
Small α : If large number of LR images are available and the amount of noise is small
Solve using iterative method
16
C is generally a high-pass filter
Results
Poorest reconstruction is the nearest
neighbour interpolated image.
Poor performance is attributed to
the independent processing of the
LR observations,
CLS SR results show significant
improvements by retaining detailed
information.
Further improvements obtained by
using the edge-preserving prior
17
Improving Resolution by Image Registration
Michal Irani and Shmuel Peleg
CGVIP 1991
18
Imaging Process
Each pixel in the resulting LR image is given by:
19
Image Registration
20
Algorithm Overview
The HR image should create the LR images.
21
Register the LR images.
Guess the HR image .
Iteration n:
Simulate the imaging process to
create from .
Compare and .
Correct in the direction of the
error.
output
nf0
n
kg
nf
n
kg
kg
nf
nf
Details
22
Results
One of the
input images
Initial guess
(average of input images)
Output
23
Limitations of Reconstruction Based SR
Multiple low-resolution images of the same scene aligned with sub-pixel
accuracy is required
SR image reconstruction is generally a severely ill-posed problem
Insufficient number of low resolution images,
Ill-conditioned registration
Unknown blurring operators,
Performance degrades rapidly when the desired magnification factor is large
or the number of available input images is small.
Result may be overly smooth, lacking important high-frequency details
SR gets much harder as the magnification factor increases.
Partial solution – impose prior.
High magnification factor – using these priors tend to look vey smooth.
24
Example Based Superresolution
William T. Freeman, Thouis R. Jones and Egon
C. Pasztor
Image-Based Modeling, Rendering, and Lighting
25
Example Based SR (Hallucination)
Correspondences between low and high resolution image patches are learned from a
database of low and high resolution image pairs
Learned relationship applied to a new low-resolution image to recover its most likely
high-resolution version.
Higher SR factors have often been obtained by repeated applications of this process.
High resolution details reconstructed (“hallucinated”) not guaranteed to provide the
true (unknown) high resolution details.
Prior Knowledge of faces
26
Algorithm Overview
Training Set
Start with collection of HR images
(a,c) Generate LR and HR patches.
(b) Initial interpolation of the LR image –
desired resolution, lacks HR details
(d,e) Store corresponding pairs
5x5 or 7x7 patch size
Construct a DB of matching LR-HR patches
Algorithmically find the most coherent patch to generate a good image
27
Local image information not sufficient!
Image -> break into patches -> look
for missing HR details -> not good
16 closest examples to the LR patch
look similar to input patch
HR detail corresponding to the LR
examples look different
Should take into account spatial
neighbor effects
28
Algorithm Block Diagram
One pass algorithm
Input image subdivided into LR
patches that are traversed in raster-
scan order
LR patch concatenated with
previously determined HR patches.
HR is selected by a nearest neighbor
search from the training set
Step 1: Increase size of image by
simple interpolation technique
Step 2: predict missing image details
in the interpolated image to create
the super-resolution output
29
Training Data
30
Results
31
Results
32
Results
Cubic-spline Super-resolution True high-resolution image
Generic images can be a good training set
for other generic images – no need for
flower images in training data
Algorithm can use training patch examples
from source image regions that look
different than the regions
33
Results – Failure Case
34
Results – Failure Case
Low-level training set cannot distinguish JPEG compression noise from
correct image data
Algorithm interprets the artifacts as image data and enhances them.
35
How training data effects performance
36
Training set doesn’t have to be similar to the image to be enlarged, it should
be in the same image class—such as text or color image.
High-resolution detail formed out of concatenated characters.
Modifications
Methods based on image patches require large training sets to include any
patterns possibly encountered in testing.
(CVPR 2004) For each LR patch yt, find its k nearest neighbors Nt.
Compute the reconstruction weights by neighbour embedding
Reconstruction weights are applied to generate the corresponding high
resolution patch
37
Super Resolution From a Single Image
Daniel Glaser, Shai Bagon and Michal Irani
ICCV 2009
38
Patch Redundancy in a Single Image
39
Natural images tend to contain
repetitive visual content.
Small (e.g., 5x5) image patches in a
natural image tend to redundantly
recur many times inside the image,
both within the same scale, as well
as across different scales.
On the average, more than 90%
of the patches in an image have 9
or more other similar patches in
the same image (‘within scale’).
> 80% of the input patches have
9 or more similar patches in
smaller scales
Employing in-scale Patch Redundancy
40
Idea: Small (5x5) patch repetitions occur abundantly within and across image
scales, even when we do not visually perceive any obvious repetitive structure
Very small patches often contain only an edge, a corner, etc. such patches are
found abundantly in multiple image scales of almost any natural image.
Recurrence of patches within the same image scale forms the basis for
applying the Classical SR constraints to information from a single image
Combining cross-scale and in-scale redundancy
Recurrence of patches across
different scales gives rise to
Example- Based SR from a
single image, with no prior
examples
Build a cascade of decreasing
resolution images from LR
image.
For each LR patch, search for
its Nearest Neighbour in the
even lower resolution image.
Take the found neighbour’s
parent in the original LR image
and copy it to be the HR
image.
41
Results
42
43
44
45
46
Observations
Main improvement in resolution comes from the Example-Based SR
component in our combined framework.
If, for a particular pixel, the only similar patches found are within the input
scale L, then this scheme reduces to the ‘classical’ single-image SR at that
pixel.
Thus, the above scheme guarantees to provide the best possible resolution
increase at each pixel.
47
Image Super-resolution via Sparse
Representation
Jianchao Yang, John Wright, Yi Ma
CVPR 2008, TIP 2010
48
Problem Definition
49
Problem: given a single low-resolution input, and a set of pairs (high- and
low-resolution) of training patches sampled from similar images, reconstruct a high-resolution version of the input.
Output OriginalInput
Training patches
Advantage: more widely applicable than reconstructive (many image) approaches.
Difficulty: single-image super-resolution is an extremely ill-posed problem.
Linear Sparse Representation
Single-image SR based on sparse signal representation.
Image patches can be well represented as a sparse linear combination of
elements from an appropriately chosen over-complete dictionary.
50
We do not directly observe the high resolution patch x, but rather (features of) its low-
resolution version:
The input low-resolution patch satisfies
dictionary of low-resolution patches.
downsampling / blurring operator
SR as Compressed Sensing
Theoretical results from compressed sensing suggest that under mild
conditions, the sparse representation can be correctly recovered from the
downsampled signals.
51
If we can recover the sparse solution to the underdetermined system of linear
equations , we can reconstruct x from
convex relaxation
This problem can be efficiently solved by linear programming. In many
circumstances it recovers the sparsest solution [Donoho 2006 CPAM].
Formally, we seek the sparsest solution:
Dictionary Preparation
52
Randomly sample 100,000 high-resolution / low-resolution patch pairs from each set
of training images:
Jointly train two dictionaries for the low- and high-
resolution image patches
Enforce that the image patch pairs have the same
sparse representations with respect to Dh and Dl,
Experimental Details
To obtain locally consistent solution
Sample 3 x 3 low resolution patches on a regular grid.
Allow 1 pixel overlap between adjacent patches.
Enforce agreement between overlapping high-resolution
reconstructions.
For color images, First RGB ->YCbCr -> apply our
algorithm to the illuminance (Y) channel only, since
humans are more sensitive to illuminance changes.
Interpolate color layers (Cb, Cr) using Bicubic
interpolation.
Evaluation: visually and qualitatively in Root Mean
Square Error (RMSE).
53
Qualitative Comparison
54
Bicubic
Our method
Neighbor embedding[Chang CVPR ‘04]
Original
Low-resolution
input
Qualitative Comparison
55
Input Image Bicubic
Neighbor embedding Our method
Performance on noisy data
56
Observations
Algorithm requires only two compact learned dictionaries, instead of a large
training patch database.
The computation based on linear programming or convex optimization,
is much more efficient and scalable,
Online recovery of the sparse representation uses the LR dictionary only
HR dictionary is used to calculate the final high-resolution image.
Computed sparse representation adaptively selects the most
relevant patch bases in the dictionary to best represent each patch of the given
low-resolution image
Sparse representation is robust to noise
57