Manifold Blurring Mean Shift algorithms for manifold denoising, presentation, 2012

Post on 07-Jul-2015

161 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

(General) To retrieve a clean dataset by deleting outliers. (Computer Vision) the recovery of a digital image that has been contaminated by additive white Gaussian noise.

Transcript

Computer Vision

Manifold Blurring Mean Shift algorithms for manifold denoising

Kevin ADDA, Florent RENUCCI

Denoising (General) To retrieve a clean dataset by deleting outliers.

(Computer Vision) the recovery of a digital image that has been contaminated by additive white Gaussian noise.

Noisy spiral dataset Handwritten digits recognition Noisy image

2Computer Vision project

Manifold Blurring Mean Shift algorithm (MBMS)

Blurring mean-shift update :

Projection on a sub-dimensional space with PCA:

, where K is a Gaussian kernel:

, such that:

Parameters: the variance of the Gaussian kernel ; k the number of neighbors to consider ; L the local instrinsic dimension; Iteration number for the whole algorithm.

3Computer Vision project

Setting the parameters: the kernel variance

related to the level of local noise outside the manifold;

The larger it is, the stronger the denoising effect;

But can distort the manifold shape over iterations.

Trade-off between kernel variance and iteration number.

4Computer Vision project

Setting the parameters: the number of neighbors

k is the number of nearest neighbors that estimates the local tangent space;

MBMS is quite robust to it. It typically grows sublinearly with N.

However, it effects strongly the mean-shift blurring effect as each point is motioned toward the Gaussian kernel mean on the neighbors.

Trade-off between the number of parameters and kernel variance.

5Computer Vision project

Setting the parameters: the intrinsic dimensionality

If L is too small, it produces more local clustering and can distort the manifold;

If L is too big, points will move a little : if L is equal to the dimension of the set, no motion.

6Computer Vision project

Since we use 2D datasets, we will usually choose L=1, except for GBMS Algorithm (L=0)

Setting the parameters: the number of iterations

A few iterations (1 to 5) achieve most of the denoising

More iterations can refine this and produce a better result, but shrinkage might arise.

7Computer Vision project

Trade-off between the number of iterations and the other parameters.

Spiral dataset

Computer Vision project8

Pinwheel.m: generates little two-dimensional datasets that are spirals of noisy data. 

(credit: Harvard intelligent probabilistic systems)

Spiral dataset: application

Computer Vision project9

Parameters : L = 1; k = 15 ; = 1.1

Initial set: Noisy spiral with uniformely distributed outliers

N = 1250

Spiral dataset: application

Computer Vision project10

Parameters : L = 1; k = 15 ; = 1.1

Iteration 1

Spiral dataset: application

Computer Vision project11

Parameters : L = 1; k = 15 ; = 1.1

Iteration 2

Spiral dataset: application

Computer Vision project12

Parameters : L = 1; k = 15 ; = 1.1

Iteration 3

Spiral dataset: application

Computer Vision project13

Parameters : L = 1; k = 15 ; = 1.1

Iteration 4

Spiral dataset: application

Computer Vision project14

Parameters : L = 1; k = 15 ; = 1.1

Iteration 5

Spiral dataset: application

Computer Vision project15

Parameters : L = 1; k = 15 ; = 1.1

Iteration 6

Spiral dataset: application

Computer Vision project16

Parameters : L = 1; k = 15 ; = 1.1

Iteration 7

Spiral dataset: application

Computer Vision project17

Parameters : L = 1; k = 15 ; = 1.1

Iteration 8

Number of neighbors effect Initial dataset:

2 sets of parameters: L = 1, k = 10, sigma = 1.1

L = 1, k = 100, sigma = 1.1

18Computer Vision project

Number of neighbors effect

Computer Vision project19

K = 10 K = 100

Iteration 1

Number of neighbors effect

Computer Vision project20

K = 10 K = 100

Iteration 2

Number of neighbors effect

Computer Vision project21

K = 10 K = 100

Iteration 3

Intrinsic dimension effect Initial dataset:

2 sets of parameters: L = 1, k = 15, sigma = 1.1

L = 0, k = 15, sigma = 1.1

22Computer Vision project

Number of neighbors effect

Computer Vision project23

L = 1 L = 0

Iteration 1

Number of neighbors effect

Computer Vision project24

L = 1 L = 0

Iteration 2

Number of neighbors effect

Computer Vision project25

L = 1 L = 0

Iteration 3

MNIST Dataset Classification

26Computer Vision project

Input : 16x8 matrices of 0 and 1 representing the image of a letter.

MNIST Dataset Classification

27Computer Vision project

Input : 16x8 matrices of 0 and 1 representing the image of a letter.

Parameters :

L = 1; sigma = 1;

k = 4; (must be an even number)

n_iteration = 1;

Preprocessing algorithm :

Extraction the "1" elements. It means that if m1,3=1 for example, we extract the point 1,3. coordinates of the white points.

Denoising step.

If the result is not an integer, we round it.

for example if we plan to move a pixel to the coordinates (12,54;14,1), we round it to (13;14).

The vector obtained is transformed in a matrix of 0 and 1.

MNIST Dataset Classification

28Computer Vision project

General algorithm :

We learn a neural network that labels the dataset

We compute the good labelling rate

We denoise the images

We learn a new neural network

We compute the good labelling rate

MNIST Dataset Classification

29Computer Vision project

Results :

We first run the algorithm on the dataset, and then separate training set and test set. We compare the good labelling rates.

Good labelling rates dataset Training/test dataset

No blurring 51% 35%

blurring 53% 39%

Conclusion

30Computer Vision project

The Manifold Blurring Mean Shift algorithm allows to blur an image in order to: Erase some outliers in merging them in the "real" image;

Merge outliers and decreasing their number.

decrease the error rate of a labelling methodMore congruent image for a human eye

Also more congruent for an automatic classification

Computer Vision project31

Thank you

top related