Wavelets & Wavelet Algorithms: Blur Detection, Text Skew Angle Detection, Texture Recognition with 2D Haar Wavelets

Post on 26-Jul-2015

92 Views

Category:

Science

5 Downloads

Preview:

Click to see full reader

Transcript

Wavelets & Wavelet Algorithms

Blur Detection, Text Angle Detection, Texture Recognitionwith

2D Haar Wavelets

Vladimir Kulyukin

www.vkedco.blogspot.comwww.vkedco.blogspot.com

Outline

● Blur Detection● Text Skew Angle Detection● Texture Recognition

Blur Detection

with

2D Haar Wavelet Transform

Theoretical Foundations

● Mallat & Hwang [1] argue that signals carry information via irregularites

● These researchers show that the local maxima of the wavelet transform detect locations of irregularites

● For example, 2D HWT maxima indicate possible locations of edges in images

Indirect & Direct Blur Detection Methods● Tong et al. [2] classify image blur detection methods into direct and

indirect

● Indirect methods characterize image blur as a linear function IB = B

* IO

+ N, where IO

is the original image, B is an unknown blur

function, N is a noise function, and is the result image blur and noise are introduced in the image

● Direct methods are based on detection of distinct features directly computed in the images, e.g., corners, edges, color histograms, etc

Edge Classification

Tong, H., Li, M., Zhang, H., and Zhang, C. "Blur detection for digital images using wavelet transform," In Proceedings of the IEEE International Conference on Multimedia and Expo, vol.1, pp. 27-30, June 2004.doi: 10.1109/ICME.2004.1394114.

Edge-Based Blur Detection

● Tong et al. [2] propose a direct method based on 2D Haar Wavelet Transform

● Main assumption of their research is that introduction of blur has different effects on the four main types of edges

● In blurred images, Dirac and A-Step edges are absent whereas G-Step and Roof edges lose their sharpness

● Images are classified as blurred on the basis of presence/absence of Dirac & A-Step edges

Blur Detection

Haar Tile Clustering

Another 2D HWT-Based Blur Estimation Method

● Another 2D HWT- based blur estimation method is presented in [3]● This method is based on the hypothesis that it may not be necessary to

detect any explicit features such as corners or edges● Rather, it may be possible to detect regions with pronounced changes

without explicitly computing the causes of those changes● After those regions are detected, they can be combined into larger

segments● Those larger segments can be used to classify images as blurred

Theory

● Sharp images have many easily distinguishable features, e.g., edges, corners, textures

● Blurred images have fewer easily distinguishable features because distinctions among image regions are less pronounced

● Easily distinguishable features consists of regions with pronounced changes

Finding Regions with Pronounced Changes

● 2D HWT can be used to find image regions with pronounced horizontal, vertical, or diagonal changes

● A captured frame is divided into N x N windows (aka tiles) where N is an integral power of 2

● Border pixels at the right and bottom margins are discarded when captured frames are not divisible by N

Finding Regions with Pronounced Changes

Image is split into 64 x 64 blocks (aka tiles)

Tile Processing

● Each tile is processed by four iterations of the 2D HWT● The number of iterations is a parameter, and can be

increased/decreased● Each tile is represented by three 2D Haar wavelets:

horizontal change (HC), vertical change (VC), and diagonal change (DC)

● These values can be thresholded to retain the tiles with only large changes

Applying 2D HWT to Image

● Image on the right shows two iterations of ordered 2D HWT

● The whiteness of each pixel is proportional to the magnitude of the wavelet in the corresponding array cell

Representation of 64 x 64 Blocks with 2D Wavelets

Each region is represented in terms of 3 wavelets: vertical (first double); horizontal (second double), diagonal (third double); Ignore the numbers after “|”: they are used for debugging

Tile Clustering● After the tiles with pronounced changes are found, the depth-first search

(DFS) is used to combine them into larger tile clusters● DFS starts with an unmarked tile with a pronounced change and connects

to it its immediate horizontal, vertical, and diagonal neighbors if they also have pronounced changes

● If such tiles are found, they are marked with the same cluster number (shown with a specific color on the subsequent slides) and the search continues recursively

● After all tiles reachable from the current tile are found, the algorithm looks for another unmarked tile

● The algorithm terminates when no more unmarked tiles are found

Tile Cluster Buildup with DFS

● Let us assume that we have found all tiles with pronounced areas, as shown in the image on the right

● Now we can run DFS to find all tile clusters

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Cluster Buildup with DFS

Tile Clustering in Sharp Images

Eventually we run out of unmarked tiles with sufficiently high pronounced changes, as shown in the image on the right

Tile Clustering in Blurred Images

Title clusters shown in the right image are found by DFS run on tiles with pronounced changes found in the left image

Tile Cluster Filtering● After the tile clusters are found, two cluster-related rules are used to

classify a whole image as sharp or blurred● The 1st rule uses the percentage of the total area of the image covered by

the found tile clusters● The 2nd rule uses the number of the tiles in each cluster to discard small

clusters● The 1st rule captures the intuition that sharper images have many tiles with

pronounced changes● The 2nd rule captures the intuition that small clusters should be discarded

as irrelevant

Haar Tile Clustering: Algorithmic ChainTake Image

Haar Tile Clustering: Algorithmic ChainTake Image Find Tiles

Haar Tile Clustering: Algorithmic ChainTake Image Find Tiles Find Tile Clusters

Haar Tile Clustering: Algorithmic ChainTake Image Find Tiles Find Tile Clusters Filter Tile Clusters

Haar Tile Clustering: Algorithmic ChainTake Image Find Tiles Find Tile Clusters Filter Tile Clusters

BLURRED / SHARP

Haar Tile Clustering: Pseudocode

Lines 2 – 8: Split image into N x N blocks and run 2D HWT on each block

Lines 9 – 13: Find all tile clusters with DFS

Lines 15 – 20: Eliminate smaller clusters & compute overall area of larger clusters

Lines 21 – 22: Classify image as blurred or sharp

Blur Detection

Evaluation of Haar Tile Clustering

Image Sample

● 500 random RGB images* were selected from a set of 506 smartphone video recordings of common grocery products

● Three human volunteers were recruited to classify each image as blurred or sharp

● An image was classified as sharp if at least two of the three volunteers classified it as sharp – this is the ground truth

*Images are available at https://app.box.com/s/n4s2ve0dajz5gkzqx9vpm1f6fzhw5upz

Three Evaluated Algorithms

● Algorithm 1: Kulyukin, V. & Andhavarapu. S. “Image Blur Detection with 2D Haar Wavelet Transform and Its Effect on Skewed Barcode Scanning.” To appear in Proceedings of the 19th International Conference on Image Processing, Computer Vision, & Pattern Recognition (IPCV 2015). Las Vegas, NV, USA

● Algorithm 2: Tong, H., Li, M., Zhang, H., and Zhang, C. "Blur detection for digital images using wavelet transform," In Proceedings of the IEEE International Conference on Multimedia and Expo, vol.1, pp. 27-30, June 2004.doi: 10.1109/ICME.2004.1394114

● Algorithm 3: Blur effect: perception and estimation with a new no-reference perceptual blur metric.” In Proceedings of SPIE 6492, Human Vision and Electronic Imaging XII, 64920I, San Jose, CA, USA, January 28, 2007. doi:10.1117/12.702790

*Images are available at https://app.box.com/s/n4s2ve0dajz5gkzqx9vpm1f6fzhw5upz

True & False Positives on Blurred and Sharp Images

*Images are available at https://app.box.com/s/n4s2ve0dajz5gkzqx9vpm1f6fzhw5upz

Algorithm True Positives on Blurred Images

False Positives on Blurred Images

True Positives on Sharp Images

False Positives on Sharp Images

Algorithm 1 163 4 254 79

Algorithm 2 167 0 183 150

Algorithm 3 81 86 268 65

Ground Truth 167 0 333 0

Relative Difference Table

Algorithm Relative Difference on Blurred Images

Relative Difference on Sharp Images

Algorithm 1 2.39 23.72

Algorithm 2 0.00 45.05

Algorithm 3 51.50 19.52

images arpblurred/shon truth ground theis and algorithmby found

images arpblurred/sh ofnumber theis where,100,max

,fferenceRelativeDi

GA

AGA

GAGA

Observations

● On blurred images, Algorithm 1 & Algorithm 2 do not deviate from ground truth; Algorithm 3 shows a significant deviation (52%)

● On sharp images, Algorithm 1 & Algorithm 3 deviate from ground truth by 20% whereas Algorithm 2 deviates by 45%

Text Skew Angle Detection

with

2D Haar Wavelet Transform

Text Skew Angle Detection

● OCR engines have a lot of difficulties with skewed texts

● If the text skew angle is known, the image can be rotated and then OCRed

● Or, which is cooler and faster, image can be OCRed in-place without any rotation

Text Skew Angle Detection Algorithms

● A variety of algorithms have been developed to determine text skew angle

● Many of these algorithms use horizontal & vertical project profiles

● A horizontal projection profile is a 1D array whose size is equal to the number of rows in the image

● A vertical projection profile is a 1D array whose size is equal to the number of columns in the image

Horizontal & Vertical Projections

1 4 3 2 2 3 6 0Vertical Projection

Horizontal Projection

222

36222

1) Vertical projection records count of black pixels in each column

2) Horizontal projection records count of black pixels in each row

Computing Horizontal & Vertical Projections

● The image of every character from a given alphabet is rotated and horizontal & vertical projections are computed for every rotation

● In the example on the right the image of A is rotated by 90 degrees four times & two projections are computed for each angle

● Such projections are filed away and used at run time for text skew angle detection

0HP 0VP

90HP 90VP

180HP 180VP

270HP 270VP

Text Skew Angle Determination with Horizontal & Vertical Projections

● At run time, text is segmented into characters (this is not a trivial task, and is error-prone)

● A horizontal & vertical projection is computed for each character (or for selected characters)

● The computed projections are matched against the pre-computed horizontal & vertical projections

● The closest match determines the possible angle

Two Projection-Based Algorithms

● There are two seminal text skew angle detection algorithms: Postl's [7] & Hull's [8]

● Postl's algorithm calculates horizontal projection profiles for every character in the alphabet in small increments (e.g., in increments of 5 degrees) and uses the sum of squared differences for projection matching to determine the skew angle

● Hull's algorithm also uses projection profiles but rotates only black pixels instead of entire images

Computing Horizontal, Vertical, & Diagonal Wavelets

1) 2D HWT is applied to image a given number of times (twice in this case)

2) Application of 2D HWT returns an array of four n x n matrices [AVR, HC, VC, DC] (e.g., if the input image is 1024 x 1024, the size of each of the returned arrays is 256 x 256)

3) AVRG is matrix of averages; HOR is matrix of horizontal wavelets; VER is matrix of vertical wavelets; DIG is matrix of diagonal wavelets

crHC ,

crVC ,

crDC ,

Binarizing Horizontal, Vertical, & Diagonal Wavelets

HC, VC, DC matrices are binarized to eliminate irrelevant wavelets

Combining Horizontal, Vertical, & Diagonal Wavelets

Binarized HOR, VER, DIG matrices are combined into one matrix using the following formula

ly.respective matrices, wavelet diagonal and

vertical,,horizontal are ,,,,,

matrix,result theis ,,1 where

,,,,,

crDCcrVCcrHC

crC

crDCcrVCcrHCcrC

crHC ,

crVC ,

crDC ,

crC ,

Computing Text Skew Angle

Convex Hull algorithm is applied to the combined matrix to find the smallest rectangle around the text area; the rectangle is used to determine the text skew angle

crC ,

Text Skew Angle Detection: Algorithmic Chain

Take Image

Text Skew Angle Detection: Algorithmic Chain

Take Image Apply 2D HWT

Text Skew Angle Detection: Algorithmic Chain

Take Image Apply 2D HWT Binarize

Text Skew Angle Detection: Algorithmic Chain

Take Image Apply 2D HWT Binarize Combine

Text Skew Angle Detection: Algorithmic Chain

Take Image Apply 2D HWT Binarize Combine Find Rectangle

Text Skew Angle Detection: Algorithmic Chain

Take Image Apply 2D HWT Binarize Combine Find Rectangle

Angle SkewText

Text Skew Angle Computation: Pseudocode

Lines 1 – 4: Overall algorithm

Lines 5 – 14: Binarize Wavelets

Lines 16 – 26: Combine wavelet matrices; threshold combined pixel values; bind thresholded values with a rectangle; compute text skew angle

Image Sample

● 607 random RGB images* were selected from a set of 607 smartphone video recordings of common grocery products

● Two human volunteers were recruited to determine the text skew angle with an open source protractor program**

● The text skew angles determined are the ground truth

*Images are available at https://usu.app.box.com/s/9zk660t5h1g0dmw4pjj1x1yp6r7zovp3

**Open source protractor program http://sourceforge.net/projects/osprotractor/

Ground Truth with Open Source Protractor

Image on the right shows a humanevaluator using OpenSourceProtractor to estimate text skew angle

Three Evaluated Algorithms

● Algorithm 1: T. Zaman, V. Kulyukin. "Text Skew Angle Detetion in Vision-Based Scanning of Nutrition Labels." To appear in Proceedings of the 19th International Conference on Image Processing, Computer Vision, & Pattern Recognition (IPCV 2015). Las Vegas, NV, USA

● Algorithm 2: Postl, W. "Detection of linear oblique structures and skew scan in digitized documents." In Proc. of International Conference on Pattern Recognition, pp. 687-689, 1986

● Algorithm 3: Hull, J.J. "Document image skew detection: survey and annotated bibliography," In J.J. Hull, S.L. Taylor (eds.), Document Analysis Systems II, World Scientific Publishing Co., 1997, pp. 40-64

*Images are available at https://app.box.com/s/n4s2ve0dajz5gkzqx9vpm1f6fzhw5upz

Error Dispersion Plots

Algorithm 1 Algorithm 2 Algorithm 3

X-axis is the image numbers from 0 to 606;Y-axis is the text skew angle error compared to the ground truth (0 is the ground truth)

Performance Comparison Table

Algorithm 1 Algorithm 2 Algorithm 3

Time (ms) 341.37 6253.02 5908.18

Algorithm 1 Algorithm 2 Algorithm 3

Median Error 4.62 68.85 20.92

Table I. Processing time in milliseconds

Table II. Median error in text skew angle estimation

Observations

● Algorithm 1 has an average processing time of 341.37 ms, which is significantly faster than Algorithm 2 (Postl) and Algorithm 3 (Hull), because Algorithm 1 does not perform any rotation of images; for the sake of objectivity, it should be noted that Algorithms 2 & 3 were originally designed for document scanners with smaller text skew angles and ideal lighting conditions

● Algorithm 1 has a lower median text skew angle error than either Algorithm 2 or Algorithm 3

● Error dispersion plots show that Algorithm 1 shows closer clustering around the 0 line which is the ground truth than either Algorithm 2 or Algorithm 3

Texture Recognition

with

2D Haar Wavelet Transform

Hypothesis

● Different materials have different textures● Different textures may have different 2D wavelet coefficients● If 2D wavelet coefficients for specific textures are known,

they can be used at run to find the corresponding materials in images

Recognition of Food Items in Nutrition Images

Problem: Is it possible to recognize all food items and their (approximate regions) in images like the one above?

Recognition of Food Items in Nutrition Images

Possible Ideal Solution

Computing Wavelet Coefficients for Different Textures

● Let us manually crop 256 x 256 regions of three different textures (apple sauce, cake, and rice) and compute their Haar wavelet coefficients

● Let us also compute them at two different levels: 128 x 128 (7 iterations) and 64 x 64 (6 iterations)

● The first one (128 x 128) is coarser than the second one (64 x 64)

256 x 256 Images of Apple Sauce, Rice, & Meat

Apple Sauce Rice Meat

128 x 128 Apple Sauce Wavelets Apple Sauce 1 Apple Sauce 2 Apple Sauce 3

-0.55 1.45

0.27 1.72

-1.20 1.02

-1.01 -.2.05

0.84 -0.39

0.29 1.89

-0.06 -0.79

-0.18 2.36

-0.05 -0.36

-0.24 -1.93

0.67 -1.52

-0.92 0.22

1.00 -0.16

-1.65 0.67

2.25 -1.39

-0.77 2.37

2.26 -0.11

-081 -0.88

HC

VC

DC

128 x 128 Rice Wavelets Rice 1 Rice 2 Rice 3

-3.48 0.11

-0.77 13.50

-5.93 -6.56

0.49 -.6.64

-10.63 -6.51

-3.35 3.40

7.50 -7.44

5.95 0.21

-4.31 -5.53

-4.09 8.60

3.05 3.44

-0.06 -8.03

-2.94 -3.40

2.54 -4.09

3.39 -10.65

-3.83 -8.50

4.23 -2.93

-14.00 3.21

HC

VC

DC

128 x 128 Meat Wavelets Meat 1 Meat 2 Meat 3

-0.71 2.95

0.38 -0.69

0.80 2.18

-2.31 -0.78

-0.88 1.93

-2.06 0.87

0.21 -0.76

-0.05 3.00

3.54 1.16

0.33 -0.62

0.71 -0.45

0.76 -2.69

1.76 1.40

-1.92 1.76

1.39 2.50

3.31 1.75

0.03 0.98

-0.35 2.14

HC

VC

DC

128 x 128 Wavelets Side by Side: 1 Apple Sauce 1 Rice 1 Meat 1

-0.55 1.45

0.27 1.72

-1.20 1.02

-1.01 -.2.05

0.84 -0.39

0.29 1.89

HC

VC

DC

-3.48 0.11

-0.77 13.50

-5.93 -6.56

0.49 -.6.64

-10.63 -6.51

-3.35 3.40

-0.71 2.95

0.38 -0.69

0.80 2.18

-2.31 -0.78

-0.88 1.93

-2.06 0.87

128 x 128 Wavelets Side by Side: 2 Apple Sauce 2 Rice 2 Meat 2

HC

VC

DC

-0.06 -0.79

-0.18 2.36

-0.05 -0.36

-0.24 -1.93

0.67 -1.52

-0.92 0.22

7.50 -7.44

5.95 0.21

-4.31 -5.53

-4.09 8.60

3.05 3.44

-0.06 -8.03

0.21 -0.76

-0.05 3.00

3.54 1.16

0.33 -0.62

0.71 -0.45

0.76 -2.69

128 x 128 Wavelets Side by Side: 3 Apple Sauce 3 Rice 3 Meat 3

HC

VC

DC

2.25 -1.39

-0.77 2.37

1.00 -0.16

-1.65 0.67

2.26 -0.11

-081 -0.88

3.39 -10.65

-3.83 -8.50

-2.94 -3.40

2.54 -4.09

4.23 -2.93

-14.00 3.21

1.39 2.50

3.31 1.75

1.76 1.40

-1.92 1.76

0.03 0.98

-0.35 2.14

64 x 64 Wavelets Side by Side Apple Sauce 1 Rice 1 Meat 1

HC

VC

DC

0.95 -1.38 -1.04 1.53

-0.78 2.29 1.50 5.18

-0.36 0.41 0.00 6.53

0.00 -1.22 4.05 -0.41

2.87 -1.06 -0.91 -0.40

-1.50 -1.77 0.72 3.01

-0.20 -0.41 -0.40 0.69

0.15 -0.40 0.21 2.75

-0.35 0.91 -0.59 -0.71

-2.16 -2.25 1.16 -2.16

0.32 0.79 -1.39 2.84

-0.48 0.74 -1.87 -0.55

18/28 -6.43 11.50 -4.14

-8.31 3.19 1.29 -4.95

-7.60 3.21 -3.92 2.91

8.75 -5/58 2.86 -1.78

0.88 -2.58 -1.20 3.80

-2.73 4.24 0.10 1.43

-0.32 2.81 2.97 0.30

-2.90 4.28 -0.52 -4.83

0.63 1.77 13.11 0.58

-2.57 -8.22 -5.58 1.37

13.38 -0.31 5.07 5.26

-3.30 10.13 7.08 -1.18

-11.11 1.88 3.52 1.85

-1.57 -8.64 -9.23 6.73

-1.18 -4.43 -6.10 -4.09

-8.42 0.20 -2.50 9.80

2.94 2.06 0.22 1.42

3.57 -2.29 -1.19 -5.15

0.40 3.90 3.40 -0.80

-0.98 3.17 -1.37 1.73

0.63 -1.03 -1.77 -0.20

-1.00 -1.22 -0.96 -0.03

1.11 0.65 3.77 0.08

-7.55 3.96 0.77 -0.23

Book References

● Y. Nievergelt. “Wavelets Made Easy.” Birkhauser, 1999.● C. S. Burrus, R. A. Gopinath, H. Guo. “Introduction to

Wavelets and Wavelet Transforms: A Primer.” Prentice Hall, 1998.

● G. P. Tolstov. “Fourier Series.” Dover Publications, Inc. 1962.

Paper & Code References[1] Mallat, S. and Hwang, W. L. “Singularity detection and processing with wavelets.” IEEE Transactions on Information Theory, vol. 38, no. 2, March 1992, pp. 617-643.

[2] Tong, H., Li, M., Zhang, H., and Zhang, C. "Blur detection for digital images using wavelet transform," In Proceedings of the IEEE International Conference on Multimedia and Expo, vol.1, pp. 27-30, June 2004.doi: 10.1109/ICME.2004.1394114.

[3] Cretea,F., Dolmierea, T., Ladreta, P., Nicolas, M. “The Blur effect: perception and estimation with a new no-reference perceptual blur metric.” In Proceedings of SPIE 6492, Human Vision and Electronic Imaging XII, 64920I, San Jose, CA, USA, January 28, 2007. doi:10.1117/12.702790.

[4] Kulyukin, V. & Andhavarapu. S. “Image Blur Detection with 2D Haar Wavelet Transform and Its Effect on Skewed Barcode Scanning.” To appear in Proceedings of the 19th International Conference on Image Processing, Computer Vision, & Pattern Recognition (IPCV 2015). Las Vegas, NV, USA

[5] Python implementation of the blur detection algorithm proposed in reference [2] is available at https://github.com/VKEDCO/PYPL/blob/master/haar_blur

[6] MATLAB implementation of blur detection algorithm proposed in reference [3] is

available at http://www.mathworks.com/matlabcentral/fileexchange/24676-image-blur-metric

Paper & Code References[7] Postl, W. "Detection of linear oblique structures and skew scan in digitized documents." In Proc. of International Conference on Pattern Recognition, pp. 687-689, 1986.

[8] Hull, J.J. "Document image skew detection: survey and annotated bibliography," In J.J. Hull, S.L. Taylor (eds.), Document Analysis Systems II, World Scientific Publishing Co., 1997, pp. 40-64

top related