Top Banner
www.ijsetr.com ISSN 2319-8885 Vol.03,Issue.48 December-2014, Pages:9727-9735 Copyright @ 2014 IJSETR. All rights reserved. Robust Document Image Binarization Technique for Degraded Document Images PRASHANT DEVIDAS INGLE 1 , MS.PARMINDER KAUR 2 , MS.VIJAYA KALE 3 1 PG Scholar, Dept of CSE, JNEC, Aurangabad, India, E-mail: [email protected]. 2 Assistant Professor, Dept of CSE, JNEC, Aurangabad, India, E-mail: [email protected]. 3 HOD, Dept of CSE, JNEC, Aurangabad, India, E-mail: [email protected]. Abstract: Segmentation of text from badly degraded document images is very challenging tasks due to the high inter/intravariation between the document background and the foreground text of different document images. In this paper, we propose a novel document image binarization technique that addresses these issues by using adaptive image contrast. The adaptive image contrast is a combination of the local image contrast and the local image gradient that is tolerant to text and background variation caused by different types of document degradations. In the proposed technique, an adaptive contrast map is first constructed for an input degraded document image. The contrast map is then binarized and combined with Canny’s edge map to identify the text stroke edge pixels. The document text is further segmented by a local threshold that is estimated based on the intensities of detected text stroke edge pixels within a local window. The proposed method is simple, robust, and involves minimum parameter tuning. It has been tested on three public datasets that are used in the recent document image binarization contest (DIBCO) 2009 & 2011 and handwritten-DIBCO 2010 and achieves accuracies of 93.5%, 87.8%, and 92.03%, respectively that are significantly higher than or close to that of the best performing methods reported in the three contests. Experiments on the Bickley diary dataset that consists of several challenging bad quality document images also show the superior performance of our proposed method, compared with other techniques. Keywords: Adaptive Image Contrast, Document Analysis, Document Image Processing, Degraded Document Image Binarization, Pixel Classification. I. INTRODUCTION Document Image Binarization is performed in the preprocessing stage for document analysis and it aims to segment the foreground text from the document background. A fast and accurate document image binarization technique is important for the ensuing document image processing tasks such as optical character recognition (OCR). Though document image binarization has been studied for many years, the thresholding of degraded document images is still an unsolved problem due to the high inter/intravariation between the text stroke and the document background across different document images. As illustrated in Fig.1, the handwritten text within the degraded documents often shows a certain amount of variation in terms of the stroke width, stroke brightness, stroke connection, and document background. In addition, historical documents are often degraded by the bleed through as illustrated in Fig.1(a) and (c) where the ink of the other side seeps through to the front. In addition, historical documents are often degraded by different types of imaging artifacts as illustrated in Fig.1(e). These different types of document degradations tend to induce the document thresholding error and make degraded document image binarization a big challenge to most state-of-the-art techniques. Fig.1. Five degraded document image examples (a) (d) are taken from DIBCO series datasets and (e) is taken from Bickley diary dataset. The recent Document Image Binarization Contest (DIBCO) [2], [3] held under the framework of the International Conference on Document Analysis and Recognition (ICDAR) 2009 & 2011 and the Handwritten
9

Robust Document Image Binarization Technique ... - ::.IJSETR.::

May 12, 2023

Download

Documents

Khang Minh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Robust Document Image Binarization Technique ... - ::.IJSETR.::

www.ijsetr.com

ISSN 2319-8885

Vol.03,Issue.48

December-2014,

Pages:9727-9735

Copyright @ 2014 IJSETR. All rights reserved.

Robust Document Image Binarization Technique for Degraded Document Images PRASHANT DEVIDAS INGLE

1, MS.PARMINDER KAUR

2, MS.VIJAYA KALE

3

1PG Scholar, Dept of CSE, JNEC, Aurangabad, India, E-mail: [email protected].

2Assistant Professor, Dept of CSE, JNEC, Aurangabad, India, E-mail: [email protected].

3HOD, Dept of CSE, JNEC, Aurangabad, India, E-mail: [email protected].

Abstract: Segmentation of text from badly degraded document images is very challenging tasks due to the high

inter/intravariation between the document background and the foreground text of different document images. In this paper, we

propose a novel document image binarization technique that addresses these issues by using adaptive image contrast. The

adaptive image contrast is a combination of the local image contrast and the local image gradient that is tolerant to text and

background variation caused by different types of document degradations. In the proposed technique, an adaptive contrast map

is first constructed for an input degraded document image. The contrast map is then binarized and combined with Canny’s edge

map to identify the text stroke edge pixels. The document text is further segmented by a local threshold that is estimated based

on the intensities of detected text stroke edge pixels within a local window. The proposed method is simple, robust, and

involves minimum parameter tuning. It has been tested on three public datasets that are used in the recent document image

binarization contest (DIBCO) 2009 & 2011 and handwritten-DIBCO 2010 and achieves accuracies of 93.5%, 87.8%, and

92.03%, respectively that are significantly higher than or close to that of the best performing methods reported in the three

contests. Experiments on the Bickley diary dataset that consists of several challenging bad quality document images also show

the superior performance of our proposed method, compared with other techniques.

Keywords: Adaptive Image Contrast, Document Analysis, Document Image Processing, Degraded Document Image

Binarization, Pixel Classification.

I. INTRODUCTION

Document Image Binarization is performed in the

preprocessing stage for document analysis and it aims to

segment the foreground text from the document

background. A fast and accurate document image

binarization technique is important for the ensuing

document image processing tasks such as optical character

recognition (OCR). Though document image binarization

has been studied for many years, the thresholding of

degraded document images is still an unsolved problem due

to the high inter/intravariation between the text stroke and

the document background across different document

images. As illustrated in Fig.1, the handwritten text within

the degraded documents often shows a certain amount of

variation in terms of the stroke width, stroke brightness,

stroke connection, and document background. In addition,

historical documents are often degraded by the bleed

through as illustrated in Fig.1(a) and (c) where the ink of

the other side seeps through to the front. In addition,

historical documents are often degraded by different types

of imaging artifacts as illustrated in Fig.1(e). These

different types of document degradations tend to induce the

document thresholding error and make degraded document

image binarization a big challenge to most state-of-the-art

techniques.

Fig.1. Five degraded document image examples (a)–(d)

are taken from DIBCO series datasets and (e) is taken

from Bickley diary dataset.

The recent Document Image Binarization Contest

(DIBCO) [2], [3] held under the framework of the

International Conference on Document Analysis and

Recognition (ICDAR) 2009 & 2011 and the Handwritten

Page 2: Robust Document Image Binarization Technique ... - ::.IJSETR.::

PRASHANT DEVIDAS INGLE, MS.PARMINDER KAUR, MS.VIJAYA KALE

International Journal of Scientific Engineering and Technology Research

Volume.03, IssueNo.48, December-2014, Pages: 9727-9735

Document Image Binarization Contest (H-DIBCO) [4] held

under the framework of the International Conference on

Frontiers in Handwritten Recognition show recent efforts

on this issue. We participated in the DIBCO 2009 and our

background estimation method [5] performs the best among

entries of 43 algorithms submitted from 35 international

research groups. We also participated in the H-DIBCO

2010 and our local maximum-minimum method [6] was

one of the top two winners among 17 submitted algorithms.

In the latest DIBCO 2011, our proposed method achieved

second best results among 18 submitted algorithms. This

paper presents a document binarization technique that

extends our previous local maximum-minimum method [6]

and the method used in the latest DIBCO 2011. The

proposed method is simple, robust and capable of handling

different types of degraded document images with

minimum parameter tuning. It makes use of the adaptive

image contrast that combines the local image contrast and

the local image gradient adaptively and therefore is tolerant

to the text and background variation caused by different

types of document degradations. In particular, the proposed

technique addresses the over-normalization problem of the

local maximum minimum algorithm [6]. At the same time,

the parameters used in the algorithm can be adaptively

estimated. The rest of this paper is organized as follows. Section II Sub-Block Classification and Thresholding .Our proposed is described in Section III. Then experimental results are reported in Section IV to demonstrate the superior performance of our framework. Finally, conclusions are presented in Section V.

II.SUB-BLOCK CLASSIFICATION AND

THRESHOLDING The three feature vectors described below were used to

test the local regions and classify them into three types:

heavy strokes, faint strokes or background. Typical

examples of these three types of regions the background of

a document does not contain any useful content

information. A background area typically has lower values

of edge strength and variance. A background which is

totally noise-free also has a small mean-gradient value.

Faint stroke areas contain faint strokes, which are very

difficult to detect from the background. This kind of area

typically has a medium value of edge strength and mean

gradient but less variance. Heavy stroke areas have strong

edge strength, more variance and larger mean-gradient

value. The proposed weighted gradient thresholding

method is applied to the different classes of sub block.

A. Faint Handwritten Image Enhancement: Enhancement of faint strokes is necessary

for further processing. To avoid the enhancement of noise,

a Wiener filter was first applied. The enhancement can be

divided into two steps.

Use 3x3 windows to enhance the image by finding

the maximum and minimum grey value in the

window.

Mini =min (elements in the window)

Maxi = max (elements in the window)

Compare „pixel – mini‟ and „maxi – pixel‟, where

„pixel‟ is the pixel-value. If the former is greater, the

„pixel‟ is closer to the highest grey value than the lowest

value in this window; hence the value of „pixel‟ is set to

the highest grey value („pixel‟=„maxi‟). If the former is

smaller, then the value of „pixel‟ is set to the lowest grey

value („pixel‟=„mini‟).

Thresholding: A new weighted method based on mean

gradient direction is proposed for thresholding faint strokes.

Handwritten English or Western-style scripts normally

contain strokes written in several directions.

III. PROPOSED METHOD

This section describes the proposed document image

binarization techniques

Contrast Image Construction.

Canny Edge Detector.

Local Threshold Segmentation.

Post Processing Procedure.

The proposed method can be implemented Firstly

through Preprocessing from a degraded document image

contrast image is constructed and edge detection can be

through canny’s edge detection method, Then through local

thresholding method text is segmented from image, and by

applying certain post processing quality of image can be

improved.

A. Contrast Image Construction The image gradient has been extensively used for edge

detection from uniform background image. Degraded

document may have certain variation in input image

because of patchy lighting, noise, or old age documents,

bleed-through, etc. In Bernsen’s paper, the local contrast is

defined as follows:

(1)

where C(i, j ) denotes the contrast of an image pixel (i, j ),

Imax(i, j) and Imin (i, ) denote the maximum and minimum

intensities within a local neighborhood windows of (i, j),

respectively. If the local contrast C (i, j) is smaller than a

threshold, the pixel is set as background directly. Otherwise

it will be classified into text or background by comparing

with the mean of Imax (i, j) and Imin (i, j) in Bernsen’s

method. The earlier proposed a novel document image

binarization method by using the local image contrast that

is evaluated as follows

(2)

Where is a positive but infinitely small number that is

added in case the local maximum is equal to 0. Compared

with Bernsen’s contrast in Equation 1, the local image

contrast in Equation 2 introduces a normalization factor by

Page 3: Robust Document Image Binarization Technique ... - ::.IJSETR.::

Robust Document Image Binarization Technique for Degraded Document Images

International Journal of Scientific Engineering and Technology Research

Volume.03, IssueNo.48, December-2014, Pages: 9727-9735

extracting the stroke edges properly; the image gradient can

be normalized to recompense the image variation within the

document background. The local image contrast evaluated

to restrain the background variation as described in

Equation 2. In particular, the numerator (i.e. the difference

between the local maximum and the local minimum)

captures the local image difference that is similar to the

traditional image gradient. The denominator is a

normalization factor that suppresses the image variation

within the document background. For image pixels within

bright regions, it will produce a large normalization factor

to neutralize the numerator and accordingly result in a

relatively low image contrast as shown in Fig.2. For the

image pixels within dark regions, it will produce a small

denominator and accordingly result in a relatively high

image contrast.

Fig.2. Activity Diagram for Proposed Method.

B. Canny’s Edge Detection Through the contrast image construction the stroke edge

pixels are detected of the document text. The edges can be

detected through canny edge detection algorithm, firstly

through smoothing the noise from the image can be

removed then algorithm finds for the higher magnitude of

image accordingly the edges of image gradient will be

marked. While marking only local edges of image should

be marked. As the local image contrast and the local image

gradient are evaluated by the difference between the

maximum and minimum intensity in a local window, the

pixels at both sides of the text stroke will be selected as the

high contrast pixels. The binary map can be further

improved through the combination with the edges by

Canny’s edge detector, through the canny edge detection

the text will be identified from input image which is shown

in Fig.3 below.

Fig.3. Identified Text Image.

C. Local Threshold Segmentation Once the text stroke edges are detected, the document

text can be extracted based on the observation that the

document text is surrounded by text stroke edges and also

has a lower intensity level compared with the detected

stroke edge pixels. The document text is extracted based on

the detected text Stroke edges as follows:

(3)

Where I refer to the normalized document image under

study Ne refers to the number of the detected stroke edge

pixels within a local neighborhood window. Nmin denotes a

threshold that specifies the minimum number of detected

stroke edge pixels (within the neighborhood window) that

is required to consider the image pixel under study as a

possible text pixel. Emean refers to the mean image

intensity of the detected stroke edge pixels within the local

neighborhood window that can be determined as follows:

Fig.4. The Histogram that is Built Based on the Local

Image Variation.

(4)

Where E refers to the determined stroke edge image shown

in Fig. 4 as Eq. 3 shows, the image pixel will be classified

as a text pixel if Ne is larger than Nmin and I (x, y) is smaller

than Emean. Otherwise, it will be classified as a

background pixel.

As described earlier, the performance of the proposed

document image binarization using the text stroke edges

depends on two parameters, namely the size of the

neighborhood window and the minimum number of the text

Page 4: Robust Document Image Binarization Technique ... - ::.IJSETR.::

PRASHANT DEVIDAS INGLE, MS.PARMINDER KAUR, MS.VIJAYA KALE

International Journal of Scientific Engineering and Technology Research

Volume.03, IssueNo.48, December-2014, Pages: 9727-9735

stroke edge pixels within the neighborhood window Nmin.

Both parameters are closely related to the width of text

strokes within the document image under study. In

particular, the size of the neighborhood window should not

be smaller than the text stroke width. or else, the text pixels

in the frame of the text strokes will not be extracted

appropriately because there may not be adequate text stroke

edge pixels within the local neighborhood window.

Equivalently , the threshold number of the text stroke edge

pixels Nmin (within the local neighborhood window) should

be more or less larger than the window size (if the window

size is larger than the text stroke width) due to the double-

edge structure of the text strokes. The text stroke width

therefore needs to be estimated before the document image

thresholding. We estimate the text stroke width based on

the detected text stroke edges.

D. Post-Processing Procedure

Document image thresholding often introduces a

certain amount of error that can be corrected through a

series of post-processing operations. Correct the document

thresholding error by three post-processing operations

based on the estimated document background surface and

some document domain knowledge. In particular, first

remove text components (labeled through connected

component analysis) of a very small size that often result

from image noise such as salt and pepper noise. Based on

the observation that the real text components are usually

composed of much more than 3 pixels, we simply remove

the text components that contain no more than 3 pixels in

our system. Next, remove the falsely detected text

components that have a relatively large size. The falsely

detected text components of a relatively large size are

identified based on the observation that they are usually

much brighter than the surrounding real text strokes. Then

capture such observation by the image difference between

the labeled text component and the corresponding patch

within the estimated document background surface.

IV. EXPERIMENTS AND DISCUSSION

A few experiments are designed to demonstrate the

effectiveness and robustness of our proposed method. We

first analyze the performance of the proposed technique on

public datasets for parameter selection. The proposed

technique is then tested and compared with state-of-the-art

methods over on three well-known competition datasets:

DIBCO 2009 dataset [2], H-DIBCO 2010 dataset [4], and

DIBCO 2011 dataset [3]. Finally, the proposed technique is

further evaluated over a very challenging Bickley diary

dataset. The binarization performance are evaluated by

using F-Measure, pseudo F-Measure, Peak Signal to Noise

Ratio (PSNR), Negative Rate Metric (NRM),

Misclassification Penalty Metric (MPM), Distance

Reciprocal Distortion (DRD) and rank score that are

adopted from DIBCO 2009, H-DIBCO 2010 and DIBCO

2011 [2]–[4]. Due to lack of ground truth data in some

datasets, no all of the metrics are applied on every images.

A. Parameter Selection

In the first experiment, we apply different γ to obtain

different power functions and test their performance under

the DIBCO 2009 and H-DIBCO 2010 datasets. The γ

increases from 2−10

to 210

exponentially and monotonically

as shown in Fig. 5(a). In particular, α is close to 1 when γ is

small and the local image contrast C dominates the adaptive

image contrast Ca in Equation 3. On the other hand, Ca is

mainly influenced by local image gradient when γ is large.

At the same time, the variation of α for different document

images increases when γ is close to 1. Under such

circumstance, the power function becomes more sensitive

to the global image intensity variation and appropriate

weights can be assigned to images with different

characteristics.

Fig.5. (a) Means and variations of the α values of the

twenty images on DIBCO and H-DIBCO dataset under

different γ values. (b) F-measure performance on

DIBCO 2009 & H-DIBCO 2010 datasets using different

γ power functions.

Fig.6. F-measure performance on DIBCO 2009, H-

DIBCO 2010, and DIBCO 2011 datasets using different

local window size W (the EW denotes the estimated text

stroke width).

Page 5: Robust Document Image Binarization Technique ... - ::.IJSETR.::

Robust Document Image Binarization Technique for Degraded Document Images

International Journal of Scientific Engineering and Technology Research

Volume.03, IssueNo.48, December-2014, Pages: 9727-9735

As shown in Fig. 5(b), our proposed method produces

better results on DIBCO dataset when the γ is much smaller

than 1 and the local image contrast dominates. On the other

hand, the F-Measure performance of H-DIBCO dataset

improves significantly when γ increases to 1. Therefore the

proposed method can assign more suitable α to different

images when γ is closer to 1. Parameter γ should therefore

be set around 1 when the adaptability of the proposed

technique is maximized and better and more robust

binarization results can be derived from different kinds of

degraded document images. Another parameter, i.e., the

local window size W, is tested in the second experiment on

the DIBCO 2009, H-DIBCO 2010 and DIBCO 2011

datasets. W is closely related to the stroke width EW. Fig. 6

shows the thresholding results when W varies from EW to

4EW. Generally, a larger local window size will help to

reduce the classification error that is often induced by the

lack of edge pixels within the local neighborhood window.

In addition, the performance of the proposed method

becomes stable when the local window size is larger than

2EW consistently on the three datasets. W can therefore be

set around 2EW because a larger local neighborhood

window will increase the computational load significantly.

B. Testing on Competition Datasets

In this experiment, we quantitatively compare our

proposed method with other state-of-the-art techniques on

DIBCO 2009, H-DIBCO 2010 and DIBCO 2011 datasets.

These methods include Otsu’s method (OTSU), Sauvola’s

method (SAUV), Niblack’s method (NIBL), Bernsen’s

method (BERN), Gatos et al.’s method (GATO), and our

previous methods (LMM [6], BE [5]). The three datasets

are composed of the same series of document images that

suffer from several common document degradations such

as smear, smudge, bleed-through and low contrast. The

DIBCO 2009 dataset contains ten testing images that

consist of five degraded handwritten documents and five

degraded printed documents. The H-DIBCO 2010 dataset

consists of ten degraded handwritten documents. The

DIBCO 2011 dataset contains eight degraded handwritten

documents and eight degraded printed documents. In total,

we have 36 degraded document images with ground truth.

The evaluation results are shown in Tables I–III. As Tables

I and II show, our proposed method achieves the highest

scores in F-Measure, pseudo F-Measure, PSNR, and NRM

and its MPM is only slightly lower than LMM [6] under

DIBCO dataset. This means that our proposed method

produces a higher overall precision and preserves the text

strokes better.

In addition, our proposed method also performs better

than the 43 document thresholding algorithms submitted to

the DIBCO 2009 [2] under DIBCO 2009 dataset and the 17

submitted algorithms in the H-DIBCO 2010 [4] under the

H-DIBCO 2010 dataset. Figs. 7–10 further show the

binarization results of the four example document images in

Fig. 1 by using the eight document binarization methods.

We also use rank score to evaluate the methods, which is to

accumulate the ranking of the method within all the testing

methods over all testing images and the four evaluation

metrics as described in DIBCO 2011 [3]. Based on this

TABLE I: Evaluation Results Of the Dataset of DIBCO

2009

TABLE II: Evaluation Results of the Dataset of H-

DIBCO 2010

TABLE III: Evaluation Results of the Dataset of

DIBCO 2011

Page 6: Robust Document Image Binarization Technique ... - ::.IJSETR.::

PRASHANT DEVIDAS INGLE, MS.PARMINDER KAUR, MS.VIJAYA KALE

International Journal of Scientific Engineering and Technology Research

Volume.03, IssueNo.48, December-2014, Pages: 9727-9735

ranking score scheme, the performance of our proposed

method is relative to other methods to compare. It’s clear

that our proposed method extracts the text better than the

other comparison methods. Besides the comparison

methods mentioned above, our proposed method is also

compared with the top three algorithms, namely Lelore et

al.’s method (LELO), the method submitted by our team

(SNUS) and N. Howe’s method (HOWE) for the DIBCO

2011 dataset. The quantitative results are shown in Table

III. As Table III shown, our proposed technique performs

the best in terms of DRD and MPM, which means that our

proposed technique maintains good text stroke contours and

provides best visual quality. In addition, our proposed

method also performs well when being evaluated in pixel

level. The F-Measure and PSNR of our proposed method

are very close to the highest scores, which is also shown in

Table III. Although it does not reach the lowest ranking

score, our proposed technique produces good results on all

the testing images, which is reflected on the high F-measure

score.

Fig.7. Binarization results of the sample document

image in Fig. 1(a) produced by different methods. (a)

OTSU. (b) SAUV. (c) NIBL. (d) BERN. (e) GATO. (f)

LMM [6]. (g) BE [5]. (h) Proposed.

Figs. 11–13 further show three example images (PR06,

PR07, and HW06) from the DIBCO 2011 dataset and its

corresponding binary results produced by different

methods. As shown in Fig. 11, BERN, NIBL and LELO

method fail to produce reasonable results. In addition, most

of the methods including HOWE method induce some

background noise in the final results. LMM and SNUS

instead remove too much character strokes. On the other

hand, our proposed method produces a binary result with

better visual quality and contains most of the text

information. Figs. 12 and 13 are more challenging; some of

the methods fail, including LELO, BERN, NIBL and SNUS

method. HOWE method and our proposed method produce

quite reasonable results with a little noise remains,

compared with other methods. However, the binary result

of our proposed method in Fig. 13 is a little over-binarized

due to the high text stroke variation of the input image. We

will improve it in our future study.

Fig.8. Binarization results of the sample document

image in Fig. 1(b) produced by different methods. (a)

OTSU. (b) SAUV. (c) NIBL. (d) BERN. (e) GATO. (f)

LMM [6]. (g) BE [5]. (h) Proposed.

Page 7: Robust Document Image Binarization Technique ... - ::.IJSETR.::

Robust Document Image Binarization Technique for Degraded Document Images

International Journal of Scientific Engineering and Technology Research

Volume.03, IssueNo.48, December-2014, Pages: 9727-9735

Fig.9. Binarization results of the sample document

image in Fig. 1(c) produced by different methods. (a)

OTSU. (b) SAUV. (c) NIBL. (d) BERN. (e) GATO. (f)

LMM [6]. (g) BE [5]. (h) Proposed.

In addition, we test the computation time of our proposed

method and other state-of-the-art techniques implemented

in Matlab. Experiments over DIBCO⣙s test dataset

shown that the average execution time of the proposed

method is around 21 seconds the execution time of OTSU,

BERN, NIBL, SAUV, GATO, BE and LMM methods are

around 0.5 seconds, 18 seconds, 27 seconds, 28 seconds,

100 seconds, 24 seconds and 20 seconds, respectively. The

proposed technique is comparable to the state-of-art

adaptive document thresholding methods.

Fig.10. Binarization results of the sample document

image in Fig. 1(d) produced by different methods. (a)

OTSU. (b) SAUV. (c) NIBL. (d) BERN. (e) GATO. (f)

LMM [6]. (g) BE [5]. (h) Proposed.

C. Testing on Bickley Diary Dataset

In the last experiment, we evaluate our method on the

Bickley diary dataset to show its robustness and superior

performance. The images from Bickley diary dataset are

taken

Fig.11. Binarization results of the sample document

image (PR 06) in DIBCO 2011 dataset produced by

different methods. (a) Input Image. (b) OTSU. (c)

SAUV. (d) NIBL. (e) BERN. (f) GATO. (g) LMM [6].

(h) BE [5]. (i) LELO. (j) SNUS. (k) HOWE. (l)

Proposed.

Fig.12. Binarization results of the sample document

image (PR 07) in DIBCO 2011 dataset produced by

different methods. (a) Input Image. (b) OTSU. (c)

SAUV. (d) NIBL. (e) BERN. (f) GATO. (g) LMM [6].

(h) BE [5]. (i) LELO. (j) SNUS. (k) HOWE. (l)

Proposed.

Page 8: Robust Document Image Binarization Technique ... - ::.IJSETR.::

PRASHANT DEVIDAS INGLE, MS.PARMINDER KAUR, MS.VIJAYA KALE

International Journal of Scientific Engineering and Technology Research

Volume.03, IssueNo.48, December-2014, Pages: 9727-9735

from a photocopy of a diary that is written about 100 years

ago. These images suffer from different kinds of

degradation, such as water stains, ink bleed-through, and

significant foreground text intensity and are more

challenging than the previous two DIBCO and H-DIBCO

datasets.

Fig.13. Binarization results of the sample document

image (HW 06) in DIBCO 2011 dataset produced by

different methods. (a) Input Image. (b) OTSU. (c)

SAUV. (d) NIBL. (e) BERN. (f) GATO. (g) LMM [6].

(h) BE [5]. (i) LELO. (j) SNUS. (k) HOWE. (l)

Proposed.

Fig.14. Binary results of the badly degraded document

image from Bickley diary dataset shown in Fig. 1(e)

produced by different binarization methods and the

ground truth image. (a) OTSU. (b) SAUV. (c) NIBL. (d)

BERN. (e) GATO. (f) LMM [6]. (g) BE [5]. (h) HOWE.

(i) Proposed. (j) Ground truth image.

We use seven ground truth images that are annotated

manually using Pix Labeler to evaluate our proposed

method with the other methods. Our proposed method

achieves average 78.54% accuracy in terms of F-measure,

which is at least 10% higher than the other seven methods.

Detailed evaluation results are illustrated in Table IV. Fig.

14 shows one example image from the Bickley diary

dataset and its corresponding binarization results generated

by different methods. It’s clear that the proposed algorithm

performs better than other methods by preserving most

textual information and producing the least noise.

TABLE IV: Evaluation Results of Bickley Diary Dataset

D. Discussion

As described in previous sections, the proposed

method involves several parameters, most of which can be

automatically estimated based on the statistics of the input

document image. This makes our proposed technique more

stable and easy-to-use for document images with different

kinds of degradation. The superior performance of our

proposed method can be explained by several factors. First,

the proposed method combines the local image contrast and

the local image gradient that help to suppress the

background variation and avoid the over-normalization of

document images with less variation. Second, the

combination with edge map helps to produce a precise text

stroke edge map. Third, the proposed method makes use of

the text stroke edges that help to extract the foreground text

from the document background accurately. But the

performance on Bickley diary dataset and some images of

DIBCO contests still needs to be improved; we will explore

it in future.

V. CONCLUSION

This paper presents an adaptive image contrast based

document image binarization technique that is tolerant to

different types of document degradation such as uneven

illumination and document smear. The proposed technique

is simple and robust, only few parameters are involved.

Moreover, it works for different kinds of degraded

document images. The proposed technique makes use of

the local image contrast that is evaluated based on the local

maximum and minimum. The proposed method has been

tested on the various datasets. Experiments show that the

Page 9: Robust Document Image Binarization Technique ... - ::.IJSETR.::

Robust Document Image Binarization Technique for Degraded Document Images

International Journal of Scientific Engineering and Technology Research

Volume.03, IssueNo.48, December-2014, Pages: 9727-9735

proposed method outperforms most reported document

binarization methods in term of the F-measure, pseudo F-

measure, PSNR, NRM, MPM and DRD.

VI. REFERENCES

[1] Bolan Su, Shijian Lu, and Chew Lim Tan, Senior

Member, IEEE, “Robust Document Image Binarization

Technique for Degraded Document Images”, IEEE

Transactions on Image Processing, Vol. 22, No. 4, April

2013.

[2] B. Gatos, K. Ntirogiannis, and I. Pratikakis, “ICDAR

2009 document image binarization contest (DIBCO 2009),”

in Proc. Int. Conf. Document Anal. Recognit, Jul. 2009, pp.

1375–1382.

[3] I. Pratikakis, B. Gaos, and K. Ntirogiannis, “ICDAR

2011 document image binarization contest (DIBCO 2011),”

in Proc. Int. Conf. Document Anal. Recognit, Sep. 2011,

pp. 1506–1510.

[4] I. Pratikakis, B. Gatos, and K. Ntirogiannis, “H-DIBCO

2010 handwritten document image binarization

competition,” in Proc. Int. Conf. Frontiers Hand writ.

Recognit, Nov. 2010, pp. 727–732.

[5] S. Lu, B. Su, and C. L. Tan, “Document image

binarization using background estimation and stroke

edges,” Int. J. Document Anal. Recognit, vol. 13, no. 4, pp.

303–314, Dec. 2010.

[6] B. Su, S. Lu, and C. L. Tan, “Binarization of historical

handwritten document images using local maximum and

minimum filter,” in Proc. Int. Workshop Document Anal.

Syst., Jun. 2010, pp. 159–166.

[7] G. Leedham, C. Yan, K. Takru, J. Hadi, N. Tan, and L.

Main, “Comparison of some thresholding algorithms for

text/background segmentation in difficult document

images,” in Proc. Int. Conf. Document Anal. Recognit, vol.

13. 2003, pp. 859–864.

[8] M. Sezgin and B. Sankur, “Survey over image

thresholding techniques and quantitative performance

evaluation,” J. Electron. Imag, vol. 13, no. 1, pp. 146–165,

Jan. 2004.

[9] O. D. Trier and A. K. Jain, “Goal-directed evaluation of

binarization methods,” IEEE Trans. Pattern Anal. Mach.

Intell., vol. 17, no. 12, pp. 1191–1201, Dec. 1995.

[10] O. D. Trier and T. Taxt, “Evaluation of binarization

methods for document images,” IEEE Trans. Pattern Anal.

Mach. Intell., vol. 17, no. 3, pp. 312–315, Mar. 1995.

[11] A. Brink, “Thresholding of digital images using two-

dimensional entropies,” Pattern Recognit., vol. 25, no. 8,

pp. 803–808, 1992.

[12] J. Kittler and J. Illingworth, “On threshold selection

using clustering criteria,” IEEE Trans. Syst., Man, Cybern.,

vol. 15, no. 5, pp. 652–655, Sep.–Oct. 1985.