Top Banner
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME 337 FUZZY RULE BASED CLASSIFICATION AND RECOGNITION OF HANDWRITTEN HINDI CURVE SCRIPT Gunjan Singh 1 , Avinash Pokhriyal 1 , Sushma Lehri 2 1 ( Faculty of Management & Computer Application, RBS College, Agra, India.) 2 (Professor, I ET, Dr. B. R. Ambedkar University, Agra, India.) ABSTRACT This paper presents a novel system for classification and recognition of handwritten Hindi script using fuzzy rule based approach. Classification & recognition of handwritten Hindi script is a complex task as characters are cursive in nature and demonstrate a lot of similar features. The quality of fuzzy logic to deal with vague and imprecise data makes it appropriate for such problems. In this paper, we focus on two or three letter words without modifiers. Prior to recognition, handwritten words are preprocessed and segmented into individual characters. The performance of an optical character recognition system extremely depends on the procedure used to extract quality features from characters. During classification stage characters are classified into seven classes using fuzzy if-then rules based on one of the most important component of Hindi characters – the vertical bar. Features such as curves, lines, junction points and endpoints are used at the recognition stage. A 3x3 mask is used to extract features from character image. System was tested for total 450 words written by 30 different people. Experimental results show that the proposed method performs classification and recognition at the rate of 92.02%. The proposed system has been implemented in MATLAB 2009 environment. Keywords: Classification, Fuzzy rule based approach, Handwritten Hindi curve script, Vertical bar, 8-neighbourhood INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), pp. 337-357 © IAEME:www.iaeme.com/ijcet.asp Journal Impact Factor (2012): 3.9580 (Calculated by GISI) www.jifactor.com IJCET © I A E M E
21

Fuzzy rule based classification and recognition of handwritten hindi

Oct 31, 2014

Download

Documents

iaeme iaeme

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

337

FUZZY RULE BASED CLASSIFICATION AND RECOGNITION

OF HANDWRITTEN HINDI CURVE SCRIPT

Gunjan Singh1, Avinash Pokhriyal1, Sushma Lehri2

1( Faculty of Management & Computer Application, RBS College, Agra, India.) 2(Professor, I ET, Dr. B. R. Ambedkar University, Agra, India.)

ABSTRACT

This paper presents a novel system for classification and recognition of

handwritten Hindi script using fuzzy rule based approach. Classification & recognition of

handwritten Hindi script is a complex task as characters are cursive in nature and

demonstrate a lot of similar features. The quality of fuzzy logic to deal with vague and

imprecise data makes it appropriate for such problems. In this paper, we focus on two or

three letter words without modifiers. Prior to recognition, handwritten words are

preprocessed and segmented into individual characters. The performance of an optical

character recognition system extremely depends on the procedure used to extract quality

features from characters. During classification stage characters are classified into seven

classes using fuzzy if-then rules based on one of the most important component of Hindi

characters – the vertical bar. Features such as curves, lines, junction points and endpoints

are used at the recognition stage. A 3x3 mask is used to extract features from character

image. System was tested for total 450 words written by 30 different people.

Experimental results show that the proposed method performs classification and

recognition at the rate of 92.02%. The proposed system has been implemented in

MATLAB 2009 environment.

Keywords: Classification, Fuzzy rule based approach, Handwritten Hindi curve script,

Vertical bar, 8-neighbourhood

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING

& TECHNOLOGY (IJCET)

ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), pp. 337-357 © IAEME:www.iaeme.com/ijcet.asp Journal Impact Factor (2012): 3.9580 (Calculated by GISI) www.jifactor.com

IJCET

© I A E M E

Page 2: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

338

1. INTRODUCTION

Character recognition is a broad field in which all types of machine recognition of

characters in various application domains is studied. It includes the recognition of machine

printed as well as hand written characters. Recognition of machine printed characters

involves the recognition of characters written by a machine, while handwritten character

recognition includes the recognition of characters written by human being either online or

offline. Recognition of machine printed characters is easy as characters are of same size, font

& thickness and have a proper shape, but due to various writing styles, hand written character

recognition is difficult as characters may be of different sizes, width and orientation. A

comparison of both approaches is given in [1]. In this paper, we will present a fuzzy rule

based classification and recognition system for handwritten Hindi script.

Hindi is one of the official languages of India. It is world’s third most commonly used

language after Chinese and English. Hindi script has 13 vowels (‘SWARS’) and 33

consonants (‘VYANJANS’) in its basic character set. All the characters have two common

features – (i) their cursive nature and, (ii) presence of header line (‘SHIROREKHA’). Header

line is a powerful tool of Hindi language. These features differentiate

the script from English and other Latin scripts. Words are formed by combining characters,

half characters and /or modifiers using header line. Fig.1 shows basic character set, a list of

modifiers and few words.

(a) (c)

Figure 1(a). Basic character set, (b) Swars (vowels) & corresponding matras

(modifiers) and (c) Few Hindi language words

Now-a-days Hindi is being used worldwide in many fields such as banking, medical,

science and technology etc. Most of the Hindi language words are being included in world’s

best dictionaries and other vocabulary developing tools. Due to the increasing popularity,

automatic Hindi language recognition systems have now become important. Research in this

area started in early 1970s. In 1977, Sethi and Chatterjee [2] presented a constrained

recognition system for handwritten Hindi characters. In [3], Sinha and Mahabala presented a

syntactic pattern analysis system for the recognition of machine printed and handwritten

characters. The first complete OCR system for machine printed characters is presented in [4].

Recognition of handwritten Hindi characters is still difficult for a machine as characters are

(b)

Page 3: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

339

cursive in nature and show a lot of similarities such as presence of header line, presence /

absence of vertical bar, loops & curves. A survey for handwritten character recognition was

proposed by R. Srihari [5] in 2000. Most of the work is focused on the recognition of

individual characters, and a little attention has been paid towards the recognition of words,

sentences or text. Recognition of words is difficult as words should be segmented into

individual characters. In the present paper, we propose a fuzzy rule based classification and

recognition system for handwritten Hindi curve script words of two or three letters without

modifiers.

Fuzzy logic is an organized method to solve problems dealing with vague, ambiguous,

imprecise, noisy, or missing input data. The concept of fuzzy logic is first given by Dr. Lotfi

A. Zadeh in 1965[13]. According to Dr, Zadeh, fuzzy logic is a mathematical tool for dealing

with uncertainty. As compared to crisp logic that deals with precise values; it is a form of

multi valued logic, which provides a way to deal with reasoning that is approximate. So it

gives a machine a better mean to simulate human reasoning capabilities. Dealing with

approximation makes it appropriate for problems such as handwritten character recognition.

This paper is organized in 5 sections. Section 2 throws some light on work done in the field

of handwritten Hindi character recognition. Section 3 presents the proposed system. Section 4

shows the experimental results. Finally conclusion is made in the last section.

2. LITERATURE REVIEW

Hanmandlu et al. [6] presented a fuzzy model based recognition system for

handwritten Hindi characters with 90.65% accuracy. The system works by performing coarse

classification of preprocessed character image by dividing it into 3x3 windows and then

determining the presence and position of vertical bar. Then feature are extracted by applying

the box approach. For recognition, an exponential variant of fuzzy membership function,

constructed using the normalized vector distance, is used. Mukherjee and Rege [7] presented

a shape feature and fuzzy logic based offline handwritten character recognition system for the

language with 86.4% recognition rate. Structural features, such as end points, junction points,

and adaptive thinning algorithm are used for segmenting characters into strokes. Then crisp

and fuzzy features are extracted for each stroke of the character. Two stage classification is

performed. Pre classification is performed using tree classifier in which characters are

classified based upon the presence and position of vertical line. Final classification and

recognition is performed using unordered stroke classification based on mean stroke features.

In [8], a handwritten Hindi vowel character recognition system is presented, in which vowels

are segmented into five groups using projection approach. To extract the core character

header line is removed by applying horizontal projection and modifiers are removed using

vertical projection. Feature extraction is done by using Invariant moments. Holambe and

Thool [9] presented a system for the recognition of printed and handwritten Devanagari script

using support vector machine and k-nearest neighbour classification technique. Singh, Mittal

and Ghosh [10] perform estimation of Support vector machine with Radial basis function and

k-nearest neighbour and achieved 93.8% accuracy. Two methods – curvelet transform &

character geometry used for extracting features.

Page 4: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

340

3. PROPOSED SYSTEM

The proposed system works in six stages: preprocessing, segmentation, normalization,

classification, feature extraction and recognition. Flow diagram is shown in Fig.2.

3.1 Preprocessing

During preprocessing, a number of following operations are performed on the collected data

to make it suitable for further processing—

(i) Scanning— Handwritten word data samples, collected from various people, are

scanned through an optical scanner or camera to convert data into a gray scale

image.

(ii) Noise Reduction-- Noise may be introduced in image during scanning, so to

reduce noise following operations are performed:

(a) Filtering—to reduce noise and false points, a nonlinear spatial filter- median

filter is applied. Concept is to convolute a predefined mask with the image and

replaces the value of the centre pixel by the median of intensity values in the

neighbourhood of that pixel [14]

(b) Dilation— there may be gaps in characters, which are filled by dilation using a

structuring element [14].

(c) Erosion— to eliminate the spurious objects from the image, erosion is applied

on it.

(iii) Slant Correction— there are chances that characters in the word are inclined

upwards or declined downwards, which makes feature extraction process difficult.

For that, slant correction is done by using [ 12].

(iv) Binarization--In this paper, features are extracted from binary images of

characters, so there is a need to convert the image to binary form. Global

thresholding is applied for binarization. The method works by choosing a

threshold value for the whole image and then sets the values of pixels to 1 whose

value is greater than the threshold and 0 otherwise.

(v) Thinning—Finally, binary image is thinned to single pixel width by the method

presented in [11].

Scanning Filtering Erosion Dilation

Slant

Correction Binarization Thinning

Noise Reduction

Preprocessing

Segmentation Normalization Classification

Feature

Extraction Recognition

Figure 2. Flow diagram of the proposed

Start

Page 5: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

341

3.2 Segmentation

Thinned image of word is segmented into individual characters by histogram equalization as

following—

(i) First, horizontal histogram is taken to get the upper and lower boundary of the

word.

(ii) Then vertical histogram is taken to get the region of each character.

(iii) A case occurs when number of regions is more than the number of characters in

the word. It may be due to the presence of a character in which vertical bar is not

connected to the character. In that case, the region of the vertical bar, with highest

peak value, is considered to be a part of the character to its left.

3.3 Normalization Binary images of individual characters are normalized into 9x9.

3.4 Classification All Hindi language characters are made up of mainly three components: header line or

SHIROREKHA, vertical bar, and curves. In the proposed method, we choose vertical bar

component to classify characters. TABLE 1 shows the features (presence or absence, length,

position, connectedness of vertical bar and number of junction points) on which basis

different classes of characters are formed. A character can belong to one class only.

Table 1: Features used for classification

Feature Symbol Values

Presence of vertical bar

VB

P (present)

NP(not present)

Position of vertical bar

POS

M(middle)

RE (right end)

Length of vertical bar

LEN

S (20%-30% of the character width W)

L(70%-80% of the character width W)

Connectedness of vertical bar to

character

CON

C (connected)

NC (not connected)

Number of junction points

JP

1,2,3.4, or 5

A junction point is a point with 3 or more pixels in its neighbourhood .Method of

extracting these features is given in algorithm VERTICALBAR_INFO and

JUNCTIONPOINT_COUNT. A movable 3X3 mask (Fig.3) is applied on the image, which

shows 8-neighborhood of the pixel P0.

Page 6: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

342

P8 P1 P2

P7 P0 P3

P6 P5 P4

Figure 3: 3X3 mask

In these algorithms, following notations are used:

CP -- current pixel

CL -- current location

COUNT_1 -- counter variable to count the number of pixels. Initial value is set to 0.

COUNT_2 -- counter variable to count the number of junction points. Initial value is set to 0.

ROW -- current row number

COL -- current column number

Algorithm VERTICALBAR_INFO

To determine the information about the vertical bar do the following:

1. Starting from the last column of the first row i.e. ROW==0 & COL==8, convolute

the mask on the binary image of character and check:

(i) IF pixel is a foreground pixel then call it as P0.

IF number of neighbouring pixels of P0 ≥ 3 and one pixel is P5 then do

the following --

(a) Set CP = P0.

(b) Set N = COL.

(c) Increase COUNT_1 by 1.

(ii) ELSE move to next column to the left and repeat step (i) till COL ≥ 4

2. To identify the presence of vertical bar check the value of COUNT_1

IF COUNT_1 ==1

THEN VB is P

ELSE VB is NP.

3. To identify the position of vertical bar check the value of N.

IF N ≥ 8

THEN POS is RE

ELSE POS is M

4. To identify the length and connectedness of vertical bar to character check POS.

(i) IF POS==M

THEN do the following till P5 is encountered

(a) Set P5=P0

(b) Increase COUNT_1 by 1

(ii) IF COUNT_1 >3

THEN LEN is L

ELSE LEN is S

(iii) IF POS ==RE

THEN Set CP=P0 and check the following till P5 is encountered

IF P6 OR P7 OR P8 exists

THEN CON is C

ELSE CON is NC

Page 7: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

343

Algorithm NUM_JUNCTIONPOINTS

To determine the number of junction points do the following

1. Starting from the upper left corner pixel, convoluting the mask on the

image from left to right.

2. Find the first foreground pixel P0

IF number of neighbouring pixels of P0 ≥ 3

THEN increase COUNT_2 by 1

ELSE P0=P3

3. Repeat step 2 till rightmost lower pixel is obtained.

4. Set JP=COUNT_2

Using above mentioned algorithms, following fuzzy rules are formed to classify the

characters into one of the eight classes. Flow process is shown in Fig.4.

(i) IF VB == NP THEN character belongs to class A ( )

(ii) IF VB == P AND POS == M AND LEN == L THEN character belongs to class B

( )

(iii) IF VB == P AND POS == M AND LEN == S AND JP < 2 THEN character

belongs to class C( )

(iv) IF VB == P AND POS == M AND LEN == S AND JP ≥ 2THEN character belongs

to class D ( )

(v) IF VB == P AND POS== RE AND CON == NC THEN character belongs to class

E ( )

(vi) IF VB == P AND POS == RE AND CON == C AND JP <4 THEN character

belongs to class F( )

(vii) IF VB == P AND POS == RE AND CON == C AND JP ≥ 4 THEN character

belongs to class G( )

Page 8: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

344

Figure 4. Flow process of classification

3.5 Feature Extraction Steps for extracting features are given in following algorithm--

Algorithm FEATURE_REC

1. Remove header line by applying the following method-

(i) Apply the 3X3 movable mask on the normalized image and scan the first row

from right to left.

(ii) IF pixel is a foreground pixel then call it P0.

IF P7 is a foreground pixel OR P0 is an end point OR P0 is a

disconnected component

SET P0 = 0

ELSE move to the left pixel.

If

JP ≥2

If

CON==NC

yes

If LEN==L

no

Read normalized image of size 9X9

of the character

Character

belongs to class

A

( )

Character

belongs to

class E

( ) Character

belongs to class

B

( )

Character belongs

to class

D

( )

Character belongs to class C

( )

If JP ≥ 4

no

Character belongs to

class F

( )

Character belongs to

class G

( )

Read presence of VB

If VB==A

no

Read position of VB

If POS==RE

no

Read length of VB

yes

no

Read value of JP

yes

yes

Read connectedness of VB

yes

no

Read value of JP

yes

VB : Vertical bar

A: Absent

POS : Position of vertical bar

RE : Right end

M: Middle

LEN : Length of vertical bar

L : Large

S: Small

JP : Junction point

CON: Connectedness of vertical bar

NC : Not connected

Page 9: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

345

Image is scanned from right to left to avoid the deletion of character pixels in

characters such as: because these characters, except , may be written

in two ways— (a) header line covers the whole character and, (b) when header line

covers only half or a portion of the character. In the first case, this step may result in

deletion of pixels, which are common to header line and character, in characters

mentioned above as well as characters such as and may produce some

disconnected components with small number of pixels.

2. Delete disconnected components as following--

(i) Scan the second row of the image from left to right.

(ii) Find the first foreground pixel P0.

(iii) IF P3 ==1

IF any pixel in 8 neighbourhood of P3 does not exists

THEN SET P0=0 AND P3=0

ELSE IF P5==1

IF any pixel in 8 neighbourhood of P5 does not exists

THEN SET P0=0 AND P5=0

Fig. 5 shows the process of deleting header line from character and its result.

(a) (b) (c)

Figure 5: (a) Character with header line, (b) Character without header line and

disconnected component, (c) Character after removing disconnected component

3. Apply the 3X3 movable mask on the normalized image of classified character and

scan the image

from top to bottom row wise. Collect following information for junction points and

end points--

(i) N1 : total number of junction points

(ii) N2: total number of end points

(iii) JPi : ith

junction point, where i=1 to N1

(iv) EPi : ith

end point where i=1 to N2

(v) Curve (JPi) : curve on ith

junction point (Table 2)

(vi) Curve (EPi) : curve on ith

end point

(vii) Line(JPi) : line on ith

junction point (Table 2)

Page 10: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

346

(viii) Line(EPi) : line on ith

end point

(ix) Loop(JPi) : loop on ith

junction point

(x) D1(i): direction of next endpoint from ith

end point

(xi) D2(i): direction of next junction point from ith

junction point

Values and symbols of different types of curves, lines & loops are given in the

TABLE 2.

Table 2: Values and symbols for curves, lines and loop

Features Values Symbol

Curve

Left Curve LC

Upper left curve ULC

Lower left curve LLC

Right curve RC

Upper right curve URC

Lower right curve LRC

U curve U

Line

Vertical line VL

Horizontal line HL

Back slash BS

Loop

Present

P

Not present NP

Different forms of above mentioned curves, lines and loops are shown in Fig. 6.

In this code, following notations are used:

PS -- Starting point

CL -- current location

CP -- Current pixel

COUNT -- counter variable. Initial value is set to 0.

Page 11: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

347

Algorithm CURVE_LINE_LOOP_INFO

To determine the nature of the curve do the following:

Convolute the mask on the binary image of classified character from bottom to top row wise.

Let P is the first foreground pixel. Call it current pixel (CP).

1. If CP is a junction point or end point, then check the 8-neighbourhood of CP.

(a) IF P1 is true THEN

(i) Repeat till P1 is encountered

(ii) Increase COUNT by 1.

ELSE stop.

(b) IF P3 is true THEN

(i) Repeat till P3 is encountered

(ii) Increase COUNT by 1.

ELSE stop.

(c) IF P8 is true THEN

(i) Repeat till P8 is encountered

(ii) Increase COUNT by 1.

ELSE stop.

(d) IF P1 OR P2 is true THEN

(i) Repeat till P1 OR P2 is encountered

(ii) Increase COUNT by 1.

ELSE stop.

(e) IF P1 OR P8 is true THEN

(i) Repeat till P1 OR P8 is encountered

(ii) Increase COUNT by 1.

ELSE stop.

(f) IF P2 OR P3 OR P4 is true THEN

(i) Repeat till P2 OR P3 OR P4 encountered

(ii) Increase COUNT by 1.

ELSE stop.

(g) IF P4 OR P5 is true THEN

(i) Repeat till P4 OR P5 is encountered

(ii) Increase COUNT by 1.

ELSE stop.

(h) IF P6 OR P7 OR P8 is true THEN

(i) Repeat till P6 OR P7 OR P8 is encountered

(ii) Increase COUNT by 1.

ELSE stop.

2. Check the following to know the type of curve and line:

(i) IF step 1(h) is true

IF step 1(a) is true

IF step 1(f) is true

IF COUNT ≥ 3

THEN Curve is LC

Page 12: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

348

(ii) ELSEIF step 1(e) is true

IF step 1(f) is true

IF COUNT ≥2

THEN Curve is ULC

(iii) ELSEIF step1(h) is true

IF step 1(e) is true

IF COUNT ≥2

THEN Curve is LLC

(iv) ELSE IF step 1(f) is true

IF step 1(a) is true

IF step 1 (h) is true

IF COUNT ≥ 3

THEN Curve is RC.

(v) ELSE IF step 1(d) is true

IF step 1(h) is true

IF COUNT ≥ 2

THEN Curve is URC.

(vi) ELSE IF step 1 (f) is true

IF step 1(e) is true

IF COUNT ≥ 2

THEN Curve is LRC.

(vii) ELSEIF step 1(g) is true

IF step 1(h) OR step1 (f) is true

IF step 1(d) is true

IF COUNT ≥3

THEN Curve is U

(viii) IF step 1(a) is true

IF COUNT ≥ 2

THEN Line is VL

(ix) IF step 1(b) is true

IF COUNT ≥ 2

THEN Line is HL

(x) IF step 1(c) is true

IF COUNT ≥ 2

THEN Line is BS

3. If CP is a junction point, then do the following to check the presence of loop:

IF step 1(h) is true

IF step 1(a) OR step 1 (g) is true

IF step 1(f) is true

IF Pi == CP AND COUNT ≥ 5

THEN Loop is P.

Page 13: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

349

(a) (b)

(c) (d)

(e) (f)

(g)

(h) (i) Figure 6 : Different types of curves : (a) Left curve (LC), (b) Upper left curve (ULC) , (c) Lower

left curve (LLC), (d) Right curve (RC), (e) Upper right curve (URC), (f) Lower right curve (LRC), (g)

U curve (U) , (h) Vertical line (VL), Horizontal line (HL), Backward slash (BS), (i) loop

CP

CP

Page 14: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

350

3.6 Recognition

Fuzzy rules are used for recognition. Class wise rules applied for characters are:

1. IF Class is A

IF Curve (EP1) == RC

THEN character is

ELSE IF Curve (JP1) ==LRC

IF N2==4 OR D1 (3) == P3

THEN character is

ELSE character is

2. IF Class is B

IF Curve (EP2) == LC

THEN character is

ELSE IF Curve (EP2) == URC

IF Curve (JP1) == LC OR Loop(JP1) ==P

THEN character is

ELSE character is

3. IF Class is C

IF Curve (EP1) == LC

THEN character is

ELSE IF Curve (EP1) == RC

IF N2==3

THEN character is

ELSE character is

4. IF Class is D

IF Curve (EP1) == LC

THEN character is

ELSE IF Curve (JP1) == LC

IF N2 < 2

THEN character is

ELSE IF N2==2

THEN character is

ELSE character is

ELSE IF Loop (JP1) ==P

IF Curve (JP1) == RC OR URC

THEN character is

ELSE character is

5. IF Class is E

IF Loop (JP1) ==P

IF N1==2

THEN character is

ELSE character is

ELSE IF Curve (EP1) == U

THEN character is

Page 15: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

351

6. IF Class is F

IF N2 > 3

IF Curve (EP1) == ULC

THEN character is

ELSE IF Curve (EP1) == RC OR Curve (EP2) == RC

THEN character is

ELSE character is

ELSE IF N2==3

IF Curve (JP1) == LLC

THEN character is

ELSE IF Curve (JP1) ==U

THEN character is

ELSE IF Curve (EP1) == ULC

THEN character is

ELSE character is

ELSE

IF Curve (JP1) ==U

THEN character is

ELSE IF Curve (JP1) ==LLC

THEN character is

ELSE IF Curve (JP1) ==LC OR Loop (JP1) ==P

THEN character is

ELSE character is

7. IF Class is G

IF N2>4

IF Curve (EP1) == RC

THEN character is

ELSE IF Line (EP1) == BS

IF D2 (1) ==P3 OR D2(2)==P3

THEN character is

ELSE character is

ELSE IF N2 ==4

IF Loop on JP1 ==P

THEN character is

ELSE character is

ELSE

IF Curve (JP1) ==LLC OR U

THEN character is

ELSE IF Curve (JP1) == LC

THEN character is

ELSE IF Loop on JP1 ==P

IF Loop on JP3 ==P OR LINE (EP2) == HL

THEN character is

ELSE character is

Page 16: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

352

Table 3: Summary of fuzzy rules for each character

Class N1 N2 Curve(JP) Curve(EP) Line(JP) Line(EP) Loop(JP) D1 D2 D3 Character

A

--- -- --- RC --- --- --- --- --- ---

--- --- --- --- LRC --- --- --- --- --- --- 4 --- --- LRC --- --- P3 --- ---

B

--- --- --- LC --- --- --- --- --- --- --- --- --- URC --- --- --- --- --- ---

--- --- LC URC --- --- P --- --- ---

C

--- --- --- LC --- --- --- --- --- --- --- --- --- RC --- --- --- --- --- --- --- 3 --- RC --- --- --- --- --- ---

D

--- --- --- LC --- --- --- --- --- --- --- <2 LC --- --- --- --- --- --- --- --- 2 LC --- --- --- --- --- --- --- --- --- LC --- --- --- --- --- --- --- --- --- --- --- --- --- P --- --- --- --- --- RC OR

URC

--- --- --- P --- --- ---

E

2 --- --- --- --- --- P --- --- --- --- --- --- --- --- --- P --- --- --- --- --- --- U --- --- --- --- --- ---

F

--- >3 --- --- ---

--- --- --- --- ---

--- >3 --- ULC --- --- --- --- --- --- --- >3 --- RC --- --- --- --- --- --- --- 3 --- --- --- --- --- --- --- --- --- 3 LLC --- --- --- --- --- --- --- --- 3 U --- --- --- --- --- --- --- --- 3 --- ULC --- --- --- --- --- --- --- <3 --- --- --- --- --- --- --- --- --- <3 U --- --- --- --- --- --- --- --- <3 LLC --- --- --- --- --- --- --- --- <3 LC --- --- --- P --- --- ---

G

--- >4 --- --- --- --- --- --- --- --- --- >4 --- RC --- --- --- --- --- --- --- >4 --- --- --- BS --- P3 P3 --- --- 4 --- --- --- --- --- --- --- --- --- 4 --- --- --- --- P --- --- --- --- <4 LLC OR

U

--- --- --- --- --- --- ---

--- <4 LC --- --- --- --- --- --- --- --- <4 --- --- --- HL P --- --- ---

--- <4 --- --- --- --- P --- --- ---

Page 17: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

353

4. EXPERIMENTAL RESULTS

Dataset has been created by collecting handwritten word samples by 30 people of

different age groups. Each person was asked to write 15 predecided words. A part of dataset

is shown in the following figure—

Figure 7: Word samples taken for experiment

Page 18: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

354

These word samples were scanned, using a flat-bed scanner at 300dpi. Results of operations

performed during recognition process on scanned image of word are shown in the

following figure.

Figure 8. Result of operations performed during preprocessing, segmentation

and classification on sample word

Original image

Filtered image

Binarized image

Thinned image

Segmented

image

VB == NP VB == P

POS == RE

CON == C

JP ≥ 4

VB == P

POS == RE

CON == C

JP < 4

Character belongs to class G A F

Classification

Eroded and dilated image

Page 19: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

355

After classification, features mentioned in TABLE 2 are extracted for each character by

applying algorithm FEATURE_REC, which are then used at the time of recognition.

Recognition rate for each word sample and for the proposed method is given in TABLE 4.

Table 4. Average recognition rate of selected words

Sample Word Recognition

rate

of character

1

Recognition

rate of

character 2

Recognition

rate of

character 3

Avg.

recognition

rate

S1

92.15% 94.08% 88.23% 91.48%

S2

94% 90.11% 87.23% 90.44%

S3

90.93% 97.26% 95.06% 94.41%

S4

94.14% 90.17% 90% 91.43%

S5

83.66% 93.96% 92.07% 89.89%

S6

95% 93.48% 84.36% 90.94%

S7

95.22% 92.01% 89.76% 92.33%

S8

96.31% 92.45% 91.19% 93.31%

S9

88.42% 92.31% 94.21% 91.64%

S10

89.75% 83.52% 93.46% 88.91%

S11

90.68% 88.99% ---------- 89.83%

S12

96.29% 94.43% --------- 95.36%

S13

88.57% 93.91% --------- 91.24%

S14

96.81% 97.44% -------- 97.12%

S15

87.41% 96.80% -------- 92.10%

Overall Average Recognition Rate 92.02%

Page 20: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January

Figure 9. Graphical representation of recognition rate of sample words

5. CONCLUSION

In this paper, we have present

simple Hindi language two or three

approach. Characters are first classified into seven different classes and then recognized class

wise. Few misclassification cases

characters such as & and

way such as & . We have extracted features for

for recognition process. Algorithms developed perform well and give fine results as the most

prominent features, such as vertical bar, curves, loops and lines,

recognition stage. Experimental results verify the significance o

92.02% recognition rate. Fuzzy logic performs better than other methods as it can deal with

imprecise, incomplete and vague data efficiently without losing any important information. In

future, we will work to achieve better

emphasizing more on characters having similar shape such as and

modifiers.

REFERENCES

Journal Papers: [1]. N. Arica and F.T. Yarman

off line hand writing, C99

[2]. I.K. Sethi, and B. Chatterjee, Machine

Devnagari,

pattern recognition, vol. 9, no. 2, 1977, pp.69

[3]. R.M.K. Sinha and H. Mahabala, Machine recognition of Devnagari script, IEE

Trans. System, Man Cybern. 9,1979, 435

[4]. S. Palit, B.B. Chaudhuri, P.P. Das, B.N. Chatterjee,

Processing and Computer Vision, Narosa

84

86

88

90

92

94

96

98S

1

S2

S3

S4

S5

S6

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976

6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

356

Graphical representation of recognition rate of sample words

presented a novel method for classification and recognition of

two or three letter words without modifiers using fuzzy rule based

Characters are first classified into seven different classes and then recognized class

es arise due to the presence of: some of the

& , and characters which can be written in more than one

have extracted features for all the basic characters of the language

Algorithms developed perform well and give fine results as the most

, such as vertical bar, curves, loops and lines, are used at classification and

Experimental results verify the significance of the proposed system with

Fuzzy logic performs better than other methods as it can deal with

imprecise, incomplete and vague data efficiently without losing any important information. In

to achieve better results and to improve the recognition rate by

emphasizing more on characters having similar shape such as and on Hindi words with

F.T. Yarman-Vural, An overview of character recognition focused on

C99-06-C-203, 2000,IEEE.

I.K. Sethi, and B. Chatterjee, Machine recognition of constrained hand printed

, vol. 9, no. 2, 1977, pp.69 – 75.

R.M.K. Sinha and H. Mahabala, Machine recognition of Devnagari script, IEE

Trans. System, Man Cybern. 9,1979, 435-441.

S. Palit, B.B. Chaudhuri, P.P. Das, B.N. Chatterjee, Pattern Recognition, Image

Processing and Computer Vision, Narosa Publishing House, India,1995,163

S6

S7

S8

S9

S1

0

S1

1

S1

2

S1

3

S1

4

S1

5

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

February (2013), © IAEME

Graphical representation of recognition rate of sample words

or classification and recognition of

words without modifiers using fuzzy rule based

Characters are first classified into seven different classes and then recognized class

of the similar shape

and characters which can be written in more than one

of the language

Algorithms developed perform well and give fine results as the most

classification and

f the proposed system with

Fuzzy logic performs better than other methods as it can deal with

imprecise, incomplete and vague data efficiently without losing any important information. In

and to improve the recognition rate by

on Hindi words with

overview of character recognition focused on

rinted

R.M.K. Sinha and H. Mahabala, Machine recognition of Devnagari script, IEEE

Pattern Recognition, Image

1995,163-168.

Series1

Page 21: Fuzzy rule based classification and recognition of handwritten hindi

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-

6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

357

[5] R. Plamondon and S. N. Srihari, “On-line and off-line handwriting recognition: A

comprehensive survey”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 22(1), 2000,

pp63–84.

[6] M. Hanmandlu, O.V. R. Murthy and V. K. Madasu, fuzzy model based recognition of

handwritten Hindi characters, 0-7695-3067-2/07, 2007,IEEE.

[7] P. Mukerji and P.P. Rege, Shape Feature and Fuzzy Logic Based Offline Devnagari

Handwritten Optical Character Recognition, Journal of Pattern Recognition Research

4, 2009, 52-68.

[8] R.J.Ramteke, Invariant moments based feature extraction for handwritten Devnagari

vowel recognition, IJCA, ( 0975-8887) Vol 1 – No. 18., 2010.

[9] A. N. Holambe, R.C.Thool , Printed and handwritten character & number recognition

of Devanagari script using SVM and KNN, Int. Journal of Recent Trends in

Engineering and Technology, Vol. 3, No. 2, May 2010

[10] B. Singh, A. Mittal and D. Ghosh, An evaluation of different feature extractors and

Classifiers for offline handwritten Devnagari character recognition, Journal of Pattern

Recognition Research 2, 2011, 269-277.

[11] A. Pokhriyal and S. Lehri, MERIT: Minutiae Extraction Using Rotation Invariant

Thinning. International Journal of Engineering Science & Technology, vol. 2(7),

2010, 3225-3235.

[12] Primekumar K.P and Sumam Mary Idicula, “Performance Of On-Line Malayalam

Handwritten character Recognition Using HMM and SFAM” International journal of

Computer Engineering & Technology (IJCET), Volume 3, Issue 1, 2012, pp. 115 -

125, Published by IAEME

Proceeding Papers: [12] P. Mukherji, P. P. Rege and L. K. Pradhan, Analytical Verification System for

Handwritten Devnagari Script. Proceedings of the Sixth IASTED VIIP, pp. 237-242,

Palma DeMallorca, Spain, August,2006.

Books:

[13] S.N. Sivanandam and S. N. Deepa, Principles of Soft Computing (Second Edition,

Wiley-India)

[14] R.C. Gonzales and R.E.Woods, Digital Image Processing (Second Edition, Prentice

Hall)