Top Banner
Pattern Recognition 38 (2005) 947 – 963 www.elsevier.com/locate/patcog Discrimination of similar handwritten numerals based on invariant curvature features LihuaYang a, b, , ChingY. Suen b , Tien D. Bui b , Ping Zhang b a School of Mathematics and Computing Science, SunYat-sen University, Guangzhou city 510275, P.R. China b Center for Pattern Recognition and Machine Intelligence, Concordia University, Montreal, Canada H3G 1M8 Received 17 December 2003; accepted 18 January 2005 Abstract This paper studies the discrimination of similar handwritten numerals based on invariant curvature features. High-order B-splines are used to calculate the curvature of the contours of handwritten numerals. The concept of a distribution center is introduced so that a one-dimensional periodic signal can be normalized as shift invariant. Consequently, the curvature of the contour of a character becomes rotation invariant. To reduce the dimension of the features, wavelet basis decomposition is used to produce more compact features. Finally, artificial neural network (ANN) and support vector machines (SVM) are employed to train the features and design classifiers of high recognition rates. 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. Keywords: Discrimination of handwritten numerals; Contour; Curvature; Wavelet; Artificial neural network (ANN); Support vector machines (SVM); Classifier 1. Introduction Discrimination of similar handwritten numerals is always a challenging problem. The key and difficult task for this work is to extract “good” features of similar handwritten numerals. For similar handwritten numerals, such as those shown in Fig. 1, their visual features are usually similar, which makes the discrimination between them difficult [1]. Features of handwritten characters mainly consist of two categories: skeletons and contours. Much research has been This work is supported by NSFC (No. 60475042), GDSF(036608), the foundation of scientific and technological plan- ning project of Guangzhou city (2003J1-C0201) and the Natural Sciences and Engineering Research Council of Canada. Corresponding author. School of Mathematics and Comput- ing Science, SunYat-sen University, Guangzhou city 510275, P.R. China. Tel.: +86 2084035410; fax: +86 2084111696. E-mail address: [email protected] (L. Yang). 0031-3203/$30.00 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.patcog.2005.01.014 done based on these two types of features, such as Refs. [2–8]. In this paper, contour curvature is employed as the feature of numerals. The curvature is the essential feature of an object contour. There has been persistent interest on curvature-based characterization in the field of handwritten character recognition (see Refs. [9–12,4,13]). It is well known in differential geometry that with the curvature and the direction of the starting point, a curve can be recon- structed accurately [14]. An important advantage of curva- ture representation is that with coordinate expression, two signals x(t) and y(t) are usually needed to express a con- tour mathematically, however, with its curvature, one signal (t) is sufficient. Another advantage of curvature represen- tation is that the contour curvature of an object is translation invariant. However, there are also two drawbacks in using curvature expression: (1) The calculation of curvature of a discrete signal is difficult in practice because the numerical calculation of the second derivatives is a typical ill-posed problem [15–17]; (2) The curvature feature is dependent
17

Discrimination of similar handwritten numerals based on invariant curvature features

Mar 06, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Discrimination of similar handwritten numerals based on invariant curvature features

Pattern Recognition 38 (2005) 947–963www.elsevier.com/locate/patcog

Discrimination of similar handwritten numerals based oninvariant curvature features�

LihuaYanga,b,∗, ChingY. Suenb, Tien D. Buib, Ping ZhangbaSchool of Mathematics and Computing Science, Sun Yat-sen University, Guangzhou city 510275, P.R. China

bCenter for Pattern Recognition and Machine Intelligence, Concordia University, Montreal, Canada H3G 1M8

Received 17 December 2003; accepted 18 January 2005

Abstract

This paper studies the discrimination of similar handwritten numerals based on invariant curvature features. High-orderB-splines are used to calculate the curvature of the contours of handwritten numerals. The concept of a distribution centeris introduced so that a one-dimensional periodic signal can be normalized as shift invariant. Consequently, the curvature ofthe contour of a character becomes rotation invariant. To reduce the dimension of the features, wavelet basis decompositionis used to produce more compact features. Finally, artificial neural network (ANN) and support vector machines (SVM) areemployed to train the features and design classifiers of high recognition rates.� 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

Keywords:Discrimination of handwritten numerals; Contour; Curvature; Wavelet; Artificial neural network (ANN); Support vectormachines (SVM); Classifier

1. Introduction

Discrimination of similar handwritten numerals is alwaysa challenging problem. The key and difficult task for thiswork is to extract “good” features of similar handwrittennumerals. For similar handwritten numerals, such as thoseshown in Fig. 1, their visual features are usually similar,which makes the discrimination between them difficult[1].Features of handwritten characters mainly consist of twocategories: skeletons and contours. Much research has been

� This work is supported by NSFC (No. 60475042),GDSF(036608), the foundation of scientific and technological plan-ning project of Guangzhou city (2003J1-C0201) and the NaturalSciences and Engineering Research Council of Canada.

∗ Corresponding author. School of Mathematics and Comput-ing Science, Sun Yat-sen University, Guangzhou city 510275, P.R.China. Tel.: +86 2084035410; fax: +86 2084111696.

E-mail address:[email protected](L. Yang).

0031-3203/$30.00� 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.doi:10.1016/j.patcog.2005.01.014

done based on these two types of features, such as Refs.[2–8]. In this paper, contour curvature is employed as thefeature of numerals. The curvature is the essential featureof an object contour. There has been persistent interest oncurvature-based characterization in the field of handwrittencharacter recognition (see Refs.[9–12,4,13]). It is wellknown in differential geometry that with the curvature andthe direction of the starting point, a curve can be recon-structed accurately[14]. An important advantage of curva-ture representation is that with coordinate expression, twosignalsx(t) andy(t) are usually needed to express a con-tour mathematically, however, with its curvature, one signal�(t) is sufficient. Another advantage of curvature represen-tation is that the contour curvature of an object is translationinvariant. However, there are also two drawbacks in usingcurvature expression: (1) The calculation of curvature of adiscrete signal is difficult in practice because the numericalcalculation of the second derivatives is a typical ill-posedproblem[15–17]; (2) The curvature feature is dependent

Page 2: Discrimination of similar handwritten numerals based on invariant curvature features

948 L. Yang et al. / Pattern Recognition 38 (2005) 947–963

Fig. 1. Eight handwritten Arabic numerals. The first four comefrom the ‘4’ category and the last four come from the ‘9’ category.

on the starting point. In practice, a particular point, such asthe top left one, can be designated as the starting point. Inthis case, when the character is rotated, the starting pointchanges, which corresponds to a translation of the one-dimensional curvature signal. Therefore, such curvature fea-ture is not rotation invariant. To overcome the first defi-ciency, high-order B-splines are used in this paper to smooththe contour and to provide a convincible calculation of thecurvature of a character contour. To reach the rotation invari-ance, a new concept—distribution center—is introduced fora periodic one-dimensional signal. Consequently, waveletbasis decomposition is employed to reduce the dimension ofthe features, producing more compact features of dimension8, 16 or 32, depending on the level at which wavelet de-composition is conducted. Finally, artificial neural network(ANN) and support vector machines (SVM) are utilized totrain the features and design high-performance classifiers.

The rest of the paper is organized as follows: In Sec-tion 2, we aim at the numerical calculation of the curva-ture. Smoothed by convolution with a high-order B-spline,the numerical differentiations can be calculated by convert-ing to numerical integrations. Consequently, the curvatureof a contour can be computed stably. In Section 3, scale,translation and rotation normalization of curvature featuresare discussed. A novel concept of a distribution center fora one-dimensional periodic signal is introduced and used toprovide curvature rotation invariant. Wavelet-based featuredimension reduction is also implemented in this section. Tosupport the theory above, ANN and SVM are employed inSection 4 to train the classifiers to discriminate among sim-ilar handwritten numerals. Our experiments give satisfyingrecognition rates, which are compared with other differentmethods. Finally, Section 5 contains the conclusion of thispaper.

2. Contour of numerals and its curvature

2.1. The curvature of smoothed curves

The contour of a handwritten numeral contains much in-formation. Human beings can recognize handwritten numer-als by their contours . This shows that contours of handwrit-ten numerals contain enough information to distinguish onefrom another.

Classical trace approach can extract the discrete contourof a handwritten numeral. The coordinatesx(t) andy(t) ofthe contours are two periodic one-dimensional signals. If a

starting point is designated, it is known in differential geom-etry that a contour can be determined uniquely by its curva-ture[14] and the orientation of the starting point. As known,two signalsx(t) and y(t), the coordinates, are needed torepresent a curve. However, with its curvature, one signal�(t) is enough to represent the contour. Moreover, the cur-vature displays more intuitive properties of the shape of thecontour. It can be employed to produce the shape featuresof a character.

It should also be pointed out that, the numerical calcula-tion of the curvature is not easy. In fact, in accordance withits mathematical definition, it is defined by the first and thesecond derivatives of its coordinatesx(t) andy(t), i.e.,

�(t) := x′(t)y′′(t) − y′(t)x′′(t)√(x′(t)2 + y′(t)2)3

. (1)

The numerical calculation of derivative is typically ill-posed[15–17]. To calculate the curvature stably, we smoothen itfirst with a B-spline. With a convolution-type smoothing,numerical differentiation is replaced by numerical integra-tion, which is much more stable in calculation.

Let �(t) be a smoothing kernel function satisfying

∫ ∞−∞

�(t) dt = 1. (2)

The smoothed version of functionf (t) is calculated by

I�s f (t) = f ∗ �s (t) =

∫ ∞−∞

f (x)�s (t − x) dx (3)

with �s (x) = 1s �( x

s ). It is obvious that

I�s f (t) −→ f (t), (s → 0),

for all continuous functionsf (t) vanishing at infinity[18].Furthermore, ifC := ∫ ∞

−∞ (1 + |t |)|�(t)| dt < ∞, then

|I�s f (t) − f (t)|�C�(f, s), (4)

where�(f, s) := max|x−y|� s |f (x) − f (y)| is called themodulus of continuity[19], which characterizes the conti-nuity of functionf (t). In fact, inequality (4) can be easilyshown by the following inequality (see Ref.[19]):

�(f, st)�(1 + |t |)�(f, t).

It is well known in mathematics thatI�s f (t) is as smooth as

�(t). Therefore, by choosing� which is smooth enough, wecan smoothenI�

s f (t), which approximatesf (t) as close aspossible whens → 0. Thus, we can calculate the curvatureof I�

s f (t) approximately, instead off (t).

Page 3: Discrimination of similar handwritten numerals based on invariant curvature features

L. Yang et al. / Pattern Recognition 38 (2005) 947–963 949

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

1.2

1.4

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-10

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Fig. 2. The graphic display of function� defined by Eq. (7) corresponding tom = 2(left), 4(middle) and 8(right), respectively.

Since

sd

dtI�

s f (t) =∫ ∞−∞

f (x)�′s (t − x) dx = I�′

sf (t),

s2 d2

dt2I�

s f (t) =∫ ∞−∞

f (x)�′′s (t − x) dx = I�′′

s f (t), (5)

the curvature of the smoothed curve(I�s x(t),I�

s y(t)) is

�(t) = I�′s x(t)I�′′

s y(t) − I�′s y(t)I�′′

s x(t)√(I�′

s x(t)2 + I�′s y(t)2)3

. (6)

With Eq. (5), the calculation of the numerical derivativesddtI�

s f (t) and d2

dt2I�s f (t) can be done by computing the

numerical integralsI�′s f (t) andI�′′

s f (t).

2.2. The calculation of discrete convolution

To calculate the curvature, we need to compute discretely

the convolutionsI�′s x(t),I�′

s y(t), I�′′s x(t) andI�′′

s y(t). Inthis paper, the smoothing kernel function is chosen to be

�(x) := m

2Nm

(m

2x + m

2

)(m�1), (7)

whereNm(t) is the B-spline of orderm whose definition isas follows.

Let N1(x) be the characteristic function of interval[0, 1)

defined by

N1(x) ={

1 for x ∈ [0, 1),

0 otherwise.

The B-spline of orderm is defined as

Nm(x) := (Nm−1 ∗ N1)(x) =∫ 1

0Nm−1(x − t) dt ,

(m = 2, 3, . . .). (8)

B-splines are used extensively, they are symmetric, smoothand compactly supported piecewise polynomials. As the de-greemgets higher, the smoother the B-splines and the wider

the support becomes. The basic properties of B-splines oforderm are listed as follows (see Ref.[20]):

suppNm = [0, m],Nm(x) > 0, ∀x ∈ (0, m),

Nm is symmetric onm

2,

∞∑k=−∞

Nm(x − k) = 1, ∀x ∈ (−∞, ∞),

N ′m(x) = Nm−1(x) − Nm−1(x − 1). (9)

Based on the properties above, it is easy to verify that∫ ∞−∞

�(x) dx = 1 (10)

and

� ∈ Cm−2(R), supp� = [−1, 1], �(−x) = �(x). (11)

Fig. 2shows the graph of� defined by Eq. (7) correspondingto m = 2, 4 and 8. It is obvious that the largerm is, thesmoother and more compactly supported the kernel function� is.

In practice,(x(t), y(t)) are discrete signals, we need tocalculate discrete convolution. For integern andk = 1, 2, ityields that (detailed conclusions are given in the Appendix)

I�(k)

s f (n) =∫ ∞−∞

f (n − t)1

s�(k)

(t

s

)dt

≈ [f (n + 1) − f (n)]�(k−1)(0)

+[s]∑

j=1

[f (n − j + 1) − f (n − j)

+ (−1)k−1(f (n + j + 1)

− f (n + j))]�(k−1)

(j

s

), (12)

where,[s] is the largest integer smaller thans.

Page 4: Discrimination of similar handwritten numerals based on invariant curvature features

950 L. Yang et al. / Pattern Recognition 38 (2005) 947–963

(a1) (a2) (a3) (a4) (a5) (a6)

20 40 60 80 100120140160180-0.4

-0.2

0

0.2

0.4

0.6

0.8

20 40 60 80 100120140160180-0.4

-0.2

0

0.2

0.4

0.6

0.8

20 40 60 80 100120140160180200220-0.4

-0.2

0

0.2

0.4

0.6

0.8

(b1) (b2) (b3)

20 40 60 80 100120140160180-0.4-0.3-0.2-0.1

00.10.20.30.40.5

20 40 60 80 100120140160180-0.4-0.3-0.2-0.1

00.10.20.30.40.5

50 100 150 200-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

(b4) (b5) (b6)

Fig. 3. (a1)–(a6) are six images of two numerals ‘4’ and ‘9’ with different sizes and directions; (b1)–(b6) are their curvatures.

2.3. Examples of curvature calculations

Fig. 3 shows the curvatures calculated by using Formula(6). Fig. 3(a1)–(a6) show six images of two Arabic numerals‘4’ and ‘9’ of different sizes and orientations. The startingpoints of the chain codes of their contours are designated bytheir top left points. It is easy to see that the shapes of (b1)and (b3), (b4) and (b6) are almost the same if the scales arenot considered. However, there is a shift between (b2) and(b1), (b5) and (b4). This was resulted from the rotation ofthe original numerals. When a numeral is rotated, the start-ing point of the chain codes of its contour moves to anotherpoint, thus, a shift results. To reach the rotation invariance,a new concept—distribution center—is introduced and dis-cussed in the next section.

3. Normalization of curvatures and feature dimensionreduction

3.1. Scale normalization

As shown inFig. 3(b3) and (b1) have the same shape butdifferent scales due to the different sizes of the original nu-

merals (seeFig. 3(a3) and (a1)). So it is forFig. 3(b4) and(b6). To reach the scale invariance, a simple linear normal-ization is employed in this paper.

Let {(xraw(t), yraw(t)), (t ∈ [0, N))} be the contour ofa numeral andT be the normalized size. Then the nor-malized signal is defined as{(x(t), y(t)), (t ∈ [0, T ))},with

x(t) = T

Nxraw(tN/T ),

y(t) = T

Nyraw(tN/T ), (t ∈ [0, T )). (13)

Calculating the curvature using Eq. (6) based on the scalenormalized sequence(x(t), y(t))t∈[0,T ), instead of the orig-inal sequence(xraw(t), yraw(t))t∈[0,N), a scale invariantcurvature, with a length ofT, is obtained. To approximatethe shapes of the original contours,T should be larger thanor equal to approximately the length of the original contour.In practice, its value is obtained by training the samples forrecognition.

Page 5: Discrimination of similar handwritten numerals based on invariant curvature features

L. Yang et al. / Pattern Recognition 38 (2005) 947–963 951

-4 -3 -2 -1 0 1 2 3 4-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

(a)

0 0.1

-0.8

-1

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-0.8

-1

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-0.8

-1

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-0.8

-1

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0.20.30.40.50.60.70.80.9 1 0.50.60.70.80.91.01.11.21.31.41.5 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 0.20.30.40.50.60.70.80.90.11.11.2

(b) (c) (d) (e)

Fig. 4. (a) 8 periods of a periodic signal, (b)–(e) are 4 separate periods of this signal starting at different points.

3.2. Shift normalization of one-dimensional periodicsignals

As seen from the above section, a shift of the curvatureoccurs when the original numeral is rotated. To reach ro-tation invariance, the curvature should be transformed andnormalized so as to produce the shift invariance.

For a one-dimensional signal with finite energy, the firstmoment can be used to get the statistic center[21,22]:

� = 1

m

∫ ∞−∞

xf (x) dx, (14)

where

m =∫ ∞−∞

f (x) dx. (15)

However, for a periodic signal, the above concept of statisticcenter is invalid. The reasons are: (1) A periodic signalf (t)

is not of finite energy, which makes the integrals in Eqs.(14) and (15) nonexistent and the above definition invalid;(2) There exists obviously no global center for a periodicsignal. If one period is considered, it is certainly not difficult

to define a center. However, such a center depends on whichperiod is considered.Fig. 4(a) shows 8 periods of a periodicsignal whose period is 1. We need to determine a “center”of this signal. If the period shown inFig. 4(b) which startsat 0 is considered, 0.58 should be the center intuitively inaccordance with the understanding of Formula (14); if theperiod inFig. 4(c) which starts at 0.5 is considered, 1 shouldbe its center; if the period inFig. 4(d) which starts at 0.7 isconsidered, 1.15 should be the center; and if the period inFig. 4(e) which starts at 0.2 is considered, 0.8 or so shouldbe the center. The discussion above shows that it is reallynot easy to define a statistic center, or a reference point fora periodic signal.

In this paper, a new idea to define a center or a referencepoint for a periodic signal is presented as follows.

Definition 1. Let f (t) be a periodic signal with periodT.Its centersare defined as all the points� satisfying

∫ �

�−T/2f (t) dt =

∫ �+T/2

�f (t) dt . (16)

Page 6: Discrimination of similar handwritten numerals based on invariant curvature features

952 L. Yang et al. / Pattern Recognition 38 (2005) 947–963

-1 -0.5 0 0.5 1 1.5 2-1

-0.8-0.6-0.4-0.2

00.20.40.60.8

1

-0.5 0 0.5 1 1.5 2 2.5-1

-0.8-0.6-0.4-0.2

00.20.40.60.8

1

0 0.5 1 1.5 2 2.5-1

-0.8-0.6-0.4-0.2

00.20.40.60.8

1

-0.5 0 0.5 1 1.5 2-1

-0.8-0.6-0.4-0.2

00.20.40.60.8

1

(a)

(c) (d)

(b)

Fig. 5. The position marked by ‘o’ are the centers according to Definition 2 within a given period of the signal shown inFig. 4(a). (a)–(d)correspond to the four periods shown inFig. 4(b)–(e), respectively.

Each� satisfying Eq. (16) is called a center point off.

It should be pointed out that the centers must exist. Infact, denote

F(x) =∫ x

x−T/2f (t) dt −

∫ x+T/2

xf (t) dt .

It is easy to show thatF(x + T/2) = −F(x). If F(0) = 0,then� = 0 satisfies Eq. (16). Otherwise, ifF(0) �= 0, thenF(0)F (T /2) < 0. By the intermediate value theorem (seeRef. [23]), there exists� ∈ (0, T /2) such thatF(�)=0, i.e.,� satisfies Eq. (16).

According to the definition, the center of a periodic signalis an infinite set. In fact, if� satisfies Eq. (16), so does� + kT /2 for eachk ∈ Z. Therefore, within each period ofa periodic signal, there exist at least two center points.

To locate the centers of a periodic functionf (t), we letg(t) be a primitive function off (t), i.e.,g′(t) = f (t). It iseasy to see that Eq. (16) holds if and only if�2

T/2g(�) = 0,

where�2T/2g(�) is the second central difference ofg(t) at

� with stepT/2, which is defined by

�2T/2g(�) = g

(� + T

2

)+ g

(� − T

2

)− 2g(�).

Therefore, the centers according to Definition 2 can be lo-cated by solving�2

T/2g(�) = 0 analytically or numerically.Fig. 5 shows the centers located within several different

given periods of a periodic signal. The centers are markedwith ‘o’ within the period. The solid curve shows the curveof the given period and the dotted parts are supplemented

a

0 a+T–ti T

ti a+T

Fig. 6. The variable domain map from[a, a + T ) to [0, T ) in Eq.(17).

to improve the visual effect.Fig. 5(a)–(d) correspond to thefour periods shown inFig. 4(b)–(e), respectively. We noticethat, based on different periods of the signal, the centerslocated are the same. It shows that such centers are reallythe essential global feature of a periodic signal. They can beused to normalize the curvature of a contour to reach shiftinvariance, i.e., the rotation invariance of the original image.

Definition 2. For a given period starting at a pointa of aperiodic signalf (t) whose least positive period isT. Lett1, . . . , tk be its centers within this period. The shift normal-ized signal is defined as one of the following signals:

f̃ (t)={f (t + ti ) t ∈ [0, a+T − ti )

f (t + ti −T ) t ∈ [a+T − ti , T ), (i = 1, . . . , k).

(17)

The map from domain[a, a + T ) to [0, T ) is displayedgraphically inFig. 6.

Page 7: Discrimination of similar handwritten numerals based on invariant curvature features

L. Yang et al. / Pattern Recognition 38 (2005) 947–963 953

(a0) (a1) (a2) (a3)

(b0) (b1) (b2) (b3)

(c0) (c1) (c2) (c3)

(d0) (d1) (d2) (d3)

Fig. 7. (a0), (b0), (c0) and (d0) are four periods with different starting points of a periodic signal shown inFig. 4. The centers are locatedand marked by ‘o’ in (a1), (b1), (c1) and (d1), respectively. The normalization results of Definition 2 are shown in (a2), (b2), (c2) and (d2),respectively. And (a3), (b3), (c3) and (d3) are the synthetic shift normalization results according to Definition 3.

Fig. 7 shows the normalization results of Definition 2.Fig. 7(a0), (b0), (c0) and (d0) are four periods with differentstarting points of a periodic signal shown inFig. 4. Thecenters are located by Eq. (16) and marked by ‘o’ inFig.7(a1), (b1), (c1) and (d1), respectively. The normalizationresults defined by Definition 2 are shown inFig. 7(a2), (b2),(c2) and (d2), respectively.

There are at least two center points (usually there areonly two in practice) within each period of a periodic sig-nal as discussed above. The distance between the two cen-

ters isT/2. Therefore, a pair of normalization results canbe obtained as shown inFig. 7(a2), (b2), (c2) and (d2). Wenotice that, for each period, the order of each pair normal-ization are not always the same. Thus, it suggests that weshould combine them to produce a single feature for eachsignal.

For a given period of a periodic signal, letf (t) andf (t −T/2) with t ∈ [0, T ) be the pair of shift normalizationsignal. For another period of the same periodic signal, letg(t) andg(t − T/2) with t ∈ [0, T ) be the corresponding

Page 8: Discrimination of similar handwritten numerals based on invariant curvature features

954 L. Yang et al. / Pattern Recognition 38 (2005) 947–963

(4a) (4b) (4c) (9a) (9b) (9c)

20 40 60 80 100120

-0.2-0.1

00.10.20.30.4

20 40 60 80 100120

-0.2-0.1

00.10.20.30.4

20 40 60 80 100120

-0.2-0.1

00.10.20.30.4

20 40 60 80 100120

-0.2-0.1

00.10.20.30.4

20 40 60 80 100120-0.2

-0.10

0.10.20.30.4

-0.2

-0.10

0.10.20.30.4

20 40 60 80 100120

(4a-1) (4b-1) (4c-1)

(9a-1) (9b-1) (9c-1)

Fig. 8. (4a), (4b), (4c), (9a), (9b) and (9c) are six handwritten Arabic numerals of ‘4’ and ‘9’ with different scales and orientations. (4a-1),(4b-1), (4c-1), (9a-1), (9b-1) and (9c-1) are their synthetic shift normalizations according to Definition 3.

pair of shift normalization. As analyzed above, one of thefollowing two cases must hold:

f (t) = g(t)

f (t − T/2) = g(t − T/2)t ∈ [0, T ) or

f (t) = g(t − T/2)

f (t − T/2) = g(t)t ∈ [0, T ).

Whenever a certain case occurs, it always holds that

f (t) + f (t − T/2) = g(t) + g(t − T/2), t ∈ [0, T ).

DenotingF(t) = f (t) + f (t − T/2), it is easy to see thatF(t + T/2) = F(t), which means thatF(t) is a functionof periodT/2. Therefore,F(t) = f (t) + f (t − T/2) witht ∈ [0, T /2) is a new feature of the original signal, which isbased on the pair of the normalization results in accordancewith Definition 2 and independent of their order.

Definition 3. For a given period of a periodic signalf (t)

whose least positive period isT, let f̃ (t), t ∈ [0, T ), be itsshift normalized signal according to Definition 2. Then,

F(t) := f̃ (t) + f̃ (t − T/2) t ∈ [0, T /2)

is called the synthetic shift normalization off (t).

Fig. 7(a3), (b3), (c3) and (d3) are the synthetic shift nor-malization results of7(a0), (b0), (c0) and (d0), respectively.It can be easily seen that the results are shift independent.It is also noticed that the data sizeT/2 is just half of theoriginal sizeT.

3.3. Rotation normalization of curvature of closed contours

The contour of an object is obviously a periodic curve andcorrespondingly its curvature is periodic one-dimensionalsignal. When the object is rotated, the starting point ofthe curvature shifts a distance.Fig. 8 shows the syntheticshift which results from normalization.Fig. 8(4a), (4b) and(4c) are three handwritten Arabic numerals ‘4’ with differ-ent scales and orientations.Fig. 8(4a-1), (4b-1) and (4c-1)exhibit the synthetic shift normalization of the curvaturesof their contours according to Definition 3. Similarly,Fig.8(9a), (9b) and (9c) are three handwritten Arabic numer-als ‘9’ with different scales and orientations.Fig. 8(9a-1),(9b-1) and (9c-1) are the synthetic shift normalization. Asexpected, the results are scale and rotation invariant.

3.4. Feature dimension reduction through waveletdecomposition

The features obtained above can be used to discriminateamong similar handwritten numerals, such as ‘4’ and ‘9’.As is described above, the dimensionT/2 of the normalizedfeature is half of the original length of the discrete signal. Toreduce the complexity of computation, it is necessary to ex-tract more compact features of the curvature further. Sincethe main features of a numeral are those of large curvature,which are usually the local maximum features of the origi-nal numeral, it is easy to understand that wavelet is a goodcandidate for feature dimension reduction here. Theoreti-cally, as a filter, the ideal wavelet should be (1) symmetric,

Page 9: Discrimination of similar handwritten numerals based on invariant curvature features

L. Yang et al. / Pattern Recognition 38 (2005) 947–963 955

0 2 4 6 8 10 12-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 2 4 6 8 10 12-1

-0.5

0

0.5

1

1.5

(a) (b)

Fig. 9. The graph ofsym6 (a) the scaling function, (b) the wavelet function.

cA0

cA1

cD1

Lo_D

Hi_D 2

2

CA2

cD2

Lo_D

Hi_D 2

2

Fig. 10. The flow chart for wavelet decomposition to the secondlevel.

(2) compactly supported, (3) regular, i.e., of good continuity,and (4) orthogonal. The reasons are that (1) symmetry im-plies the filter is of linear phase; (2) compact support equalsto the locality of the filter and less computational complex-ity, (3) regularity is needed since the curvature is usuallya continuous curve, which can be approximated better withsmooth wavelets; and last, (4) orthogonality is employed toeliminate redundancy and get more compact features.

Strictly speaking in mathematics, it is impossible to finda wavelet which satisfies all the above four properties. How-ever, in our application here, symmetry is not so importantsince we do not need to keep the geometry as a signal is fil-tered. A near symmetry would serve our need, and thereforewe choosesymletsdeveloped by Daubechies[24]. There isa series of symlets withN = 2, 3, . . . . Each of them is or-thogonal, regular, compactly supported with compact length2N − 1 and nearly symmetric. Also the number of vanish-ing moments for the wavelet isN. The higherN is, the moreregular and symmetric the wavelet becomes. The detailedproperties can be found in Ref.[24, pp. 254–257]. In bal-ancing symmetry and compact support, we chooseN = 6,i.e.,sym6, as the wavelet to condense the curvature featuresobtained in the last section.Fig. 9 shows the scaling func-tion �(x) and the wavelet function�(x) of sym6, andTable1 lists the filters for decomposition and reconstruction, re-spectively.

Fig. 10 is the flow chart for the wavelet decomposition

to the second level, whereLo_D and Hi_D denote

Table 1The filters ofsym6 for decomposition and reconstruction

Lo_D Hi_D Lo_R Hi_R

0.0154 0.0078 −0.0078 0.01540.0035 0.0018 0.0018 −0.0035−0.1180 −0.0447 0.0447 −0.1180−0.0483 −0.0211 −0.0211 0.04830.4911 0.0726 −0.0726 0.49110.7876 0.3379 0.3379 −0.78760.3379 −0.7876 0.7876 0.3379−0.0726 0.4911 0.4911 0.0726−0.0211 0.0483 −0.0483 −0.02110.0447 −0.1180 −0.1180 −0.04470.0018 −0.0035 0.0035 0.0018−0.0078 0.0154 0.0154 0.0078

the lower-frequency and high-frequency analysis filters re-spectively, whose filter coefficients are listed in the first two

columns ofTable 1. ↓ 2 denotes down sampling, which

means that only the even-indexed elements are kept. At theedges of the data, periodic extension is used. For an origi-nal datacA0, by conducting the wavelet decomposition toa given level, for example, the second level, an approxima-tion cA2 is obtained. By applying wavelet decompositionto �(t) of the numerals shown inFig. 8(4a)–(4c), (9a)–(9c),more compact features are produced, which are listed inFig. 11(4a)–(4c), (9a)–(9c), respectively. The contours aresampled with normalization length 128, consequently thesynthetic normalization length is 64. After the wavelet de-composition is conducted to the second level, each featureconsists of 16 data. For a contour curvature of lengthL, if itis decomposed by wavelet to thekth level, the final featuredimensionD will be:

D = L

2k+1.

Fig. 12 shows 20 handwritten Arabic numerals of ‘4’ and‘9’. Fig. 12(4a)–(4j) are 10 different handwritten Arabic nu-

Page 10: Discrimination of similar handwritten numerals based on invariant curvature features

956 L. Yang et al. / Pattern Recognition 38 (2005) 947–963

2 4 6 8 10 12 14 16

-0.4

-0.2

0

0.2

0.4

0.6

2 4 6 8 10 12 14 16

-0.4

-0.2

0

0.2

0.4

0.6

2 4 6 8 10 12 14 16

-0.4

-0.2

0

0.2

0.4

0.6

(4a) (4b) (4c)

2 4 6 8 10 12 14 16

-0.4

-0.2

0

0.2

0.4

0.6

2 4 6 8 10 12 14 16

-0.4

-0.2

0

0.2

0.4

0.6

2 4 6 8 10 12 14 16

-0.4

-0.2

0

0.2

0.4

0.6

(9a) (9b) (9c)

Fig. 11. (4a), (4b), (4c), (9a), (9b) and (9c) are the wavelet features ofFig. 8(4a-1)–(9c-1), respectively.

merals of ‘4’ andFig. 12(4a-1)–(4j-1) are their correspond-ing wavelet transform features. Similarly,Fig. 12(9a)–(9j)are 10 different handwritten Arabic numerals of ‘9’ andFig.12(9a-1)–(9j-1) are their corresponding wavelet transformfeatures.

4. Discrimination experiments of similar handwrittennumerals

4.1. Database

In order to distinguish similar character pairs 4 and 9, weextracted from MNIST database[25] two banks of trainingand testing samples. The first bank consists of 7000 trainingsamples, 3500 for each numeral, and 1600 testing samples,800 for each numeral. The second bank consists of 1600training samples, 800 for each numeral, and 1600 testingsamples which is the same as the first bank. They are listedin Table 2. These two banks of data are used to conductour recognition experiments. To let readers have a roughimpression of these samples, 240 are selected from Bank 2and shown inFig. 13. The first eight rows ofFig. 13comefrom the training categories of Bank 2, whose first four rowsare ‘4’ and the second four rows are ‘9’. Similarly, from theninth row to the 16th rows ofFig. 13come from the testingcategories of Bank 2, whose first four rows are ‘4’ and thesecond four rows are ‘9’, respectively.

For simplicity, in this section, we use the following ab-breviations:

• No.: number,

• TRS: training samples,• TES: testing samples.

After contour extraction, the curvature is calculated and nor-malized to a length of 128. Thensymletwavelet is used toextract wavelet features with different resolutions, namelythe different feature vectors of sizes 8, 16, and 32 are ex-tracted through wavelet transforms.

4.2. Discrimination experiment conducted on artificialneural networks (ANN)

A 3-layer ANN with back propagation (BP) algorithm isused as the classifier in our first experiment. It is configuredas follows:

• The number of nodes in the first layer is equal to thenumber of features.

• The number of nodes in the hidden layer is 20.• The number of nodes in the output layer is 2 representing

the 2 classes.

We encode the ideal output for character ‘4’ as{1, 0};whereas the ideal output for character ‘9’ as{0, 1}.

The recognition acceptance, and rejection policy are in-corporated into the neural networks with the following rules:we set acceptance thresholdT1 and rejection thresholdT2. Ifthe differences between the values of ANN ideal output, andactual output values of both output nodes for a testing sam-ple, are smaller thanT1, the testing character is assumed tobe correctly recognized. If the difference between the valueof ANN ideal output and actual output value of either one

Page 11: Discrimination of similar handwritten numerals based on invariant curvature features

L. Yang et al. / Pattern Recognition 38 (2005) 947–963 957

(4a) (4b) (4c) (4d) (4e)

0 2 4 6 8 10 12 14 16-0.3-0.2-0.1

00.10.20.30.40.5

0 2 4 6 8 10 12 14 16-0.4-0.2

00.20.40.60.8

1

0 2 4 6 8 10 12 14 16-0.6-0.4-0.2

00.20.40.60.8

0 2 4 6 8 10 12 14 16-0.4-0.2

00.20.40.60.8

1

0 2 4 6 8 10 12 14 16-0.2-0.1

00.10.20.30.40.5

(4a-1) (4b-1) (4c-1) (4d-1) (4e-1)

(4f) (4g) (4h) (4i) (4j)

0 2 4 6 8 10 12 14 16-0.4-0.3-0.2-0.1

00.10.20.30.40.50.6

0 2 4 6 8 10 12 14 16-0.2-0.1

00.10.20.30.40.5

0 2 4 6 8 10 12 1416-0.3-0.2-0.1

00.10.20.30.40.5

0 2 4 6 8 10 12 14 16-0.4-0.3-0.2-0.1

00.10.20.30.40.50.6

0 2 4 6 8 10 12 14 16-0.4-0.3-0.2-0.1

00.10.20.30.40.5

(4f-1) (4g-1) (4h-1) (4i-1) (4j-1)

(9a) (9b) (9c) (9d) (9e)

0 2 4 6 8 10 12 14 16-0.8-0.6-0.4-0.2

00.20.40.60.8

1

0 2 4 6 8 10 12 14 16-1

-0.5

0

0.5

1

1.5

2

0 2 4 6 8 10 12 14 16-0.4-0.3-0.2-0.1

00.10.20.3

0 2 4 6 8 10 12 14 16-0.8-0.6-0.4-0.2

00.20.40.6

0 2 4 6 8 10 12 14 16-0.4-0.3-0.2-0.1

00.10.20.30.4

(9a-1) (9b-1) (9c-1) (9d-1)(9e-1)

(9f) (9g) (9h) (9i) (9j)

0 2 4 6 8 10 12 14 16-0.5-0.4-0.3-0.2-0.1

00.10.20.30.40.5

0 2 4 6 8 10 12 14 16-0.3-0.2-0.1

00.10.20.30.40.50.60.7

0 2 4 6 8 10 12 14 16-0.6-0.4-0.2

00.20.40.60.8

0 2 4 6 8 10 12 14 16-0.3-0.2-0.1

00.10.20.30.4

0 2 4 6 8 10 12 14 16-0.6-0.4-0.2

00.20.40.60.8

(9f-1) (9g-1) (9h-1) (9i-1) (9j-1)

Fig. 12. (4a)–(4j) are 10 different handwritten Arabic numerals of ‘4’. (4a-1)–(4j-1) are their corresponding wavelet transform features.(9a)–(9j) are 10 different handwritten Arabic numerals of ‘9’. (9a-1)–(9j-1) are their corresponding wavelet features.

output node is greater thanT1 and smaller thanT2, the test-ing sample will be rejected. Otherwise, the testing sampleis misrecognized. In our experiments,T1 is set to 0.1 andT2 is set to 0.45.

The termination condition of ANN training is set as fol-lows: either ANN is trained for 500 iterations then stoppedor ANN output errors for all training samples are smallerthan a predefined threshold.

We conducted three experiments with different featurelengths of 8, 16, 32, respectively, for the first bank data andthe second bank data.

Table 2The numbers of training samples and testing samples used

Character 4 9 Total No.

Bank 1No. of TRS 3500 3500 7000No. of TES 800 800 1600

Bank 2No. of TRS 800 800 1600No. of TES 800 800 1600

Page 12: Discrimination of similar handwritten numerals based on invariant curvature features

958 L. Yang et al. / Pattern Recognition 38 (2005) 947–963

Table 3ANN recognition rates conducted on the data in Bank 1 ofTable 2(ANN trained by 7000 training samples)

No. of Recognition Rejection rate (%) Misrecognition rate (%) Recognitionfeatures rate (%) and No. of rejections and No. of misrecognitions reliability (%)

8 98.63 0.62 (10) 0.75 (12) 99.2516 98.19 0.87 (14) 0.94 (15) 99.0632 97.94 1.06 (17) 1.00 (16) 99.00

Fig. 13. 240 samples from Bank 2. The first eight rows come fromthe training category of ‘4’, the second four rows come from thetraining category of ‘9’, the third four rows come from the testingcategory of ‘4’ and the fourth four rows come from the testingcategory of ‘9’.

4.2.1. Experiment conducted on the first bank dataWe used 7000 training samples to conduct ANN training

and then 1600 samples for testing.Table 3 summarizes the recognition rate, number

of rejections, number of misrecognitions and recogni-tion reliability rate conducted on 1600 testing samples.Here, Recognition Reliability= (No. of testing samples−No. of misrecognition)/No. of testing samples∗ 100.

4.2.2. Experiment conducted on the second bank dataFor the training procedure of second bank data, we used

another 1600 training samples to conduct ANN training and

the same 1600 testing samples as above are used to conducttesting.Table 4summarizes the testing results.

Generally speaking, there are more characters of ‘4’ beingrejected and misrecognized than characters ‘9’ category dueto the fact that handwritten character ‘4’ has more writingstyles which are easily confused with handwritten character‘9’.

Fig. 14 shows some characters of ‘4’ and ‘9’ rejected inthe above recognition experiments andFig. 15displays somecharacters which are misrecognized in our experiments.

4.3. Discrimination experiment conducted on supportvector machines (SVM)

In order to verify the performance of the features ex-tracted by our proposed method, we conducted recognitionexperiments using SVM.

Firstly, original features are normalized over the range[0, N ] as follows:

x′ij = xij − min1�k � l xkj

max1�k � l xkj − xijN .

wherexij is the jth feature of theith training samples (j =1, . . . , n; i = 1, . . . , l). x′

ijis the corresponding normalized

feature,l is the total number of training samples andn isthe number of features. In our experiment,n = 8, 16, 32,respectively.N is set to 10.

We used an RBF kernel to form the SVM classifier andconducted three recognition experiments using a differentnumber of features. As we use hard threshold (0 level) todistinguish between two categories, the output of SVM onlyhas two states: correct recognition and misrecognition.Table5 lists the recognition rate conducted on training and testingsamples of both the first bank data and the second bank data.

The second and third rows ofTable 5give the recognitionrates conducted on the data in Bank 1 ofTable 2. The fourthand fifth rows ofTable 5is the recognition rates on the datain Bank 2 ofTable 2.

From the recognition results produced by ANN and SVMclassifiers, it can be concluded that:

(1) As handwritten numerals vary considerably, especiallyfor the discrimination of similar character pairs ‘4’ from‘9’, the appropriate use of training samples will increaseANN and SVM recognition rates. Furthermore, for SVM

Page 13: Discrimination of similar handwritten numerals based on invariant curvature features

L. Yang et al. / Pattern Recognition 38 (2005) 947–963 959

Table 4ANN recognition rates conducted on the data in Bank 2 ofTable 2(ANN trained by 1600 training samples)

No. of Recognition Rejection rate (%) Misrecognition rate (%) Recognitionfeatures rate (%) and No. of rejections and No. of misrecognitions reliability (%)

8 95.69 2.37 (38) 1.94 (31) 98.0616 95.43 2.50 (40) 2.07 (33) 97.9332 95.13 2.75 (44) 2.12 (34) 97.88

Table 5SVM recognition rates conducted on the data in Banks 1 and 2 ofTable 2

No. of features 8 16 32

Recognition rate on TRS (Bank 1) (%) 99.25 99.10 99.05Recognition rate on TES (Bank 1) (%) 98.80 98.45 98.10Recognition rate on TRS (Bank 2) (%) 99.12 98.76 98.50Recognition rate on TES (Bank 2) (%) 96.82 96.64 95.81

Fig. 14. Some examples of rejected characters ‘4’ and ‘9’. The firstrow comes from the category of ‘4’ and the second row comesfrom the category of ‘9’.

training, what is more important is to find those charac-ters, which are representatives of different handwrittenstyles.

(2) The feature vector of size 8 can achieve a better recog-nition rate for both classifiers. Two reasons may explainthe above results: The smaller number of features can re-duce the training complexity of classifiers; on the otherhand, wavelet transform is a good method for suppress-ing feature dimensionality without sacrificing much in-formation.

4.4. Comparisons with other methods and discussions

In order to compare our proposed method with some base-line methods, we conducted two comparative experimentsby using two sets of features, namely, mesh features andwavelet features. For simplicity, we only use the 7000 train-ing samples to train classifiers (ANN and SVM) and anotherset of 1600 as testing samples, i.e., the Bank 1 as describedin Section 4.1.

4.4.1. Mesh featuresThe grayscale character images with a size of 28× 28

of the 7000 training samples and 1600 testing samples are

Fig. 15. Some examples of misrecognized characters ‘4’ and ‘9’.The first row comes from the category of ‘4’ and the second rowcomes from the category of ‘9’.

scaled into different sizes of 32× 32, 16× 16, 8× 8, 4× 4,respectively. The values of the scaled images are directlyused as mesh features. The different sets of mesh featuresare fed to ANN and SVM for training.Table 6summarizesthe recognition results conducted on ANN with the differentsets of the features.Table 7 lists the recognition resultsconducted on SVM.

4.4.2. Wavelet featuresWe apply 2D Daubechies-4 wavelets to a 32× 32 char-

acter image to extract the low frequency component coef-ficients with sizes of 16× 16, 8× 8, and 4× 4 at differ-ent decomposition levels as wavelet features; then feed thewavelet features to ANN and SVM, respectively, for train-ing and testing.Tables 8and9 show the recognition resultsof the ANN and SVM classifiers on different sets of waveletfeatures.

4.4.3. Comparison and discussionsFor the ANN classifier, if our proposed feature extrac-

tion method is employed, the classifier can achieve 98.63%recognition rate and 99.25% reliability under the conditionthat only 8 features are used. However, the recognition rateand reliability are much lower when the same numbers of

Page 14: Discrimination of similar handwritten numerals based on invariant curvature features

960 L. Yang et al. / Pattern Recognition 38 (2005) 947–963

Table 6Recognition results of ANN conducted on different mesh features

No. of features 16 (4× 4) 64 (8 × 8) 256 (16× 16) 1024 (32× 32)

Recognition rate (%) 56.88 (910) 84.13 (1346) 95.00 (1520) 98.50 (1576)Rejection rate (%) and No. of rejections 19.37 (310) 13.31 (213) 2.81 (45) 0.688 (11)Misrecognition rate (%) and No. of misrecognitions 23.75 (380) 2.56 (41) 2.18 (35) 0.81 (13)Recognition reliability (%) 76.25 97.43 97.81 99.19

Table 7Recognition results of SVM conducted on different mesh features

No. of features 16 (4× 4) 64 (8× 8) 256 (16× 16) 1024 (32× 32)

Recognition rate on TRS (Bank 1) (%) 71.50 95.60 97.71 99.20Recognition rate on TES (Bank 1) (%) 71.06 95.25 97.31 98.69

Table 8ANN recognition results conducted on different wavelet features

No. of features 16 (4× 4) 64 (8× 8) 256 (16× 16)

Recognition rate (%) 89.19 (1427) 96.62 (1546) 97.93 (1567)Rejection rate (%) and No. of rejections 7.81 (125) 1.88 (30) 1.13 (18)Misrecognition rate (%) and No. of Misrecognitions 3.00 (48) 1.50 (24) 0.94 (15)Recognition reliability (%) 97.00 98.50 99.06

Table 9SVM recognition results conducted on different wavelet features

No. of features 16 (4× 4) 64 (8× 8) 256 (16× 16)

Recognition rate on TRS (Bank 1) (%) 78.50 98.60 99.06Recognition rate TES (Bank 1) (%) 74.56 98.06 98.75

mesh and wavelet features are used, even many more fea-tures are applied. Our proposed method can achieve the high-est recognition performance among three feature extractionmethods.

For the SVM classifier, higher recognition rates can beachieved by all three methods (our proposed method andtwo baseline methods) as compared to the ANN classifier.In order to obtain a higher recognition rate and reliability,our proposed method only uses 8 features; whereas, othermethods need to use many more features.

Table 10lists the comparison of ANN training time ofdifferent number of features conducted on the 7000 trainingsamples. Our proposed method only extracts the feature vec-tors with dimensions of 8, 16, and 32, which are much fewerthan the number of mesh features and wavelet features.

From our experiments, it can be concluded that the fea-ture dimensionality reduction can greatly speed up the ANNtraining procedure and reduce ANN computation complex-ity. It makes the ANN classifier more reliable.

In Ref. [25], Yann LeCun et al. introduced the convo-lutional neural networks with gradient-based learning tech-nique for the recognition of handwritten numerals. They

achieved a higher recognition rate as more elegant and com-plicated neural networks are employed, and more trainingsamples are used in the training procedure. In our experi-ments, we only use the BP networks and the SVM with RBFas classifiers in order to evaluate our proposed feature ex-traction method. Comparing with other two reference base-line feature sets, our proposed feature extraction methodshows a higher recognition performance with much smallerfeature numbers and much less training time.

In order to increase the recognition rate, some schemesmay be considered in our future research, such as the combi-nation of classifiers, increasing the training samples as wellas using more discriminant classifiers, etc.

5. Conclusion

To discriminate among similar handwritten numerals,curvature-based features is used and their invariance isdiscussed in this paper. High-order B-splines are usedto smooth the contours of handwritten numerals. Conse-quently, the curvatures can be calculated without numerical

Page 15: Discrimination of similar handwritten numerals based on invariant curvature features

L. Yang et al. / Pattern Recognition 38 (2005) 947–963 961

Table 10ANN training time for different number of features (All the ANN training procedures are conducted on Pentium (R) 4 CPU with 2.80 GHz,1.00 GB of RAM)

No. of features 1024 (32× 32) 256 (16× 16) 64 (8× 8) 32 16 (4× 4) 8

Training time (s) 26400 6180 1500 810 300 160

instability. A novel concept of a distribution center for aone-dimensional periodic signal is introduced to providerotation invariance of the curvature features. Wavelet de-composition is utilized to reduce the size of the feature vec-tors. Finally, artificial neural network (ANN) and supportvector machines (SVM) are employed for classification.Experiments show high recognition rates from 95.13% upto 99.80%.

Acknowledments

This work started at CENPARMI (Center for PatternRecognition and Machine Intelligence Concordia Univer-sity) when the first author visited, and finished at SunYat-sen University. The authors thank their colleagues fromthese two universities.

Appendix. Numerical calculation of discrete convolu-

tion I�(k)

s f (n) for integer n and k = 1,2

It is easily seen that

I�(k)

s f (n) =∫ ∞−∞

f (n − t)1

s�(k)

(t

s

)dt

≈∞∑

j=−∞f (n − j)

1

s

∫ j+1

j�(k)

(t

s

)dt

=∞∑

j=−∞f (n − j)

[�(k−1)

(j + 1

s

)

−�(k−1)

(j

s

)]

=∞∑

j=−∞f (n − j + 1)�(k−1)

(j

s

)

−∞∑

j=−∞f (n − j)�(k−1)

(j

s

)

=∞∑

j=−∞[f (n − j + 1) − f (n − j)]�(k−1)

(j

s

).

Since� is an even function, which implies that�(k)(−t) =(−1)k�(k)(t), and supp� = [−1, 1], we have that

I�(k)

s f (n)

≈ [f (n + 1) − f (n)]�(k−1)(0)

+∞∑

j=1

[f (n − j + 1) − f (n − j)] · �(k−1)

(j

s

)

+−1∑

j=−∞[f (n − j + 1) − f (n − j)]�(k−1)

(j

s

)

= [f (n + 1) − f (n)]�(k−1)(0) +∞∑

j=1

[f (n − j + 1)

− f (n − j)] · �(k−1)

(j

s

)+ (−1)k−1

×∞∑

j=1

[f (n + j + 1) − f (n + j)]�(k−1)

(j

s

)

= [f (n + 1) − f (n)]�(k−1)(0)

+∞∑

j=1

[f (n − j + 1) − f (n − j)

+ (−1)k−1(f (n + j + 1) − f (n + j))]�(k−1)

(j

s

).

= [f (n + 1) − f (n)]�(k−1)(0)

+[s]∑

j=1

[f (n − j + 1) − f (n − j)

+ (−1)k−1(f (n + j + 1) − f (n + j))]�(k−1)

(j

s

),

where,[s] is the largest integer smaller thans.

References

[1] R. Legault, C.Y. Suen, C. Nadal, Classification of confusinghandwritten numerals by human subjects, in: C.Y. Suen(Ed.), Proceedings of the First International Workshop onFrontiers in Handwriting Recognition, Montreal, April 1990,pp. 181–193.

[2] H.-P. Chiu, D.-C. Tseng, A feature-preserved thinningalgorithm for handwritten chinese characters, Signal Process.58 (2) (1997) 203–214.

Page 16: Discrimination of similar handwritten numerals based on invariant curvature features

962 L. Yang et al. / Pattern Recognition 38 (2005) 947–963

[3] Y.Y. Chung, M.T. Wong, High accuracy handwritten characterrecognition system using contour sequence moments,Proceeding of ICSP’98, 1998, pp. 1249–1252.

[4] R. Legault, C.Y. Suen, A comparison of methods of extractingcurvature features, in: Pattern Recognition, Conference C:Image, Speech and Signal Analysis, Proceedings of the11th IAPR International Conference, vol. III, 30 August–3September 1992, pp. 134–138.

[5] S. Madhvanath, V. Govindaraju, Contour-based imagepreprocessing for holistic handwritten word recognition, in:Document Analysis and Recognition, 1997, Proceedings ofthe Fourth International Conference, vol. 2, 18–20 August1997, pp. 536–539.

[6] K. Liu, Y.S. Huang, C.Y. Suen, Identification of fork points onthe skeletons of handwritten chinese characters, IEEE Trans.Pattern Anal. Mach. Intell. 21 (10) (1999) 1095–1100.

[7] J.J. Zou, H. Yan, Skeletonization of ribbon-like shapesbased on regularity and singularity analyses, IEEE Trans.System Man Cybernet.—Part B: Cybernet. 31 (3) (2001)401–407.

[8] F. Chang, Y.-C. Lu, T. Pavlidis, Feature analysis using linesweep thinning algorithm, IEEE Trans. Pattern Anal. Mach.Intell. 21 (2) (1999) 145–158.

[9] P.D. Gader, D. Hepp, B. Forester, T. Peurach, B.T. Mitchell,Pipelined systems for recognition of handwritten digits inUSPS zip codes, in: Proceedings of the US Postal ServiceAdvanced Technology Conference, November 1990, pp.539–548.

[10] J.J. Hull, A. Commike, T.K. Ho, Multiple algorithmsfor handwritten character recognition, in: Proceedings ofthe International Workshop on Frontiers in HandwritingRecognition, Concordia University, Montreal, April 1990, pp.117–124.

[11] Y. Le Cun, B. Boser, J.S. Denker, D. Henderson, R.E. Howard,W. Hubbard, L.D. Jackel, H.S. Baird, Constrained neuralnetwork for unconstrained handwritten digit recognition, in:Proceedings of the International Workshop on Frontiers in

Handwriting Recognition, Concordia University, Montreal,April 1990, pp. 145–154.

[12] C.Y. Suen, C. Nadal, T.A. Mai, R. Legault, L. Lam,Recognition of handwritten numerals based on the conceptof multiple experts, in: Proceedings of the InternationalWorkshop on Frontiers in Handwriting Recognition,Concordia University, Montreal, April 1990, pp. 131–144.

[13] R. Plamondon, S.N. Srihari, On-line and off-line handwritingrecognition: a comprehensive survey, IEEE Trans. PatternAnal. Mach. Intell. 22 (1) (2000) 63–84.

[14] D. Wu, Lecture on Differential Geometry, fourth ed., PeopleEducation Press of China, 1981.

[15] A.N. Tikhonov, V.Y. Arsenin, Solutions of Ill-posed Problems,Wiley, New York, 1977.

[16] J. Cullum, Numerical differentiation and regularization, SIAMJ. Numer. Anal. 8 (1971) 254–265.

[17] A. Kirsch, An Introduction to the Mathematical Theory ofInverse Problems, Springer, New York, 1996.

[18] E.M. Stein, G. Weiss, Introduction to Fourier Analysis onEuclidean Spaces, Princeton University Press, Princeton, NJ,1971.

[19] R.A. DeVore, G.G. Lorentz, Constructive Approximation,Springer, Berlin, Heidelberg, 1993.

[20] C.K. Chui, An Introduction to Wavelets, Academic Press,Boston, 1992.

[21] H.L. Xiong, T.X. Zhang, Y.S. Moon, A translation- andscale-invariant adaptive wavelet transform, IEEE Trans. ImageProcess. 9 (12) (2000) 2100–2108.

[22] K.R. Castleman, Digital Image Processing, Prentice-Hall, Inc.,Englewood Cliffs, NJ, 1996.

[23] P.M. Fitzpatrick, Advanced Calculus: A Course inMathematical Analysis, PWS Publishing Company, 1996.

[24] I. Daubechies, Ten Lectures on Wavelets, Society for Industrialand Applied Mathematics, Philadelphia, 1992.

[25] Y. LéCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-basedlearning applied to document recognition, Proc. IEEE 86 (11)(1998) 2278–2324.

About the Author—LIHUA YANG received the B.S. degree from the Mathematics Department, Hunan Normal University, China, in 1984,the M.S. degree from the Mathematics Department, Beijing Normal University, China, in 1987, and the Ph.D. degree from the Departmentof Scientific Computing and Computer application, Sun Yat-sen University, China, in 1995. From 1996 to 1998, he was a postdoctoralfellow in the Institute of Mathematics, Academia Sinica, China. From 1997 to 1999, he worked in the Department of Computer Science ofHong Kong Baptist University as a visiting scholar. From July of 2001 to Jan of 2002, he was a visiting scientist at the Center of PatternRecognition and Machine Intelligence, Concordia University, Canada. He is now a professor at the School of Mathematics and ComputingScience, Sun Yat-sen University, China.Dr. Yang is the author or coauthor of more than 30 papers and 1 book on wavelet and pattern recognition. His current research interestsinclude Wavelet and time-frequency analysis, Pattern Recognition and Image Processing.

About the Author—CHING Y. SUEN received an M.Sc. (Eng.) degree from the University of Hong Kong and a Ph.D. degree from theUniversity of British Columbia, Canada. In 1972, he joined the Department of Computer Science of Concordia University where he becameProfessor in 1979 and served as Chairman from 1980 to 1984, and as Associate Dean for Research of the Faculty of Engineering andComputer Science from 1993 to 1997. He has guided/hosted 65 visiting scientists and professors, and supervised 60 doctoral and master’sgraduates. Currently he holds the distinguished Concordia Research Chair in Artificial Intelligence and Pattern Recognition, and is theDirector of CENPARMI, the Centre for PR & MI.Prof. Suen is the author/editor of 11 books and more than 400 papers on subjects ranging from computer vision and handwriting recognition, toexpert systems and computational linguistics. He is the founder of “The International Journal of Computer Processing of Oriental Languages”and served as its first Editor-in-Chief for 10 years. Presently he is an Associate Editor of several journals related to pattern recognition. AFellow of the IEEE, IAPR, and the Academy of Sciences of the Royal Society of Canada, he has served several professional societies asPresident, Vice-President, or Governor. He is also the founder and chair of several conference series including ICDAR, IWFHR, and VI.

Page 17: Discrimination of similar handwritten numerals based on invariant curvature features

L. Yang et al. / Pattern Recognition 38 (2005) 947–963 963

He was the General Chair of numerous international conferences, including the International Conference on Computer Processing of Chineseand Oriental Languages in August 1988 held in Toronto, International Conference on Document Analysis and Recognition held in Montrealin August 1995, and the International Conference on Pattern Recognition held in Quebec City in August 2002.Dr. Suen has given 150 seminars at major computer industries and various government and academic institutions. He has been the principalinvestigator of 25 industrial/government research contracts, and is the recipient of prestigious awards, including the ITAC/NSERC Awardfrom the Information Technology Association of Canada and the Natural Sciences and Engineering Research Council of Canada in 1992and the Concordia “Research Fellow” award in 1998.

About the Author—TIEN D. BUI received the Bachelor of Engineering degree from the University of Ottawa in 1968, M. Eng. degreefrom Carleton University in 1969, and Ph.D. degree from York University in 1971. Before joining Concordia University he was with theDepartment of Mechanical Engineering at McGill from 1971 to 1974. He joined the Department of Computer Science at Concordia in 1974,was promoted to full professor in 1984, became Chair of the Department from 1985 to 1990. In June 1992 he was appointed AssociateVice-Rector Research at the same university. He served in this position until 1996. Dr. Bui has served for a long time on various governingbodies of the University including its Senate. He had been a member of the Boards of Directors of many research centers and researchinstitutes in Quebec including the Centre de Recherche Informatique de Montreal Inc. (CRIM), the Institut de Recherche sur les Populations(IREP), the Institut des Sciences Mathematiques (ISM), GRIAO (a consortium of many research labs on VLSI in Quebec universities), andwas a member of the Committee of Vice-Rectors Research in Quebec (CREPUQ).Dr. Bui is currently an Associate Editor of the International Journal of Wavelets, Multiresolution and Information Processing, and theTransactions of the Society for Modeling and Simulation. He has been a member of the organizing committees/program committees of manyinternational conferences including the ICIAR 2005 in Toronto, Canada. He has served as member of grant selection committees and asexternal reviewer for federal and provincial granting agencies. Dr. Bui has received many research grants over the years, and published morethan 120 papers in different areas in scientific journals and conference proceedings. He is co-author of the book Computer Transformationof Digital Images and Patterns published by World Scientific Publishing Co., 1989.Dr. Bui was an invited professor at the Istituto per le Applicazioni del Calcolo, Rome, Italy in 1978–1979. And in 1983–1984 he was avisiting professor at the Department of Mechanical Engineering, and the Lawrence Berkeley Lab. of the University of California at Berkeley.His current research interests are in wavelet transforms, partial differential equations, mathematical approaches to machine intelligence,pattern recognition, signal, image and video processing.

About the Author—PING ZHANG graduated from Chongqing University, China. Then he became an academic staff at the Departmentof Automation, Chongqing University, China. In 1998, he was a visiting researcher at the School of Computer Science and Mathematics,Victoria University, Australia. From 1999–2001, he was a research staff at Soft Computing Research Group, School of Electronic andElectrical Engineering, Nanyang Technological University, Singapore. Now, he is a research assistant and Ph.D. candidate at Center forPattern Recognition and Machine Intelligence (CEMPARMI), Concordia University, Canada. His research interests include OCR, patternrecognition, image processing and computer vision.