CHAPTER 2 LITERATURE REVIEW
CHAPTER 2
LITERATURE
REVIEW
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 16
2.1 Introduction
There is a great need for OCR related research in Indian languages, even though
there are many technical challenges as well as the lack of a commercial market [1]. With
the spread of computers in organizations and homes, automatic processing of paper
documents is rapidly gaining importance in India [2]. A short description of the
advancements in OCR of Indian scripts including Bangla, Tamil, Telugu, Gurmukhi,
Oriya, Gujarati, Kannada, and Devanagari up to 2002 can be seen in [3]. In this paper, it
is tried to address all the advancements till 2010 in printed as well as handwritten
Devanagari script recognition along with their performances. Devanagari is the script
used for writing many official languages in India, such as Hindi, Marathi, Sindhi, Nepali,
Sanskrit, and Konkani, where Marathi is the language spoken in Maharashtra state.
Several other Indian languages like Gujarati, Punjabi, and Bengali use scripts similar to
Devanagari. More than 300 million people use Devanagari script for documentation in
central and northern parts of India [4]. This chapter presents a comprehensive review of
the work carried out in Devanagari OCR. Section 2.2 discusses the literature review in
the field of machine-printed Devanagari script. Section 2.3 presents the review in the
handwritten character recognition field. In both these cases, the research carried out at
each stage of the OCR namely, pre-processing, feature extraction and
classification/recognition is discussed in detail. Section 2.4 puts forth some observations
and finally the chapter ends giving some concluding remarks in Section 2.5.
2.2 Recognition of machine-printed Devanagari script
The work on automatic recognition of printed Devanagari script started in early
1970s. The efforts then were initiated by Sinha [9], [10] at Indian Institute of
Technology, Kanpur. A syntactic pattern analysis system for Devanagari script
recognition is presented in Sinha’s Ph.D. thesis [9]. Another OCRsystem development of
printed Devanagari is by Palit and Chaudhuri [11] as well as Pal and Chaudhuri [12]. A
team comprising Prof. B. B. Chaudhuri, U. Pal, M. Mitra, and U. Garain of Indian
Statistical Institute, Kolkata, developed the first commercial level product for printed
Devanagari OCR. The same technology has been transferred to Center for Development
for the Advance Computing (CDAC) in 2001 for commercialization and is marketed as
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 17
“Chitrankan” [3]. The following sections discuss the preprocessing, feature-extraction,
and classification techniques reported so far for machine-printed Devanagari OCR.
2.2.1 Pre-processing and segmentation techniques
When a document is scanned using an optical scanner, a small degree of skew
(tilt) is unavoidable. Skew angle is the angle that the text lines in the digital image make
with the horizontal direction. Skew estimation and correction are important
preprocessing steps of document layout analysis. As far as documents containing
Devanagari text are concerned, the most important characteristic to be considered for
skew estimation is the header line (shirorekha) joining all the characters in a word.
An approach based on the detection of “shirorekha” is proposed by Chaudhuri
and Pal [13] and in [14]. Das and Chanda [15] also proposed a fast and script-
independent skew estimation technique based on mathematical morphology. After layout
preprocessing like skew elimination, the separation of paragraphs, text lines, words, and
characters is to be carried out for effective feature extraction. Text blocks in the
document pages are extracted first, and then, lines and words are separated. Separation of
text lines from text blocks is called line segmentation and separation of words from each
text line is called word segmentation. Projection profiles, space between words and lines
are used to achieve this in [5].
Separating words into constituent characters is called character segmentation.
Removal of shirorekha (header line) does the segmentation of characters from each
Devanagari word in [5], [16]. Garain and Chaudhuri [17] presented another technique for
identification and segmentation of touching machine-printed Devanagari characters
based on fuzzy multi factorial analysis. Bansal and Sinha [18] presented a two-pass
algorithm for the segmentation of machine-printed composite characters into their
constituent symbols. The proposed algorithm extensively uses structural properties of the
script. Kompalli et al. [19] used a graph representation method to segment characters
from printed words. In the methodology described by Bansal and Sinha [20], the
segmentation by smearing leaves the overlapping text lines and touching characters
unsegmented. The selection of image regions for further segmentation is based on
statistical analysis of height or width depending on the context. Sharma et al. [21]
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 18
presented a rule-based approach for skew correction along with removing insignificant
data like dark band, thumb mark, and specks.
In the method proposed by Kompalli et al. [22], the shirorekha is determined
using projection profile and run length. Once the shirorekha is removed, the top, middle,
and bottom zones are identified easily. Components in top and bottom zones are part of
vowel modifiers. Each of these components is then scaled to a standard size before
feature extraction and classification [23]. To segment touching printed Devanagari
characters on degraded documents, a technique based on fuzzy multi factorial analysis is
proposed in [96], where a predictive algorithm effectively selects the cut points to
segment touching Devanagari characters.
For the binarization of natural scene images containing Devanagari textual
information, an adaptive thresholding technique is proposed in [80]. A water-reservoir-
based analogy is proposed in [39] to extract individual text lines from such documents. It
is necessary to identify the scripts before applying their corresponding recognition
engine. Many techniques on line-wise and word-wise script identification have been
proposed in the literature [79], [82], [84], [86], [91], [95], [98], [106]. In [106], a line-
wise script identification approach is proposed, where different structural features are
used. In [86], appearance-based models are employed for the script identification of the
printed text. These models are based on principal component analysis (PCA) and linear
discriminant analysis (LDA)/Fisher’s linear discriminant (FLD).
Words are identified in multilingual document images using SVM in [95]. In
[98], for word-wise script identification, the document is initially segmented into lines,
and then, the lines are segmented into words. Individual script words are identified from
document images using different topological and structural features. Texture features
have been applied in [84] for script identification. In [79], a technique to identify
Kannada, Hindi, and English text lines from a printed document is presented. To get
higher accuracy, a two-stage approach is proposed for printed script identification in
[82].
2.2.2 Feature extraction techniques
Different features have been used for the recognition of Devanagari characters.
The system described by Sinha and Mahabala [10] for printed Devanagari characters
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 19
stores structural descriptions for each symbol of the script in terms of primitives and
their relationships. Sinha [24] also demonstrated how the spatial association among the
constituent symbols of Devanagari script plays an important role in understanding
Devanagari words. In [5], a character is assigned to one of the three groups, namely
basic, modifier, and compound character groups and group-wise features are considered.
Also, it is observed that the compound characters (around 250) in the script occupy only
6% of the text.
The major two features considered for printed Devanagari characters by Jayanthi
et al. [25] are main horizontal line and various vertical lines. The third feature is to test
whether vertical lines are present in the rightmost side of the character. The other
features have been the height to width (aspect) ratio of the character, whether the
character is narrow or broad ended and the number of free ends it has. Govindaraju et al.
[16] considered gradient features for feature selection of the characters. Kompalli et al.
[22], [26], used gradient, structural, and concavity (GSC) features for OCR of machine
printed and multifont Devanagari text. The gradient features were used to classify
segmented images. In the method proposed by Dhurandhar et al. [27], the significant
contours of the printed character are extracted and characterized as a contour set based
on a reference coordinate system. Jawahar et al. [23] used PCA for feature extraction of
printed characters. A word-level matching scheme for searching in printed document
images is proposed by Meshesha and Jawahar [28]. The feature-extraction scheme
extracts local features by scanning vertical strips of the word image and combines them
automatically based on their discriminatory potential. The features considered are word
profiles, moments, and transform-domain representations. In [1], printed Hindi words are
initially identified from bilingual or multilingual documents based on features of the
Devanagari script using SVM. Identified words are then segmented into individual
characters in the next step, where the composite characters are identified and further
segmented based on the structural properties of the script and statistical information.
In [79], a technique to identify Kannada, Hindi, and English text lines from a
printed document is presented. The features used for script identification of machine-
printed text in [82] are 64-D CH features and 400-D gradient features. For the purpose of
indexing in [87], printed Devanagari word images are represented in the form of
geometric feature graphs (GFG). It is a graph-based representation of the features
extracted from the image of the word. A set of features including percentiles, horizontal,
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 20
and vertical derivatives of percentiles, angles, correlations, and energy were used for the
recognition of printed Devanagari character recognition in [94]. LDA was then used to
reduce the dimensionality of the feature set from 81 to 15.
Zernike moments and directional features are used as the features for printed
characters in [95]. Using background and foreground information, a scheme toward the
recognition of Indian complex documents is proposed in [107].
2.2.3 Recognition/Classification techniques
Many classifiers like artificial neural network (ANN) [22], [23], [61], [77],
hidden Markov model (HMM) [42], support vector machine (SVM) [35], [61], modified
quadratic discriminant function (MQDF) [50], [56], etc., have been used for Devanagari
character recognition. Several compound discriminant functions have been derived from
the projection distance (PD) and the MQDF is one of them [35].
Some contemporary techniques like rough sets, fuzzy rules, evolutionary
algorithms, and Mahalanobis and Hausdorff distances [54], [68], [69], [96], etc., are also
used for the recognition purpose of Devanagari characters. A feature-based tree classifier
has been used in [5] to recognize the basic characters. A top–down binary-tree-based
recognition of printed Devanagari characters is proposed by Jayanthi et al. [25] as binary
tree is one of the fastest decision making processes for a computer program.
Govindaraju et al. [16] considered 38 characters and 83 frequently occurring
conjunct character classes in a multistage classification approach. Initially, they were
classified into four categories depending on their structural properties. Each category was
then classified using a separate classifier of three-level ANN, where the network is
trained using a standard back propagation algorithm. The recognition of printed
characters in the method proposed by Dhurandhar et al. [27] involves comparing the
contour sets with those in the enrolled database.
In [10], the recognition of printed characters involves a search for primitives on the
labeled pattern based on the stored description. Contextual constraints are also utilized to
arrive at the correct interpretation. In [19], multiple hypotheses are obtained for each
composite character by considering all possible combinations of the classifier results for
the primitive components. A dynamic time warping (DTW) based partial matching
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 21
algorithm is designed for morphological matching that takes care of word from
variations in the beginning and at the end is proposed by Meshesha et al. [28].
Kompalli et al. [26] outlined two different techniques for OCR of machine-
printed, multifont Devanagari text. In [22], neural network classifiers are used for the
recognition of printed characters and words. Jawahar et al. [23] used SVM for
classifying printed characters. In [1], segmented printed characters are recognized using
generalized Hausdorff image comparisons. In [29], the classification of printed
Devanagari characters is done through five filters: 1) coverage of the region of the core
strip; 2) vertical bar feature; 3) horizontal zero crossings; 4) number and position of
vertex points; and 5) moments.
In [94], for printed Devanagari character recognition, each basic glyph and
ligature is modeled with a 14-state left-to-right HMM with a maximum of 256 Gaussians
per HMM. The training of HMM was carried out using the standard expectation–
maximization procedure. For classification of printed characters in [95], generalized
Hausdorff image comparison, nearest neighbor classifier, weighted Euclidean distance,
and hierarchical classification technique were employed.
General OCR techniques produce poor results on noisy and degraded documents
like old books or newspapers, photocopy materials, faxed documents, etc. [31]. The
quality degradation of old documents and books are mainly due to ancient print
technology and poor paper quality. As a result the main difficulty in recognizing the
images of such documents is because of the distortion of characters due to spreading of
ink. Imperfections in scanning may also result in noisy images. To handle such degraded
documents, Dhingra et al. [31] presented an approach for the development of minimum
classification error (MCE) based system. Gabor filters directly extract features used for
classification as they have been successfully applied to Chinese OCR in [32]. The MCE-
based classifiers provide robustness to the system against random noise by adjusting the
system feature space according to the loss function computed.
Dhingra et al. [31] used a degradation model [33] to simulate the distortions
caused due to the imperfections in scanning. In [71] and [91], the effectiveness of Gabor
and discrete cosine transform (DCT) features was independently evaluated using nearest
neighbor, linear discriminant, and SVM classifiers for the blind recognition of 11
different printed scripts including Devanagari. From the experimentations, it was evident
that the Gabor–SVM combination had an edge over other combinations. The
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 22
classification of a machine-printed word to a particular script was done in [82] using
SVM via majority voting of each recognized character component of the word. For the
recognition of multi-oriented Devanagari characters SVM is used in [107] too.
Towards post processing of Devanagari OCR, only a few works are reported.
Bansal and Sinha [30] described a method for the correction of optically read printed
character strings using a Hindi word dictionary. Pal and Chaudhuri [12] and [99] also
proposed a suffix- and prefix-based error correction technique, which can take care of
different inflectional languages. Only a few works are reported regarding document
retrieval and word spotting. In [88], a search system for retrieval of relevant documents
from large collection of document images is presented. A DTW-based partial matching
scheme is employed to group together similar words for the indexing purpose. Word
profiles like upper and lower words and projection and transition profiles are used as
features for word representation. Two different approaches are proposed for spotting
words in images of printed Sanskrit documents in [97]. In the first approach, a block
adjacency graph (BAG) based scheme for word recognition is used. In the second
approach, a moment-based word matching technique, which maintains a script invariant
representation of all word images, is employed. Word matching is then carried out using
cosine similarity.
A shape-code-based word-spotting matching technique for retrieval of
multilingual Indian documents is proposed by Tarafdar et al. [100], where different
primitive shape codes like 1) zonal information of extreme points; 2) vertical-shape-
based feature; 3) crossing count (with respect to the position of vertical bar); 4) loop
shape and position; and 5) background information, etc., are used. An inexact matching
technique is employed to measure the similarity for possible spotting.
The details of many printed Devanagari character and word recognition systems
are summarized in Tables 2.1 and Table 2.2, respectively. It is evident from Table 2.1
that for printed Devanagari characters, the method proposed by Dhingra et al. [31] is
superior to other methods in terms of recognition accuracy. For printed word recognition,
the method proposed by Kompalli et al. [19] has the highest accuracy, as shown in Table
2.2.
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 23
Table 2.1 Details of printed Devanagari character recognition systems
Method Feature Classifier Data set
(size)
Accuracy
(%)
Govindajaru et al [16] Gradient Neural networks 4,506 84
Kompalli et al [22] GSC Neural networks 32,413 84.77
Bansal et al [20] Statistical and
Structural
Statistical knowledge
sources
Unspecified 87
Huanfeng Ma et al [1] Structural and
statistical
Hausdroff image
comparison
2,727 88.24
Sinha et al [10] Structural Syntactic pattern
recognition
Unspecified 90
Natarajan et al [94] Derivatives HMM 21,982 91.3
Bansal et al [29] Filters Five filters Unspecified 93
Dhurandhar et al [27] Contours Interpolation 546 93.03
Kompalli et al [26] GSC K-nearest neighbor 9,297 95
Jayanthi et al [25] Statistical Binary tree 4863 95.08
Chaudhuri et al [5] Statistical Tree classifier and
Template matching
10,000 95.08
Kompalli et al [19] SFSA Stochastic finite state
automation
10,606 96
Jawahar et al [23] PCA Support vector machine 2,00,000 96.7
Dhingra et al [31] Gabor MCE 30,000 98.5
Table 2.2 Details of printed Devanagari word recognition systems
Method Feature Classifier Data set
(size)
Accuracy
(%)
Govindajaru et al [16] Gradient Neural networks 4,506 53
Kompalli et al [26] GSC K-nearest neighbor 1,882 58.51
Kompalli et al [22] GSC Neural networks 14,353 61.8
Huanfeng Ma et al [1] Statistical and
Structural
Hausdroff image
comparison
578 66.78
Chaudhuri et al [5] Statistical Tree classifier and
Template matching
10,000 83.67
Kompalli et al [19] SFSA Stochastic finite state
automation
10,606 87
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 24
2.3 Recognition of handwritten Devanagari script
Only during recent years, research toward Indian handwritten character
recognition is getting increased attention although the first research report on offline
handwritten Devanagari characters was published in 1977 [34]. Many approaches have
been proposed toward handwritten Devanagari numeral, character, and word recognition
in the past decade [35].
2.3.1 Pre-processing and segmentation techniques
Some handwritten documents (e.g., Indian postal documents) may contain some
non text parts (like stamp-seal, etc.). Before recognition of this document, it is needed to
segment the text and non-text parts. Many techniques [37], [38] based on connected
component analysis, run length-smoothing approach (RLSA), and morphological
operations are used for this.
For converting gray-scale images to binary, many techniques are employed in the
literature. In [38], images are binarized using a histogram based global binarization
algorithm [39]. In [41] and [42], the Devanagari word image is first smoothed using a
median filter, and then, binarized by Otsu’s [43] thresholding method. The binarized
image is then smoothed using a median filter. Both local and global methods are used in
some of the works [37]. Noise removal of the document is also an important step toward
the recognition. Bajaj et al. [44] used a median filtering-based approach for noise
removal from the images of handwritten Devanagari numerals.
For skew angle detection of handwritten Devanagari words and characters, an
extension to the work in [13] is proposed in [67]. The method treats shirorekha (header
line) as an inherent feature of Devanagari script. The authors have assumed that a
handwritten Devanagari word will never have the straight shirorekha, and hence,
considered the straightest part of the shirorekha for skew determination. A heuristic
approach has been applied to detect the skew angle. Initially the document is scanned
from all the four sides for getting the coordinates of pixels encountered along the
demarcation of the word boundaries. First-order differential of the coordinate
information gives the spatial-level curve. Various levels are then clustered using the
nearest neighborhood algorithm to form various regions. The biggest region is treated as
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 25
the region of importance. The skew angle is then calculated through a heuristic weight
assignment scheme.
In [41], mathematical morphological operations, namely erosion and dilation
were used to detect the shirorekha of each Devanagari word. With the assumption that
the shirorekha is piecewise linear, the skew correction of the word is performed after
detecting the shirorekha. The skew angle is found using eigenvectors of the scatter
matrix of each component (piece) of shirorekha. For correcting the skew of the word, it
is again divided into slabs of a particular number of columns. Each slab is pushed up or
down depending on the skew angle of the shirorekha component of that particular slab.
Text-line segmentation is an important task in the automatic recognition of offline
handwritten text document. Variations in interline distance, presence of inconsistent
baseline skew, touching, and overlapping text lines make this task more crucial and
complex. Correctness/incorrectness of text-line segmentation directly affects the
accuracy of word/character segmentation, which consequently changes the accuracy of
word/character recognition.
Several techniques for text-line segmentation are reported in the literature [101],
[102]. The techniques may be categorized into four groups, which are as follows: 1)
projection-profile-based techniques; 2) Hough-transform based techniques; 3) smearing
techniques; and 4) methods based on thinning operation. As a conventional technique for
text-line segmentation, global horizontal projection analysis of black pixels has been
utilized for line segmentation in printed documents [3]. However, this technique cannot
be used directly on unconstrained handwritten text documents due to text-line skew
variability, inconsistent interline distances, and overlapping and touching components of
two consecutive text lines. Partial or piecewise horizontal projection analysis of black
pixels is employed by many researchers to separate handwritten text lines of different
languages [60], [103], [104].
In the piecewise horizontal projection technique, a text-page image is initially
decomposed into a number of vertical stripes. The positions of potential piecewise
separating lines (PSL) are obtained for each stripe using partial horizontal projection on
each stripe. For PSL computing, row-wise sum of all black pixels of a stripe is
calculated. The row, where this sum is zero is a PSL. The extra pieces of lines are
removed based on some heuristic rules. The potential separating lines are then connected
to achieve complete separating lines for all respective text lines of the image [40], [104].
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 26
For line segmentation of handwritten Devanagari text in [83], a method based on
header line detection, base line detection, and contour-following technique is proposed.
The proposed method is free from preprocessing techniques like skew correction,
thinning, and noise removal. Roy et al. [105] proposed morphology based handwritten
line segmentation using foreground and background information. Hanmandlu et al. [59]
used a structural approach for segmentation of handwritten Hindi text. In [81], a dual
method based on interdependency between text line and interline gap is proposed for the
identification of handwritten Devanagari text. The method draws curves simultaneously
through the text and interlines gap points found from strip wise histogram peaks and
inter peak valleys. The curves stabilize after several iterations, and then, define the final
text-line and interline gaps. Also because of upper and lower modifiers of Devanagari
text, many touching may occur between two consecutive lines and more research is
needed to solve these problems in Devanagari scripts. After a text line is segmented,
words are separated from it.
Most of the exiting techniques use vertical projection profile for this purpose [3],
[60]. The segmentation of characters from words, there are two types of segmentation
schemes: recognition-free and recognition-based segmentations. In recognition-free
segmentation, a character string can be divided into segments by rules without
recognition. In recognition-based segmentation, candidate segmentation points are
verified with recognizer. In the past years, many algorithms for the segmentation of
character strings have been proposed [3], [41], and [59].
One class of approaches use contour features for segmentation. Analyzing the
contour of a connected pattern, the corresponding valley and mountain points are
derived. A cutting path is then decided to segment the connected pattern by joining
valley and mountain points. In general, contour-based methods do not provide accurate
results. Some researchers use profile features for segmentation. Profile-based methods
fail when the handwritings are strongly skewed or overlapped. A multi agent-based
approach to the segmentation of touching handwritten Hindi numerals is presented in
[65]. The first agent locates possible touching based on the thickness of handwriting.
The second agent works on the thinned image to locate possible touching based on the
rules that govern the connection of different segments to form digits. The two agents
then negotiate and try to agree on the actual touching points. The distortions in
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 27
handwritten Devanagari characters are removed in [72] using a thickening process
followed by thinning and pruning operations.
Hanmandlu et al. [59] make an attempt to segment handwritten Devanagari
words into constituent characters and modifiers. Initially, the handwritten text is
segmented into lines and words using the technique given in [60]. The segmentation of
each word includes its separation into characters, lower modifiers, upper modifiers, and
separation of compound (composite) characters into consonants and half consonants.
Initially, the header line is located and removed after correcting the skew. Analysis of
horizontal pixel density in the top half of a word gives the location of the header line.
After removing the header line, upper modifiers and characters below the header line are
separated using connected component analysis. The characters below the header lines are
analyzed further for the presence of lower modifiers. This is done by horizontally
scanning the thinned image from top to bottom. A window-based approach is used to
find whether the segmented character is a composite one or not. The segmentation of
characters from a Devanagari word in [41] is based on the assumption that a shirorekha
(header line) is always present in a word.
Some works are reported on script identification from handwritten documents. It
is done using texture features in [84]. The texture features are extracted based on the
cooccurrence histograms of wavelet-decomposed images, which capture information
about the relationship between each high-frequency subband and the corresponding low-
frequency subband of the transformed image.
The correlation between the subbands at the same resolution is significant in
characterizing a texture. For script identification in handwritten documents in [85],
denoising, thinning, pruning, m-connectivity, and text size normalization are done in
sequence. Afterward, multi channel Gabor filtering is used to extract texture features that
characterize the visual appearances of the document image. There exist documents,
where both machine printed and handwritten texts appear together.
In [92], a machine printed and handwritten text classification for Devanagari and
Bangla is presented. The scheme is based on both the structural and statistical features of
printed and handwritten text lines.
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 28
2.3.2 Feature extraction techniques
Even though researchers test different features, statistical and structural features
are mostly used for handwritten numeral/character recognition. The feature-extraction
methods in [8] for handwritten Devanagari numeral recognition are based on both
statistical and structural features. Sethi and Chatterjee [34] described handwritten
Devanagari numeral recognition based on a structural approach. The primitives used are
horizontal and vertical line segments, right and left slants. For handwritten numerals in
[2], a wavelet filter-based multiresolution analysis of input numeral images is carried out
in a cascaded manner. It is described that Daubechies wavelet, as a problem solving tool,
fit efficiently with digital computer with its basis functions defined by multiplication and
addition operators, as there are no derivatives or integrals involved. They considered
high-level features based on contour representations of all the four frequency
components (high–high, high–low, low–high, and low–low) of the wavelet-filtered
image.
Bajaj et al. [44] represented each handwritten Devanagari numeral using three
types of features: 1) density features; 2) moment features of right, left, upper, and lower
profile curves; and 3) descriptive component features. For extracting the features, a box
approach is proposed by Hanmandlu et al. [45], [46] for handwritten numbers, which
requires a spatial division of the numeral image into boxes. Ramteke and Mehrotra [47]
evaluated the performance of various techniques based on moment invariants on
handwritten Devanagari numerals. The features that have been extracted are based on
moments, image partition, principal component axes, correlation coefficient, and
perturbed moments. Thinning-based features are also used in Devanagari handwritten
character recognition. From the thinned images of handwritten Hindi numerals, three
different types of feature points, namely end, branch, and cross points are extracted first
in [62]. The strokes between these feature points and their cavity information is also used
for the recognition purpose. In [75], translation and scale invariance of handwritten
Devanagari numerals are achieved using simple Geometric moments. Higher order
Zernike moments are also used in the same work as shape descriptors.
The feature used for classifying handwritten digits in [90] is the quad-tree-based
longest run feature (QTLR). Chain code and gradient-based features are used for
Devanagari numeral recognition in [56]. Fourier descriptors (FD) capable of representing
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 29
shapes have been used as features in [57] for handwritten numerals. Sixty-four-
dimensional FD invariant to rotation, scale, and translation represent each handwritten
numeral. Kumar [48] compared performances of five feature-extraction methods on
handwritten characters. The various features covered are Kirsch directional edges,
distance transform, chain code, gradient and directional distance distribution. From the
experimentations, it is found that Kirsch directional edges are least performing and
gradient is best performing with SVM classifiers. With multilayer perceptrons (MLP),
the performance of gradient and directional distance distribution is almost same. The
chain-code-based feature is better as compared to Kirsch directional edges and distance
transform. A new feature is also proposed in the paper, where the gradient direction is
quantized into four-directional levels and each gradient map is divided into 4 × 4
regions. This is combined with total distances in four directions and neighborhood pixels
weight. Kaur [49] used Zernike moments along with zoning for feature extraction from
handwritten Devanagari characters.
The application of moments as a feature extractor provides a method for
describing the properties of an object in terms of its area, position, orientation, and other
precisely defined parameters. For the recognition of handwritten Devanagari non
compound characters, shadow features, and CH features are computed in [61]. In [63],
the handwritten Devanagari characters are represented using chain-code features. In [68],
the features are extracted from handwritten Devanagari characters using a box approach
presented in [69]. Each character image is divided into 24 boxes. The features are
represented using normalized vector distances for each character. The shirorekha and
spine in a handwritten character are detected using a differential-distance-based
technique in [72]. Also features like crossing points, end points, and corners are also
considered in the same work.
In [73], a feature-extraction technique to improve the recognition results of two
similar shaped handwritten characters is discussed. The technique is based on Fisher
ratio (F-ratio), a statistical measure defined by the ratio of the between-class variance to
the within-class variance. The main features for handwritten Devanagari characters
considered in [77] are the CH features, four side views based, and shadow-based
features. Features used by Sharma et al. [50] for handwritten Devanagari characters are
obtained from the directional chain code information of the contour points of the
characters. The bounding box of a character is segmented into blocks and a CH is
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 30
computed in each of the blocks. Based on the CH, they have used 64-D features for
recognition. The features used by Pal et al. [51] for handwritten characters are mainly
based on directional information obtained from the arc tangent of the gradient and
Gaussian filter.
In [35], a comparative study of Devanagari handwritten character recognition
using 12 different classifiers and four sets of features is presented. Feature sets used in
the classifiers are computed based on curvature and gradient information obtained from
binary as well as gray-scale images. The histogram of chain-code directions in the
image-strips scanned from left to right by a sliding window is used by Shaw et al. [42] as
a feature vector for handwritten Devanagari word recognition.
2.3.3 Recognition/Classification techniques
A decision tree is employed to perform the analysis of hand printed Devanagari
numerals by Sethi and Chatterjee [34] depending on the presence/absence of primitives
like horizontal and vertical line segments, right and left slants and their interconnections.
A similar strategy is applied to the constrained hand printed characters in [52].
Bhattacharya and Chaudhuri [2] use a distinct MLP classifier at each stage of their
recognition scheme for handwritten numerals. Each such classifier either classifies or
rejects an input numeral at the corresponding resolution level. If the MLP classifier at a
coarser resolution level rejects a numeral, the classifier of the following stage attempts to
recognize it at the next higher resolution level. Finally, if rejection still occurs at the
highest resolution level, the output vector of each of these three MLP classifiers is
transformed into a kind of likelihood measurement. Another MLP classifier has been
used to obtain the final decision by combining these three likelihood measurement
vectors. Patil and Sontakke [53] proposed a general fuzzy hyperline segment neural
network for rotation, scale, and translation invariant handwritten numeral recognition. It
combines supervised and unsupervised learning in a single algorithm so that it can be
used for pure classification, pure clustering, and hybrid classification/clustering. Bajaj et
al. [44] combined decisions of multiple classifiers for handwritten Devanagari numerals.
A neural network-based classification scheme is designed for this task. Three different
neural classifiers have been used for classification. The outputs of the three classifiers are
combined using a connectionist scheme.
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 31
Hanmandlu et al. [46] proposed a fuzzy model-based scheme for recognition of
handwritten Devanagari numerals by representing them in the form of exponential
membership functions, which serve as a fuzzy model. Modifying the exponential
membership functions fitted to the fuzzy sets does the recognition. These fuzzy sets are
derived from features consisting of normalized distances obtained using the Box
approach. The Gaussian distribution function has been adopted by Ramteke and
Mehrotra [47] for classification of handwritten numerals. In [76], a method is proposed
based on cubic spline interpolation for determining smooth and continuous edges in the
images of handwritten Devanagari numerals.
In [90], a Hough transformation- based technique is used to localize the postal
code blocks from structured postal documents with defined address block region.
Isolated handwritten digits are then extracted from the localized postal-code region. In
[58], a system is proposed to classify handwritten Devanagari characters into several
groups based on similarity measure. The header line (shirorekha) is located based on end
points and pixels positions in the top half part of the character image. The header line is
removed from the images of every character before coarse classification. Three different
classifiers, namely nearest neighbor, k-NN, and SVM were tested independently to
recognize handwritten Devanagari numerals in [57]. The performance of SVM in terms
of accuracy was better than the other two classifiers.
A syntactic representation (SR) of features is used in [62] for handwritten
numeral recognition. This representation is matched against the set of prototype SRs of
handwritten numerals for a possible match. Edge direction histogram features are used
along with PCA for enhancing recognition accuracies of handwritten Devanagari
numerals in [76]. Recognition of handwritten numeric postal codes in a multiscript
environment is presented in [90]. Similar shaped digit patterns of four scripts, namely
Latin, Devanagari, Bangla, and Urdu are grouped in 25 clusters. A script-independent
pattern SVM-based classifier is designed to classify the numeric postal codes into one of
these 25 clusters. Based on the classification decisions, a rule-based script inference
engine is developed to infer about the script of the numeric postal code. One of the four
script-specific SVM-based classifiers is then invoked to recognize the digits of the
corresponding script.
The work in [64] explores the potentiality of a clonal selection algorithm (CSA)
in recognition. In particular, a retraining scheme for the CSA is proposed for better
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 32
recognition of handwritten Devanagari numerals. Size normalized binary image matrix is
used as the feature map for the same. In [49], the feature vector is entered as an input to
one of the feed forward back propagation neural network for the classification of
handwritten Devanagari characters. Kumar [48] compared the performances of SVM and
MLP classifiers with six different features on handwritten characters and found that the
performance of SVM classifier was superior to MLP in all the six cases. But the
classification time required for SVM was greater than that of MLP. Sharma et al. [50]
proposed a quadratic classifier-based scheme for the recognition of handwritten
characters. A modified quadratic classifier is applied by Pal et al. [51] on the features of
handwritten characters for recognition.
In [55], two classifiers are combined to get higher accuracy of character
recognition with the same features. Combined use of SVM and MQDF is applied for the
same. A comparative study was done by Pal et al. [35] on Devanagari handwritten
character recognition using 12 different classifiers like PD, subspace method (SM),
linear discriminant function (LDF), SVM, MQDF, mirror image learning (MIL),
Euclidean distance (ED), nearest neighbor, k-NN, modified PD (MPD), compound PD
(CPD), and compound MQDF (CMQDF). From the experiment, they noted that MIL
classifier provided best results and the ED showed the lowest results among all the 12
classifiers considered. A divide-and-conquer strategy is adopted in [58] for the
recognition of handwritten Devanagari characters, where each category is divided into
subcategories based on structural properties to make the classification process simpler.
The subcategories considered in the paper are connected characters, non connected
characters, end-bar characters, middle-bar characters, without bar characters, end-bar
characters with one closed loop, end-bar characters with two closed loops, and without-
bar characters with loop. Identifying the presence and position of vertical line segments
and closed loops does this coarse classification.
In [59], the top modifiers of Devanagari script are classified into one touching-
point and two touching-point modifiers by checking whether a modifier touches the
header line at two positions or not. Further classification of two touching point modifiers
is done by analyzing the core strip of the word. Two MLPs and a minimum edit distance
(MED) method are used for classification of handwritten Devanagari non compound
characters in [61]. In the first stage of classification, characters with distinct shapes are
classified using two MLPs. Shadow features are used for one MLP and CH features are
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 33
used for the other MLP for classification. In the second stage of classification, confused
characters having similar shapes are classified using a MED method. This method makes
use of corners detected in a character image using modified Harris corner detection
technique. The work reported in [72] presents a two-stage classification approach for
handwritten Devanagari characters. The first stage is using structural properties like
shirorekha and spine in a character. The second stage exploits intersection features of
characters, which are then fed to a feed forward neural network (FFNN) for further
classification.
In [77], three MLPs are designed for three types of features. Each MLP is trained
with a back propagation-learning algorithm. Results of three MLPs are then combined
using a weighted majority scheme. The work reported in [63] discusses the use of regular
expressions (RE) in handwritten Devanagari character recognition, where a handwritten
character is converted into an encoded string based on chain-code features. Then, RE of
stored templates is matched with it. Rejected samples are then sent to a MED classifier
for recognition.
An elastic matching (EM) technique based on an eigen deformation (ED) for
recognition of handwritten Devanagari characters is proposed in [66]. The method
consists of two phases: a training phase for the estimation of EDs, and a recognition
process using the estimated EDs. EDs are the intrinsic deformations within each
character category and can be estimated by the PCA of actual deformations collected
through the EM.
A coarse classification is done in [68] prior to recognition, where the handwritten
Devanagari characters are classified into three major categories, namely end-bar
characters, middle-bar characters, and characters without any bar based on the presence
of vertical bar. The recognition of handwritten characters in [68] is based on the
modified exponential membership function fitted to the fuzzy sets derived from the
features of the characters. A Reuse Policy that provides guidance from the past policies
is also utilized in the paper to improve the speed of the learning process. Not much work
is reported toward handwritten character string (word) recognition of Devanagari.
A segmentation-based approach to handwritten Devanagari word recognition is
proposed by Shaw et al. [41]. On the basis of the header line, a word image is segmented
into pseudo characters. HMM are proposed to recognize the pseudo characters. The
word-level recognition is done on the basis of string edit distance.
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 34
A continuous density HMM is also proposed by Shaw et al. [42] to recognize a
handwritten word images. The states of the HMM are not determined a priori, but are
determined automatically based on a database of handwritten word images. An HMM is
constructed for each word. To classify an unknown word image, its class conditional
probability for each HMM is computed. The class that gives highest such probability is
finally selected. In [74] a dynamic programming (DP) based technique is proposed for
pin code string recognition. Initially, the pin code string is segmented into primitives.
Table 2.3 Details of handwritten Devanagari numeral recognition systems
Method Feature Classifier Data set
(size)
Accuracy
(%)
Bajaj et al [44] Statistical Neural networks 2,460 89.68
Ramteke et al [47] Moment
invariants
Gaussian distribution 2,000 92
Lakshmi et al [76] Gradient PCA 9,800 94.25
Hanmandlu et al [46] Box approach Fuzzy model 3,500 95
Hanmandlu et al [54] Box approach Bacterial foraging 3,500 96
Elnagar et al [62] Structural Matching SR 1,500 96
Garain et al [64] Binary image Clonal selection 12,000 96
Basu et al [90] QTLR SVM 3,000 97
Rajput et al [57] Fourier
descriptor
SVM 13,000 97.85
Pal et al [74] Gradient MQDF 23,340 98.41
Sharma et al [50] Chain code Quadratic 22,556 98.86
Bhattacharya et al [2] Wavelet MLP 22,556 99.27
Patil et al [53] Structural General fuzzy neural
network
2,000 99.50
Pal et al [56] Gradient MQDF 22,556 99.56
The details of many handwritten Devanagari numeral, character, and word
recognition systems are summarized in Tables 2.3-2.5, respectively. It is evident from
Table 2.3 that in recognizing Devanagari handwritten numerals, the method proposed by
Pal et al. [56] is superior in terms of accuracy. Even for recognizing Devanagari
handwritten characters, the method proposed by Pal et al. [35] has the highest accuracy,
as shown in Table 2.4.
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 35
Table 2.4 Details of handwritten Devanagari character recognition systems
Method Feature Classifier Data set
(size)
Accuracy
(%)
Sharma et al [16] Chain code Quadratic 11,270 80.36
Deshpande et al [63] Chain code RE and MED 5,000 82
Arora et al [72] Structural FFNN 50,000 89.12
Arora et al [77] Combined MLP 1,500 89.98
Hanmandlu et al [68] Vector distance Fuzzy sets 4,750 90.65
Arora et al [61] Shadow and CH MLP and MED 7,154 90.74
Kumar et al [48] Gradient SVM 25,000 94.10
Pal et al [51] Gradient and
Gaussian filter
Quadratic 36,172 94.24
Mane et al [66] Eigen
deformation
Elastic matching 3,600 94.91
Pal et al [55] Gradient SVM and MQDF 36,172 95.13
Pal et al [35] Gradient MIL 36,172 95.19
Table 2.5 Details of handwritten Devanagari word recognition systems
Method Feature Classifier Data set
(size)
Accuracy
(%)
Shaw et al [42] Chain code HMM 39,700 80.20
Shaw et al [41] Segments HMM 39,700 84.31
2.4 Some observations
In the recent past, Department of Information Technology (DIT), Government of
India formed a Consortium of several Institutions/Universities of India involved in OCR
activities and provided considerable amount of fund to this Consortium to improve the
quality of research related to Indian language OCR. Creation of benchmark databases for
Devanagari script is essential for successful research.
Efforts have been made in India and U.S. to create test beds for printed and
handwritten Devanagari character recognition [2], [36], [93]. The database developed by
Indian Statistical Institute, Kolkata [2] contains 22 556 isolated handwritten Devanagari
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 36
numeral samples collected from real-life situations, and is made available free of cost to
researchers of other academic institutions.
Setlur et al. [36] also made some efforts in creating data resources and designing
an evaluation test bed for Devanagari script recognition. Jawahar et al. [93] of
International Institute of Information Technology, Hyderabad were successful in
generating a corpus consists of more than 600 000 document images printed in Indian
scripts. But still there is no standard database for handwritten composite characters and
words written in Devanagari.
For the segmentation-based recognition of handwritten text, initially it has to be
separated into words, and then, words into individual characters and modifiers. As the
recognition has to be performed on isolated characters, segmentation of words into
characters is a critical step for handwritten text recognition as incorrect segmentation of
words may lead to incorrect recognition. Most of the segmentation errors are due to
various writing styles of different individuals. Also presence of many touching
characters is another major problem of segmentation. As a result, much research is
needed and expected in this area. Identifying compound and touching characters is also a
challenging task. Some authors are of the opinion that the use of contextual information
may improve the results of segmentation. Characters are joined together using a
shirorekha to frame words in Devanagari script. It has been observed that some people
write words without using a shirorekha on them. Thus, a word may or may not have a
header line in some handwriting text and such an absence creates problems in
recognition. Skew detection will be difficult in the absence of shirorekha as some of the
existing works related to skew detection are based on the presence of shirorekha on
words. The straightness of the shirorekha is also an issue of concern. A study on the
irregularities in Devanagari handwriting is presented in [70]. These irregularities occur
during the writing process and make the word-level recognition of the text more
complex. Some of them are: abnormal size of a vowel symbol, incomplete and inaccurate
representation of a vowel symbol, merging of vowel symbols with headline, intrusion of
upper vowel symbols with middle region, intrusion of lower vowel symbols with middle
region, improper attachment of lower vowel symbols, writing vowel symbols in
isolation, wrong position of header line, incomplete writing, presence of unwanted or
extra strokes, narrow writing, and over writing.
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 37
In [78], some problems in designing OCRs for Indian scripts are discussed. The
authors presented a data collection tool, a segmentation analysis tool, and a feature
selection tool capable of tuning the features used for the classification of a particular font
or script of another set. With increasing popularity of digital cameras attached with
various handheld electronic devices, some new computational challenges have gained
significance recently [80]. One such problem is the extraction of textual information
from natural scene images captured by such devices. The extracted text information can
be sent to an OCR or to a text-to-speech engine for further processing. Because of the
recent surge in digital library projects globally and large scale intensification of
digitization efforts, it is expected that almost all of man’s knowledge will be available in
digital form on the Web in the coming years [89].
2.5 Concluding remarks
With the advent of computer and information technology, there has been a
dramatic increase of research in the field of Devanagari OCR since 1990. Different
strategies using combination of multiple features, multiple classifiers, and multiple
templates have been considered extensively in the state of the art. Only a few works have
been reported in the areas of unconstrained Devanagari handwriting recognition.
Lexicon-based approaches shall be used for recognizing legal amounts on bank cheques
and city names on postal documents. There is a great scope of research in these areas for
the future researchers in the area of handwritten Devanagari OCR. Word spotting in
handwritten Devanagari documents is also an interesting area of research as it will be
helpful in indexing as well as searching the document images of handwritten archives.
Holistic approaches shall be employed for the same. Some research is really required to
find ideal combinations of classifiers for the purpose of recognition. It is still not clear
that how a combination strategy can fully utilize the power of subclassifiers, and to deal
with the tradeoff between combination and effectiveness. The information about the
classification power of a subclassifier may also help in assigning weights to them. Only a
few papers are published on script identification. Generally researchers assume that a
given document is written in a specific script. In countries like India, where many
languages and scripts exist, the identification of script has to be done prior to the
recognition in applications like postal address reader, where address can be written in
A Neural Network Based Handwritten Character Recognition for Marathi Script
2. Literature Review 38
any Indian script. More research toward this direction on handwritten documents is
expected in near future.
In India huge volumes of historical documents and books (handwritten or printed
in Devanagari script) remain to be digitized for better access, sharing, indexing, etc. This
will definitely be helpful for other research communities in India in the areas of social
sciences, economics, and linguistics. From the survey, it is noted that the errors in
recognizing printed Devanagari characters are mainly due to incorrect character
segmentation of touching or broken characters. Because of upper and lower modifiers of
Devanagari text, many portions of two consecutive lines may also overlap and proper
segmentation of such overlapped portions are needed to get higher accuracy. Many
authors suggest that the post-processing of classifier outputs by integrating a dictionary
with the OCR system can significantly reduce the misclassifications in printed as well as
handwritten word recognitions. Recently, some efforts have been reported toward
building benchmark databases to enhance the quality of OCR-related research in India. It
is also observed that special keyboards are required to key-in Devanagari text as the
number of characters and modifiers in Devanagari script are more than the number of
characters in Latin script. Since the process of typing is tiring and time consuming,
digitization of documents and their automatic processing would be easier than keying-in
the Devanagari text. The next chapter discusses the problems in recognition of
handwritten Marathi characters and the strategies to overcome them.