Top Banner
1 Arabic Handwritten Script Arabic Handwritten Script Recognition Towards Recognition Towards Generalization: A Survey Generalization: A Survey Authors: Authors: Randa I. M. Elanwar Randa I. M. Elanwar Assistant Researcher, Electronic Research Institute Prof. Dr. Mohsen A. A. Rashwan Prof. Dr. Mohsen A. A. Rashwan Professor of Digital Signal Processing, Electronic and communication dept, Cairo University Prof. Dr. Samia A. A. Mashali Prof. Dr. Samia A. A. Mashali Head of computers and systems dept, Electronic Research Institute
58

Arabic Handwritten Script Recognition Towards Generalization: A Survey

Jul 15, 2015

Download

Science

Randa Elanwar
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Arabic Handwritten Script Recognition Towards Generalization: A Survey

1

Arabic Handwritten Script Arabic Handwritten Script Recognit ion Towards Recognit ion Towards

General ization: A SurveyGeneral ization: A Survey Authors:Authors: Randa I. M. ElanwarRanda I. M. ElanwarAssistant Researcher, Electronic Research Institute

Prof. Dr. Mohsen A. A. RashwanProf. Dr. Mohsen A. A. RashwanProfessor of Digital Signal Processing, Electronic and communication dept, Cairo University

Prof. Dr. Samia A. A. MashaliProf. Dr. Samia A. A. MashaliHead of computers and systems dept, Electronic Research Institute

Page 2: Arabic Handwritten Script Recognition Towards Generalization: A Survey

2

Presentation ContentsPresentation Contents

Introduction

Paper Objective

Arabic handwriting recognition problem

Main Challenges

Recent off-line Arabic handwriting recognition systems

Recent on-line Arabic handwriting recognition systems

Summary and Conclusion

Page 3: Arabic Handwritten Script Recognition Towards Generalization: A Survey

3

IntroductionIntroduction Handwriting recognition can be defined as the task of transforming text represented in the spatial form of graphical marks into its symbolic representation

The main components of a recognizer are:1. Capturing Data & acquisition

2. Preprocessing & segmentation

3. Defining patterns and model selection

4. Feature Extraction

5. Training

6. Classification

Page 4: Arabic Handwritten Script Recognition Towards Generalization: A Survey

4

IntroductionIntroduction

• First the input device captures an image and convert it to a usable format

• Data is then preprocessed to eliminate noise for simplification without loosing relevant information and may also be segmented to smaller data units

Page 5: Arabic Handwritten Script Recognition Towards Generalization: A Survey

5

IntroductionIntroduction

• The information of each data unit is sent to feature extractor to reduce them by measuring certain “features” or “properties”

• Patterns (or classes) should be defined and models should be selected. These models are trained using the extracted features.

Page 6: Arabic Handwritten Script Recognition Towards Generalization: A Survey

6

IntroductionIntroduction

• The model for a pattern may be a single specific set of features

• To recognize (or classify) a novel pattern means to recover the model that generated the pattern based on the extracted features

Page 7: Arabic Handwritten Script Recognition Towards Generalization: A Survey

7

IntroductionIntroduction The feature extractor has reduced the data unit to a

point or feature vector X in a 2D feature space (or observation space)

Classification rule: Classify the input as Class I if its feature vector falls below the decision boundary shown, and as Class II otherwise.

Page 8: Arabic Handwritten Script Recognition Towards Generalization: A Survey

8

IntroductionIntroduction The problem is that designing a very complex

recognizer is unlikely to give good generalization since it seems to be “tuned” to the particular training samples

The question is how to optimize this tradeoff: generalization versus simple classifier

Page 9: Arabic Handwritten Script Recognition Towards Generalization: A Survey

9

IntroductionIntroduction Usually there is an action taken based on the

classification decision. Each action should be assigned a certain cost.

We design our decision boundary (classification rule) so that on the average, the Risk will be as small as possible.

The Risk (R) is the expected value of cost

Minimizing (R) leads to complex boundaries

The question is how to optimize this tradeoff: generalization versus minimum risk?

Page 10: Arabic Handwritten Script Recognition Towards Generalization: A Survey

10

IntroductionIntroduction In order to achieve general purpose recognizer

(unbiased) we should have a sufficient number of training samples (N) for each class in the data set.

A theoretical estimate claims that

N ≅ 100 / P where P ≡ prob. of misclassification

I.e., for P ≈ 0.01, N ≈ 10000 and for P ≈ 0.03, N ≈ 3000

Such large data set (if available) needs large storage and long processing time (time complexity)

The question is how to optimize this tradeoff: generalization versus complexity?

Page 11: Arabic Handwritten Script Recognition Towards Generalization: A Survey

11

Paper Object ivePaper Object ive

Our concern in this paper is to:

1. provide a comprehensive review of recent off-line

and on-line trends in Arabic cursive handwriting

recognition (last 10 years publications)

2. clarify the challenges standing against obtaining a

reliable, accurate, simple, general purpose recognizer

based on these trends.

Page 12: Arabic Handwritten Script Recognition Towards Generalization: A Survey

12

Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem

Arabic Script Recognition Systems are categorized as:

1. On-line or Off-line

2. Writer Dependent or Writer Independent

3. Open-vocabulary or closed-vocabulary

Page 13: Arabic Handwritten Script Recognition Towards Generalization: A Survey

13

Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem

Types of Recognition:

When the input device is a digitizer tablet that

transmits the signal in real time or includes timing

information together with pen position, this is mostly

referred to as on-line or dynamic recognition

Page 14: Arabic Handwritten Script Recognition Towards Generalization: A Survey

14

Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem

Types of Recognition:

When the input device is a still camera or a scanner,

which captures the position of digital ink on the page

but not the order in which it was laid down, this is

defined as off-line or image-based OCR

Page 15: Arabic Handwritten Script Recognition Towards Generalization: A Survey

15

Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem

Special Characteristics of Arabic Script:

Always written from right to left

Arabic word consists of one or more portions; each

has one or more characters

Many characters differ only by the position and the

number of dots attached

Page 16: Arabic Handwritten Script Recognition Towards Generalization: A Survey

16

Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem

Special Characteristics of Arabic Script:

Every character has more than one shape, depending

on its position

Characters overlap

Page 17: Arabic Handwritten Script Recognition Towards Generalization: A Survey

17

Arabic Handwriting Recognition ProblemArabic Handwriting Recognition Problem

Special Characteristics of Arabic Script:

Existence of ligatures

Due to having these special characteristics, Arabic handwriting recognition systems still need more research to be established commercially

Page 18: Arabic Handwritten Script Recognition Towards Generalization: A Survey

18

Main ChallengesMain Challenges

Feature Extraction

Noise

Model Selection and Complexity

Segmentation

Context

Evidence Pooling

Costs and Risks

Computational Complexity

Learning and Adaptation

Page 19: Arabic Handwritten Script Recognition Towards Generalization: A Survey

19

Main ChallengesMain Challenges

Feature Extraction:

A good feature set should helps distinguishing a class

from other classes, be invariant to differences and

contains no redundant information

Page 20: Arabic Handwritten Script Recognition Towards Generalization: A Survey

20

Main ChallengesMain Challenges

Feature Extraction:

A good feature set should helps distinguishing a class

from other classes, be invariant to differences and

contains no redundant information

… How to know which features are most

promising ?

… Is there ways to automatically learn which features are

best for a c lassifier?

Page 21: Arabic Handwritten Script Recognition Towards Generalization: A Survey

21

Main ChallengesMain Challenges

Feature Extraction:

A good feature set should helps distinguishing a class

from other classes, be invariant to differences and

contains no redundant information

… How to know which features are most

promising ?

… Is there ways to automatically learn which features are

best for a c lassifier?

It should be limited in number for computational ease

and to limit the amount of training data

Page 22: Arabic Handwritten Script Recognition Towards Generalization: A Survey

22

Main ChallengesMain Challenges

Feature Extraction:

A good feature set should helps distinguishing a class

from other classes, be invariant to differences and

contains no redundant information

… How to know which features are most

promising ?

… Is there ways to automatically learn which features are

best for a c lassifier?

It should be limited in number for computational ease

and to limit the amount of training data

… How many features

to use?

… How to train or used a c lassifier when some

features are miss ing?

Page 23: Arabic Handwritten Script Recognition Towards Generalization: A Survey

23

Main ChallengesMain Challenges

Noise:

Random error in a pixel value (deformation) due to

signal-independent, signal-dependent and salt &

pepper noise.

Noise cannot always be totally eliminated; but

smoothing is done

Page 24: Arabic Handwritten Script Recognition Towards Generalization: A Survey

24

Main ChallengesMain Challenges

Noise:

Random error in a pixel value (deformation) due to

signal-independent, signal-dependent and salt &

pepper noise.

Noise cannot always be totally eliminated; but

smoothing is done

… Is the deformation in some signal is noise? or natural

varieties in true models?

… How can we use this information to improve

our c lass ifier?

Page 25: Arabic Handwritten Script Recognition Towards Generalization: A Survey

25

Main ChallengesMain Challenges

Modeling Selection and Complexity:

Determining the complexity of the model: not so

simple that it cannot explain the differences between

the categories, yet not so complex as to give poor

classification on novel patterns.

Page 26: Arabic Handwritten Script Recognition Towards Generalization: A Survey

26

Main ChallengesMain Challenges

Modeling Selection and Complexity:

Determining the complexity of the model: not so

simple that it cannot explain the differences between

the categories, yet not so complex as to give poor

classification on novel patterns.

… how to know when to re ject a c lass of models and

try another one?

… Are there principled methods for finding the best

complexity for a c lass ifier?

… Is it a matter of random tr ial & error not even guided by

expectations of performance?

Page 27: Arabic Handwritten Script Recognition Towards Generalization: A Survey

27

Main ChallengesMain Challenges

Segmentation:

Segmentation subdivides image into its constituent

regions or objects. Segmentation should stop when the

objects of interest in an application have been isolated.

Page 28: Arabic Handwritten Script Recognition Towards Generalization: A Survey

28

Main ChallengesMain Challenges

Segmentation:

Segmentation subdivides image into its constituent

regions or objects. Segmentation should stop when the

objects of interest in an application have been isolated.

… How do we know where one character “ends” and the

next one “begin”?

… Shall we segment the images before they have been categorized or

categorize them

before they have been segmented?

Page 29: Arabic Handwritten Script Recognition Towards Generalization: A Survey

29

Main ChallengesMain Challenges

Context:

The accuracy of automatic handwriting recognition

systems based on purely visual information seems to

have a ceiling

Incorporating Symantec and syntactic knowledge

sources into the automatic recognition of text can offer

potential improvements in performance

… how, precise ly , should we incorporate such

information?

Page 30: Arabic Handwritten Script Recognition Towards Generalization: A Survey

30

Main ChallengesMain Challenges

Evidence Pooling:

For high classification performance or for increased

class coverage, different classification tools are

developed either in parallel or sequentially

When having several component classifiers, and

these categorizers agree on a particular pattern, there

is no difficulty. But suppose they disagree !!!

Page 31: Arabic Handwritten Script Recognition Towards Generalization: A Survey

31

Main ChallengesMain Challenges

Evidence Pooling:

For high classification performance or for increased

class coverage, different classification tools are

developed either in parallel or sequentially

When having several component classifiers, and

these categorizers agree on a particular pattern, there

is no difficulty. But suppose they disagree !!!

… How should a “super” c lassif ier pool the evidence from the component

recognizers to achieve the best decis ion?

… How would the “super” categorizer know when to base a decision on

a minority opinion when required?

Page 32: Arabic Handwritten Script Recognition Towards Generalization: A Survey

32

Main ChallengesMain Challenges

Costs and Risks:

A classifier is generally used to recommend actions,

each action having an associated cost or risk

We often design our classifier to recommend actions

that minimize some total expected cost or risk

Page 33: Arabic Handwritten Script Recognition Towards Generalization: A Survey

33

Main ChallengesMain Challenges

Costs and Risks:

A classifier is generally used to recommend actions,

each action having an associated cost or risk

We often design our classifier to recommend actions

that minimize some total expected cost or risk

… How do we incorporate knowledge about such r isks and how wil l they

affect the c lassification decision?

… Is there a way to estimate the total r isk and thus te l l whether our

c lassif ier is acceptable even before we f ie ld it?

Page 34: Arabic Handwritten Script Recognition Towards Generalization: A Survey

34

Main ChallengesMain Challenges

Computational Complexity:

Although we might achieve error-free recognition, the

time & storage requirements would be quite prohibitive

Some pattern recognition problems can be solved

using algorithms that are highly impractical.

Page 35: Arabic Handwritten Script Recognition Towards Generalization: A Survey

35

Main ChallengesMain Challenges

Computational Complexity:

Although we might achieve error-free recognition, the

time & storage requirements would be quite prohibitive

Some pattern recognition problems can be solved

using algorithms that are highly impractical.

… What is the tradeoff between computational ease

and performance?

… How can we optimize an exce l lent recognizer within the

engineer ing constraints ?

Page 36: Arabic Handwritten Script Recognition Towards Generalization: A Survey

36

Main ChallengesMain Challenges

Learning and Adaptation: Any method that incorporates information from training

samples in the design of a classifier employs learning

If the models were extremely complicated, the classifier

would have complex decision boundaries

To overcome this, more training samples are needed to

obtain a better estimate of the true underlying features

In case of limited training samples, we should incorporate

knowledge of the problem domain. The production

representation is the “best” representation for classification.

Page 37: Arabic Handwritten Script Recognition Towards Generalization: A Survey

37

Main ChallengesMain Challenges

Learning and Adaptation: Any method that incorporates information from training

samples in the design of a classifier employs learning

If the models were extremely complicated, the classifier

would have complex decision boundaries

To overcome this, more training samples are needed to

obtain a better estimate of the true underlying features

In case of limited training samples, we should incorporate

knowledge of the problem domain. The production

representation is the “best” representation for classification.

… How much training samples are needed for good general ization?

… How can we insure that the learning algorithm favors “s imple”

so lutions rather than complicated ones?

Page 38: Arabic Handwritten Script Recognition Towards Generalization: A Survey

38

Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition

systemssystems

Example: Pechwitz et al research [17]

proposed a recognition system based on a semi-continuous 1-D HMM using the IFN/ENIT database of handwritten Tunisian town/village names.

Preprocessing:

1. Extracting image contour and Performing a noise reduction filtering.

2. Skeletonization and normalization are performed.

3. Baseline estimation and word length normalization are performed.

Page 39: Arabic Handwritten Script Recognition Towards Generalization: A Survey

39

Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition

systemssystems

Example: Pechwitz et al research [17]

Feature Extraction:

1. A rectangular window is shifted from right to left across the normalized gray level script image .

2. A Loeve-Karhunen Transformation is performed on the gray values of each frame to reduce the number of features.

Modeling:

1. A HMM-model is generated for each character shape (all possible positions) up to 160 different HMM-models.

2. Semi Continuous HMMs are used with 7 states per character.

Page 40: Arabic Handwritten Script Recognition Towards Generalization: A Survey

40

Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition

systemssystems

Example: Pechwitz et al research [17]

Database:

1. This database is split into four sets A, B, C & D.

2. The 4 sets contain 26,459 images of segmented Tunisian town names (115,585 PAWs) handwritten by 411 unique writers.

3. 946 unique word labels, and 762 unique PAW labels.

4. For each image the ground truth information is available.

Lexicon:

The character shape HMM-models are combined to valid word models using a tree structured lexicon with all 946 different Tunisian town/village names.

Page 41: Arabic Handwritten Script Recognition Towards Generalization: A Survey

41

Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition

systemssystems

Example: Pechwitz et al research [17]

Recognition:

The standard Viterbi Algorithm is used together with the lexicon.

The authors applied the recognition algorithm to the database twice, once using the baseline coming from GT (ground truth) and once using baseline they estimated.

Results:

Recognition rates 82 – 89% are obtained using baseline estimation

Recognition rates 89 – 95% are obtained using GT baseline

Page 42: Arabic Handwritten Script Recognition Towards Generalization: A Survey

42

Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition

systemssystems

Example: Pechwitz et al research [17]

Challenges:

1. Working on available database skips the limited training samples challenge

Page 43: Arabic Handwritten Script Recognition Towards Generalization: A Survey

43

Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition

systemssystems

Example: Pechwitz et al research [17]

Challenges:

1. Working on available database skips the limited training samples challenge

2. It is not easy to generalize this classifier for open vocabulary applications because it works on a limited lexicon of words (segmentation-free recognizer) otherwise context will be a must.

Page 44: Arabic Handwritten Script Recognition Towards Generalization: A Survey

44

Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition

systemssystems

Example: Pechwitz et al research [17]

Challenges:

1. Working on available database skips the limited training samples challenge

2. It is not easy to generalize this classifier for open vocabulary applications because it works on a limited lexicon of words (segmentation-free recognizer) otherwise context will be a must.

3. Generating the same HMM structure for all characters and ligatures i.e., modeling selection & complexity .. we think it would be much better to vary the model structure according to each character requirement (ض shouldn’t have the same model as ة for example).

Page 45: Arabic Handwritten Script Recognition Towards Generalization: A Survey

45

Recent off- l ine Arabic handwriting recognition Recent off- l ine Arabic handwriting recognition

systemssystems

Example: Pechwitz et al research [17]

Challenges:

1. Working on available database skips the limited training samples challenge

2. It is not easy to generalize this classifier for open vocabulary applications because it works on a limited lexicon of words (segmentation-free recognizer) otherwise context will be a must.

3. Generating the same HMM structure for all characters and ligatures i.e., modeling selection & complexity .. we think it would be much better to vary the model structure according to each character requirement (ض shouldn’t have the same model as ة for example).

4. Feature Extraction: The idea of normalizing the word width to use a sliding window feature extractor is pretty good except for the great dependency on the baseline estimation which is in itself a great source of error.

Page 46: Arabic Handwritten Script Recognition Towards Generalization: A Survey

46

Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition

systemssystems

Example: Biadsy et al research [24]

Preprocessing:

1. Geometrical processing phase to minimize handwriting variations.

2. A low-pass filter is used to reduce noise and remove imperfections caused by acquisition devices.

3. The writing-speed is normalized by re-sampling the consequent point sequences.

Feature Extraction:

Mainly angles (with x-axis) and loop-presence

Page 47: Arabic Handwritten Script Recognition Towards Generalization: A Survey

47

Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition

systemssystems

Example: Biadsy et al research [24]

Modeling:

1. The recognition framework uses discrete Left-to-right HMMs to represent each Arabic letter shape (isolated, initial, medial, and final).

2. The number of states for each letter shape model is based on the geometric complexity of the letter shape. It varies from 5 to 11 states.

For example: 11 states are assigned to isolated ش, and 5 states to isolated أ.

Page 48: Arabic Handwritten Script Recognition Towards Generalization: A Survey

48

Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition

systemssystems

Example: Biadsy et al research [24]

Lexicon:

1. The Arabic dictionary D is subdivided into a set of sub-dictionaries {D1, D2, …, Dn} based on the number of word parts in each word.

2. Letter-shape models are embedded in a network that represents a word-part dictionary. The segmentation of word parts into letter-shapes and their recognition are performed simultaneously in an integrated process. D = {D = {وسام، هل، معلم، محمود، محمد، فادى، رواية، جامعة، ثقافة، التحدى، انسانوسام، هل، معلم، محمود، محمد، فادى، رواية، جامعة، ثقافة، التحدى، انسان}}

Sub-dictionaries of DSub-dictionaries of D Word-Part Dictionary for D3Word-Part Dictionary for D3

D1 = {D1 = {هل، معلم، محمدهل، معلم، محمد}}

D2 = {D2 = {محمود، جامعة، ثقافةمحمود، جامعة، ثقافة}}

D3 = {D3 = {وسام، فادى، التحدى، انسانوسام، فادى، التحدى، انسان}}

D4 = {D4 = {روايةرواية}}

WPD3,1 = {WPD3,1 = {و، فا، او، فا، ا}}

WPD3,2 = {WPD3,2 = {سا، د، لتحد، نساسا، د، لتحد، نسا}}

WPD3,3 = {WPD3,3 = {م، ى، نم، ى، ن}}

Page 49: Arabic Handwritten Script Recognition Towards Generalization: A Survey

49

Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition

systemssystems

Example: Biadsy et al research [24]

Database:

1. 4 trainers are asked to write 800 selected words each.

2. For testing, 10 testers (the 4 trainers, in addition to 6 new volunteers) are asked to write 280 words not in the training data (2,358 words in total).

3. 5 different dictionary sizes (5K, 10K, 20K, 30K, and 40K words) selected from different Arabic websites are used. The 280 test words are present in

all dictionary sizes.

Recognition:

Writer dependent (WD) and writer independent (WI) experiments are done and average word recognition rates 88 – 96% are obtained. The

performance degrades as ambiguity (dictionary size) increases.

Page 50: Arabic Handwritten Script Recognition Towards Generalization: A Survey

50

Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition

systemssystems

Example: Biadsy et al research [24]

Challenges:

1. Feature Extraction: The features they use are not enough to lead to satisfying classification of general unconstrained handwritings. Thus they are in a great need to work under limited vocabulary. The word parts must be present in the dictionary or the will not be recognized.

Page 51: Arabic Handwritten Script Recognition Towards Generalization: A Survey

51

Recent on-l ine Arabic handwriting recognition Recent on-l ine Arabic handwriting recognition

systemssystems

Example: Biadsy et al research [24]

Challenges:

1. Feature Extraction: The features they use are not enough to lead to satisfying classification of general unconstrained handwritings. Thus they are in a great need to work under limited vocabulary. The word parts must be present in the dictionary or the will not be recognized.

2. Database they use looks unnatural. Volunteers are asked to follow restrict methodology of writing which affects their individual writing style. Besides, the system handles limited handwriting varieties due to the small number of volunteers who wrote the database.

Page 52: Arabic Handwritten Script Recognition Towards Generalization: A Survey

52

Summary and ConclusionSummary and Conclusion

Foreign recognizers have found their way to the

markets as commercial products since years while

Arabic recognizers still need more time.

Page 53: Arabic Handwritten Script Recognition Towards Generalization: A Survey

53

Summary and ConclusionSummary and Conclusion

Foreign recognizers have found their way to the

markets as commercial products since years while

Arabic recognizers still need more time.

in the case of Arabic handwritten words many

researchers use a specific, more or less small data set

of their own ∴ it is impossible to compare different

results which would be important to improve existent

methods

Page 54: Arabic Handwritten Script Recognition Towards Generalization: A Survey

54

Summary and ConclusionSummary and Conclusion

Foreign recognizers have found their way to the

markets as commercial products since years while

Arabic recognizers still need more time.

in the case of Arabic handwritten words many

researchers use a specific, more or less small data set

of their own ∴ it is impossible to compare different

results which would be important to improve existent

methods

The complexity of the problem is greatly increased by

noise and by the infinite variability of handwritings

Page 55: Arabic Handwritten Script Recognition Towards Generalization: A Survey

55

Summary and ConclusionSummary and Conclusion

Cursive script requires the segmentation of words in

characters or parts of characters, i.e. graphemes, and

then the detection of individual features.

Page 56: Arabic Handwritten Script Recognition Towards Generalization: A Survey

56

Summary and ConclusionSummary and Conclusion

Cursive script requires the segmentation of words in

characters or parts of characters, i.e. graphemes, and

then the detection of individual features.

Generally, the holistic approach can be used if the

size of the vocabulary is small (such as the recognition

of the legal amount in cheques)

Page 57: Arabic Handwritten Script Recognition Towards Generalization: A Survey

57

Summary and ConclusionSummary and Conclusion

Cursive script requires the segmentation of words in

characters or parts of characters, i.e. graphemes, and

then the detection of individual features.

Generally, the holistic approach can be used if the

size of the vocabulary is small (such as the recognition

of the legal amount in cheques)

The character-based approach is the preferred

method for recognition applications that are

unconstrained or involve large-size vocabularies to

insure good generalization together with reasonable

complexity

Page 58: Arabic Handwritten Script Recognition Towards Generalization: A Survey

58

Thank Thank YouYou