Online Handwritten Script Recognition.pptx

PowerPoint Presentation

ONLINE HANDWRITTEN SCRIPT RECOGNITION

Presentation By: Priya AhujaCSE 6C10-CSU-110

CONTENTSOnline RecognitionWhy Handwriting Recognition?Why is Handwriting Recognition difficult?Properties of ScriptsFeatures of Handwritten ScriptSteps in Handwritten Script RecognitionFuture ScopeReferences

Online RecognitionOn-line handwriting recognition involves the automatic conversion of text as it is written on a specialdigitizerorPDA, where a sensor picks up the pen-tip movements as well as pen-up/pen-down switching. That kind of data is known as digital ink and can be regarded as a dynamic representation of handwriting. The obtained signal is converted into letter codes which are usable within computer and text-processing applications.The elements of an on-line handwriting recognition interface typically include:1) a pen or stylus for the user to write with.2) a touch sensitive surface, which may be integrated with, or adjacent to, an output display.3) a software application which interprets the movements of the stylus across the writing surface, translating the resulting strokes into digital text.

Devices that accept on-line handwritten data: From the top left, Pocket PC, CrossPad, Ink Link, Cell Phone, Smart Board, Tablet with display, Anoto pen, Wacom Tablet, Tablet PCWhy Handwriting Recognition?

Online documents may be written in different languages and scripts. A single document page in itself may contain text written in multiple scripts.A script is defined as a graphic form of a writing system. Different scripts may follow the same writing system. For example, the alphabetic system is adopted by scripts like Roman and Greek , and the phonetic-alphabetic system is adopted by most Indian scripts , including Devnagari. A specific script like Roman may be used by multiple languages such as English, German, and French.The general class of Han-based scripts include Chinese, Japanese, and Korean (we do not consider Kana or Han-Gul). Devnagari script is used by many Indian languages, including Hindi, Sanskrit, Marathi, and Rajasthani. Arabic script is used by Arabic, Farsi, Urdu, etc.Roman script is used by many European languages like English, German, French, and Italian.

The most important characteristic of online documents is that they capture the temporal sequence of strokes while writing the document.We use stroke properties as well as the spatial and temporal information of a collection of strokes to identify the script used in the document.

Why is Handwriting Recognition Difficult?High variability of individual charactersWriting style

Stroke width and qualitySize of the writingVariation even for single writer!Reliable segmentation of cursive script extremely problematic due to Merging of adjacent characters

Properties of scriptsArabic : Arabic is written from right to left within a line and the lines are written from top to bottom. A typical Arabic character contains a relatively long main stroke which is drawn from right to left, along with one to three dots. The character set contains three long vowels. Short markings(diacritics) may be added to the main character to indicate short vowels. Due to these diacritical marks and the dots in the script, the length of the strokes vary considerably.Cyrillic: Cyrillic script looks very similar to the cursive Roman script. The most distinctive features of Cyrillic script, compared to Roman script are: 1) individual characters, connected together in a word, form one long stroke, 2) the absence of delayed strokes .Delayed strokes cause movement of the pen in the direction opposite to the regular writing direction.

The word trait contains three delayed strokes, shown as bold dotted lines here.Devnagari : The most important characteristic of Devnagariscript is the horizontal line present at the top of each word,called Shirorekha .These lines are usually drawn after the word is written and hence are similar to delayed strokes in Roman script. The words are written from left to right in a line.

Han: Characters of Han script are composed of multiple short strokes. The strokes are usually drawn from top to bottom and left to right within a character. The direction of writing of words in a line is either left to right or top to bottom. Hebrew: Words in a line of Hebrew script are written from right to left and, hence, the script is temporally similar to Arabic. The most distinguishing factor of Hebrew from Arabic is that the strokes are more uniform in length in the former.Roman: Roman script has the same writing direction as Cyrillic, Devnagari, and Han scripts. In addition, the length of the strokes tends to fall between that of Devnagari and Cyrillic scripts.

The word devnagari written in Devnagari script. The Shirorekha is shown in bold.Features of Handwritten scriptHorizontal Interstroke Direction (HID):This is the sum of the horizontal directions between the starting points of consecutive strokes in the pattern. The feature essentially captures the writing direction within a line.

where Xstart(.) denotes the x coordinate of the pen-down position of the stroke, n is the number of strokes in the pattern, and r is set to 3 to reduce errors due to abrupt changes in direction between successive strokes. The value of HID falls in the range [r n , n r].Average Stroke Length (ASL): Each stroke is resampled during preprocessing so that the sample points are equidistant. Hence, the number of sample points in a stroke is used as a measure of its length. The Average Stroke Length is defined as the average length of the individual strokes in the pattern.

where n is the number of strokes in the pattern.The value of ASL is a real number which falls in the range [1.0, R0 ],where the value of R0 depends on the resampling distance used during preprocessing.Shirorekha Strength: This feature measures the strength of the horizontal line component in the pattern using the Hough transform. The value of this feature is computed as:

Stroke Density: This is the number of strokes per unit length (x-axis) of the pattern. Note that the Han script is written using short strokes, while Roman and Cyrillic are written using longer strokes.

where n is the number of strokes in the pattern. The value of Stroke Density is a real number and can vary within the range (0.0,R1), where R1 is a positive real number. Aspect Ratio: This is the ratio of the width to the height of a pattern. The value of Aspect Ratio is a real number and can vary within the range (0.0, R2), where R2 is a positive real number.

Reverse Distance: This is the distance by which the pen moves in the direction opposite to the normal writing direction. The normal writing direction is different for different scripts. The value of Reverse Distance is a nonnegative integer and its observed values were in the range [0,1200].

Average Horizontal Stroke Direction: Horizontal Stroke Direction (HD) of a stroke, s, can be understood as the horizontal direction from the start of the stroke to its end. Formally, we define HD(s) as:

Average Vertical Stroke Direction: It is defined similar to the Average Horizontal Stroke Direction. The Vertical Direction (VD) of a single stroke s is defined as:

where Y pen-down(.)and Y pen-up(.) are the y-coordinates of the pen-down and pen-up positions, respectively .For an n-stroke pattern, the Average Vertical Stroke Direction is computed as the average of the VD values of its component strokes. The value of Average Vertical Stroke Direction falls in the range [-1.0,1.0].

where Xpen-down(.)and Xpen-up(.) are the x-coordinates of the pen-down and pen-up positions, respectively.

For an n-stroke pattern, the Average Horizontal Stroke Direction is computed as the average of the HD values of its component strokes. The value of Average Horizontal Stroke Direction falls in the range [-1.0,1.0].Vertical Interstroke Direction (VID): The Vertical Interstroke Direction is defined as:

_Y(s) is the average of the y-coordinates of the stroke points and n is the number of strokes in the pattern. The value of VID is an integer and falls in the range (1 -n, n -1).Variance of Stroke Length: This is the variance in sample lengths of individual strokes within a pattern. The value is of Variance of Stroke Length is a nonnegative integer.

STEPS IN HANDWRITTEN SCRIPT RECOGNITION

1. Preprocessing: Goal is to remove unwanted variation.Common Methods: Skew / Slant / Size normalization:Trajectory data mapped to 2D representationBaselines / core area estimated similar to offline caseSpecial Online Methods: Outlier Elimination: Remove position measurements caused by interferencesResampling and smoothing of the trajectoryElimination of delayed strokes.Resampling and smoothing of the trajectory -:Goal: Normalize variations in writing speed (no identification!)Equidistant resampling & interpolation.

Elimination of delayed strokes -:Handling of delayed strokes problematic, additional time variability!Remove by heuristic rules

Feature ExtractionBasic Idea: Describe shape of pen trajectory locallyTypical Features: Slope angle of local trajectory(represented as sin and cos : continuous variation)Binary pen-up vs. pen-down featureHat feature for describing delayed strokes(strokes that spatially correspond to removed delayed strokes are marked)Feature Dynamics: In all applications of HMMs dynamic features greatly enhance performance.Discrete time derivative of featuresHere: Differences between successive slope anglesCLASSIFICATIONThe last big step is classification. In this step various models are used to map the extracted features to different classes and thus identifying the characters or words the features represent.Andrei Andreyevich Markov

Born: 14 June 1856 in Ryazan, RussiaDied: 20 July 1922 in Petrograd (now St Petersburg), RussiaMarkov is particularly remembered for his study of Markov chains, sequences of random variables in which the future variable is determined by the present variable but is independent of the way in which the present state arose from its predecessors. This work launched the theory of stochastic processes.

Markov random processesA random sequence has the Markov property if its distribution is determined solely by its current state. Any random process having this property is called a Markov random process.For observable state sequences (state is known from data), this leads to a Markov chain model.For non-observable states, this leads to a Hidden Markov Model (HMM).

Chain Rule & Markov Property

Bayes ruleMarkov propertys1s3s2Has N states, called s1, s2 .. sNThere are discrete timesteps, t=0, t=1, N = 3t=0A Markov Systems1s3s2Has N states, called s1, s2 .. sNThere are discrete timesteps, t=0, t=1, On the tth timestep the system is in exactly one of the available states. Call it qtNote: qt {s1, s2 .. sN }N = 3t=0qt=q0=s3Current StateA Markov SystemA Markov Systems1s3s2Has N states, called s1, s2 .. sNThere are discrete timesteps, t=0, t=1, On the tth timestep the system is in exactly one of the available states. Call it qtNote: qt {s1, s2 .. sN }Between each timestep, the next state is chosen randomly.N = 3t=1qt=q1=s2Current States1s3s2Has N states, called s1, s2 .. sNThere are discrete timesteps, t=0, t=1, On the tth timestep the system is in exactly one of the available states. Call it qtNote: qt {s1, s2 .. sN }Between each timestep, the next state is chosen randomly.The current state determines the probability distribution for the next state.N = 3t=1qt=q1=s2P(qt+1=s1|qt=s3) = 1/3P(qt+1=s2|qt=s3) = 2/3P(qt+1=s3|qt=s3) = 0P(qt+1=s1|qt=s1) = 0P(qt+1=s2|qt=s1) = 0P(qt+1=s3|qt=s1) = 1P(qt+1=s1|qt=s2) = 1/2P(qt+1=s2|qt=s2) = 1/2P(qt+1=s3|qt=s2) = 0s1s3s2Has N states, called s1, s2 .. sNThere are discrete timesteps, t=0, t=1, On the tth timestep the system is in exactly one of the available states. Call it qtNote: qt {s1, s2 .. sN }Between each timestep, the next state is chosen randomly.The current state determines the probability distribution for the next state.N = 3t=1qt=q1=s2P(qt+1=s1|qt=s3) = 1/3P(qt+1=s2|qt=s3) = 2/3P(qt+1=s3|qt=s3) = 0P(qt+1=s1|qt=s1) = 0P(qt+1=s2|qt=s1) = 0P(qt+1=s3|qt=s1) = 1P(qt+1=s1|qt=s2) = 1/2P(qt+1=s2|qt=s2) = 1/2P(qt+1=s3|qt=s2) = 01/21/21/32/31Often notated with arcs between states

s1s3s2qt+1 is conditionally independent of { qt-1, qt-2, q1, q0 } given qt.In other words:P(qt+1 = sj |qt = si ) =P(qt+1 = sj |qt = si ,any earlier history)Notation:N = 3t=1qt=q1=s2P(qt+1=s1|qt=s3) = 1/3P(qt+1=s2|qt=s3) = 2/3P(qt+1=s3|qt=s3) = 0P(qt+1=s1|qt=s1) = 0P(qt+1=s2|qt=s1) = 0P(qt+1=s3|qt=s1) = 1P(qt+1=s1|qt=s2) = 1/2P(qt+1=s2|qt=s2) = 1/2P(qt+1=s3|qt=s2) = 01/21/21/32/31

Markov Property

Example: A Simple Markov Model For Weather Prediction

Any given day, the weather can be described as being in one of three states:State 1: precipitation (rain, snow, hail, etc.)State 2: cloudyState 3: sunny

Transitions between states are described by the transition matrix

This model can then be described by the following directed graph

Basic Calculations-1

Example: What is the probability that the weather for eight consecutive days is sun-sun-sun-rain-rain-sun-cloudy-sun?Solution:O = sun sun sun rain rain sun cloudy sun 3 3 3 1 1 3 2 3

From Markov To Hidden MarkovThe previous model assumes that each state can be uniquely associated with an observable eventOnce an observation is made, the state of the system is then trivially retrievedThis model, however, is too restrictive to be of practical use for most realistic problemsTo make the model more flexible, we will assume that the outcomes or observations of the model are a probabilistic function of each stateEach state can produce a number of outputs according to a unique probability distribution, and each distinct output can potentially be generated at any stateThese are known a Hidden Markov Models (HMM), because the state sequence is not directly observable, it can only be approximated from the sequence of observations produced by the system

HMM Formal DefinitionAn HMM, , is a 5-tuple consisting ofN the number of statesM the number of possible observations{1, 2, .. N} The starting state probabilitiesP(q0 = Si) = i a11a12a1N a21a22a2N : : : aN1aN2aNN

b1(1)b1(2)b1(M) b2(1)b2(2)b2(M) : : : bN(1)bN(2)bN(M)The state transition probabilities P(qt+1=Sj | qt=Si)=aijThe observation probabilities P(Ot=k | qt=Si)=bi(k)The coin-toss problem

To illustrate the concept of an HMM consider the following scenarioAssume that you are placed in a room with a curtainBehind the curtain there is a person performing a coin-toss experimentThis person selects one of several coins, and tosses it: heads (H) or tails (T)The person tells you the outcome (H,T), but not which coin was used each timeYour goal is to build a probabilistic model that best explains a sequence of observations O={o1,o2,o3,o4,}={H,T,T,H,,}The coins represent the states; these are hidden because you do not know which coin was tossed each timeThe outcome of each toss represents an observation A likely sequence of coins may be inferred from the observations, but this state sequence will not be unique The Coin Toss Example 1 coin

As a result, the Markov model is observable since there is only one stateIn fact, we may describe the system with a deterministic model where the states are the actual observations (see figure)the model parameter P(H) may be found from the ratio of heads and tailsO= H H H T T HS = 1 1 1 2 2 1The Coin Toss Example 2 coins

From Markov to Hidden Markov Model: The Coin Toss Example 3 coins

1, 2 or 3 coins?Which of these models is best?Since the states are not observable, the best we can do is select the model that best explains the data (e.g., Maximum Likelihood criterion)Whether the observation sequence is long and rich enough to warrant a more complex model is a different story, though

Future ScopeOver the past three decades, many different methods have been explored by a large number of scientists to recognize characters. A variety of approaches have been proposed and tested by researchers in different parts of the world to improve the experience of usability.

Referencesieeexplore.ieee.org/iel5/34/28182/01261096.pdfF. Coulmas, Writing Systems: An Introduction to Their Linguistic Analysis: Cambridge University Press, 2003.R. Plamondon and S. N. Srihari, "On-line and off-line handwriting recognition: A comprehensive survey," IEEE Transactions on Pattern Analysis and Machine Intelligence,vol. 22, pp. 63-84, 2000.A. L. Spitz, "Determination of the script and language content of document images," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, pp. 235-245,1997 G. X. Tan, C. Viard-Gaudin, and A. Kot, "Automatic Writer Identification Framework for Online HandwrittenDocuments Using Character Prototypes," Pattern Recogn.,2009.

THANK YOU !