1 1. INTRODUCTION In the running world, there is growing demand for the software systems to recognize characters in computer world when information is scanned through paper documents. As we know that we have a number of newspapers and books which are in printed format related to different subjects. In these days there is a huge demand in “storing the information available in these paper documents in to a computer storage disk and then later reusing this information by searching process, to avoid damages or losses”. One simple way to store information in these paper documents in to computer system is to first scan the documents and then store them as IMAGES. But to reuse this information it is very difficult to read the individual contents and searching the contents form these documents line-by-line and word-by-word. The reason for this difficulty is the font characteristics of the characters in paper documents are different to font of the characters in computer system. As a result, computer is unable to recognize the characters while reading them. This concept of storing the contents of paper documents in computer storage place and then reading and searching the content is called DOCUMENT PROCESSING. Sometimes in this document processing we need to process the information that is related to languages other than the English in the world. For this document processing we need an application software system named CHARACTER RECOGNITION SYSTEM. This process is also called CHARACTER RECOGNITION AND CONVERSION (CRC). Thus our need is to develop character recognition software system to perform Document Image Analysis which transforms documents in paper format to electronic format. For this process there are various techniques in the world. We’ve chosen Character Recognition and Conversion. HISTORY OF OCR Early optical character recognition may be traced to technologies involving telegraphy and creating reading devices for the blind. In 1914, Emanuel Goldberg developed a machine that read characters and converted them into standard telegraph code. Concurrently, Edmund Fournier d'Albe developed the Optophone, a handheld scanner that when moved across a printed page, produced tones that corresponded to specific letters or characters. In the late 1920s and into the 1930s Emanuel Goldberg developed what he called a "Statistical Machine" for searching microfilm archives using an optical code recognition system. In 1931 he was granted USA Patent number 1,838,389 for the invention. The patent was acquired by IBM. With the advent of smart-phones and smart-glasses, OCR can be used in internet connected mobile device applications that extract text captured using the device's camera. These devices that do not have OCR functionality built-in to the operating system will typically use an OCR API to extract the text from the image file captured and provided by the device. The OCR API returns the extracted text, along with information about the location of the detected text in the original image back to the device app for further processing (such as text-to-speech) or display.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
1. INTRODUCTION
In the running world, there is growing demand for the software systems to recognize characters in
computer world when information is scanned through paper documents. As we know that we have a
number of newspapers and books which are in printed format related to different subjects. In these days
there is a huge demand in “storing the information available in these paper documents in to a computer
storage disk and then later reusing this information by searching process, to avoid damages or losses”.
One simple way to store information in these paper documents in to computer system is to first scan the
documents and then store them as IMAGES. But to reuse this information it is very difficult to read the
individual contents and searching the contents form these documents line-by-line and word-by-word. The
reason for this difficulty is the font characteristics of the characters in paper documents are different to
font of the characters in computer system. As a result, computer is unable to recognize the characters
while reading them. This concept of storing the contents of paper documents in computer storage place
and then reading and searching the content is called DOCUMENT PROCESSING. Sometimes in this
document processing we need to process the information that is related to languages other than the
English in the world. For this document processing we need an application software system named
CHARACTER RECOGNITION SYSTEM. This process is also called CHARACTER RECOGNITION
AND CONVERSION (CRC).
Thus our need is to develop character recognition software system to perform Document Image
Analysis which transforms documents in paper format to electronic format. For this process there are
various techniques in the world. We’ve chosen Character Recognition and Conversion.
HISTORY OF OCR
Early optical character recognition may be traced to technologies involving telegraphy
and creating reading devices for the blind. In 1914, Emanuel Goldberg developed a machine that
read characters and converted them into standard telegraph code. Concurrently, Edmund Fournier
d'Albe developed the Optophone, a handheld scanner that when moved across a printed page,
produced tones that corresponded to specific letters or characters.
In the late 1920s and into the 1930s Emanuel Goldberg developed what he called a "Statistical
Machine" for searching microfilm archives using an optical code recognition system. In 1931 he
was granted USA Patent number 1,838,389 for the invention. The patent was acquired by IBM.
With the advent of smart-phones and smart-glasses, OCR can be used in internet connected
mobile device applications that extract text captured using the device's camera. These devices
that do not have OCR functionality built-in to the operating system will typically use an
OCR API to extract the text from the image file captured and provided by the device. The OCR
API returns the extracted text, along with information about the location of the detected text in
the original image back to the device app for further processing (such as text-to-speech) or
%Number one=imread('letters_numbers\1.bmp'); two=imread('letters_numbers\2.bmp'); three=imread('letters_numbers\3.bmp');four=imread('letters_numbers\4.bmp'); five=imread('letters_numbers\5.bmp'); six=imread('letters_numbers\6.bmp'); seven=imread('letters_numbers\7.bmp');eight=imread('letters_numbers\8.bmp'); nine=imread('letters_numbers\9.bmp'); zero=imread('letters_numbers\0.bmp'); %*-*-*-*-*-*-*-*-*-*-*- letter=[A B C D E F G H I J K L M... N O P Q R S T U V W X Y Z]; number=[one two three four five... six seven eight nine zero];
lowercase = [a b c d e f g h i j k ... l m n o p q r s t u v w x y z]; character=[letter number lowercase]; templates=mat2cell(character,42,[24 24 24 24 24 24 24 ...
%figure,imshow(im_texto); %title('line sent in the function letter'); for s=1:num_filas s; sum_col = sum(im_texto(:,s)); if sum_col==0 k = 'true'; nm=im_texto(:,1:s-1); % First letter matrix %figure,imshow(nm); %title('first letter in the function letter_in_a_line'); %pause(1); rm=im_texto(:,s:end);% Remaining line matrix %figure,imshow(rm); %title('remaining letters in the function letter_in_a_line'); %pause(1); fl = clip(nm); %pause(1); re=clip(rm); space = size(rm,2)-size(re,2); %*-*-*Uncomment lines below to see the result*-*-*-*- %subplot(2,1,1);imshow(fl); %subplot(2,1,2);imshow(re); break else fl=im_texto;%Only one line. re=[ ]; space = 0; end end
function img_out=clip(img_in) [f c]=find(img_in); img_out=img_in(min(f):max(f),min(c):max(c));
12
5.4. Recognize letters
%function read_letter function letter=read_letter(imagn,num_letras) % Computes the correlation between template and input image % and its output is a string containing the letter. % Size of 'imagn' must be 42 x 24 pixels % Example: % imagn=imread('D.bmp'); % letter=read_letter(imagn) %load templates global templates comp=[ ];
% PRINCIPAL PROGRAM warning off %#ok<WNOFF> % Clear all clc, close all, clear all % Read image imagen=imread('testcheck1.jpg'); % Show image imagen1 = imagen;
15
figure,imshow(imagen1); title('INPUT IMAGE WITH NOISE') % Convert to gray scale if size(imagen,3)==3 %RGB image imagen=rgb2gray(imagen); end % Convert to BW threshold = graythresh(imagen);
imagen =~im2bw(imagen,threshold); imagen2 = imagen; %figure,imshow(imagen2); % title('before bwareaopen') % Remove all object containing fewer than 15 pixels imagen = bwareaopen(imagen,15); imagen3 = imagen; %figure,imshow(imagen3); %title('after bwareaopen') %Storage matrix word from image word=[ ]; re=imagen; %Opens text.txt as file for write fid = fopen('text.txt', 'wt'); % Load templates load templates global templates % Compute the number of letters in template file num_letras=size(templates,2); while 1 %Fcn 'lines_crop' separate lines in text [fl re]=lines_crop(re); %fl= first line, re= remaining image imgn=fl; n=0; %Uncomment line below to see lines one by one %figure,imshow(fl);pause(2) %-----------------------------------------------------------------
spacevector = []; % to compute the total spaces betweeen % adjacent letter rc = fl;
while 1 %Fcn 'letter_crop' separate letters in a line [fc rc space]=letter_crop(rc); %fc = first letter in the line %rc = remaining cropped line %space = space between the letter % cropped and the next letter %uncomment below line to see letters one by one %figure,imshow(fc);pause(0.5) img_r = imresize(fc,[42 24]); %resize letter so that correlation %can be performed n = n + 1; spacevector(n)=space;
%Fcn 'read_letter' correlates the cropped letter with the images %given in the folder 'letters_numbers' letter = read_letter(img_r,num_letras);
16
%letter concatenation word = [word letter];
if isempty(rc) %breaks loop when there are no more characters break; end end
for x= 1:n %loop to introduce space at requisite locations if spacevector(x+no_spaces)> (0.75 * max_space) no_spaces = no_spaces + 1; for m = x:n word(n+x-m+no_spaces)=word(n+x-m+no_spaces-1); end word(x+no_spaces) = ' '; spacevector = [0 spacevector]; end end
%fprintf(fid,'%s\n',lower(word));%Write 'word' in text file (lower) fprintf(fid,'%s\n',word);%Write 'word' in text file (upper) % Clear 'word' variable word=[ ]; %*When the sentences finish, breaks the loop if isempty(re) %See variable 're' in Fcn 'lines' break end end fclose(fid); %Open 'text.txt' file winopen('text.txt') clear all
17
6. SAMPLE TESTING
During the process of execution, it selects a file from a directory using a dialog box.
Figure 6.1: Select an Image Dialog box
The image which has been selected will be displayed in the processing platform.
Figure 6.2: Show Image with error
18
The processed image will be displayed as a text in notepad.
Figure 6.3: Text generation in a textbox
19
7. PERFORMANCE TESTING
There are many types of tastings which can be performed in the software. One of the testing is
black box testing.
7.1 Black Box Testing In this type of testing the product design to perform is already known. The only thing to check
here is the input and its corresponding output. Out of millions of inputs we have consider only
four inputs.
Figure 7.1: Conversion of text from a single line image to text format
Figure 7.2: Conversion of text from a multi-line image to text format
20
Figure 7.3: Conversion of text from a colored background image to text format
Figure 7.4: Conversion of text from a lower case letter format
Figure 7.5: Conversion of text from a lower case letter and upper case letter format
21
Figure 7.6: Conversion of text from a distorted image to text format
22
8. ARCHITECTURE
The Architecture of the Character Recognition and Conversion system on a grid infrastructure
consists of the three main components. They are:-
Scanner
Character recognition Software
Output Interface
Figure 8.1: CRC Architecture
23
9. APPLICATION
Language Conversion
Along with English, there are varieties of languages which can be converted into user readable
format. By language translation software which uses DIA as its main functioning software.
Automatic Number Plate Recognition
Automatic number plate recognition is a mass surveillance method that uses optical
character recognition on images to read vehicle registration plates. They can use existing closed-
circuit television or road-rule enforcement cameras, or ones specifically designed for the task.
They are used by various police forces and as a method of electronic toll collection on pay-per-
use roads and cataloging the movements of traffic or individuals.
Data Entry for Business Documents
It is widely used as a form of data entry from printed paper data records, whether
passport documents, invoices, bank statements, computerized receipts, business cards, mail,
printouts of static-data, or any suitable documentation.
More applications,
More quickly make textual versions of printed documents
Make electronic images of printed documents searchable
Automatic insurance documents key information extraction
Converting handwriting in real time to control a computer
Assistive technology for blind and visually impaired users