Scene Text Detection on Images using Cellular Automata

1. Scene Text Detection on Images using Cellular AutomataKonstantinos Zagoris and Ioannis PratikakisImage Processing and Multimedia Lab,Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, [email protected], [email protected]

2. Outline Introduction State of the Art Disadvantages Architecture of the proposed method Canny Edge Detector Coordinating Logic Filters (CLF) Proposed Cellular Automata Text DetectionMethod Evaluation and Experimental Results 3. Introduction Textual information in images or video constitutesa very rich source of high-level semantics forretrieval and indexing It can be acquired as scene text that wascaptured by a video or photo camera as part of ascene Text detection on natural scenes is still a hardtask to solve Have very high computational cost 4. State of the Art Split in two categories: region-based and texture-basedRegion-based algorithms group pixels based oncommon characteristics Texture-based methods scan the image atdifferent scales using a sliding window andclassify text areas based on texture information. From another perspective, can be divided intoheuristic-based and machine learning-basedmethods. Heuristic-based algorithms segment the imageinto small regions and then group them by someconstraints Machine learning-based methods use directly 5. Disadvantages Many parameters have to be estimatedexperimentallycondemns themto datadependency and lack of generality When background is really complex, theybecome computationally expensive. Texture-basedtechniques cannot catchsatisfactory text with size bigger of the slidingwindow. An increase of the window make these methodsquite costly. In addition, they still use empiricalthresholds on specific features therefore they lackadaptability. 6. Proposed Method Address the scene text detection problem bymodeling texture into cellular automata (CA)context Replace costly image processing operations withtheir equivalent cellular operations Eliminate most limitations, such as the empiricalthresholds and heavy computational procedures 7. Architecture of the proposed methodOriginal ImageCanny Edge Map Logical ORCellular Automata Logical ANDCoordinating Logic Logical ORFiltersMajority State RuleEdge ProjectionFilteringFinal Text 8. Coordinating Logic Filters (CLF) execute coordinate logic operations among thepixels of the image The CLF operations is similar to themorphological operations, achieving similarfunctionality morphology Dilation is the logical OR morphology Erosion is the logical AND 9. Canny Edge Detector Detection of the salient image edges Use Sobel masks thresholding and non-maxima suppression(lowthreshold equal to 20 and high threshold equal to100) The final edge map is a binarised image with thecontour pixels set to one (white) and theremainder pixels equal to zero (black). This approach exploits the fact that text linesproduce strong vertical edges horizontally alignedwith a high density. gives us the opportunity to detect normal or 10. Canny Edge Detector 11. Proposed Cellular Automata The proposed CA is considered to be a 2-D latticeof cells where every pixel is represented by a cell. The CA grid width and height is defined by theedge image width and height Each cell have two states as the input image isbinary. Taking advantage of the CA flexibility, thetransition rules are changing and are applied infour consecutive steps resulting in four time stepsCA evolution. 12. 1st Step Logical OR 13. 1st Step Logical OR 14. 2nd Step Logical AND 15. 2nd Step Logical AND 16. 3rd Step Logical OR 17. 3rd Step Logical OR 18. Majority State Rule 19. 4th Step - Majority State Rule 20. Edge Projection Filtering in the high edge density images, the methodproduces a number of false positives post-processing filtering is required in order toremove them filtered them based on horizontal and verticalprojections Areas with mean horizontal and verticalprojections below a threshold are discarded. 21. Edge Projection Filtering 22. Examples 23. Examples 24. Evaluation 25. Evaluation1. Wolf, C., Jolion, J.M.: Object count/area graphs for the evaluation of objectdetection and segmentation algorithms. International Journal on DocumentAnalysis and Recognition 8(4), 280296 (2006) 26. Experimental Results In order to showcase the advantages of our proposed method, we test it against a machine- learning edge based scene text detection system. We replace the CLF with the corresponding morphological operations (dilation and opening) and the majority state rule with the Support Vector Machines (SVMs) classifierMethod Recall Precision HarmonicMeanProposed CA-based0.7942 0.74620.7652methodMachine-learning based 0.7134 0.52340.6038method 27. Experimental ResultsMean execution time of each of them for a set images(15 total) in a Intel Core 2 Quad CPU Q9550(2.83GHz) machine.MethodMean Execution Time(sec)Proposed CA-based 2.75 secmethodMachine-learning based5.96 secmethod 28. Conclusions A method based on the Cellular Automata waspresented for the detection of scene text onnatural images Initially, the Canny edge detector is employed inorder to exposed the dominant edges on theimage. Then a CA is used for the calculation of thecandidate text areas. Its rules depend onCoordinating Logic Filters and on the majoritystate rule A post-processing technique based on edgeprojection analysis is employed for the highdensity edge images in order to eliminated thefalse positives. 29. !Thank You!

Scene Text Detection on Images using Cellular Automata

Documents

step logical

text areas

machine learning edge

ascene text detection

catchsatisfactory text

canny edge detector

scene text detection

heuristicbased algorithms