Robust extraction of text from scene images is essential for successful scene text recognition. Scene images usually have nonuniform illumination, complex background, and text-like objects. In this paper, we propose a text extraction algorithm by combining the adaptive binarization and perceptual color clustering method. Adaptive binarization method can handle gradual illumination changes on character regions, so it can extract whole character regions even though shadows and/or light variations affect the image quality. However, image binarization on gray-scale images cannot distinguish different color components having the same luminance. Perceptual color clustering method complementary can extract text regions which have similar color distances, so that it can prevent the problem of the binarization method. Text verification based on local information of a single component and global relationship between multiple components is used to determine the true text components. It is demonstrated that the proposed method achieved reasonabe accuracy of the text extraction for the moderately difficult examples from the ICDAR 2003 database.
"However, systems developed to recognize words (i.e., the term word is used loosely here to mean any string of characters in a scene image) still obtain unsatisfactory results. Indeed, the majority of classical approaches proposes to segment the text into characters, generally relying on a pre-processing step that distinguishes text from background , (a complete survey of character segmentation methods is presented in ) and then recognize extracted characters. However, the different distortions in scene text images make the segmentation very hard, leading to poor recognition results. "
[Show abstract][Hide abstract] ABSTRACT: Understanding text captured in real-world scenes is a challenging problem in the field of visual pattern recognition and continues to generate a significant interest in the OCR (Optical Character Recognition) community. This paper proposes a novel method to recognize scene texts avoiding the conventional character segmentation step. The idea is to scan the text image with multi-scale windows and apply a robust recognition model, relying on a neural classification approach, to every window in order to recognize valid characters and identify non valid ones. Recognition results are represented as a graph model in order to determine the best sequence of characters. Some linguistic knowledge is also incorporated to remove errors due to recognition confusions. The designed method is evaluated on the ICDAR 2003 database of scene text images and outperforms state-of-the-art approaches.
Proc. of Int. Workshop on Document Analysis Systems (DAS'12); 03/2012
[Show abstract][Hide abstract] ABSTRACT: Text detection and recognition in real images taken in unconstrained environments, such as street view images, remain surprisingly challenging in Computer Vision. Extraction of text and caption from images and videos is important and in great demand for video retrieval, annotation, indexing and content analysis. In this paper, we propose a text extraction algorithm using Dual Tree Complex transform. It is demonstrated that the proposed method achieved reasonable accuracy of the text extraction for moderately difficult examples.
[Show abstract][Hide abstract] ABSTRACT: In this paper, we propose a framework for isolating text regions from natural scene images. The main algorithm has two functions: it generates text region candidates, and it verifies of the label of the candidates (text or non-text). The text region candidates are generated through a modified K-means clustering algorithm, which references texture features, edge information and color information. The candidate labels are then verified in a global sense by the Markov Random Field model where collinearity weight is added as long as most texts are aligned. The proposed method achieves reasonable accuracy for text extraction from moderately difficult examples from the ICDAR 2003 database.
20th International Conference on Pattern Recognition, ICPR 2010, Istanbul, Turkey, 23-26 August 2010; 08/2010
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.