ThesisPDF Available

Generalized Haar-like filters for document analysis : application to word spotting and text extraction from comics


Abstract and Figures

The presented thesis follows two directions. The first one disposes a technique for text and graphic separation in comics. The second one points out a learning free segmentation free word spotting framework based on the query-by-string problem for manuscript documents. The two approaches are based on human perception characteristics. Indeed, they were inspired by several characteristics of human vision such as the Preattentive processing. These characteristics guide us to introduce two multi scale approaches for two different document analysis tasks which are text extraction from comics and word spotting in manuscript document. These two approaches are based on applying generalized Haar-like filters globally on each document image whatever its type. Describing and detailing the use of such features throughout this thesis, we offer the researches of document image analysis field a new line of research that has to be more explored in future. The two approaches are layout segmentation free and the generalized Haar-like filters are applied globally on the image. Moreover, no binarization step of the processed document is done in order to avoid losing data that may influence the accuracy of the two frameworks. Indeed, any learning step is performed. Thus, we avoid the process of extraction features a priori which will be performed automatically, taking into consideration the different characteristics of the documents.
Content may be subject to copyright.
A preview of the PDF is not available
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Lexicon-based handwritten text keyword spotting (KWS) has proven to be a faster and more accurate alternative to lexicon-free methods. Nevertheless, since lexicon-based KWS relies on a predefined vocabulary, fixed in the training phase, it does not support queries involving out-of-vocabulary (OOV) keywords. In this paper, we outline previous work aimed at solving this problem and present a new approach based on smoothing the (null) scores of OOV keywords by means of the information provided by “similar” in-vocabulary words. Good results achieved using this approach are compared with previously published alternatives on different data sets.
Line-level keyword spotting (KWS) is presented on the basis of frame-level word posterior probabilities. These posteriors are obtained using word graphs derived from the recognition process of a full-fledged handwritten text recognizer based on hidden Markov models and N-gram language models. This approach has several advantages. First, since it uses a holistic, segmentation-free technology, it does not require any kind of word or character segmentation. Second, the use of language models allows the context of each spotted word to be taken into account, thereby considerably increasing KWS accuracy. And third, the proposed KWS scores are based on true posterior probabilities, taking into account all (or most) possible word segmentations of the input image. These scores are properly bounded and normalized. This mathematically clean formulation lends itself to smooth, threshold-based keyword queries which, in turn, permit comfortable trade-offs between search precision and recall. Experiments are carried out on several historic collections of handwritten text images, as well as a well-known data set of modern English handwritten text. According to the empirical results, the proposed approach achieves KWS results comparable to those obtained with the recently-introduced “BLSTM neural networks KWS” approach and clearly outperform the popular, state-of-the-art “Filler HMM” KWS method. Overall, the results clearly support all the above-claimed advantages of the proposed approach.