Conference Paper

Inkball Models as Features for Handwriting Recognition

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... What are the advantages of unsupervised pretraining in the context of training neural networks for classification ? 2. Does pretraining work in the same way for standard models and convolutional models ? 3. What are the advantages of automatic feature extraction compared to handcrafted features ? 4. How much tuning of the model and training is necessary to generate features to be used by basic classifiers ? 5. Can the same features easily be used by different classifiers ? ...
... The algorithm is illustrated in Figure 2. 4. It is very similar to the stochastic gradient descent procedure used to train an ANN with backpropagation. ...
... Benchmark details are available in Appendix A. 4. ...
Thesis
Full-text available
In this thesis, we propose to use methodologies that automatically learn how to extract relevant features from images. We are especially interested in evaluating how these features compare against handcrafted features. More precisely, we are interested in the unsupervised training that is used for the Restricted Boltzmann Machine (RBM) and Convolutional RBM (CRBM) models. These models relaunched the Deep Learning interest of the last decade. During the time of this thesis, the auto-encoders approach, especially Convolutional Auto-Encoders (CAE) have been used more and more. Therefore, one objective of this thesis is also to compare the CRBM approach with the CAE approach. The scope of this work is defined by several machine learning tasks. The first one, handwritten digit recognition, is analysed to see how much the unsupervised pretraining technique introduced with the Deep Belief Network (DBN) model improves the training of neural networks. The second, detection and recognition of Sudoku in images, is evaluating the efficiency of DBN and Convolutional DBN (CDBN) models for classification of images of poor quality. Finally, features are learned fully unsupervised from images for a keyword spotting task and are compared against well-known handcrafted features. Moreover, the thesis was also oriented around a software engineering axis. Indeed, a complete machine learning framework was developed during this thesis to explore possible optimizations and possible algorithms in order to train the tested models as fast as possible.
... Ink verification: This kind of application is also utilized to assist in making decision whether a given document is genuine or not, specifically it helps answer the question of whether some texts or strokes are added with a different pen or ink. Howe et al. 147 have proposed a method of ink verification in which the authors utilize inkball models to generate a varying feature set which is used to train the hidden markov model for character recognition. With this model, the hidden stages correspond to characters in the target language, and each of these characters has a corresponding inkball model produced from a prototype of each character. ...
Thesis
The fast-growing information technologies and digital image technology over the past decades have made digital document images becoming more ubiquitous than ever. In reality, there have been variety of legal documents consisting of administrative and business documents such as certificate, diploma, contract, invoice, etc. These documents are in use in government agencies, banks, educational institutions and so on. Due to convenience of exchanging information, the genuine documents are often transferred from one place to another by using digital channels. The tampering of these documents during the transmission has become an unavoidable matter, especially in the field of cybercrime. Hence, the credibility and trustworthiness of the legal digital documents have been diminished, this often results in a serious aftermath with respect to criminal, economic and social issues. To secure the genuine digital documents against unauthorized interference, the field of document forensics has been evolved, and it has drawn much attention from researchers in the community of document analysis and recognition. One of the efficient solutions to address this matter is data hiding in conjunction with pattern recognition techniques. The objective of this work is to develop a data hiding framework as trustworthy as possible that enables to verify if a document is genuine or phony. The challenging problems dealt with in this thesis are: (1) extraction of enough stable features from the documents even in the presence of various distortions; and (2) be able to detect precisely hidden information embedded for securing documents from watermarked documents undergone real distortions caused by print-and-scan, or print-photocopy-scan processes. For the former issue, we address it by taking advantage of conventional pattern recognition techniques and deep learning based approaches. Specifically, we utilize well-known detectors to detect feature points from the documents, and propose a new feature point detector for developing a steganography scheme. To enhance feature stability against the real distortions, we approach to develop watermarking systems based on stable regions instead of feature points, which are based on the conventional techniques and fully convolutional networks (FCN). In addition, the generative adversarial networks (GAN) are also applied to produce a reference document, and character variations or fonts used for watermarking process. For the later issue, we have come up with two approaches to develop data hiding and detection algorithms: one is based on the changing of pixel intensities, and the other is relied on the shape of characters and symbols.The assessments show that our approaches are able to properly detect the hidden information when the watermarked documents are subjected to various distortions. In comparison with state-of-the-art methods, our approaches give competitive performance in terms of robustness with applications to various types of document.
... This approach has been introduced as a technique for segmentation-free word spotting that requires few training data. In addition to keyword spotting, inkball models have been used for handwriting recognition as a complex feature in conjunction with HMM [25]. Inkball models are visually similar to keypoint graphs since they are using very similar points on the handwriting as nodes. ...
Preprint
Graphs provide a powerful representation formalism that offers great promise to benefit tasks like handwritten signature verification. While most state-of-the-art approaches to signature verification rely on fixed-size representations, graphs are flexible in size and allow modeling local features as well as the global structure of the handwriting. In this article, we present two recent graph-based approaches to offline signature verification: keypoint graphs with approximated graph edit distance and inkball models. We provide a comprehensive description of the methods, propose improvements both in terms of computational time and accuracy, and report experimental results for four benchmark datasets. The proposed methods achieve top results for several benchmarks, highlighting the potential of graph-based signature verification.
Article
Full-text available
In this article we propose a local descriptor for an unconstrained handwritten word spotting task. The pre-sented features are inspired by the SIFT keypoint descrip-tor, widely employed in computer vision and object recog-nition, but underexploited in the handwriting recognition field. In our approach, a sliding window moves from left to right over a word image. At each position, the window is subdivided into cells, and in each cell a histogram of orientations is accumulated. Experiments using two dif-ferent word spotting systems -hidden Markov models and dynamic time warping -demonstrate a very significant im-provement when using the proposed features with respect to the state-of-the-art ones.
Conference Paper
Full-text available
Automatic transcription of historical documents is vital for the creation of digital libraries. In this paper we propose graph similarity features as a novel descriptor for handwriting recognition in historical documents based on Hidden Markov Models. Using a structural graph-based representation of text images, a sequence of graph similarity features is extracted by means of dissimilarity embedding with respect to a set of character prototypes. On the medieval Parzival data set it is demonstrated that the proposed structural descriptor significantly outperforms two well-known statistical reference descriptors for single word recognition.
Article
Full-text available
In this paper, a system for the reading of totally unconstrained handwritten text is presented. The kernel of the system is a hidden Markov model (HMM) for handwriting recognition. This HMM is enhanced by a statistical language model. Thus linguistic knowledge beyond the lexicon level is incorporated in the recognition process. Another novel feature of the system is that the HMM is applied in such a way that the difficult problem of segmenting a line of text into individual words is avoided. A number of experiments with various language models and large vocabularies have been conducted. The language models used in the system were also analytically compared based on their perplexity.
Article
Full-text available
Since their first inception more than half a century ago, automatic reading systems have evolved substantially, thereby showing impressive performance on machine-printed text. The recognition of handwriting can, however, still be considered an open research problem due to its substantial variation in appearance. With the introduction of Markovian models to the field, a promising modeling and recognition paradigm was established for automatic offline handwriting recognition. However, so far, no standard procedures for building Markov-model-based recognizers could be established though trends toward unified approaches can be identified. It is therefore the goal of this survey to provide a comprehensive overview of the application of Markov models in the research field of offline handwriting recognition, covering both the widely used hidden Markov models and the less complex Markov-chain or n-gram models. First, we will introduce the typical architecture of a Markov-model-based offline handwriting recognition system and make the reader familiar with the essential theoretical concepts behind Markovian models. Then, we will give a thorough review of the solutions proposed in the literature for the open problems how to apply Markov-model-based approaches to automatic offline handwriting recognition.
Conference Paper
Paleographers study ancient and historical handwriting in order to learn more about documents of significant interest and their creators. Computational tools and methods can aid this task in numerous ways, particularly for languages and scripts that are not widely known today. One project currently underway seeks to gather a collection of securely dated letter samples from Syriac documents dating between 500 and 1100 CE. The set comprises over 60,000 human-selected character samples. This paper gives details on the collection and describes the automatic techniques used to process the initial human input so as to produce high-quality segmented character samples ready for analysis.
Conference Paper
Many document collections of historical interest are handwritten and lack transcripts. Scholars need tools for high-quality information retrieval in such environments, preferably without the burden of extensive system training. This paper presents a novel approach to word spotting designed for manuscripts or degraded print that requires minimal initial training. It can infer a generative word appearance model from a single instance, and then use the model to retrieve similar words from arbitrary documents. An approximation to the retrieval statistic runs efficiently on graphics processing hardware. Tested on two standard data sets, the method compares favorably with prior results.
Conference Paper
This paper presents a word spotting method based on line-segmentation, sliding window, continuous dynamic programming, and slit style HOG feature. Our method is applicable regardless of what language is written in the manuscript because it does not require any language-dependent preprocess. The slit style HOG feature is a gradient-distribution-based feature with overlapping normalization and redundant expression, and the use of this feature improved the performance of the word spotting. We compared our method with some previously developed word spotting methods, and confirmed that our method outperforms them in both English and Japanese manuscripts.
Conference Paper
Most offline handwriting recognition approaches proceed by segmenting words into smaller pieces (usually characters) which are recognized separately. The recognition result of a word is then the composition of the individually recognized parts. Inspired by results in cognitive psychology, researchers have begun to focus on holistic word recognition approaches. Here we present a holistic word recognition approach for single-author historical documents, which is motivated by the fact that for severely degraded documents a segmentation of words into characters will produce very poor results. The quality of the original documents does not allow us to recognize them with high accuracy - our goal here is to produce transcriptions that will allow successful retrieval of images, which has been shown to be feasible even in such noisy environments. We believe that this is the first systematic approach to recognizing words in historical manuscripts with extensive experiments. Our experiments show recognition accuracy of 65%, which exceeds performance of other systems which operate on non-degraded input images (nonhistorical documents).
Article
This tutorial provides an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and gives practical details on methods of implementation of the theory along with a description of selected applications of the theory to distinct problems in speech recognition. Results from a number of original sources are combined to provide a single source of acquiring the background required to pursue further this area of research. The author first reviews the theory of discrete Markov chains and shows how the concept of hidden states, where the observation is a probabilistic function of the state, can be used effectively. The theory is illustrated with two simple examples, namely coin-tossing, and the classic balls-in-urns system. Three fundamental problems of HMMs are noted and several practical techniques for solving these problems are given. The various types of HMMs that have been studied, including ergodic as well as left-right models, are described
Article
We describe linear-time algorithms for solving a class of problems that involve transforming a cost function on a grid using spatial information. These problems can be viewed as a generalization of classical distance transforms of binary images, where the binary image is replaced by an arbitrary function on a grid. Alternatively they can be viewed in terms of the minimum convolution of two functions, which is an important operation in grayscale morphology. A consequence of our techniques is a simple and fast method for computing the Euclidean distance transform of a binary image. Our algorithms are also applicable to Viterbi decoding, belief propagation, and optimal control.