Conference Paper

Database Development and Recognition of Handwritten Devanagari Legal Amount Words

Dept. of IT, PICT, Pune, India
DOI: 10.1109/ICDAR.2011.69 Conference: 2011 International Conference on Document Analysis and Recognition, ICDAR 2011, Beijing, China, September 18-21, 2011
Source: IEEE Xplore

ABSTRACT A dataset containing 26,720 handwritten legal amount words written in Hindi and Marathi languages (Devanagari script) is presented in this paper along with a training-free technique to recognize such handwritten legal amounts present on Indian bank cheques. The recognition of handwritten legal amount words in Hindi and Marathi languages is a challenging because of the similar size and shape of many words in the lexicon. Moreover, many words have same suffixes or prefixes. The recognition technique proposed is a combination of two approaches. The first approach is based on gradient, structural and cavity (GSC) features along with a binary vector matching (BVM) technique. The second approach is based on vertical projection profile (VPP) feature and dynamic time warping (DTW). A number of highly matched words in both the approaches are considered for the recognition step in the combined approach based on a ranking scheme. Syntactical knowledge related to the languages is also used to achieve higher reliability. To the best of our knowledge, this is the first work of its kind in recognizing handwritten legal amounts written in Hindi and Marathi. Researchers interested in the dataset can contact the authors to get it through a shared link.

2 Followers
 · 
195 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose a statistical script independent line based word spotting framework for offline handwritten documents based on Hidden Markov Models. We propose and compare an exhaustive study of filler models and background models for better representation of background or non-keyword text. The candidate keywords are pruned in a two stage spotting framework using the character based and lexicon based background models. The system deals with large vocabulary without the need for word or character segmentation. The script independent word spotting system is evaluated on a mixed corpus of public dataset from several scripts such as IAM for English, AMA for Arabic and LAW for Devanagari.
    Pattern Recognition 03/2014; 47(3):1039–1050. DOI:10.1016/j.patcog.2013.09.019 · 2.58 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Offline Script Identification (OSI) facilitates many important applications such as automatic archiving of multilingual documents, searching online/offline archives of document images and for the selection of script specific Optical Character Recognition (OCR) in a multilingual environment. In a multilingual country like India, a document containing text words in more than one language is a common scenario. A state-of-the-art survey about the techniques available in the area of OSI for Indic scripts would be of a great aid to the researchers. Hence, a sincere attempt is made in this article to discuss the advancements reported in the literature during the last few decades. Various feature extraction and classification techniques associated with the OSI of the Indic scripts are discussed in this survey. We hope that this survey will serve as a compendium not only for researchers in India, but also for policymakers and practitioners in India. It will also help to accomplish a target of bringing the researchers working on different Indic scripts together. Taking the recent developments in OSI of Indian regional scripts into consideration, this article will provide a better platform for future research activities.
    Computer Science Review 12/2014; DOI:10.1016/j.cosrev.2014.12.001
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A Dissertation submitted in partial fulfilment for the award of the Degree of Master of Technology in department of Computer Engineering
    08/2012, Degree: Master of Technology (Computer Engineering), Supervisor: Dr Rakesh Rathi

Full-text (3 Sources)

Download
91 Downloads
Available from
May 20, 2014