Database Development and Recognition of Handwritten Devanagari Legal Amount Words
ABSTRACT A dataset containing 26,720 handwritten legal amount words written in Hindi and Marathi languages (Devanagari script) is presented in this paper along with a training-free technique to recognize such handwritten legal amounts present on Indian bank cheques. The recognition of handwritten legal amount words in Hindi and Marathi languages is a challenging because of the similar size and shape of many words in the lexicon. Moreover, many words have same suffixes or prefixes. The recognition technique proposed is a combination of two approaches. The first approach is based on gradient, structural and cavity (GSC) features along with a binary vector matching (BVM) technique. The second approach is based on vertical projection profile (VPP) feature and dynamic time warping (DTW). A number of highly matched words in both the approaches are considered for the recognition step in the combined approach based on a ranking scheme. Syntactical knowledge related to the languages is also used to achieve higher reliability. To the best of our knowledge, this is the first work of its kind in recognizing handwritten legal amounts written in Hindi and Marathi. Researchers interested in the dataset can contact the authors to get it through a shared link.
-
Citations (0)
-
Cited In (0)
Page 1
Database Development and Recognition of Handwritten Devanagari Legal Amount Words
Abstract— A dataset containing 26,720 handwritten legal
amount words written in Hindi and Marathi languages
(Devanagari script) is presented in this paper along with a
training-free technique to recognize such handwritten legal
amounts present on Indian bank cheques. The recognition of
handwritten legal amount words in Hindi and Marathi
languages is a challenging because of the similar size and shape
of many words in the lexicon. Moreover, many words have
same suffixes or prefixes. The recognition technique proposed
is a combination of two approaches. The first approach is
based on gradient, structural and cavity (GSC) features along
with a binary vector matching (BVM) technique. The second
approach is based on vertical projection profile (VPP) feature
and dynamic time warping (DTW). A number of highly
matched words in both the approaches are considered for the
recognition step in the combined approach based on a ranking
scheme. Syntactical knowledge related to the languages is also
used to achieve higher reliability. To the best of our knowledge,
this is the first work of its kind in recognizing handwritten
legal amounts written in Hindi and Marathi. Researchers
interested in the dataset can contact the authors to get through
a shared link.
Keywords- handwritten database; Devanagari script; legal
amount recognition; hadwriting recognition; Hindi and Marathi
languages; bank cheque processing.
I.
INTRODUCTION
In India, around 300 million people use Devanagari script
for writing languages like Hindi, Marathi, Sindhi, Nepali,
Sanskrit, and Konkani, where Hindi is the national language
of the country [1]. Hindi and Marathi are the most popular
languages written in Devanagari script. As the national
language, Hindi is accepted allover India and is used for
documentation especially in the Indian states of Bihar,
Chhattisgarh, Haryana, Himachal Pradesh, Jharkhand,
Madhya Pradesh, New Delhi, Rajasthan Uttar Pradesh and
Uttarakhand. Marathi language is the official language of the
Indian state of Maharashtra, which is one of the biggest
states in the country.
To fill-up various paper documents like bank cheques,
envelops, application forms, railway reservation forms,
answer sheets etc. people use Devanagari script. The script
is written from left to right and has 13 vowels, 34 consonants
and 14 modifiers. More characteristics of Devanagari script
can be seen in [1]. Unlike Latin script, Devanagari has only
one style of writing. There is no concept of cursive style in
writing Devanagari script. Normally, to write a word, the
constituent characters are written from left to right and then
joined by a header-line called ‘shirorekha’ as shown in
Figure. 1.
Legal amount, courtesy amount, date, payee details and
signature are the fields to be filled by an account holder on a
bank cheque as shown in Figures 2 and 3.The value of a
cheque is written in two ways in two areas. The first area
(legal) contains the amount written in words and the second
area (courtesy) contains the amount written in numerals. It is
considered that the amount recognition has to rely on both
courtesy and legal amount recognition. A state of the art
survey on cheque recognition techniques can be found in
[17]. This paper mainly deals with the recognition of
handwritten legal amounts. Two types of approaches are
mainly used for handwritten word recognition: analytical
(local) and global (holistic). In analytical approaches, each
handwritten word of the legal amount is recognized by
recognizing its constituent characters. For the same, a word
is divided (segmented) into components like characters or
graphemes (part of characters) first. In global (holistic)
approaches, the entire word is considered as a single unit
(pattern) and recognition is done without any type of
segmentation. Various techniques for the recognition of
handwritten words can be found in the surveys carried out in
[9], [10] and [11]. A survey of character level segmentation
of handwritten words is done in [12].
Figure 1.
Machine printed and handwritten samples of a Hindi word
(meaning ‘fifty’).
Although there are many works reported on non-Indian
Bank cheque recognition [8, 13-16], to the best of our
knowledge there is no work reported on bank cheques
written in Indian languages. Only a few works are reported
in the literature towards handwritten Devanagari words. For
recognizing handwritten city names, which are distinct in
their size and shape, Shaw et al. [2, 3] used a segmentation-
based approach and a segmentation-free approach. In the
segmentation-based approach [2], a word image is
segmented into pseudo-characters and Hidden Markov
models (HMM) are used to recognize them. In the
segmentation free approach [3], continuous density-HMMs
are used to recognize handwritten Devanagari words. An
HMM is constructed for each Devanagari word. The states of
the HMMs are determined automatically based on a database
of handwritten city names. In [4], Shaw and Parui used a
global approach that extracts global features from
handwritten Devanagari words. A two-stage recognition
scheme is used for the recognition purpose where the first
R. Jayadevan
Department of IT
PICT, Pune, India
rjayadevan@rediffmail.com
S. R. Kolhe
Department of CS
NMU, Jalgaon, India
srkolhe2000@gmail.com
P. M. Patil
Department of EE
VIT, Pune, India
patil_pm@rediffmail.com
Umapada Pal
CVPR Unit
ISI, Kolkata, India
umapada@isical.ac.in
2011 International Conference on Document Analysis and Recognition
1520-5363/11 $26.00 © 2011 IEEE
DOI 10.1109/ICDAR.2011.69
304
Page 2
stage of recognition is based on an HMM classifier and the
second stage is based on the Bayes discriminant function.
For the offline recognition of handwritten legal amounts
in Marathi and Hindi languages, a technique that is a
combination of two approaches in a single writer
environment is presented in this paper. An individual can use
106 words in Hindi Language and 114 words in Marathi
language for writing a valid legal amount, whereas in
English there are only 30 to 36 words depending on the
currency. It is assumed here that these word samples can be
collected from an individual while opening an account with
the bank. We have developed handwritten word image
databases for Hindi and Marathi languages (Devanagari
script) for the present study as there was no such database
was available. We have designed special kinds of forms to
collect the handwritten samples from the writers. The form
contains different boxes in which a writer has to write all the
possible words in the lexicon in a specified order. Most of
the writers were from the age group 18 to 22. There was no
restriction imposed on the writer regarding the style and
speed of writing. These handwritten forms were then
scanned at 300 dpi resolution to get the gray-scale images of
the forms. The images were then skew corrected using radon
transform. The words were then extracted using the location
information as the words are written in ascending order of
their value. Figure 4 and Figure 5 show these words being
used for writing legal amounts in Marathi and Hindi
languages collected from two individuals. The value of each
word is shown on its right side. Handwritten words similar in
size or shape are grouped together to illustrate the
complexity in recognition, as many of the words in a group
have same suffixes. To illustrate the type of cheques being
used, two handwritten cheques in Marathi and Hindi
languages are also shown in Figures 2 and 3. The
recognition technique proposed here is a combination of two
approaches. The first approach is based on GSC features
along with a BVM technique, whereas the second approach
is based on VPP feature and DTW. The two subsequent
sections of the paper discuss the recognition technique and
the details of experimentation, whereas the last section
concludes the paper.
Figure 2. A cheque written in Marathi language
Figure 3. A cheque written in Hindi language
Figure 4. Marathi words belonging to an individual (total 114)
from the database
Figure 5. Hindi words belonging to an individual (total 106)
from the database.
305
Page 3
II.
WORD RECOGNITION
As a valid legal amount text written in Hindi or Marathi
language has a number of constituent words, the words have
to be extracted first before being recognized. In order to
extract words from a cheque image, a layout-based approach
is employed. The area of interest is located with the help of
guidelines present on the cheque. Guidelines are then
removed using morphological operations as described in [8]
and the stroke reconstruction is performed using structural
elements in different directions. Handwritten words can be
separated by observing the connected components and the
distance between two consecutive connected components.
This is based on the assumption that the intra word gaps are
always smaller than the inter word gaps. Each extracted
word is then recognized by comparing its features with that
of the words in the database written by the same writer. The
features, matching and the final word selection are described
in the following subsections. An overview of the entire word
recognition scheme can also be seen in Figure 6.
Figure 6. Overview of the proposed recognition scheme
A.
GSC Features and BVM
After skew and slant corrections, each binary handwritten
word image is divided vertically into 4 segments such that
every segment has same number of foreground pixels. Then
the image is divided horizontally into 8-segments in the same
way. This will result in 32 (4 × 8) image sub segments. For
each sub segment, its corresponding gradient structural and
cavity (GSC) features are extracted as described in [5]. The
GSC features considered in this work are gradient values of
the contours in seven major directions(22.5o, 45o, 67.5o, 90o,
112.5o, 135o and 157.5o), presence of line segments in three
different directions (90o, 45o, and 135o), presence of four
types of corners, presence of junction points, end points,
loops, upper, lower, left and right cavities. A 21 (twenty
one) bit binary vector is created for the feature representation
of each sub segment such that a bit is set to 1 if the
corresponding feature is present. For gradient features, a
threshold can be set for the number of contour pixels (with
same gradient value) to set the corresponding bit value to 1.
The binary vectors of the sub-segments are subsequently
concatenated to form a 672 (32 × 21) bit binary vector for
each handwritten word. The matching process of the word
with those in the database collected from the same individual
is carried out using the binary vector matching technique
described in [6].
Let Sij (i, j = 0 or 1) be the number of bit level matches
between i in the first pattern and j in the second pattern at
the corresponding positions. The four possible values
are
00
S
,
01
S
,
10
S
and
11
S
. Let X and Y be the feature
vectors of two words to be compared, then the dissimilarity
between them can be calculated using the following
equation.
⎛
−=
)((2
2
01 1110
SSS
⎟⎟
⎠
⎞
⎜⎜
⎝
++++
−
) )()(
1
),(
10 00 0111 00
01 10
S
00
S
11
SSS
SSSS
YXD
(1)
B.
VPP and DTW
Vertical projection profile (VPP) features are considered
here along with dynamic time warping (DTW) for matching
handwritten Devanagari words. Same technique was earlier
applied for spotting handwritten English words in [7]. After
skew and slant corrections, the VPP feature vector is
computed by summing the pixel values in each column of
the binary image. If the foreground pixels of the image
I with height h are indicated by 0’s, the value of the
projection vector at column j can be computed as follows.
∑
=
r
1
A rectangular grid is used for matching two vectors in
DTW. Each point (node) in the grid is associated with a
‘Cost’, which is the difference between the values of the
vectors at the corresponding row and column positions
respectively. The feature vector R of the reference (stored)
word of length Lr is aligned along the X-axis of the grid and
the feature vector T of the test word of length Lt is aligned
along the Y-axis respectively.
Let (xk , yk ) represents a point (node) on a warping path
at the instance ‘k’ of matching. A path starting from node (1,
1) and ending at node (xk , yk ) has a cost D (xk , yk )
associated with it. The problem of finding the optimal path
can be reduced to finding a sequence of nodes (xk , yk ),
which minimizes the accumulated cost for a complete path
ending at node (T, R) as follows.
([),(
1kkk
xD MinyxD
=
−
The dissimilarity between the two feature vectors is equal to
DMin (T, R). It has to be made sure that the path will not turn
−=
h
jrIjI VP)),(1 (),(
(2)
),( )],
1kkk
Min
yxCosty
+
−
(3)
306
Page 4
back on itself. Both the (xk and yk) indices either stay same
or increase. Both xk and yk can only increase by 1 on each
step along the path. The path starts at (1, 1) and ends at
(T, R).
C.
Recognition
A legal amount word extracted from the cheque has to be
matched with the words collected from the same individual
using the two techniques mentioned above. A number (n) of
highly matched words from both the techniques are
considered to finalize the recognition result. A ranking
scheme is employed for the same as follows.
Let
,....,,{
21GGG
wwS =
matched words from
} ,....,,{
21 VnVVV
wwwS =
matched words from VPP-DTW method. The intersection of
these two sets gives the final set
considered.
SSS
∩=
(4)
The rank ‘R’ associated with a word ‘w’ in SF is
calculated as :
VjGiR
+=
. (5)
Where, Gi and Vj are the ranks of the same word in SG and SV
respectively.
The word having the highest rank is selected from the
final set as the recognition result. If two words share the
highest rank, then their rank in SG (i.e Gi) is considered for
the final selection. If SF = φ, then the test word has to be
rejected to attain higher reliability.
}
Gn
w
be the set of highly
GSC-BVM method
be the set of highly
and
)(
F
S
of words to be
VGF
D.
Syntactical Knowledge
After recognizing all the words on a cheque, the syntax
of the entire legal amount has to be analyzed to reject the
words appearing at wrong positions. The syntactical
knowledge (SK) is always language dependent and can be
implemented as a set of rules. Some of the common rules
applicable to Indian bank cheques are listed below.
•
The first word should not be a word equivalent to
'crore', 'lakh', 'thousand', 'hundred', 'only' etc.
•
The word equivalent to 'only' should appear only at
the end of the amount.
•
A word equivalent to 'crore' or 'lakh' or 'thousand' or
'hundred' or 'only' or 'rupees' should appear only
once in the entire legal amount.
•
Two consecutive words shall not belong to the
group of the words equivalent to 'crore', 'lakh',
'thousand', 'hundred' etc.
III.
EXPERIMENTATION
The dataset has been grouped into three sub-datasets
namely DB1, DB2 and DB3. DB1 contains data collected
from 90 individuals in Marathi language where each
individual contributed 114 word templates and a handwritten
cheque. Thus DB1 has 10,260(114×90) handwritten words
and 90 handwritten cheques in Marathi language. DB2 also
has data in Marathi language, collected from 70 individuals
with comparatively poor
7,980(114×70) handwritten words and 70 handwritten
cheques. DB3 contains data in Hindi language collected
from 80 individuals. Each individual contributed 106 word
templates and a handwritten cheque. Thus DB3 has
8,480(106×80) handwritten words and 80 handwritten
cheques in Hindi language. The three sub-datasets
collectively have 26,720 handwritten Devanagari words and
240 handwritten cheques.
The word level correctness, error, rejection and reliability
of the different experimental setups conducted are given in
tables I, II and III. Table I depicts the individual performance
details of GSC-BVM and VPP-DTW methods. It is clear
from Table I that the GSC-BVM method outperforms the
VPP-DTW on all the three datasets. It is also evident from
the table that the syntactical knowledge (SK) can increase
the reliability of the system by 5% to15%. Table II shows
the performance details of the combined approach with
different values of ‘n’. Table III shows the impact of
syntactical knowledge (SK) on the combined technique by
improving the reliability by 1 to 6%.
handwriting. DB2 has
TABLE I.
WITH AND WITHOUT SYNTACTICAL KNOWLEDGE
INDIVIDUAL PERFORMANCE DETAILS OF THE METHODS
Attribute
Database
Word Level Results
Without SK
GSC
&
BVM DTW
80.65 76.69
76.50 61.31
79.45 65.23
0.00 0.00
0.00 0.00
0.00 0.00
19.34 23.31
23.49 38.68
20.54 34.76
80.65 76.69
76.50 61.31
79.45 65.23
With SK
GSC
&
BVM
78.08
72.77
72.68
8.15
12.60
14.67
13.75
14.61
12.64
85.02
83.27
85.18
VPP
&
VPP
&
DTW
71.56
53.86
55.07
14.91
27.79
29.34
13.51
18.33
15.57
84.10
74.60
77.95
Correctness
(%)
DB1
DB2
DB3
DB1
DB2
DB3
DB1
DB2
DB3
DB1
DB2
DB3
Rejection
(%)
Error
(%)
Reliability
(%)
IV. CONCLUSIONS
In India, there is a vast scope for research related to
handwritten Devanagari text processing as almost 300
million people use Devanagari script for writing. A database
is developed containing 26,720 handwritten Hindi and
Marathi legal amount words and is made available to the
global research community. This database can be used for
307
Page 5
benchmarking several segmentation and classification
(recognition) techniques. To the best of our knowledge, it is
the first database of its type (legal amount words). A
combined approach is discussed in the paper to recognize
legal amount words in a single writer environment. It is also
demonstrated that the combined approach performs better
than the individual methods in terms of correctness and
reliability. By analyzing the individual performance details
through experimentations, it became evident that the
performance of GSC-BVM technique is superior to that of
VPP-DTW technique. Reliability is important in automatic
cheque processing, as incorrect recognitions will result in
huge financial losses. From the experimental results, it is
clear that the use of syntactical knowledge improves the
reliability of the recognition system. Also it is clear that
more research is required to achieve a fully reliable system.
ACKNOWLEDGMENT
The Department of Science & Technology (DST),
Ministry of Science and Technology, New Delhi, India
supports this work under the Grant SR/FTP/ETA-58/2007.
TABLE II.
WORD LEVEL PERFORMANCE DETAILS OF THE COMBINED APPROACH WITH DIFFERENT ‘n’ VALUES.
n
Correctness (%)
DB1 DB2
82.05 72.77
84.38 74.78
85.08 76.50
85.78 77.36
85.78 78.22
85.54 78.79
Error (%)
DB2
10.02
12.60
15.47
17.76
18.05
18.62
Rejection (%)
DB1 DB2
10.95 17.19
7.22 12.60
5.12
3.72 4.87
3.03 3.72
2.33 2.57
Reliability (%)
DB1 DB2
92.14 87.88
90.95 85.57
89.68 83.17
89.10 81.32
88.46 81.25
87.58 80.88
DB3
79.00
80.36
81.26
82.16
82.61
83.07
DB1
6.99
8.39
9.79
10.48
11.18
12.12
DB3
11.73
13.31
14.44
16.25
16.70
16.70
DB3
9.25
6.32
4.28
1.58
0.67
0.22
DB3
87.06
85.78
84.90
83.48
83.18
83.25
3
4
5
6
7
8
8.02
TABLE III.
WORD LEVEL PERFORMANCE DETAILS OF THE COMBINED APPROACH ( HAVING SK) WITH DIFFERENT ‘n’ VALUES
n
Correctness (%)
DB1 DB2
81.81 72.49
83.68 74.49
84.14 75.93
84.61 76.50
84.61 77.36
84.38 77.65
Error (%)
DB2
9.74
12.32
13.75
15.18
15.47
15.18
Rejection (%)
DB1 DB2
11.88 17.7
9.55 13.18
7.92 10.31
7.45 8.30
6.75 7.16
6.99 7.16
[9] A. Vinciarelli, A survey on off-line cursive word recognition, Pattern
Recognition, vol. 35, no. 7, pp. 1433-1446, 2002.
[10] T. Steinherz, E. Rivlin and N. Intrator, Offline cursive script word
recognition – a survey, International Journal on Document Analysis
and Recognition, vol. 2, no. 2, pp. 90-110, 1999.
[11] S. Madhvanath and V. Govindaraju, The role of holistic paradigms in
handwritten word recognition, IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol. 23, no.2, pp.149-164, 2001.
[12] Y. Lu and M. Shridhar, Character segmentation in handwritten words
-An overview, Pattern Recognition, vol. 29, no. 1, pp.77-96, 1996 .
[13] T. Paquet and Y. Lecourtier, “Handwriting Recognition: Application
on Bank cheques,” Proceedings of 1st Intl. Conf. Document Anal.
And Recog., pp. 749-755, 1991.
[14] G. Dimauro, S. Impedovo, G. Pirlo and A. Salzo, "A Multi-Expert
Signature Verification System
International Journal of Pattern Recognition and Artificial
Intelligence, Vol. 11, pp. 827-844, 1997
[15] L. L Lee, M. G. Lizárraga, N. R. Gomes and A. L. Koerich, “A
Prototype for Brazilian Bankcheck Recognition”, International
Journal of Pattern Recognition and Artificial Intelligence, Vol. 11, pp.
549-569 ,1997.
[16] M. Shridhar, Gilles F. Houle and F. Kimura, "Comprehensive Check
Image Reader", In Proc. ICCTA, pp. 407-416, 2007.
[17] R. Jayadevan, S. R. Kolhe, P. M. Patil and U. Pal, “Automatic
Processing of Handwritten Bank Cheque Images: A Survey”, IJDAR,
(Accepted).
Reliability (%)
DB1 DB2
92.85 88.15
92.52 85.80
91.39 84.66
91.43 83.43
90.75 83.33
90.72 83.64
DB3
72.91
74.26
74.94
75.62
76.07
76.29
DB1
6.29
6.75
7.92
7.92
8.62
8.62
DB3
6.32
7.44
7.90
8.57
9.02
8.80
DB3
20.76
18.28
17.15
15.80
14.89
14.89
DB3
92.02
90.88
90.46
89.81
89.38
89.65
3
4
5
6
7
8
REFERENCES
[1] R. Jayadevan, S. R. Kolhe, P.M. Patil and U. Pal, "Offline
Recognition of Devanagari Script: A Survey", IEEE Transactions on
Systems, Man and Cybernetics-Part C: Applications and Reviews,
2011 (in press).
[2] B. Shaw, S. K. Parui and M. Shridhar, “A Segmentation Based
Approach to Offline Handwritten Devanagari Word Recognition”, in
Proceedings of IEEE ICIT, 2008, pp.256-257.
[3] B. Shaw, S. K. Parui and M. Shridhar, “Off-line Handwritten
Devanagari Word Recognition: A holistic approach based on
directional chain code feature and HMM”, in Proceedings of IEEE
ICIT, 2008, pp.203-208.
[4] B. Shaw and S. K. Parui, "A Two Stage Recognition Scheme for
Offline Handwritten Devanagari Words", Machine Interpretation of
Patterns-Image Analysis and Data Mining, R. K. De, D. P. Mandal
and A. Ghosh, Eds. World scientific, pp. 145-165, 2010.
[5] J. T. Favata, G. Srikantan and S. N. Srihari, "Handprinted
Character/Digit Recognition Using a Multiple Feature/Resolution
Philosophy", in Proceedings of IWFHR, 1994, pp. 57-66.
[6] B. Zhang, S. N. Srihari and C. Huang, "Word image retrieval using
binary features", Proc. SPIE 5296, 45, 2003.
[7] T. M. Rath and R. Manmatha, “Word Image Matching Using
Dynamic Time Warping”, in Proceedings of IEEE Computer Society
Conference on Computer Vision and Pattern Recognition, 2003, pp.
521-527.
[8] X. Ye, M. Cheriet, C. Y. Suen and K. Liu, "Extraction of bankcheck
items by mathematical morphology", IJDAR, vol.2, pp. 53-66, 1999.
for Bankcheck Processing:,
308