Article

Research of Chinese Handwritten Text Segmentation Algorithm

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

OCR is a complicated process, there are many factors that can influence the recognition rate. Early period people tried to optimize the classifier to obtain high recognition rate, but the premise is that there is only one character no matter print or handwritten. For the performance of classifier has been promoted a lot, recognition rate for single character is high enough for commercial use. With the development of the demand for handwritten text recognition, how to raise the recognition rate of OCR system becomes very important. Unlike OCR system for print which focus on classifier. The research of OCR system for handwritten text is mainly on character segmentation. Statistical analysis showed that the mistake made by missegment is more than the mistake made by classifier. This is decided by the feature of handwritten text. There are more randomness and the lines are not horizontal, besides that, handwritten Chinese characters are more like overlapped and the gaps between characters are smaller. So this is the difficulty of handwritten Chinese characters. In this paper, the mutil-step searching nonlinear line exaction algorithm the paper proposed is easy and the accuracy is high, which can tackle the some weaknesses of direct projection method and indirect projection.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Segmentation of off-line handwritten Chinese characters is the premise of recognition. It is most difficult to segment connected characters. A novel algorithm based on stroke analysis and background thinning was proposed to segment connected handwritten Chinese characters. The feature points, viz. end points, fork points and corner points are detected in the thinned image of connected characters. The segments between the feature points are considered as substrokes and are extracted. The information of the length of substrokes, the topological relations and projection is employed to locate connected points. By thinning the background, suitable separation path is determined. The experimental results show that satisfactory performance is achieved by the presented method for segmentation of connected handwritten Chinese characters.
Article
In this paper, a metasynthetic method is proposed to segment handwritten Chinese character strings. The Viterbi algorithm is firstly applied to search segmentation paths and several rules are used to remove redundant paths. Then a background-thinning method is further adopted to obtain non-linear segmentation paths. If there are not touching characters, a dynamic programming algorithm is applied to merge components. For touching characters, we apply background and foreground information to obtain candidate segmentation paths and the feature vectors are constructed in terms of peripheral features. Then the mixture probabilistic density function whose parameters are obtained by the EM algorithm is used to choose the best segmentation path. Experimental results demonstrate that the proposed scheme effectively segments handwritten Chinese characters and achieves an improvement over previous methods.
Conference Paper
Segmentation of connected handwritten Chinese characters is a very difficult task in document image analysis. In this paper, a novel algorithm based on stroke analysis and background thinning is proposed to segment connected handwritten Chinese characters. The feature points, viz. end points, fork points and corner points are detected in the thinned image. The segments between feature points are considered as substrokes and are extracted. Lengths of substrokes and the topological relations between them are employed to locate connected point. A new method based on background thinning is developed to decide a proper segmentation path. The experimental results show that satisfactory performance is achieved by the presented method for segmentation of connected handwritten Chinese characters.
Machine Printed Character Segmentation 一 An Overview
L. Y, "Machine Printed Character Segmentation 一 An Overview", Pattern Recognition, vol. 28, no. l, (1995), pp. 67-80.
The Research of Character Segmentation in OCR and Text Extract
  • L Qingzhong
L. Qingzhong, "The Research of Character Segmentation in OCR and Text Extract", Tingjing: Nankai University, (2001).
A Survey of Methods in Handwritten Chinese Character Segmentation
  • S Jie
  • C Yu
S. Jie and C. Yu, "A Survey of Methods in Handwritten Chinese Character Segmentation", Computer Technology and Development, vol. 16, no. 6, (2006), pp. 184-190.
Research on Segmentation of Unconstrained Handwritten Characters
  • M Rui
M. Rui, "Research on Segmentation of Unconstrained Handwritten Characters", Nanjing University of Science and Technology, (2007).