June 2024
·
50 Reads
It happens in the examination paper that text lines include inconsistent nonuniform word size, character erasure, diverse text length and dense long texts. This paper proposes an improved method for ViT to enhance its capability in recognizing text lines in handwritten Chinese examination papers. First, this method employs a segmentation method suitable for text line recognition and proposes a repeated multiscale linear projection (RMLP) method to enrich the spatial information of the image vectors, which improves the model's integration capability for patch vectors of multiple scales. Second, ViT is combined with CTC to achieve prediction for each patch, thus improving the robustness of ViT in Chinese handwritten text recognition. Experiments show that RMLP-ViT promises the recognition of examination paper text lines and achieves good performance on the SCUT-EPT dataset.