Preprint

Billet Number Recognition Based on Test-Time Adaptation

Authors:
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the authors.

Abstract

During the steel billet production process, it is essential to recognize machine-printed or manually written billet numbers on moving billets in real-time. To address the issue of low recognition accuracy for existing scene text recognition methods, caused by factors such as image distortions and distribution differences between training and test data, we propose a billet number recognition method that integrates test-time adaptation with prior knowledge. First, we introduce a test-time adaptation method into a model that uses the DB network for text detection and the SVTR network for text recognition. By minimizing the model's entropy during the testing phase, the model can adapt to the distribution of test data without the need for supervised fine-tuning. Second, we leverage the billet number encoding rules as prior knowledge to assess the validity of each recognition result. Invalid results, which do not comply with the encoding rules, are replaced. Finally, we introduce a validation mechanism into the CTC algorithm using prior knowledge to address its limitations in recognizing damaged characters. Experimental results on real datasets, including both machine-printed billet numbers and handwritten billet numbers, show significant improvements in evaluation metrics, validating the effectiveness of the proposed method.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The surface characteristics of billets are crucial for subsequent traceability, yet the production process generates intricate digital features on their surfaces. This paper introduces BDR-Net, a novel billet surface digit recognition network. Drawing inspiration from Inception, the network adopts a ResNext-like architecture as its primary framework. It uniformly distributes output in dimension, extracts positional and scale features separately, and introduces a mixed dilated convolution block to reduce parameters while expanding the sensory field. To address the challenge of lost up-sampled features during fusion, an innovative stream alignment-based up-sampled feature fusion algorithm is proposed. Additionally, to enhance the network’s focus on extracting salient spatial and channel features, a mixed-dimensional attention mechanism (scSE) is integrated into the alignment-based upsampling feature fusion module. Experimental results showcase BDR-Net’s outstanding performance, achieving an impressive 95.6% accuracy in digitally classifying billet surfaces, surpassing the ResNext50_32x4d benchmark model by 4.3% in recognition accuracy. Moreover, compared to current classification networks, this model exhibits significant accuracy improvements. Furthermore, the mAP@0.95 metric reaches 0.897, surpassing current classification networks. These findings underscore the remarkable performance of the model in billet surface digit recognition, offering an effective solution for digit recognition on billet surfaces in steel mills.
Article
Full-text available
The automatic recognition of labels marked on steel slab surfaces is of significance to information management and intelligent manufacturing in steel plants. However, it is not an easy task due to complex factors like low printing quality, motion distortion and thermal blurring, especially while handling the prone-to-deform dot-matrix labels generated by a hot spray marking (HSM) technique. In this paper, a machine vision system is presented for the HSM dot-matrix label recognition. With a brief description of the imaging system, our emphasis is put on image analysis. First, a coarse-to-fine strategy is applied to locate HSM characters from captured images, where a weighted gravity-center estimation method is extended to search the enclosure of label regions, and an edge projection scheme is adopted to refine the label extraction. Subsequently, a Multidirectional Line Scanning (MLS) method is proposed to determine the boundaries between adjacent dot-matrix characters with tilt, adhesion or dot-missing abnormalities. Finally, by converting the dot-matrix character into a 2D point set, we introduce a Point Cloud registration for DOt-matrix Character (PC4DOC) method to recognize prone-to-deform characters, which appears to accommodate various distortions and abnormalities owing to the inherent deformation correction of affine transformation and fault tolerance of robust correspondence matching. According to our experiments, the proposed method can achieve real-time recognition with an accuracy of 93.84% in spite of severely degraded images and incomplete characters. The system has been installed and run in a steel mill for more than one year, and its stability was also verified.
Article
Full-text available
With the rise and development of deep learning, computer vision has been tremendously transformed and reshaped. As an important research area in computer vision, scene text detection and recognition has been inevitably influenced by this wave of revolution, consequentially entering the era of deep learning. In recent years, the community has witnessed substantial advancements in mindset, methodology and performance. This survey is aimed at summarizing and analyzing the major changes and significant progresses of scene text detection and recognition in the deep learning era. Through this article, we devote to: (1) introduce new insights and ideas; (2) highlight recent techniques and benchmarks; (3) look ahead into future trends. Specifically, we will emphasize the dramatic differences brought by deep learning and remaining grand challenges. We expect that this review paper would serve as a reference book for researchers in this field. Related resources are also collected in our Github repository (https://github.com/Jyouhou/SceneTextPapers).
Article
Full-text available
We propose the simple and efficient method of semi-supervised learning for deep neural networks. Basically, the proposed network is trained in a supervised fashion with labeled and unlabeled data simultaneously. For un-labeled data, Pseudo-Label s, just picking up the class which has the maximum predicted probability, are used as if they were true labels. This is in effect equivalent to Entropy Regularization. It favors a low-density separation between classes, a commonly assumed prior for semi-supervised learning. With De-noising Auto-Encoder and Dropout, this simple method outperforms conventional methods for semi-supervised learning with very small labeled data on the MNIST handwritten digit dataset.
Conference Paper
Full-text available
Many real-world sequence learning tasks re- quire the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (RNNs) are powerful sequence learners that would seem well suited to such tasks. However, because they require pre-segmented training data, and post-processing to transform their out- puts into label sequences, their applicability has so far been limited. This paper presents a novel method for training RNNs to label un- segmented sequences directly, thereby solv- ing both problems. An experiment on the TIMIT speech corpus demonstrates its ad- vantages over both a baseline HMM and a hybrid HMM-RNN.
Article
In this study, a novel framework for the recognition of a billet identification number (BIN) using deep learning is proposed. Because a billet, which is a semi-finished product, could be rolled, the BIN may be rotated at various angles. Most product numbers, including BIN, are a combination of individual characters. Such product numbers are determined based on the class of each character and its order (or the positioning). In addition, the two pieces of information are constant even if the product number is rotated. Inspired by this concept, the proposed framework of deep neural networks has two outputs. One is for the class of an individual character, and the other is the order of an individual character within BIN. Compared with a previous study, the proposed network requires an additional annotation but does not require additional labor for labeling. The multi-task learning for two annotations has a positive role in the representation learning of a network, which is shown in the experiment results. Furthermore, to achieve a good performance of the BIN identification, we analyzed various networks using the proposed framework. The proposed algorithm was then compared with a conventional algorithm to evaluate the performance of the BIN identification.
Article
Research and development of OCR systems are considered from a historical point of view. The historical development of commercial systems is included. Both template matching and structure analysis approaches to R&D are considered. It is noted that the two approaches are coming closer and tending to merge. Commercial products are divided into three generations, for each of which some representative OCR systems are chosen and described in some detail. Some comments are made on recent techniques applied to OCR, such as expert systems and neural networks, and some open problems are indicated. The authors' views and hopes regarding future trends are presented
Real-time scene text detection with differentiable binarization
  • Minghui Liao
  • Zhaoyi Wan
  • Cong Yao
  • Kai Chen
  • Xiang Bai
Minghui Liao, Zhaoyi Wan, Cong Yao, Kai Chen, and Xiang Bai. Real-time scene text detection with differentiable binarization. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 11474-11481, 2020. 1, 3
  • Yongkun Du
  • Zhineng Chen
  • Caiyan Jia
  • Xiaoting Yin
  • Tianlun Zheng
  • Chenxia Li
  • Yuning Du
  • Yu-Gang Jiang
  • Svtr
Yongkun Du, Zhineng Chen, Caiyan Jia, Xiaoting Yin, Tianlun Zheng, Chenxia Li, Yuning Du, and Yu-Gang Jiang. Svtr: Scene text recognition with a single visual model. arXiv preprint arXiv:2205.00159, 2022. 1, 3
Batch number of billets recognition system design
  • Xiaojun Zhang
Xiaojun Zhang. Batch number of billets recognition system design. Master's thesis, Shanghai Jiao Tong University, 2009, 2009. 2
Study on steel plate character recognition in iron and steel logistics
  • Xiao Zhou
Xiao Zhou. Study on steel plate character recognition in iron and steel logistics. Industrial Control Computer, (2):93-94, 2016. 2
Study of rolling mill production line of heavy rail steel billet recognition system
  • Bin Dong
Bin Dong. Study of rolling mill production line of heavy rail steel billet recognition system. Master's thesis, Wuhan Institute of Technology, 2015. 2
Inspection method for steel billet characters based on svm
  • Di Wu
  • Dongsheng Jiao
  • Xiao Zhang
  • Zhanguo Shi
Di Wu, Dongsheng Jiao, Xiao Zhang, and Zhanguo Shi. Inspection method for steel billet characters based on svm. Microcomputer Applications, (10):49-51, 2011. 2
Content based image retrieval system using k-means and knn approach by feature extraction
  • Mj Sadiq
  • Prof
  • Kaleem
MJ Sadiq, A Prof, A Kaleem, et al. Content based image retrieval system using k-means and knn approach by feature extraction. Journal of Peritherapeutic Neuroradiology, 24(6):643-649, 2018. 2
Recognition method for handwritten steel billet identification number based on yolo deep convolutional neural network
  • Qiaojie Sun
  • Dali Chen
  • Sen Wang
  • Shixin Liu
Qiaojie Sun, Dali Chen, Sen Wang, and Shixin Liu. Recognition method for handwritten steel billet identification number based on yolo deep convolutional neural network. In 2020 Chinese Control And Decision Conference (CCDC), pages 5642-5646. IEEE, 2020. 2
Billet number recognition based on esrgan and improved yolov5
  • Zijia Wang
  • Yichao Dong
  • Dan Niu
  • Minghao Liu
  • Qi Li
  • Xisong Chen
Zijia Wang, Yichao Dong, Dan Niu, Minghao Liu, Qi Li, and Xisong Chen. Billet number recognition based on esrgan and improved yolov5. In 2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC), pages 1384-1389. IEEE, 2022. 2
Test-time training with selfsupervision for generalization under distribution shifts
  • Yu Sun
  • Xiaolong Wang
  • Zhuang Liu
  • John Miller
  • Alexei Efros
  • Moritz Hardt
Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei Efros, and Moritz Hardt. Test-time training with selfsupervision for generalization under distribution shifts. In International conference on machine learning, pages 9229-9248. PMLR, 2020. 2
Tent: Fully test-time adaptation by entropy minimization
  • Dequan Wang
  • Evan Shelhamer
  • Shaoteng Liu
  • Bruno Olshausen
  • Trevor Darrell
Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Olshausen, and Trevor Darrell. Tent: Fully test-time adaptation by entropy minimization. arXiv preprint arXiv:2006.10726, 2020. 2, 4
  • Shuaicheng Niu
  • Jiaxiang Wu
  • Yifan Zhang
  • Zhiquan Wen
  • Yaofo Chen
  • Peilin Zhao
  • Mingkui Tan
Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Zhiquan Wen, Yaofo Chen, Peilin Zhao, and Mingkui Tan. Towards stable test-time adaptation in dynamic wild world. arXiv preprint arXiv:2302.12400, 2023. 2
Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation
  • Jian Liang
  • Dapeng Hu
  • Jiashi Feng
Jian Liang, Dapeng Hu, and Jiashi Feng. Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In International conference on machine learning, pages 6028-6039. PMLR, 2020. 2
Pp-ocrv3: More attempts for the improvement of ultra lightweight ocr system
  • Chenxia Li
  • Weiwei Liu
  • Ruoyu Guo
  • Xiaoting Yin
  • Kaitao Jiang
  • Yongkun Du
  • Yuning Du
  • Lingfeng Zhu
  • Baohua Lai
  • Xiaoguang Hu
Chenxia Li, Weiwei Liu, Ruoyu Guo, Xiaoting Yin, Kaitao Jiang, Yongkun Du, Yuning Du, Lingfeng Zhu, Baohua Lai, Xiaoguang Hu, et al. Pp-ocrv3: More attempts for the improvement of ultra lightweight ocr system. arXiv preprint arXiv:2206.03001, 2022. 6