January 2025
IEEE Internet of Things Journal
With the rapid development of Visual Internet of Things (VIoT) and text detection technology, they have been widely combined and applied to many industrial production sites, such as label text detection, achieving impressive results. However, there are still many shortcomings in the text detection technology: 1) The existing VIoT system has very limited detection precision for text with large scale changes, especially for some small-scale text detection; 2) The existing text detection algorithms cannot meet the actual situation, as the labels often contain handwritten texts, and the text to be detected is arbitrary shape; 3) In the actual detection, there are many creases or defects on the text label. To solve the above problems, this paper designs a text detection method based on a multi-scale selection fusion feature pyramid and multi-semantic spatial network to assist the VIoT system in detecting label text. Firstly, a multi-scale selective fusion feature pyramid is designed, which not only uses the texture extraction module to effectively improve the text texture feature and multi-scale feature extraction ability, but also uses the cross-scale selective fusion block to selectively fuse the features of different stages to reduce the influence of pollution on detection. In addition, a multi-semantic spatial network is designed to capture the multi-semantic spatial information of each feature channel by using the multi-scale deep shared one-dimensional convolution, which effectively integrates global context dependence and multi-semantic spatial prior. Experimental results show that the comprehensive index F-measure on the public datasets ICDAR2015, Total-Text and CTW1500 is increased by 5.7%, 3.3%and 3.8% respectively. Furthermore, the precision, recall, and F-measure on the dataset Label-Text are 94.6%, 90.7%and 92.6% respectively. The label text detection VIoT system we designed has been deployed in the field and achieved excellent performance. The code of our proposed method can be found in: https://github.com/rebornone1/MSNet