September 2023
·
16 Reads
·
1 Citation
Signal Image and Video Processing
Person re-identification is an image retrieval technique for person in real scenes. Due to factors such as camera angle, lighting, and occlusion, there is a high intra-class variation in the representation of a specific sample. Furthermore, discriminative local regions such as hats and shoes are often ignored, resulting in some useful local information being unable to be used for retrieval. In this paper, a multi-scale feature fusion network model combining global and local features is proposed. The network is built with four stacked building block, where multi-scale features are assigned with different weights and fused according to the output conditions of each branch. In addition, a multi-attention mechanism network is combined with the multi-scale feature fusion in this paper. This method aims to enable the network to model the relation between input images, so as to effectively aggregate the features of neighbour person samples to obtain a more robust image representation. Experimental results show that the retrieval performance can be improved by the proposed method.