Article

Person re-identification by symmetry-driven accumulation of local features

ABSTRACT In this paper, we present an appearance-based method for person re-identification. It consists in the extraction of features that model three complementary aspects of the human appearance: the overall chromatic content, the spatial arrangement of colors into stable regions, and the presence of recurrent local motifs with high entropy. All this information is derived from different body parts, and weighted opportunely by exploiting symmetry and asymmetry perceptual principles. In this way, robustness against very low resolution, occlusions and pose, viewpoint and illumination changes is achieved. The approach applies to situations where the number of candidates varies continuously, considering single images or bunch of frames for each individual. It has been tested on several public benchmark datasets (ViPER, iLIDS, ETHZ), gaining new state-of-the-art performances.

Full-text

Available from: Loris Bazzani, Jun 02, 2015
0 Followers
 · 
239 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a novel approach to person re-identification, a fundamental task in distributed multi-camera surveillance systems. Although a variety of powerful algorithms have been presented in the past few years, most of them usually focus on designing hand-crafted features and learning metrics either individually or sequentially. Different from previous works, we formulate a unified deep ranking framework that jointly tackles both of these key components to maximize their strengths. We start from the principle that the correct match of the probe image should be positioned in the top rank within the whole gallery set. An effective learning-to-rank algorithm is proposed to minimize the cost corresponding to the ranking disorders of the gallery. The ranking model is solved with a deep Convolutional Neural Network (CNN) that builds the relation between input image pairs and their similarity scores through joint representation learning directly from raw image pixels. The proposed framework allows us to get rid of feature engineering and does not rely on any assumption. An extensive comparative evaluation is given, demonstrating that our approach significantly outperforms all state-of-the-art approaches, including both traditional and CNN-based methods on the challenging VIPeR and CUHK-01 datasets. Additionally, our approach has better ability to generalize across datasets without fine-tuning.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Identifying the same individual across different scenes is an important yet difficult task in intelligent video surveillance. Its main difficulty lies in how to preserve similarity of the same person against large appearance and structure variation while discriminating different individuals. In this paper, we present a scalable distance driven feature learning framework based on the deep neural network for person re-identification, and demonstrate its effectiveness to handle the existing challenges. Specifically, given the training images with the class labels (person IDs), we first produce a large number of triplet units, each of which contains three images, i.e. one person with a matched reference and a mismatched reference. Treating the units as the input, we build the convolutional neural network to generate the layered representations, and follow with the distance metric. By means of parameter optimization, our framework tends to maximize the relative distance between the matched pair and the mismatched pair for each triplet unit. Moreover, a nontrivial issue arising with the framework is that the triplet organization cubically enlarges the number of training triplets, as one image can be involved into several triplet units. To overcome this problem, we develop an effective triplet generation scheme and an optimized gradient descent algorithm, making the computational load mainly depends on the number of original images instead of the number of triplets. On several challenging databases, our approach achieves very promising results and outperforms other state-of-the-art approaches.
    Pattern Recognition 04/2015; DOI:10.1016/j.patcog.2015.04.005 · 2.58 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This dissertation addresses the challenges in large scale deployment of wide-area camera networks and automated analysis of resulting big data. Analysis of such data is limited due to communication bottlenecks and low computational power at individual nodes. Specific focus is on distributed tracking and search/retrieval. For object tracking in overlapping camera views, we propose a strategy for inducing priors on the scene specific information and explicitly modeling object appearance. Contextual information such as known trajectories and entry/exit points will be leveraged as scene specific priors. A novel probabilistic multiple camera tracking algorithm with a distributed loss function for incorporating scene priors is proposed, which leads to a significant improvement in the overall tracking accuracy. The proposed algorithm is validated with extensive experimentation in challenging camera network data, and is found to compare favorably with state of the art trackers. For non-overlapping views, a novel graph based model is proposed to represent spatial-temporal relationships between objects for search and retrieval tasks. A graph ranking strategy is used to order the items based on similarity with an emphasis on diversity. Extensive experimental results on a ten camera network are presented. The proposed person re-identification methodology is compared with the state-of-the-art algorithms in benchmark datasets.
    12/2014, Degree: Doctor of Philosophy, Supervisor: B.S.Manjunath