Conference Paper

FPGA-based pedestrian detection using array of covariance features

DOI: 10.1109/ICDSC.2011.6042923 Conference: 2011 Fifth ACM/IEEE International Conference on Distributed Smart Cameras, Ghent, Belgium, Aug. 22-25, 2011
Source: DBLP


In this paper we propose a pedestrian detection algorithm and its implementation on a Xilinx Virtex-4 FPGA. The algorithm is a sliding window-based classifier, that exploits a recently designed descriptor, the covariance of features, for characterizing pedestrians in a robust way. In the paper we show how such descriptor, originally suited for maximizing accuracy performances without caring about timings, can be quickly computed in an elegant, parallel way on the FPGA board. A grid of overlapped covariances extracts information from the sliding window, and feeds a linear Support Vector Machine that performs the detection. Experiments are performed on the INRIA pedestrian benchmark; the performances of the FPGA-based detector are discussed in terms of required computational effort and accuracy, showing state-of-the-art detection performances under excellent timings and economic memory usage.

Download full-text


Available from: Marco Cristani, May 21, 2014

Click to see the full-text of:

Conference Paper: FPGA-based pedestrian detection using array of covariance features

457.67 KB

See full-text
  • Source
    • "The works of [3], [4], [5] and [6] follow this concept: the hardware is programmed from scratch in order to find an optimised solution for a specific problem. More in detail, [3], [4] and [5] present an implementation of the Histogram of Oriented Gradients (HOG) on streaming video flow, while [6] extracts aggregated features into a covariance matrix. This work aims at overcoming the state-of-the-art limits in FPGA-based SCN nodes by proposing, implementing and evaluating the performance of an innovative and reconfigurable architecture for designing computer vision pipeline inside nodes embedding FPGAs. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Smart Camera Networks (SCNs) is nowadays an emerging research field which represents the natural evolution of centralized computer vision applications towards full distributed and pervasive systems. In such a scenario, one of the biggest effort is in the definition of a flexible and reconfigurable SCN node architecture able to remotely support the possibility of updating the application parameters and changing the running computer vision applications at run-time. In this respect, this paper presents a novel SCN node architecture based on a device in which a microcontroller manages all the network functionality as well as the remote configuration, while an FPGA implements all the necessary module of a full computer vision pipeline. In the paper the envisioned architecture is first detailed in general terms, then a real implementation is presented to show the feasibility and the benefits of the proposed solution. Finally, performance evaluation results prove the potential of hardware software codesign in reaching flexibility and reduced latency time.
    ACM/IEEE International Conference on Distributed Smart Cameras; 09/2013
  • Source
    • "Kadota et al. [23] perform HOG feature extraction in FPGA, then classify the results on a microprocessor. Martelli et al. [24] perform FPGA-based pedestrian detection using covariance features, and Hiromoto et al. [25] describe a similar system using co-occurrence HoG. We compare our implementation to these versions in the results . "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a new implementation, with complete analysis, of the processing operations required in a widely-used pedestrian detection algorithm (the histogram of oriented gradients (HOG) detector) when run in various configurations on a heterogeneous platform suitable for use as an embedded system. The platform consists of field-programmable gate array (FPGA), graphics processing unit (GPU), and central processing unit (CPU) and we detail the advantages of such an image processing system for real-time performance. We thoroughly analyze the consequent tradeoffs made between power consumption, latency and accuracy for each possible configuration. We thus demonstrate that prioritization of each of these factors can be made by selecting a specific configuration. These separate configurations may then be changed dynamically to respond to changing priorities of a real-time system, e.g., on a moving vehicle. We compare the performance of real-time implementations of linear and kernel support vector machines in HOG and evaluate the entire system against the state-of-the-art in real-time person detection. We also show that our FPGA implementation detects pedestrians more accurately than existing implementations, and that a heterogeneous configuration which performs image scaling on the GPU, and histogram extraction and classification on the FPGA, produces a good compromise between power and speed.
    IEEE Journal on Emerging and Selected Topics in Circuits and Systems 06/2013; 3(2-2):236-247. DOI:10.1109/JETCAS.2013.2256821 · 1.52 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper focuses on real-time pedestrian detection on Field Programmable Gate Arrays (FPGAs) using the Histograms of Oriented Gradients (HOG) descriptor in combination with a Support Vector Machine (SVM) for classification as a basic method. We propose to process image data at twice the pixel frequency and to normalize blocks with the L1-Sqrt-norm resulting in an efficient resource utilization. This implementation allows for parallel computation of different scales. Combined with a time-multiplex approach we increase multiscale capabilities beyond resource limitations. We are able to process 64 high resolution images (1920 × 1080 pixels) per second at 18 scales with a latency of less than 150 u s. 1.79 million HOG descriptors and their SVM classifications can be calculated per second and per scale, which outperforms current FPGA implementations by a factor of 4.
    IEEE Conference on Computer Vision and Pattern Recognition Workshops (Embedded Computer Vision), Portland, Oregon; 06/2013
Show more