Conference Paper

Cross-Articulation Learning for Robust Detection of Pedestrians

Technical University Darmstadt, Darmstadt, Hesse, Germany
DOI: 10.1007/11861898_25 Conference: Pattern Recognition, 28th DAGM Symposium, Berlin, Germany, September 12-14, 2006, Proceedings
Source: DBLP


Recognizing categories of articulated objects in real-world scenarios is a challenging problem for today's vision algorithms. Due to the large appearance changes and intra-class variability of these objects, it is hard to define a model, which is both general and discriminative enough to capture the properties of the category. In this work, we pro- pose an approach, which aims for a suitable trade-off for this problem. On the one hand, the approach is made more discriminant by explic- itly distinguishing typical object shapes. On the other hand, the method generalizes well and requires relatively few training samples by cross- articulation learning. The effectiveness of the approach is shown and compared to previous approaches on two datasets containing pedestri- ans with different articulations.

Full-text preview

Available from:
  • Source
    • "The MMHT [3] approach adapts the probabilistic framework in the ISM approach and learns the weights of Hough votes in a discriminative max-margin framework. As for handling object scale variations, the ISM [1], 4D- ISM [15], partISM [5], IRD [6], Fast PRISM [7] and MMHT [3] approaches use local feature descriptors to estimate the scales of local features and cast Hough votes in a scale space. The positions where the voting points are most concentrated in the scale space are considered as the locations of object hypotheses. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Popular Hough Transform-based object detection approaches usually construct an appearance codebook by clustering local image features. However, how to choose appropriate values for the parameters used in the clustering step remains an open problem. Moreover, some popular histogram features extracted from overlapping image blocks may cause a high degree of redundancy and multicollinearity. In this paper, we propose a novel Hough Transform-based object detection approach. First, to address the above issues, we exploit a Bridge Partial Least Squares (BPLS) technique to establish context-encoded Hough Regression Models (HRMs), which are linear regression models that cast probabilistic Hough votes to predict object locations. BPLS is an efficient variant of Partial Least Squares (PLS). PLS-based regression techniques (including BPLS) can reduce the redundancy and eliminate the multicollinearity of a feature set. And the appropriate value of the only parameter used in PLS (i.e., the number of latent components) can be determined by using a cross-validation procedure. Second, to efficiently handle object scale changes, we propose a novel multi-scale voting scheme. In this scheme, multiple Hough images corresponding to multiple object scales can be obtained simultaneously. Third, an object in a test image may correspond to multiple true and false positive hypotheses at different scales. Based on the proposed multi-scale voting scheme, a principled strategy is proposed to fuse hypotheses to reduce false positives by evaluating normalized pointwise mutual information between hypotheses. In the experiments, we also compare the proposed HRM approach with its several variants to evaluate the influences of its components on its performance. Experimental results show that the proposed HRM approach has achieved desirable performances on popular benchmark datasets.
    Full-text · Article · Nov 2014 · Neurocomputing
  • Source
    • "Besides, due to computational constraints, these approaches usually employ simplified person models (ellipse, human shape templates, etc). On the other side, the systems that do not operate in real time [3] [7] [17] [21] [27] [28] [29] get these initial candidates location scanning the complete image at various scales and rotations; in this case, person models must be complex to classify correctly many negative examples. The scanning and use of more complex models improve the detection rate but the computational costs are too high to allow for real time processing. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper an improved real time algorithm for de-tecting pedestrians in surveillance video is proposed. The algorithm is based on people appearance and defines a per-son model as the union of four models of body parts. Firstly, motion segmentation is performed to detect moving pixels. Then, moving regions are extracted and tracked. Finally, the detected moving objects are classified as human or non-human objects. In order to test and validate the algorithm, we have developed a dataset containing annotated surveil-lance sequences of different complexity levels focused on the pedestrians detection. Experimental results over this dataset show that our approach performs considerably well at real time and even better than other real and non-real time approaches from the state of art.
    Full-text · Article · Aug 2010
  • Source
    • "Furthermore, other interesting ways to overcome this variability are being explored (e.g., multiple instance learning), which may provide additional benefits, like relaxing the annotation process. Of course, any improvement in existing algorithms, like the proposals in [93], [107], or new features that exploit typical measures (i.e., intensity, gradient, etc.) in new ways, like shape context [105] or HOG [31], will contribute to the improvement of these systems. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Advanced driver assistance systems (ADASs), and particularly pedestrian protection systems (PPSs), have become an active research area aimed at improving traffic safety. The major challenge of PPSs is the development of reliable on-board pedestrian detection systems. Due to the varying appearance of pedestrians (e.g., different clothes, changing size, aspect ratio, and dynamic shape) and the unstructured environment, it is very difficult to cope with the demanded robustness of this kind of system. Two problems arising in this research area are the lack of public benchmarks and the difficulty in reproducing many of the proposed methods, which makes it difficult to compare the approaches. As a result, surveying the literature by enumerating the proposals one--after-another is not the most useful way to provide a comparative point of view. Accordingly, we present a more convenient strategy to survey the different approaches. We divide the problem of detecting pedestrians from images into different processing steps, each with attached responsibilities. Then, the different proposed methods are analyzed and classified with respect to each processing stage, favoring a comparative viewpoint. Finally, discussion of the important topics is presented, putting special emphasis on the future needs and challenges.
    Full-text · Article · Jul 2010 · IEEE Transactions on Software Engineering
Show more