Conference Paper

The 2005 PASCAL visual object classes challenge

RWTH Aachen University, Aachen, North Rhine-Westphalia, Germany
DOI: 10.1007/11736790_8 Conference: Machine Learning Challenges, Evaluating Predictive Uncertainty, Visual Object Classification and Recognizing Textual Entailment, First PASCAL Machine Learning Challenges Workshop, MLCW 2005, Southampton, UK, April 11-13, 2005, Revised Selected Papers
Source: DBLP


The PASCAL Visual Object Classes Challenge ran from February to March 2005. The goal of the challenge was to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects). Four object classes were selected: motor- bikes, bicycles, cars and people. Twelve teams entered the challenge. In this chapter we provide details of the datasets, algorithms used by the teams, evaluation criteria, and results achieved.

Download full-text


Available from: Thomas Deselaers
  • Source
    • "The key idea explored here is the use of large number of features represented in many search trees, in contrast to many existing action classification methods based on a single, small codebook and SVM [5] [38] [45]. This message also comes from the static object recognition [6], where efficient search methods using many different features from large number of examples provide the best results. The advantage of using multiple trees has been demonstrated in image retrieval [42]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we propose an approach for action recognition based on a vocabulary of local appearance-motion features and fast approximate search in a large number of trees. Large numbers of features with associated motion vectors are extracted from video data and are represented by many trees. Multiple interest point detectors are used to provide features for every frame. The motion vectors for the features are estimated using optical flow and a descriptor based matching. The features are combined with image segmentation to estimate dominant homographies, and then separated into static and moving ones despite the camera motion. Features from a query sequence are matched to the trees and vote for action categories and their locations. Large number of trees make the process efficient and robust. The system is capable of simultaneous categorisation and localisation of actions using only a few frames per sequence. The approach obtains excellent performance on standard action recognition sequences. We perform large scale experiments on 17 challenging real action categories from various sport disciplines. We demonstrate the robustness of our method to appearance variations, camera motion, scale change, asymmetric actions, background clutter and occlusion.
    Full-text · Article · Mar 2011 · Computer Vision and Image Understanding
  • Source
    • "The increased popularity of the problem witnessed in the recent years and the advent of powerful computer hardware have led to a seeming success of categorization approaches on the standard datasets such as Caltech 101 [15]. However, the high discrepancy between the accuracy of object classification and detection/segmentation [14] suggests that the problem still poses a significant and open challenge. The recent preoccupation with tuning the approaches to specific datasets might have precluded the attention from the most crucial issue: the representation [41]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Visual categorization of objects has captured the attention of the vision community for decades [10]. The increased popularity of the problem witnessed in the recent years and the advent of powerful computer hardware have led to a seeming success of categorization approaches on the standard datasets such as
    Preview · Article · Dec 2010
  • Source
    • "The concept-based measures AUC and EER can be calculated from the Receiver Operating Characteristics (ROC) curve and are common measures used in the evaluation of classification tasks [6], [7]. A ROC curve graphically plots the true-positive rate against the false-positive rate. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The Photo Annotation Task is performed as one task in the Image CLEF@ICPR contest and poses the challenge to annotate 53 visual concepts in Flickr photos. Altogether 12 research teams met the multilabel classification challenge and submitted solutions. The participants were provided with a training and a validation set consisting of 5,000 and 3,000 annotated images, respectively. The test was performed on 10,000 images. Two evaluation paradigms have been applied, the evaluation per concept and the evaluation per example. The evaluation per concept was performed by calculating the Equal Error Rate and the Area Under Curve (AUC). The evaluation per example utilizes a recently proposed Ontology Score. For the concepts, an average AUC of 86.5% could be achieved, including concepts with an AUC of 96%. The classification performance for each image ranged between 59% and 100% with an average score of 85%.
    Preview · Conference Paper · Aug 2010
Show more