Evaluation of Local Feature Extraction Methods Generated through Genetic Programming on Visual SLAM

Working Paper · January 2014with 134 Reads
Cite this publication

Do you want to read the rest of this working paper?

Request full-text
Request Full-text Paper PDF
This research hasn't been cited in any other publications.
  • Article
    Full-text available
    This work presents a novel local image descriptor based on the concept of pointwise signal regularity. Local image regions are extracted using either an interest point or an interest region detector, and discriminative feature vectors are constructed by uniformly sampling the pointwise Hölderian regularity around each region center. Regularity estimation is performed using local image oscillations, the most straightforward method directly derived from the definition of the Hölder exponent. Furthermore, estimating the Hölder exponent in this manner has proven to be superior, in most cases, when compared to wavelet based estimation as was shown in previous work. Our detector shows invariance to illumination change, JPEG compression, image rotation and scale change. Results show that the proposed descriptor is stable with respect to variations in imaging conditions, and reliable performance metrics prove it to be comparable and in some instances better than SIFT, the state-of-the-art in local descriptors.
  • Article
    This article presents a novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features). SURF approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (specifically, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper encompasses a detailed description of the detector and descriptor and then explores the effects of the most important parameters. We conclude the article with SURF's application to two challenging, yet converse goals: camera calibration as a special case of image registration, and object recognition. Our experiments underline SURF's usefulness in a broad range of topics in computer vision.
  • Article
    Full-text available
    The detection of stable and informative image points is one of the most important low-level problems in modern computer vision. This paper proposes a multiobjective genetic programming (MO-GP) approach for the automatic synthesis of operators that detect interest points. The proposal is unique for interest point detection because it poses a MO formulation of the point detection problem. The search objectives for the MO-GP search consider three properties that are widely expressed as desirable for an interest point detector, these are: (1) stability; (2) point dispersion; and (3) high information content. The results suggest that the point detection task is a MO problem, and that different operators can provide different trade-offs among the objectives. In fact, MO-GP is able to find several sets of Pareto optimal operators, whose performance is validated on standardized procedures including an extensive test with 500 images; as a result, we could say that all solutions found by the system dominate previously man-made detectors in the Pareto sense. In conclusion, the MO formulation of the interest point detection problem provides the appropriate framework for the automatic design of image operators that achieve interesting trade-offs between relevant performance criteria that are meaningful for a variety of vision tasks.
  • Article
    The regularity of a signal can be numerically expressed using Hölder exponents, which characterize the singular structures a signal contains. In particular, within the domains of image processing and image understanding, regularity-based analysis can be used to describe local image shape and appearance. However, estimating the Hölder exponent is not a trivial task, and current methods tend to be computationally slow and complex. This work presents an approach to automatically synthesize estimators of the pointwise Hölder exponent for digital images. This task is formulated as an optimization problem and Genetic Programming (GP) is used to search for operators that can approximate a traditional estimator, the oscillations method. Experimental results show that GP can generate estimators that achieve a low error and a high correlation with the ground truth estimation. Furthermore, most of the GP estimators are faster than traditional approaches, in some cases their runtime is orders of magnitude smaller. This result allowed us to implement a real-time estimation of the Hölder exponent on a live video signal, the first such implementation in current literature. Moreover, the evolved estimators are used to generate local descriptors of salient image regions, a task for which a stable and robust matching is achieved, comparable with state-of-the-art methods. In conclusion, the evolved estimators produced by GP could help expand the application domain of Hölder regularity within the fields of image analysis and signal processing.
  • Book
    Full-text available
    This is one of the only books to provide a complete and coherent review of the theory of genetic programming (GP). In doing so, it provides a coherent consolidation of recent work on the theoretical foundations of GP. A concise introduction to GP and genetic algorithms (GA) is followed by a discussion of fitness landscapes and other theoretical approaches to natural and artificial evolution. Having surveyed early approaches to GP theory it presents new exact schema analysis, showing that it applies to GP as well as to the simpler GAs. New results on the potentially infinite number of possible programs are followed by two chapters applying these new techniques.
  • Article
    In this survey, we give an overview of invariant interest point detectors, how they evolved over time, how they work, and what their respective strengths and weaknesses are. We begin with defining the properties of the ideal local feature detector. This is followed by an overview of the literature over the past four decades organized in different categories of feature extraction methods. We then provide a more detailed analysis of a selection of methods which had a particularly significant impact on the research field. We conclude with a summary and promising future research directions.
  • Article
    Full-text available
    This work describes a way of designing interest point detectors using an evolutionary-computer-assisted design approach. Nowadays, feature extraction is performed through the paradigm of interest point detection due to its simplicity and robustness for practical applications such as: image matching and view-based object recognition. Genetic programming is used as the core functionality of the proposed human-computer framework that significantly augments the scope of interest point design through a computer assisted learning process. Indeed, genetic programming has produced numerous interest point operators, many with unique or unorthodox designs. The analysis of those best detectors gives us an advantage to achieve a new level of creative design that improves the perspective for human-machine innovation. In particular, we present two novel interest point detectors produced through the analysis of multiple solutions that were obtained through single and multi-objective searches. Experimental results using a well-known testbed are provided to illustrate the performance of the operators and hence the effectiveness of the proposal.
  • Article
    Full-text available
    This work describes how evolutionary computation can be used to synthesize low-level image operators that detect interesting points on digital images. Interest point detection is an essential part of many modern computer vision systems that solve tasks such as object recognition, stereo correspondence, and image indexing, to name but a few. The design of the specialized operators is posed as an optimization/search problem that is solved with genetic programming (GP), a strategy still mostly unexplored by the computer vision community. The proposed approach automatically synthesizes operators that are competitive with state-of-the-art designs, taking into account an operator's geometric stability and the global separability of detected points during fitness evaluation. The GP search space is defined using simple primitive operations that are commonly found in point detectors proposed by the vision community. The experiments described in this paper extend previous results (Trujillo and Olague, 2006a,b) by presenting 15 new operators that were synthesized through the GP-based search. Some of the synthesized operators can be regarded as improved manmade designs because they employ well-known image processing techniques and achieve highly competitive performance. On the other hand, since the GP search also generates what can be considered as unconventional operators for point detection, these results provide a new perspective to feature extraction research.
  • Article
    Full-text available
    In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context, steerable filters, PCA-SIFT, differential invariants, spin images, SIFT, complex filters, moment invariants, and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.
  • Article
    Full-text available
    We present a real-time algorithm which can recover the 3D trajectory of a monocular camera, moving rapidly through a previously unknown scene. Our system, which we dub MonoSLAM, is the first successful application of the SLAM methodology from mobile robotics to the "pure vision" domain of a single uncontrolled camera, achieving real time but drift-free performance inaccessible to Structure from Motion approaches. The core of the approach is the online creation of a sparse but persistent map of natural landmarks within a probabilistic framework. Our key novel contributions include an active approach to mapping and measurement, the use of a general motion model for smooth camera movement, and solutions for monocular feature initialization and feature orientation estimation. Together, these add up to an extremely efficient and robust algorithm which runs at 30 Hz with standard PC and camera hardware. This work extends the range of robotic systems in which SLAM can be usefully applied, but also opens up new areas. We present applications of MonoSLAM to real-time 3D localization and mapping for a high-performance full-size humanoid robot and live augmented reality with a hand-held camera.
  • Conference Paper
    This paper addresses the problem of real-time 3D model-based tracking by combining point-based and edge-based tracking systems. We present a careful analysis of the properties of these two sensor systems and show that this leads to some non -trivial design choices that collectively yield extremely high performance. In particular, we present a method for integrating the two systems and robustly combining the pose estimates they produce. Further we show how on-line learning can be used to improve the performance of feature tracking. Finally, to aid real-time performance, we introduce the FAST feature detector which can perform full-frame feature detection at 400Hz. The combination of these techniques results in a system which is capable of tracking average prediction errors of 200 pixels. This level of robustness allows us to track very rapid motions, such as 50deg camera shake at 6Hz
  • Conference Paper
    An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds
  • Article
    This paper describes the simultaneous localization and mapping (SLAM) problem and the essential methods for solving the SLAM problem and summarizes key implementations and demonstrations of the method. While there are still many practical issues to overcome, especially in more complex outdoor environments, the general SLAM method is now a well understood and established part of robotics. Another part of the tutorial summarized more recent works in addressing some of the remaining issues in SLAM, including computation, feature representation, and data association
  • No feature-based vision system can work unless good features can be identified and tracked from frame to frame. Although tracking itself is by and large a solved problem, selecting features that can be tracked well and correspond to physical points in the world is still hard. We propose a feature selection criterion that is optimal by construction because it is based on how the tracker works, and a feature monitoring method that can detect occlusions, disocclusions, and features that do not correspond to points in the world. These methods are based on a new tracking algorithm that extends previous Newton-Raphson style search methods to work under affine image transformations. We test performance with several simulations and experiments. 1 Introduction IEEE Conference on Computer Vision and Pattern Recognition (CVPR94) Seattle, June 1994 Is feature tracking a solved problem? The extensive studies of image correlation [4], [3], [15], [18], [7], [17] and sum-of-squared-difference (SSD...