Herbert Bay’s research while affiliated with ETH Zurich and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (14)


Robust interest point detector and descriptor
  • Patent

March 2014

·

39 Reads

·

22 Citations

Ryuji Funayama

·

Hiromichi Yanagihara

·

·

[...]

·

Herbert Bay

A method for operating on images is described for interest point detection and/or description working under different scales and with different rotations, e.g. for scale-invariant and rotation-invariant interest point detection and/or description.


Speeded-up robust features (SURF)

June 2008

·

2,274 Reads

·

12,257 Citations

Computer Vision and Image Understanding

This article presents a novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features). SURF approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (specifically, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper encompasses a detailed description of the detector and descriptor and then explores the effects of the most important parameters. We conclude the article with SURF's application to two challenging, yet converse goals: camera calibration as a special case of image registration, and object recognition. Our experiments underline SURF's usefulness in a broad range of topics in computer vision.



Lecture Notes in Computer Science

January 2008

·

40 Reads

·

22 Citations

Lecture Notes in Computer Science

We present a system which allows to request information on physical objects by taking a picture of them. This way, using a mobile phone with integrated camera, users can interact with objects or "things" in a very simple manner. A further advantage is that the objects themselves don't have to be tagged with any kind of markers. At the core of our system lies an object recognition method, which identifies an object from a query image through multiple recognition stages, including local visual features, global geometry, and optionally also metadata such as GPS location. We present two applications for our system, namely a slide tagging application for presentation screens in smart meeting rooms and a cityguide on a mobile phone. Both systems are fully functional, including an application on the mobile phone, which allows simplest point-and-shoot interaction with objects. Experiments evaluate the performance of our approach in both application scenarios and show good recognition results under challenging conditions.


Dense Stereo by Triangular Meshing and Cross Validation

September 2006

·

12 Reads

·

1 Citation

Lecture Notes in Computer Science

Dense depth maps can be estimated in a Bayesian sense from multiple calibrated still images of a rigid scene relative to a reference view [1]. This well-established probabilistic framework is extended by adaptively refining a triangular meshing procedure and by automatic cross-validation of model parameters. The adaptive refinement strategy locally adjusts the triangular meshing according to the measured image data. The new method substantially outperforms the competing techniques both in terms of robustness and accuracy.


Fig. 1. Left to right: the (discretised and cropped) Gaussian second order partial derivatives in y -direction and xy -direction, and our approximations thereof using box filters. The grey regions are equal to zero. 
Fig. 2. Left: Detected interest points for a Sunflower field. This kind of scenes shows clearly the nature of the features from Hessian-based detectors. Middle: Haar wavelet types used for SURF. Right: Detail of the Graffiti scene showing the size of the descriptor window at different scales.
Fig. 3. The descriptor entries of a sub-region represent the nature of the underlying intensity pattern. Left: In case of a homogeneous region, all values are relatively low. Middle: In presence of frequencies in x direction, the value of | d x | is high, but all others remain low. If the intensity is gradually increasing in x direction, both values 
Fig. 5. An example image from the reference set (left) and the test set (right). Note the difference in viewpoint and colours. 
SURF: Speeded up robust features
  • Conference Paper
  • Full-text available

July 2006

·

11,698 Reads

·

10,452 Citations

Lecture Notes in Computer Science

In this paper, we present a novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Robust Features). It approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (in casu, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper presents experimental results on a standard evaluation set, as well as on imagery obtained in the context of a real-life object recognition application. Both show SURF’s strong performance.

Download

3D from Line Segments in Two Poorly-Textured, Uncalibrated Images

June 2006

·

179 Reads

·

20 Citations

This paper addresses the problem of camera self-calibration, bundle adjustment and 3D reconstruction from line segments in two images of poorly-textured indoor scenes. First, we generate line segment correspondences, using an extended version of our previously proposed matching scheme. The first main contribution is a new method to identify polyhedral junctions resulting from the intersections of the line segments. At the same time, the images are segmented into planar polygons. This is done using an algorithm based on a binary space partitioning (BSP) tree. The junctions are matched end points of the detected line segments and hence can be used to obtain the epipolar geometry. The essential matrix is considered for metric camera calibration. For better stability, the second main contribution consists in a bundle adjustment on the line segments and the camera parameters that reduces the number of unknowns by a maximum flow algorithm. Finally, a piecewise-planar 3D reconstruction is computed based on the segmentation of the BSP tree. The system's performance is tested on some challenging examples.


Fig. 1. (a) Correspondences after matching SURF features (b) after the guided matching. (c) Example graph of the connected images after pairwise matching. 
Fig. 2. A summary of the steps involved for the multi-band blending of two images (the steps for the second image are in light gray) over three octaves. First, the warped images I 0 i and their blending masks M 0 i are used to generate the Gaussian pyramid levels I 1 i , I 2 i , ... and masks sizes M 1 i , M 2 i , .... The Laplacian pyramid is computed as the difference between adjacent Gaussian levels (one of them must be expanded first). The Laplacian pyramid is then masked with the dilated blending mask (&) and summed for every image in the pyramid. Finally, the resulting blended image r 0 is generated by expanding and summing all levels of the pyramid.
Fig. 3. Three sample mosaics with different pathologies composed with (a) 6, (b) 6, and (c) 7 retina images. Multi-band blending over 6 octaves was used to fuse the images. 
Retina Mosaicing Using Local Features

February 2006

·

2,510 Reads

·

83 Citations

Lecture Notes in Computer Science

Laser photocoagulation is a proven procedure to treat various pathologies of the retina. Challenges such as motion compensation, correct energy dosage, and avoiding incidental damage are responsible for the still low success rate. They can be overcome with improved instrumentation, such as a fully automatic laser photocoagulation system. In this paper, we present a core image processing element of such a system, namely a novel approach for retina mosaicing. Our method relies on recent developments in region detection and feature description to automatically fuse retina images. In contrast to the state-of-the-art the proposed approach works even for retina images with no discernable vascularity. Moreover, an efficient scheme to determine the blending masks of arbitrarily overlapping images for multi-band blending is presented.


SURF: Speeded Up Robust Features.

January 2006

·

3,247 Reads

·

1,562 Citations

Computer Vision and Image Understanding

This article presents a novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features). SURF approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (specifically, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps.The paper encompasses a detailed description of the detector and descriptor and then explores the effects of the most important parameters. We conclude the article with SURF’s application to two challenging, yet converse goals: camera calibration as a special case of image registration, and object recognition. Our experiments underline SURF’s usefulness in a broad range of topics in computer vision.



Citations (14)


... This involves extracting image features and looking for the closest neighbor from an image database. Traditional approaches y on handcrafted features [9,41] etVLAD and its variants [4,16,44] use deep-learned image features to improve recall performance and robustness. The emergence of self-supervised foundation models, such as DINOv2 [49], enables universal image representations, offering significant progress [34,36] across many VPR tasks. ...

Reference:

Multiview Scene Graph
Surf: Speeded up robust features
  • Citing Chapter
  • January 2006

... Interest points (Funayama et al., 2012) need to have an exact location in the image, which is extracted from the response maps by finding all the local maxima. SURF finds local maxima using a non-maximum suppression in a 3 × 3 × 3 neighbourhood around each pixel (Arnesen, 2010). ...

Robust interest point detector and descriptor
  • Citing Patent
  • March 2014

... The two images to be registered are commonly referred to as the moving image I m and the fixed image I , f minimizing their differences can facilitate clinicians in disease analysis and diagnosis [3]. The traditional registration method mainly achieves the registration process by minimizing the grayscale difference between images [4][5][6] or utilizing key feature points between images [7,8] to iteratively optimize each pair of input images from scratch. Although the registration accuracy is high, its limitation is that the iteration time is long and the similarity measurement function is prone to getting stuck in local optima. ...

Speeded-up Robust Features (SURF). Comput. Vis
  • Citing Article

... Currently, significant progress has been made in image stitching techniques. Researchers have achieved many results in feature point extraction (e.g., SIFT, SURF), image alignment (e.g., RANSAC), and image fusion (e.g., seamless cloning) [4,5]. Existing techniques still face challenges in dealing with some specific problems, such as real-time stitching in dynamic scenes, image quality enhancement in low-light environments, and computational efficiency when dealing with large-scale panoramic images. ...

SURF: Speeded up robust features
  • Citing Article
  • January 2008

Computer Vision and Image Understanding

... The realm of handcrafted visual descriptors has reached maturity, with representations such as SIFT [11], SURF [12], and BRIEF [13] being prominent examples. These descriptors encode local image details surrounding 2D key points into floating-point or binary arrays, showcasing resilience against various transformations. ...

Speeded-up robust features (SURF)
  • Citing Article
  • June 2008

Computer Vision and Image Understanding

... U-SURF is a simplified version of SURF, which does not consider image rotation, so it can improve efficiency. The PD PRPS patterns collected in the substation and in the laboratory are all upright images, and there is no change in the viewing angle; therefore, the simpler U-SURF feature extraction method was selected to simplify the experimental process and increase the speed [14,15]. This research applied the local feature extraction method of images to the field of PD pattern recognition for the first time. ...

Interactive museum guide: Fast and robust recognition of museum objects

... These two algorithms have been commonly employed in image stitching assignments, especially in the generation of panoramas and seamless stitching of multiple images, where alignment and synthesis between images can be effectively achieved by extracting and matching feature points [14,15]. However, traditional feature detection and matching methods are often difficult to cope with complex motion patterns and non-rigid deformations when dealing with dynamic scenes or drastic environmental changes, resulting in less than ideal stitching results. ...

SURF: Speeded up robust features

Lecture Notes in Computer Science

... In this paper, we are aiming at a disparity map that facilitates reconstruction and as such accurate disparity in non-occluded regions takes more preference over detection of occlusion. It has been reported in [11, 12] that triangulation schemes may be used to accelerate multiview reconstruction algorithms such as those proposed in [13]. [12] adaptively subdivides the scene as it estimates disparity. ...

Dense Stereo by Triangular Meshing and Cross Validation
  • Citing Conference Paper
  • September 2006

Lecture Notes in Computer Science

... It also does not prevent self-intersections nor guarantees watertightness. The authors of [BENV06] segment images into likely planar polygons based on 3D corner junctions and use best supporting lines to reconstruct polygons in 3D. For 2.5D reconstruction, extracted 3D lines [SZ97] are used with a dense height map to build a line arrangement on the ground plane and create geometric primitives and building masks [ZBKB08]. ...

3D from Line Segments in Two Poorly-Textured, Uncalibrated Images