Gerhard Roth

Gerhard Roth
  • Carleton University

About

94
Publications
41,604
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,737
Citations
Current institution
Carleton University

Publications

Publications (94)
Conference Paper
We describe a real time authoring tool that is useful to supervise group of users in an augmented reality environment. The overall system is composed of a server; the Command Center (CC) and several clients; Wearable Systems (WS). The server is used to visualize the virtual representation of the real environment, 3d features map where augmentation...
Conference Paper
Video copy detection is an important task with many applications, especially since detecting copies is an alternative to watermarking. In this paper we describe a simple, but efficient approach that is easy to parallelize, works well, and has low storage requirements. We represent each video frame by a count of the number of SURF interest points in...
Conference Paper
Full-text available
We present a biologically motivated classifier and feature descriptors that are designed for execution on single instruction multi data hardware and are applied to high speed multiclass object recognition. Our feature extractor uses a cellular tuning approach to select the optimal Gabor filters to process a given input, followed by the computation...
Conference Paper
This paper presents a novel algorithm to iteratively compute camera paths of long image sequences. Scale Invariant Features are first extracted from the ordered set of images. These images are then matched pair-wise sequentially and correspondences are computed. An initial geometric path can be found after by applying a bundle adjustment algorithm...
Article
Full-text available
Augmented reality (AR) is the concept of inserting virtual objects into real scenes. Often, augmentations are aligned with rigid planar objects in the scene. However, a more difficult task is to align non-rigid augmentations with flexible objects like cloth. To address this problem, we present a method to perform real-time flexible augmentations on...
Conference Paper
Full-text available
This paper studies the performance of various scale- invariant detectors in the context of feature matching. In particular, we propose an implementation of the Hessian-Laplace operator that we called scale-interpolated Hessian-Laplace. This research also proposes to use Haar descriptors which are derived from the Haar wavelet transform. It offers t...
Article
Augmented Reality (AR) brings virtual objects into the real world instead of making people go into the computer world. With AR, virtual objects appear to coexist with the real in the user's real environment [Bier et al. 1993].
Conference Paper
Full-text available
Strongly similar subimages contain different views of the same object. In subimage search, the user selects an image region and the retrieval system attempts to find matching subimages in an image database that are strongly similar. Solutions have been proposed using salient features or "interest points" that have associated descriptor vectors. How...
Article
Full-text available
Image thresholding is a common task in many computer vision and graphics applications. The goal of thresholding an image is to classify pixels as either "dark" or "light". Adaptive thresholding is a form of thresholding that takes into account spatial variations in illumination. We present a technique for real-time adaptive thresholding using the i...
Conference Paper
This paper presents a system for introducing augmented reality (AR) enhancements into an image-based cubic panorama sequence. Panoramic cameras, such as the Point Gray Research Ladybug allow rapid capture and generation of panoramic sequences for users to navigate and view. Our AR system provides the ability for authors to add virtual content into...
Article
Full-text available
This paper addresses the problem of computing the three-dimensional (3-D) path of a moving rigid object using a calibrated stereoscopic vision setup. The proposed system begins by detecting feature points on the moving object. By tracking these points over time, it produces clouds of 3-D points that can be registered, thus giving information about...
Conference Paper
Full-text available
The power of Markov random field formulations of lowlevel vision problems, such as stereo, has been known for some time. However, recent advances, both algorithmic and in processing power, have made their application practical. This paper presents a novel implementation of Bayesian belief propagation for graphics processing units found in most mode...
Conference Paper
Full-text available
A process is described to determine the shot accuracy of an automatic robotic pool playing system. The system comprises a ceiling-mounted gantry robot, a special purpose cue end-effector, a ceiling-mounted camera, and a standard bar pool table. Two methods are compared for extracting the homography between the camera and the table plane. A challeng...
Conference Paper
Full-text available
This paper presents a new locally refined collision detection approach for large scale complex meshes in distributed virtual environments (DVEs) where exact and interactive interference detection is required. Transmitting models with millions of polygons is time consuming in comparison with transmitting simple models. Even if the models are transmi...
Article
Full-text available
We introduce a novel approach, multiresolution collision detection ,f or fast and exact interference detection on continuous level-of-detail (LOD) representa- tions of arbitrary triangle meshes undergoing rigid-body motion. A new algorithm, active bounding tree (AB-Tree), is presented to accelerate interference queries of three-dimensional models,...
Conference Paper
Full-text available
Panoramic cameras can capture a 360° view from a point providing new capabilities for multimedia, tele-presence and robotic applications. For example, virtual walk-throughs of an environment can be created from a sequence of panoramic images, where perspective views are created according to a user's position and view direction. For this and other a...
Conference Paper
Full-text available
We present a system for virtual navigation in real environments using image-based panorama rendering. Multiple overlapping images are captured using a Point Grey Ladybug camera and a single cube-aligned panorama image is generated for each capture location. Panorama locations are connected in a graph topology and registered with a 2D map for naviga...
Conference Paper
Full-text available
The computational cost of a collision detection (CD) algorithm on polygonal surfaces depends highly on the complexity of the models. A novel "locally refined" approach is introduced in this paper for fast CD in haptic rendering applications, e.g. haptic surgery and haptic sculpture simulations. Exact interference detections are performed on propose...
Conference Paper
Full-text available
This paper presents a 3D pose estimation and reconstruction system based on a calibrated stereoscopic vision setup. The proposed approach consists in robustly tracking the movements of the cameras with respect to a rigid scene along a sequence. In addition, a novel correction scheme is proposed, that compensates for the accumulated error in the com...
Conference Paper
Full-text available
A common task in computer entertainment is the ability to interact with virtual 3D objects. Interacting with these objects using standard computer input devices such as a mouse and keyboard can often be a dicult task. For this reason, Tangible User Interfaces (TUIs) were developed to allow more natural interaction with complex virtual objects by ma...
Article
Full-text available
Interaction with 3D objects using standard computer in-put devices such as a mouse and keyboard is often a difficult task. For this reason, Tangible User Interfaces (TUIs) are developed to allow more natural 3D interac-tion by manipulating physical objects in a familiar way. We present a new TUI system that includes a passive optical tracking metho...
Article
Full-text available
UAVs are becoming ubiquitous and will be widely deployed in many applications. The result will be a large amount of video data that needs to be organized and searched. A critical image processing application for UAVs will be a Google-like realtime search engine for large image and video databases. We have developed a novel indexing and search metho...
Conference Paper
Full-text available
We propose a method to augment live video based on the tracking of natural features, and the online estimation of the trinocular geometry. Previous without-marker approaches require the computation of camera pose to render virtual objects. The strength of our proposed method is that it doesn 7 require tracking of camera pose, and exploits the usual...
Conference Paper
Full-text available
Videoconferencing systems in use today typically rely on either fixed or pan/tilt/zoom cameras for image acquisition, and close-talking microphones for good quality audio capture. These sensors are unsuitable for scenarios involving multiple users seated at a meeting table, or non-stationary users. In these situations, the focus of attention should...
Article
Due to recent increase of computer power and decrease of camera cost, it became very common to see a camera on top of a computer monitor. This paper presents the vision-based technology which allows one in such a setup to significantly enhance the perceptual power of the computer. The described techniques for tracking a face using a convex-shape no...
Article
Full-text available
Calibration is the process of computing the intrinsic (internal) camera parameters from a series of images. Normally calibration is done by placing predefined targets in the scene or by having special camera motions, such as rotations. If these two restrictions do not hold, then this calibration process is called autocalibration because it is done...
Article
Full-text available
The problem of building geometric models has been a central application in photogrammetry. Our goal is to partially automate this process by finding the features necessary for computing the exterior orientation. This is done by robustly computing the fundamental matrix, and trilinear tensor for all images pairs and some image triples. The correspon...
Conference Paper
Full-text available
Collaboration is an essential mechanism for productivity. Projection tables such as the SociaDesk enable collaboration through the sharing of audio, video and data. To enhance this form of interaction, it is beneficial to enable local, multi-user interaction with this media. This paper introduces a computer vision-based gesture recognition system t...
Conference Paper
Full-text available
A robotic system is presented which automatically pots (i.e., sinks) pool balls. A homography is estimated that relates the gantry robot coordinate frame to the overhead (global) camera coordinate frame. This homography is computed by first calculating the mapping between the camera frame and a projection of the robot frame, and then solving the po...
Article
Full-text available
In augmented reality, virtual objects are added to the real world by superimposing them onto a video stream or a head-mounted display in real-time. Typical augmented reality applications track 2D patterns on rigid planar objects in order to acquire the pose of the camera in the scene. Rigid augmentations are then performed on the planar objects. We...
Article
Full-text available
Laser scanning range sensors are widely used for high-precision, high-density three-dimensional (3D) reconstruction and inspection of the surface of physical objects. The process typically involves planning a set of views, physically altering the relative object-sensor pose, taking scans, registering the acquired geometric data in a common coordina...
Article
Full-text available
A modern tool being explored by researchers is the technological augmentation of human perception known as Augmented Reality (AR). In this paradigm the user sees the real world along with virtual objects and annotations of that world. This synthesis requires the proper registration of virtual information with the real scene, implying the computer's...
Article
Full-text available
Projective vision research has recently received a lot of attention and has claimed some important results in current literature. In this paper, we present a compilation of tools that we have created to allow further research into the field. Not only can experienced projective vision researchers use these tools, but they also have use as a visual l...
Article
Full-text available
This paper presents a parallel method for progressive mesh simplification. A progressive mesh (PM) is a continuous mesh representation of a given 3D object which makes it possible to efficiently access all mesh representations between a low and a high level of resolution. The creation of a progressive mesh is a time consuming process and has a need...
Article
Full-text available
Object reconstruction or inspection using a range camera requires a positioning system to configure relative sensor-object geometry in a sequenceofposes. Discrepancies between commanded and actual poses can result in serious scanning deficiencies. This paper provides analytical and experimental characterization of pose error effects for a common ty...
Article
Full-text available
Vision-based registration techniques for augmented reality systems have been the subject of intensive research recently due to their potential to accurately align virtual objects with the real world. The downfall of these vision-based approaches, however, is their high computational cost and lack of robustness.
Article
Full-text available
For humans, to view a scene with two eyes is clearly more advantageous than to do that with one eye. In computer vision however, most of high-level vision tasks, an example of which is face tracking, are still done with one camera only. This is due to the fact that, unlike in human brains, the relationship between the images observed by two arbitra...
Conference Paper
Full-text available
We describe a new method of achieving autocalibration that uses a stochastic optimization approach taken from the field of evolutionary computing and we perform a number of experiments on standardized data sets that show the effectiveness of the approach. The basic assumption of this method is that the internal (intrinsic) camera parameters remain...
Article
Full-text available
Automated 3D object reconstruction or inspection using a range camera requires a positioning system to configure sensor-object relative geometry in a sequence of poses definedbyacomputed view plan. Discrepancies between commanded and actual poses can result in serious scanning deficiencies. This paper examines the view planning impact of positionin...
Article
Full-text available
A new theoretical framework for the view planning problem is presented. In this framework, view planning is defined as an instance of the well-known set covering problem from the field of combinatorial optimization. We include an image-based registration constraint and express the result in the form of an integer programming problem.
Article
Full-text available
The view planning problem, also known as the next-bestview (NBV) problem, for object reconstruction and inspection has been shown to be isomorphic to the set covering problem which is NP-Complete. In this paper we express a theoretical framework for the NBV problem as an integer programming problem including a registration constraint. Experimental...
Conference Paper
Full-text available
Autocalibration algorithms based on the fundamental matrix must solve the problem of finding the global minimum of a cost function which has many local minima. We describe a new method of achieving this goal, which uses a stochastic optimization approach taken from the field of evolutionary algorithms. In theory, approaches that use the fundamental...
Conference Paper
Full-text available
Interaction with virtual objects in an augmented environment enhances the user's interpretation of their presence. An augmented reality (AR) system that uses computer vision for registration can use the same technology for simple gesture recognition. This paper describes hand detection and simple gesture recognition techniques useful in pattern-bas...
Conference Paper
Full-text available
Pattern-based augmented reality systems are considered the most promising approach for accurately registering virtual objects with real-time video feeds. The problem with existing solutions is the lack of robustness to partial occlusions of the pattern, which is important when attempting natural interactions with virtual objects. This paper describ...
Article
Full-text available
With the invention of fast USB interfaces and recent increase of computer power and decrease of camera cost, it has be- come very common to see a camera on top of a computer monitor. Vision-based games and interfaces however are still not common, even despite the realization of the benefits vision could bring: hand-free control, multiple-user inter...
Conference Paper
Full-text available
This paper presents a method of computing camera positions from a sequence of overlapping images obtained from a binocular/trinocular camera head. First, we find matching features among the images at each camera head position. Because the individual cameras are calibrated we can directly compute the 3D coordinates of these features using triangulat...
Conference Paper
The view planning problem, also known as the next-best-view (NBV) problem, for object reconstruction and inspection, has been shown to be isomorphic to the set covering problem which is NP-Complete. In this paper we express a theoretical framework for the NBV problem as an integer programming problem including a registration constraint. Experimenta...
Conference Paper
Full-text available
We propose a key frame extraction mechanism to aid the Structure from Motion (SfM) problem when dealing with image sequences from video cameras. Due to high frame rates (15 frames per second or more) the baseline between frames can be very small and the number of frames can become unpractical to deal with effectively. The mechanism described in thi...
Article
Full-text available
Previous work has presented a view planning concept for automated object reconstruction. The multi-stage concept comprises a rapid initial exploration stage to produce a low resolution approximate object model — the rough model. With the rough model as the new knowledge base, a view plan is developed for a subsequent precision measurement stage cap...
Article
We present a volumetric method that can efficiently create triangular meshes from 3-D geometric data. This data can be presented in the form of images, profiles or unordered points. The mesh model can be created at different resolutions and can also be closed to make a true volumetric model.
Article
Full-text available
The paradigm of projective vision has recently become popular. In this paper we describe a system for computing camerapositions from an image sequence using projective methods. Projective methods are normally used to deal with uncalibrated images. However, we claim that even when calibration information is available it is often better to use the pr...
Article
Creating virtual environment models often requires geometric data from range sensors as well as photometric data from CCD cameras. The model must be geometrically correct, visually realistic, and small enough in size to allow real-time rendering. We present an approach based on 3D range sensor data, multiple CCD cameras, and a colour high-resolutio...
Article
Full-text available
We presentavolumetric method that can e#- ciently create triangular meshes from 3-D geometric data. This data can be presented in the form of images, pro#les or unordered points. The mesh model can be created at di#erent resolutions and can also be closed to make a true volumetric model. Keywords: Mesh Creation, Triangular Meshes, Model Building, R...
Article
Full-text available
Reverse engineering is a process by which a geometric model is created from sensor data. In this paper we show how to create a geometric model that can be read into a Computer Aided Design system using as input multiple range images of an object. Our reverse engineering method combines anumber of new algorithms in a novel way to create the parametr...
Article
Full-text available
The problem of building virtual models from sensor data increases in importanceaspowerful graphics rendering hardwarebecomes widespread. Model building stands at the interface between computer vision and computer graphics, and researchers from both areas have made contributions. We believe that only by a systematic review of the remaining open rese...
Article
Full-text available
Applications for 3D models of objects and scenes are rapidly growing in number. Active sensors are the most commonly used means of acquiring geometric models. The current acquisition process of view planning, sensing, registration and integration requires a high level of intervention by imaging specialists with extensive training and experience. Au...
Conference Paper
The paper describes a method of automatically performing the registration of two range images that have significant overlap. We first find points of interest in the intensity data that comes with each range image. Then we perform a triangulation of the 3D range points associated with these 2D interest points. All possible pairs of triangles between...
Conference Paper
Full-text available
The paper presents an algorithm for constructing tangent plane continuous (G<sup>1</sup>) surfaces with piecewise polynomials over triangular meshes. The input mesh can be of arbitrary topological type, that is, any number of faces can meet at a mesh vertex. The mesh is first refined to one solely with quadrilateral cells. Rectangular Bezier patche...
Conference Paper
This paper describes a method of automatically performing the registration of two range images that have significant overlap. We first find points of interest in the intensity data that comes with each range image. Then we perform a tetrahedrization of the 3D range points associated with these 2D interest points. The triangle pairs of these tetrahe...
Article
Selecting the appropriate 3-D capture and modeling technologies to create the contents of a virtual environment (VE) remains a challenging task. One of the objectives of the environment-modeling project in our laboratory is to develop a design strategy for selecting the optimum sensor technology and its configuration to meet the requirements for vi...
Article
Full-text available
Creating virtual environment models often requires geometric data from range sensors as well as photometric data from CCD cameras. The model must be geometrically correct, visually realistic, and small enough in size to allow real-time rendering. We present an approach based on 3D range sensor data, multiple CCD cameras, and a colour high-resolutio...
Article
Full-text available
Creating virtual environment models often requires geometric data from range sensors as well as photometric data from CCD cameras. The model must be geometrically correct, visually realistic, and small enough in size to allow real-time rendering. We present an approach based on 3D range sensor data, multiple CCD cameras, and a high-resolution digit...
Conference Paper
Full-text available
We present a volumetric method that can efficiently create triangular meshes from 3-D geometric data. This data can be presented in the form of images, profiles or unordered points. The mesh model can be created at different resolutions and can also be closed to make a true volumetric model.
Article
Full-text available
In this paper we present a way of integrating a number of different views taken by a rangefinder in order to create a single surface model. This model consists of a mesh of triangular planar patches which can be easily and efficiently rendered on graphics hardware. Our method is based on the marching cubes algorithm which was created for rendering...
Article
Full-text available
A genetic algorithm based on a minimal subset representation of a geometric primitive is used to perform primitive extraction. A genetic algorithm is an optimization method that uses the metaphor of evolution, and a minimal subset is the smallest number of points necessary to define a unique instance of a geometric primitive. The approach is capabl...
Article
This paper discusses the problem of tracking a geometric object using data from a real-time laser rangefinder. Because of the volume of data we choose to preprocess each range image to find the edge points. Then 3D lines and circles are extracted from these edge points. By matching these extracted curves to the model of each object we are able to c...
Article
Extracting geometric primitives is an important task in model-based computer vision. The Hough transform is the most common method of extracting geometric primitives. Recently, methods derived from the field of robust statistics have been used for this purpose. We show that extracting a single geometric primitive is equivalent to finding the optimu...
Article
Two related and important problems in the field of model-based computer vision are the extraction of predefined primitives from geometric data, and the computation of correspondences among such primitives. We show that both problems are equivalent to the optimization of a cost function, which often has very many local minima. One implication of thi...
Conference Paper
Full-text available
This paper diacusses the problem of extracting curves, such as lines, circlea and ellipses, from 2D edge data. Our pra- posed extraction algorithm operates by taking random Barn- ples of minima! sitbsets, and thcn matching the curvc through each minimal sub~et against the edge data. We perform the time-critical curve matching step using a new appro...
Conference Paper
The problem of geometric primitive extraction in a range image is considered. A segmentation algorithm segments the image into quadric patches and provides an efficient initialization for the extraction of geometric primitives by a genetic algorithm. These algorithms are shown to be simple, fast and robust to noise. A parallel implementation has be...
Article
Pose determination is the process of finding the pose (position and orientation) of a part with a known geometry in a scene using sensor data. No prior estimate of the pose is assumed to be available. If an estimate of the pose is available then pose determination becomes pose refinement. It is shown that both can be modelled as optimizations of a...
Conference Paper
The problem of segmenting an image provided by a geometric sensor into geometric primitives is addressed by a two-step iterative process. In the first step, the largest connected region bounded by edge pixels is hypothesized as containing a geometric primitive. In the second step, the resulting set of pixels is sent to a robust fitter based on the...
Article
Parts acquisition by a robot is an important industrial problem. Our method makes possible the acquisition of various types of parts which are jumbled together in a pile. An intensity image along with selected range data is used to guide a robot to a location, called a holdsite, at which it can grasp a part. The intensity image gives a rough indica...
Article
Full-text available
Selecting the appropriate 3-D capture and modeling technologies to create the contents of a virtual environment (VE) remains a challenging task. One of the objectives of the environment-modeling project in our laboratory is to develop a design strategy for selecting the optimum sensor technology and its configuration to meet the requirements for vi...
Article
Full-text available
1. Briefly, what approach or combination of approaches did you test in each of your submitted runs? • VIVAlab-uOttawa.v The video search approach is based on a method that finds keypoints in each video frame, and then uses a descriptor of keypoint counts in 16 (4 by 4) equally sized regions of the images. • VIVAlab-uOttawa.a The audio-only copy det...
Article
Full-text available
Accurate 3-D models of environments, or sites, are essential to the success of navigation and positioning of autonomous vehicles in these environments. However, creating such models in large indoor environments is a challenging task indeed. This paper describes the procedures and the components of a system designed to solve the problems of precise...
Article
Augmented reality (AR) is the concept of inserting vir-tual objects into real scenes. Typically, these augmenta-tions are aligned with rigid planar objects in the scene, which can sometimes be restrictive. This poster presents a method to perform real-time 2D augmentations on non-rigid objects, such as clothing. In addition, a novel tech-nique to e...
Article
This thesis applies the consensus paradigm to an important problem in model-based vision, that of primitive extraction. Primitive extraction is the process of finding geometric primitives in geometric data. Such data are obtained directly by active sensors such as laser rangefinders, by processes that operate on passive sensor data to create depth...
Article
Full-text available
Laser scanning range sensors are widely used for high-precision, high-density three-dimensional (3D) reconstruction and inspection of the surface of physical objects. The process typically involves planning a set of views, physically altering the relative object-sensor pose, taking scans, registering the acquired geometric data in a common coordina...
Article
Full-text available
The applications of vision-based face tracking to HCI are evident. Face tracking based program control can be used as a hands-free alternative and/or extension to conventional pointing devices such as mouse, joystick, track pad or trackball. This can be used, for example, to switch the focus of attention in windows environment. Vision-based percept...
Article
Calibration is the process of computing the intrinsic (internal) camera parameters from a series of images. Normally calibration is done by placing predefined targets in the scene or by having special camera motions, such as rotations. If these two restrictions do not hold, then this calibration process is called autocalibration because it is done...
Article
Full-text available
Vision-based registration techniques for augmented reality systems have been the subject of intensive research recently due to their potential to accurately align virtual objects with the real world. The downfall of these vision-based approaches, however, is their high computational cost and lack of robustness. This paper describes the implementati...

Network

Cited By