Robert LaganiereUniversity of Ottawa · School of Electrical Engineering and Computer Science
Robert Laganiere
PhD
About
192
Publications
71,068
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,747
Citations
Introduction
Additional affiliations
July 1995 - present
Publications
Publications (192)
Multiple Object Tracking (MOT) in thermal imaging presents unique challenges due to the lack of visual features and the complexity of motion patterns. This paper introduces an innovative approach to improve MOT in the thermal domain by developing a novel box association method that utilizes both thermal object identity and motion similarity. Our me...
Fully sparse 3D detection has attracted an increasing interest in the recent years. However, the sparsity of the features in these frameworks challenges the generation of proposals because of the limited diffusion process. In addition, the quest for efficiency has led to only few work on vision-assisted fully sparse models. In this paper, we propos...
For achieving better performance, the majority of deep convolutional neural networks have endeavored to increase the model capacity by adding more convolutional layers or increasing the size of the filters. Consequently, the computational cost increases proportionally with the model capacity. This problem can be alleviated by dynamic convolution. I...
Perception systems in autonomous vehicles need to accurately detect and classify objects within their surrounding environments. Numerous types of sensors are deployed on these vehicles, and the combination of such multimodal data streams can significantly boost performance. The authors introduce a novel sensor fusion framework using deep convolutio...
link:
https://rdcu.be/dkIhZ
Deep convolution neural networks (DCNNs) in deep learning have been widely used in semantic segmentation. However, the filters of most regular convolutions in DCNNs are spatially invariant to local transformations, which reduces localization accuracy and hinders the improvement of semantic segmentation. Dynamic convolu...
Effective detection of road objects in diverse environmental conditions is a critical requirement for autonomous driving systems. Multi-modal sensor fusion is a promising approach for improving perception, as it enables the combination of information from multiple sensor streams in order to optimize the integration of their respective data. Fusion...
Features from LiDAR and cameras are considered to be complementary. However, due to the sparsity of the LiDAR point clouds, a dense and accurate RGB/3D projective relationship is difficult to establish especially for distant scene points. Recent works try to solve this problem by designing a network that learns missing points or dense point density...
Object detection utilizing Frequency Modulated Continous Wave radar is becoming increasingly popular in the field of autonomous systems. Radar does not possess the same drawbacks seen by other emission-based sensors such as LiDAR, primarily the degradation or loss of return signals due to weather conditions such as rain or snow. However, radar does...
Autonomous driving requires effective capabilities to detect road objects in different environmental conditions. One promising solution to improve perception is to leverage multi-sensor fusion. This approach aims to combine various sensor streams in order to best integrate the information coming from the different sensors. Fusion operators are used...
p> In this work, we propose a novel deep learning based sensor fusion framework, that uses both camera and LiDAR sensors in a multi-modal and multi-view setting. In order to leverage both data streams, we incorporate two new sophisticated fusion mechanisms: element-wise multiplication and multi-modal factorized bilinear pooling. When compared to pr...
p> In this work, we propose a novel deep learning based sensor fusion framework, that uses both camera and LiDAR sensors in a multi-modal and multi-view setting. In order to leverage both data streams, we incorporate two new sophisticated fusion mechanisms: element-wise multiplication and multi-modal factorized bilinear pooling. When compared to pr...
With the popularity of attention layers in natural language processing, many researchers have experimented with attention-based architecture in different applications of computer vision. In this paper, we extend on an existing 3D point cloud based deep odometry estimation model [9] by introducing attention layers in the model architecture. The resu...
A two stage real-time hand gesture recognition system is presented. It combines a machine learning trained detection step with a colour processing contour shape validation step. The detection step is done with either Adaboost Cascades or Support Vector Machines using HOG features. The system achieves a low false positive rate and a sufficient true...
Pedestrian detection has a pivotal role in the field of computer vision. Recently, deep convolutional neural networks (CNNs) have been demonstrated to achieve appealing performance in object detection compared to hand-crafted methods, with single shot multiBox detector (SSD) being one of state-of-the-art methods in terms of both speed and accuracy....
Recently convolutional neural networks (CNNs) have been employed to address the problem of hand pose estimation. In this work, we introduce an end-to-end deep architecture that can accurately estimate hand pose through the joint use of model-based and fine-tuning methods. In the model-based stage, we make use of the prior information in hand model...
Object detection using automotive radars has not been explored with deep learning models in comparison to the camera based approaches. This can be attributed to the lack of public radar datasets. In this paper, we collect a novel radar dataset that contains radar data in the form of Range-Azimuth-Doppler tensors along with the bounding boxes on the...
Processing point clouds using deep neural networks is still a challenging task. Most existing models focus on object detection and registration with deep neural networks using point clouds. In this paper, we propose a deep model that learns to estimate odometry in driving scenarios using point cloud data. The proposed model consumes raw point cloud...
Camera and Lidar processing have been revolutionized with the rapid development of deep learning model architectures. Automotive radar is one of the crucial elements of automated driver assistance and autonomous driving systems. Radar still relies on traditional signal processing techniques, unlike camera and Lidar based methods. We believe this is...
Multispectral pedestrian detection is a difficult task, especially with pedestrian images of different sizes. In most convolutional neural network (CNN) models, the shared receptive fields of each layer are of the same size, which constrains detection results of multiple scales pedestrians. In this paper, we propose a dynamic selection scheme to ad...
Recently, there are lots of tracking methods proposed to improve the performance of visual tracking in videos with challenging situations, such as background clutter, severe occlusion, rotation, and so on. In real unmanned aerial vehicle (UAV) based tracking systems, there are various noises occurring during video capturing, transmission, and proce...
Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved You...
div>Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved...
div>Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved...
div>Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved...
Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved You...
In this work, we propose the use of radar with advanced deep segmentation models to identify open space in parking scenarios. A publically available dataset of radar observations called SCORP was collected. Deep models are evaluated with various radar input representations. Our proposed approach achieves low memory usage and real-time processing sp...
In this paper, we present a pedestrian detection method by leveraging multispectral images which consist of color and thermal image information. Our method is based on the observation that a multispectral image enables us to overcome inherent limitations for pedestrian detection under challenging situations, e.g., insufficient illumination, small s...
In this paper, we investigate effective neural network layers for optical flow estimation and in particular for omni-directional optical flow. Optical flow has many applications in computer graphics, augmented reality and in 3D modeling. We create a simple dataset that enables us to efficiently assess the effectiveness of different neural network l...
In recent years, deep learning models have resulted in a huge amount of progress in various areas, including computer vision. By nature, the supervised training of deep models requires a large amount of data to be available. This ideal case is usually not tractable as the data annotation is a tremendously exhausting and costly task to perform. An a...
In this paper we introduce a 2D convolutional neural network (CNN) which exploits the additive depth map, a minimal representation of volume, for reconstructing occluded portions of objects captured using commodity depth sensors. The additive depth map represents the amount of depth needed to transform the input into the “back” depth map taken with...
Visual tracking is a fundamental computer vision task. Recent years have seen many tracking methods based on correlation filters exhibiting excellent performance. The strength of these methods comes from their ability to efficiently learn changes of the target appearance over time. A fundamental drawback to these methods is that the background of t...
In this paper we present an approach for 6 DoF panoramic videos from omni-directional stereo (ODS) images using convolutional neural networks (CNNs). More specifically, we use CNNs to generate panoramic depth maps from ODS images in real-time. These depth maps would then allow for re-projection of panoramic images thus providing 6 DoF to a viewer i...
Object detection is a hot topic with various applications in computer vision, e.g., image understanding, autonomous driving, and video surveillance. Much of the progresses have been driven by the availability of object detection benchmark datasets, including PASCAL VOC, ImageNet, and MS COCO. However, object detection on the drone platform is still...
Single-object tracking, also known as visual tracking, on the drone platform attracts much attention recently with various applications in computer vision, such as filming and surveillance. However, the lack of commonly accepted annotated datasets and standard evaluation platform prevent the developments of algorithms. To address this issue, the Vi...
In this paper, we exploit convolutional features extracted from multiple layers of a pre-trained deep convolutional neural network. The outputs of the multiple convolutional layers encode both low-level and high-level information about the targets. The earlier convolutional layers provide accurate positional information while the late convolutional...
In this paper, we develop a scale-aware Region Proposal Network (RPN) model to address the problem of vehicle detection in challenging situations. Our model introduces two built in sub-networks which detect vehicles with scales from disjoint ranges. Therefore, the model is capable of training the specialized sub-networks for large-scale and small-s...
This work proposes a sensor-based control system for fully automated object detection and exploration U+0028 surface following U+0029 with a redundant industrial robot. The control system utilizes both offline and online trajectory planning for reactive interaction with objects of different shapes and color using RGB-D vision and proximity U+002F c...
In this paper, we present a multi-scale Fully Convolutional Networks (MSP-RFCN) to robustly detect and classify human hands under various challenging conditions. In our approach, the input image is passed through the proposed network to generate score maps, based on multi-scale predictions. The network has been specifically designed to deal with sm...
The difference between sample distributions of public datasets and specific scenes can be very significant. As a result, the deployment of generic human detectors in real-world scenes most often lead to sub-optimal detection performance. To avoid the labour-intensive task of manual annotations, we propose a semi-supervised approach for training dee...
Although many visual attention models have been proposed , very few saliency models investigated the impact of audio information. To develop audiovisual attention models, researchers need to have a ground truth of eye movements recorded while exploring complex natural scenes in different audio conditions. They also need tools to compare eye movemen...
We examine the problem of creating immersive virtual reality (VR) scenes using a single moving RGB-D camera. Our approach takes as input a RGB-D video containing one or more actors and constructs a complete 3D background within which human actors are properly embedded. A user can then view the captured video from any viewpoint and interact with the...
The Thermal Infrared Visual Object Tracking challenge 2016, VOT-TIR2016, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply pre-learned models of object appearance. VOT-TIR2016 is the second benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented....
The Visual Object Tracking challenge VOT2016 aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 70 trackers are presented, with a large number of trackers being published at major computer vision conferences and journals in the recent years. The number of tested state-of-...
This paper presents a technology readiness assessment framework called PROVE-IT(), which allows one to access the readiness of face recognition and video analytic technologies for video surveillance applications, and the roadmap for the deployment of technologies for automated recognition of people and their activities in video, based on the propos...
This paper presents the design and integration of a vision-guided robotic system for automated and rapid vehicle inspection. The main goal of this work is to scan and explore regions of interest over an automotive vehicle while a manipulator’s end effector operates in close proximity of the vehicle and safely accommodates its curves and inherent su...
This paper introduces an action recognition system based on a multiscale local part model. This model includes both a coarse primitive level root patch covering local global information and higher resolution overlapping part patches incorporating local structure and temporal relations. Descriptors are then computed over the local part models by app...
Camera calibration plays a key role in every computer vision application dealing with the problems of recovering a camera's geometry with respect to a 3D world reference, making 3D measurement in a captured scene or extracting 3D data from observed objects. These problems emerge in various applications such as structure from motion, robotics, augme...
Modeling human attention has been arousing a lot of interest due to its numerous applications. The process that allows us to focus on some more important stimuli is defined as the " attention'. Seam carving is an approach to resize images or video sequences while preserving the semantic content. To define what is important, gradient was first used...
Gradient features play important roles for the problem of pedestrian detection, especially the histogram of oriented gradients (HOG) feature. To improve detection accuracy in terms of feature extraction, HOG has been combined with multiple kinds of low-level features. However, it is still possible to exploit further discriminative information from...