John S. Zelek

John S. Zelek
University of Waterloo | UWaterloo · Department of Systems Design Engineering

About

168
Publications
21,554
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,834
Citations
Citations since 2017
40 Research Items
938 Citations
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200

Publications

Publications (168)
Article
Tracking and identifying players is a fundamental step in computer vision-based ice hockey analytics. The data generated by tracking is used in many other downstream tasks, such as game event detection and game strategy analysis. Player tracking and identification is a challenging problem since the motion of players in hockey is fast-paced and non-...
Preprint
Full-text available
Tracking and identifying players is an important problem in computer vision based ice hockey analytics. Player tracking is a challenging problem since the motion of players in hockey is fast-paced and non-linear. There is also significant player-player and player-board occlusion, camera panning and zooming in hockey broadcast video. Prior published...
Preprint
Causal discovery between collections of time-series data can help diagnose causes of symptoms and hopefully prevent faults before they occur. However, reliable causal discovery can be very challenging, especially when the data acquisition rate varies (i.e., non-uniform data sampling), or in the presence of missing data points (e.g., sparse data sam...
Preprint
Full-text available
Identifying players in video is a foundational step in computer vision-based sports analytics. Obtaining player identities is essential for analyzing the game and is used in downstream tasks such as game event recognition. Transformers are the existing standard in Natural Language Processing (NLP) and are swiftly gaining traction in computer vision...
Article
Full-text available
Over the past years, researchers have proposed various methods to discover causal relationships among time-series data as well as algorithms to fill in missing entries in time-series data. Little to no work has been done in combining the two strategies for the purpose of learning causal relationships using unevenly sampled multivariate time-series...
Article
Why can’t neural networks (NN) forecast better? In the major super-forecasting competitions, NN have typically under-performed when compared to traditional statistical methods. When they have performed well, the underlying methods have been ensembles of NN and statistical methods. Forecasting stock markets, medical, infrastructure dynamics, social...
Conference Paper
Deep learning methods for ophthalmic diagnosis have shown success for tasks like segmentation and classification but their implementation in the clinical setting is limited by the black-box nature of the algorithms. Very few studies have explored the explainability of deep learning in this domain. Attribution methods explain the decisions by assign...
Article
Full-text available
Presented in this paper is a novel method for the mapping and semantic modeling of an underground parking lot using 3D point clouds collected by a low-cost Backpack Laser Scanning (BLS) or LiDAR system. Our method consists of two parts: a Simultaneous Localization and Mapping (SLAM) algorithm based on Sparse Point Clouds (SPC) and a semantic modeli...
Preprint
Full-text available
Puck location in ice hockey is essential for hockey analysts for determining the location of play and analyzing game events. However, because of the difficulty involved in obtaining accurate annotations due to the extremely low visibility and commonly occurring occlusions of the puck, the problem is very challenging. The problem becomes even more c...
Article
Full-text available
This paper presents a real-time 3D object detector based on LiDAR based Simultaneous Localization and Mapping (LiDAR-SLAM). The 3D point clouds acquired by mobile LiDAR systems, within the environment of buildings, are usually highly sparse, irregularly distributed, and often contain occlusion and structural ambiguity. Existing 3D object detection...
Conference Paper
Globally Diabetic retinopathy (DR) is one of the leading causes of blindness. But due to low patient to doctor ratio performing clinical retinal screening processes for all such patients is not always possible. In this paper a deep learning based automated diabetic retinopathy detection method is presented. Though different frameworks exist for cla...
Preprint
Full-text available
Recognizing actions in ice hockey using computer vision poses challenges due to bulky equipment and inadequate image quality. A novel two-stream framework has been designed to improve action recognition accuracy for hockey using three main components. First, pose is estimated via the Part Affinity Fields model to extract meaningful cues from the pl...
Conference Paper
Segmentation of optical coherence tomography (OCT) images is crucial for investigation of individual layers of the retina for detection of possible pathologies. However, the task of segmentation is confounded by factors such as poor image contrast and speckle noise. Also, the time and subjectivity involved in manual segmentation (MS) limits the app...
Article
Visual Odometry (VO) can be categorized as being either direct or feature based. When the system is calibrated photometrically, and images are captured at high rates, direct methods have shown to outperform feature-based ones in terms of accuracy and processing time; they are also more robust to failure in feature-deprived environments. On the down...
Conference Paper
Accurate segmentation of spectral-domain Optical Coherence Tomography (SD-OCT) images helps diagnose retinal pathologies and facilitates the study of their progression/remission. Manual segmentation is clinical-expertise dependent and highly time-consuming. Furthermore, poor image contrast due to high-reflectivity of some retinal layers and the pre...
Conference Paper
Segmentation of spectral-domain Optical Coherence Tomography (SD-OCT) images facilitates visualization and quantification of sub-retinal layers for diagnosis of retinal pathologies. However, manual segmentation is subjective, expertise dependent, and time-consuming, which limits applicability of SD-OCT. Efforts are therefore being made to implement...
Article
Full-text available
Analysis of retinal fundus images is essential for eye-care physicians in the diagnosis, care and treatment of patients. Accurate fundus and/or retinal vessel maps give rise to longitudinal studies able to utilize multimedia image registration and disease/condition status measurements, as well as applications in surgery preparation and biometrics....
Conference Paper
Full-text available
Diabetes is a chronic condition affecting millions of people worldwide. One of its major complications is diabetic retinopathy (DR), which is the most common cause of legal blindness in the developed world. Early screening and treatment of DR prevents vision deterioration, however the recommendation of yearly screening is often not being met. Mobil...
Article
Extensive research in the field of monocular SLAM for the past fifteen years has yielded workable systems that found their way into various applications in robotics and augmented reality. Although filter-based monocular SLAM systems were common at some time, the more efficient keyframe-based solutions are becoming the de facto methodology for build...
Conference Paper
The segmentation of retinal morphology has numerous applications in assessing ophthalmologic and cardiovas-cular disease pathologies. The early detection of many such conditions is often the most effective method for reducing patient risk. Computer aided segmentation of the vasculature has proven to be a challenge, mainly due to inconsistencies suc...
Conference Paper
Analysis of retinal fundus images is essential for physicians, optometrists and ophthalmologists in the diagnosis, care and treatment of patients. The first step of almost all forms of automated fundus analysis begins with the segmentation and subtraction of the retinal vasculature, while analysis of that same structure can aid in the diagnosis of...
Article
Automatic calibration of structured-light systems, generally consisting of a projector and camera, is of great importance for a variety of practical applications. We propose a novel optimization approach for geometric calibration of a projector-camera system that estimates the intrinsic, extrinsic and distortion parameters of both the camera and pr...
Article
Full-text available
The paper preposes a new scheme to promote the robustness of 3D structure and motion factorization from uncalibrated image sequences. First, an augmented affine factorization algorithm is proposed to circumvent the difficulty in image registration with imperfect data. Then, an alternative weighted factorization algorithm is designed to handle the m...
Article
This paper explores the viability of applying 3D optical flow techniques on 3D heart sequences to diagnose cardiac abnormalities and disease. Tagged magnetic resonance imaging (TMRI) is a non-invasive method to visualize in vivo myocardium motion during a cardiac cycle. By tracking the 3D trajectories of tagged material points it is possible to con...
Article
Google Street View is a useful database that houses a large amount of information. This information, however, is unlabelled. We explore the use of superpixel methods for segmentation of images in this database, specifically road segmentation.
Article
Full-text available
p>This work implements a method to improve correspondence matching in stereo vision by using varying illumination intensities from an external light source. By iteratively increasing the light intensity on the scene, different parts of the scene become saturated in the left and right images. These saturated areas are assumed to correspond to each o...
Article
Full-text available
p>Image-based localization problem consists of estimating the 6 DoF camera pose by matching the image to a 3D point cloud (or equivalent) representing a 3D environment. The robustness and accuracy of current solutions is not objective and quantifiable. We have completed a comparative analysis of the main state of the art approaches, namely Brute Fo...
Article
Acquiring accurate dense depth maps is crucial for accurate 3D reconstruction. Current high quality depth sensors capable of generating dense depth maps are expensive and bulky, while compact low-cost sensors can only reliably generate sparse depth measurements. We propose a novel multilayer conditional random field (MCRF) approach to reconstruct a...
Article
Full-text available
The smartphone is beginning to offer low-cost competition for some expensive medical diagnostics, particularly in areas where access to funds, medical professionals and equipment is scarce.
Article
This paper presents a novel visual servoing system that is simple, robust, and relatively fast. Our system utilizes depth information available from an RGB-D Microsoft Kinect, rather than a stereo camera that is known to fail in poorly textured environments. The technique is based on a two-phase controller. The first phase provides a coarse image p...
Article
This paper presents algorithmic advances and field trial results for autonomous exploration and proposes a solution to perform simultaneous localization and mapping (SLAM), complete coverage, and object detection without relying on GPS or magnetometer data. We demonstrate an integrated approach to the exploration problem, and we make specific contr...
Article
Human action recognition in video is important in many computer vision applications such as automated surveillance. Human actions can be compactly encoded using a sparse set of local spatio-temporal salient features at different scales. The existing bottom-up methods construct a single dictionary of action primitives from the joint features of all...
Conference Paper
A novel statistical textural distinctiveness approach for robustly detecting salient regions in natural images is proposed. Rotational-invariant neighborhood-based textural representations are extracted and used to learn a set of representative texture atoms for defining a sparse texture model for the image. Based on the learnt sparse texture model...
Conference Paper
In robotics and computer vision, saliency maps are frequently used to identify regions that contain potential objects of interest and to restrict object detection to those regions only. However, common saliency approaches do not provide information as to whether there really is an interesting object triggering saliency and therefore tend to highlig...
Conference Paper
The paper focuses on robust 3D structure from motion of nonrigid objects from uncalibrated image sequences. A new affine factorization algorithm is first proposed to avoid the difficulty in image alignment for imperfect data, followed by a robust factorization scheme to handle outlying and missing data. The novelty and main contributions of the pap...
Conference Paper
The paper focuses on 3D structure and motion factorization from uncalibrated image sequences. A rank-4 affine factorization algorithm and a robust structure and motion factorization scheme are proposed to handle outlying and missing data. The novelty and main contribution of the paper are as follows: (i) The rank-4 factorization algorithm is a new...
Chapter
Structure and motion recovery of nonrigid objects and dynamic scenes has received a lot of attention in recent years. In this chapter, the state-of-the-art techniques for the problem of structure and motion factorization of nonrigid objects and dynamic scenes are reviewed. First, an introduction of structure from motion and some mathematical backgr...
Article
This paper focuses on the problem of structure and motion recovery from uncalibrated image sequences. It has been empirically proven that image measurement uncertainties can be modeled spatially and temporally by virtue of reprojection residuals. Consequently, a spatial-and-temporal-weighted factorization (STWF) algorithm is proposed to handle sign...
Conference Paper
A majority of driver assistance systems focuses on assisting the driver when the car is in motion. There is relatively less, to almost no work on assistance systems when the car is stationary. The focus of this paper is on such a system and describes the design and realization of a novel, camera-based anti-trap protection system for smart car doors...
Conference Paper
Driver assistance systems in today's cars assist drivers when the car is in motion. However, there is relatively less, to almost no work on assistance systems when the car is stationary. This paper focuses on such a system, and describes the design and realization of a collision avoidance system for a novel car door assistance system, the smart car...
Article
Feature detection is a crucial step in many Computer Vision applications such as matching, tracking, visual odometry and object recognition, etc. Detecting robust features that are persistent, rotation-invariant, and quickly calculated is a major problem in computer vision. Feature detectors using the difference of Gaussian (DoG) are computationall...
Article
An automatic adjustment of the seat position according to the driver height significantly increases the level of comfort when entering a car. A camera attached to a vehicle can estimate the body heights of approaching drivers. However, absolute height estimation based on a single camera leads to several problems. Cost-sensitive cameras used in auto...
Conference Paper
This paper proposes an extension to anisotropic diffusion filtering for a better preservation of semantically meaningful structures such as edges in an image in its smoothing/denoising process. The problem of separation of the gradients due to edges and the gradients due to noise is formulated as a nonlinearly separable classification problem. More...
Conference Paper
Full-text available
Local spatio-temporal salient features are used for a sparse and compact representation of video contents in many computer vision tasks such as human action recognition. To localize these features (i.e., key point detection), existing methods perform either symmetric or asymmetric multi-resolution temporal filtering and use a structural or a motion...
Conference Paper
The paper focuses on the problem of structure from motion and proposes a spatial-and-temporal-weighted factorization algorithm. The contributions of the paper are as follows: First, it is demonstrated that the image reprojection error is generally in proportion to the error magnitude contained in the image. Second, the error distribution contained...
Conference Paper
Filter-based Structure from Motion (SfM) approaches work usually in two steps: prediction and update. Prediction is the process of determining a prior distribution of the state vector at time t+1 from the previous distribution at time t. Update is the process of adjusting the predicted distribution so it complies with the new received measurements...
Article
Persons with dementias, such as Alzheimer's disease, have well‐documented deficiencies in way-finding, which often renders these individuals house bound and/or unable to perform daily activities without significant frustrations. A wearable belt has recently been developed that may have the capability to facilitate navigation for this population. Th...
Article
In a typical surveillance installation, a human operator has to constantly monitor a large array of video feeds for suspicious behaviour. As the number of cameras increases, information overload makes manual surveillance increasingly difficult, adding to other confounding factors such as human fatigue and boredom. The objective of an intelligent vi...
Article
Full-text available
Spatio-temporal salient features can localize the local motion events and are used to represent video sequences for many computer vision tasks such as action recognition. The robust detection of these features under geometric variations such as affine transfor-mation and view/scale changes is however an open problem. Existing methods use the same f...
Conference Paper
Human action recognition can be performed using multiscale salient features which encode the local events in the video. Existing feature extraction methods use non-causal spatio-temporal filtering, and hence, they are not biologically plausible. To address this inconsistency, new features extracted from a biologically plausible perception model are...
Conference Paper
Full-text available
The Extended Kalman Filter (EKF) is still one of the most widely used approaches for small scale Structure from Motion (SFM) and Simultaneous Localization And Mapping (SLAM) problems. However, the EKF does not have the ability to take into account the motion information carried by features matched only between two consecutive frames. This informati...
Technical Report
Full-text available
To improve operational effectiveness for the Canadian Forces (CF), the Joint Unmanned Aerial Vehicle Surveillance Target Acquisition System (JUSTAS) project is acquiring a medium-altitude, long-endurance (MALE) uninhabited aerial vehicle (UAV). In support of the JUSTAS project, Defence Research and Development Canada (DRDC) - Toronto is investigati...
Conference Paper
Full-text available
Computer vision (i.e., image understanding) involves understanding the 3D scene creating the image. Computer vision is challenging because it is the computer that decides how to act based on an understanding of the image. Key image understanding tasks include depth computation, as well as object detection, localization, recognition and tracking. Te...
Article
A wearable belt has recently been developed which may have the capability to facilitate navigation. Through a series of four, small, vibrating motors adjusted to the cardinal positions of front, back, right, and left, the belt provides the wearer with vibrotactile signals indicating the direction and distance to their destination. The study investi...
Conference Paper
Cognitive assistance of a rollator (wheeled walker) user tends to reduce the attentional capacity of the user and may impact her stability. Hence, it is important to understand and track the pose of rollator users before augmenting a rollator with some form of cognitive assistance. While the majority of current markerless vision systems focus on es...
Conference Paper
Rao-BlackWellized particle filters have achieved a breakthrough in the scalability of filters used for Structure from Motion (SFM) and Simultaneous Localization And Mapping (SLAM). The new generations of these filters employ as proposal distribution the optimal i.e, the one taking into consideration not only the previous motion of the camera, but a...
Conference Paper
Robust visual tracking is a challenging problem, especially when a target undergoes complete occlusion or leaves and later re-enters the camera view. The mean-shift tracker is an efficient appearance-based tracking algorithm that has become very popular in recent years. Many researchers have developed extensions to the algorithm that improve the ap...
Conference Paper
Full-text available
Understanding the human gait is an important objective towards improving elderly mobility. In turn, gait analyses largely depend on kinematic and dynamic measurements. While the majority of current markerless vision systems focus on estimating 2D and 3D walking motion in the sagittal plane, we wish to estimate the 3D pose of rollator users' lower l...
Conference Paper
Spatio-temporal salient features are widely being used for compact representation of objects and motions in video, especially for event and action recognition. The existing feature extraction methods have two main problems: First, they work in batch mode and mostly use Gaussian (linear) scale-space filtering for multi-scale feature extraction. This...
Article
We present a method to determine the essential matrix using both discrete and differential matching constraints. Differential constraints, derived from optical flow, are abundant in contrast to the discrete constraints, derived from feature correspondences, which are scarce when just a limited number of salient features are available. We formulate...