Yi Zhou’s research while affiliated with Hunan University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (9)


Motion and Structure from Event-Based Normal Flow
  • Chapter

October 2024

·

8 Reads

·

2 Citations

Zhongyang Ren

·

Bangyan Liao

·

Delei Kong

·

[...]

·

Yi Zhou


Fig. 2: Toy example of applying the proposed normal-flow constraint. (a): A 2D registration task using as input either optical flow or normal flow. The groundtruth displacement is defined by a global flow of (1.732, −1) ⊤ . (b)-(d): The loss landscape obtained using different geometric measurements in the registration task. The red dot denotes the resulting displacement, and the green cross the groundtruth displacement. The bottom-left regions in (b)-(d) display the corresponding registration results.
Fig. 8: An event camera (orange) capturing different sequences in simulated scenes.
Fig. 10: Results of the continuous-time angular velocity estimator on sequence dynamic_rotation.
Evaluation of our linear solver for differential homography estimation.
Motion and Structure from Event-based Normal Flow
  • Preprint
  • File available

July 2024

·

39 Reads

Recovering the camera motion and scene geometry from visual data is a fundamental problem in the field of computer vision. Its success in standard vision is attributed to the maturity of feature extraction, data association and multi-view geometry. The recent emergence of neuromorphic event-based cameras places great demands on approaches that use raw event data as input to solve this fundamental problem.Existing state-of-the-art solutions typically infer implicitly data association by iteratively reversing the event data generation process. However, the nonlinear nature of these methods limits their applicability in real-time tasks, and the constant-motion assumption leads to unstable results under agile motion.To this end, we rethink the problem formulation in a way that aligns better with the differential working principle of event cameras.We show that the event-based normal flow can be used, via the proposed geometric error term, as an alternative to the full flow in solving a family of geometric problems that involve instantaneous first-order kinematics and scene geometry. Furthermore, we develop a fast linear solver and a continuous-time nonlinear solver on top of the proposed geometric error term.Experiments on both synthetic and real data show the superiority of our linear solver in terms of accuracy and efficiency, and indicate its complementary feature as an initialization method for existing nonlinear solvers. Besides, our continuous-time non-linear solver exhibits exceptional capability in accommodating sudden variations in motion since it does not rely on the constant-motion assumption.

Download


Event-Aided Time-to-Collision Estimation for Autonomous Driving

July 2024

·

55 Reads

Predicting a potential collision with leading vehicles is an essential functionality of any autonomous/assisted driving system. One bottleneck of existing vision-based solutions is that their updating rate is limited to the frame rate of standard cameras used. In this paper, we present a novel method that estimates the time to collision using a neuromorphic event-based camera, a biologically inspired visual sensor that can sense at exactly the same rate as scene dynamics. The core of the proposed algorithm consists of a two-step approach for efficient and accurate geometric model fitting on event data in a coarse-to-fine manner. The first step is a robust linear solver based on a novel geometric measurement that overcomes the partial observability of event-based normal flow. The second step further refines the resulting model via a spatio-temporal registration process formulated as a nonlinear optimization problem. Experiments on both synthetic and real data demonstrate the effectiveness of the proposed method, outperforming other alternative methods in terms of efficiency and accuracy.


BeNeRF: Neural Radiance Fields from a Single Blurry Image and Event Stream

July 2024

·

5 Reads

Neural implicit representation of visual scenes has attracted a lot of attention in recent research of computer vision and graphics. Most prior methods focus on how to reconstruct 3D scene representation from a set of images. In this work, we demonstrate the possibility to recover the neural radiance fields (NeRF) from a single blurry image and its corresponding event stream. We model the camera motion with a cubic B-Spline in SE(3) space. Both the blurry image and the brightness change within a time interval, can then be synthesized from the 3D scene representation given the 6-DoF poses interpolated from the cubic B-Spline. Our method can jointly learn both the implicit neural scene representation and recover the camera motion by minimizing the differences between the synthesized data and the real measurements without pre-computed camera poses from COLMAP. We evaluate the proposed method with both synthetic and real datasets. The experimental results demonstrate that we are able to render view-consistent latent sharp images from the learned NeRF and bring a blurry image alive in high quality. Code and data are available at https://github.com/WU-CVGL/BeNeRF.



Event-based Motion Segmentation by Cascaded Two-Level Multi-Model Fitting

November 2021

·

37 Reads

Among prerequisites for a synthetic agent to interact with dynamic scenes, the ability to identify independently moving objects is specifically important. From an application perspective, nevertheless, standard cameras may deteriorate remarkably under aggressive motion and challenging illumination conditions. In contrast, event-based cameras, as a category of novel biologically inspired sensors, deliver advantages to deal with these challenges. Its rapid response and asynchronous nature enables it to capture visual stimuli at exactly the same rate of the scene dynamics. In this paper, we present a cascaded two-level multi-model fitting method for identifying independently moving objects (i.e., the motion segmentation problem) with a monocular event camera. The first level leverages tracking of event features and solves the feature clustering problem under a progressive multi-model fitting scheme. Initialized with the resulting motion model instances, the second level further addresses the event clustering problem using a spatio-temporal graph-cut method. This combination leads to efficient and accurate event-wise motion segmentation that cannot be achieved by any of them alone. Experiments demonstrate the effectiveness and versatility of our method in real-world scenes with different motion patterns and an unknown number of independently moving objects.


Fig. 7: Mapping module: (a) Stereo observations (time surfaces) are created at selected timestamps t, · · · , t − M (e.g., 20 Hz) and fed to the mapping module along with the events and camera poses. Inverse depth estimates, represented by probability distributions p(D t−k ), are propagated to a common time t and fused to produce an inverse depth map, p(D t ). We fuse estimates from 20 stereo observations (i.e., M = 19) to create p(D t ). (b) Taking the fusion from t − 1 to t as an example, the fusion rules are indicated in the dashed rectangle, which represents a 3 × 3 region of the image plane (pixels are marked by a grid of gray dots). A 3D point corresponding to the mean depth of p(D t−1 ) projects on the image plane at time t at a blue dot. Such a blue dot and p(D t−1 ) influence (i.e., assign, fuse or replace) the distributions p(D t ) estimated at the four closest pixels.
Fig. 8: Tracking. The point cloud recovered from the inverse depth map in (a) is warped to the time surface negative at the current time (b) using the estimated relative pose. The result (b) is a good alignment between the projection of the point cloud and the minima (dark areas) of the time surface negative.
Fig. 12: Mapping. Qualitative comparison of mapping results (depth estimation) on several sequences using various stereo algorithms. The first column shows intensity frames from the DAVIS camera (not used, just for visualization). Columns 2 to 5 show inverse depth estimation results of GTS [26], SGM [45], CopNet [62] and our method, respectively. Depth maps are color coded, from red (close) to blue (far) over a black background, in the range 0.55-6.25 m for the top four rows (sequences from [21]) and the range 1-6.25 m for the bottom two rows (sequences from [55]).
Event-Based Stereo Visual Odometry

March 2021

·

231 Reads

·

224 Citations

IEEE Transactions on Robotics

Event-based cameras are bioinspired vision sensors whose pixels work independently from each other and respond asynchronously to brightness changes, with microsecond resolution. Their advantages make it possible to tackle challenging scenarios in robotics, such as high-speed and high dynamic range scenes. We present a solution to the problem of visual odometry from the data acquired by a stereo event-based camera rig. Our system follows a parallel tracking-and-mapping approach, where novel solutions to each subproblem (three-dimensional (3-D) reconstruction and camera pose estimation) are developed with two objectives in mind: being principled and efficient, for real-time operation with commodity hardware. To this end, we seek to maximize the spatio-temporal consistency of stereo event-based data while using a simple and efficient representation. Specifically, the mapping module builds a semidense 3-D map of the scene by fusing depth estimates from multiple viewpoints (obtained by spatio-temporal consistency) in a probabilistic fashion. The tracking module recovers the pose of the stereo rig by solving a registration problem that naturally arises due to the chosen map and event data representation. Experiments on publicly available datasets and on our own recordings demonstrate the versatility of the proposed method in natural scenes with general 6-DoF motion. The system successfully leverages the advantages of event-based cameras to perform visual odometry in challenging illumination conditions, such as low-light and high dynamic range, while running in real-time on a standard CPU. We release the software and dataset under an open source license to foster research in the emerging topic of event-based simultaneous localization and mapping.

Citations (4)


... With this observation, some existing methods focus on predicting normal flow (NF) and demonstrate that normal flow is useful for tasks like egomotion estimation [23,28,41]. However, current NF estimation approaches are predominantly model-based, relying on fitting a plane to the local space-time event surface [9,32]. ...

Reference:

Learning Normal Flow Directly From Event Neighborhoods
Motion and Structure from Event-Based Normal Flow
  • Citing Chapter
  • October 2024

... Nguyen et al. [16] developed a real-time continuous-time LiDAR-inertial odometry (SLICT2), which achieved efficient optimization with few iterations using a simple solver. Lu et al. [28] proposed an event-based visual-inertial velometer that incrementally incorporates measurements from a stereo event camera and IMU. Li et al. [17] proposed a spline-based approach (SFUISE) for continuous-time Ultra-wideband-Inertial sensor fusion, which addressed the limitations of discrete-time sensor fusion schemes in asynchronous multi-sensor fusion and online calibration. ...

Event-based Visual Inertial Velometer
  • Citing Conference Paper
  • July 2024

... The tracking module projects the accumulated event frames onto these reference frames for pose estimation. Building on these two modules, different event representation methods have been introduced in the eventbased visual odometry, see [2]- [4]. However, both mapping and tracking modules of these systems depend on fixed-rate triggers determined by the platform's processing capacity, resulting in considerable computational overhead. ...

IMU-Aided Event-based Stereo Visual Odometry
  • Citing Conference Paper
  • May 2024

... Similarly, event-based methods for unsupervised depth estimation and egomotion estimation utilize the high temporal resolution of DVS outputs to generate real-time depth maps and motion trajectories [43,44]. Event-based SLAM frameworks [45,46,47] and visual odometry solutions [48,49] highlight the robustness of neuromorphic perception for localization and mapping under resource-constrained conditions. Techniques such as contrast maximization [50] and reward-based refinements [51] have further improved feature extraction and motion estimation, showcasing the flexibility of neuromorphic vision systems. ...

Event-Based Stereo Visual Odometry

IEEE Transactions on Robotics