Fig 1 - uploaded by Ghazaleh Panahandeh

Content may be subject to copyright.

Source publication

In this paper, we address the problem of ego-motion estimation by fusing visual and inertial information. The hardware consists of an inertial measurement unit (IMU) and a monocular camera. The camera provides visual observations in the form of features on a horizontal plane. Exploiting the geometric constraint of features on the plane into visual...

## Contexts in source publication

**Context 1**

... setups, for in- door or GPS-denied environments. Moreover, estimating the metric distance to the plane can provide useful information for take-off and landing without using any markers or pre- built maps. Currently, the main application for the consid- ered work is the close-to-landing maneuvers of quadcopters and other UAVs, as illustrated in Fig. ...

**Context 2**

... focus on estimator incon- sistency is therefore suggested for our system. Moreover, to evaluate the performance of our proposed method for the height estimation, we performed the follow- ing experiment: the IMU-camera was placed on a table, and without having any motion along the x − y plane the table was lifted up and down using a step motor. Fig. 10 illus- trates the estimated height of the system for about 373 sec- onds. The initial and the final measured height was approx- imately 68 cm and height of the peaks was in the range of 113 ± 2 cm. It is worth mentioning that the estimated error along the x and y directions was in the order of millimeter. For this estimate, the maximum ...

**Context 3**

... successive images. For the reported experiments in this section, we have tried to capture different types of motions, such as sharp and smooth turns, constant speed and sudden acceleration. Moreover, in all the experiments, for some period of time the sensors were left stationary, without any movement. The stationary periods are well depicted in Fig. 10. Addition- ally, in a separate experiment, to investigate the stability of the system under stationary conditions over a long period of time, we evaluated the performance of the system for a case where the system was left stationary on a table without any movement. The results show that after about 30 min the er- ror along the x and y ...

**Context 4**

... time, we evaluated the performance of the system for a case where the system was left stationary on a table without any movement. The results show that after about 30 min the er- ror along the x and y axes were about 50 cm and along the z axis it was about 150 cm. The estimated biases of the ac- celerometer gyro for this experiment are plotted in Fig. 11, where it can be seen that the biases are quite stable for a long period of time. It is worth mentioning that, for this case where there is no excitation in the system (linear acceler- ation and rotational velocity), the observability analysis is not strictly valid since the required conditions regarding ex- istence of excitation are ...

## Similar publications

The most essential part in Video Coding Standards is Motion Estimation. But, Motion Estimation itself consumes more than half of the coding time to encode. To reduce the computational
complexity, the time required for Motion Estimation should be reduced for which efficient algorithms are required. A new Modified Cross Hexagon Diamond Search (MCHDS)...

This paper presents a method for image motion estimation for event-based sensors. Accurate and fast image flow estimation still challenges Computer Vision. A new paradigm based on asynchronous event-based data provides an interesting alternative and has shown to provide good estimation at high contrast contours by estimating motion based on very ac...

Abstract-Block matching motion estimation is the essence of
video coding systems. This paper is a survey of the existing
block matching algorithms used for motion estimation in
video coding. The algorithms that are surveyed in this paper
are widely accepted by the video coding community and have
been used in implementing various standards, ran...

Integer Motion Estimation (IME) for block-based video coding introduces significant challenges in power consumption and silicon area usage with the adoption of more complex coding tools and higher resolution. To conquer these problems, this paper proposes an Binary Adaptive Luminance Mapping (BALM) algorithm by exploiting the local correlation in i...

It is important to reduce the time cost of video compression for image sensors in video sensor network. Motion estimation (ME) is the most time-consuming part in video compression. Previous work on ME exploited intra-frame data reuse in a reference frame to improve the time efficiency but neglected inter-frame data reuse. We propose a novel inter-f...

## Citations

... The set is explicitly characterized, and its volume is bounded as a function of its characteristics. Lie derivatives are used to find the observable and unobservable modes of time-varying nonlinear VINS [29,30]. To measure the observability of a set of states, the smallest singular value of the block of the local observability Gramian that includes only the states of interest is utilized [31]. ...

... There are exponent items in (27) that complicate for the observability analysis. Compared with TOM, the stripped observability matrix (SOM) shown in (28) is simpler and can be used for the observability analysis if (29) holds. ...

... In this way, we begin to prove (29). If x 0 belongs to the null space of Q j , we divide x 0 into two parts as x 0,1 and x 0,2 , each consisting of three states. ...

In the development of automotive electronics, nearly every automobile is equipped with an inertial measurement unit (IMU). However, the yaw misalignment of the IMU is inevitable when mounted to a vehicle's body. It is difficult to measure directly, so its estimation is required to acquire accurate data from the IMU. This study proposes a method for the IMU and automotive onboard sensors to estimate the yaw misalignment autonomously. In order to estimate the IMU yaw misalignment, first, the attitude and velocity integration method in the vehicle level frame is presented. In addition, the attitude error dynamics consisting of yaw misalignment, pitch, and roll, and velocity error dynamics consisting of longitudinal and lateral velocities, are derived. Then, on the basis of the error dynamics and observation equations, the degree of the observability of the yaw misalignment is analyzed through the piece-wise constant system (PWCS) and singular value decomposition (SVD) theory. Next, a Kalman filter is implemented to estimate the yaw misalignment and the velocity error. Finally, an experimental test in straight line acceleration and deceleration maneuvers is conducted to verify the yaw misalignment estimation method. When the longitudinal or lateral acceleration varies, the yaw misalignment is observable and can be estimated without aids from external information. After compensating for the yaw misalignment, the accuracy of the state estimation result, such as lateral velocity, can be improved significantly when the IMU is integrated with other sensors.

... A monocular VIO system that uses only ground plane features, within an UKF was proposed in [26]. They also showed the translation in the direction of the groundplane normal becomes globally observable, reducing the total number of unobservable directions to three. ...

Modern visual-inertial navigation systems (VINS) are faced with a critical challenge in real-world deployment: they need to operate reliably and robustly in highly dynamic environments. Current best solutions merely filter dynamic objects as outliers based on the semantics of the object category. Such an approach does not scale as it requires semantic classifiers to encompass all possibly-moving object classes; this is hard to define, let alone deploy. On the other hand, many real-world environments exhibit strong structural regularities in the form of planes such as walls and ground surfaces, which are also crucially static. We present RP-VIO, a monocular visual-inertial odometry system that leverages the simple geometry of these planes for improved robustness and accuracy in challenging dynamic environments. Since existing datasets have a limited number of dynamic elements, we also present a highly-dynamic, photorealistic synthetic dataset for a more effective evaluation of the capabilities of modern VINS systems. We evaluate our approach on this dataset, and three diverse sequences from standard datasets including two real-world dynamic sequences and show a significant improvement in robustness and accuracy over a state-of-the-art monocular visual-inertial odometry system. We also show in simulation an improvement over a simple dynamic-features masking approach. Our code and dataset are publicly available.

... With the similar idea, in [74,129,160], the observability of IMU-camera (monocular, RGBD) calibration was analytically studied, which shows that the extrinsic transformation between the IMU and camera is observable given generic motions. Additionally, in [161,162], the system with a downward-looking camera measuring point features from horizontal planes was shown to have the observable global z position of the sensor. ...

As inertial and visual sensors are becoming ubiquitous, visual-inertial navigation systems (VINS) have prevailed in a wide range of applications from mobile augmented reality to aerial navigation to autonomous driving, in part because of the complementary sensing capabilities and the decreasing costs and size of the sensors. In this paper, we survey thoroughly the research efforts taken in this field and strive to provide a concise but complete review of the related work -- which is unfortunately missing in the literature while being greatly demanded by researchers and engineers -- in the hope to accelerate the VINS research and beyond in our society as a whole.

... With the similar idea, in [53,54,20], the observability of IMU-camera (monocular, RGBD) calibration was analytically studied, which shows that the extrinsic transformation between the IMU and camera is observable given generic motions. Additionally, in [55,56], the system with a downward-looking camera measuring point features from horizontal planes was shown to have the observable global z position of the sensor. ...

... With the similar idea, in [53,54,20], the observability of IMU-camera (monocular, RGBD) calibration was analytically studied, which shows that the extrinsic transformation between the IMU and camera is observable given generic motions. Additionally, in [55,56], the system with a downward-looking camera measuring point features from horizontal planes was shown to have the observable global z position of the sensor. ...

In this paper, we perform a thorough observability analysis for linearized inertial navigation systems (INS) aided by exteroceptive range and/or bearing sensors (such as cameras, LiDAR and sonars) with different geometric features (points, lines and planes). While the observability of vision-aided INS (VINS) with point features has been extensively studied in the literature, we analytically show that the general aided INS with point features preserves the same observability property: that is, 4 unobservable directions, corresponding to the global yaw and the global position of the sensor platform. We further prove that there are at least 5 (and 7) unobservable directions for the linearized aided INS with a single line (and plane) feature; and, for the first time, analytically derive the unobservable subspace for the case of multiple lines/planes. Building upon this, we examine the system observability of the linearized aided INS with different combinations of points, lines and planes, and show that, in general, the system preserves at least 4 unobservable directions, while if global measurements are available, as expected, some unobservable directions diminish. In particular, when using plane features, we propose to use a minimal, closest point (CP) representation; and we also study in-depth the effects of 5 degenerate motions identified on observability. To numerically validate our analysis, we develop and evaluate both EKF-based visual-inertial SLAM and visual-inertial odometry (VIO) using heterogeneous geometric features in Monte Carlo simulations.

... Also, the convergence of the filter requires a lot of motion [26][27][28]. The observability of the coupled VIF system has been analyzed using different methods in [27][28][29][30]. Persistent excitation of the vehicle increases the number of observable states as compared Content courtesy of Springer Nature, terms of use apply. ...

... It requires excitation of the vehicle (A g not being zero) to achieve the full system observability, which can be proven by analyzing the system observability [28][29][30]39]. The requirement can also be inferred from the fact that a monocular camera estimates speed up to a known scale factor (scale ambiguity). ...

In this paper, a visual inertial fusion framework is proposed for estimating the metric states of a Micro Aerial Vehicle (MAV) using optic flow (OF) and a homography model. Aided by the attitude estimation from the on-board Inertial Measurement Unit (IMU), the computed homography matrix is reshaped into a vector and directly fed into an Extend Kalman Filter (EKF). The sensor fusion method is able to recover metric distance, speed, acceleration bias and surface normal of the observed plane. We further consider reducing the size of the filter by using only part of the homography matrix as the system observation. Simulation results show that these smaller filters have reduced observability compared with the filter using the complete homography matrix, however it is still possible to estimate the metric states as long as one of the axes is linearly excited. Experiments using real sensory data show that our method is superior to the homography decomposition method for state and slope estimation. The proposed method is also validated in closed-loop flight tests of a quadrotor.

... One of the fundamental problems in VIF is determining what are the observable states for the coupled system, and this has been extensively analyzed using different methods [130,131,132,133]. The observability of the system depends mainly on the type of the visual algorithms and the motion of the vehicle. ...

... It requires excitation of the vehicle (A g not being zero) to achieve the full system observability, which can be proven by analyzing the system observability [130,131,132,133,135]. The requirement can also be inferred from the fact that a monocular camera estimates speed up to an known scale factor (scale ambiguity). ...

This thesis is concerned with the motion estimation and control of Micro Aerial Vehicles (MAVs) using a monocular camera and Inertial Measurement Unit (IMU). The primary contribution is the development of robust visual algorithms dealing with real-world challenges like illumination variation and low-texture scenes, and the design of simple sensor fusion techniques for reliable and high-frequency motion estimation and even structure estimation. Real-time performance is achieved and the algorithms have been validated using real sensory data and tested for on-board closed-loop control of MAVs. Hover, talk-o� and landing of a low-cost quadrotor are �first considered. Image loom is used for determining the rough height during take-o� and for providing initial height value for the biologically-inspired 3D snapshot algorithm. The problem of illumination change is addressed by developing a fast and robust template matching algorithm. External disturbances can temporarily push the drone away from the visual anchor point. This condition is detected using a confidence measure and it is shown how by integrating frame-to-frame motion the vehicle can be guided back to achieve loop closure. The approach has been tested extensively in fight tests both indoors and outdoors. The estimation of the visual scale factor is then considered by fusing visual estimation with inertial data. The robustness of a popular OF algorithm is improved using a transformed binary image from the intensity image. A new homography model is developed in which it is proposed to directly obtain the speed up to an unknown scale factor from the homography matrix. For sensor fusion, Kalman filters are initially applied separately to the three axes for state estimation. It is then proposed to use the whole homography matrix as the system measurement to further compute the surface normal. It is also shown that only part of the homography matrix is needed for metric state estimation. Real images and IMU data recorded from our quadrotor platform show the superiority of the proposed method over the traditional approach that decomposes the homography matrix for both state estimation and slope estimation. This thesis also touches upon the online estimation of camera intrinsic parameters.

This paper focuses mainly on the analysis of observable modes and absolute navigation capability of the landmark-based IMU/Vision Navigation System (IMU/VNS) for Unmanned Aerial Vehicle (UAV). Firstly, the mathematical model of the IMU/VNS is established. Secondly, the observability matrix of the landmark-based IMU/VNS is obtained based on Lie derivative. And the observable modes with different number of landmarks are derived by solving the null space of the observability matrix. Finally, by deriving the navigation parameters of UAV from the observable modes, analysis is made on the absolute navigation capability of the landmark-based IMU/VNS when the absolute positions of the landmarks are available. And simulation results verify the correctness of all the above analysis methods.

In this article, we perform a thorough observability analysis for linearized inertial navigation systems (INS) aided by exteroceptive range and/or bearing sensors (such as cameras, LiDAR, and sonars) with different geometric features (points, lines, planes, or their combinations). In particular, by reviewing common representations of geometric features, we introduce two sets of unified feature representations, i.e., the quaternion and closest point (CP) parameterizations. While the observability of vision-aided INS (VINS) with point features has been extensively studied in the literature, we analytically show that the general aided INS with point features preserves the same observability property, i.e., four unobservable directions, corresponding to the global yaw and the global translation of the sensor platform. We further prove that there are at least five (or seven) unobservable directions for the linearized aided INS with a single line (plane) feature, and, for the first time, analytically derive the unobservable subspace for the case of multiple lines or planes. Building upon this analysis for homogeneous features, we examine the observability of the same system but with combinations of heterogeneous features, and show that, in general, the system preserves at least four unobservable directions, while if global measurements are available, as expected, the unobservable subspace will have lower dimensions. We validate our analysis in Monte–Carlo simulations using both EKF-based visual-inertial SLAM and visual-inertial odometry (VIO) with different geometric features.