Vision Based UAV Attitude Estimation: Progress and Insights.
ABSTRACT Unmanned aerial vehicles (UAVs) are increasingly replacing manned systems in situations that are dangerous, remote, or difficult
for manned aircraft to access. Its control tasks are empowered by computer vision technology. Visual sensors are robustly
used for stabilization as primary or at least secondary sensors. Hence, UAV stabilization by attitude estimation from visual
sensors is a very active research area. Vision based techniques are proving their effectiveness and robustness in handling
this problem. In this work a comprehensive review of UAV vision based attitude estimation approaches is covered, starting
from horizon based methods and passing by vanishing points, optical flow, and stereoscopic based techniques. A novel segmentation
approach for UAV attitude estimation based on polarization is proposed. Our future insightes for attitude estimation from
uncalibrated catadioptric sensors are also discussed.

Conference Paper: Realtime Image Recovery Using Temporal Image Fusion
[Show abstract] [Hide abstract]
ABSTRACT: In computer vision systems an unpredictable image corruption can have significant impact on its usability. Image recovery methods for partial image damage, in particular in moving scenarios, can be crucial for recovering corrupted images. In these situations, image fusion techniques can be successfully applied to congregate information taken at different instants and from different pointsofview to recover damaged parts. In this article we propose a technique for temporal and spatial image fusion, based on fuzzy classification, which allows partial image recovery upon unexpected defects without user intervention. The method uses image alignment techniques and duplicated information from previous images to create fuzzy confidence maps. These maps are then used to detect damaged pixels and recover them using information from previous frames.2013 IEEE International Conference on Fuzzy Systems (FUZZIEEE 2013); 07/2013  SourceAvailable from: Miguel Angel OlivaresMendez
Conference Paper: Seeandavoid quadcopter using fuzzy control optimized by crossentropy
[Show abstract] [Hide abstract]
ABSTRACT: This work presents an optimized visual fuzzy servoing system for avoidance obstacle task using an unmanned aerial vehicle. The crossentropy method was used to set the gains of the controller inputs. The optimization process was made using the ROSGazebo 3D simulation with an extension software developed for this work. Once the optimal controller was obtained a set of real tests were made with a quadcopter to evaluate the behavior of the controller with excellent results. To accomplish this task just the visual information of the front camera of the quadrotor was used. This image is processed offboard and the information is send to the Fuzzy Logic controller which sends commands to modify the orientation of the aircraft.Fuzzy Systems (FUZZIEEE), 2012 IEEE International Conference on; 06/2012  SourceAvailable from: Luca Cicala
Conference Paper: UAV position and attitude estimation using IMU, GNSS and camera
[Show abstract] [Hide abstract]
ABSTRACT: The aim of this paper is to present a method for integration of measurements provided by inertial sensors (gyroscopes and accelerometers), GPS and a video system in order to estimate position and attitude of an UAV (Unmanned Aerial Vehicle). Inertial sensors are widely used for aircraft navigation because they represent a low cost and compact solution, but their measurements suffer of several errors which cause a rapid divergence of position and attitude estimates. To avoid divergence inertial sensors are usually coupled with other systems as for example GNSS (Global Navigation Satellite System). In this paper it is examined the possibility to couple the inertial sensors also with a camera. A camera is generally installed onboard UAVs for surveillance purposes, it presents several advantages with respect to GNSS as for example great accuracy and higher data rate. Moreover, it can be used in urban area or, more in general, where multipath effects can forbid the application of GNSS. A camera, coupled with a video processing system, can provide attitude and position (up to a scale factor), but it has lower data rate than inertial sensors and its measurements have latencies which can prejudice the performances and the effectiveness of the flight control system. The integration of inertial sensors with a camera allows exploiting the better features of both the systems, providing better performances in position and attitude estimation.Information Fusion (FUSION), 2012 15th International Conference on; 01/2012
Page 1
Vision Based UAV Attitude Estimation: Progress
and Insights
Abd El Rahman Shabayek, Cédric Demonceaux, Olivier Morel, David Fofi
April 14, 2011
Le2i  UMR CNRS 5158
IUT Le Creusot
Université de Bourgogne, France
Abdelrahman.Shabayek@members.ema.eu, Cedric.Demonceaux@u
bourgogne.fr, Olivier.Morel@ubourgogne.fr, David.Fofi@ubourgogne.fr
Abstract
Unmanned aerial vehicles (UAVs) are increasingly replacing manned systems
in situations that are dangerous, remote, or difficult for manned aircraft to access.
Its control tasks are empowered by computer vision technology. Visual sensors are
robustly used for stabilization as primary or at least secondary sensors. Hence, UAV
stabilization by attitude estimation from visual sensors is a very active research area.
Vision based techniques are proving their effectiveness and robustness in handling
this problem. In this work a comprehensive review of UAV vision based attitude
estimation approaches is covered, starting from horizon based methods and passing
by vanishing points, optical flow, and stereoscopic based techniques. A novel seg
mentation approach for UAV attitude estimation based on polarization is proposed.
Our future insightes for attitude estimation from uncalibrated catadioptric sensors
are also discussed.
1Introduction
In order to determine the pose of the vehicle accurately and rapidly, the regular approach
is to use inertial sensors with other sensors and applying sensor fusion. Some sensors
used for this purpose are the Global positioning sensor (GPS), inertial navigation sensor
(INS), as well as other sensors such as altitude sensors (ALS) and speedometers. These
sensorshavesomelimitations. GPSsensorforexample, isnotavailableatsomelocations
or readings subject to error. INS has the disadvantage of accumulation of errors. To
overcome these limitations, visionbased navigation approaches have been developed.
These approaches can be used where GPS or INS systems are not available or can be
used with other sensors to obtain better estimations. UAV attitude estimation has been
deeply studied in terms of data fusion of multiple low cost sensors in a Kalman filter
(KF) framework to have the vehicle full state of position and orientation. But in pure
vision based methods, if a horizontal world reference is visible (e.g horizon) the camera
attitude can be obtained.
1
Page 2
In order to control a flying vehicle at least six parameters (pose of the vehicle) should
be known; Euler angles representing the orientation of the vehicle and a vector of co
ordinates, representing the position of the vehicle. Pose estimation basically depends
on viewing a world unchanging physical reference (e.g landmarks on the ground) for
accurate estimation. Our main concern in this work is to review the work that focuses
on attitude (roll, pitch, and yaw angles shown in figure (1)) estimation rather than pose
estimation.
Figure 1: An illustrative sketch of the attitude (roll, pitch, and yaw angles)
In a typical flight, the demand for yaw angle will be largely constant and hence dis
turbances tend to have a relatively small effect on yaw. Further, small steady state errors
are normally acceptable since (unlike roll and pitch) any errors will have no further ef
fect on the UAV motion. Therefor, for the sake of UAV stabilization, the most important
angles to be estimated are the pitch and roll angles as most of the work in literature
propose. In this work, the focus will be on attitude estimation from perspective and
omnidirectional cameras. It is intended to give a complete review with some views to
enhance current work and propose novel ideas under investigation and development by
our research group.
1.1 Vision sensors for attitude estimation
Vision based methods were first introduced by [1] . They proposed to equip a Micro Air
Vehicle (MAV) with a perspective camera to have a visionguided flight stability and au
tonomy system. Omnidirectional sensors for attitude estimation were first introduced by
[2]. The omnidirectional sensors (Fisheye and Catadioptric cameras shown in figure (2))
were used in different scenarios. Catadioptric sensors are commercially available for
reasonable prices. A catadioptric sensor has two main parts, the mirror and the lens. The
lens could be telecentric or perspective. The sensor is in general assembled as shown in
figure (2c).
Omnidirectional sensors were used alone or in stereo configurations. Omnidirec
tional vision presents several advantages: a) a complete surrounding of the UAV can be
2
Page 3
captured and the horizon is totally visible, b) possible occlusions will have lower impact
on the estimation of the final results, c) whatever the attitude of the UAV, the horizon is
always present in the image, even partially, and the angles can always be computed, d) it
is also possible to compute the roll and pitch angles without any prior hypothesis, con
trary to the applications using a perspective camera. Yet, catadioptric vision also presents
some drawbacks. For example,a) a catadioptric image contains significant deformations
due to the geometry of the mirror and to the sampling of the camera, b) catadioptric cam
eras should be redesigned to a lower scale to be attached to a micro air vehicle (MAV).
(a) Perspective (b) Fisheye
(c) Catadioptric
Figure 2: Perspective and omnidirectional (Fisheye and Catadioptric) cameras
1.2The main techniques for attitude estimation
In literature, the first group of methods tries to detect a horizontal reference frame in the
world to estimate the up direction and hence the attitude of the vehicle. The horizon,
if visible, is the best natural horizontal reference to be used [1]. However, in urban
environments the horizon might not be visible. Hence, the second group tries to find the
vanishing points from parallel vertical and horizontal lines which are basic features of
man made structure (e.g [3]). The third group was biologically inspired from insects, it
employs the UAV motion (optical flow) for the sake of required estimation [4]. Stereo
vision based techniques came to the play to open the door for more accurate estimation
3
Page 4
specially if combined with optical flow (e.g [5]). All these techniques will be discussed
in the following sections.
Most of the employed techniques in literature use the Kalman filter (KF) or one of
its variations in order to obtain an accurate and reliable estimation specially if more than
one sensor is used and their measurements are fused. For a general parameter estima
tion issue, the extended Kalman filter (EKF) technique is widely adopted. Due to the
processing of EKF in a linear manner, it may lead to suboptimal estimation and even
filter divergence. Nevertheless, state estimation using EKF assumes that both state recur
sion and covariance propagation are Gaussian. Unscented Kalman filter (UKF) resolves
the nonlinear parameter estimation and machine learning problems. It can outperform
the EKF especially for those highly nonlinear system dynamics/measurement processes.
None of the Jacobean or derivatives of any functions are taken under the UKF process
ing [6]. For example in [7], using an EFK, the candidate horizon lines are propagated
and tracked through successive image frames, with statistically unlikely horizon can
didates eliminated. In [8], they followed the EKF framework to combine inertial and
visual sensor for real time attitude estimation. They have designed a KF for image line
measurements.
1.3Paper organization
The paper will be organized as follows: sections (2, 3, 4), will review the general tech
niques for attitude estimation from visual sensors (perspective and omnidirectional only)
in detail. In section (2), horizon detection algorithms will be briefly explained and re
viewed. Vanishing points based techniques are reviewed in section (3). The classical
and hybrid approaches using stereovision and optical flow are reviewed in section (4).
Finally we conclude in (5).
2Horizon Detection
The visual sensor is not only a selfcontained and passive like an INS but also interactive
with its environment. An absolute attitude can be provided by detecting a reliable world
reference frame. Attitude computation by vision is based on the detection of the hori
zon, which appears as a line in perspective images or a curve in omnidirectional images
as shown in figure (3), and on the estimation of the angle between the horizon and a
horizontal reference.
Due to the difficulty in obtaining groundtruth for aircraft attitude, most of the work
in literature do not provide a quantitative measure of error in their estimates of roll and
pitch. In [9], they provided a complexity and performance comparison between their
method and other methods in litterature. They have included a comparison table of exe
cution times for various published studies on visual attitude estimation.
In the following subsections, we will cover in detail the different segmentation ap
proaches for horizon detection in section (2.1), a proposal to segment using polarization
in section (2.2), and both the perspective and omnidirectional scenarios will be reviewed.
Section (2.3) will briefly discuss horizon estimation and attitude computation in the per
spective case. Section (2.4) will briefly discuss the same in the omnidirectional case
specially in the catadioptric scenario which is frequently used.
4
Page 5
(a) Perspective(b) Noncentral catadioptric
Figure 3: Horizon in a) a perspective image, b) a noncentral catadioptric image
2.1 Sky/Ground Segmentation
As the segmentation of sky and ground is a crucial step toward extracting the horizon
line/curve, which is used for attitude estimation, these segmentation methods will be
discussed here.
Usingperspectivevision, algorithmsemployingGaussianassumptionsforsky/ground
segmentation fails in scenarios where the underlying Gaussian assumption for the sky
and ground appearances is not appropriate [1]. These assumptions might be enhanced by
a statistical image modeling framework by building prior models of the sky and ground
then trained. Since the appearances of the sky and ground vary enormously, no single
feature is sufficient for accurate modeling; as such, these algorithms rely both on color
and texture as critical features. They may use hue and intensity for color representation,
and the complex wavelet transform for texture representation. Then they may use Hid
den Markov Tree models as underlying statistical models over the feature space [10]. In
[7], the algorithm is based on detecting lines in an image which may correspond to the
horizon, followed by testing the optical flow against the measurements expected by the
motion filter.
Usingomnidirectionalvision, somealgorithmsusemarkovianformulationofsky/ground
segmentation based on color information [2], or the sky/ground partitioning is done in
the spherical image thanks to the optimization of the Mahalanobis distance between
these regions. The search for points in either regions takes place in the RGB space
[11]. In order to isolate the sky from the ground [12, 13], an approach based on the
method employed by [14] weights the RGB components of each pixel using the function
f (RGB) = 3B2/(R+G+B).
In [9], they propose an algorithm which can be incorporated into any vision system
(e.g. narrow angle, wide angle or panoramic), irrespective of the way in which the en
vironment is imaged (e.g. through lenses or mirrors). The proposed horizon detection
method consists of four stages: a) enhancing sky/ground contrast, b) determining opti
mum threshold for sky and ground segmentation, c) converting horizon points to vectors
in the view sphere, and d) fitting 3D plane to horizon vectors to estimate the attitude.
In [15] they proposed segmentation using temperature from thermopile sensors in the
thermal infrared band. However, in this work, the focus will be on attitude estimation
from perspective and omnidirectional sensors only.
5
Page 6
The previous segmentation solutions are either complex and/or time consuming. A
method based on polarization for segmentation in section (2.2) is proposed. We believe
it will have significant enhancements in both complexity and time due to its simplicity .
We propose a novel noncentral catadioptric sensor where the mirror is a freeform shape
and the camera is polarimetric (e.g FD1665P Polarization Camera [16]) to be used for
attitude estimation.
2.2 Polarization based segmentation
Instead of using color information or edge detection algorithms for segmentation which
may require different complex models and offline processing as shown, we propose to
use polarization information which exists in the surrounding nature. Polarization infor
mation are directly computed from three intensity images taken at three different angles
of a linear polarization filter (0, 45, and 90 degrees) or at one shot using a polarimetric
camera.
Usingpolarizationforsegmentationisnotnew. Itwasusedforroughsurfacesegmen
tation [17], material classification [18], water hazards detection for autonomous offroad
navigation [19] , and similar applications. However, to the best of our knowledge, it is
the first time to propose using polarization for sky/ground segmentation for UAV attitude
estimation.
The most important polarization information are phase (angle) and degree. Accord
ing to [18], the phase of polarization is computed as follows:
q
=
0.5⇤tan?1(I0+I90?2I45
I90< I0
ifI45< I0
q
= q +90
else
q = q ?90
I90?I0
)+90(1)
if
and the degree of polarization is:
f =
I90?I0
(I90+I0)⇤cos(2q)
(2)
where I0, I45, andI90are intensity images taken at 0, 45, and 90 degrees of the rotating
polarizer respectively (or at one shot from a polarimetric camera).
Figure (4) shows the segmentation results for noncentral catadioptric images with
the horizon detected by simply detecting the transition area. This technique is very sim
ple and can be optimized by kind of binary search in the image having very rapid and
robust results for the detected horizon in the image. Only few regions of the image are
needed to be inspected for their degree or angle of polarization to decide for the search
direction. Unlike conventional segmentation methods, thanks to polarization, we do not
face the illumination problem caused by the sun being in the image.
In future work, we will provide detailed algorithms with complexity and run time
comparison with other methods found in literature.
6
Page 7
(a) 0 degree(b) 45 degree(c) 90 degree
(d) Segmentation based on the de
gree of polarization
(e)Segmentationbasedontheangle
of polarization
(f) Extracted horizon curve
Figure 4: Sky/Ground segmentation and horizon extraction based on polarization from
noncentral catadioptric images
2.3 Using perspective sensors
The horizon is projected as a line in the perspective image. Intuitively, it is required to
extract that line. Most methods first segment the image into sky/ground areas, then take
the separating points as the horizon line. The attitude is dependant on the gradient of
that horizon line on the image plane. In literature, the general approach is to find the
normal to the plane of the horizon in order to estimate the roll and pitch angles. The
normal vector has direct mathematical relation with the attitude as expressed in different
methods. The work done by [20, 21] are examples of successful autonomous control of
a MAV based on attitude estimation from the horizon detected.
Inliterature, horizondetectionproblemhasbeenaddressedbysegmentationandedge
detection. In [1, 22] they proposed to equip a MAV with a perspective camera to have a
visionguidedflightstabilityandautonomysystem. Theydetectedthehorizonbyextract
ing the straight line that separates the sky from the ground using the context difference of
the two regions. In [10] they treated the horizon detection problem as a subset of image
segmentation and object recognition, and used a percentage of the sky seen as an error
signal to a flight stability controller on a MAV. The resulting system was stable enough
to be safely flown by an untrained operator in real time. In contrast, [20] uses a direct
edgedetection technique, followed by automatic threshold and a Houghlike algorithm
to generate a “projection statistic”’ for the horizon. It claims a 99% success rate over
several hours of video. Importantly, it deals only with detection, not estimation of at
titude. In [7] they propose an algorithm slightly similar to [20] in that it uses an edge
detection technique followed by a Hough transform. However, they propose different
image prefiltering. In [23, 24, 25, 14] they use the centroids of sky and ground to ex
7
Page 8
tract the horizon and derive the different angles. They try to simplify their work by using
a circular mask to reduce image asymmetry and to simplify the calculations.
2.4Using omnidirectional sensors
The use of a single perspective camera generates several drawbacks. Firstly, a partial
view of the environment and important occlusions in the horizon can have a serious
influence on the final result. Secondly, the horizon is visible only in a particular interval
of roll and pitch values. If the UAV gets out of this interval, the final image is exclusively
made of sky or earth and the horizon cannot be detected. Thirdly, it is only possible
to compute the roll angle while the pitch is only approximated thanks to a hypothesis
on the altitude of the UAV. All that pushed the need toward employing omnidirectional
sensors to capture the horizon in almost all scenarios. The horizon appears as a curve
in the omnidirectional image. It is common to use both fisheye and central catadioptric
sensors. As both are treated by the equivalence sphere theory proposed by [26]. The
particular geometric characteristics of the catadioptric sensor will be briefly explained in
the next section. Once the horizon is detected, these characteristics are used to compute
the attitude of the UAV.
2.4.1 Central catadioptric projection of the horizon
As demonstrated in [26], a 3D sphere projects on the equivalence sphere in a small circle,
and then on the catadioptric image plane in an ellipse (see figure (5)). Consequently, the
attitude computation is based on searching for an ellipse in the omnidirectional image or
asmallcircleontheequivalentspherewhichcorrespondstothehorizon. Thegeometrical
properties of the equivalent sphere allow to deduce the roll and pitch angles. Indeed, the
normal of the projected horizon on the sphere, which is also confounded with the line
passing through the center of the sphere of equivalence and through the center of the
earth represents in fact the attitude of the UAV depending on the position of the optical
axis. Then, the computation of the coordinates of the optical axis is sufficient in order to
deduce the roll and pitch angles.
2.4.2 Horizon estimation and attitude computation
To estimate the horizon, first the catadioptric image should be segmented to obtain
the sky and ground and hence the points belonging to the horizon. Next, the horizon
points should be back projected on the equivalence sphere. Finally, the best plane passes
through the horizon on that sphere should be estimated to deduce its normal which gives
the roll and pitch angles (e.g [2, 11]).
In [2], they proposed to use an omnidirectional visual sensor in order to compute the
attitude of a UAV. They have extended the work of [1, 22] to detect the curved horizon
line. They show an adaptation of the Markov Random Fields (MRFs) to treat the defor
mations in the catadioptric images in order to detect the horizon and hence the catadiotric
geometric characteristics are used to compute the UAV attitude. This method gives in
teresting results but do not use sufficiently the geometric characteristics of catadioptric
vision. Moreover, the segmentation step is time consuming and do not permit a real time
implementation. In [11], they present higher accuracy and computation time. They use
8
Page 9
Figure 5: The relation between the horizon projection and the roll and pitch angles.
(Adapted from [2]).
the geometric characteristics of the central catadioptric sensor for a formulation of the
process as an optimization problem which is solved on the sphere of equivalence in order
to compute directly the attitude angles. In [27], a hybrid method that is using the horizon
and the homography is proposed. In [12, 13], they propose a similar approach to [2] for
attitude estimation and a stereobased system for height and motion estimation.
3Vanishing Points
In [11, 2], the horizon was determined with Random Markow Fields or RGB based Ma
halanobis distance. This approach requires the conditions where the horizon is visible
(e.g low altitude in urban environments). In addition, it can not be used to estimate the
yaw angle. In urban environments, the world reference can be the parallel lines which
are a basic property of manmade structures. In this situations, vanishing points at the
intersection of parallel vertical and horizontal lines can be used for attitude estimation
(e.g [3]).
In [28], a batch process was developed to recover the history of camera orientations
from nonlinear optimization (bundle adjustment) of the vanishing points. In [8], their
approach is based on vanishing points detection using raw line measurements directly
to refine the attitude. They do not require any line tracking. But they fuse these line
measurements with IMU gyro angle and compare each line segment with the current
best attitude estimate.
Vanishing points were more exploited with the omnidirectional sensors. In [3], they
use lines that are available in urban areas which avoids the limitations of horizon deter
mination but it is still not possible to estimate the yaw angle, also it requires to determine
the sky. Therefore, their approach is not suitable in dense city environments as well as
closed areas. A more recent work proposes the use of vanishing points and infinite ho
mography to estimate the helicopter attitude[29]. This approach can be used in urban
environments, however this method has never been applied to a real UAV. In [30], they
used the approach described in [29] to estimate helicopter attitude and improved it using
9
Page 10
a KF.
The research area in using vanishing points for attitude estimation is very active.
It provides the intuitive solution for the attitude estimation problem specially in urban
environments. Duetoitsimportance, thefollowingsubsectionswillexplaintheminmore
details using perspective and omnidirectional sensors. For a comprehensive evaluation
of several approaches for vanishing points detection, the reader is referred to [31, 32].
3.1Perspective
The perspective projection of parallel lines intersects at a single point on an image called
the vanishing point. In [33], given the camera calibration matrix, the geometric relation
ship between the vanishing points, the horizon, and camera orientation has been well
established in a Gaussian sphere using 2D projective geometry . All vanishing point
can be considered in a Gaussian sphere representation even those at infinity. For more
details on representing vanishing points on a Gaussian sphere from a calibrated camera
(see figure (6)), the reader is referred to [33, 34, 8].
3.1.1 Gaussian sphere
Figure 6: Gaussian Sphere adapted from [34]
The Gaussian sphere is a unit sphere which shares the same optical center of the
pinhole camera. In the 2D projective space, an image line is represented as a normal
vector of a great circle in homogeneous coordinates. The intersection of two parallel
edges is a vanishing point which can be computed by the duality between the points and
lines in a projective plane i.evij= li⇥ljwhere vijis a vanishing point and li, and lj
are parallel lines. The vanishing point is the direction to the corresponding 3D point at
infinity.
In a calibrated camera, the vanishing points formed by vertical edges and those
formed by horizontal edges are geometrically constrained to:
vT
vertical.vi
horizontal= 0, i = 1,.....,n.
(3)
10
Page 11
Vanishing points that lie on the same plane define a vanishing line in an image. Then
the horizon is equal to the vanishing line that links any two horizontal vanishing points.
The horizon is dual to the vertical vanishing point. This can be geometrically explained
as having the horizon as the projection of the world ground plane, and the normal to the
ground plane is projected on the vertical vanishing point i.e:
horizon = vi
horizontal⇥ vj
horizontal
(4)
.
The UAV attitude can be determined when either the vertical vanishing point or at
least two horizontal vanishing points are recovered from the image given that a) the great
circle in the Gaussian sphere has the same orientation as the world ground plane, and b)
the relative camera pose with respect to an UAV is known. In general, it is assumed that
the camera is attached to the UAV where the camera’s principle axis is aligned along the
UAV centerline.
3.1.2Vertical vanishing points
In urban environments, vertical edges meet at a single vanishing point in the same direc
tion as the gravity in the world coordinates. The vertical vanishing point is the perspec
tive projection of the world zaxis with the camera pose matrix. Let vvertical= (vx,vy)T,
be the vertical vanishing point , then once it is found, the attitude can be immediately
computed by (see figure (7)):
roll = f = atan2(vx,vy), pitch = q = atan
1
pvx2+vy2.
(5)
The horizon line on the image is a line defined by the vertical vanishing point where:
(sinfcosq)x+(cosfcosq))y+sinq= 0.
(6)
3.1.3Horizontal vanishing points
In urban environments, horizontal edges which are orthogonal to the gravity direction
meet at vanishing points in the world ground plane (see figure (8)). One of the horizontal
vanishing points is the perspective projection of the world xaxis with the camera pose
matrix. Then the horizontal vanishing point is:
vhorizontal= [cosfsiny?sinfsinqcosy
cosqcosy
,?sinfsiny?cosfsinqcosy
cosqcosy
]T
(7)
where y is the yaw angle. All the horizontal vanishing points are along the horizon and
their locations are determined by the different yaw angles.
11
Page 12
Figure 7: Illustration of the relation between a vertical vanishing point and the roll and
pitch angles.
Figure 8: Horizontal vanishing points.
12
Page 13
3.2Catadioptric
As previously mentioned, Projection of 3D world points to the image plane can be done
in three steps. Firstly the point is projected to the equivalent sphere, then to the plane
at infinity and finally to the image plane. Besides, projection of 3D lines generates a
great circle on the equivalent sphere (see figure (5)). By back projecting every can
didate edge on the sphere and checking each edge if it verifies the great circle con
straint, one can decide which edges belong to real 3D lines. In order to do this, the
edges divided according to their gradient orientations and selected by their lengths are
back projected to the sphere. Then plane normal of the great circle is computed by
cross product of first and last edgel directions. In addition, parallel lines have the same
vanishing direction on the equivalent sphere. Therefore, dominant parallel lines can be
extracted by counting lines which satisfy some similarity threshold based on their van
ishing direction. By excluding found parallel lines and repeating the same algorithm,
these dominant vanishing directions can be found. Based on an orthogonality threshold,
if u1⇥u2 OrthogonalityThreshold, the cross product u3= u1⇥u2is computed to
determine the third vanishing direction, where uis are orthogonal parallel lines. If the
inequality is not satisfied, this means that the detection of orthogonal parallel lines is
failed; therefore attitude estimation at that frame should be skipped. In that case, it is
thought that the UAV does not change its orientation.
4 Stereo Vision And Optical Flow
(a) Stereo vision System(b) Phasebased estimation of the optical
flow field adapted from [35]
Figure 9: Stereo Vision and Optical Flow
4.1 Stereo vision
Computer stereo vision, is a part of computer vision where two cameras capture the same
scene but they are separated by a distance as shown in figure (9a). A computer compares
13
Page 14
the images while shifting the two images together over top of each other to find the parts
that match. The shifted amount is called the disparity.
In [36], the authors used a dual CCD stereo vision system in order to improve the
computation of the attitude by determining the complete pose of the UAV taking advan
tages of UKF. However, this system relies on the capture of ground targets/landmarks in
both images which limits the environment in which the UAV can move. In [37], they pre
sented a mixed stereoscopic vision system made of fisheye and perspective cameras for
altitude estimation. Since there exists a homography between the two captured views,
where the sensor is calibrated and the attitude is estimated by the fisheye camera us
ing the techniques in [2, 3] , the algorithm searches for the altitude which verifies this
homography. It allows real time implementation. In [12, 13] , the conventional stereo
system was used for altitude computation. But for attitude computation, they also used a
similar approach to [2].
4.2Optical flow
Optical flow is the approximation of the motion field which can be computed from time
varying image sequences (see figure (9b)). It provides many important visual cues [38].
It is possible to estimate the flight altitude from the observed optical flow in the down
ward direction. Faster optic flow indicates a low flight altitude. Obstacles can be detected
intheforwarddirectionbydetectingexpansion, ordivergence, intheforwardvisualfield.
Optical Flow Estimation Methods are based on a) differential Techniques (dense mo
tion field) where spatial and temporal variations of the image brightness at all pixels are
considered, b) phase methods where response of filters to energy signals are used, c)
matching techniques (sparse motion field) where the disparity of special image points
(features) between frames is estimated.
In [39, 40], they derived a form of the KF that uses the relationship between vision
based measurements and the motion of the camera. The resulting implicit extended
Kalman filter (IEKF) can be used to recover the camera motion states. In [41], they
reused [39, 40] work in terms of an aircraft stateestimation problem by incorporating
aircraft dynamics into the IEKF framework. The resulting formulation partially esti
mated the aircraft states but exhibited relatively slow convergence. Improvements have
been demonstrated by [42, 43] who also used an aircraft model. Unfortunately, accu
rate MAV models are often not available within an aggressive flight regime where the
aerodynamics are difficult to characterize.
Several techniques have utilized the kinematic relationship between camera motion
and the resulting optical flow to directly solve for unknown motion parameters using
constrained optimization. In [44, 45, 46], these techniques depend on at least partial
knowledge of the translational velocity for use in the optimization. This knowledge
often depends on GPS measurements. In [47], they addressed the problem of estimating
aircraft states during a GPSdenied mission segment. An iterative optimization approach
is adopted to determine the angular rates and the windaxis angles. No knowledge of
vehicle velocity is required. The coupled aircraftcamera kinematics are used to solve
for aircraft states in similar fashion to previous efforts; however, velocity dependencies
are removed through decoupling the optical flow resulting from angular and translational
motion, respectively. Angular rate estimates are obtained initially and used to setup a
simple linear leastsquares problem for the aerodynamic angles. Performance of the
leastsquares problem is further improved through the application of a weighting scheme
14
Page 15
derived from parallax measurements.
But Optical flow is inherently noisy, and obtaining dense and accurate optical flow
images is computationally expensive. Additionally, systems that rely on optical flow for
extracting range information need to discount the components of optical flow that are
induced by rotations of the aircraft, and use only those components that are generated
by the translational motion of the vehicle. This either requires an often noisy, numerical
estimate of the roll, pitch, and yaw rates of the aircraft, or additional apparatus for their
explicit measurement, such as a threeaxis gyroscope. Furthermore, the range perceived
from a downward facing camera or optical flow sensor is only dependent upon altitude,
velocity, and the aircraft’s attitude [48].
Stereovisionprovidesanattractiveapproachtosolvesomeoftheproblemsofprovid
ing guidance for autonomous aircraft operating in lowaltitude or cluttered environments
[5, 48]. In [7], the optical flow of the image for each candidate horizon line is calculated,
and using these measurements from the perspective camera, they are able to estimate the
body rates of the aircraft. In [49], they estimate the heading of a small fixed pitch four
rotor helicopter. Heading estimates are computed using the optical flow technique of
phase correlation on images captured using a down facing camera. The camera is fitted
with an omnidirectional lens and the images are transformed into the logpolar domain
before the main computational step.
4.3Optical flow from stereo vision
In [5, 48, 50], they proposed a stereo vision system from two noncentral catadioptric
cameras. The profile of the mirror is designed to ensure that equally spaced points on
the ground, on a line parallel to the camera’s optical axis, are imaged to points that
are equally spaced in the camera’s image plane. However, they have not used physical
mirrors, but instead used high resolution video cameras equipped with wideangle fish
eye lenses and simulated the imaging properties of the mirrors by means of software
lookup tables. Given the measured disparity surface from the optical flow, the attitude
(roll and pitch) and altitude can be estimated by iteratively fitting the modelled surface
to the measurements. They propose to enhance their method by estimating attitude and
altitude with respect to an assumed ground plane by reprojecting the disparity points into
3D coordinates. In [51], he presentes a technique for estimating the aerodynamic attitude
in the presence of dynamic obstacles. This technique relies on optical flow and stereo
vision to remove dynamic objects from the static background. The resulting flow field is
used for attitude computation from the calculated flow centroids.
5Conclusion
Any UAV may fly in low, middle, or high altitudes. We believe that the Omnidirec
tional sensors should be always used because either the horizon will be always visible
(middle and high altitudes) or the vanishing points directions in low altitudes. If the
horizon is visible, then attitude should be estimated based on it. We proposed a simpler
method for segmentation and horizon detection based on polarization which can be used.
In urban environments, techniques based on vanishing points should be used. If obstacle
15