Conference PaperPDF Available

Abstract and Figures

In this paper we propose a new vision-based obstacle avoidance strategy using the Underwater Dark Channel Prior (UDCP) that can be applied to any Unmanned Underwater Vehicle (UUV) equipped with a simple monocular camera and minimal on-board processing capabilities. For each incoming image, our method first computes a relative depth map to estimate the obstacles nearby. Then, the map is segmented and the most promising Region of Interest (RoI) is identified. Finally, an escape direction is computed within the RoI and a control action is performed accordingly to avoid the obstacles. We tested our approach on a video sequence in a natural environment and compared it against a state-of-the-art method showing better performance, specially in light changing conditions. We also provide online results on a low-cost Remotely Operated Vehicle (ROV) in a controlled environment.
Content may be subject to copyright.
Real-Time Monocular Obstacle Avoidance using Underwater Dark
Channel Prior
Paulo Drews-Jr1,2,3Emili Hern´
andez1Alberto Elfes1Erickson R. Nascimento2Mario Campos2
Abstract— In this paper we propose a new vision-based
obstacle avoidance strategy using the Underwater Dark Chan-
nel Prior (UDCP) that can be applied to any Unmanned
Underwater Vehicle (UUV) equipped with a simple monocular
camera and minimal on-board processing capabilities. For
each incoming image, our method first computes a relative
depth map to estimate the obstacles nearby. Then, the map
is segmented and the most promising Region of Interest (RoI)
is identified. Finally, an escape direction is computed within
the RoI and a control action is performed accordingly to avoid
the obstacles. We tested our approach on a video sequence
in a natural environment and compared it against a state-
of-the-art method showing better performance, specially in
light changing conditions. We also provide online results on
a low-cost Remotely Operated Vehicle (ROV) in a controlled
environment.
I. INT ROD UC TI ON
In the last few years there has been an increase of
Unmanned Underwater Vehicles (UUVs) available to the
general public. These modern vehicles are different from
the traditional commercially available ones and those built
for research purposes as they tend to be small, affordable
and with highly limited sensing capabilities. One example is
the OpenROV [1], which has a color camera onboard in its
standard configuration.
Vision-based sensors have been extensively used in many
underwater robotic applications such as habitat and animal
classification [2], mapping [3], 3D scene reconstruction [4],
visualization [5], docking [6], tracking [7], inspection [8]
and robot localization [9]. However, very few works have
addressed the vision-based obstacle avoidance problem in
the underwater domain as it is usually solved with sonar-
based sensors [10]. The work by Roser et al. [11] is based
on binocular vision. The main limitation of their method is
the requirement of a calibrated stereo pair and the associated
computational cost. Recently, Rodr´
ıguez-Telles et al. [12]
proposed a method to avoid obstacles using a monocular
camera that requires an offline learning phase and superpixel
based segmentation [13]. The training step is used to obtain
This research is partly supported by CNPq, CAPES, FAPEMIG. This
paper also represents a contribution of the INCT-Mar COI funded by CNPq
Grant Number 610012/2011-8.
1E. Hern´
andez and A. Elfes are with Autonomous Systems Labo-
ratory, Data61-CSIRO, Brisbane, Australia. [Emili.Hernandez,
Alberto.Elfes]@csiro.au
2E.R. Nascimento and M. Campos are with the Computer Vision
and Robotics Laboratory of the Dep. de Ciˆ
encia da Computac¸˜
ao,
Univ. Federal de Minas Gerais - UFMG, Belo Horizonte, Brazil.
{erickson,mario}@dcc.ufmg.br
3P. Drews-Jr is also with NAUTEC, Intelligent Robotics and Automation
Group, Univ. Federal do Rio Grande - FURG, Rio Grande - Brazil.
paulodrews@furg.br
the water color which is then generalized to the whole
dataset. The main drawback of this approach is the highly
dependence on the medium conditions and it often requires
specific training and manual tuning of the algorithm’s pa-
rameters for each dataset.
In this paper, we propose a real-time monocular obstacle
avoidance method suitable for small ROVs. Our approach
estimates a depth map using statistical priors [14], [15] and
a physical underwater light attenuation model. Differently
from the images captured in the air, the underwater images
carry on information about depth because of the relation
between depth and medium effects. Thus, we explore this
property using a statistical prior to estimate the depth map.
Then, it is segmented using an adaptive threshold and a
set of Regions of Interest (RoIs) are identified based on
an ellipse fitting technique. We also compute an escape
direction using the center of mass of the most promising RoI
that avoids the collision with the nearby obstacles. Finally,
we use a simple but effective control strategy to turn the
direction vector between the robot and the escape direction
into thrusters setpoints encoded as Pulse Width Modulation
signals (PWMs).
The main contribution of this work is an underwater ob-
stacle avoidance method that achieves real-time performance
using monocular images. We applied a statistical model-
based depth map estimation for obstacle avoidance purposes.
We also present results using offline experiments in real
oceanic condition and compared it against Rodr´
ıguez-Telles
et al. [12] showing better results. Furthermore, we show
that our algorithm achieves real-time performance in an
OpenROV, a low-cost ROV equipped with a single camera,
in a controlled environment.
The remainder of the paper is organized as follows:
Section II describes the proposed obstacle avoidance method;
Section III evaluates the methodology using experimental
field data; finally, in Section IV, we summarize the paper
contributions and draw the future research directions.
II. ME TH OD OL OG Y
Our approach uses images from a single monocular color
camera and it generates a depth map of the scene based on
a light attenuation model and statistical priors. In fact, light
is absorbed and scattered by the medium before reaching
the camera, and the understanding of these effects allow us
to estimate the depth map. This depth map is segmented
into RoIs which allows us to compute an escape direction.
This is turned into control set-points and feed directly to the
control of the each thruster. Fig. 1 shows the main steps of
Video Stream PWMs
Depth
Map RoIs
Monocular
Depth
Estimation
Segmentation
Escape
Direction and
Control
Fig. 1: Method overview. Input images from a video stream
are used to estimate depth maps which are segmented in
RoIs. They allow us to compute the escape direction and to
control the vehicle.
our method, and Fig. 2 depicts the intermediate steps for a
single frame.
A. Monocular Depth Map Estimation
1) Physical Underwater Light Attenuation Model: Un-
derwater images are the result of a complex light rays
interaction, the medium and the scene structure. Jaffe-
McGlamery proposed one of the most used model to describe
this interaction [16], [17], in which the image intensity is
composed of three terms: the direct illumination (Ed), the
forward-scattering (Efs ) and the backscattering (Ebs):
ET=Ed+Efs +Ebs.(1)
Part of the light radiated from objects is scattered and
absorbed by the medium and the remaining portion, called
direct illumination, reaches the sensor. Direct illumination
[17] is formulated as :
Ed=Jeηd =J tr,(2)
where Jis the scene radiance, dis the depth, and ηis the
attenuation coefficient. The attenuation coefficient ηis com-
posed of the scattering and the absorption coefficients, both
wavelength dependent [18]. tris the medium transmission,
modeled as the exponential term.
Since the backscattering Ebs is the main reason for image
contrast degradation in most of cases, forward scattering
Efs is usually neglected [19]. The backscattering does not
originate from the object’s radiance, but it results from the
interaction between the sources of ambient illumination with
particles dispersed in the medium. A simplified model for the
Ebs component can be described as:
Ebs =A(1 eηd) = A(1 tr),(3)
where Ais the global light which is wavelength dependent.
This is estimated by finding the brightest pixel in the
darkness channel [20]. The other terms are the same as for
the direct components.
The final model describing the formation of an image
Iacquired in an underwater homogeneous medium with
natural light can be formulated as:
I(x) = J(x)tr(x) + A(1 tr(x)),(4)
where xare the pixel coordinates.
2) Transmission Prior and Depth Estimation: The image
formation model described in Eq. 4 is an ill-posed problem,
since it is not possible to solve the depth (d) and the true
appearance of the scene (J) without prior knowledge about
the scene.
[21] proposed the Dark Channel Prior (DCP), a statistical
prior based on the observation that natural images exhibited
a mostly dark intensity in a square patch in at least one color
channel of the image. It is difficult to validate this assumption
and the corresponding statistic correlation in underwater
images due to the impossibility to obtain real underwater
images without the medium. Despite this difficulty, the main
assumption stated by [21] is still plausible for which at least
one color channel has some pixels whose intensity are close
to zero. This low intensity pixels are due to shadows, objects
or surfaces where at least one color channel has low intensity,
like fishes, algae or corals, and dark objects or surfaces like
rocks or dark sediment.
However, the wavelength independence claim is false in
most of the cases due to the high absorption rates in the red
channel in typical oceanic conditions. Hence, we adopted a
prior called Underwater Dark Channel Prior (UDCP) [14],
[15]:
JUD CP (x) = min
yΩ(x)( min
cG,B Jc(y)).(5)
Considering Eq. 4 and the UDCP assumption, it is possible
to isolate the transmission ˜
trin a local patch . Applying
the minimum operation to both sides, we can estimate ˜
tr
based on the image Iand the global light Aas:
˜
tr(x) = 1 min
yΩ(x)( min
cG,B
Ic(y)
Ac),(6)
where the global light Ais estimated by finding the brightest
pixel in the JUD CP (Eq. 5). [15] provides a experimental
verification of the UDCP assumption and more details about
its applicability.
We define the square patch = 15 ×15 for 640 ×360
pixels images. The minimum operator is similar to the
classical erosion morphological operator. Thus, we compute
the minimum filter using a fast operator as proposed in [22],
with a linear complexity with respect to the image size. Fig.
2b depicts an example of a transmission map tr.
Based on the transmission map, we can estimate the depth
map Dup to the unknown attenuation coefficient η, as:
D(x) = ηd(x) = log tr(x).(7)
In the actual implementation, the log operator is computed
based on LookUp Tables (LUT) to improve the performance.
Differently from image restoration works, we do not perform
any refinement procedure due to time constraints. The depth
map obtained is adequate for robotics tasks such as obstacle
avoidance. However, some filtering operations are performed
(a) (b) (c) (d) (e)
Fig. 2: Intermediate results of the proposed method: a) input image; b) transmission map; c) depth map; d) segmented RoIs,
with the largest one in blue; e) direction of escape (circle) on the selected RoI (ellipse) with thrusters setpoints.
to improve the segmentation step: a median filter using a
kernel of 5×5pixels, and a Gaussian filter with the same
size of . Fig. 2c illustrates the final depth map.
B. Segmentation
For each incoming depth map, we first perform a binary
segmentation. The threshold level is estimated as a fraction
of the global light A. This simple approach is robust to light
variation because Achanges according to the illumination of
each frame. Thus, the segmentation is partially invariant to
illumination changes [21].
Similar to [12], we assume that a RoI is safe for the robot
if it is possible to fit in it a circle of radius r. Therefore,
we apply an erode operation in the segmented pixels using
a circular kernel with radius r. The effect generated by
this operation is similar to the one produced by increasing
the obstacle extent, typically performed on path planning
methods [23]. RoIs are estimated based on segmented pixels
within all neighboring pixels. Fig. 2d shows the largest RoI,
in blue, and the others, in green.
C. Escape Direction Estimation and Control Scheme
1) Escape Direction: The RoIs obtained are sorted ac-
cording to their size and those smaller than the area of the
circle with radius rare removed. Then, we fit an ellipse using
least squares optimization [24] in the largest RoI. Based on
the ellipse shape, a circle of radius ris fitted within the
ellipse at the RoI center of mass (see Fig.2e). If the circle is
contained in the ellipse, it is accepted as an escape direction.
Otherwise, this process is repeated for the next valid RoI
until a suitable escape direction is found. The radius ris
empirically estimated as its value depends on the camera,
the robot and the environment.
As proposed in [12], the pitch angle is set to an upward
direction based on the camera’s field of view in case a valid
escape direction is not found.
In order to prevent sudden changes when estimating the
escape direction in each frame independently, we generate
a stable escape direction by computing the average between
the current and the previous valid values. The robustness of
the method is not affected despite the delay introduced by
the filtering process.
2) Reactive Control Scheme: Given a valid and stable
escape direction, the thruster setpoints are computed based
on the position error e= (ex, ey, ez)with respect to the
center of the image Pc= (cx, cy).
ex=DRoI ,
ey=xRoI cx
cx
,
ez=yRoI cy
cy
,(8)
where DRoI is the average depth in the selected RoI, and
pRoI = (xRoI , yRoI )is the escape direction on the reference
frame image. Based on those references, we implemented a
Pcontroller for each degree of freedom of the OpenROV.
The controllers are responsible for heave and surge motions
and yaw rotation:
us=Kps·ex,
uy=Kpy·ey,
uh=Kph·ez(9)
where Kps,K pyand Kpsare their proportional gains. In
the actual implementation, the control signals are properly
scaled to the range of the Electronic Speed Control (ESC)
uses.
The output signal of the depth controller uhis fed directly
to the top thruster because it only has effect on the heave
motion of the vehicle. The horizontal thrusters are driven
by a combination of signals from usand uycontrollers. We
added up these control signals, but assuming a different sign
of uyfor each thruster [25].
III. EXP ER IM EN TAL RES ULT S
We evaluated our algorithm in an offline sequence acquired
in real oceanic environment and tested its online performance
using a standard OpenROV [1] equipped with a single
camera in a controlled environment. In both offline and
online approaches, we compare our method against to the
monocular obstacle avoidance proposed by Rodr´
ıguez-Telles
et al. [12].
For the sake of a fair comparison, all methods were
implemented using standard C++ with OpenCV [26] for
efficient image processing, and sockets communication for a
fast communication with the robot. We used two processing
units: a notebook with an Intel I7-4510U@2.0GHz CPU and
8Gb of RAM, and the standard OpenROV v2.7 onboard com-
puter, a BeagleBone Black (BBB) with a Cortex A8@1GHz
CPU and 512Mb of RAM. The results performed using the
notebook requires the BBB to acquire the images and to
transmit them using the tether.
[12] was implemented according to the paper, using a
superpixel segmentation algorithm based on a modified ver-
sion of the Simple Linear Iterative Clustering algorithm
(SLIC) [13]. This modified SLIC was coded based on an
open source project1. We also implemented the training
step following the offline approach proposed by the authors
in which the user indicates in some training images the
superpixels corresponding to RoI.
Although all the evaluated sequences were acquired at
720p resolution, we re-scaled the images to 640 ×360
pixels to achieve real-time performance in the control step
(10Hz). This resolution was enough to maintain the
robustness of our system running at high frame rate.
A. Offline Experiments
We carried out offline experiments using a real oceanic se-
quence obtained with a Seabotix LBV300-5 ROV, equipped
with a GoPro Hero3+ Black Edition camera. The image were
acquired from a coral reefs with a sandy seabed in Brazil’s
Northeast Coastal area at 10mwater depth approximately.
The video sequence shows challenging conditions such as
floating sediment, fishes moving and illumination variation
like sun flicker in a narrow passage scene.
Fig. 3 shows the offline experiments results for some key
frames with the challenging situations stated before in the
first row, figs. 3a-3d. The results of our method are depicted
in the second row, figs. 3e-3h, and the results obtained with
Rodr´
ıgues-Telles et al. [12] method are in the last row,
figs. 3i-3l.
For our approach, we show the estimated depth map in
gray scale, with the fitted ellipse (in cyan) and the escape
direction (in yellow). Although the illumination changes in
the scene, the RoI size is similar in all images since the
adaptive threshold is based on the global light Avalue. As
stated before, it is estimated by finding the intensity of the
brightest pixel in the underwater dark channel (Eq. 5), and
it changes accordingly the scene illumination.
The results obtained with [12] depicts the superpixel
segmentation and their classification. Blue dots indicate
superpixels classified as RoI and red dots represent the
obstacles. The escape direction is also shown as a yellow
circle. In all images, the method had some difficulty to
discriminate between free and occupied areas, specially on
the coral reef on the right side, in which many superpixels
are shown as free. This causes the estimation of the escape
direction to be unsafe in figs. 3j and 3l.
Table I shows the algorithms running time. Based on the
current implementation, our algorithm is ×25 faster than
[12]. Our algorithm can run up to 30Hz, while we could
only achieve 1.3Hz with [12], which is not enough
for the control loop. Their performance in the execution
time is highly dependent of the superpixel segmentation
method (90%), whereas our algorithm is limited by the
1https://github.com/PSMM/SLIC-Superpixels
TABLE I: Comparative analysis of the proposed method
against the state-of-the-art in terms of running time by frame
in the offline experiments.
Average Time (s) Std. Deviation (s)
Rodr´
ıgues-Telles et al. [12] 0.7604 0.0087
Our Method - Notebook 0.0295 0.0030
Our Method - BBB 1.03 0.032
erosion operation using a circular kernel that is responsible
to obstacle extent, which takes 35% of the time. The
performance of our method on the BBB board is still limited
to 1Hz.
B. Online Experiments
We also performed an online evaluation of the proposed
algorithm with a OpenROV v2.7 [1], a small tethered ROV
with three thrusters: two in horizontal plane for surge motion
and yaw rotation, and a vertical one for heave motion. The
vehicle is equipped with a Genius KYE F100 Ultra-wide
angle full HD webcam. We assembled our standard unit
without the laser pointers due to safety regulations. The
OpenROV offers as low-cost alternative to traditional ROV
platforms when operating in shallow water with no water
currents.
We obtained the experimental results in a small circular
pool with 1.5mof radius and 0.5mof water depth. Several
marking cones were used as obstacles (see Fig. 4). The small
water depth limited the escape direction that the method
is able to compute. Therefore, we noted that the method
had the tendency to move the ROV to the surface in those
experiments. We reduced the proportional gain in the heave
motion controller to compensate this effect.
Fig. 5 shows the results of our algorithm and [12] method
during three different experiments. Our approach (figs. 5g-
5l) computed the depth map and estimated a valid escape
direction. The depth map estimation was not accurate be-
cause of the white floor. This is due to the limitation of
the statistical prior that assumes darkness in the image in at
least one color channel. The water surface can misclassify a
free space due to reflection that generates a mirror effect of
the white floor of the pool. Therefore, the method tried to
compensate it with increasing the pitch of the vehicle. The
results of the online experiments are shown on the attached
video.
Our implementation of [12] was unable to correctly detect
free and occupied areas. The obstacles were successfully
detected only for images where the obstacles are in the
center and near the camera, as well as in some areas on
the vicinity of the center (figs. 5m, 5o, and 5r). Despite
the correct detection in these cases, the algorithm was not
able to compute a valid escape direction (highlighted with a
red circle). Due to its limited capability to identify occupied
area correctly, the method estimated the escape direction in
the center of the image, i.e. the center of mass of the RoI
as proposed in [12] and some collisions were incorrectly
detected as a valid escape direction, e.g. Fig. 5n.
(a) (b) (c) (d)
(e) (f) (g) (h)
(i) (j) (k) (l)
Fig. 3: Offline results for the real oceanic sequence: a-d) samples of the collected frames; e-h) results of our method showing
the depth map, the fitted ellipse in the selected RoI and the escape direction; i-l) results of our implementation of Rodr´
ıgues-
Telles et al. method with blue dots indicate superpixels classified as free areas, red dots represent obstacles and the escape
direction.
Fig. 4: Experimental setup: an OpenROV platform in a pool
where we conducted field tests. Obstacles were introduced
(red cones) to evaluate the performance of the algorithms.
TABLE II: Comparative analysis of the proposed method
against the state-of-the-art in terms of running time by frame
for the online experiments.
Average Time (s) Std. Deviation (s)
Rodr´
ıgues-Telles et al. [12] 0.7698 0.0141
Our Method - Notebook 0.0396 0.0081
Table II shows the running time for the algorithms in
the online experiments. Similarly to the offline case, our
algorithm was ×19 faster than [12], with a smaller standard
deviation. The execution time difference of our method
with respect to the offline experiment is due to the size of
the RoIs which imposes an increase of processing power
requirements.
IV. CON CL US IO NS A ND FU TU RE WO RK
This paper proposed a novel obstacle avoidance method
for underwater environments using a single monocular cam-
era. For each incoming frame and no previous information
about the environment, our approach computed an estimation
of the depth map respect to the camera using statistical priors
and a physical underwater light attenuation model. After
identifying the free areas on the depth map with an adaptive
threshold, a fast segmentation method estimated the most
promising RoI and computed the escape direction. This was
turned into a reactive control action to avoid obstacles. We
compared our approach against a state-of-the-art method in
an offline dataset taken in a natural environment and in online
experiments using an OpenROV platform in a controlled
environment.
Future work will be focused on evaluating the accuracy
of the depth map estimation under different illumination and
water conditions. We will also explore the depth information
to find an adaptive radius for a safer escape direction and
to provide a multi-object segmentation. We will also install
laser pointers or a simple sonar-based range finder to turn
the depth map into actual distances. Furthermore, we will
improve the code to make it able to run in real-time on the
OpenROV onboard computer.
ACK NOW LE DG ME NT S
We thank the colleagues from Autonomous Systems
Laboratory, at CSIRO for hosting Paulo Drews-Jr during
(a) (b) (c) (d) (e) (f)
(g) (h) (i) (j) (k) (l)
(m) (n) (o) (p) (q) (r)
Fig. 5: Results in online experiments using an underwater vehicle in a controlled environment. a-f) two sample frames for
each of the three experiments; g-l) results obtained using our algorithm; m-r) results obtained with Rodr´
ıgues-Telles et al.
method.
his sandwich program (sponsored by CAPES grant no.
99999.003584/2014-03) both for the prolific discussions and
for their kind support in providing equipment and the nec-
essary infrastructure for some of the experiments in this
work. We also thank to VeRLab-UFMG and NAUTEC-
FURG for providing equipment and assistance with part of
the experimental data. This research is also partly supported
by CNPq, CAPES and FAPEMIG.
REF ER EN CE S
[1] E. Stackpole E and D. Lang, “OpenROV - Underwater Exploration
Robots,” http://www.openrov.com/, Accessed July 30, 2016.
[2] F. Codevilla, S. Botelho, N. Duarte, S. Purkis, A. Shihavuddin,
R. Garcia, and N. Gracias, “Geostatistics for context-aware image
classification,” in Computer Vision Systems, L. Nalpantidis, V. Krger,
J.-O. Eklundh, and A. Gasteratos, Eds., vol. 9163 of LNCS, pp. 228–
239. Springer, 2015.
[3] R. Campos, R. Garcia, P. Alliez, and M. Yvinec, “A surface
reconstruction method for in-detail underwater 3D optical mapping,”
IJRR, vol. 34, no. 1, pp. 64–89, 2015.
[4] A Concha, P. Drews-Jr, M Campos, and J Civera, “Real-time
localization and dense mapping in underwater environments from a
monocular sequence,” in IEEE/OES Oceans, 2015.
[5] P. Drews-Jr, E. Nascimento, M. Campos, and A. Elfes, “Auto-
matic restoration of underwater monocular sequences of images,” in
IEEE/RSJ IROS, 2015, pp. 1058–1064.
[6] F. Maire, D. Prasser, M. Dunbabin, and M. Dawson, “A vision based
target detection system for docking of an autonomous underwater
vehicle,” in ACRA, 2009, pp. 1–7.
[7] P. Drews-Jr, E. Nascimento, A. Xavier, and M. Campos, “Generalized
optical flow model for scattering media,” in ICPR, 2014, pp. 3999–
4004.
[8] F. Hover, R. Eustice, A. Kim, B. Englot, H. Johannsson, M. Kaess,
and J. Leonard, “Advanced perception, navigation and planning for
autonomous in-water ship hull inspection,” IJRR, vol. 31, no. 12, pp.
1445–1464, 2012.
[9] S. Botelho, P. Drews-Jr, G. Oliveira, and M. Figueiredo, “Visual
odometry and mapping for underwater autonomous vehicles,” in IEEE
LARS, 2009, pp. 1–6.
[10] Y. Petillot, I. Tena Ruiz, and D.M. Lane, “Underwater vehicle obstacle
avoidance and path planning using a multi-beam forward looking
sonar,IEEE JOE, vol. 26, no. 2, pp. 240–251, 2001.
[11] M. Roser, M. Dunbabin, and A. Geiger, “Simultaneous underwater
visibility assessment, enhancement and improved stereo, in IEEE
ICRA, 2014, pp. 3840–3847.
[12] F. Rodr´
ıguez-Telles, R. P´
erez-Alcocer, A. Maldonado-Ram´
ırez,
L. Torres-Mendez, B. Dey, and E. Mart´
ınez-Garcia, “Vision-based
reactive autonomous navigation with obstacle avoidance: Towards a
non-invasive and cautious exploration of marine habitat,” in IEEE
ICRA, 2014, pp. 3813–3818.
[13] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk,
“SLIC superpixels compared to state-of-the-art superpixel methods,
IEEE TPAMI, vol. 34, no. 11, pp. 2274–2282, 2012.
[14] P. Drews-Jr, E. Nascimento, F. Codevilla, S. Botelho, and M. Campos,
“Transmission estimation in underwater single images, in IEEE
ICCVw, 2013, pp. 825–830.
[15] P Drews-Jr, E. Nascimento, S. Botelho, and M. Campos, “Underwater
depth estimation and image restoration based on single images,” IEEE
CG&A, vol. 36, no. 2, pp. 50–61, 2016.
[16] B. McGlamery, “A computer model for underwater camera systems,
in SPIE 0208, Ocean Optics VI, 1980, vol. 208, pp. 221–231.
[17] J. Jaffe, “Computer modeling and the design of optimal underwater
imaging systems,” IEEE JOE, vol. 15, no. 2, pp. 101–111, 1990.
[18] C. D. Mobley, Light and Water: Radiative Transfer in Natural Waters,
Academic Press, 1994.
[19] Y. Schechner and N. Karpel, “Recovery of underwater visibility and
structure by polarization analysis,” IEEE JOE, vol. 30, no. 3, pp.
570–587, 2005.
[20] F. Codevilla, S. Botelho, P. Drews-Jr, N. Duarte Filho, and J. Gaya,
“Underwater single image restoration using dark channel prior, in
NAVCOMP, 2014, pp. 18–21.
[21] K. He, J. Sun, and X. Tang, “Single image haze removal using dark
channel prior, in IEEE CVPR, 2009, pp. 1956–1963.
[22] M. van Herk, “A fast algorithm for local minimum and maximum
filters on rectangular and octagonal kernels,” PRL, vol. 13, no. 7, pp.
517–521, 1992.
[23] H. Choset, K. M. Lynch, S. Hutchinson, G. A. Kantor, W. Burgard,
L. E. Kavraki, and S. Thrun, Principles of Robot Motion: Theory,
Algorithms, and Implementations, MIT Press, 2005.
[24] W. Gander, G. H. Golub, and R. Strebel, “Least-squares fitting of
circles and ellipses,” BIT Numer. Math., vol. 34, no. 4, pp. 558–578,
1994.
[25] V. N. Kuhn, P. Drews-Jr, S. Gomes, M. Cunha, and S. Botelho,
“Automatic control of a ROV for inspection of underwater structures
using a low-cost sensing,” JBSMSE, vol. 37, no. 1, pp. 361–374, 2015.
[26] G. Bradski, “The OpenCV library,” Dr. Dobb’s Journal of Software
Tools, 2000.
... On the other hand, there are some obstacle avoidance applications based on underwater machine vision [15][16][17][18]. A vision-based obstacle detection technique using optical flow is proposed for collision avoidance of Autonomous Quadrotor Navigation in [15]. ...
... Leonardi et al. [17] provide a proof of concept regarding a series of experiments investigating stereo vision for underwater obstacle avoidance and position estimation. Drews et al. [18] propose a new vision-based obstacle avoidance strategy using the Underwater Dark Channel Prior (UDCP) algorithm that can be applied to any underwater robot with a simple monocular camera and minimal on-board processing capabilities. Evans et al. [19] outline the sonar and video sensor processing techniques used for realtime control of the Intervention-AUV to perform tracking and 3D pose reconstruction. ...
Article
Full-text available
As we know, for autonomous robots working in a complex underwater region, obstacle avoidance design will play an important role in underwater tasks. In this paper, a binocular-vision-based underwater obstacle avoidance mechanism is discussed and verified with our self-made Underwater Quadrocopter Vehicle. The proposed Underwater Quadrocopter Vehicle (UQV for short), like a quadrocopter drone working underwater, is a new kind of Autonomous Underwater Vehicle (AUV), which is equipped with four propellers along the vertical direction of the robotic body to adjust its body posture and two propellers arranged at the sides of the robotic body to provide propulsive and turning force. Moreover, an underwater binocular-vision-based obstacle positioning method is studied to measure an underwater spherical obstacle’s radius and its distance from the UQV. Due to its perfect ability of full-freedom underwater actions, the proposed UQV has obvious advantages such as a zero turning radius compared with existing torpedo-shaped AUVs. Therefore, one semicircle-curve-based obstacle avoidance path is planned on the basis of an obstacle’s coordinates. Practical pool experiments show that the proposed binocular vision can locate an underwater obstacle accurately, and the designed UQV has the ability to effectively avoid multiple obstacles along the predefined trajectory.
... Rodríguez-Teiles et al. [22] segmented RGB images to determine the direction for escape. Drews-Jr et al. [23] estimated a relative depth using the underwater dark channel prior and used that estimated information to determine the action. There has been recent efforts in 3D trajectory optimization for underwater robots. ...
Preprint
Underwater navigation presents several challenges, including unstructured unknown environments, lack of reliable localization systems (e.g., GPS), and poor visibility. Furthermore, good-quality obstacle detection sensors for underwater robots are scant and costly; and many sensors like RGB-D cameras and LiDAR only work in-air. To enable reliable mapless underwater navigation despite these challenges, we propose a low-cost end-to-end navigation system, based on a monocular camera and a fixed single-beam echo-sounder, that efficiently navigates an underwater robot to waypoints while avoiding nearby obstacles. Our proposed method is based on Proximal Policy Optimization (PPO), which takes as input current relative goal information, estimated depth images, echo-sounder readings, and previous executed actions, and outputs 3D robot actions in a normalized scale. End-to-end training was done in simulation, where we adopted domain randomization (varying underwater conditions and visibility) to learn a robust policy against noise and changes in visibility conditions. The experiments in simulation and real-world demonstrated that our proposed method is successful and resilient in navigating a low-cost underwater robot in unknown underwater environments. The implementation is made publicly available at https://github.com/dartmouthrobotics/deeprl-uw-robot-navigation.
... Despite the advances, there are several challenges to overcome. The marine water presents several drawbacks and difficulties, such as water turbidity [4], reduction of optical capabilities, water pressure [5], access limitation for humans, which requires complex equipment, whose cost is high [6], among others. ...
Article
Sonar sensors are an important source of data for understanding underwater environments because the sound is invariant to water turbidity, as well as illumination condition changes. This work reconstructs the surface of underwater structures and quantifies the spatial variations between multiple readings using data acquired by a 2D Mechanical Scanning Imaging Sonar (MSIS), attached to a Remotely Operated Vehicle (ROV). The proposed methodology is general for any underwater structure, but its main application is the melting process estimation in submerged frontal portions of glaciers. This comes up as a motivation, since ablation processes occurring on glacier fronts remain as one of the main inaccurate components of the glacier mass balance. The quantification of this phenomenon is crucial for estimating the consequences of glacier melting to sea level rise. Two methods are proposed to estimate volume change using sonar data.Reference Plane-Based System (RPBS) uses a plane behind the comparison surfaces as reference. Moreover, Sensor Line-Based System (SLBS) uses the movement line of the sensor during the data acquisition process. We evaluated the proposed methods using simulated and real data acquired with a ROV with MSIS. Results show the feasibility of both methods to estimate the volume change of underwater structures.
... Modern underwater (UW) activities such as monitoring, inspection, maintenance, archaeology, and environmental research, involve the acquisition of video footage and images of objects, fauna and flora [1,2]. The quality of perception of these objects in the scene depends on physical properties of water, ambient light, and depth [3]. ...
Article
Full-text available
Images acquired during underwater activities suffer from environmental properties of the water, such as turbidity and light attenuation. These phenomena cause color distortion, blurring, and contrast reduction. In addition, irregular ambient light distribution causes color channel unbalance and regions with high-intensity pixels. Recent works related to underwater image enhancement, and based on deep learning approaches, tackle the lack of paired datasets generating synthetic ground-truth. In this paper, we present a self-supervised learning methodology for underwater image enhancement based on deep learning that requires no paired datasets. The proposed method estimates the degradation present in underwater images. Besides, an autoencoder reconstructs this image, and its output image is degraded using the estimated degradation information. Therefore, the strategy replaces the output image with the degraded version in the loss function during the training phase. This procedure misleads the neural network that learns to compensate the additional degradation. As a result, the reconstructed image is an enhanced version of the input image. Also, the algorithm presents an attention module to reduce high-intensity areas generated in enhanced images by color channel unbalances and outlier regions. Furthermore, the proposed methodology requires no ground-truth. Besides, only real underwater images were used to train the neural network, and the results indicate the effectiveness of the method in terms of color preservation, color cast reduction, and contrast improvement.
... Modern underwater (UW) activities such as monitoring, inspection, maintenance, archaeology, and environmental research, involve the acquisition of video footage and images of objects, fauna and flora [1,2]. The quality of perception of these objects in the scene depends on physical properties of water, ambient light, and depth [3]. ...
Preprint
Images acquired during underwater activities suffer from environmental properties of the water, such as turbidity and light attenuation. These phenomena cause color distortion, blurring, and contrast reduction. In addition, irregular ambient light distribution causes color channel unbalance and regions with high-intensity pixels. Recent works related to underwater image enhancement, and based on deep learning approaches, tackle the lack of paired datasets generating synthetic ground-truth. In this paper, we present a self-supervised learning methodology for underwater image enhancement based on deep learning that requires no paired datasets. The proposed method estimates the degradation present in underwater images. Besides, an autoencoder reconstructs this image, and its output image is degraded using the estimated degradation information. Therefore, the strategy replaces the output image with the degraded version in the loss function during the training phase. This procedure \textit{misleads} the neural network that learns to compensate the additional degradation. As a result, the reconstructed image is an enhanced version of the input image. Also, the algorithm presents an attention module to reduce high-intensity areas generated in enhanced images by color channel unbalances and outlier regions. Furthermore, the proposed methodology requires no ground-truth. Besides, only real underwater images were used to train the neural network, and the results indicate the effectiveness of the method in terms of color preservation, color cast reduction, and contrast improvement.
... More recently, the learning was applied in tasks in robotics as well, where it was initially used to handle tasks in stable and observable environments [3]. For mobile robotics, however, the complexity increases significantly, since this kind of robot can be eventually placed in an environment with obstacles and walls for example [4,5]. In this scenario, Deep-RL ended up simplifying the problem by discretizing it [6]. ...
Article
Full-text available
This paper presents a novel deep reinforcement learning-based system for 3D mapless navigation for Unmanned Aerial Vehicles (UAVs). Instead of using an image-based sensing approach, we propose a simple learning system that uses only a few sparse range data from a distance sensor to train a learning agent. We based our approaches on two state-of-art double critics Deep-RL models: Twin Delayed Deep Deterministic Policy Gradient (TD3) and Soft Actor-Critic (SAC). We show that our two approaches manage to outperform an approach based on the Deep Deterministic Policy Gradient (DDPG) technique and the BUG2 algorithm. Also, our new Deep-RL structure based on Recurrent Neural Networks (RNNs) outperforms the current structure used to perform mapless navigation of mobile robots. Overall, we conclude that Deep-RL approaches based on double critic with Recurrent Neural Networks (RNNs) are better suited to perform mapless navigation and obstacle avoidance of UAVs.
... Image restoration for participating media is not a new research problem [40]. For example, this problem is usually faced in robotics applications where artificial vision systems are often adopted as the main sensing device [13] to deal with tasks such as classification [10], mapping [6], 3D reconstruction [12], visualization [15], docking [36], tracking [17] [29] and robot localization by itself [2]. Furthermore, images taken in aerial environments with participating media, such as disaster zones (i.e., with the presence of smoke) or hazy weather can hamper the performance of navigation systems in autonomous vehicles based on vision, such as drones. ...
Article
Full-text available
Modern imaging devices can capture faithful color and characteristics of natural and man-made scenes. However, there exist conditions in which the light radiated by objects cannot reach the camera’s lens or it is naturally degraded. Thus, the resulting captured images suffer from color loss. This article addresses the problem of underwater image restoration by using an optics-based formulation to model the interaction between light and any underwater suspended particle. Our approach uses a factorial Markov random field (FMRF) to reformulate and solve the general nonlinear participating media optical model. This novel formulation also has the particularity of considering attenuation coefficients, beside global light, as to probabilistic latent variables, inferred from a single image. Due to this unique feature, our FMRF methodology for itself is enough to deal with images acquired in underwater scenes. The generality of our optical model makes it applicable in other participating media such as fog or haze, more commonly addressed in the current literature. Results have shown the capabilities to improve the degraded images using our methodology in several scenarios.
Chapter
Underwater navigation presents several challenges, including unstructured unknown environments, lack of reliable localization systems (e.g., GPS), and poor visibility. Furthermore, good-quality obstacle detection sensors for underwater robots are scant and costly; and many sensors like RGB-D cameras and LiDAR only work in-air. To enable reliable mapless underwater navigation despite these challenges, we propose a low-cost end-to-end navigation system, based on a monocular camera and a fixed single-beam echo-sounder, that efficiently navigates an underwater robot to waypoints while avoiding nearby obstacles. Our proposed method is based on Proximal Policy Optimization (PPO), which takes as input current relative goal information, estimated depth images, echo-sounder readings, and previous executed actions, and outputs 3D robot actions in a normalized scale. End-to-end training was done in simulation, where we adopted domain randomization (varying underwater conditions and visibility) to learn a robust policy against noise and changes in visibility conditions. The experiments in simulation and real-world demonstrated that our proposed method is successful and resilient in navigating a low-cost underwater robot in unknown underwater environments. The implementation is made publicly available at https://github.com/dartmouthrobotics/deeprl-uw-robot-navigation.
Conference Paper
Full-text available
The use of Autonomous Underwater Vehicles (AUVs) for underwater tasks is a promising robotic field. These robots can carry visual inspection cameras. Besides serving the activities of inspection and mapping, the captured images can also be used to aid navigation and localization of the robots. Visual odometry is the process of determining the position and orientation of a robot by analyzing the associated camera images. It has been used in a wide variety of non-standard locomotion robotic methods. In this context, this paper proposes an approach to visual odometry and mapping of underwater vehicles. Supposing the use of inspection cameras, this proposal is composed of two stages: i) the use of computer vision for visual odometry, extracting landmarks in underwater image sequences and ii) the development of topological maps for localization and navigation. The integration of such systems will allow visual odometry, localization and mapping of the environment. A set of tests with real robots was accomplished, regarding online and performance issues. The results reveals an accuracy and robust approach to several underwater conditions, as illumination and noise, leading to a promissory and original visual odometry and mapping technique.
Conference Paper
Full-text available
The underwater vision is highly spoiled by the underwater degradation effects. As light propagates in the water or other participative mediums, it suffers from a substantial scattering effect that produces poor image quality. Based on a physical model that describes this phenomenon it is possible to recover an haze-free image. But, for this procedure to succeed, it is necessary to obtain certain parameters from the model. With an adaptation of the Dark Channel Prior, proposed by this paper, we are able to obtain a rough distance map estimative. With this, and some model simplifications, we are able to successfully obtain the restoration of the image.
Article
Full-text available
In underwater environments, the scattering and absorption phenomena affect the propagation of light, degrading the quality of captured images. In this work, the authors present a method based on a physical model of light propagation that takes into account the most significant effects to image degradation: absorption, scattering, and backscattering. The proposed method uses statistical priors to restore the visual quality of the images acquired in typical underwater scenarios.
Conference Paper
Full-text available
Underwater environments present a considerable challenge for computer vision, since water is a scattering medium with substantial light absorption characteristics which is made even more severe by turbidity. This poses significant problems for visual underwater navigation, object detection, tracking and recognition. Previous works tackle the problem by using unreliable priors or expensive and complex devices. This paper adopts a physical underwater light attenuation model which is used to enhance the quality of images and enable the applicability of traditional computer vision techniques images acquired from underwater scenes. The proposed method simultaneously estimates the attenuation parameter of the medium and the depth map of the scene to compute the image irradiance thus reducing the effect of the medium in the images. Our approach is based on a novel optical flow method, which is capable of dealing with scattering media, and a new technique that robustly estimates the medium parameters. Combined with structure-from-motion techniques, the depth map is estimated and a model-based restoration is performed. The method was tested both with simulated and real sequences of images. The experimental images were acquired with a camera mounted on a Remotely Operated Vehicle (ROV) navigating in a naturally lit, shallow seawater. The results show that the proposed technique allows for substantial restoration of the images, thereby improving the ability to identify and match features, which in turn is an essential step for other computer vision algorithms such as object detection and tracking, and autonomous navigation.
Conference Paper
Full-text available
Context information is fundamental for image understanding. Many algorithms add context information by including semantic relations among objects such as neighboring tendencies, relative sizes and positions. To achieve context inclusion, popular context-aware classification methods rely on probabilistic graphical models such as Markov Random Fields (MRF) or Conditional Random Fields (CRF). However, recent studies showed that MRF/CRF approaches do not perform better than a simple smoothing on the labeling results. The need for more context awareness has motivated the use of different methods where the semantic relations between objects are further enforced. With this, we found that on particular application scenarios where some specific assumptions can be made, the use of context relationships is greatly more effective. We propose a new method, called GeoSim, to compute the labels of mosaic images with context label agreement. Our method trains a transition probability model to enforce properties such as class size and proportions. The method draws inspiration from Geostatistics, usually used to model spatial uncertainties. We tested the proposed method in two different ocean seabed classification context, obtaining state-of-art results.
Article
Full-text available
Underwater range scanning techniques are starting to gain interest in underwater exploration, providing new tools to represent the seafloor. These scans (often) acquired by underwater robots usually result in an unstructured point cloud, but given the common downward-looking or forward-looking configuration of these sensors with respect to the scene, the problem of recovering a piecewise linear approximation representing the scene is normally solved by approximating these 3D points using a heightmap (2.5D). Nevertheless, this representation is not able to correctly represent complex structures, especially those presenting arbitrary concavities normally exhibited in underwater objects. We present a method devoted to full 3D surface reconstruction that does not assume any specific sensor configuration. The method presented is robust to common defects in raw scanned data such as outliers and noise often present in extreme environments such as underwater, both for sonar and optical surveys. Moreover, the proposed method does not need a manual preprocessing step. It is also generic as it does not need any information other than the points themselves to work. This property leads to its wide application to any kind of range scanning technologies and we demonstrate its versatility by using it on synthetic data, controlled laser scans, and multibeam sonar surveys. Finally, and given the unbeatable level of detail that optical methods can provide, we analyze the application of this method on optical datasets related to biology, geology and archeology.
Conference Paper
Full-text available
The underwater vision is highly spoiled by the underwater degradation effects. As light propagates in the water or other participative mediums, it suffers from a substantial scattering effect that produces poor image quality. Based on a physical model that describes this phenomenon it is possible to recover an haze-free image. But, for this procedure to succeed, it is necessary to obtain certain parameters from the model. With an adaptation of the Dark Channel Prior, proposed by this paper, we are able to obtain a rough distance map estimative. With this, and some model simplifications, we are able to successfully obtain the restoration of the image.
Conference Paper
Vision-based underwater navigation and obstacle avoidance demands robust computer vision algorithms, particularly for operation in turbid water with reduced visibility. This paper describes a novel method for the simultaneous underwater image quality assessment, visibility enhancement and disparity computation to increase stereo range resolution under dynamic, natural lighting and turbid conditions. The technique estimates the visibility properties from a sparse 3D map of the original degraded image using a physical underwater light attenuation model. Firstly, an iterated distance-adaptive image contrast enhancement enables a dense disparity computation and visibility estimation. Secondly, using a light attenuation model for ocean water, a color corrected stereo underwater image is obtained along with a visibility distance estimate. Experimental results in shallow, naturally lit, high-turbidity coastal environments show the proposed technique improves range estimation over the original images as well as image quality and color for habitat classification. Furthermore, the recursiveness and robustness of the technique allows implementation onboard an Autonomous Underwater Vehicle for improving navigation and obstacle avoidance performance.