ArticlePDF Available

Abstract and Figures

In underwater environments, the scattering and absorption phenomena affect the propagation of light, degrading the quality of captured images. In this work, the authors present a method based on a physical model of light propagation that takes into account the most significant effects to image degradation: absorption, scattering, and backscattering. The proposed method uses statistical priors to restore the visual quality of the images acquired in typical underwater scenarios.
Content may be subject to copyright.
1
Underwater Depth Estimation and Image Restoration Based on Single Images
Paulo Drews-Jr, Erickson R. Nascimento, Silvia Botelho and Mario Campos
Images acquired in underwater environments undergo a degradation process due to the inherent
complexity of the interaction of light with the medium. Such interaction includes numerous phenomena
such as multipath refraction and reflection of light rays on particles in suspension with dynamic motion
patterns. The complexity of a possibly complete model may render it unfeasible to be used in several
applications which may requires frame rate performance. Thus, we have adopted a simplified model that
takes into account the most significant effects to image degradation, i.e. absorption, scattering and
backscattering. In this work, we present a restoration method based on a physical model of light
propagation along the use of statistical priors of the scene. Our approach is able to simultaneously recover
the medium transmission and the scene depth as well as to restore the visual quality of the images
acquired in typical underwater scenarios.
Introduction
An increasing number of real-world applications are related to underwater environments, among which
are fisheries, environmental and structural monitoring and inspections, and oil and gas exploration.
Petroleum and natural gas are still the most important sources of energy in the world and researchers
have recently discovered relevant oil and gas reserves along the coast of Brazil and Africa underneath
what is known as pre-salt rock formations. Pre-salt layers of rocks in the earth's crust are composed only
of petrified salt covering large areas on the ocean floor. Recent findings have unveiled that over millions
of years large amounts of organic matter have been deposited beneath the layers of pressed salt between
the west coast of Africa and the eastern shores of South America. This organic matter has been
transformed into oil, which in many areas is engulfed with gas.
In Brazil, the pre-salt area spans a range of about 800 kilometers along the Brazilian coast. Geological
studies have estimated that the oil and gas reserves in that area are in the order of 80 billion barrels,
which would place Brazil as the sixth largest holder of reserves in the world behind Saudi Arabia, Iran,
Iraq, Kuwait and the United Arab Emirates.
Exploring and working in the pre-salt reserves present an important technological issue that includes the
ability to perceive the underwater environment. Techniques based on machine vision can help humans to
monitor and to supervise activities in these scenarios, as well as to enable carrying out missions with
autonomous robotic vehicles. In general, computer vision algorithms assume that the medium does not
affect light propagation. However, this assumption does not hold in scattering media such as underwater
2
scenes. Indeed, the phenomena of scattering and absorption affect the propagation of light, degrading
the quality of the captured images.
Thus, the effort in the fields of image processing and computer vision for leading to an improvement in
the quality of underwater images may contribute to several applications, especially those related to
offshore oil and gas industry. In this paper, we address the problems of image restoration, to improve the
visual quality of underwater images, and depth estimation of the scene to extract geometrical information
of the objects in therein.
Image restoration and depth estimation are ambiguous problems, since in general the available number
of constraints is smaller than the number of unknown variables. One of the strategies most commonly
adopted to tackle these problems in computer vision is to impose extra constraints that are based on
some a priori knowledge about the scene. These extra constraints are called priors. In general, a prior can
be a statistical/physical property, ad-hoc rules, or even heuristic assumptions. The performance of the
algorithms is limited by the extent to which the prior is valid. Some of the widely used priors in image
processing are smoothness, sparsity, and symmetry.
Inspired by the observation of Kaiming He and colleagues [1] that natural scenes tend to be dark in at least
one of the RGB color channels, we derived a new prior by making observations in underwater images on
the relevance of the absorption rate in the red color channel. By collecting a large number of images from
several image search engines, we tested our prior and show its applicability and limitations on images
acquired from real scenes.
The main contribution of the work described in this paper is an extension of our previous work to deal
with underwater image restoration called Underwater Dark Channel Prior (UDCP) [2]. We present a
deeper study on the method, including an extensive statistical experimental verification of the assumption
following the guidelines described in He and colleagues [1]. Additionally, we present a new application of
the UDCP prior for underwater image restoration and depth estimation. We evaluated the algorithm using
qualitative and quantitative analysis through a new set of data, including images acquired in the Brazilian
coast. The techniques presented in this work open new opportunities to develop automatic algorithms
for underwater applications that require high quality in visual information.
Previous Approaches in Image Restoration of Underwater Images
The works in literature have approached the problem of restoring images acquired from underwater
scenes from several perspectives: using specific purpose hardware, stereo images and polarization filters
[3]. Despite the improvements achieved by these approaches, they still present several limitations. For
instance, methods that rely on specialized hardware are expensive and complex to be realized. The use of
polarizers, for example, requires moving parts and it is hard to implement in automatic acquisition tasks.
In a stereo vision system approach, the correspondence problem becomes even harder due to the strong
3
effects imposed by the medium. Methods based on multiple images require at least two images of the
same scene taken under different environment conditions, which makes inadequate for real-time
applications. Thus, the problem of image restoration for underwater scenes still demands much research
effort in spite of the advances that have already been attained.
In the past few years, a large number of algorithms for image restoration based on single image have been
proposed, and the works of He and colleagues [1] and of Raanan Fattal [4] the most cited in the field.
While these works have shown good performance for enhancement in the visual quality for outdoor
terrestrial images, there is still room for improvement when they are applied to underwater images. As
far as single image methods are concerned, He and colleagues have proposed one of the most popular
methods called Dark Channel Prior (DCP). Liu Chao and Meng Wang [5], John Chiang and Ying-Ching Chen
[6], and Seiichi Serikawaa and Huimin Lu [7] have also applied the DCP method to restore the visual quality
of underwater images. However, these works do not address some of the fundamental DCP limitations
related with the absorption rate of the red channel, and do not discuss relevant issues with the basic DCP
assumptions.
Differently than outdoor scenes, the underwater medium imposes wavelength dependent rates of
absorption, mainly in the red channel. Thus, Paulo Drews-Jr and colleagues [2] proposed a modified
version of the DCP to overcome this limitation of DCP prior for applications in underwater imaging. Here
we build upon and extend the work presented in [2], providing an extensive study about the prior with
applications to image restoration and depth estimation. Furthermore, we provide new results of image
restoration using qualitative and quantitative analysis.
Underwater Attenuation Light Modelling
The underwater image formation results from a complex interaction between the light, the medium, and
the scene. A simplified analysis of this interaction is possible, yet maintaining physical plausibility. To this
end, first order effects are the forward scattering and the backscattering, i.e. the scattering of light rays
in small and large angles. The absorption of light is associated with these two effects since they respond
for contrast degradation and color shift in images. Fig. 1 illustrates these effects.
According to Yoav Schechner and Nir Karpel [3], backscattering is the prime reason for image contrast
degradation, thus the forward scattering can be neglected. Therefore, the underwater attenuation light
model is a linear combination of the direct light and the backscattering. The direct light is defined as the
fraction of light irradiated by the scene where a part is lost due to scattering and absorption. On the other
hand, the backscattering does not originate from the object's radiance, but it results from the interaction
between the environment’s illumination sources and the particles dispersed in the medium. For a
homogeneous illuminated environment, the backscattered light can be assumed to be constant and it can
be obtained from the image by using a completely haze-opaque region or by finding the farthest pixel in
4
the scene. However, this information is impossible to be automatically acquired from a single image.
Finding the brightest pixel in the dark channel is assumed as an adequate approximation.
Fig. 1 Diagram illustrating the underwater attenuation light model. The dashed lines show the forward
scattering and backscattering effects. The scattering of light rays in small and large angles creates these
effects, respectively. Direct light is the portion of light irradiated by the scene that reaches the image
plane.
One important aspect of the linear model is the weight of the direct and of the backscattering components
in the final image. The experimental analysis indicates an exponential behavior between the depth and
the attenuation coefficient. This coefficient is an inherent property of the medium and it is defined as the
sum of the absorption and scattering rates. Since both rates are wavelength dependent, the attenuation
coefficient is different for each wavelength. In the literature exponential weight is called medium
transmission. The depths in the scene are estimated up to a scale factor by applying the log operation to
the medium transmission value.
The image restoration is performed by inverting the underwater attenuation light model. Assuming that
we are able to estimate the medium transmission and the backscattering light, the restored image is
computed by summing the backscattering light intensity and dividing it by the normalized color image.
5
Dark Channel Prior
The Dark Channel Prior is a statistical prior based on the observation that natural outdoor images in clear
day exhibits mostly dark intensities in a square patch of the image [1]. This was inspired in the well-known
dark-object subtraction method from the remote sensing field. The authors considered that in most of
the non-sky patches on images of outdoor scenes, at least one color channel in the RGB representation
would have some pixels whose intensity were almost zero. This low intensity in the dark channel was due
to three factors: a) shadows in the images; b) colorful objects or surfaces where at least one color has low
intensity and c) dark objects or surfaces. They collected a large number of outdoor images and built
histograms, and with those, they have shown that about 75 percent of the pixels in the dark channel had
zero values, and the intensity of 90 percent of the pixels was below 25 in a scale of [0;255]. Those results
provide a strong support to the dark channel prior assumption for outdoor images. This prior allows the
estimation of an approximation of the amount of the medium transmission in local patches. He and
colleagues have shown that the Dark Channel provided excellent results in haze scenes.
The use of a local patch affects the performance of the medium transmission estimation. He and
colleagues proposed the use of a spectral matting method to refine the estimated transmission. Their
method presents good results but it requires a high computational effort to process the Laplacian matrix.
Other works proposed approximate solutions to make it faster by using quadtrees, Markov Random Fields,
or filtering techniques, e.g. guided filter or bilateral filter.
Dark Channel Prior on Underwater Images and its variations
Due to the good results obtained by the DCP method for haze scenes and the similarities in the modelling
of a haze image and an underwater image, some previous works applied the Dark Channel Prior to process
underwater images. One of the first works to use DCP in underwater images was Chao and Wang [5]. The
reported results show a limited number of experiments where the visual quality of the results do not
present a significant improvement, even for images with small degradation. Chiang and Chen [6] also
proposed an underwater image restoration method using standard DCP. Their method obtained good
results for real underwater images, but it was limited by the standard DCP method in underwater images
and by the assumption that the image is predominantly blue. Recently, Serikawaa and Lu [7] proposed a
variation of the DCP that filters the medium transmission by using Joint Trilateral Filter. Despite the
improvement attained in the image restoration when compared to standard DCP, the limitation related
to the red channel remains the same.
Kristofor Gibson and colleagues [8] proposed a variation of the DCP where they replaced the minimum
operator in an image patch by the median operator. They named the method as MDCP. They chose the
median operator due to its ability to preserve edges. Their approach could provide good estimation when
the effects of the medium are approximately wavelength independent; in this case, the behavior tends to
be similar to standard DCP.
6
Nicholas Carlevaris-Bianco and colleagues [9] proposed an underwater image restoration using a new
interpretation of the DCP for underwater conditions. The proposed prior explores the fact that the
attenuation of light in water varies depending on the color of the light. Underwater medium attenuates
the red color channel at a much higher rate than the green and blue channels. Differently from the
standard DCP, that prior is based on the difference between the maximum in the red channel and each
one of the other channels (G and B), instead of with only the minimum as in DCP. The method works well
when the absorption coefficient of the red channel is large. The method shows some shortcomings to
estimate the medium transmission in typical shallow waters.
Underwater Dark Channel Prior and the Image Restoration
The statistical correlation of a low Dark Channel in haze-free images is not easy to be tested for
underwater images due to the difficulty to obtain real images of underwater scene in an out-of-water
condition. However, the assumptions made by He and colleagues are still plausible, i.e. at least one color
channel has some pixels whose intensity are close to zero. These low intensities are due to a) shadows; b)
color objects or surfaces having at least one color channel with low intensity, e.g. fishes, algae or corals;
c) dark objects or surfaces, e.g. rocks or dark sediment.
Despite the fact that dark channel assumption seems to be correct, some problems arise from the
wavelength independence assumption. There are many practical situations where the red channel is
nearly dark, which corrupts the transmission estimate by the standard DCP. Indeed, the red channel
suffers an aggressive decay caused by the absorption of the medium making it to be approximately zero
even in shallow waters. Thus, the information of the red channel is undependable.
We proposed a new prior that considers just the green and the blue color channels to overcome this issue.
We named this prior Underwater Dark Channel Prior (UDCP). This prior allows us to invert the model and
to obtain an estimate of the medium transmission. The medium transmission and the backscattering light
constants provide enough information to restore the images.
We performed an experimental verification to evaluate the assumption of the new prior based on two
assumptions: a) the main assumption of the DCP for outdoor scenes remains valid if only applied to green
and blue channels, and b) the behavior of the UDCP histogram in underwater scenes is plausible.
Since He's dataset has not be made publicly available, we created our own following the guidelines
proposed by He and colleagues in [1]. The dataset is composed of 1,022 outdoor landscape images greater
than 0.2 Mpixels from the SUN database [10], see Fig. 2 for image samples. We selected a subset of images
of natural scenes, i.e. images without any human-made object, comprising of 274 images (first row in Fig.
2).
We then compute the distribution of pixel intensities, where each bin contains 16 intensity levels from an
interval of [0;255] (Fig. 3). The histograms were obtained by using i. only the natural images and ii. all
7
images of the extended dataset. In this figure, each row depicts the results for the minimum operator in
a small patch window using only the RED, GREEN and BLUE channels, the DCP (dark channel in all channels)
and the UDCP (dark channel in green and blue).
Fig. 2 - Sample images of our dataset. The first row shows the images of natural scenes and the second
row shows scenes that include human-made structures/objects (Images acquired from SUN Dataset
[10]).
Even thou our datasets and those of He and colleagues are different, ours were collected using the same
guidelines as theirs, and thus, some similarity is to be expected. Indeed, the histograms present
similarities, but also important differences. The probability of the first bin (intensities between 0-15) is
smaller than the one presented by He and colleagues [1]. They reported ≈ 90% for the first bin in the DCP
while the probability for our dataset is 45% (Fig. 3). One can see the highest probability, 50%, is
obtained for histograms of natural scenes (Fig. 3 - 1st and 3rd rows) which is the expected case of typical
underwater scenes. The most important observation is related to the significance of each channel for the
prior. The lower intensity bins of the blue channel (Fig. 3) are dominant mainly in natural scenes. The red
channel is still dark but it is the most equalized histogram for all scenarios. The green channel presents
similar behavior. Thus, the absence of blue color in the final composition of the scene represents the
prevalence of this channel in both the DCP and the UDCP.
One can observe the close similarities between DCP and UDCP statistics in the histogram in Fig. 3. We
show the Pearson's linear correlation coefficient in Table I, which quantifies these similarities. The
correlation coefficient ranges from [-1;1], where a coefficient value close to one indicates that the
relationship is almost perfect and negative values show that data are uncorrelated.
8
Fig. 3 - The distribution of pixel intensity of the dark channel for natural scenes of the extended
dataset (1st and 3rd rows) and for all images of the extended dataset (2nd and 4th rows). We show the
histogram for the red, green, blue channels, DCP (in black color), and UDCP (in cyan color),
respectively.
9
Table I Person’s Correlation coefficient between DCP and UDCP, Red, Green and Blue channels.
Natural Scenes
Extended Dataset
UDCP
0.9999
0.9998
Red
0.8049
0.8856
Green
0.7680
0.8351
Blue
0.9998
0.9995
One can readily see that there is a strong correlation between DCP and UDCP, which means that both
methods are based on similar assumptions about the scene, i.e. low intensity dark channel. The value of
the correlation coefficients for the blue channel and DCP are approximately equal to one meaning that
they are also strongly correlated. In natural scenes, the correlation between DCP and green channel is the
smallest due to the presence of grass and trees in the scenes, which causes an increase in the intensities
of this color channel.
Fig. 4 Sample images from the underwater datasets. Images from the reduced dataset are shown in
the top row and images from the extended dataset is shown in the bottom row (First Row Courtesy
of Rémi Forget, Second RowCourtesy of Kristina Maze).
10
Fig. 5 - The distribution of pixel intensity of the dark channel for the reduced dataset (1st and 3rd rows)
and for the extended dataset (2nd and 4th rows). We show the histogram for the red, green, blue
channels, DCP (in black color), and UDCP (in cyan color), respectively.
We also created two datasets of underwater images to evaluate the influence of the medium and to verify
the UDCP assumptions. The datasets creation follows the guidelines of He and colleagues [1]. The first
dataset (reduced) was created by extracting the images from a single user of the Flickr website. This
dataset contains 65 high quality photos acquired with the same camera. The images, which include coral
reefs, rocks, marine animals, wreck, etc., were acquired during diving activities in several places of the
11
world (thus with different turbidity levels). The first row of Fig. 4 shows sample images of the reduced
dataset.
The second dataset (extended) was obtained by collecting images from several image search engines on
the internet. This dataset is composed of 171 underwater images acquired under diverse media
conditions, water depth and scenes, which provides a rich source of information. All the images are
approximately homogeneous illuminated that limits the water depth to shallow water. The second row of
the Fig. 4 shows sample images of the extended dataset.
Differently from the histograms of outdoor scenes, the dark channel of the red channel is really dark, i.e.
≈ 90% of the pixels are in the first bin (Fig. 5). This agrees with the assumption of UDCP, i.e. the highest
absorption rate for the red channel. As expected, the dark channel for the blue and the green channels
are similar but many values cover a broader range due to the effects of the interaction of light with the
medium. The histograms of DCP are somewhat consistent with what we would expect for non-
participating media. Hence, DCP is not able to recover adequately the medium transmission. However,
the bin values in the UDCP histograms are more evenly distributed, which indicates that UDCP is a better
approach to estimate the medium transmission.
Experimental verification shows that the statistics for the UDCP assumption is a more general supposition
than the DCP assumption. However, these results do not guarantee the quality of the estimated
transmission. UDCP and DCP obtain similar histograms for natural scenes, as shown by the correlation
analysis. These results indicate that both are based on similar assumptions.
Another important characteristic concerns the blue channel, which in natural scenes tends to be darker
than the other channels. The underwater medium is typically blue, thus increasing the intensities of this
color channel. This fact corroborates the underwater dark channel assumption.
Experimental Results
If from the one hand the experiments showed that the assumptions of UDCP are valid, on the other hand
it is important to find out if the UDCP outperforms the other DCP based methods for restoring images. In
order to evaluate the performance of UDCP, we applied the standard DCP to underwater images, as
proposed by Chao and Wang [5], and Chiang and Chen [6]. The MDCP [8] was also applied to underwater
images, but with the refinement proposed by He and colleagues [1]. We also obtained results for Bianco's
prior (BP) [9]. Our evaluation is based on qualitative and quantitative analysis. Figs. 6 and 7 show the
qualitative results for underwater images collected from the internet. Fig. 8 shows the sample of images
from three underwater videos that we captured. We acquired these videos in a coral reef at the Brazilian
Northeast Coastal area at the depth of approximately 10m. They are composed of 150, 138 and 610
frames, and the sample images of these videos are figs. 8(a), 8(b) and 8(c), respectively.
12
(c)
(f)
(g)
(i)
Fig. 6 Restored images and depth estimation: (a) Underwater image with regions where the
backscattering constant was estimated, using UDCP (orange patch), DCP (red patch), MDCP (yellow
patch), and the BP (purple patch). Restored images using UDCP (b), DCP (d), MDCP (e), and BP (f).
Colorized depth maps obtained using UDCP (c), DCP (g), MDCP (h), and BP (i) (The credits of the image
(a): Kevin Clarke).
In the quantitative evaluation, we used a metric called proposed by Nicolas Hautière and colleagues [11]
to analyze their method for weather-degraded images. We adopted this metric in the present work due
to the similarities of weather-degraded image and underwater images. Three different indexes are
defined in the metric: e, 𝑟 ̅and s. The value of e evaluates the ability of a method to restore edges, which
were not visible in the degraded image, but are visible in the restored image. The value of 𝑟 ̅ measures
the quality of the contrast restoration; a similar technique was adopted by [3] to evaluate restoration in
One example of these experiments is depicted in Fig. 6, which shows the original image, Fig. 6(a), the
13
restored image, Fig. 6(b), and the colorized depth maps, Fig. 6(c), obtained using UDCP approach. We
colorized the depth maps to aid the visualization, where reddish colors represent closer points and bluish
colors, represent points that are further away.an underwater medium. Finally, the value of s is obtained
from the amount of pixels which are saturated (black or white) after applying the restoration method but
were not before. These three indexes allow us to estimate an empirical restoration score
= e + 𝑟 ̅+ 1 − s
[11], where larger values mean better restoration. Table II shows the obtained results.
(c)
(f)
(g)
(i)
Fig. 7 A second example of restored images and depth estimation: (a) Underwater image with
regions where the backscattering constant was estimated, using UDCP (orange patch), DCP (red
patch), MDCP (yellow patch), and the BP (purple patch). Restored images using UDCP (b), DCP (d),
MDCP (e), and BP (f). Colorized depth maps obtained using UDCP (c), DCP (g), MDCP (h), and BP (i).
(The credits of the image (a): Ancuti and colleagues [12]).
14
One example of these experiments is depicted in Fig. 6, which shows the original image, Fig. 6(a), the
restored image, Fig. 6(b), and the colorized depth maps, Fig. 6(c), obtained using UDCP approach. We
colorized the depth maps to aid the visualization, where reddish colors represent closer points and bluish
colors, represent points that are further away.
Figs. 6 and 7 also show the results obtained by applying the methods proposed by other authors, (i.e.
DCP, MDCP, and BP) on images from the extended dataset. This dataset is detailed in Section Underwater
Dark Channel Prior and the Image Restoration. We show the underwater images with the back scattering
light estimation in figs. 6(a) and 7(a). The estimation of the backscattering constant obtained by UDCP
seems to be the most plausible, i.e. near the farthest point in the image. The other methods fail in the
estimation in at least one of the images. They identify the backscattering light in bright surfaces of the
scene instead of the farthest point.
Table II Quantitative evaluation of the underwater restoration methods using the
metric [11]. The
best method is highlighted using bold letters. We show the results for the sample images in Figs. 6, 7
and 8 and the average of the extended dataset and the videos of Fig. 8.
UDCP
DCP
MDCP
BP
Fig. 6(a)
4.52
2.22
2.14
3.50
Fig. 7(a)
60.43
41.44
56.97
3.01
Average of the
Extended Dataset
3.68
2.77
2.75
3.01
Fig. 8(a)
705.61
640.95
641.72
2.85
Average of the Video 1
3460.2
1500.0
1942.8
2.79
Fig. 8(b)
5.82
5.49
4.80
2.21
Average of the Video 2
32.02
23.09
24.09
2.30
Fig. 8(c)
10.81
9.56
8.98
2.23
Average of the Video 3
90.37
69.56
73.11
2.36
The restored images by UDCP, figs. 6(b) and 7(b), show that there was an improvement as far as contrast
and color fidelity are concerned. The values in Table II show that the restoration using UDCP presented
the best values for metric for all experiments, including full dataset. In Fig. 6, the UDCP (b) and BP (f)
presented the best results for contrast and visibility. However, BP fails to estimate the backscattering
constant. It generates an incorrect depth information shown by the colorization of the restored image.
This is corroborated by the fact that the depth maps estimated by both methods are similar. The
improvement in the estimation of the ocean floor of the scene is noticed for the image restored using
UDCP. The improvement provided by DCP and MDCP is imperceptible because both methods are not able
to recover the depth map in a correct way.
The UDCP method obtained the best results in Fig. 7, while the BP, Fig. 7(f), presented the worst results.
The values in Table II also confirm this fact. This can be explained since BP underestimates the attenuation
15
coefficient, limiting the quality of the map. This is due to the behavior of the red channel, which is not
completely absorbed. The results obtained by standard DCP, Fig. 7(d), and MDCP, Fig. 7(e), are also related
to this fact, since both methods are able to provide good results for depth map and restoration.
Fig. 8 shows the results obtained by applying the methods to the videos that we have captured. We
depicted one sample image from each video, shown in figs. 8(a), 8(b) and 8(c). For these sample images,
the backscattering constant is well balanced in all wavelengths due to the characteristics of the water and
the small water depth. In this case, the standard DCP, MDCP and UDCP present similar results in
qualitative terms. The BP method fails to estimate the depth map, in a similar way as the one shown in
Fig. 7. Thus, we omit the results using BP because they are similar to those obtained by the underwater
camera, i.e. first row. The UDCP attained the best results for scenes located at greater depths, evidenced
by the visibility of the rock in the top left of the restored image in Fig. 8(i).
Table II shows the average values for the videos illustrated by sample images in Fig. 8. We can clearly see
that our method presents better results using the metric, especially due to its ability to improve edges.
The average of the extended dataset is also shown, and the results are still favorable to UDCP. One can
see that the metric presents large values for video associated to Fig. 8(a). This is because the number of
edges in the underwater image is small, assuming the parameters adopted by Hautière and colleagues
[11]. The increase provided by the restoration is large, producing large values of the e metric and, by
consequence, the metric.
Conclusions
Although the standard DCP is intuitive, it has shown limitations to its use in underwater conditions
because of the high absorption of the red channel. The BP method presented good results in specific
contexts, but it underestimated medium transmission. The MDCP presents similar results to the DCP.
Finally, UDCP presented the most significant results in underwater conditions. It provides good restoration
and depth estimation even in situations where other methods can fail.
Despite the fact that UDCP presented meaningful results, it lacks both in reliability and robustness due to
the limitations imposed by the assumptions. On the one hand, the use of single image methods to restore
images can enhance the quality, but on the other hand, it is susceptible to the variations in scene
characteristics. Thus, one important direction is to use the information provided by the image to estimate
a confidence level, which would prove to be useful in practical applications, e.g. robotics. Another
important direction to be pursued is to use image sequences to disambiguate the parameters of the
model. Video acquisition is a common capability in almost all types of underwater cameras commonly
used by divers and ROVs. In this case, a single image restoration method can be used for initial estimation,
which would be followed by successive refinements as other images become available. Finally, for several
applications, it might be necessary to enhance the model with the inclusion of the effect of artificial
illumination in the scene. It will enable us to deal with deep-water conditions.
16
(c)
(f)
(g)
(i)
(j)
(l)
Fig. 8 Image restoration of three underwater videos acquired in Brazilian Northeast Coastal area. The first row
shows three sample images for each video. The restoration results for these sample images obtained by the
standard DCP, UDCP and MDCP are shown in the second, third and fourth rows, respectively.
17
References
[1] K. He, J. Sun, and X. Tang. Single image haze removal using dark channel prior, in IEEE CVPR, pages 19561963,
2009.
[2] P. Drews-Jr, E. Nascimento, F. Moraes, S. Botelho, M. Campos, Transmission estimation in underwater single
images, in IEEE ICCV - Workshop on Underwater Vision, 2013, pp. 825830.
[3] Y. Schechner and N. Karpel. Recovery of underwater visibility and structure by polarization analysis. IEEE JOE,
30(3):570587, 2005.
[4] R. Fattal. Single image dehazing. ACM TOG, 27(3), 2008.
[5] L. Chao and M. Wang. Removal of water scattering, in ICCET, volume 2, pages 3539, 2010.
[6] J. Chiang and Y. Chen. Underwater image enhancement by wavelength compensation and dehazing. IEEE TIP,
21(4):17561769, 2012.
[7] S. Serikawa and H. Lu. Underwater image dehazing using joint trilateral filter. Computers & Electrical
Engineering, 40(1): 4150, 2014.
[8] K. Gibson, D. Vo, and T. Nguyen. An investigation of dehazing effects on image and video coding. IEEE TIP,
21(2):662673, 2012.
[9] N. Carlevaris-Bianco, A. Mohan, and R. Eustice. Initial results in underwater single image dehazing, in IEEE
OCEANS, pages 18, 2010.
[10] J. Xiao, J. Hays, K. Ehinger, A. Oliva, A. Torralba. SUN database: Large-scale scene recognition from abbey to
zoo, in IEEE CVPR, pp. 34853492, 2010.
[11] N. Hautière, J.-P. Tarel, D. Aubert and E. Dumont. Blind contrast enhancement assessment by gradient ratioing
at visible edges. Image Analysis & Stereology, 27(2): 8795, 2008.
[12] C. Ancuti, C. Ancuti, T. Haber, and P. Bekaert. Enhancing underwater images and videos by fusion. In IEEE
CVPR, pages 81-88, 2012.
... To address these challenges, numerous Underwater Image Enhancement (UIE) algorithms have been developed to a ain high-quality underwater images [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25]. While a human-assisted judgment process can be used to select a well-performing UIE algorithm, it is costly, time-consuming, and impractical for real-time systems. ...
... The SAUD dataset comprises 100 underwater raw images from different scenes and 1000 enhanced underwater images. Each underwater raw image is enhanced using 10 mainstream enhancement algorithms, including BL-TM [6], HP-based [7], RD-based [9], Retinex [10], RGHS [11], TS-based [12], and UDCP [13], as well as the latest deep learningbased algorithms UWCNN [14], GL-Net [15], and Water-Net [17]. ...
... Differently, the UIQE dataset comprises 45 underwater raw images from diverse scenes and 405 enhanced underwater images. Each underwater raw image undergoes enhancement using 9 algorithms, which include Retinex [10], UDCP [13], UIBLA [18], RED [19], CycleGAN [20], WSCT [21], UGAN [22], FGAN [23], and UWCNN-SD [24]. All algorithms are distinct from those in the SAUD dataset, except for UDCP. ...
Article
Full-text available
Underwater images are important for underwater vision tasks, yet their quality often degrades during imaging, promoting the generation of Underwater Image Enhancement (UIE) algorithms. This paper proposes a Dual-Channel Convolutional Neural Network (DC-CNN)-based quality assessment method to evaluate the performance of different UIE algorithms. Specifically, inspired by the intrinsic image decomposition, the enhanced underwater image is decomposed into reflectance with color information and illumination with texture information based on the Retinex theory. Afterward, we design a DC-CNN with two branches to learn color and texture features from reflectance and illumination, respectively, reflecting the distortion characteristics of enhanced underwater images. To integrate the learned features, a feature fusion module and attention mechanism are conducted to align efficiently and reasonably with human visual perception characteristics. Finally, a quality regression module is used to establish the mapping relationship between the extracted features and quality scores. Experimental results on two public enhanced underwater image datasets (i.e., UIQE and SAUD) show that the proposed DC-CNN method outperforms a variety of the existing quality assessment methods.
... Data-driven approaches use supervised learning with synthetic datasets [42]- [44], GANs [45], or contrastive learning [46] to map underwater images to in-air distributions. Physics-based methodss [5], [14], [47]- [51] consider underwater image formation models to estimate parameters and reverse the degradation process. These methods often incorporate priors, such as underwater dark channel [47], and haze-lines [50], to facilitate parameter estimation, while some [5], [14] leverage multi-view observations to enhance parameter estimation by constructing 3D structures. ...
... Physics-based methodss [5], [14], [47]- [51] consider underwater image formation models to estimate parameters and reverse the degradation process. These methods often incorporate priors, such as underwater dark channel [47], and haze-lines [50], to facilitate parameter estimation, while some [5], [14] leverage multi-view observations to enhance parameter estimation by constructing 3D structures. For instance, SUCRe [5] employs the 3D structure to track color changes of a point from different viewpoints, resulting in more accurate parameter estimation. ...
Preprint
Representing underwater 3D scenes is a valuable yet complex task, as attenuation and scattering effects during underwater imaging significantly couple the information of the objects and the water. This coupling presents a significant challenge for existing methods in effectively representing both the objects and the water medium simultaneously. To address this challenge, we propose Aquatic-GS, a hybrid 3D representation approach for underwater scenes that effectively represents both the objects and the water medium. Specifically, we construct a Neural Water Field (NWF) to implicitly model the water parameters, while extending the latest 3D Gaussian Splatting (3DGS) to model the objects explicitly. Both components are integrated through a physics-based underwater image formation model to represent complex underwater scenes. Moreover, to construct more precise scene geometry and details, we design a Depth-Guided Optimization (DGO) mechanism that uses a pseudo-depth map as auxiliary guidance. After optimization, Aquatic-GS enables the rendering of novel underwater viewpoints and supports restoring the true appearance of underwater scenes, as if the water medium were absent. Extensive experiments on both simulated and real-world datasets demonstrate that Aquatic-GS surpasses state-of-the-art underwater 3D representation methods, achieving better rendering quality and real-time rendering performance with a 410x increase in speed. Furthermore, regarding underwater image restoration, Aquatic-GS outperforms representative dewatering methods in color correction, detail recovery, and stability. Our models, code, and datasets can be accessed at https://aquaticgs.github.io.
... Due to the number of constraints normally being less than the number of unknown variables, underwater image restoration and depth prediction are an ambiguous problem [48]. [49] uses a CNN to estimate depth map which in turn is used to perform image dehazing based on an atmospheric scattering model. ...
Thesis
Full-text available
Being cost effective, safe and portable, Unmanned Underwater Vehicle (UUV)s are becoming popular for underwater exploration. Navigation in underwater environments is challenging due to the fact that electromagnetic waves do not transmit far underwater. Vision based navigation and mapping can be very useful in this environment for being cheap and easily accessible. In this project, the feasibility of using a self-supervised based method for scene depth and robot pose learning from underwater videos has been studied. Three different depth prediction models Dispnet, Udepth and Dispnet Mvit have been trained with two different input spaces (Red-Green-Blue (RGB) and Red-Max-Intensity (RMI)) along with a pose prediction model. The models have been trained on 3 different years (2016, 2018 and 2020) data from the underwater Eiffel tower dataset and have been tested with data from the year 2015 of that same dataset. For testing the model’s generalizability the trained model is also tested with the Varos dataset. The predicted depths has been used to enhance images using the SeaThru pipeline. The udepth model with RMI input space has achieved the best depth prediction result on the eiffel tower dataset with an Root Mean Squared Error (RMSE) of 2.5583m when the maximum depth has been capped at 40m. While on varos dataset the dispnet mvit model with RGB input space performed the best with RMSE of 8.1343m when the maximum depth has been capped at 60m depth. For underwater image enhancement using SeaThru pipeline the dispnet with RMI input space achieved the best performance in terms of Underwater Image Quality Measure (UIQM) of 1.66 which is an 26.68% increase compared to the original images in the eiffel tower dataset. However, in the varos dataset the dispnet model with RGB input space achieved an UIQM of 0.50, a 305.03% of increase from the original images. The Varos dataset being a simulated dataset retains pixel level information even in the shadows of the images, which aids in enhancing the images using the SeaThru pipeline. This makes the enhanced images gain a 305.03% increase in UIQM compared to the original images. To get a better understanding of the image enhancement performance, a new dataset SeaThru-Nerf, consisting of real underwater images, has been used. On the SeaThru-Nerf dataset the dispnet model with RGB input space performed best for image enhancement in all scenes in terms of UIQM. It achieved UIQM of 2.06 a16.09% increase in Curasao scene, 2.01 an increase of 22.27% in Panama scene, 1.21 a 36.32% increase in IUI3-RedSea scene and 1.67 a increase of 29.43% in JapaneseGradens-RedSea scene. The predicted depths from the Udepth (RMI) model on the Eiffel-Tower dataset are used in RGB-D SLAM pipeline in the ORB-SLAM3 framework as a depth sensor. The estimated trajectory from the SLAM yielded an RMSE of 9.79m Absolute Trajectory Error (ATE) when averaged over 3 different runs, which is 2.14% of the total trajectory length. For the pose net when trained with Dispnet along with RMI input space achieved an Absolute Trajectory Error (ATE) of 1532.216m which is 0.311m per frame in the Eiffel tower dataset when a 5-frame snippet has been used to align and scale the trajectory. In Varos dataset the best performing pose net was trained with Dispnet mVit with RMI input space. It achieved a total ATE of 25.84997m and mean ATE of 0.0042m per frame. From the experiments, it can be said the self-supervised learning based scene depth and ego motion learning from videos, which was originally proposed for airborne vision, can be applied in underwater environments where the visual appearance of the scene is significantly different and this also changes with the viewpoint of the camera. Moreover, it can be also concluded that the predicted depths can be used in several application where the depth is required and not readily available from the sensor, like a pseudo depth sensor in RGB-D SLAM and underwater image enhancement techniques like SeaThru.
Article
Optical imaging and vision technology have become crucial research topics in the field of underwater and ocean scenes. These technologies play a vital role in advancing underwater exploration, scientific research, and smart aquaculture. Despite their importance, optical systems face substantial challenges in complex water environments, especially in turbid conditions where light absorption and scattering by water and suspended particles drastically reduce the quality of captured images. This degradation highlights the need for sophisticated image restoration algorithms that can effectively recover the quality of images impaired by turbid water conditions. Cascaded frameworks, which incorporate multiple modules, including that of color correction, detail enhancement, and information fusion, have emerged as particularly efficacious, offering significant improvements over traditional direct or end-to-end restoration methods. This review elaborates on the comparative benefits of various cascading structures—serial, parallel, and hybrid—and examines different restoration algorithms that utilize intrinsic image features, physical models, prior knowledge, and deep learning technologies. Furthermore, it provides a comprehensive overview of publicly available datasets and evaluation metrics, both referential and non-referential, which are crucial for developing, validating, and optimizing underwater image restoration algorithms. Additionally, by conducting experimental comparative studies of several open restoration algorithms, the effectiveness and applicability of different methods are validated. Finally, the review proposes potential research directions, aiming to deliver valuable insights to scholars in underwater application and computer vision, thereby inspiring innovative research initiatives that could further enhance the capabilities of underwater optical imaging.
Article
Full-text available
To address the issue of color distortion and blurriness in underwater imageries, a hybrid Underwater Image Enhancement (UIE) method combining Adaptive Gray World Algorithm (GWA), Feature Fusion Attention Network (FFANet) and Unsharp Masking (USM) is proposed in this research. This method begins with color correction by applying different stretching processes to the RGB components based on the image's color information, and iteratively corrects the colors. Next, the corrected image undergoes dehazing via FFA-Net to eliminate underwater haze and improve clarity. Ultimately, USM is applied to amplify high-frequency components, thus enhancing edge details. Qualitative and quantitative comparisons demonstrate that the proposed Improved GWA FFA-Net USM (IGFU) method outperforms existing techniques in underwater image quality.
Conference Paper
Full-text available
This paper proposes a methodology to estimate the transmission in underwater environments which consists on an adaptation of the Dark Channel Prior (DCP), a statistical prior based on properties of images obtained in outdoor natural scenes. Our methodology, called Underwater DCP (UDCP), basically considers that the blue and green color channels are the underwater visual information source, which enables a significant improvement over existing methods based in DCP. This is shown through a comparative study with state of the art techniques, we present a detailed analysis of our technique which shows its applicability and limitations in images acquired from real and simulated scenes.
Conference Paper
Full-text available
As light is transmitted from subject to observer it is absorbed and scattered by the medium it passes through. In mediums with large suspended particles, such as fog or turbid water, the effect of scattering can drastically decrease the quality of images. In this paper we present an algorithm for removing the effects of light scattering, referred to as dehazing, in underwater images. Our key contribution is to propose a simple, yet effective, prior that exploits the strong difference in attenuation between the three image color channels in water to estimate the depth of the scene. We then use this estimate to reduce the spatially varying effect of haze in the image. Our method works with a single image and does not require any specialized hardware or prior knowledge of the scene. As a by-product of the dehazing process, an up-to-scale depth map of the scene is produced. We present results over multiple real underwater images and over a controlled test set where the target distance and true colors are known.
Article
Full-text available
In this paper, we propose a simple but effective image prior - dark channel prior to remove haze from a single input image. The dark channel prior is a kind of statistics of the haze-free outdoor images. It is based on a key observation - most local patches in haze-free outdoor images contain some pixels which have very low intensities in at least one color channel. Using this prior with the haze imaging model, we can directly estimate the thickness of the haze and recover a high quality haze-free image. Results on a variety of outdoor haze images demonstrate the power of the proposed prior. Moreover, a high quality depth map can also be obtained as a by-product of haze removal.
Article
Full-text available
Light scattering and color change are two major sources of distortion for underwater photography. Light scattering is caused by light incident on objects reflected and deflected multiple times by particles present in the water before reaching the camera. This in turn lowers the visibility and contrast of the image captured. Color change corresponds to the varying degrees of attenuation encountered by light traveling in the water with different wavelengths, rendering ambient underwater environments dominated by a bluish tone. No existing underwater processing techniques can handle light scattering and color change distortions suffered by underwater images, and the possible presence of artificial lighting simultaneously. This paper proposes a novel systematic approach to enhance underwater images by a dehazing algorithm, to compensate the attenuation discrepancy along the propagation path, and to take the influence of the possible presence of an artifical light source into consideration. Once the depth map, i.e., distances between the objects and the camera, is estimated, the foreground and background within a scene are segmented. The light intensities of foreground and background are compared to determine whether an artificial light source is employed during the image capturing process. After compensating the effect of artifical light, the haze phenomenon and discrepancy in wavelength attenuation along the underwater propagation path to camera are corrected. Next, the water depth in the image scene is estimated according to the residual energy ratios of different color channels existing in the background light. Based on the amount of attenuation corresponding to each light wavelength, color change compensation is conducted to restore color balance. The performance of the proposed algorithm for wavelength compensation and image dehazing (WCID) is evaluated both objectively and subjectively by utilizing ground-truth color patches and video downloaded from the Youtube website. Both results demonstrate that images with significantly enhanced visibility and superior color fidelity are obtained by the WCID proposed.
Conference Paper
This paper describes a novel strategy to enhance underwater videos and images. Built on the fusion principles, our strategy derives the inputs and the weight measures only from the degraded version of the image. In order to overcome the limitations of the underwater medium we define two inputs that represent color corrected and contrast enhanced versions of the original underwater image/frame, but also four weight maps that aim to increase the visibility of the distant objects degraded due to the medium scattering and absorption. Our strategy is a single image approach that does not require specialized hardware or knowledge about the underwater conditions or scene structure. Our fusion framework also supports temporal coherence between adjacent frames by performing an effective edge preserving noise reduction strategy. The enhanced images and videos are characterized by reduced noise level, better exposed-ness of the dark regions, improved global contrast while the finest details and edges are enhanced significantly. In addition, the utility of our enhancing technique is proved for several challenging applications.
Article
This paper describes a novel method to enhance underwater images by image dehazing. Scattering and color change are two major problems of distortion for underwater imaging. Scattering is caused by large suspended particles, such as turbid water which contains abundant particles. Color change or color distortion corresponds to the varying degrees of attenuation encountered by light traveling in the water with different wavelengths, rendering ambient underwater environments dominated by a bluish tone. Our key contributions are proposed a new underwater model to compensate the attenuation discrepancy along the propagation path, and proposed a fast joint trigonometric filtering dehazing algorithm. The enhanced images are characterized by reduced noised level, better exposedness of the dark regions, improved global contrast while the finest details and edges are enhanced significantly. In addition, our method is comparable to higher quality than the state-of-the-art methods by assuming in the latest image evaluation systems.
Article
Underwater imaging is important for scientific research and technology as well as for popular activities, yet it is plagued by poor visibility conditions. In this paper, we present a computer vision approach that removes degradation effects in underwater vision. We analyze the physical effects of visibility degradation. It is shown that the main degradation effects can be associated with partial polarization of light. Then, an algorithm is presented, which inverts the image formation process for recovering good visibility in images of scenes. The algorithm is based on a couple of images taken through a polarizer at different orientations. As a by-product, a distance map of the scene is also derived. In addition, this paper analyzes the noise sensitivity of the recovery. We successfully demonstrated our approach in experiments conducted in the sea. Great improvements of scene contrast and color correction were obtained, nearly doubling the underwater visibility range.
Conference Paper
In this paper, an efficient and effective method is proposed using dark channel prior to restore the original clarity of the images underwater. Images taken in the underwater environment are subject to water attenuation and particles in water's scattering, a phenomenon similar to the effect of heavy fog in the air. Using dark channel prior, the depth of the turbid water can be estimated by the assumption that most local patches in water-free images contain some pixels which have very low intensities in at least one color channel. In this way, the effect of turbid water can be removed and the original clarity of images can be unveiled. The results processed by this method are presented in the paper.
Conference Paper
Scene categorization is a fundamental problem in computer vision. However, scene understanding research has been constrained by the limited scope of currently-used databases which do not capture the full variety of scene categories. Whereas standard databases for object categorization contain hundreds of different classes of objects, the largest available dataset of scene categories contains only 15 classes. In this paper we propose the extensive Scene UNderstanding (SUN) database that contains 899 categories and 130,519 images. We use 397 well-sampled categories to evaluate numerous state-of-the-art algorithms for scene recognition and establish new bounds of performance. We measure human scene classification performance on the SUN database and compare this with computational methods. Additionally, we study a finer-grained scene representation to detect scenes embedded inside of larger scenes.
Article
This paper makes an investigation of the dehazing effects on image and video coding for surveillance systems. The goal is to achieve good dehazed images and videos at the receiver while sustaining low bitrates (using compression) in the transmission pipeline. At first, this paper proposes a novel method for single-image dehazing, which is used for the investigation. It operates at a faster speed than current methods and can avoid halo effects by using the median operation. We then consider the dehazing effects in compression by investigating the coding artifacts and motion estimation in cases of applying any dehazing method before or after compression. We conclude that better dehazing performance with fewer artifacts and better coding efficiency is achieved when the dehazing is applied before compression. Simulations for Joint Photographers Expert Group images in addition to subjective and objective tests with H.264 compressed sequences validate our conclusion.