ArticlePDF Available

Abstract and Figures

Shadow removal is an important problem in computer vision, since the presence of shadows complicates core computer vision tasks, including image segmentation and object recognition. Most state-of-the-art shadow removal methods are based on complex deep learning architectures, which require training on a large amount of data. In this paper a novel and efficient methodology is proposed aiming to provide a simple solution to shadow removal, both in terms of implementation and computational cost. The proposed methodology is fully unsupervised, based solely on color image features. Initially, the shadow region is automatically extracted by a segmentation algorithm based on Electromagnetic-Like Optimization. Superpixel-based segmentation is performed and pairs of shadowed and non-shadowed regions, which are nearest neighbors in terms of their color content, are identified as parts of the same object. The shadowed part of each pair is relighted by means of histogram matching, using the content of its non-shadowed counterpart. Quantitative and qualitative experiments on well-recognized publicly available benchmark datasets are conducted to evaluate the performance of proposed methodology in comparison to state-of-the-art methods. The results validate both its efficiency and effectiveness, making evident that solving the shadow removal problem does not necessarily require complex deep learning-based solutions.
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
Multimedia Tools and Applications (2024) 83:19517–19539
https://doi.org/10.1007/s11042-023-16282-0
1 3
SUShe: simple unsupervised shadow removal
Dimitra‑ChristinaC.Koutsiou1 · MichalisA.Savelonas1 · DimitrisK.Iakovidis1
Received: 16 November 2022 / Revised: 31 May 2023 / Accepted: 4 July 2023 /
Published online: 28 July 2023
© The Author(s) 2023
Abstract
Shadow removal is an important problem in computer vision, since the presence of shadows com-
plicates core computer vision tasks, including image segmentation and object recognition. Most
state-of-the-art shadow removal methods are based on complex deep learning architectures, which
require training on a large amount of data. In this paper a novel and efficient methodology is pro-
posed aiming to provide a simple solution to shadow removal, both in terms of implementation
and computational cost. The proposed methodology is fully unsupervised, based solely on color
image features. Initially, the shadow region is automatically extracted by a segmentation algorithm
based on Electromagnetic-Like Optimization. Superpixel-based segmentation is performed and
pairs of shadowed and non-shadowed regions, which are nearest neighbors in terms of their color
content, are identified as parts of the same object. The shadowed part of each pair is relighted by
means of histogram matching, using the content of its non-shadowed counterpart. Quantitative
and qualitative experiments on well-recognized publicly available benchmark datasets are con-
ducted to evaluate the performance of proposed methodology in comparison to state-of-the-art
methods. The results validate both its efficiency and effectiveness, making evident that solving
the shadow removal problem does not necessarily require complex deep learning-based solutions.
Keywords Shadow removal· Color· Histogram matching· Electromagnetism-like
optimization
1 Introduction
Core computer vision tasks, such as navigation and obstacle detection are complicated
in the presence of shadows, which can be misinterpreted as objects or obstacles. Several
methods have been proposed to address shadow detection and removal in various appli-
cation domains, including satellite imaging, traffic monitoring, and obstacle detection in
* Dimitris K. Iakovidis
diakovidis@uth.gr
Dimitra-Christina C. Koutsiou
dkoutsiou@uth.gr
Michalis A. Savelonas
msavelonas@uth.gr
1 Department ofComputer Science andBiomedical Informatics, University ofThessaly,
Papasiopoulou 2-4, 35131Lamia, Greece
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19518
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
navigation systems [4, 28, 52, 57, 59]. Figure1 shows example images illustrating objects
in a shading environment.
The shadow problem is commonly treated using the Shadow Model Theory (SMT) pro-
posed by Barrow etal. [6], which enables the calculation of the intensity of light, reflected
at a given point on a surface. It is a physics-based model, which decomposes illumination
in two components: direct and ambient illumination. According to Guo etal. [23], the illu-
mination model for an RGB image is provided by:
where Ii is the color intensity of pixel i in R, G and B color channels, Ld and Le are the light
intensities associated with the light source and ambient sources respectively, θi is the angle
defined by direct lighting direction and surface, and ti [0, 1] is a variable which denotes
the amount of light inside an area.
Early shadow removal methods, such as [7, 1719, 36, 41], were also based on models
relying on a number of assumptions, and usually evaluated on datasets with a small number
of images, focusing mainly on qualitative aspects. Later, there was a decisive shift towards
supervised methods, such as deep neural networks, which have brought new trends, but
also the requirement for large training sets. New benchmark datasets were created, to cover
this requirement, and evaluations begun to encompass also a more quantitative aspect [33,
46, 50, 53]. A drawback was that in these datasets shadows needed to be manually anno-
tated, and such annotations are time-consuming and usually costly. Furthermore, deep neu-
ral networks are usually computationally expensive and demanding in terms of computa-
tional resources.
The state-of-the-art shadow detection and removal approaches can be divided into two
categories: a) unsupervised methods, and b) supervised methods. Most methods of both
categories encompass the SMT and use a coefficient to indicate the intensity in the shadow
areas based on Eq. (1). This coefficient describes the brightness reduction in relation to
the area of the image without any shadows [15, 35, 47] . Finding the right parameters that,
when multiplied by the intensity of each pixel in the shadow zone, the initial illumina-
tion will be recovered in the shadowed areas. Unsupervised methods are usually based on
intrinsic image features, such as color and texture, and strategies that enable the recovery
of detail and luminosity of shadow regions [5, 12, 1619, 22, 25, 29, 31, 36, 41, 60, 63].
Supervised methods, are mainly based on complex deep learning architectures, and they
(1)
Ii
=
(
t
i
cos𝜃
i
L
d
+L
e)
R
i
(a) (b) (c)(d)
Fig. 1 Example images illustrating objects in shadowed environments(a-d) from various publicly available
shadow datasets considered in this study [4, 16, 17]
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19519
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
usually provide results of higher quality than the unsupervised ones [1, 9, 14, 26, 27, 33,
34, 37, 42, 53, 56, 64].
This work aims to address the need for both efficient and effective shadow removal in
computer vision workflows, by the following contributions:
A novel, fully unsupervised shadow removal method, named Simple Unsupervised
Shadow Removal (SUShe), which is based solely on color image features. Unlike pre-
vious methods it is very simple, both in terms of implementation and computational
complexity, whereas it obtains results that are comparable to state-of-the-art deep
learning-based methods.
A unique shadow segmentation approach efficiently combining a physics-inspired opti-
mization algorithm, superpixel segmentation, and histogram matching, in the context
of a lightweight pre-processing function.
An extensive experimental study on various publicly available benchmark datasets,
highlighting the tradeoff of efficiency and effectiveness it offers.
The remainder of this paper is organized into six sections. Section2 provides an over-
view of related work, and Section3 details the proposed methodology. Section4 provides
information on the experimental setup and the evaluation framework. Section5 presents
results of the proposed method in comparison with the most relevant state-of-the-art meth-
ods. Section6 provides a perspective in terms of computational complexity. Section7 pro-
vides a discussion of the experimental results, and the main conclusions of this work as
well.
2 Related work
Several shadow detection and removal methods have been based on SMT. This theory
is the foundation for most relighting methods that have been published in the last dec-
ade. Still, it is incomplete, in the sense that it fails to accurately model umbra regions,
to enable correct relighting in the proximity of shadow borders. Apart from the SMT,
several works have been proposed for shadow removal. These include model-based
unsupervised methods, such as the method proposed in [19], where each RGB image
is projected to an 1D invariant direction, in order to recover hues by means of a 2D
chromaticity feature space. In [47], a pyramid-based process is employed for shadow
removal with user assistance. In [36], the main aim was shadow removal in a way that is
robust against texture variations. Later, methods, such as [60], were based on classical
machine learning techniques, which use engineered features, such as texture elements,
and aim to match regions in different lighting conditions [22, 23]. In [12] clustering-
based shadow detection relied on color and texture features, whereas in [41] a shadow
removal method was proposed for images with uniform background. Shadow and lit
regions were separated by ignoring the low-frequency image details. Also, an unsuper-
vised shadow removal method using differential operations for a recent osmosis model
was proposed in [7]. However, most of the afore-mentioned methods have been tested
only in subsets of benchmark datasets. This is because the implementation of these stud-
ies is usually limited to images with specific types of textures and features.
Recently, the focus of research on shadow detection and removal turned to supervised
deep learning-based architectures, such as Convolutional Neural Networks (CNNs), and
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19520
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
Generative Adversarial Networks (GANs). In [33], a shadow-free image is generated using
two deep networks, SP-Net and M-Net. In [34], a GAN-based framework was trained with
patches extracted from shadow images, following the physics-inspired model of [33]. In
[1], Channel Attention GAN detects and removes shadows by using two networks, which
consider physical shadow properties and equipment parameters. Another network, called
G2R-ShadowNet consists of three subnetworks, and it requires a small number of images
for training [38]. Stacked Conditional Generative Adversarial Network (ST-CGAN) [53]
combines two stacked conditional GANs, which provide generators for the shadow detec-
tion mask and for the shadow-free image. In [20], shadow removal was treated as an image
fusion problem via FusionNet, a network that generates weight maps facilitating the fusion
process. Feature fusion was also employed in [9], integrated with multiple dictionary learn-
ing. During the last years, several studies have also been proposed to increase the effec-
tiveness of shadow removal in the benchmark shadow datasets. In [26] shadow removal
architecture was proposed by Hu etal., aiming to learn direction-aware and spatial char-
acteristics of the images at various levels, using a CNN. Additionally, Hu etal. proposed a
weighted cross entropy loss, to train a neural network for shadow detection. That method
addressed color and luminosity inconsistencies in the training pairs for shadow removal,
by applying a color transfer function. In [62] in order to investigate residual images and
illumination estimation with GANs for shadow removal discrepancies in the training pairs,
a framework named RIS-GAN was proposed. To refine the coarse shadow-removal result
in the shadow-free image of that approach, indirect shadow-removal images were created
by estimating negative residual images and inverse illumination maps, in conjunction with
the coarse shadow-removal image. In [11], the shadow removal problem was approached
in two ways. Firstly, a dual hierarchically aggregation network was proposed to carefully
learn the border artifacts in a shadowed image. Without any down-sampling, a foundation
of dilated convolutions was considered for attention estimation, using multi-context infor-
mation. Secondly, taking into account that training on a small dataset limits the network
ability to recognize textural differences, resulting in color inconsistencies in the shadowed
region, the authors developed a dataset synthesis method based on shadow matting. In [10]
a two-stage context-aware network, called Context-Aware Network (CANet) was proposed
for shadow removal. In CANet the shadow regions receive contextual information from the
corresponding non-shadowed regions. As a next step, encoder-decoder was used to enhance
the results. Mask-ShadowNet was proposed in [24], where a masked adaptive instance
normalization method along with embedded aligners were applied to remove shadows,
considering illumination uniformity and the different feature statistics in the shadow and
non-shadow areas. In [65] a Bidirectional Mapping Network was presented, combining the
learning process of shadow removal and shadow generation into a unified parameter shar-
ing framework. In [51], a style-guided shadow removal network was proposed to address
the issue of visually disharmonic images after shadow removal, and to ensure better image
style coherence. The training of all these deep learning-based methods is associated with a)
a high computational cost, b) non-trivial hardware specifications, and c) a requirement for a
large number of annotated images.
Shadow removal is useful in a variety of computer vision applications, such as the detec-
tion of moving objects and pedestrians, either in indoor or in outdoor environments [29,
31, 63], in navigation-aid systems [44], and the recognition of regions of interest in remote
sensing images [26, 27]. In [49], an automatic shadow mask estimation approach was intro-
duced, aiming to replace manual labeling in a supervised context, using known solar angles
and 3D point clouds. Shadow removal can be an essential component of remote sensing
object detection algorithms, aiming to cope with several challenges, such as the complex
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19521
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
background, and the variations of scale and density [39, 55]. Wang etal. [54] proposed an
automatic cloud shadow screening mechanism, which was utilized for PlanetScope, a con-
stellation of over 130 satellites of European Space Agency (ESA) that can regularly image
the entire surface of the Earth. In an unsupervised context, a statistical method [3] was pro-
posed for aerial imaging, based on decision trees.
Unlike current, either unsupervised or supervised, shadow removal methods, this work
provides a very simple methodology for automatic shadow removal, based on a novel com-
bination of superpixel segmentation with a strategy for matching shadow and lit regions.
3 Proposed methodology
The proposed methodology is based on a simple, yet very effective strategy. Initially, the
shadow mask is extracted using an evolutionary physics-inspired algorithm. Next, both
shadowed or non-shadowed image regions, which are coherent in terms of texture and
color, are identified and shadow/non-shadow pairs of neighboring superpixels, adjacent to
shadow borders, are determined. The shadowed part of each pair is relighted by means of
histogram matching.
3.1 Shadow detection
Shadow detection refers to the segmentation of a natural image, either in indoor or out-
door settings, in order to extract the shadowed region. Algorithm1 summarizes the shadow
detection stage with a pseudocode, and Fig.2 presents a visual summary of this algorithm.
The Electromagnetism-like Optimization (EMO) [32, 45, 48] algorithm is employed for
multilevel segmentation, aiming to cope with the issue of computational complexity. Ini-
tially, the color space used to represent the input image is converted from RGB to HSV
(line 2). The Hue (H) component of the input image is segmented using the EMO-based
method described in [8], considering that the component Η is invariant to changes in light-
ing. A set of k images hi, i = 1, 2, …, k, is the output of this operation, representing roughly
hue-homogeneous regions (line 3). As a next step, the Value component (V) of the HSV
image is multiplied by each hi, i = 1, 2, …, k, image, resulting in a series of new images
Algorithm1 Shadow Detection Pseudocode
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19522
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
vi = hi · V, i = 1, 2, …, k, which represent regions with weighted intensities (line 5). This
weighting is performed because the generated image regions have lower intensities in shad-
owed areas. Thus, a subsequent bilevel thresholding step is facilitated. Bi-level threshold-
ing is performed by using EMO, on each of vi, i = 1, 2, …, k, with only one threshold to be
optimized. The result of this operation is a set of k binary images, bi, i = 1,2, …, k (line 6).
In these images, the pixels corresponding to lower intensities (potential shadowed regions)
are set to white, and the remaining pixels are set to black. As a final step, the binary masks
bi, i = 1,2, …, k, obtained for each input image, are aggregated to create a mask B repre-
senting the shadowed regions of the input image (line 7).
3.2 Superpixel matching strategy
Following the application of the shadow detection algorithm, SUShe performs unsu-
pervised shadow removal on image regions with approximately uniform color features.
These regions, which are characterized as superpixels, are obtained using the SLIC
Superpixel segmentation algorithm. Α superpixel matching strategy is then applied
to identify superpixels in the shadow areas that are similar to superpixels in the non-
shadow areas. Relighting is performed by transforming the histogram of the shadow
superpixels, so that it matches the histogram of the respective non-shadow superpixels.
SLIC Superpixel segmentation The Simple Linear Iterative Clustering (SLIC) superpixel
segmentation algorithm performs local clustering of pixels, considering both color and
spatial information, by means of a metric proposed in [2]. The algorithm takes as input an
Fig. 2 Outline of the shadow detection algorithm
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19523
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
image and the number of superpixels K, in which the input image should be divided. The
initial RGB image is separated into K smaller grid intervals S defined in the xy plane. Each
pixel inside S has spatial coordinates (xi, yi) and color coordinates (Li, ai, bi) in CIE-Lab
color space. Every grid interval S would have a superpixel center Ck = [Lk, ak, bk, xk, yk], for
superpixels that are similar in size. Thus, the following distances are calculated for each
pixel (xi, yi, Li, ai, bi) in S, to the superpixel center Ckusing Eqs. (2) and (3) in order to
define the metric provided in Eq. (4):
where Ds is the final metric, which combines Euclidean distances of the color (in CIE-
Lab) dLab and the Euclidean distances of the spatial coordinates dxy,normalized by the grid
interval S, and m is a variable of the SLIC algorithm, which controls the compactness of
the superpixel. The default value for m was set equal to 10 according to [2]. Each cluster
center Ck is assigned to the best matching pixels from the 2S × 2S area, according to the
distance metric Ds. This process is iterative, until convergence.
SUShe: Simple unsupervised shadow removal The proposed methodology combines two
very simple techniques for region segmentation and relighting of the shadowed areas. The
SLIC Superpixel algorithm is used to segment the input image into many small, approxi-
mately uniform (with respect to color), regions. Algorithm 2 summarizes the shadow
removal stage with pseudocode, and Fig.3 presents a visual summary.
Initially, the binary shadow mask B (obtained in Subsection3.1) is used to split the
input image I (line 1) into a shadowed regions IS and lit regions IL (line 2). These
regions are obtained by multiplying IS = B · I andIL = (1 − B) · I.
Next, SLIC superpixel segmentation is applied to IS and IL separately (line 3): IS and
IL are broken down into the shadowed (ISLIC(S)) and the lit image regions ISLIC(L)),with
respect to the color and spatial features of each region.
The spatial gravity centers of each superpixel
I
SLIC(S)
i
inside the shadow region
GC
shadow
i,i=1, 2 K
and the corresponding ones of the lit region
GC
litj
,j=1, 2, K
are calculated (line 4), where Kis the number of superpixels (line 1).
In each channel of the RGB color space (line 6), for each gravity center
GCshadowi
of
I
SLIC(S)
i
, its Euclidean spatial distance
d
d
(
GCshadowi,GClitj
)
from each gravity
center of
GC
lit
j
of superpixel ISLIC(L)j is calculated (line 8) in order to find the minimum one
(line 9). The minimum spatial distance corresponds to the distance of the superpixel that
will illuminate the shadowed superpixel located in the area
.
The lit superpixel pairi, that has the minimum distance d from the shadowed one,
I
SLIC(
S
)
i
,
is considered as the optimal counterpart for the respective shadow superpixel.
Next, the histograms, and the corresponding cumulative distribution functions (cdfs) of the
shadow superpixel
ISLIC(S)i
and of its pairi, are calculated (line 10).
(2)
dLab
=
(
L
k
L
i)
2
+
(
a
k
a
i)
2
+
(
b
k
b
i)2
(3)
dxy
=
(
x
k
x
i)
2
+
(
y
k
y
i)
2
(4)
D
s=
d2
Lab +m2
S
d2
xy
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19524
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
Histogram matching [40] is performed on
I
SLIC(S)
i
to transform the shadow histogram, so
that it matches the corresponding lit cdf of pairi. The shadow superpixels are relighted,
using the color values of the lit counterpart pairi (line 11).
These steps are iteratively performed for all shadow superpixels
I
SLIC(S)
i=1,..K
.
Finally, all the relighted shadowed superpixels are merged to extract the relighted region
IR(line 12). After completing these iterations, the relighted shadowed region IR is merged
with the initial lit region IL (line 13).
The entire process is repeated three times, for R, G, and B channels, and the results are
concatenated (line 14) to produce the final shadow-free image Inon shadow (line 15).
4 Evaluation
4.1 Experimental setup anddatasets
Τhe proposed methodology and all the experiments have been implemented in MATLAB
R2019a, on an AMD Ryzen Core-75800H 3.2GHz, with 16 GB RAM. The experimen-
tal evaluation has been based on three benchmark datasets, namely the Image Shadow
Algorithm2 Simple Unsupervised Shadow Removal (SUShe) Pseudocode
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19525
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
Triplets Dataset (ISTD), the Adjusted Image Shadow Triplets Dataset (AISTD), and the
Shadow Removal Dataset (SRD). ISTD is the most challenging shadow dataset employed
in state-of-the-art works. It consists of 2410 triplets, each comprising the initial RGB
images with shadows, the shadow mask, and the ground truth RGB shadow-free image.
ISTD is divided in two subsets; the first subset is composed of 1870 images for training
and the second one contains 540 images for testing. Each image has a size of 480 × 640
pixels. Another well-recognized dataset is AISTD, which is an improved version of the
ISTD, as described in [33], with 1870 training and 540 testing images. Experiments were
also performed using the SRD dataset. This dataset has been proposed in [46] and consists
of 3088 images in total, from which 2680 are used for training and 408 are used for test-
ing. This dataset includes images of various scenes, illumination conditions and object
types, in order to enable the investigation of various shadow and reflectance phenomena.
4.2 Evaluation metrics
The results of the proposed method have been evaluated both quantitatively and qualita-
tively. The quantitative evaluation was based on the Root Mean-Squared Error (RMSE)
and the Peak Signal to Noise Ratio (PSNR). The RMSE between two given images has
been calculated for the shadowed area, the non-shadow area, and for all areas, using the
evaluation code proposed in [21], which has also been used in major state-of-the-art works,
such as [33, 34, 53]. In that code, the RMSE was implemented as follows:
Fig. 3 Illustration of the proposed SUShe shadow removal framework
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19526
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
where GT is the ground truth image, Outputis the predicted shadow-free image and i = 1, .
. , n represents the index of each pixel in the area of interest (i.e., shadow, non-shadow, all
areas) in the image, and n is the total number of pixels in that area.
The Mean-Squared Error (MSE) between the output of the shadow removal and the
ground truth without shadows is calculated by:
PSNR has been calculated using Eq. (6):
where M is the maximum pixel value in the area of interest. PSNR is measured in deci-
bels (db). A higher PSNR value is linked to higher output image quality (). The RMSE
decreases as the output image becomes more similar to the ground truth; therefore, the
output image quality is improved.
Furthermore, we have also assessed our experiments using the novel evaluation metric
Learned Perceptual Image Patch Similarity (LPIPS), which closely matches human recep-
tion [61]. For some predefined network, LPIPS merely computes the similarity between the
activations of two image patches. An image patch perceived similarity is indicated by a low
LPIPS score ().
5 Results
In this section, quantitative and qualitative results of the proposed methodology are pre-
sented. Different values of K with respect to superpixel segmentation were tested to find
the most appropriate segmentation level; specifically, K = 70, 80, 90, 100, 400 and 700
superpixels. Tables1, 2 and 3 summarize the results obtained by SUShe in ISTD, AISTD
and SRD datasets, respectively(the best results are indicated in bold). In Table1 it can
be noted that by setting K = 90 (PSNR = 24.82, RMSE = 8.14, LPIPS = 0.079), SUShe
achieves the best shadow removal results in ISTD. Second best results were obtained
(5)
RMSE
=
1
n
n
i=1
GTiOutputi
2
(6)
MSE
=
1
n
n
i=1
(
GTiOutputi
)2
(7)
PSNR
=10 log
(
M2
MSE )
Table 1 Quantitative Results of
the proposed methodology for
different superpixel values in
ISTD
ΚPSNRRMSELPIPS
All regions All regions Shadow Non-shadow All regions
70 24.84 8.15 14.91 6.83 0.078
80 24,77 8.19 15.16 6.83 0.079
90 24.82 8.14 14.87 6.83 0.079
100 24.75 8.17 15.02 6.83 0.080
400 24.71 8.17 14.99 6.83 0.085
700 24.68 8.16 14.99 6.83 0.088
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19527
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
for K = 70 (PSNR = 24.84, RMSE = 8.15, LPIPS = 0.078). For K = 100, K = 400, K = 700
(mean values approximately equal to PSNR = 24.73, RMSE = 8.17 and LPIPS = 0.084)
the results are comparable. For K = 80 the lowest performance is obtained. In the case
of AISTD (Table2), the best results of SUShe are also obtained for the lowest values
K, in the range tested, i.e. K = 70 (PSNR = 30.09, RMSE = 4.12, LPIPS = 0.076) and
K = 90 (PSNR = 30.06, RMSE = 4.11, LPIPS = 0.076). Again, for larger values of K, i.e.
K = 100, Κ = 400, Κ = 700, the results are comparable to each other (mean values approx-
imately equal to PSNR = 29.60, RMSE = 4.24, LPIPS = 0.083). The results obtained for
K = 70 are the best all over the datasets we have used for evaluation. In the case of SRD,
the optimal results obtained for K = 70 lead to the lowest RMSE score (PSNR = 22.05,
RMSE = 8.68, LPIPS = 0.167) The values K = 80, 90 lead to slightly inferior accuracy. It
can be observed that higher values for K are not linked with higher quality results.
Figure4 indicates that the proposed methodology is relatively insensitive to K. Espe-
cially in terms of LPIPS, the results are comparable using different values of K. Yet, its
performance is slightly better for the lowest values of K, in the range tested.
Tables 4, 5 and 6 present experimental comparisons between SUShe (indicated in
bold)and state-of-the-art shadow removal algorithms. The results presented for the latter
are derived from the literature. RMSE is computed in three ways: a) inside the shadowed
region (Shadow), outside the shadowed region (Non-shadow) and in the entire image (All
regions). Figures5, 6 and 7 illustrate qualitative results of SUShe and other state-of-the-art
algorithms, including all unsupervised methods and some of the supervised ones, which
are publicly available by the authors.
Table 4 presents comparisons on ISTD. SUShe outperforms all non-neural net-
work-based methods ([21, 23, 58]), as it achieves the best results in terms of all met-
rics (PSNR = 24.82, RMSE = 8.14, LPIPS = 0.079). The methods of Guo et al. [23]
and Gong etal. [21] have been created using supervised techniques in the context of
Table 2 Quantitative Results of
the proposed methodology for
different superpixel values in
AISTD
ΚPSNRRMSELPIPS
All regions All regions Shadow Non-shadow All regions
70 30.09 4.12 12.26 2.53 0.076
80 29.99 4.14 12.43 2.53 0.076
90 30.06 4.11 12.22 2.53 0.076
100 29.91 4.14 12.40 2.53 0.077
400 29.51 4.24 13.03 2.53 0.083
700 29.24 4.34 13.59 2.53 0.086
Table 3 Quantitative results of
the proposed methodology for
different superpixel values in
SRD
ΚPSNRRMSELPIPS
All regions All regions Shadow Non-shadow All regions
70 22.05 8.68 16.31 7.25 0.167
80 21.95 9.69 23.25 4.54 0.168
90 21.91 9.66 23.16 4.54 0.168
100 21.85 9.74 23.44 4.54 0.168
400 21.03 10.56 26.41 4.54 0.168
700 20.64 10.94 27.81 4.54 0.178
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19528
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
shadow detection and removal. On one hand, Guo etal. use pairwise classification for
shadow removal. On the other hand, the method of Gong etal. requires indication of the
shadowed and lit areas using a GUI tool to apply shadow detection. In addition, SUShe
outperforms three state-of-the-art neural network-based methods: ARGAN [14], Cycle-
GAN [64], the method proposed by Nagae etal. [43], and the well-known SP + M Net,
DHAN [33] (Table4). The rest of the neural network-based methods achieve RMSE
values that are lower than the RMSE of SUShe. Still, SUShe obtains LPIPS = 0.079
in the ISTD, which is the lowest value with the exception of ST-CGAN. Approaches
such as [26, 27, 33, 37, 46], [20, 30] lead to results comparable to SUShe (with a dif-
ference in RMSE that is not exceeding 2.0); however, these approaches require training.
Figure5 illustrates indicative shadow removal results of SUShe and the state-of-the-art
Fig. 4 LPIPS sensitivity for different K values (a) in ISTD, (b) in AISTD and (c) in SRD
Table 4 Quantitative results in comparison with other state-of-the-art methodologies for ISTD
N/A: Not Applicable.
Methodology Training PSNRRMSELPIPS
All regions All regions Shadow Non-shadow All regions
Yang etal. (2012) [58] N/A 20.11 15.63 19.82 14.83
Guo etal. (2013) [23] N/A 22.33 9.30 18.95 7.46
Gong etal. (2016) [21] N/A 24.07 8.53 14.98 7.29
ARGAN (2019) [14] paired 24.25 11.02 15.49 10.21
ST-CGAN (2018) [53] paired 24.85 7.47 10.33 6.93 0.067
DSC (2020) [26] paired 26.62 6.67 9.76 6.93 0.202
DeshadowNet (2017) [46] paired 24.25 7.60
Fu etal. (2021) [20] paired 27.19 5.92 7.77 5.56 0.086
SP + M-Net (2019) [33] paired 24.10 8.32 11.22 7.77 0.080
Nagae etal. (2020) [43] paired 24.87 10.63 17.80 8.62
DHAN (2020) [11] paired 27.88 8.46 10.56 7.84
Cycle-GAN (2017) [27, 64] unpaired 8.16 0.120
Mask-ShadowGAN(2019) [27] unpaired 25.07 7.41 12.67 6.68 0.250
LG-ShadowNet (2021) [37] unpaired 25.92 6.67 10.93 5.94
DC-ShadowNet (2021) [30] unpaired 22.94 6.57 10.64 5.80 0.090
SUShe N/A 24.82 8.14 14.87 6.83 0.079
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19529
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
methods compared. As can be observed in Fig.8b or e, some methods completely alter
the image, while others fail in completely removing the shadow (Fig.8c, d, f, j). In addi-
tion, in Fig.8h, the pair of shadowed/non-shadowed regions is erroneous, leading to a
relighting from an erroneous non-shadow area and eventually to an erroneous bright-
ness reset. SUShe is the only completely unsupervised methodology with a satisfactory
performance along the entire ISTD. The comparative results in the ISTD are graphically
represented in Fig.5.
Table5 presents comparisons between SUShe and state-of-the-art shadow removal
methods, for AISTD. In this case, SUShe is ranked second with respect to PSNR and
Table 5 Quantitative results in comparison with other state-of-the-art methodologies for AISTD
Methodology Training PSNRRMSELPIPS
All regions All regions Shadow Non-shadow All regions
Yang etal. (2012) [58] N/A 19.77 16.00 24.70 14.40
Guo etal. (2013) [23] N/A 24.27 6.10 22.00 3.10
Gong etal. (2016) [21] N/A 27.78 5.10 14.42 3.39 0.086
ST-CGAN (2018) [53] paired 23.14 8.70 13.40 7.70 0.150
DeshadowNet (2017) [46] paired 7.60 15.90 6.00
SG-ShadowNet (2022) [51] paired 32.45 3.40 5.90 3.40 0.070
Fu etal. (2021) [20] paired 29.44 3.80 6.50 4.20 0.106
SP + M-Net (2019) [33] paired 30.02 4.41 8.84 3.64 0.085
Nagae etal. (2020) [43] paired 26.73 9.40 17.11 6.54
DHAN (2020) [11] paired 24.86 10.05 12.09 9.42
Mask-ShadowGAN(2019) [27] unpaired 26.89 5.30 12.50 4.00 0.095
LG-ShadowNet (2021) [37] unpaired 29.22 5.00 10.60 5.00 0.103
DC-ShadowNet (2021) [30] unpaired 28.76 4.60 10.30 3.50 0.170
SUShe N/A 30.06 4.11 12.22 2.53 0.076
Table 6 Quantitative results in comparison with other state-of-the-art methodologies for SRD
Methodology Training PSNRRMSELPIPS
All regions All regions All regions
Yang etal. (2012) [58] N/A 22.57
Guo etal. (2013) [23] N/A 12.60
Gong etal. (2016) [21] N/A 8.73
DSC (2020) [26] paired 21.52 6.21 0.147
DeshadowNet (2017) [46] paired 6.64 0.165
SG-ShadowNet (2022) [51] paired 31.16 6.23 0.099
Fu etal. (2021) [20] paired 27.74 6.51 0.153
Nagae etal. (2020) [43] paired 22.15 15.59
DHAN (2020) [11] paired 27.66 8.68
Cycle-GAN (2017) [64] unpaired 9.14
Mask-ShadowGAN (2019) [27] unpaired 7.32
DC-ShadowNet (2021) [30] unpaired 30.55 4.66 0.167
SUShe N/A 22.05 8.68 0.167
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19530
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
LPIPS, after SP + M Net and SG-ShadowNet, and is ranked third with respect to RMSE.
The difference between SUShe results and these methods is notably small, given the
higher computational complexity of the latter. Figure9 illustrates comparative results
on images from AISTD. Once again, the methods proposed by Guo etal., Gong etal.,
and the methods DC-ShadowNet and LG-ShadowNet (Fig.9c-e, h) fail to remove the
shadow in the second and third image (center and right column), whereas the method
of Yang etal. (Fig.9b) alters the image inside and outside the shadow regions. The best
results are obtained by SUShe, whereas comparable results are obtained by SG-Shad-
owNet, DHAN, and by the method of Fu etal. (Fig.9f-g, i). The comparative results in
the AISTD are also graphically represented in Fig.6.
Fig. 5 Comparative results on the ISTD (from Table4), in terms of RMSE, PSNR, and LPIPS metrics. The
horizontal lines represent the mean scores obtained from all the previously proposed methods per metric
Fig. 6 Comparative results on the AISTD (from Table5), in terms of RMSE, PSNR, and LPIPS metrics.
The horizontal lines represent the mean scores obtained from all the previously proposed methods per
metric
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19531
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
Table 6 presents comparisons on SRD. In this case, SUShe outperforms all unsuper-
vised methods, and notably outperforms Cycle-GAN and the method of Nagae etal. as
well. Furthermore, the performance of SUShe is comparable to the performances of
DHAN (RMSE = 8.68) and DeShadowNet [46], in terms of LPIPS. Figure 10 illustrates
indicative qualitative results of SUShe and other state-of-the-art algorithms on images
from SRD, including the supervised methods. In the case of the first image of Fig.10 (left),
the output of SUShe is obviously closer to the ground truth than the rest of the methods. In
the case of the second image of Fig.10 (right), the result of SUShe is comparable to that
of DeshadowNet and ARGAN, DSC, DC-ShadowNet, Fu etal., and SG-ShadowNet. It is
also worth noting that SUShe preserves the details of the original image, unlike ST-CGAN,
which introduces blur artifacts on the original image. Overall, SUShe achieves compara-
ble shadow removal results with some supervised methods. The comparative results in the
SRD are also graphically represented in Fig.7.
6 Computational complexity
The computational complexity of SUShe is estimated in order to quantitatively assess its
efficiency. More specifically:
a) SLIC Superpixels are used for segmentation of both lit and shadowed regions by
employing hue and spatial coordinates. SLIC bypasses tens of thousands of point-to-point
redundant distance calculations by localizing the search in the clustering process. SLIC is
O(N), where N is the number of the pixels of an image [2].
b) For each RGB channel (c = 3 for the loops in the following process):
Histogram calculation of an image is O(N), since N =width × height.
Calculation of the gravity centers of the lit superpixels is O(K), where Kis the number
of lit superpixels.
Scanning the shadow superpixels to calculate the following features, is O(K):
Fig. 7 Comparative results on the SRD (from Table6), in terms of RMSE, PSNR, and LPIPS metrics. The
horizontal lines represent the mean scores obtained from all the previously proposed methods per metric
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19532
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
(a)
Shadow Image
(b)
Yang et al., (2012)
(c)
Guo et al., (2012)
(d)
Gong et al., (2016)
(e)
STCGAN, (2018)
(f)
DSC (2020)
(g)
SP+M Net (2019)
(h)
DC-ShadowNet (2
021)
(i)
Fu et al., (2021)
(j)
LG-ShadowNet, (202
1)
(k)
DHAN, (2020)
(l)
Ground Truth Image
(m)
SUShe
Fig. 8 Indicative results of SUShe and other state-of-the-art methods in ISTD. a Shadow Image, b Yang
etal., (2012)[58], c Guo etal., (2012)[23], d Gong etal., (2016)[21], e STCGAN, (2018)[53], f DSC (2020)
[26], g SP + M Net (2019)[33], h DC-ShadowNet (2021)[30], i Fu etal., (2021)[20], j LG-ShadowNet,
(2021)[37], k DHAN, (2020)[11], l Ground Truth Image, m SUShe
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19533
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
(a)
Shadow Image
(b)
Yang et al., (2012)
(c)
Guo et al., (2012)
(d)
Gong et al., (2016)
(e)
DC-ShadowNet, (2021)
(f)
Fu et al., (2021)
(g)
SG-ShadowNet, (2022)
(h)
LG-ShadowNet, (2021)
(i)
DHAN, (2020)
(i)
Ground Truth Image
(j)
SUShe
Fig. 9 Indicative results of SUShe and other state-of-the-art methods in AISTD, (a) Shadow Image, (b)
Yang etal., (2012)[58], (c) Guo etal., (2012)[23], (d) Gong etal., (2016)[21], (e) DC-ShadowNet, (2021)
[30], (f) Fu etal., (2021)[20], (g) SG-ShadowNet, (2022)[51], (h) LG-ShadowNet, (2021)[37], (i) DHAN,
(2020)[11], (i) Ground Truth Image, (j) SUShe
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19534
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
(a)
Shadow Image
(b)
DeshadowNet, (2017)
(c)
STCGAN, (2018)
(d)
ARGAN, (2019)
(e)
DSC, (2020)
(f)
DC-ShadowNet, (2021)
(g)
Fu et al., (2021)
(h)
SG-ShadowNet. (2022)
(i)
DHAN, (2020)
(j)
Ground Truth Image
(k)
SUShe
Fig. 10 Indicative results of SUShe and other state-of-the-art methods in SRD. a Shadow Image, b DeshadowNet,
(2017)[46], c STCGAN, (2018)[53], d ARGAN, (2019)[14], e DSC, (2020)[26], f DC-ShadowNet, (2021)[30], g
Fu etal., (2021)[20], h SG-ShadowNet. (2022)[51], i DHAN, (2020)[11], j Ground Truth Image, k SUSh
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19535
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
Calculation of the Euclidean distance with all lit superpixels and finding the minimum
amounts to O(K).
Thecalculationof cdfsis estimated almost equal to O(N).
Histogram matching amounts to O(greylevels2).
A quantitative comparison of the Floating-point Operations per Second (FLOPS)
between SUShe and available deep learning-based methods is presented in Table 7 and
Fig.11. It can be noticed that SUShe has a significantly lower FLOPS value (indicated in
bold).
7 Discussion andconclusions
This work investigated a simple, efficient and effective solution for the complex prob-
lem of shadow removal, which can affect object detection and recognition algorithms,
deteriorating their performance. The experimental results showed that by combin-
ing simple segmentation and color enhancement algorithms, the original brightness
of shadowed regions can be restored. This was validated by quantitative and qualita-
tive comparisons performed with both unsupervised and supervised state-of-the-art
methods. All experiments were performed in three widely adopted publicly available
Table 7 The number of FLOPS
per image for different shadow
removal methods
Method FLOPS×109
Mask-ShadowGAN 266.4
Fu etal. 104.8
LG-ShadowNet 82.7
SP + M-Net 39.8
SG-ShadowNet 39.7
SUShe 0.7
Fig. 11 The number of FLOPS
per image (from Table7) for dif-
ferent shadow removal methods
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19536
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
benchmark datasets. From Tables4, 5 and 6 and Figs. 5, 6 and 7, it is evident that
SUShe outperforms all state-of-the-art unsupervised methods compared, as well as
some supervised ones (Cycle-GAN, ARGAN, Nagae etal., DHAN, LG-ShadowNet,
DC-ShadowNet). As for those that SUShe does notoutperform, such as SG-Shad-
owNet, the results of SUShe are comparable both quantitatively and qualitatively
(Figs. 8, 9 and 10). To the best of our knowledge SUShe is the only unsupervised
algorithm which provides a similar shadow removal performance with most of the
state-of-the-art supervised methods compared, on the full, widely used benchmark
datasets considered in this study. Furthermore, the comparisons with state-of-the-art
deep-learning-based methods in terms computational complexity showed that SUShe
is much more efficient (Table7 and Fig.11).
Overall, the following conclusions can be derived:
SUShe is very simple to implement and of low computational complexity.
Its computational complexity is generally lower than that of the state-of-the-art algo-
rithms for shadow removal.
The results obtained indicate that SUShe can remove shadows better than any of the
compared state-of-the-art unsupervised shadow removal methods.
In comparison with the supervised state-of-the-art shadow removal methods, its per-
formance is comparable or better.
Solving the shadow removal problem does not necessarily require complex deep
learning-based solutions.
Shadow removal is instrumental for object recognition in various domains such as
remote sensing image processing, traffic monitoring and object recognition. Future work
will involve evaluation of SUShe in various applications where rapid system response
is required, such as assistive navigation systems [13]. Furthermore, SUShe can be also
applied in the medical domain to investigate how shadow removal can optimize the results
of medical imaging in the shadowed regions of an internal body organ, e.g., the shadowed
regions in images obtained from gastrointestinal capsules, diagnostic ultrasounds etc.
Acknowledgements The authors would like to thank Prof. Yandong Tang for the SRD dataset provision.
This work was co-financed by Greece and the European Union (European Social Fund-ESF) through the
Operational Programme «Human Resources Development, Education and Lifelong Learning» in the context
of the Act “Enhancing Human Resources Research Potential by undertaking a Doctoral Research” Sub-
action 2: IKY Scholarship Programme for PhD candidates in the Greek Universities.
Funding Open access funding provided by HEAL-Link Greece.
Data availability All data analyzed during this study are available in https:// github. com/ DeepI nsight-
PCALab/ ST- CGAN (ISTD) and https:// github. com/ cvlab- stony brook/ SID (AISTD), and included in this
published article [46] (SRD).
Declarations
Conflict of interest The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License,
which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long
as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Com-
mons licence, and indicate if changes were made. The images or other third party material in this article
are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19537
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly
from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.
References
1. Abiko R, Ikehara M (2022) Channel Attention GAN Trained with Enhanced Dataset for Single-Image
Shadow Removal. IEEE Access 10:12322–12333
2. Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2010) Slic superpixels. ΙEEE Trans Pat-
tern Anal Mach Intell 34(11):2274–82.https:// doi. org/ 10. 1109/ TPAMI. 2012. 120
3. Alvarado-Robles G, Osornio-Rios RA, Solis-Munoz FJ, Morales-Hernandez LA (2021) An approach
for shadow detection in aerial images based on multi-channel statistics. IEEE Access 9:34240–34250
4. Avina-Cervantes JG, Mart𝚤nez-Jiménez L, Devy M, Hernández-Gutierrez A, Almanza DL, Ibarra MA
(2007) Shadows attenuation for robust object recognition. In: Mexican International Conference on
Artificial Intelligence, pp 650–659
5. Baba M, Mukunoki M, Asada N (2004) Shadow removal from a real image based on shadow density.
In: Proceedings of the ACM SIGGRAPH, p 60
6. Barrow H, Tenenbaum J, Hanson A, Riseman E (1978) Recovering intrinsic scene characteristics.
Comput Vis Syst 2(3–26):2
7. Benalia S, Hachama M (2022) A nonlocal method for image shadow removal. Comput Math Appl
107:95–103
8. Birbil SL, Fang S-C (2003) An electromagnetism-like mechanism for global optimization. J Glob
Optim 25(3):263–282
9. Chen Q, Zhang G, Yang X, Li S, Li Y, Wang HH (2018) Single image shadow detection and removal
based on feature fusion and multiple dictionary learning. Multimed Tools Appl 77(14):18601–18624.
https:// doi. org/ 10. 1007/ s11042- 017- 5299-0
10. Chen Z, Long C, Zhang L, Xiao C (2021) Canet: A context-aware network for shadow removal. In:
Proceedings of the IEEE/CVF Int Conf Comput Vis (ICCV), pp 4743–4752
11. Cun X, Pun C-M, Shi C (2020) Towards ghost-free shadow removal via dual hierarchical aggregation
network and shadow matting GAN. In: Proceedings of the AAAI Conf Artif Intell 34(07), pp 10680–
10687 https:// doi. org/ 10. 1609/ aaai. v34i07. 6695
12. Dhingra G, Kumar V, Joshi HD (2021) Clustering-based shadow detection from images with texture
and color analysis. Multimed Tools Appl 80(25):33763–33778
13. Dimas G, Diamantis DE, Kalozoumis P, Iakovidis DK (2020) Uncertainty-aware visual perception sys-
tem for outdoor navigation of the visually challenged. Sensors 20(8):2385
14. Ding B, Long C, Zhang L, Xiao C (2019) Argan: attentive recurrent generative adversarial network
for shadow detection and removal. In: Proceedings of the IEEE/CVF Int Conf Comput Vis (ICCV), pp
10213–10222
15. Einy T, Immer E, Vered G, & Avidan S (2022) Physics based image deshadowing using local linear
model. In: Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR), pp 3012–3020
16. Fan X, Wu W, Zhang L, Yan Q, Fu G, Chen Z, Long C, Xiao C (2020) Shading-aware shadow detec-
tion and removal from a single image. Vis Comput 36(10):2175–2188. https:// doi. org/ 10. 1007/
s00371- 020- 01916-3
17. Finlayson GD, Hordley SD, Drew MS (2002) Removing shadows from images. In: Proceedings of the
Eur Conf Comput Vis (ECCV), pp 823–836
18. Finlayson GD, Hordley SD, Lu C, Drew MS (2005) On the removal of shadows from images. IEEE
Trans Pattern Anal Mach Intell 28(1):59–68
19. Finlayson GD, Drew MS, Lu C (2009) Entropy minimization for shadow removal. Int J Comput Vis
85(1):35–57
20. Fu L, Zhou C, Guo Q, Juefei-Xu F, Yu H, Feng W, Liu Y, Wang S (2021) Auto-exposure fusion for
single-image shadow removal. In: Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR),
pp 10571–10580
21. Gong H, Cosker D (2016) Interactive removal and ground truth for difficult shadow scenes. JOSA A
33(9):1798–1811. https:// doi. org/ 10. 1364/ JOSAA. 33. 001798
22. Guo R, Dai Q, Hoiem D (2011) Single-image shadow detection and removal using paired regions. In:
Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR) 2011, pp 2033–2040
23. Guo R, Dai Q, Hoiem D (2012) Paired regions for shadow detection and removal. IEEE Trans Pattern
Anal Mach Intell 35(12):2956–2967
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19538
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
24. He S, Peng B, Dong J, Du Y (2021) Mask-ShadowNet: toward shadow removal via masked adaptive
instance normalization. IEEE Signal Process Lett 28:957–961
25. Hiary H, Zaghloul R, Al-Zoubi MB (2018) Single-image shadow detection using quaternion cues.
Comput J 61(3):459–468
26. Hu X, Fu C-W, Zhu L, Qin J, Heng P-A (2019) Direction-aware spatial context features for shadow
detection and removal. IEEE Trans Pattern Anal Mach Intell 42(11):2795–2808
27. Hu X, Jiang Y, Fu C-W, Heng P-A (2019) Mask-shadowgan: learning to remove shadows from
unpaired data. In: Proceedings of the IEEE/CVF Int Conf Comput Vis (ICCV), pp 2472–2481
28. Hu X, Wang T, Fu C-W, Jiang Y, Wang Q, Heng P-A (2021) Revisiting shadow detection: a new
benchmark dataset for complex world. IEEE Trans Image Process 30:1925–1934
29. Jarraya SK, Hammami M, Ben-Abdallah H (2016) Adaptive moving shadow detection and removal by
new semi-supervised learning technique. Multimed Tools Appl 75(18):10949–10977
30. Jin Y, Sharma A, Tan RT (2021) DC-Shadownet: single-image hard and soft shadow removal using
unsupervised domain-classifier guided network. In: Proceedings of the IEEE/CVF Int Conf Comput
Vis (ICCV), pp 5027–5036
31. Khare M, Srivastava RK, Jeon M (2018) Shadow detection and removal for moving objects using
Daubechies complex wavelet transform. Multimed Tools Appl 77(2):2391–2421. https:// link. sprin ger.
com/ artic le/ 10. 1007/ s11042- 017- 4371-0
32. Koutsiou D-CC, Savelonas M, Iakovidis DK (2021) HV shadow detection based on electromagnetism-
like optimization. In: Proceedings of the Eur Sig Process Conf (EUSIPCO), pp 635–639
33. Le H, Samaras D (2019) Shadow removal via shadow image decomposition. In: Proceedings of the
IEEE/CVF Int Conf Comput Vis (ICCV), pp 8578–8587
34. Le H, Samaras D (2020) From shadow segmentation to shadow removal. In: Proceedings of the Eur
Conf Comput Vis (ECCV), pp 264–281
35. Le H, Samaras D (2021) Physics-based shadow image decomposition for shadow removal. IEEE Trans
Pattern Anal Mach Intell 44(12):9088–9101. https:// doi. org/ 10. 1109/ tpami. 2021. 31249 34
36. Liu F, Gleicher M (2008) Texture-consistent shadow removal. In: Proceedings of the Eur Conf Comput
Vis (ECCV), pp 437–450
37. Liu Z, Yin H, Mi Y, Pu M, Wang S (2021) Shadow removal by a lightness-guided network with train-
ing on unpaired data. IEEE Trans Image Process 30:1853–1865
38. Liu Z, Yin H, Wu X, Wu Z, Mi Y, Wang S (2021) From shadow generation to shadow removal. In:
Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR), pp 4927–4936
39. Liu Y, Li Q, Yuan Y, Du Q, Wang Q (2021) ABNet: adaptive balanced network for multiscale object
detection in remote sensing imagery. IEEE Trans Geosci Remote Sens 60:1–14
40. Maini R, Aggarwal H (2010) A comprehensive review of image enhancement techniques.
arXiv:1003.4053. https:// doi. org/ 10. 48550/ arXiv. 1003. 4053
41. Murali S, Govindan V, Kalady S (2019) Shadow removal from uniform-textured images using iterative
thresholding of shearlet coefficients. Multimed Tools Appl 78(15):21167–21186
42. Murali S, Govindan V, Kalady S (2022) Quaternion-based image shadow removal. Vis Comput
38(5):1527–1538
43. Nagae T, Abiko R, Yamaguchi T, Ikehara M (2021) Shadow detection and removal using GAN. In:
Proceedings of the Eur Sig Process Conf (EUSIPCO), pp 630–634
44. Ntakolia C, Dimas G, Iakovidis DK (2022) User-centered system design for assisted navigation of
visually impaired individuals in outdoor cultural environments. Univ Access Inf Soc 21(1):249–274.
https:// doi. org/ 10. 1007/ s10209- 020- 00764-1
45. Oliva D, Cuevas E, Pajares G, Zaldivar D, Osuna V (2014) A multilevel thresholding algorithm using
electromagnetism optimization. Neurocomputing 139:357–381
46. Qu L, Tian J, He S, Tang Y, Lau RW (2017) Deshadownet: a multi-context embedding deep network for
shadow removal. In: Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR), pp 4067–4075
47. Shor Y, Lischinski D (2008) The shadow meets the mask: pyramid-based shadow removal. Comput
Graph Forum 27(2):577–586
48. Sovatzidi G, Savelonas M, Koutsiou D-CC, Iakovidis DK (2020) Image segmentation based on deter-
minative brain storm optimization. In: Proceedings of the Int Work Sem Soc Med Adapt Pers (SMAP),
2020:1–6
49. Ufuktepe DK, Collins J, Ufuktepe E, Fraser J, Krock T, Palaniappan K (2021) Learning-based shadow
detection in aerial imagery using automatic training supervision from 3D point clouds. In: Proceedings
of the IEEE/CVF Int Conf Comput Vis (ICCV), pp 3926–3935
50. Vicente TFY, Hou L, Yu C-P, Hoai M, Samaras D (2016) Large-scale training of shadow detectors
with noisily-annotated shadow examples. In: Proceedings of the Eur Conf Comput Vis (ECCV), part
VI, 14, pp 816–832
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
19539
Multimedia Tools and Applications (2024) 83:19517–19539
1 3
51. Wan J., H. Yin, Z. Wu, X. Wu, Y. Liu, and S. Wang (2022) Style-guided shadow removal. In: Proceed-
ings of the Eur Conf Comput Vis (ECCV), pp 361–378
52. Wang J-M, Chung Y-C, Chang C, Chen S-W (2004) Shadow detection and removal for traffic images.
IEEE Int Conf Network Sens Control 1:649–654
53. Wang J, Li X, Yang J (2018) Stacked conditional generative adversarial networks for jointly learn-
ingshadow detection and shadow removal. In: Proceedings of the IEEE Conf Comput Vis Patt Rec
(CVPR), pp 1788–1797
54. Wang J, Yang D, Chen S, Zhu X, Wu S, Bogonovich M, Guo Z, Zhu Z, Wu J (2021) Automatic cloud
and cloud shadow detection in tropical areas for PlanetScope satellite images. Remote Sens Environ
264:112604. https:// doi. org/ 10. 1016/j. rse. 2021. 112604
55. Wang Q, Liu Y, Xiong Z, Yuan Y (2022) Hybrid feature aligned network for salient object detection in
optical remote sensing imagery. IEEE Trans Geosci Remote Sens 60:1–15
56. Wu W, Wu X, Wan Y (2022) Single-image shadow removal using detail extraction and illumination
estimation. Vis Comput 38(5):1677–1687. https:// doi. org/ 10. 1007/ s00371- 021- 02096-4
57. Xiao M, Han C-Z, Zhang L (2007) Moving shadow detection and removal for traffic sequences. Int J
Autom Comput 4(1):38–46
58. Yang Q, Tan K-H, Ahuja N (2012) Shadow removal using bilateral filtering. IEEE Trans Image Pro-
cess21(10):4361–4368. https:// doi. org/ 10. 1109/ TIP. 2012. 22089 76
59. Zhang H, Sun K, Li W (2014) Object-oriented shadow detection and removal from urban high-reso-
lution remote sensing images. IEEE Trans Geosci Remote Sens 52(11):6972–6982. https:// doi. org/ 10.
1109/ TGRS. 2014. 23062 33
60. Zhang L, Zhang Q, Xiao C (2015) Shadow remover: image shadow removal based on illumination
recovering optimization. IEEE Trans Image Process 24(11):4623–4636
61. Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep
features as a perceptual metric. In: Proceedings of the IEEE Conf Compu Vis Patt Rec (CVPR), pp
586–595
62. Zhang L, Long C, Zhang X, Xiao C (2020) Ris-Gan: explore residual and illumination with gen-
erative adversarial networks for shadow removal. In: Proceedings of the AAAI Conf Artif Intell
34(07):12829–12836
63. Zheng L, Ruan X, Chen Y, Huang M (2017) Shadow removal for pedestrian detection and track-
ing in indoor environments. Multimed Tools Appl 76(18):18321–18337. https:// doi. org/ 10. 1007/
s11042- 016- 3880-6
64. Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consist-
ent adversarial networks. In: Proceedings of the IEEE international conference on computer vision
(ICCV), pp 2223–2232
65. Zhu Y, Huang J, Fu X, Zhao F, Sun Q, Zha Z-J (2022) Bijective mapping network for shadow removal.
In: Proceedings of the IEEE/CVF Conf Comput Vis Patt Rec (CVPR), pp 5627–5636
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center
GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers
and authorised users (“Users”), for small-scale personal, non-commercial use provided that all
copyright, trade and service marks and other proprietary notices are maintained. By accessing,
sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of
use (“Terms”). For these purposes, Springer Nature considers academic use (by researchers and
students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and
conditions, a relevant site licence or a personal subscription. These Terms will prevail over any
conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription (to
the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of
the Creative Commons license used will apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may
also use these personal data internally within ResearchGate and Springer Nature and as agreed share
it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not otherwise
disclose your personal data outside the ResearchGate or the Springer Nature group of companies
unless we have your permission as detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial
use, it is important to note that Users may not:
use such content for the purpose of providing other users with access on a regular or large scale
basis or as a means to circumvent access control;
use such content where to do so would be considered a criminal or statutory offence in any
jurisdiction, or gives rise to civil liability, or is otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association
unless explicitly agreed to by Springer Nature in writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a
systematic database of Springer Nature journal content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a
product or service that creates revenue, royalties, rent or income from our content or its inclusion as
part of a paid for service or for other commercial gain. Springer Nature journal content cannot be
used for inter-library loans and librarians may not upload Springer Nature journal content on a large
scale into their, or any other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not
obligated to publish any information or content on this website and may remove it or features or
functionality at our sole discretion, at any time with or without notice. Springer Nature may revoke
this licence to you at any time and remove access to any copies of the Springer Nature journal content
which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or
guarantees to Users, either express or implied with respect to the Springer nature journal content and
all parties disclaim and waive any implied warranties or warranties imposed by law, including
merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published
by Springer Nature that may be licensed from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a
regular basis or in any other manner not expressly permitted by these Terms, please contact Springer
Nature at
onlineservice@springernature.com
... Most of the current datasets developed for benchmarking shadow removal from images contain scenes that are very simple, e.g., including one or a couple of objects, or limited background variation. This can limit the capacity of supervised DL-based shadow removal Simple Unsupervised Shadow removal (SUShe), was proposed [14]. That method combines a physics-based optimization algorithm with color feature extraction for shadow detection, and it recovers the luminosity of the shadowed areas by leveraging superpixel segmentation and histogram matching. ...
... Experimental Setup. The comparative evaluation of the proposed SHAU architecture involved nine state-of-the-art shadow removal methods that cover a diverse range of approaches that rely on different types of networks to remove shadows, including ST-CGAN [17], DC-ShadowNet [31], LG-ShadowNet [58], SP+M+I-Net [19], Fu et al. [59], CNSNet [23], SG-ShadowNet [22], ShadowFormer [24] and a physics- [14], which does not require training. For a fair comparison, the results are reproduced utilizing the official source code and hyperparameters of each reported method, with the exception of CNSNet for which the source code was not available. ...
Preprint
Effective shadow removal is pivotal in enhancing the visual quality of images in various applications, ranging from computer vision to digital photography. During the last decades physics and machine learning -based methodologies have been proposed; however, most of them have limited capacity in capturing complex shadow patterns due to restrictive model assumptions, neglecting the fact that shadows usually appear at different scales. Also, current datasets used for benchmarking shadow removal are composed of a limited number of images with simple scenes containing mainly uniform shadows cast by single objects, whereas only a few of them include both manual shadow annotations and paired shadow-free images. Aiming to address all these limitations in the context of natural scene imaging, including urban environments with complex scenes, the contribution of this study is twofold: a) it proposes a novel deep learning architecture, named Soft-Hard Attention U-net (SHAU), focusing on multiscale shadow removal; b) it provides a novel synthetic dataset, named Multiscale Shadow Removal Dataset (MSRD), containing complex shadow patterns of multiple scales, aiming to serve as a privacy-preserving dataset for a more comprehensive benchmarking of future shadow removal methodologies. Key architectural components of SHAU are the soft and hard attention modules, which along with multiscale feature extraction blocks enable effective shadow removal of different scales and intensities. The results demonstrate the effectiveness of SHAU over the relevant state-of-the-art shadow removal methods across various benchmark datasets, improving the Peak Signal-to-Noise Ratio and Root Mean Square Error for the shadow area by 25.1% and 61.3%, respectively.
... Still image subdivision, which needs to be, fragmented the actual object from the groundwork to inspect the image appropriately and recognize the contents of the images very carefully through a sequence of steps can be seen in the Figure 2. flowchart. In this unique circumstance, the edge detection is a key contraption for image processing [14,15] . Flowchart of the proposed shadow removal method [5] . ...
Article
Full-text available
This research’s main objective is to study and evaluate the detection and removal of undesired shadows from still images since these shadows might mask important information caused by light sources and other obstructions. A variety of methods for detecting and eliminating shadows as well as object tracking approaches based on movement estimation and identification are investigated. This includes shadow removal methods like background subtraction, which are intended to improve obstacle recognition of the source item and increase the accuracy of shadow removal from objects. When new items enter the frame, they are first distinguished from the background using a reference frame. The tracking procedure is made more difficult by the merging of the shadow with the foreground object. The approach highlights the difficulties in object detection owing to frequent occurrences of obstacles by using morphological procedures for shadow identification and removal. The proposed approach uses feature extraction is also discussed, highlighting its importance in image processing research and the use of suggested methods to get over obstacles in image sequences. The proposed method for shadow identification and removal offers a novel approach to improve image processing when dealing with still images. The purpose of this technique is to better detect and remove shadows from images, which will increase the precision of object tracking and detection. Depending on the type of images being processed, the process begins with initializing a background model, which is based on a static image background.
Article
Full-text available
Even today, where many deep-learning-based methods have been published, single-image shadow removal is a challenging task to achieve high accuracy. This is because the shadow changes depending on various conditions such as the target material or the light source, and it is difficult to estimate all the physical parameters. In this paper, we propose a new single-image shadow removal method (Channel Attention GAN: CANet) using two networks for detecting shadows and removing shadows. Intensity change in shadowed regions has different characteristics depending on the wavelength of light. In addition, the image acquisition system of the camera acquires an image in a state where the RGB values influence each other. Therefore, our method focused on the physical properties of shadows and the camera’s image acquisition system. The proposed network has a structure considering the relationship between color channels. When training this network, we modified the color and added some artifacts to the training images in order to make the training dataset more complex. These image processing are based on the shadow model, considering the camera image acquisition system. With these new proposals, our method can remove shadows in all ISTD, ISTD+, SRD, and SRD+ datasets with higher accuracy than the state-of-the-art methods. The code is available on GitHub: https://github.com/ryo-abiko/CANet .
Chapter
Shadow removal is an important topic in image restoration, and it can benefit many computer vision tasks. State-of-the-art shadow-removal methods typically employ deep learning by minimizing a pixel-level difference between the de-shadowed region and their corresponding (pseudo) shadow-free version. After shadow removal, the shadow and non-shadow regions may exhibit inconsistent appearance, leading to a visually disharmonious image. To address this problem, we propose a style-guided shadow removal network (SG-ShadowNet) for better image-style consistency after shadow removal. In SG-ShadowNet, we first learn the style representation of the non-shadow region via a simple region style estimator. Then we propose a novel effective normalization strategy with the region-level style to adjust the coarsely re-covered shadow region to be more harmonized with the rest of the image. Extensive experiments show that our proposed SG-ShadowNet outperforms all the existing competitive models and achieves a new state-of-the-art performance on ISTD+, SRD, and Video Shadow Removal benchmark datasets. Code is available at: https://github.com/jinwan1994/SG-ShadowNet.KeywordsShadow removalRegion styleNormalization
Article
Recently, salient object detection in optical remote sensing images (RSI-SOD) has attracted great attention. Benefiting from the success of deep learning and the inspiration of natural SOD task, RSI-SOD has achieved fast progress over the past two years. However, existing methods usually suffer from the intrinsic problems of optical RSIs: 1) cluttered background; 2) scale variation of salient objects; 3) complicated edges and irregular topology. To remedy these problems, we propose a hybrid feature aligned network (HFANet) jointly modeling boundary learning to detect salient objects effectively. Specifically, we design a hybrid encoder by unifying two components to capture global context for mitigating the disturbance of complex background. Then, to detect multiscale salient objects effectively, we propose a Gated Fold-ASPP (GF-ASPP) to extract abundant context in the deep semantic features. Furthermore, an adjacent feature aligned module (AFAM) is presented for integrating adjacent features with unparameterized alignment strategy. Finally, we propose a novel interactive guidance loss (IGLoss) to combine saliency and edge detection, which can adaptively perform mutual supervision of the two subtasks to facilitate detection of salient objects with blurred edges and irregular topology. Adequate experimental results on three optical RSI-SOD datasets reveal that the presented approach exceeds 18 state-of-the-art ones. All codes and detection results are available at https://github.com/lyf0801/HFANet .
Article
This paper proposes a new model for image shadow removal. The model reformulates a recent osmosis model with nonlocal differential operators. This allows to benefit from distant pixels similarities and thus improves restoration results. Some properties of this model are established and discussed, making it suitable for our application. Experimental results show that the nonlocal model obtained very good qualitative and quantitative results compared with state-of-the-art techniques.
Article
Benefiting from the development of convolutional neural networks (CNNs), many excellent algorithms for object detection have been presented. Remote sensing object detection is a challenging task mainly due to: 1) complicated background of remote sensing images; 2) extremely imbalanced scale and sparsity distribution of remote sensing objects. Existing methods can not effectively solve these problems with excellent detection accuracy and rapid speed. To address these issues, we propose an Adaptive Balanced Network in this paper. Firstly, we design an Enhanced Effective Channel Attention (EECA) mechanism to improve the feature representation ability of backbone, which can alleviate the obstacles of complex background on foreground objects. Then, to combine multi-scale features adaptively in different channels and spatial positions, an Adaptive Feature Pyramid Network (AFPN) is designed to capture more discriminative features. Furthermore, considering that the original FPN ignores rich deep-level features, a Context Enhancement Module (CEM) is proposed to exploit abundant semantic information for multi-scale object detection. Experimental results on three public datasets demonstrate that our approach exhibits superior performance over baseline by only introducing less than 1.5M extra parameters.