Access to this full-text is provided by MDPI.
Content available from Applied Sciences
This content is subject to copyright.
Citation: Kan, Q.; Liu, X.; Meng, A.;
Yu, L. Intelligent Recognition of Road
Internal Void Using Ground-
Penetrating Radar. Appl. Sci. 2024,14,
11848. https://doi.org/10.3390/
app142411848
Academic Editor: Amerigo Capria
Received: 22 November 2024
Revised: 12 December 2024
Accepted: 16 December 2024
Published: 18 December 2024
Copyright: © 2024 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
Intelligent Recognition of Road Internal Void Using
Ground-Penetrating Radar
Qian Kan 1,2, Xing Liu 2, *, Anxin Meng 2and Li Yu 1
1School of Electronic Information and Communications, Huazhong University of Science and Technology,
Wuhan 430074, China; d201980602@hust.edu.cn (Q.K.); hustlyu@hust.edu.cn (L.Y.)
2Shenzhen Urban Transport Planning Center Co., Ltd., Shenzhen 518057, China; 19b932015@stu.hit.edu.cn
*Correspondence: 19b932014@stu.hit.edu.cn
Abstract: Internal road voids can lead to decreased load-bearing capacity, which may result in
sudden road collapse, posing threats to traffic safety. Three-dimensional ground-penetrating radar
(3D GPR) detects internal road structures by transmitting high-frequency electromagnetic waves
into the ground and receiving reflected waves. However, due to noise interference during detection,
accurately identifying void areas based on GPR-collected images remains a significant challenge.
Therefore, in order to more accurately detect and identify the void areas inside the road, this study
proposes an intelligent recognition method for internal road voids based on 3D GPR. First, extensive
data on internal road voids was collected using 3D GPR, and the GPR echo characteristics of void
areas were analyzed. To address the issue of poor image quality in GPR images, a GPR image
enhancement model integrating multi-frequency information was proposed by combining the Unet
model, Multi-Head Cross Attention mechanism, and diffusion model. Finally, the intelligent recogni-
tion model and enhanced GPR images were used to achieve intelligent and accurate recognition of
internal road voids, followed by engineering validation. The research results demonstrate that the
proposed road internal void image enhancement model achieves significant improvements in both
visual effects and quantitative evaluation metrics, while providing more effective void features for
intelligent recognition models. This study offers technical support for precise decision making in
road maintenance and ensuring safe road operations.
Keywords: ground-penetrating radar; void; Unet model; image enhancement; YOLO v8; intelligent
recognition
1. Introduction
The formation of internal road voids is primarily influenced by natural geological con-
ditions, engineering factors, and environmental effects, specifically including groundwater
loss, construction quality, drainage systems, vehicle loads, underground pipeline leakage,
and environmental changes. These factors collectively lead to the loss of subgrade materials
and structural damage, significantly affecting road structure stability, reducing load-bearing
capacity, and potentially causing road collapse under the stress of heavy vehicles. Ad-
ditionally, if internal road voids are not repaired in a timely manner, they will continue
to develop and expand under the influence of vehicle loads, increasing the cost of road
maintenance and affecting traffic efficiency during the repair process [
1
–
3
]. To ensure the
health of road operations and traffic safety, scholars have conducted research on detection
methods for concealed void-related diseases within roads. By accurately understanding
the distribution and development of voids, repair strategies can be formulated to minimize
their impact on traffic. At the same time, numerous intelligent recognition algorithms have
been proposed, enhancing the efficiency of internal road void detection [
4
,
5
]. However,
due to factors such as noise interference and the complexity of the medium materials, it is
challenging to maintain high-quality internal road image data, affecting the accuracy of
Appl. Sci. 2024,14, 11848. https://doi.org/10.3390/app142411848 https://www.mdpi.com/journal/applsci
Appl. Sci. 2024,14, 11848 2 of 20
void area discrimination [
6
–
8
]. Therefore, researching image enhancement methods for
low-quality internal road void image data can improve the accuracy of void identification.
Compared to methods for detecting apparent road diseases, the internal detection
cannot be directly assessed through visual observation. The detection includes invasive
methods such as slot cutting [
9
–
11
], as well as non-invasive techniques such as acoustic
testing and electromagnetic wave detection [
12
–
14
]. Compared to destructive testing, non-
destructive testing does not compromise the integrity of the road structure, allowing for a
comprehensive assessment of the internal structure of the road and ensuring the continuity
of operations. Among non-destructive methods, GPR is widely used. This method analyzes
the internal state of roads by interpreting electromagnetic wave signals. Compared to
other methods, GPR features strong penetration, high resolution, and fast data collection.
Moreover, the detection process does not require contact with the surface of the object,
enabling the rapid, comprehensive, and safe detection of internal road voids. Therefore,
scholars often use GPR to conduct research on road defects.
GPR signal processing significantly impacts data accuracy and interpretation speed.
Researchers have achieved road internal structure assessment through Neyman–Pearson
spatial correlation analysis [
15
,
16
] and clarified radar signal attenuation characteristics and
penetration depth [
17
]. To improve signal quality, methods including two-dimensional digital
filter clutter removal, fast independent component analysis with multifractal spectrum de-
noising [
18
,
19
], and parameter-optimized generalized S-transform [
20
] were proposed. These
studies enhanced adaptability in complex environments by reducing noise and strengthening
signal features. Research has shown that void areas appear as bright features in radar images,
with high-frequency components closely related to details and edge variations [
21
–
23
]. Anal-
ysis of high-frequency image information significantly improves void area interpretation
accuracy, providing reliable evidence for internal road anomaly detection.
Researchers have conducted intelligent void detection studies using YOLO series
models [
24
–
27
], knowledge distillation techniques [
28
], and 3D convolutional neural net-
works [
29
], but have focused mainly on algorithm improvements while neglecting image
quality enhancement. Although image quality enhancement research has been conducted
in other fields, such as remote sensing image enhancement based on relation cross-attention
modules [30], hybrid CNN for fruit classification [31], exposure difference network-based
enhancement [
32
], DEANet for low-light image enhancement [
33
], underwater image en-
hancement [
34
], and wavelet-based MRI enhancement [
35
], these techniques are difficult to
directly apply to GPR void image quality enhancement due to the unique characteristics of
GPR images, necessitating targeted research.
In recent years, research on GPR image enhancement has gradually developed. Re-
searchers have proposed image enhancement methods based on non-linear technology,
using LIP model and CDF-HSD functions to improve target signal contrast while sup-
pressing background noise [
36
]; data enhancement methods based on AutoAugment have
improved the performance of intelligent recognition models [
37
]; deep learning methods
based on diffusion models have achieved simultaneous optimization of image resolution
enhancement and clutter removal, improving the quality of deep GPR images [
38
]. How-
ever, although existing enhancement methods have achieved good results in single tasks,
they lack in-depth exploration and effective modeling of the feature correspondence rela-
tionship between high-quality and low-quality images. This limitation makes it difficult to
accurately capture and maintain key feature information during the enhancement process,
affecting the reliability of enhancement results.
Therefore, this study aims to conduct research on the intelligent recognition of road
internal voids using 3D GPR. Initially, internal road void data were collected using 3D GPR.
For the collected images, an image quality enhancement model was proposed specifically
for void images, which integrated the advantages of the Unet model, Multi-Head Self-
Attention (MHSA) mechanism, and Multi-Head Cross-Attention (MHCA) mechanism.
Finally, intelligent recognition models of road voids were constructed using images before
and after enhancement to clearly determine the impact of image enhancement on model
Appl. Sci. 2024,14, 11848 3 of 20
performance. This research aims to provide an accurate method for recognizing internal
road voids, rapidly assessing internal void areas, and providing technical support for
comprehensive road maintenance.
Therefore, this study aims to conduct research on the intelligent recognition of road
internal voids using 3D GPR. By proposing a quality enhancement model specifically
designed for GPR images of internal road voids, this study aims to reduce the impact of
noise interference and complex media materials during GPR image acquisition on image
quality, thereby improving the accuracy of intelligent recognition models for internal road
voids. The research assumes that the proposed image enhancement model can effectively
improve the feature representation of void areas in GPR images and enhance image quality.
Additionally, the enhanced images are expected to provide clearer void features for the
intelligent recognition model, thus increasing the detection accuracy. Initially, internal
road void data were collected using 3D GPR. For the collected images, an image quality
enhancement model was proposed specifically for void images, which integrated the
advantages of the Unet model, Multi-Head Self-Attention (MHSA) mechanism, and Multi-
Head Cross-Attention (MHCA) mechanism. Finally, intelligent recognition models of road
voids were constructed using images before and after enhancement to clearly determine
the impact of image enhancement on model performance. This research aims to provide an
accurate method for recognizing internal road voids, rapidly assessing internal void areas,
and providing technical support for comprehensive road maintenance.
2. Materials and Methods
2.1. Data Acquisition
This study employed 3D GPR technology to detect voids within road structures. The
3D GPR system features a radar control module, along with an antenna array, Doppler
range-finding technology, and an RTK system. The center frequency of antenna used by
the radar is 200 MHz. The system is equipped with 15 antenna sets, which improves the
efficiency of data acquisition. The effective width of radar scanning is 180 cm. In order to
achieve coverage of the entire lane, the single lane is detected using two GPR scans.
In this study, 3D GPR was utilized to survey major transportation routes in Shen-
zhen. The survey covered hundreds of roads across multiple administrative districts in
Shenzhen, including Bao’an, Longhua, Guangming, Luohu, Futian, and Yantian Districts.
The detection mileage exceeded 1000 km, and more than 1700 void images were obtained.
LabelImg v1.8.1 software was used to label the empty areas in the image, and the image
was segmented into training, validation, and test sets for model training.
The on-site inspection image is shown in Figure 1.
Appl. Sci. 2024, 14, 11848 4 of 21
(a) On-site detection
(b) On-site marking
Figure 1. The on-site inspection images.
2.2. Echo Characteristics of Internal Voids in Roads
The echo characteristics of voids in roads aect the accuracy of identifying void areas.
Therefore, this section will conduct research on the echo characteristics of internal voids
in roads. An example of an internal road void appears in Figure 2.
Figure 2. Image of an internal road void. The red boxed area indicates the void area.
When electromagnetic waves propagate through the interior of a road and encounter
a void area, the signicant dierence in dielectric properties between air and road mate-
rials results in a pronounced enhancement of the reected electromagnetic signal. This is
displayed in the image as a prominent bright area. Additionally, when electromagnetic
waves encounter these voids, their propagation speed changes, which aects the continu-
ity of the co-axial lines in the image. As the GPR is mobile during detection, the path of
reection is at its minimum when positioned directly over a void. As the radar distances
itself from the void area, the path of reection progressively extends, appearing in the
radar image as a hyperbolic shape, creating unique wing-shaped paerns anking the
hyperbolic shape within the image.
The GPR image characteristics vary with dierent void formation causes and void
dimensions. Three void images with dierent causes and dimensions were selected for
comparison, as shown in Figure 3. For Type 1, the void has dimensions of length 1.4 m,
width 1.1 m, burial depth 0.27 m, and net height 0.31 m. For Type 2, the void dimensions
are length 1.3 m, width 1.2 m, burial depth 0.51 m, and net height 0.28 m. For Type 3, the
void dimensions are length 2.8 m, width 2.3 m, burial depth 0.65 m, and net height 0.51
m. Type 1 and 2 voids were caused by pipeline damage, while Type 3 voids were caused
by nearby construction activities.
Figure 1. The on-site inspection images.
2.2. Echo Characteristics of Internal Voids in Roads
The echo characteristics of voids in roads affect the accuracy of identifying void areas.
Therefore, this section will conduct research on the echo characteristics of internal voids in
roads. An example of an internal road void appears in Figure 2.
Appl. Sci. 2024,14, 11848 4 of 20
Appl. Sci. 2024, 14, 11848 4 of 21
(a) On-site detection
(b) On-site marking
Figure 1. The on-site inspection images.
2.2. Echo Characteristics of Internal Voids in Roads
The echo characteristics of voids in roads aect the accuracy of identifying void areas.
Therefore, this section will conduct research on the echo characteristics of internal voids
in roads. An example of an internal road void appears in Figure 2.
Figure 2. Image of an internal road void. The red boxed area indicates the void area.
When electromagnetic waves propagate through the interior of a road and encounter
a void area, the signicant dierence in dielectric properties between air and road mate-
rials results in a pronounced enhancement of the reected electromagnetic signal. This is
displayed in the image as a prominent bright area. Additionally, when electromagnetic
waves encounter these voids, their propagation speed changes, which aects the continu-
ity of the co-axial lines in the image. As the GPR is mobile during detection, the path of
reection is at its minimum when positioned directly over a void. As the radar distances
itself from the void area, the path of reection progressively extends, appearing in the
radar image as a hyperbolic shape, creating unique wing-shaped paerns anking the
hyperbolic shape within the image.
The GPR image characteristics vary with dierent void formation causes and void
dimensions. Three void images with dierent causes and dimensions were selected for
comparison, as shown in Figure 3. For Type 1, the void has dimensions of length 1.4 m,
width 1.1 m, burial depth 0.27 m, and net height 0.31 m. For Type 2, the void dimensions
are length 1.3 m, width 1.2 m, burial depth 0.51 m, and net height 0.28 m. For Type 3, the
void dimensions are length 2.8 m, width 2.3 m, burial depth 0.65 m, and net height 0.51
m. Type 1 and 2 voids were caused by pipeline damage, while Type 3 voids were caused
by nearby construction activities.
Figure 2. Image of an internal road void. The red boxed area indicates the void area.
When electromagnetic waves propagate through the interior of a road and encounter a
void area, the significant difference in dielectric properties between air and road materials
results in a pronounced enhancement of the reflected electromagnetic signal. This is
displayed in the image as a prominent bright area. Additionally, when electromagnetic
waves encounter these voids, their propagation speed changes, which affects the continuity
of the co-axial lines in the image. As the GPR is mobile during detection, the path of
reflection is at its minimum when positioned directly over a void. As the radar distances
itself from the void area, the path of reflection progressively extends, appearing in the radar
image as a hyperbolic shape, creating unique wing-shaped patterns flanking the hyperbolic
shape within the image.
The GPR image characteristics vary with different void formation causes and void
dimensions. Three void images with different causes and dimensions were selected for
comparison, as shown in Figure 3. For Type 1, the void has dimensions of length 1.4 m,
width 1.1 m, burial depth 0.27 m, and net height 0.31 m. For Type 2, the void dimensions
are length 1.3 m, width 1.2 m, burial depth 0.51 m, and net height 0.28 m. For Type 3, the
void dimensions are length 2.8 m, width 2.3 m, burial depth 0.65 m, and net height 0.51 m.
Type 1 and 2 voids were caused by pipeline damage, while Type 3 voids were caused by
nearby construction activities.
Appl. Sci. 2024, 14, 11848 5 of 21
(a) Type 1
(b) Type 2
(c) Type 3
Figure 3. Three types of void images. The red boxed area indicates the void area.
Comparison of these three void images reveals that larger-sized voids produce wider
reection ranges with more pronounced extensibility in the images. Greater net height
generates stronger reection signals, while increased burial depth weakens signal
strength, resulting in relatively blurred reection features. Additionally, voids with dif-
ferent causes exhibit distinct characteristics. Pipeline damage-induced voids show local-
ized strong reections with clear hyperbolic features, well-dened boundaries, and rela-
tively concentrated spatial distribution. In contrast, construction-induced voids display
broader reection ranges, more expanded hyperbolic features, more complex internal
structures, and more dispersed boundary reections.
During the data annotation process, based on clearly dened void image character-
istics, we implemented a three-level expert annotation system. This included initial anno-
tation by technicians, secondary review by technical sta, and nal verication by inspec-
tion personnel. For cases with disputes, team discussions were mandatory to reach con-
sensus, ensuring consistency and standardization in the annotation process. Additionally,
we established a problem case database to record and analyze challenging annotation
cases, providing references and accumulated experience for improving annotation qual-
ity.
2.3. Image Quality Evaluation Metrics
In this study, four metrics—PSNR, SSIM, LPIPS, and NIQE—were employed to eval-
uate the quality of enhanced images. These metrics are commonly used in various elds
such as medical imaging, security surveillance, and remote sensing for assessing image
compression and denoising eects, as well as evaluating image processing quality [39,40].
(1) PSNR
PSNR is a measure used to assess the quality of reconstructed images. A higher PSNR
indicates a beer quality of image reconstruction, suggesting closer approximation to the
original image.
PSNR is calculated as shown in Equation (1).
2
10
10 log I
MAX
PSNR MSE
=
(1)
where
I
MAX
denotes the highest pixel value.
MSE
is the mean square error.
(2) SSIM
SSIM is utilized to assess the visual resemblance between two images and is suitable
for evaluating the eects of image quality enhancement. SSIM takes into account the struc-
tural information of images, reecting the perceptual dierences in image quality as seen
by the human eye. The calculation method for SSIM is shown in Equation (2).
Figure 3. Three types of void images. The red boxed area indicates the void area.
Comparison of these three void images reveals that larger-sized voids produce wider
reflection ranges with more pronounced extensibility in the images. Greater net height
generates stronger reflection signals, while increased burial depth weakens signal strength,
resulting in relatively blurred reflection features. Additionally, voids with different causes
exhibit distinct characteristics. Pipeline damage-induced voids show localized strong reflec-
tions with clear hyperbolic features, well-defined boundaries, and relatively concentrated
spatial distribution. In contrast, construction-induced voids display broader reflection
ranges, more expanded hyperbolic features, more complex internal structures, and more
dispersed boundary reflections.
During the data annotation process, based on clearly defined void image charac-
teristics, we implemented a three-level expert annotation system. This included initial
annotation by technicians, secondary review by technical staff, and final verification by
inspection personnel. For cases with disputes, team discussions were mandatory to reach
consensus, ensuring consistency and standardization in the annotation process. Addition-
Appl. Sci. 2024,14, 11848 5 of 20
ally, we established a problem case database to record and analyze challenging annotation
cases, providing references and accumulated experience for improving annotation quality.
2.3. Image Quality Evaluation Metrics
In this study, four metrics—PSNR,SSIM,LPIPS, and NIQE—were employed to eval-
uate the quality of enhanced images. These metrics are commonly used in various fields
such as medical imaging, security surveillance, and remote sensing for assessing image
compression and denoising effects, as well as evaluating image processing quality [39,40].
(1)
PSNR
PSNR is a measure used to assess the quality of reconstructed images. A higher PSNR
indicates a better quality of image reconstruction, suggesting closer approximation to the
original image.
PSNR is calculated as shown in Equation (1).
PSNR =10 ×log10 MAX2
I
MSE !(1)
where MAXIdenotes the highest pixel value. MSE is the mean square error.
(2)
SSIM
SSIM is utilized to assess the visual resemblance between two images and is suitable
for evaluating the effects of image quality enhancement. SSIM takes into account the
structural information of images, reflecting the perceptual differences in image quality as
seen by the human eye. The calculation method for SSIM is shown in Equation (2).
SSIM(x,y)=2µxµy+c12σxy +c2
µ2
x+µ2
y+c1σ2
x+σ2
y+c2(2)
where
x
and
y
are the window regions of the two images;
µx
and
µy
are the mean values;
σ2
x
and
σ2
y
are the variances;
σxy
is the covariance;
c1
and
c2
are constants used to maintain
stability.
(3)
LPIPS
LPIPS is a metric for assessing image similarity that considers the characteristics
of human visual perception, using deep learning models to evaluate visual differences
between two images. The lower the LPIPS value, the smaller the perceptual differences
between the two images, indicating that the corresponding algorithm maintains higher
visual quality and similarity.
Unlike metrics based on simple pixel comparison, LPIPS utilizes the advanced features
of images in calculating differences between them, providing better performance and
robustness when dealing with complex features. The calculation method is detailed in
Equation (3).
LPIPS(x,y)=
N
∑
k=1 1
HnWn
Hn
∑
i=1
Wn
∑
j=1
∥ωn⊙(Fn(x)−Fn(y))∥2!(3)
where
LPIPS(x,y)
represents the perceptual similarity measurement between images x
and y;
N
represents the count of layers;
F
is the pretrained network;
Hn
and
Wn
are the
height and width of the feature map;
Fn(x)
and
Fn(y)
are the feature representations;
ωn
is the weight parameter at layer
n
;
⊙
denotes element-wise multiplication;
∥·∥2
is the
L2 norm.
Appl. Sci. 2024,14, 11848 6 of 20
(4)
NIQE
NIQE is a no-reference image quality evaluation technique that operates without the
need for an original or an ideal reference image. Based on a natural scene statistics model,
NIQE learns statistical characteristics from high-quality natural images and constructs a
probabilistic model describing these statistical features. NIQE assesses image quality by
comparing its features with those of the learned model, determining deviations from this
norm. A lower NIQE value indicates that the image quality is closer to that of high-quality
natural images. The computational method for NIQE is shown in Equation (4).
NIQE =q(x−µ)TΣ−1(x−µ)(4)
In the equation,
x
represents the feature vector;
µ
and
∑
are the mean and covariance
of the pretrained model.
(x−µ)TΣ−1(x−µ)
represents the Mahalanobis distance, which
quantifies the degree of deviation between the image features and the model features.
3. Results
3.1. Research on Preprocessing Techniques for Internal Void Images in Road
3.1.1. Image Size Optimization
Due to the diverse causes and random distribution of void areas within road interiors,
influenced by the coverage of 3D GPR and the propagation characteristics of electromagnetic
waves, some void regions are located near the edges of the radar images. In such images, it is
particularly challenging to fully capture neighborhood pixel information at the edges.
In response to the aforementioned challenges, this section proposes an approach
to enhance the image dimensions as a means to mitigate the low recognition accuracy
stemming from void areas located near the image edges.
Consequently, padding is added symmetrically along the width of the image, based
on its length, thereby increasing the overall image dimensions. To maintain proportional
balance and prevent distortion of the targets due to uneven expansion, the expanded image
is formatted into a square shape. This standardization facilitates more uniform processing
within the algorithmic models and minimizes variability caused by image size differences.
The effectiveness of this padding technique is demonstrated in Figure 4.
Appl. Sci. 2024, 14, 11848 7 of 21
In response to the aforementioned challenges, this section proposes an approach to
enhance the image dimensions as a means to mitigate the low recognition accuracy stem-
ming from void areas located near the image edges.
Consequently, padding is added symmetrically along the width of the image, based
on its length, thereby increasing the overall image dimensions. To maintain proportional
balance and prevent distortion of the targets due to uneven expansion, the expanded im-
age is formaed into a square shape. This standardization facilitates more uniform pro-
cessing within the algorithmic models and minimizes variability caused by image size
dierences. The eectiveness of this padding technique is demonstrated in Figure 4.
(a) Original image
(b) Filled image
Figure 4. Comparison before and after image lling.
3.1.2. Image Quality Degradation and Enhancement
In practical scenarios, images of voids inside roads captured using GPR are aected
by factors such as signal aenuation, noise interference, equipment resolution, and the
complexity of the underground medium. These factors lead to instability in the quality
and clarity of the images. Therefore, research into image quality enhancement techniques
is necessary.
Typically, supervised learning processes require paired datasets, which comprise in-
put images of low quality and target images of high quality. However, in the eld, high-
quality images corresponding to the low-quality images of voids inside roads are lacking,
which hampers the learning process and renders supervised learning methods inapplica-
ble. Unsupervised learning, on the other hand, mainly relies on analyzing the intrinsic
structure and paerns within the data itself, independent of paired label data. However,
it suers from a lack of intuitiveness and a clear optimization goal, leading to unstable
improvements in image quality and an inability to guarantee achieving the desired quality
level.
Based on these challenges, this study proposes a method for constructing void image
data aimed at supervised learning. This method primarily generates low-quality and
high-quality images by degrading and enhancing the image quality, respectively, based
on images collected in the eld.
When identifying void areas within road interior radar images, the information at
the image edges is particularly important as it can be used to determine the precise loca-
tion, shape, size, and distribution of the void areas. In the image processing phase, the
high-frequency information in an image can reect details such as image feature charac-
teristics, target area edges, and spatial changes. Therefore, during the process of image
Figure 4. Comparison before and after image filling.
3.1.2. Image Quality Degradation and Enhancement
In practical scenarios, images of voids inside roads captured using GPR are affected
by factors such as signal attenuation, noise interference, equipment resolution, and the
complexity of the underground medium. These factors lead to instability in the quality
Appl. Sci. 2024,14, 11848 7 of 20
and clarity of the images. Therefore, research into image quality enhancement techniques
is necessary.
Typically, supervised learning processes require paired datasets, which comprise
input images of low quality and target images of high quality. However, in the field,
high-quality images corresponding to the low-quality images of voids inside roads are
lacking, which hampers the learning process and renders supervised learning methods
inapplicable. Unsupervised learning, on the other hand, mainly relies on analyzing the
intrinsic structure and patterns within the data itself, independent of paired label data.
However, it suffers from a lack of intuitiveness and a clear optimization goal, leading to
unstable improvements in image quality and an inability to guarantee achieving the desired
quality level.
Based on these challenges, this study proposes a method for constructing void image
data aimed at supervised learning. This method primarily generates low-quality and
high-quality images by degrading and enhancing the image quality, respectively, based on
images collected in the field.
When identifying void areas within road interior radar images, the information at
the image edges is particularly important as it can be used to determine the precise loca-
tion, shape, size, and distribution of the void areas. In the image processing phase, the
high-frequency information in an image can reflect details such as image feature charac-
teristics, target area edges, and spatial changes. Therefore, during the process of image
quality degradation, emphasis will be placed on processing the high-frequency information
characteristics of the image. Fourier transform low-pass filtering is a technique used in the
frequency domain to process images. It can be used to diminish and remove high-frequency
elements in an image while retaining the low-frequency elements, thus achieving the goal
of image quality degradation.
When collecting images of road interiors using 3D GPR, various types of noise are
encountered. Electromagnetic interference noise mainly originates from electromagnetic
waves in the surrounding environment, which can disrupt the reception of radar signals.
Random noise is typically generated by the electronic components of the radar equipment
itself, influenced by the quality of equipment manufacturing and external environmental
factors. System noise includes noises produced by reflections and scatterings within the
radar system, originating from radar wave reflections at the interfaces between different
mediums. Echo blurring occurs due to the multi-directional scattering of radar waves
during transmission, leading to the overlapping and blurring of echo signals, making it
difficult to discern image details. The presence of these noises reduces the quality of the
images. Gaussian filtering is effective at decreasing image noise through image smoothing
while preserving edge information without damaging the edges.
The effects of image quality degradation and image quality enhancement are illustrated
in Figure 5.
Appl. Sci. 2024, 14, 11848 8 of 21
quality degradation, emphasis will be placed on processing the high-frequency infor-
mation characteristics of the image. Fourier transform low-pass ltering is a technique
used in the frequency domain to process images. It can be used to diminish and remove
high-frequency elements in an image while retaining the low-frequency elements, thus
achieving the goal of image quality degradation.
When collecting images of road interiors using 3D GPR, various types of noise are
encountered. Electromagnetic interference noise mainly originates from electromagnetic
waves in the surrounding environment, which can disrupt the reception of radar signals.
Random noise is typically generated by the electronic components of the radar equipment
itself, inuenced by the quality of equipment manufacturing and external environmental
factors. System noise includes noises produced by reections and scaerings within the
radar system, originating from radar wave reections at the interfaces between dierent
mediums. Echo blurring occurs due to the multi-directional scaering of radar waves dur-
ing transmission, leading to the overlapping and blurring of echo signals, making it di-
cult to discern image details. The presence of these noises reduces the quality of the im-
ages. Gaussian ltering is eective at decreasing image noise through image smoothing
while preserving edge information without damaging the edges.
The eects of image quality degradation and image quality enhancement are illus-
trated in Figure 5.
(a) Original
(b) Degraded
(c) Enhanced
Figure 5. Image quality degradation and enhancement. The red box area is the key comparison
area.
From Figure 5a, it can be observed that the void area displays alternating black and
white highlighted regions. This is due to the dielectric constant at the void being lower
than the surrounding materials, causing strong reections of electromagnetic waves in
this area. The image colors in non-void areas are relatively uniform, with some areas ap-
pearing wavy, which is caused by the inhomogeneity of the road medium and the dier-
ences in the propagation of electromagnetic waves through it. As shown in Figure 5b, after
applying Fourier low-pass ltering to the image, the clear edges and details in the original
image become relatively blurred, and the image contrast is reduced, resulting in a softer
and smoother image. From Figure 5c, it is evident that after processing the image with
Gaussian ltering, the edge information in the void area is well preserved, and the pixel
value changes in non-void areas are relatively gentle. Gaussian ltering eectively
smooths such areas and reduces noise.
3.2. Research on a Road Internal Void Image Enhancement Method Based on Improved Unet Model
In this section, while conducting research on the network model for road internal
void imagery, the Unet neural network architecture was referenced [41]. Based on MHSA
modules and MHCA modules, an image quality enhancement model aimed at internal
road void imagery was designed [42,43].
Figure 5. Image quality degradation and enhancement. The red box area is the key comparison area.
From Figure 5a, it can be observed that the void area displays alternating black and
white highlighted regions. This is due to the dielectric constant at the void being lower
than the surrounding materials, causing strong reflections of electromagnetic waves in this
area. The image colors in non-void areas are relatively uniform, with some areas appearing
Appl. Sci. 2024,14, 11848 8 of 20
wavy, which is caused by the inhomogeneity of the road medium and the differences
in the propagation of electromagnetic waves through it. As shown in Figure 5b, after
applying Fourier low-pass filtering to the image, the clear edges and details in the original
image become relatively blurred, and the image contrast is reduced, resulting in a softer
and smoother image. From Figure 5c, it is evident that after processing the image with
Gaussian filtering, the edge information in the void area is well preserved, and the pixel
value changes in non-void areas are relatively gentle. Gaussian filtering effectively smooths
such areas and reduces noise.
3.2. Research on a Road Internal Void Image Enhancement Method Based on Improved Unet Model
In this section, while conducting research on the network model for road internal
void imagery, the Unet neural network architecture was referenced [
41
]. Based on MHSA
modules and MHCA modules, an image quality enhancement model aimed at internal
road void imagery was designed [42,43].
3.2.1. Unet Neural Network Model
The Unet model, characterized by its symmetric encoder and decoder architecture,
along with effective skip connections, is capable of deeply extracting and utilizing multi-
level features, significantly enhancing its ability to process complex images. This makes the
Unet model particularly well-suited for processing GPR images of road internals. Through
end-to-end learning, the Unet can adaptively recover clear and precise structures from
damaged images, effectively enhancing image quality and detail representation.
The Unet network employs an encoder–decoder structure, which includes a contract-
ing path and a symmetric expansive path, connected through a bottleneck layer. The
network’s architectural diagram is depicted in Figure 6.
Appl. Sci. 2024, 14, 11848 9 of 21
3.2.1. Unet Neural Network Model
The Unet model, characterized by its symmetric encoder and decoder architecture,
along with eective skip connections, is capable of deeply extracting and utilizing multi-
level features, signicantly enhancing its ability to process complex images. This makes
the Unet model particularly well-suited for processing GPR images of road internals.
Through end-to-end learning, the Unet can adaptively recover clear and precise structures
from damaged images, eectively enhancing image quality and detail representation.
The Unet network employs an encoder–decoder structure, which includes a contract-
ing path and a symmetric expansive path, connected through a boleneck layer. The net-
work’s architectural diagram is depicted in Figure 6.
Figure 6. Unet structure diagram.
The primary function of the encoder is to capture the image’s feature information
and decrease its spatial dimensions. Simultaneously, it boosts the number of feature chan-
nels to augment the network’s ability to represent features. Within the encoder, the input
image undergoes four stages of downsampling, each containing two 3 × 3 convolution
layers, which are then succeeded by a 2 × 2 max pooling layer, eectively extracting multi-
scale features and capturing a broader range and more complex features.
The decoder primarily focuses on gradually upsampling the encoded feature maps
back to the original image size and conducts feature learning through convolution layers,
thereby enhancing feature extraction capabilities. The decoder architecture includes four
upsampling stages. Each upsampling module contains two convolution layers and one
transpose convolution layer, eectively increasing the spatial dimensions of the feature
maps and restoring their spatial resolution. After the upsampling module, a convolution
layer is used to achieve two-channel feature maps, preserving the spatial dimensions of
the feature layers while altering the channel count, mapping high-dimensional features to
the required output dimensions for classifying and locating features and background in
the target areas. Skip connections are a crucial component of the Unet structure. These
connections directly tie feature maps from the encoder to matching feature maps in the
decoder, helping to retain more contextual and positional information in the output and
enhancing the restoration of image details through feature fusion.
Figure 6. Unet structure diagram.
The primary function of the encoder is to capture the image’s feature information and
decrease its spatial dimensions. Simultaneously, it boosts the number of feature channels to
augment the network’s ability to represent features. Within the encoder, the input image
undergoes four stages of downsampling, each containing two 3
×
3 convolution layers,
which are then succeeded by a 2
×
2 max pooling layer, effectively extracting multi-scale
features and capturing a broader range and more complex features.
The decoder primarily focuses on gradually upsampling the encoded feature maps
back to the original image size and conducts feature learning through convolution layers,
thereby enhancing feature extraction capabilities. The decoder architecture includes four
upsampling stages. Each upsampling module contains two convolution layers and one
transpose convolution layer, effectively increasing the spatial dimensions of the feature
Appl. Sci. 2024,14, 11848 9 of 20
maps and restoring their spatial resolution. After the upsampling module, a convolution
layer is used to achieve two-channel feature maps, preserving the spatial dimensions of
the feature layers while altering the channel count, mapping high-dimensional features
to the required output dimensions for classifying and locating features and background
in the target areas. Skip connections are a crucial component of the Unet structure. These
connections directly tie feature maps from the encoder to matching feature maps in the
decoder, helping to retain more contextual and positional information in the output and
enhancing the restoration of image details through feature fusion.
3.2.2. MHSA Module and MHCA Module
(1) MHSA Module
The MHSA mechanism, prevalent in deep learning models, allows the model to
concurrently absorb information from various representational subspaces, thus boosting
its ability to handle complex data. The mechanism primarily operates by dividing the
“attention” operation into multiple heads, each independently learning different aspects of
the input data, which are then combined to achieve a more comprehensive understanding.
The Query (Q), Key (K), and Value (V) vectors are the core components of the attention
mechanism. Here, the query vector is used to locate areas of interest, the key vector identifies
all potential points of focus, and the value vector stores the corresponding feature content.
In the MHSA mechanism, each head performs calculations independently, allowing for
the parallel processing of various features and parts of the image. Ultimately, the outputs
from all heads are merged together, forming a comprehensive feature representation that
aids in improving the accuracy of image feature extraction.
(2) Self-Attention Mechanism (SAM)
When utilizing the SAM, the procedure starts by converting the model input into
Query, Key, and Value matrices. Subsequently, attention scores are calculated by conducting
a dot product between Query and each Key. The specific computational process is illustrated
in Figure 7.
Appl. Sci. 2024, 14, 11848 10 of 21
3.2.2. MHSA Module and MHCA Module
(1) MHSA Module
The MHSA mechanism, prevalent in deep learning models, allows the model to con-
currently absorb information from various representational subspaces, thus boosting its
ability to handle complex data. The mechanism primarily operates by dividing the “aen-
tion” operation into multiple heads, each independently learning dierent aspects of the
input data, which are then combined to achieve a more comprehensive understanding.
The Query (Q), Key (K), and Value (V) vectors are the core components of the aen-
tion mechanism. Here, the query vector is used to locate areas of interest, the key vector
identies all potential points of focus, and the value vector stores the corresponding fea-
ture content.
In the MHSA mechanism, each head performs calculations independently, allowing
for the parallel processing of various features and parts of the image. Ultimately, the out-
puts from all heads are merged together, forming a comprehensive feature representation
that aids in improving the accuracy of image feature extraction.
(2) Self-Aention Mechanism (SAM)
When utilizing the SAM, the procedure starts by converting the model input into
Query, Key, and Value matrices. Subsequently, aention scores are calculated by conduct-
ing a dot product between Query and each Key. The specic computational process is
illustrated in Figure 7.
Figure 7. Self-aention mechanism.
Where X symbolizes the feature representation. The width, height, and channel count
of the feature map are denoted by w, h, d, respectively.
Q
W
,
K
W
, and
V
W
are the learnable linear transformation matrices corresponding
to the
Q
,
K
, and
V
.
(3) MHSA mechanism
The layout of the MHSA mechanism is illustrated in Figure 8.
Figure 7. Self-attention mechanism.
Where X symbolizes the feature representation. The width, height, and channel count
of the feature map are denoted by w, h, d, respectively.
WQ
,
WK
, and
WV
are the learnable linear transformation matrices corresponding to
the Q,K, and V.
(3) MHSA mechanism
The layout of the MHSA mechanism is illustrated in Figure 8.
Appl. Sci. 2024,14, 11848 10 of 20
Appl. Sci. 2024, 14, 11848 11 of 21
Figure 8. MHSA mechanism.
For each head ℎ, dierent weight matrices are used to transform the input, and the
aention scores for each head are calculated. The outputs of all heads are concatenated
together and passed through a nal linear transformation.
(4) MHCA Module
The MHCA is an extension of the SAM that enables the model to eectively utilize
the information from one input while processing another, enhancing the model’s recogni-
tion and understanding capabilities. As indicated in the red box in Figure 9, the feature
map S, after being processed by the module CBSU, is used as the value V, and the features
of the feature map Y are used as the query Q and key K. This utilizes the information from
S to improve the model’s capability for feature extraction from Y.
The process for calculating MHCA is conducted as illustrated in Figure 9.
Figure 9. MHCA.
Figure 8. MHSA mechanism.
For each head h, different weight matrices are used to transform the input, and the
attention scores for each head are calculated. The outputs of all heads are concatenated
together and passed through a final linear transformation.
(4) MHCA Module
The MHCA is an extension of the SAM that enables the model to effectively utilize the
information from one input while processing another, enhancing the model’s recognition
and understanding capabilities. As indicated in the red box in Figure 9, the feature map S,
after being processed by the module CBSU, is used as the value V, and the features of the
feature map Y are used as the query Q and key K. This utilizes the information from S to
improve the model’s capability for feature extraction from Y.
Appl. Sci. 2024, 14, 11848 11 of 21
Figure 8. MHSA mechanism.
For each head ℎ, dierent weight matrices are used to transform the input, and the
aention scores for each head are calculated. The outputs of all heads are concatenated
together and passed through a nal linear transformation.
(4) MHCA Module
The MHCA is an extension of the SAM that enables the model to eectively utilize
the information from one input while processing another, enhancing the model’s recogni-
tion and understanding capabilities. As indicated in the red box in Figure 9, the feature
map S, after being processed by the module CBSU, is used as the value V, and the features
of the feature map Y are used as the query Q and key K. This utilizes the information from
S to improve the model’s capability for feature extraction from Y.
The process for calculating MHCA is conducted as illustrated in Figure 9.
Figure 9. MHCA.
Figure 9. MHCA.
Appl. Sci. 2024,14, 11848 11 of 20
The process for calculating MHCA is conducted as illustrated in Figure 9.
In the schematic, feature map A undergoes processing through the CBSU module to
further extract and refine the information contained within feature map A and to expand the
feature dimensions. Feature map S, after being processed by the CBR module, undergoes a
dot product operation with the result from feature map A after CBSU module processing.
This operation further emphasizes the important features within feature map S in the
output’s weighting. Meanwhile, feature map Y, after processing through the UC module,
is concatenated with the output from the aforementioned dot product operation. This
enhances the richness of the model output, providing a deep insight into the image content
and enhancing the accuracy and robustness of target area recognition.
3.2.3. Image Enhancement Model Design Based on an Improved Unet Model—MHUnet
3D GPR captures road interior void images that often suffer from inconsistent quality.
Moreover, void areas feature diverse scales and complex, variable backgrounds. Con-
sequently, an image enhancement network model for void images needs robust feature
extraction capabilities. This model must accurately identify and analyze both global and
detailed features of void areas across different scales of receptive fields, while also balanc-
ing the extraction of local features with the integration of global information, adapting to
complex and variable image background conditions.
Although the Unet network model attempts to expand the receptive field through
its multi-layer structure, each layer’s receptive field is limited, making it challenging to
cover global information throughout the image. The structure’s skip connections help
maintain local feature information, but they struggle with processing complex void images
over large areas. Furthermore, the Unet model inadequately addresses image quality
variations caused by noise and signal attenuation in diverse underground media, leading
to inconsistent image reconstruction quality.
Therefore, this section introduces MHSA and MHCA mechanisms into the Unet net-
work model structure. MHSA, by interacting with other units, can acquire rich global
information. Simultaneously, by learning long-range dependencies across the entire in-
put image, it effectively expands the model’s receptive field, enhancing the handling of
large-scale, multiple voids, and complex images. MHCA effectively merges information
between different network layers and modules, focusing on different feature combinations
to improve the feature extraction performance of void areas and improve image quality
enhancement effects.
The structure of the designed void image enhancement model is illustrated in Figure 10.
The initial part of the model is the StemConv module, which comprises four convo-
lutional layers (Conv), batch normalization layers (BN), and GeLU activation functions.
This module primarily serves to extract preliminary features from the image and enhance
non-linear processing capabilities. The DownConv module reduces the spatial dimensions
of the feature map through maximum pooling and deepens feature extraction with two
convolutional passes to capture abstract features. The MHSA module enhances the capture
of global dependency features, enhancing the model’s capability to identify key features
across the entire image. The MHCA module integrates features from different levels to
enhance feature expressiveness. The UpConv module is used to gradually reconstruct
the spatial resolution of the features, providing a foundation for high-quality image re-
construction. Concat helps recover more detailed information in void areas. TConv uses
two different sizes of convolutional kernels to further process and optimize the feature
map, maintaining contextual information while adjusting the number of channels, proving
helpful in modifying the dimensions and depth of the feature map.
Appl. Sci. 2024,14, 11848 12 of 20
Appl. Sci. 2024, 14, 11848 13 of 21
Figure 10. Image enhancement model structure.
The initial part of the model is the StemConv module, which comprises four convo-
lutional layers (Conv), batch normalization layers (BN), and GeLU activation functions.
This module primarily serves to extract preliminary features from the image and enhance
non-linear processing capabilities. The DownConv module reduces the spatial dimen-
sions of the feature map through maximum pooling and deepens feature extraction with
two convolutional passes to capture abstract features. The MHSA module enhances the
capture of global dependency features, enhancing the model’s capability to identify key
features across the entire image. The MHCA module integrates features from dierent
levels to enhance feature expressiveness. The UpConv module is used to gradually recon-
struct the spatial resolution of the features, providing a foundation for high-quality image
reconstruction. Concat helps recover more detailed information in void areas. TConv uses
two dierent sizes of convolutional kernels to further process and optimize the feature
map, maintaining contextual information while adjusting the number of channels, prov-
ing helpful in modifying the dimensions and depth of the feature map.
3.2.4. Analysis of Image Enhancement Eects Based on the MHUnet Model
The MHUnet model was utilized to enhance the quality of low-quality void maps
within road interiors, with corresponding results shown in Figure 11.
(a) Image 1—Low quality
(b) Image 1—Enhanced quality
(c) Image 2—Low quality
(d) Image 2—Enhanced quality
Figure 11. Before and after comparison of image quality enhancement based on the MHUnet model.
Figure 10. Image enhancement model structure.
3.2.4. Analysis of Image Enhancement Effects Based on the MHUnet Model
The MHUnet model was utilized to enhance the quality of low-quality void maps
within road interiors, with corresponding results shown in Figure 11.
Appl. Sci. 2024, 14, 11848 13 of 21
Figure 10. Image enhancement model structure.
The initial part of the model is the StemConv module, which comprises four convo-
lutional layers (Conv), batch normalization layers (BN), and GeLU activation functions.
This module primarily serves to extract preliminary features from the image and enhance
non-linear processing capabilities. The DownConv module reduces the spatial dimen-
sions of the feature map through maximum pooling and deepens feature extraction with
two convolutional passes to capture abstract features. The MHSA module enhances the
capture of global dependency features, enhancing the model’s capability to identify key
features across the entire image. The MHCA module integrates features from dierent
levels to enhance feature expressiveness. The UpConv module is used to gradually recon-
struct the spatial resolution of the features, providing a foundation for high-quality image
reconstruction. Concat helps recover more detailed information in void areas. TConv uses
two dierent sizes of convolutional kernels to further process and optimize the feature
map, maintaining contextual information while adjusting the number of channels, prov-
ing helpful in modifying the dimensions and depth of the feature map.
3.2.4. Analysis of Image Enhancement Eects Based on the MHUnet Model
The MHUnet model was utilized to enhance the quality of low-quality void maps
within road interiors, with corresponding results shown in Figure 11.
(a) Image 1—Low quality
(b) Image 1—Enhanced quality
(c) Image 2—Low quality
(d) Image 2—Enhanced quality
Figure 11. Before and after comparison of image quality enhancement based on the MHUnet model.
Figure 11. Before and after comparison of image quality enhancement based on the MHUnet model.
As evident from the figure, compared to the low-quality image, the enhanced image
exhibits a deepening of black areas and a brightness increase in white and gray areas,
significantly improving the image contrast. The enhanced contrast makes the distinction
between void areas and the surrounding background more pronounced. The increase in
brightness aids in better recognition of details within the image. In the enhanced image,
the edges of the internal structure areas of the road are sharper and clearer, which helps
in further distinguishing between void and non-void areas. In summary, the enhanced
image has a higher degree of detail recognition, which helps to improve the accuracy and
reliability of the interpretation of the void map.
The MHUnet model was used to enhance the quality of collected GPR images, with
the comparison between pre- and post-enhancement shown in Figure 12.
Appl. Sci. 2024,14, 11848 13 of 20
Appl. Sci. 2024, 14, 11848 14 of 21
As evident from the gure, compared to the low-quality image, the enhanced image
exhibits a deepening of black areas and a brightness increase in white and gray areas,
signicantly improving the image contrast. The enhanced contrast makes the distinction
between void areas and the surrounding background more pronounced. The increase in
brightness aids in beer recognition of details within the image. In the enhanced image,
the edges of the internal structure areas of the road are sharper and clearer, which helps
in further distinguishing between void and non-void areas. In summary, the enhanced
image has a higher degree of detail recognition, which helps to improve the accuracy and
reliability of the interpretation of the void map.
The MHUnet model was used to enhance the quality of collected GPR images, with
the comparison between pre- and post-enhancement shown in Figure 12.
(a) Image 1—Original image
(b) Image 1—Enhanced image
(c) Image 2—Original image
(d) Image 2—Enhanced image
Figure 12. Comparison of original image quality before and after enhancement.
As can be seen from the gure, compared to the original collected images, the en-
hanced images show improvements in multiple aspects. The void area boundaries are
sharper, detail features are more distinguishable, and overall image blur is reduced; the
contrast between void areas and background is more pronounced with richer grayscale
levels; background noise is eectively suppressed while main features are more promi-
nent; additionally, the hyperbolic features of void areas are more distinct, structural
boundaries are clearer, and target areas show beer dierentiation from surrounding en-
vironments. These improvements contribute to higher accuracy in subsequent intelligent
recognition.
To avoid having two dierent feature distributions (original and enhanced) in the
dataset and reduce the negative impacts of data distribution dierences on model train-
ing, the image enhancement method was applied to all 1700 images in the dataset. Addi-
tionally, this approach signicantly simplies the processing workow by eliminating the
need for image quality judgment and classication.
To demonstrate the advancement of our proposed method, comparison with existing
image enhancement models is necessary. Among existing image enhancement models,
NAFNet and Uformer are widely applied. NAFNet has extensive applications in image
processing, primarily in low-light image enhancement, underwater image enhancement,
and remote sensing image denoising. This method shows excellent performance in motion
blur removal, Gaussian noise removal, and image quality improvement, with advantages
in computational eciency and small model size [44–46]. The Uformer method is mainly
applied in low-light image enhancement, particularly excelling in ultra-high-denition
image processing. This method not only improves image quality but also serves as a pre-
processing step for downstream visual tasks such as face detection [47–49].
The image enhancement eect was quantitatively analyzed using four indicators:
PSNR, SSIM, LPIPS, and NIQE. Among them, the quality-degraded image and the
Figure 12. Comparison of original image quality before and after enhancement.
As can be seen from the figure, compared to the original collected images, the en-
hanced images show improvements in multiple aspects. The void area boundaries are
sharper, detail features are more distinguishable, and overall image blur is reduced; the
contrast between void areas and background is more pronounced with richer grayscale
levels; background noise is effectively suppressed while main features are more prominent;
additionally, the hyperbolic features of void areas are more distinct, structural boundaries
are clearer, and target areas show better differentiation from surrounding environments.
These improvements contribute to higher accuracy in subsequent intelligent recognition.
To avoid having two different feature distributions (original and enhanced) in the
dataset and reduce the negative impacts of data distribution differences on model training,
the image enhancement method was applied to all 1700 images in the dataset. Additionally,
this approach significantly simplifies the processing workflow by eliminating the need for
image quality judgment and classification.
To demonstrate the advancement of our proposed method, comparison with existing
image enhancement models is necessary. Among existing image enhancement models,
NAFNet and Uformer are widely applied. NAFNet has extensive applications in image
processing, primarily in low-light image enhancement, underwater image enhancement,
and remote sensing image denoising. This method shows excellent performance in motion
blur removal, Gaussian noise removal, and image quality improvement, with advantages
in computational efficiency and small model size [
44
–
46
]. The Uformer method is mainly
applied in low-light image enhancement, particularly excelling in ultra-high-definition
image processing. This method not only improves image quality but also serves as a
preprocessing step for downstream visual tasks such as face detection [47–49].
The image enhancement effect was quantitatively analyzed using four indicators:
PSNR,SSIM,LPIPS, and NIQE. Among them, the quality-degraded image and the MHUnet
method enhanced image corresponding indicators were calculated using the image prepro-
cessed image as the reference image, as shown in Table 1.
Table 1. Model performance metrics for different data types.
Image Type PSNR (dB) SSIM LPIPS NIQE
Not enhanced 28.21 0.9137 0.0253 12.1723
Unet 30.12 0.9309 0.0211 10.8357
NAFNet 32.38 0.9488 0.0197 10.2675
Uformer 34.06 0.9364 0.0145 9.3738
Enhanced by MHUnet
34.65 0.9695 0.0165 9.6543
The table reveals that the quality-degraded images have lower PSNR and SSIM values,
and higher LPIPS and NIQE scores, indicating a certain similarity in image structure
between the low-quality images and reference images, but still a significant difference in
Appl. Sci. 2024,14, 11848 14 of 20
visual quality and image structure. After applying the MHUnet method, the PSNR and
SSIM values increased from 28.21 and 0.9137 to 34.65 and 0.9695, respectively. This increase
indicates effective suppression of image noise and overall improvement in image quality,
bringing the enhanced image nearer to the high-quality reference in aspects like brightness,
contrast, and structural details. Additionally, the preservation of local features and textural
elements in the image is well-maintained.
The LPIPS and NIQE scores decreased from 0.0253 and 12.1723 to 0.0165 and 9.6543,
respectively. This reduction shows that the image enhancement method decreases the
perceptual differences between the quality-degraded images and the reference images,
which helps to improve the accuracy of identifying voids in road interiors where high
resolution is crucial. Notably, LPIPS is a metric calculated using deep learning methods
that can recognize the deep features of void images and is particularly sensitive to complex
textures and shapes. Therefore, the enhanced image quality in this method aids in the
accurate identification of void areas.
On the other hand, compared to the enhancement results of three models—Unet,
NAFNet, and Uformer—the MHUnet model demonstrates superior performance across
all four metrics. Although the NAFNet model performs excellently in conventional image
processing, it is not entirely suitable for enhancing GPR images of road internal voids. This
is mainly because GPR images contain specialized geological structural information, which
differs significantly from the features of natural images. Similarly, the Uformer model is
not fully applicable to enhancing GPR images of road internal voids. This is primarily
because GPR images are formed based on electromagnetic wave reflection signals, which
are fundamentally different from natural lighting scenes.
This further validates the effectiveness and advancement of the method proposed in this
research. This is attributed to the proposed MHUnet model, which references the Unet model
architecture capable of reconstructing images at various scales, enhancing image brightness,
clarity, and contrast. The inclusion of MHSA and MHCA in the model adjusts and optimizes
perceptual image features, improving visual quality and consistency in image perception.
Through comprehensive feature learning and multi-level information integration, the model
can produce results that are statistically closer to high-quality reference images, thereby
enhancing the accuracy and reliability of intelligent void area recognition.
3.3. Comparative Analysis of Void Intelligent Recognition Performance
Using the YOLOv8 model, the performance between original images and enhanced
images was compared. When constructing the dataset, the images were divided into a
training set, validation set, and test set in a ratio of 6:2:2. The model performance was
analyzed using precision, recall, F1, and mAP. Furthermore, to demonstrate the effectiveness
of enhanced images in improving intelligent void recognition performance, statistical analysis
was conducted using t-test. Specifically, the test set was divided into 5 groups, and the mean
and standard deviation of corresponding metrics for each group were calculated, along with
the t-value in the t-test. The model performance metrics and statistical indicators are shown
in Table 2, where AV represents average value and SD represents standard deviation.
As shown in Table 2, the YOLOv8 model trained with MHUnet-enhanced images
demonstrates overall superior performance compared to the model trained with original
images, indicating that enhanced images contribute to improving the performance of road
internal void recognition models. In the t-test analysis, the calculated t-values for precision,
recall, F1, and mAP were 10.43, 16.86, 17.24, and 13.79, respectively. Based on the degrees of
freedom and significance level, referring to the t-value table, the critical value t is 2.306 under
the conditions of 8 degrees of freedom and a significance level of 0.05. Since the t-values
for all four metrics exceed 2.306, this indicates that the void recognition model trained with
enhanced images significantly outperforms the model trained with original images.
To further demonstrate the universality of the MHUnet model-enhanced images
in improving recognition model performance, we also compared the training effects of
different images on YOLOv7, YOLOv9, and Faster-rcnn model. The performance indicators
Appl. Sci. 2024,14, 11848 15 of 20
corresponding to the three hollow intelligent recognition models are calculated and shown
in Table 3.
Table 2. Comparison of YOLOv8 model performance metrics and statistical indicators.
Image Type Precision Recall
AV (%) SD (%) t-Value AV (%) SD (%) t-Value
Original image 86.55 0.553 10.43 80.31 0.512 16.86
MHUnet Enhancement 87.94 0.601 81.99 0.486
Image Type F1(%) mAP(%)
AV(%) SD(%) t-Value AV(%) SD(%) t-Value
Original image 83.31 0.526 17.24 86.74 0.412 13.79
MHUnet Enhancement 84.86 0.415 87.62 0.387
Table 3. Comparison of detection performance between original and enhanced images.
Model Type Image Type Precision (%) Recall (%) F1 (%) mAP (%)
YOLOv7 Original image 85.61 78.43 81.68 85.85
MHUnet Enhancement 86.99 79.75 82.63 86.73
YOLOv8 Original image 86.55 80.31 83.31 86.74
MHUnet Enhancement 87.94 81.99 84.86 87.62
YOLOv9 Original image 86.28 80.54 83.38 85.99
MHUnet Enhancement 87.38 81.84 84.95 87.35
Faster-rcnn Original image 88.61 81.57 84.69 87.85
MHUnet Enhancement 90.38 83.31 86.15 88.42
According to the table, compared with the original data, all four models trained on
the enhanced data based on the MHUnet model have higher accuracy, recall, F1, and mAP
than the models corresponding to the original data. The enhancement of image quality
effectively enhances the learning and generalization abilities of intelligent recognition
models for road internal voids, demonstrating higher accuracy and reliability.
3.4. Engineering Validation
The image enhancement technology corresponding to the MHUnet model and the
intelligent recognition model were applied and validated in the detection of road interior
defects on Menghai Avenue and Qianhai Avenue in the Nanshan District of Shenzhen.
Specific site images and validation results are shown in Figure 13 and Table 4.
Appl. Sci. 2024, 14, 11848 17 of 21
eectively enhances the learning and generalization abilities of intelligent recognition
models for road internal voids, demonstrating higher accuracy and reliability.
3.4. Engineering Validation
The image enhancement technology corresponding to the MHUnet model and the
intelligent recognition model were applied and validated in the detection of road interior
defects on Menghai Avenue and Qianhai Avenue in the Nanshan District of Shenzhen.
Specic site images and validation results are shown in Figure 13 and Table 4.
(a) Searching for empty positions
(b) Drilling holes at empty positions
Figure 13. On-site verication of internal voids in the road.
Table 4. Verication of model detection accuracy.
Type
Number of Voids
Accuracy (%)
Model detection
10
90
Accurate verification
9
The intelligent recognition model identied a total of 10 void areas on Menghai Av-
enue and Qianhai Avenue, with 9 of these void areas being accurately validated. This
demonstrates that the image enhancement proposed in this study can be eectively ap-
plied in the detection and recognition of road interior void areas.
4. Discussion
This study utilized 3D GPR to detect internal road voids across multiple administra-
tive regions in Shenzhen, establishing a dataset of 1700 void images. In existing related
research, scholars have employed two approaches to build void image datasets: numerical
simulation and on-site GPR collection. While numerical simulation is cost-eective, sim-
ulated images show characteristic dierences from real images [50]. On-site collection,
though costly, results in smaller datasets typically not exceeding 1000 images [51–55].
Therefore, compared to existing research, our dataset provides a larger-scale data foun-
dation for internal road void image enhancement and intelligent recognition.
A supervised learning-oriented void image processing method was proposed based
on Fourier transform low-frequency ltering and Gaussian ltering. This method in-
creases pixel dierences between degraded and enhanced images by decreasing and in-
creasing grayscale values in void regions, thereby expanding the model’s learning space.
In road-related research, image enhancement technology primarily focuses on image
quality improvement under complex conditions, dynamic scene correction, and target de-
tection enhancement [56–58]. Image enhancement includes traditional and intelligent
methods. Traditional methods comprise histogram processing, spatial domain pro-
cessing, and morphological processing. Compared to traditional methods, intelligent en-
hancement methods oer advantages such as scene-adaptive parameter adjustment,
stronger robustness and generalization ability, end-to-end processing without manual
feature design and rules, and ecient processing after training through learning optimal
Figure 13. On-site verification of internal voids in the road.
Appl. Sci. 2024,14, 11848 16 of 20
Table 4. Verification of model detection accuracy.
Type Number of Voids Accuracy (%)
Model detection 10 90
Accurate verification 9
The intelligent recognition model identified a total of 10 void areas on Menghai
Avenue and Qianhai Avenue, with 9 of these void areas being accurately validated. This
demonstrates that the image enhancement proposed in this study can be effectively applied
in the detection and recognition of road interior void areas.
4. Discussion
This study utilized 3D GPR to detect internal road voids across multiple administrative
regions in Shenzhen, establishing a dataset of 1700 void images. In existing related research,
scholars have employed two approaches to build void image datasets: numerical simulation
and on-site GPR collection. While numerical simulation is cost-effective, simulated images
show characteristic differences from real images [
50
]. On-site collection, though costly,
results in smaller datasets typically not exceeding 1000 images [
51
–
55
]. Therefore, compared
to existing research, our dataset provides a larger-scale data foundation for internal road
void image enhancement and intelligent recognition.
A supervised learning-oriented void image processing method was proposed based
on Fourier transform low-frequency filtering and Gaussian filtering. This method increases
pixel differences between degraded and enhanced images by decreasing and increas-
ing grayscale values in void regions, thereby expanding the model’s learning space. In
road-related research, image enhancement technology primarily focuses on image quality
improvement under complex conditions, dynamic scene correction, and target detection
enhancement [
56
–
58
]. Image enhancement includes traditional and intelligent methods.
Traditional methods comprise histogram processing, spatial domain processing, and mor-
phological processing. Compared to traditional methods, intelligent enhancement methods
offer advantages such as scene-adaptive parameter adjustment, stronger robustness and
generalization ability, end-to-end processing without manual feature design and rules,
and efficient processing after training through learning optimal enhancement strategies
from large datasets. Among intelligent enhancement algorithms, the Unet model and
its improved structures are widely applied, demonstrating advantages in medical imag-
ing, remote sensing imaging, and natural scene imaging, including good image structure
preservation, strong detail recovery capability, and fast processing speed [
59
–
61
]. However,
research on internal road void image enhancement technology is still in its early stages.
Our study combines the advantages of MHSA and MHCA in global modeling and cross-
domain feature fusion [
62
,
63
], and based on the Unet model structure, introduces MHSA
and MHCA to construct an improved Unet model (MHUnet) for enhancing internal road
void images. Compared to original images, this model significantly improves enhanced
image quality in both visual effects and quantitative indicators. The model improves the
brightness, clarity, and contrast of internal road void images, enhancing visual quality and
image perception consistency. The enhanced internal road void images can provide more
effective features for intelligent recognition models, effectively enhancing their learning
capability and improving model accuracy and sensitivity.
5. Conclusions
This study proposes an intelligent recognition method for road internal voids based
on improved image enhancement technology. Three-dimensional GPR detection was
conducted in Shenzhen to establish a void image dataset. The proposed MHUnet model
incorporates MHSA and MHCA mechanisms, achieving significant improvements in
image brightness, clarity, and contrast, enhancing perceptual consistency, and matching
statistical distributions with high-quality reference images. For the enhanced images,
Appl. Sci. 2024,14, 11848 17 of 20
the quantitative indicators improved significantly: PSNR increased by 6.44 dB, SSIM
by 0.0558, LPIPS by 0.0088, and NIQE by 2.518. Experimental validation demonstrates
that the enhanced images provide higher-quality feature information, improving the
model’s accuracy and sensitivity while exhibiting superior recognition capabilities and
generalization performance. Compared to the original images, the improved model
showed increased performance metrics: precision by 1.39%, recall by 1.68%, F1 score by
1.55%, and mAP by 0.88%. The research findings provide a new technical solution for
the intelligent detection of road internal voids, with significant practical implications for
improving the intelligence level of road maintenance.
The next steps will focus on further optimizing the intelligent recognition model for
internal road voids and conducting larger-scale engineering practice verification. Addition-
ally, we will conduct quantitative assessment research based on void images to evaluate
void size, depth, deterioration degree, void grade, and impact on road damage. By estab-
lishing a comprehensive void assessment system, we aim to better guide road maintenance
work prioritization, improve maintenance efficiency, and provide important technical sup-
port for road lifecycle management, thereby enhancing the engineering application value
of the research.
Author Contributions: Conceptualization, Q.K. and X.L.; Data curation, Q.K. and A.M.; Formal anal-
ysis, A.M. and L.Y.; Funding acquisition, Q.K. and A.M.; Investigation, Q.K. and X.L.; Methodology,
Q.K. and X.L.; Project administration, X.L. and A.M.; Resources, X.L. and A.M.; Software, A.M. and
L.Y.; Supervision, X.L.; Validation, A.M. and L.Y.; Visualization, A.M. and L.Y.; Writing—original
draft, Q.K. and X.L.; Writing—review and editing, A.M. and L.Y. All authors have read and agreed to
the published version of the manuscript.
Funding: This research was funded by the National Key Research and Development Program, grant
number 2022YFB2602100. This project was sponsored by the Ministry of Science and Technology
of China.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data presented in this study are available on request from the
corresponding author. The data are not publicly available due to confidentiality agreements with
government administrative departments and partner institutions who jointly collected the ground-
penetrating radar data. The authors require prior authorization from these collaborating parties
before sharing the data.
Conflicts of Interest: Author Qian Kan, Xing Liu, Anxin Meng was employed by the company
Shenzhen Urban Transport Planning Center Co., Ltd. The remaining authors declare that the research
was conducted in the absence of any commercial or financial relationships that could be construed as
a potential conflict of interest.
References
1.
Liu, Z.; Sun, T.; Huang, T.; Liu, G.; Tao, Y.; Wang, G.; Wang, L. Forward Modeling and Model Test of Ground—Penetrating Radar
toward Typical Asphalt Pavement Distresses. Adv. Civ. Eng. 2023,2023, 2227326. [CrossRef]
2.
Primusz, P.; Abdelsamei, E.; Mahmoud, A.; Sipos, G.; Fi, I.; Herceg, A.; Tóth, C. Assessment of In Situ Compactness and Air Void
Content of New Asphalt Layers Using Ground-Penetrating Radar Measurements. Appl. Sci. 2024,14, 614. [CrossRef]
3.
Xiong, X.; Meng, A.; Lu, J.; Tan, Y.; Chen, B.; Tang, J.; Zhang, C.; Xiao, S.; Hu, J. Automatic detection and location of pavement
internal distresses from ground penetrating radar images based on deep learning. Constr. Build. Mater. 2024,411, 134483.
[CrossRef]
4.
Leng, Z.; Al-Qadi, I.L.; Shangguan, P.; Son, S. Field application of ground-penetrating radar for measurement of asphalt mixture
density: Case study of Illinois route 72 overlay. Transp. Res. Rec. 2012,2304, 133–141. [CrossRef]
5.
Leng, Z. Prediction of In-Situ Asphalt Mixture Density Using Ground Penetrating Radar: Theoretical Development and Field Verification;
University of Illinois at Urbana-Champaign: Champaign, IL, USA, 2011.
6.
Wang, W.; Xiang, W.; Li, C.; Qiu, S.; Wang, Y.; Wang, X.; Bu, S.; Bian, Q. A Case Study of Pavement Foundation Support and
Drainage Evaluations of Damaged Urban Cement Concrete Roads. Appl. Sci. 2024,14, 1791. [CrossRef]
Appl. Sci. 2024,14, 11848 18 of 20
7.
Xu, Y.; Shi, X.; Yao, Y. Performance Assessment of Existing Asphalt Pavement in China’s Highway Reconstruction and Expansion
Project Based on Coupling Weighting Method and Cloud Model Theory. Appl. Sci. 2024,14, 5789. [CrossRef]
8.
Leng, Z.; Al-Qadi, I.L. An innovative method for measuring pavement dielectric constant using the extended CMP method with
two air-coupled GPR systems. NDT E Int. 2014,66, 90–98. [CrossRef]
9.
Zhang, Z.; Huang, S.; Zhang, K. Accurate detection method for compaction uniformity of asphalt pavement. Constr. Build. Mater.
2017,145, 88–97. [CrossRef]
10.
Tang, J.; Huang, Z.; Li, W.; Yu, H. Low compaction level detection of newly constructed asphalt pavement based on regional
index. Sensors 2022,22, 7980. [CrossRef]
11.
Mioduszewski, P.; Sorociak, W. Acoustic evaluation of road surfaces using different Close Proximity testing devices. Appl. Acoust.
2023,204, 109255. [CrossRef]
12.
Kim, S.Y.; Kang, S.; Park, G.; Lee, D.; Lim, Y.; Lee, J.S. Detection of roadbed layers in mountainous area using down-up-crosshole
penetrometer and ground penetrating radar. Measurement 2024,224, 113889. [CrossRef]
13.
Kang, S.; Lee, J.S.; Park, G.; Kim, N.; Park, J. Unpaved road characterization during rainfall scenario: Electromagnetic wave and
cone penetration assessment. NDT E Int. 2023,139, 102930. [CrossRef]
14.
Sabery, S.M.; Bystrov, A.; Gardner, P.; Stroescu, A.; Gashinova, M. Road surface classification based on radar imaging using
convolutional neural network. IEEE Sens. J. 2021,21, 18725–18732. [CrossRef]
15.
Benedetto, A.; Tosti, F.; Ciampoli, L.B.; D’amico, F. An overview of ground-penetrating radar signal processing techniques for
road inspections. Signal Process. 2017,132, 201–209. [CrossRef]
16.
Benedetto, A.; Benedetto, F.; De Blasiis, M.R.; Giunta, G. Reliability of signal processing technique for pavement damages
detection and classification using ground penetrating radar. IEEE Sens. J. 2005,5, 471–480. [CrossRef]
17.
Leucci, G. Ground penetrating radar: The electromagnetic signal attenuation and maximum penetration depth. Sch. Res. Exch.
2008,2008, 926091. [CrossRef]
18.
Potin, D.; Duflos, E.; Vanheeghe, P. Landmines ground-penetrating radar signal enhancement by digital filtering. IEEE Trans.
Geosci. Remote Sens. 2006,44, 2393–2406. [CrossRef]
19.
Li, R.; Zhang, H.; Chen, Z.; Yu, N.; Kong, W.; Li, T.; Wang, E.; Wu, X.; Liu, Y. Denoising method of ground-penetrating radar
signal based on independent component analysis with multifractal spectrum. Measurement 2022,192, 110886. [CrossRef]
20.
Xue, W.; Zhu, J.; Rong, X.; Huang, Y.; Yang, Y.; Yu, Y. The analysis of ground penetrating radar signal based on generalized S
transform with parameters optimization. J. Appl. Geophys. 2017,140, 75–83. [CrossRef]
21.
Li, F.; Yang, F.; Qiao, X.; Xing, W.; Zhou, C.; Xing, H. 3D ground penetrating radar cavity identification algorithm for urban roads
using transfer learning. Meas. Sci. Technol. 2023,34, 055106. [CrossRef]
22.
Li, F.; Yang, F.; Xie, Y.; Qiao, X.; Du, C.; Li, C.; Ru, Q.; Zhang, F.; Gu, X.; Yong, Z. Research on 3D ground penetrating radar deep
underground cavity identification algorithm in urban roads using multi-dimensional time-frequency features. NDT E Int. 2024,
143, 103060. [CrossRef]
23.
Kang, M.S.; Kim, N.; Lee, J.J.; An, Y.K. Deep learning-based automated underground cavity detection using three-dimensional
ground penetrating radar. Struct. Health Monit. 2020,19, 173–185. [CrossRef]
24.
Xiong, X.; Tan, Y.; Hu, J.; Hong, X.; Tang, J. Evaluation of Asphalt Pavement Internal Distresses Using Three-Dimensional
Ground-Penetrating Radar. Int. J. Pavement Res. Technol. 2024, 1–12. [CrossRef]
25.
Xue, W.; Li, T.; Peng, J.; Liu, L.; Zhang, J. Road underground defect detection in ground penetrating radar images based on an
improved YOLOv5s model. J. Appl. Geophys. 2024,229, 105491. [CrossRef]
26.
Zhu, J.; Zhao, D.; Luo, X. Evaluating the optimised YOLO-based defect detection method for subsurface diagnosis with ground
penetrating radar. Road Mater. Pavement Des. 2024,25, 186–203. [CrossRef]
27.
Zhang, B.; Cheng, H.; Zhong, Y.; Chi, J.; Shen, G.; Yang, Z.; Li, X.; Xu, S. Real-Time Detection of Voids in Asphalt Pavement Based
on Swin-Transformer-Improved YOLOv5. IEEE Trans. Intell. Transp. Syst. 2023,25, 2615–2626. [CrossRef]
28.
Kan, Q.; Liu, X.; Meng, A.; Yu, L. Identification of internal voids in pavement based on improved knowledge distillation
technology. Case Stud. Constr. Mater. 2024,21, e03555. [CrossRef]
29.
Khudoyarov, S.; Kim, N.; Lee, J.J. Three-dimensional convolutional neural network–based underground object classification using
three-dimensional ground penetrating radar data. Struct. Health Monit. 2020,19, 1884–1893. [CrossRef]
30.
Lu, K.; Huang, X.; Xia, R.; Zhang, P.; Shen, J. Cross attention is all you need: Relational remote sensing change detection with
transformer. GISci. Remote Sens. 2024,61, 2380126. [CrossRef]
31.
Sameera, P.; Deshpande, A.A. Disease detection and classification in pomegranate fruit using hybrid convolutional neural
network with honey badger optimization algorithm. Int. J. Food Prop. 2024,27, 815–837.
32.
Jiang, S.; Mei, Y.; Wang, P.; Liu, Q. Exposure difference network for low-light image enhancement. Pattern Recognit. 2024,156, 110796.
[CrossRef]
33.
Jiang, Y.; Li, L.; Zhu, J.; Xue, Y. DEANet: Decomposition enhancement and adjustment network for low-light image enhancement.
Tsinghua Sci. Technol. 2023,28, 743–753. [CrossRef]
34.
Saleem, A.; Paheding, S.; Rawashdeh, N.; Awad, A.; Kaur, N. A non-reference evaluation of underwater image enhancement
methods using a new underwater image dataset. IEEE Access 2023,11, 10412–10428. [CrossRef]
Appl. Sci. 2024,14, 11848 19 of 20
35.
Prakash, A.; Bhandari, A.K. Cuckoo search constrained gamma masking for MRI image contrast enhancement. Multimed. Tools
Appl. 2023,82, 40129–40148. [CrossRef]
36.
Wu, G.X.; Liu, Y. Image Enhancement Algorithm for Ground Penetrating Radar Based on Nonlinear Technology. J. Phys. Conf. Ser.
IOP Publ. 2024,2887, 012045.
37.
Liu, Z.; Gu, X.Y.; Li, J.; Dong, Q.; Jiang, J. Deep learning-enhanced numerical simulation of ground penetrating radar and image
detection of road cracks. Chin. J. Geophys. 2024,67, 2455–2471.
38.
Lan, T.; Luo, X.; Yang, X.; Gong, J.; Li, X.; Qu, X. A Constrained Diffusion Model for Deep GPR Image Enhancement. IEEE Geosci.
Remote Sens. Lett. 2024,21, 3003505. [CrossRef]
39.
Sara, U.; Akter, M.; Uddin, M.S. Image quality assessment through FSIM, SSIM, MSE and PSNR—A comparative study. J. Comput.
Commun. 2019,7, 8–18. [CrossRef]
40.
Qu, Q.; Liang, H.; Chen, X.; Chung, Y.Y.; Shen, Y. NeRF-NQA: No-Reference Quality Assessment for Scenes Generated by NeRF
and Neural View Synthesis Methods. IEEE Trans. Vis. Comput. Graph. 2024,30, 2129–2139. [CrossRef] [PubMed]
41.
Krithika Alias AnbuDevi, M.; Suganthi, K. Review of semantic segmentation of medical images using modified architectures of
UNET. Diagnostics 2022,12, 3064. [CrossRef] [PubMed]
42.
Tan, H.; Liu, X.; Yin, B.; Li, X. MHSA-Net: Multihead self-attention network for occluded person re-identification. IEEE Trans.
Neural Netw. Learn. Syst. 2022,34, 8210–8224. [CrossRef] [PubMed]
43.
Wen, Z.; Lin, W.; Wang, T.; Xu, G. Distract your attention: Multi-head cross attention network for facial expression recognition.
Biomimetics 2023,8, 199. [CrossRef] [PubMed]
44.
Chheda, R.R.; Priyadarshi, K.; Muragodmath, S.M.; Dehalvi, F.; Kulkarni, U.; Chikkamath, S. EnhanceNet: A Deep Neural
Network for Low-Light Image Enhancement with Image Restoration. In International Conference on Recent Trends in Machine
Learning, IOT, Smart Cities & Applications; Springer Nature: Singapore, 2023; pp. 283–300.
45.
Li, C.; Yang, B. Underwater Image Enhancement Based on the Fusion of PUIENet and NAFNet. In Proceedings of the Advances
in Computer Graphics: 40th Computer Graphics International Conference, CGI 2023, Shanghai, China, 28 August–1 September
2023; Springer Nature: Cham, Switzerland, 2023; pp. 335–347.
46.
Wang, C.; Liu, K.; Shi, J.; Yuan, H.; Wang, W. An Image Enhancement Method for Domestic High-Resolution Remote Sensing
Satellite. In Proceedings of the 2023 6th International Conference on Big Data Technologies, Qingdao, China, 22–24 September
2023; pp. 340–345.
47.
Sun, Y.; Sun, J.; Sun, F.; Wang, F.; Li, H. Low-light image enhancement using transformer with color fusion and channel attention.
J. Supercomput. 2024,80, 18365–18391. [CrossRef]
48.
Wang, T.; Zhang, K.; Shen, T.; Luo, W.; Stenger, B.; Lu, T. Ultra-high-definition low-light image en-hancement: A benchmark and
transformer-based method. Proc. AAAI Conf. Artif. Intell. 2023,37, 2654–2662.
49.
Hu, X.; Wang, J.; Xu, S. Lightweight and Fast Low-Light Image Enhancement Method Based on PoolFormer. IEICE Trans. Inf. Syst.
2024,107, 157–160. [CrossRef]
50.
Warren, C.; Giannopoulos, A.; Giannakis, I. gprMax: Open source software to simulate electromagnetic wave propagation for
Ground Penetrating Radar. Comput. Phys. Commun. 2016,209, 163–170. [CrossRef]
51.
Niu, F.; Huang, Y.; He, P.; Su, W.; Jiao, C.; Ren, L. Intelligent recognition of ground penetrating radar images in urban road
detection: A deep learning approach. J. Civ. Struct. Health Monit. 2024,14, 1917–1933. [CrossRef]
52.
Todkar, S.S.; Le Bastard, C.; Baltazart, V.; Ihamouten, A.; Dérobort, X. Comparative study of classification algorithms to detect
interlayer debondings within pavement structures from step-frequency radar data. In Proceedings of the IGARSS 2018—2018
IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 6820–6823.
53.
Sezgin, M.; Yoldemir, B.; Özkan, E.; Nazlı, H. Identification of buried objects based on peak scatter modelling of GPR A-scan
signals. In Proceedings of the Detection and Sensing of Mines, Explosive Objects, and Obscured Targets XXIV, Baltimore, MD,
USA, 15–17 April 2019; Volume 11012, pp. 39–48.
54. Du, P.; Liao, L.; Yang, X. Intelligent recognition of defects in railway subgrade. J. China Railw. Soc. 2010,32, 142–146.
55.
Xu, J.; Zhang, J.; Sun, W. Recognition of the typical distress in concrete pavement based on GPR and 1D-CNN. Remote Sens. 2021,
13, 2375. [CrossRef]
56.
Xu, Z.; Dai, Z.; Sun, Z.; Li, W.; Dong, S. Pavement Image Enhancement in Pixel-Wise Based on Multi-Level Semantic Information.
IEEE Trans. Intell. Transp. Syst. 2023,24, 15077–15091. [CrossRef]
57.
Arezoumand, S.; Mahmoudzadeh, A.; Golroo, A.; Mojaradi, B. Automatic pavement rutting measurement by fusing a high
speed-shot camera and a linear laser. Constr. Build. Mater. 2021,283, 122668. [CrossRef]
58.
Ren, Z.; Tian, X.; Qin, G.; Zhou, D. Lightweight recognition method of infrared sensor image based on deep learning method. In
Proceedings of the 2024 8th International Conference on Control Engineering and Artificial Intelligence, Shanghai, China, 26–28
January 2024; pp. 273–277.
59.
Ullah, F.; Ansari, S.U.; Hanif, M.; Ayari, M.A.; Chowdhury, M.E.H.; Khandakar, A.A.; Khan, M.S. Brain MR image enhancement
for tumor segmentation using 3D U-Net. Sensors 2021,21, 7528. [CrossRef]
60.
Zhao, M.; Yang, R.; Hu, M.; Liu, B. Deep learning-based technique for remote sensing image enhancement using multiscale
feature fusion. Sensors 2024,24, 673. [CrossRef] [PubMed]
61. Liu, F.; Hua, Z.; Li, J.; Fan, L. Dual UNet low-light image enhancement network based on attention mechanism. Multimed. Tools
Appl. 2023,82, 24707–24742. [CrossRef]
Appl. Sci. 2024,14, 11848 20 of 20
62.
Zhang, Y.; Xu, B.; Zhao, T. Convolutional multi-head self-attention on memory for aspect sentiment classification. IEEE/CAA J.
Autom. Sin. 2020,7, 1038–1044. [CrossRef]
63.
Li, Y.; Liu, Z.; Zhou, L.; Yuan, X.; Shangguan, Z.; Hu, X.; Hu, B. A facial depression recognition method based on hybrid
multi-head cross attention network. Front. Neurosci. 2023,17, 1188434. [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
Available via license: CC BY
Content may be subject to copyright.