ArticlePDF Available

A Rapid Bridge Crack Detection Method Based on Deep Learning


Abstract and Figures

The aim of this study is to enhance the efficiency and lower the expense of detecting cracks in large-scale concrete structures. A rapid crack detection method based on deep learning is proposed. A large number of artificial samples from existing concrete crack images were generated by a deep convolutional generative adversarial network (DCGAN), and the artificial samples were balanced and feature-rich. Then, the dataset was established by mixing the artificial samples with the original samples. You Only Look Once v5 (YOLOv5) was trained on this dataset to implement rapid detection of concrete bridge cracks, and the detection accuracy was compared with the results using only the original samples. The experiments show that DCGAN can mine the potential distribution of image data and extract crack features through the deep transposed convolution layer and down sampling operation. Moreover, the light-weight YOLOv5 increases channel capacity and reduces the dimensions of the input image without losing pixel information. This method maintains the generalization performance of the neural network and provides an alternative solution with a low cost of data acquisition while accomplishing the rapid detection of bridge cracks with high precision.
Content may be subject to copyright.
Citation: Liu, Y.; Gao, W.; Zhao, T.;
Wang, Z.; Wang, Z. A Rapid Bridge
Crack Detection Method Based on
Deep Learning. Appl. Sci. 2023,13,
Academic Editor: Andrea Carpinteri
Received: 25 July 2023
Revised: 26 August 2023
Accepted: 30 August 2023
Published: 31 August 2023
Copyright: © 2023 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
A Rapid Bridge Crack Detection Method Based on
Deep Learning
Yifan Liu 1,2, Weiliang Gao 3,*, Tingting Zhao 1,2, Zhiyong Wang 1, 2, * and Zhihua Wang 1,2
1Institute of Applied Mechanics, College of Mechanical and Vehicle Engineering, Taiyuan University of
Technology, Taiyuan 030024, China; (Y.L.); (T.Z.); (Z.W.)
2Shanxi Key Laboratory of Material Strength and Structural Impact, Taiyuan University of Technology,
Taiyuan 030024, China
3Institute of Defense Engineering, Academy of Military Sciences (AMS), Peoples Liberation Army (PLA),
Beijing 100850, China
*Correspondence: (W.G.); (Z.W.)
The aim of this study is to enhance the efficiency and lower the expense of detecting cracks
in large-scale concrete structures. A rapid crack detection method based on deep learning is proposed.
A large number of artificial samples from existing concrete crack images were generated by a deep
convolutional generative adversarial network (DCGAN), and the artificial samples were balanced
and feature-rich. Then, the dataset was established by mixing the artificial samples with the original
samples. You Only Look Once v5 (YOLOv5) was trained on this dataset to implement rapid detection
of concrete bridge cracks, and the detection accuracy was compared with the results using only
the original samples. The experiments show that DCGAN can mine the potential distribution of
image data and extract crack features through the deep transposed convolution layer and down
sampling operation. Moreover, the light-weight YOLOv5 increases channel capacity and reduces
the dimensions of the input image without losing pixel information. This method maintains the
generalization performance of the neural network and provides an alternative solution with a low
cost of data acquisition while accomplishing the rapid detection of bridge cracks with high precision.
Keywords: crack detection; concrete; DCGAN; YOLOv5
1. Introduction
Concrete is widely used in dams, bridges, and other large-scale engineering struc-
tures [
]. For these structures, maintenance, monitoring, and life assessments are very
important tasks after the post-construction period [
]. Among the various disasters that
can occur during the maintenance period of bridge engineering, cracks often appear first [
This is due to the uneven settlement of the bridge foundation in the vertical direction and
displacement in the horizontal direction, leading to internal stresses in the concrete structure
and resulting in cracks [
]. For foundations that are built in phases or subjected to the
effects of frost in cold areas, deformation of the structure and cracks can also occur [
Additionally, bridge cracks can cause significant harm: (1) Cracks can result in leaks, caus-
ing water flow to into the interior of the bridge and destroying the concrete’s mechanical
characteristics and physical properties. When flowing water freezes, it will cause the for-
mation of deeper and wider cracks, which will cause instability of the main structure of
the bridge, decreasing the safety level. The water in the bridge cracks will cause further
development and expansion of cracks under the influence of gravity and pressure during
construction [
]. (2) The cracks will lead to carbonation in the concrete structure of the
bridge. The concrete material reacts with CO
from the environment in the presence of
moisture to produce calcium carbonate. As a result, the safety and mechanical properties of
the concrete structure are reduced [
]. (3) Bridge cracks will destroy the purification film
Appl. Sci. 2023,13, 9878.
Appl. Sci. 2023,13, 9878 2 of 17
of steel bars and metal components and corrode under the simultaneous penetration and
action of external air and water. The rust generated after the corrosion of steel bars is more
than ten times larger than the initial volume, which will reduce the stability of the reinforced
concrete engineering [
]. In summary, it is very important to obtain the location, length,
width, and extension condition of bridge cracks in time through detection technology.
However, with continuous increases in the number of completed bridges and bridge
spans, crack detection work becomes increasingly demanding. At the same time, the
economic costs also become higher [
]. With the accelerated growth of computing capa-
bilities in recent years, advancements in deep learning technology have motivated us to
develop a novel approach to address these problems [
]. Li et al. [
] presented a novel
approach for bridge crack detection by enhancing the encoder–decoder framework and
utilizing a mixed pooling module. The encoder employs dilated convolutions to extract the
fundamental characteristics of crack images. This methodology ensures preservation of the
feature image resolution and facilitates the acquisition of a wide receptive field. Notably,
the experimental results demonstrated that this technique achieved significantly higher
precision and recall metrics. Li et al. [
] proposed the utilization of deep learning tech-
niques for bridge crack detection using unmanned aerial vehicles (UAVs). They adopted
the faster region convolutional neural network (faster R-CNN) algorithm based on VGG16
transfer learning to detect bridge cracks effectively. The experimental results indicated
that the automatic detection of bridge cracks using UAVs and the faster R-CNN algorithm
could provide enhanced efficiency without compromising accuracy. By leveraging the
advantages of depthwise separable convolution, atrous convolution, and the atrous spatial
pyramid pooling module, Xu et al. [
] presented a convolutional neural network (CNN)
end-to-end crack detection method. This study showed that the model had better capa-
bilities compared with conventional classification models. Moreover, the model had the
potential to be integrated into any convolutional network, serving as an efficient module
for feature extraction. Based on the current research status, it is obvious that obtaining
large amounts of high-quality data on cracks is still a time-consuming and costly task [
Additionally, it is worth discussing which form of neural network is most suitable for the
rapid detection of bridge cracks.
Therefore, the DCGAN is employed to generate a large number of artificial crack
samples to expand the dataset. As a representative artificial neural network (ANN),
DCGAN can mining the potential distribution of image data and achieve image data fitting
so that the neural network can produce high-quality realistic images based on learned
image features [
]. In terms of application fields, generative adversarial networks, a
popular means of dataset augmentation and generation, have been widely used in medical
fields, safety inspection, agricultural production, and other fields [
]. An imbalanced
fault diagnosis method based on the generative model of DCGAN was proposed by
Luo et al. [
] to solve the problems of limited datasets. In order to expand fake fingerprint
data, Choi et al. [
] proposed a method to investigate whether a fake fingerprint generated
by DCGAN was similar to a fake fingerprint from the dataset. At present, image generation
technology is still relatively rare in the field of roads, and the intelligent detection of road
cracks is also in its exploratory stages. For example, the collected images of roads often
have complex backgrounds and are heavily affected by light, and conditions such as cracks
are often difficult to distinguish [
]. Therefore, it is of value to study bridge crack data
augmentation methods based on DCGAN.
YOLOv5, which is considered to be an efficient and stable target detection neural
network, is used to achieve the rapid detection of bridge cracks [
]. In recent years, the
YOLOv5 object detection architecture has gained increasing attention in the field of com-
puter vision due to its outstanding performance in detecting various objects in real-time
scenarios. The utilization of YOLOv5 for bridge crack detection brings several advan-
tages [
]. Firstly, YOLOv5 offers a high accuracy in detecting and localizing cracks on
bridge surfaces. With its advanced anchor-based and anchor-free mechanisms, YOLOv5
can effectively identify and delineate fine cracks, regardless of their length, width, or ori-
Appl. Sci. 2023,13, 9878 3 of 17
entation. This enables engineers and researchers to obtain precise information regarding
crack locations and sizes. Secondly, the real-time capabilities of YOLOv5 enable rapid crack
detection, aiding in efficient inspections. Its speed-optimized architecture allows inspectors
to quickly assess the conditions of bridges. This reduces the time and resources required for
manual inspections, making crack detection more time-efficient and cost-effective. More-
over, YOLOv5 can adapt to varying crack features and textures, ensuring consistent and
reliable crack detection across different bridge types. This flexibility eliminates the need for
multiple detection techniques, simplifying the crack detection process. This study shows
that the combination of DCGAN and YOLOv5 can carry out the detection of engineering
defects rapidly and accurately.
The sections of this study are structured as follows: Section 2presents a detailed
introduction of the establishment of the dataset and the structure and theory of DCGAN
and YOLOv5. In Section 3, the training process, training environment, and predicted results
are presented. Subsequently, Section 4discusses the research motivation, limitations of this
work, and future research directions. Finally, Section 5provides the concluding remarks for
this study.
2. Materials and Methods
2.1. Establishment of the Dataset
A total of 2068 sets of bridge crack datapoints with various morphologies were col-
lected, as shown in Figure 1. The dataset of cracks is feature rich. Figure 1a–d shows vertical
cracks, cross cracks, horizontal cracks, and wider cracks, respectively. These 2068 images
were used as the training data for DCGAN.
Figure 1. Partial bridge crack image data: (ap) The representative cracks.
Appl. Sci. 2023,13, 9878 4 of 17
2.2. The Theory of DCGAN
The GAN was first proposed by Goodfellow in 2014 [
], aiming to produce generated
samples that are nearly consistent with the distribution of real data, which belongs to the
category of unsupervised learning [
]. GANs have attracted a large number of researchers
and have subsequently achieved many research results in computer vision fields such as
image synthesis, style transfer, image repair, and object detection [
]. DCGAN combines
CNN with GAN, which greatly improves the stability of GAN and the effect of the output
results. DCGAN usually consists of a generator and discriminator.
The primary objective of the generator is to understand and absorb the attributes
present in the training data. It accomplishes this by aligning the random noise distribution
with the actual distribution of the training data under the guidance of the discriminator.
This process enables the generator to produce synthetic data that closely resemble the
characteristics observed in the training dataset. The training objective function can be
expressed as follows:
G=arg min
GDiv(Pd ata(x),PG)(1)
In the equation,
are the real and generated data distributions, respectively,
and Div() is the difference between the distributions.
The primary role of the discriminator is to differentiate between real and generated
data produced by the generator while providing feedback to the generator. Both net-
works undergo iterative training, where their capabilities grow simultaneously until the
generated network is capable of producing data that can deceive the discriminator. This
process focuses on refining the discerning abilities of the discriminator and enhancing the
generative potential of the network. Finally, a certain balance will be reached in terms of
the capabilities of the generator and discriminator. The training objective function can be
expressed as follows:
D=arg max
DDiv(PG,Pdata )(2)
The true likelihood of the input sample is manifested in the discriminator objective
function value, reflecting a binary classification challenge. The formula for the training
objective function can be articulated as follows:
V(G,D) = EXPdata(x)[log D(x)] + EZPZ(z)[log(1D(G(z))] (3)
In the equation,
are the real data xand the noise zexpectations,
is the output of the real data x; and
represents the noise zthrough
the generator.
Suppose that in a continuous space, Equation (3) can be expressed as:
V(G,D) = Zx[Pdata(x)log(D(x) + PG(x)log(1D(x)))]dx (4)
For the integrand function F(x),
F(x) = Pdata(x)log D(x) + PGlog(1D(x)) (5)
are arbitrary non-zero real numbers, Equation (5) obtains the
maximum value when Equation (6) is satisfied. Equation (6) can be expressed in the
following form:
Pdata(x) + PG(x)(6)
Appl. Sci. 2023,13, 9878 5 of 17
Equation (7) can be obtained by taking Equation (6) into Equation (4):
V(G,D) = 2 log 2 +2JSD(Pd ata ||PG)(7)
In the equation,
is the
divergence between
. The more
are, the smaller the
divergence; conversely, the more different
and PGare, the larger the JS divergence.
Considering the generator and the discriminator together, the objective function of
DCGAN can be expressed as:
DV(G,D) = EXPdata(x)[log D(x)] + EZPZ(z)[log(1D(G(z)))] (8)
2.3. The Architecture of the DCGAN
Figure 2shows the generator architecture used in this study. To commence, the
generator is supplied with a random noise vector of 100-dimensions and undergoes an
upward sampling procedure facilitated by dee- transposed convolution layers. Then the
feature maps are obtained, which have different scales. Transposed convolution is a special
kind of forward convolution which first expands the dimensions of the input image [
This process is also called up-sampling, which can convert images into higher resolutions.
Every up-sampling operation is succeeded by a layer of batch normalization and a layer
implementing an activation function. The last layer uses the Tanh function, and the other
activation function layers use the Relu function. The initial random noise vector undergoes
a series of seven up-sampling operations, resulting in the generation of an image with
dimensions of 256
1. The complete network structure and parameters of each layer
of the generator is shown in Table 1.
Figure 2. The architecture of the DCGAN.
Figure 2also shows the discriminator architecture used in this study. The discriminator
undertakes seven down-sampling operations when presented with both the real bridge
crack image and the generated crack image. These operations efficiently diminish both the
size of the images and the dimensions of their respective features [
]. Each convolution
layer employs convolution kernels measuring 4
4. The count of convolution kernels
employed in each respective layer is 64, 128, 256, 512, 1024, and 1. For each down-sampling
operation, a batch normalization layer and a Leaky Relu function layer follow. The Leaky
Relu function layer incorporates a negative slope coefficient of 0.2. Consequently, a com-
prehensive down-sampling process enables the extraction of the features of the images.
The ultimate loss value is derived based on the features extracted from the down sampling
process. When the image of the unauthentic bridge crack or the real bridge crack is re-
ceived, the loss value closes to 1 or 0, respectively. The complete network structure and the
parameters of each layer of the generator are shown in Table 2.
Appl. Sci. 2023,13, 9878 6 of 17
Table 1. The complete network structure of the generator.
Layer Output Shape Parameter
Reshape_1 (1 ×1×100) 0
Conv2D_transpose_1 (4 ×4×1024) 1,639,424
BN_1 (4 ×4×1024) 4096
Activation_1 (4 ×4×1024) 0
Conv2D_transpose_2 (8 ×8×512) 8,389,120
BN_2 (8 ×8×512) 2048
Activation_2 (8 ×8×512) 0
Conv2D_transpose_3 (16 ×16 ×256) 2,097,408
BN_3 (16 ×16 ×256) 1024
Activation_3 (16 ×16 ×256) 0
Conv2D_transpose_4 (32 ×32 ×128) 524,416
BN_4 (32 ×32 ×128) 512
Activation_4 (32 ×32 ×128) 0
Conv2D_transpose_5 (64 ×64 ×64) 131,136
BN_5 (64 ×64 ×64) 256
Activation_5 (64 ×64 ×64) 0
Conv2D_transpose_6 (128 ×128 ×32) 32,800
BN_6 (128 ×128 ×32) 128
Activation_6 (128 ×128 ×32) 0
Conv2D_transpose_7 (256 ×256 ×1) 513
Activation_7 (256 ×256 ×1) 0
Table 2. The complete network structure of discriminator.
Layer Output Shape Parameter
Conv2D_1 (128 ×128 ×64) 1088
BN_1 (128 ×128 ×64) 256
Leaky Relu_1 (128 ×128 ×64) 0
Conv2D_2 (64 ×64 ×128) 131,200
BN_2 (64 ×64 ×128) 512
Leaky Relu_2 (64 ×64 ×128) 0
Conv2D_3 (32 ×32 ×256) 524,544
BN_3 (32 ×32 ×256) 1024
Leaky Relu_3 (32 ×32 ×256) 0
Conv2D_4 (16 ×16 ×512) 2,097,664
BN_4 (16 ×16 ×512) 2048
Leaky Relu_4 (16 ×16 ×512) 0
Conv2D_5 (8 ×8×1024) 8,389,632
BN_5 (8 ×8×1024) 4096
Leaky Relu_5 (8 ×8×1024) 0
Conv2D_6 (4 ×4×1) 16,385
Flatten_1 16 0
Dense_1 1 17
2.4. The Architecture of YOLOv5
YOLOv5 is an efficient target detection algorithm [
]. Similar to previous generations
of Yolo algorithms, YOLOv5 adopts the concept of grids, that is, the image is divided into
multiple meshes, each of which is responsible for predicting one or more objects. Each grid
can produce prediction boxes (i.e., anchor), and templates for three prediction boxes are
generally pre-stored. The anchor has a preset width, height, coordinates, and confidence
level. The confidence level indicates the probability that an object is present in the mesh.
When the mesh in which anchor is located has objects, the confidence level is 1, and vice
versa is 0. If we regard the difference between the width and height of anchor and the
difference of coordinates as losses, and the binary cross entropy as a loss of confidence,
then the problem of target detection will be greatly simplified into a simple regression
prediction and classification problem.
Appl. Sci. 2023,13, 9878 7 of 17
Figure 3shows the YOLOv5 architecture used in this study. YOLOv5 mainly consists
of Backbone network, Neck network and Head network. The Backbone part is mainly
used for feature extraction and continuous reduction of feature map. The fusion of shallow
graphic features and deep semantic features is performed by Neck network and the Head
network is the classifier and regressor of YOLOv5. The main modules of YOLOv5 include
Focus, CBL, CSP1_X, CSP2_X and SPP. Among them, the Focus module operates as part of
the initial processing before the Backbone network. This operation essentially retrieves a
value from alternate pixels in an image, creating an effect akin to proximate down sampling.
The original 640
3 image is inputted into the Focus module, and the feature map
of 320
12 is first changed by slicing operation. It finally becomes a feature map
with the size of 320
32 after a convolution operation; The convolutional layer,
batch normalization layer and Leaky Relu function are the main parts of The CBL module;
CSP1_X module uses the CSPNet structure, consisting of three convolutional layers and X
Res units modules while CSP2_X module replaces Res unit with CBL; Multi-scale fusion is
carried out by SPP module with Maxpool operation of 5
5, 9
9 and 13
13 convolution
kernel sizes.
Figure 3. The architecture of YOLOv5.
3. Results
3.1. The Training Process and Results of DCGAN
The experimental environment is presented in Table 3. The training process of the
neural network involves 2000 epochs. A batch size of 64 is used, and the weight parameters
of both the discriminator and generator are automatically saved every 50 epochs. The
loss function adopts the binary cross entropy, which can be expressed by Equation (9) [
The loss curves of the generator and discriminator are presented in Figure 4. The training
process is mainly divided into two phases (an unstable phase and a stable phase). The loss
values fluctuate greatly in the unstable phase, and the loss curves convergence gradually.
Real-time visualization allows for visual inspection of the samples. As shown in Figure 5,
the training results of four representative epochs are selected. In the early phase of train-
ing (approximately 0~1200 epochs), the generated cracks have obvious defects, and the
morphology of the cracks is only slightly learned by the neural network. In approximately
1200~1600 epochs, the morphology of the cracks gradually appears, but the generated
results are still unstable. The quality of the produced cracks notably improves as the
training progresses into its later stages.
Loss =1
yi·log ˆ
yi+ (1yi)·log(1ˆ
Appl. Sci. 2023,13, 9878 8 of 17
Table 3. Parameters of the computer environment.
Name Parameter
System Windows 10
CPU Inter Core i7-11800H CPU @ 2.3 GHz
Memory 8 GB
Graphics card NIVIDA GeForce RTX3060
Environment Python 3.6, Tensorfolw 2.8.0, Keras 2.8.0, Numpy 1.22.2
Configuration CUDA 11.2
Figure 4.
The loss curves of the DCGAN: (
) The generator loss curve of the training process; (
) The
discriminator loss curve of the training process.
Figure 5. The partial real-time visualization.
Appl. Sci. 2023,13, 9878 9 of 17
In this equation, Nis the output size of the predicted results of neural network,
the true value of the ilabel, and ˆ
yiis the predicted value of the ilabel.
For computational efficiency, the best training results are chosen from the images
generated at the 1950th training epoch. The current weight parameters are saved for future
use in generating bridge cracks efficiently. Under our computer hardware conditions, the
time cost of generating 1000 artificial crack samples is on the order of 10 s. Figure 6shows
part of original crack images and generated crack images. The diversity in shapes of the
generated cracks is evident, including vertical cracks, cross cracks, horizontal cracks, and
X-type cracks, which are consistent with the real dataset. From a subjective perspective,
the bridge cracks produced by the trained DCGAN effectively encapsulate the primary
features of the original cracks. It can be challenging to differentiate the generated samples
from the original cracks. Finally, 90 generated bridge cracks are used as the dataset in the
following experiments and analysis.
Figure 6.
The partial original samples and generated samples: (
) The original crack samples;
(eh) The generated crack samples.
3.2. The Training Process and Results of YOLOv5
The experimental environment is presented in Table 4. With a total of 300 training
epochs, the batch size is set to 4. The binary cross entropy, box loss, and object loss are
employed as the loss functions of the neural network. The classification loss is evaluated
by the binary cross entropy; that is, calculating whether the anchor and the corresponding
calibration classification are correct. The effect of the box loss function is to evaluate the
location loss; that is, calculating the error between the anchor and the calibration box, which
can be expressed by Equation (10). The object loss is calculated by IOU [
], which is the
intersection and union ratio between the real and predicted boxes.
Table 4. Parameters of the computer environment.
Name Parameter
System Windows 10
CPU Inter Core i7-11800H CPU @ 2.3 GHz
Memory 8 GB
Graphics card NIVIDA GeForce RTX3060
Environment Python 3.8.5, Pytorch 1.8.0, NUMPY 1.21.5
Configuration CUDA 11.2
Appl. Sci. 2023,13, 9878 10 of 17
LCIOU =1IOU +ρ2(b,bgt)
(1IOU) + v(11)
π2(arctan wgt
hgt arctan w
In Equations (10)–(12),
are the weight of the predicted and real boxes,
are the height of predicted and real boxes, respectively;
the central points of the predicted and real boxes, respectively; ρrepresents the Euclidean
distance between two central points;
is the diagonal distance across the smallest enclosed
area that is able to encompass both the predicted and real boxes.
The learning rate is a very important hyperparameter in the neural network, which
will influence the accuracy and speed of the training process [
]. The learning rate curves
of the training process of YOLOv5 are shown in Figure 7. The warm-up algorithm is
used in the initial phase of training (phase 1 in Figure 7). The purpose of the warm-
up algorithm is as follows [
]: At the beginning of training, the weights of the model
are randomly initialized, and at this time, a larger learning rate may bring instability
(oscillation) to the model. However, the warm-up algorithm can involve several epochs
with a small learning rate at the beginning of training. Therefore, the model can slowly
tend to stabilize, then select the pre-set learning rate for follow-up training, which makes
the model convergence faster and the model effect better. The cosine annealing algorithm
is used throughout the remainder of the training (phase 2 in Figure 7) [
]. The idea of
the cosine annealing algorithm is as follows: The cosine function is used by the cosine
annealing to reduce the learning rate. The cosine value first decreases slowly with an
increase in x in the cosine function, then it decreases more rapidly followed by decreasing
slowly again, which satisfies the requirements of the learning rate of the gradient descent
algorithm. In this study, the initial learning rate and the pre-set learning rate are set to 0.001
and 0.01, respectively.
The original dataset (including 180 sets of training samples and 10 sets of validation
samples) and the extended dataset (including 180 sets of training samples and 10 sets of
validation samples) were established. The samples in the original dataset are all real images,
while half of the samples in the training dataset are generated. The detection accuracy and
model performance were evaluated using Precision, Recall, MAP-0.5, and MAP-0.5:0.95.
The precision and recall can be expressed by Equations (13) and (14), respectively. AP is
the average accuracy; that is, the area under the PR curve (recall on the horizontal axis and
precision on the vertical axis) of a specific classification in all images. MAP is the average of
all the classifications of AP in all images. Therefore, MAP-0.5 represents the value of MAP
when IOU is equal to 0. When IOU is equal to 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, and
0.95, MAP-0.5:0.95 are the average values of MAP, respectively.
Precision =TP
TP +FP (13)
In Equation (13), TP is the number of positive sample (IOU
threshold) with the
correct classification, and FP is the number of negative samples (IOU < threshold) with the
wrong classification.
Recall =TP
TP +FN (14)
In Equation (14), FN is the number of positive samples with the wrong classification.
Appl. Sci. 2023,13, 9878 11 of 17
Figure 7. The learning rate curve of YOLOv5.
Each loss curve and index of the training process is shown in Figure 8. It can be
seen that the loss and accuracy curves of original and extended datasets almost coincide.
The final loss and accuracy convergence values are also approximate. Therefore, the
trained YOLOv5 based on the original dataset and extended dataset have similar detection
accuracies. Moreover, two representative YOLOv5 models (YOLOv5m and YOLOv5l) were
trained on the extended dataset and compared with the presented model [
]. Figure 9
shows the approximate training losses and detection accuracy of the three models. However,
the model sizes of YOLOv5m and YOLOv5l are larger, leading to longer training times and
greater GPU memory requirements. Figure 10 shows the detection results based on the
original dataset (Figure 10a–c) and extended dataset (Figure 10d–f) of the trained YOLOv5.
Under our computer hardware conditions, the time cost of detecting one crack image is
on the order of 10 ms. Zheng et al. [
] used an improved YOLOv5 to carry out automatic
concrete pavement crack detection, and the time cost of detecting one crack image was
on the order of 5.1 ms. Pei et al. [
] also employed a deep learning method to extend the
dataset and the faster R-CNN to detect the cracks. The average precision was approximately
0.86, but they used a large dataset containing 1000 real images and 3000 generated cracks.
The results indicate that the proposed method is effective.
Appl. Sci. 2023,13, 9878 12 of 17
Figure 8.
The loss and accuracy curves of YOLOv5: (
) The box loss curves of the training dataset;
) The box loss curves of the validation dataset; (
) The object loss curves of the training dataset;
) The object loss curves of the validation dataset; (
) The Precision curves of the training process;
) The Recall curves of the training process; (
) The MAP-0.5 curves of the training process; (
) The
MAP-0.5:0.95 curves of the training process.
Figure 9.
Comparison of three different YOLOv5 models: (
) The object loss curves of the validation
dataset; (b) The MAP-0.5:0.95 curves of the training process.
Appl. Sci. 2023,13, 9878 13 of 17
Figure 10.
The detection results of YOLOv5: (
) The detection results based on the original dataset;
(df) The detection results based on the extended dataset.
4. Discussion
Large-scale engineering projects play a pivotal role in modern society, ranging from
infrastructure development to environmental protection. However, the success and sustain-
ability of these projects heavily rely on their post-construction maintenance and monitoring.
Effective post-construction management is important for ensuring the long-term func-
tionality and safety of such projects [
]. One primary reason for the significance of
post-construction maintenance and monitoring is the dynamic properties of engineering
structures and systems. As these structures are exposed to changing environmental condi-
tions, they undergo various forms of deterioration over time. Without proper maintenance,
this deterioration can escalate, resulting in impaired functionality, decreased performance,
and compromised safety. Regular inspections, repairs, and replacements are essential to
rectify damage, avoid catastrophic failures, and extend the lifespan of large-scale engi-
neering projects [
]. Additionally, post-construction monitoring provides crucial data for
understanding the behavior and performance of these projects in real-world conditions. By
collecting and analyzing information on structural stresses, vibrations, deflections, envi-
ronmental impacts, and defects, it is possible to identify potential issues and optimize the
design and operation of similar future projects. Such monitoring also facilitates prompt
interventions, allowing for early identification of problems and effective maintenance
strategies [48].
Bridge crack detection is an important task in large-scale engineering project moni-
toring, as cracks on bridges may pose significant hazards [
]. It is important to recognize
that the severity and potential hazards associated with cracks depend on various factors,
including the type, size, and location of the crack, as well as the overall condition of the
bridge. The damage to the bridge caused by cracks can also be diverse [50,51]:
Appl. Sci. 2023,13, 9878 14 of 17
Cracks can weaken the structure, making it susceptible to sudden failure, particularly
under heavy traffic loads or extreme weather conditions.
Cracks can accelerate the degradation and deterioration of bridge materials. Moisture
penetration through cracks can promote corrosion in reinforced concrete or steel
elements, further compromising their structural integrity.
Cracks can affect the dynamic behavior of bridges, leading to decreased stability
and increased vulnerability to external forces, such as earthquakes or strong winds.
Moreover, fatigue cracks caused by repetitive loading and stress cycles introduce a
gradual deterioration process that can eventually lead to catastrophic failure.
A key concern related to cracks on bridges is their impact on the safety of transporta-
tion users.
Settlement cracks, arising from the differential settlement of bridge foundations, can
cause misalignments and deformations that affect the overall stability and functional-
ity of the structure.
To address the hazards posed by cracks on bridges, effective maintenance and repair
strategies are essential. Routine inspections and monitoring programs play a crucial role in
detecting cracks in early stages and evaluating their severity. Depending on the size and
extent of a crack, appropriate repair techniques, such as crack injection or grouting, need to
be implemented to restore the structural integrity. In conclusion, cracks pose significant
hazards to bridges, jeopardizing their structural integrity and safety. Understanding the
causes, types, detection methods, and consequences of cracks in bridges is crucial for
designing appropriate maintenance and repair strategies [12].
However, traditional methods are costly for massive bridges spanning large distances.
Further research and technological advancements in the field of rapid crack detection tech-
nologies are instrumental for ensuring the long-term sustainability of bridge infrastructure
and ensuring the safety of the public. Deep learning provides a new method for solving
this problem with the rapid development of computer technology [
]. Firstly, drone
technology can be used to sample the entire bridge [
]. This technique has the following
advantages: (1) It is safe and reliable. The use of drones for inspecting bridges eliminates
the need for manual inspection, avoids casualties, improves operational safety, and saves
inspection costs. (2) The unmanned detection accuracy of bridges is high. The drone itself
can carry high-definition cameras to take images of cracks. (3) It is more efficient to use
drones for bridge inspection. Drone sampling technology is relatively mature. The proposal
of collision-tolerant UAVs along with a two-stage inspection method for bridge coating was
put forward by Jiang et al. [
]. Junwon et al. [
] extensively described the principles of
drone-facilitated inspections as well as key factors to consider for optimal data collection.
The images of the entire bridge captured by the UAV can be divided into many small-
sized pictures and entered into a computer for crack detection. Even though rapid detection
of cracks can be achieved using deep learning techniques, it still requires a large dataset
to train the neural network. Additionally, it should be pointed out that the existing small
sample training strategy is not sufficient to solve the issue of data scarcity [
]. In order
to further reduce the cost of data acquisition and manual detection, the DCGAN can be
employed to generate artificial crack samples from existing crack images. Peng et al. [
used the DCGAN to reconstruct the oil reservoir fracture model. The results showed that
the reconstructed model could predict the pressure distribution accurately.
Finally, YOLOv5 is used to carry out the detection process. YOLOv5 applies the
adaptive anchor. Therefore, YOLOv5 learns the best anchor boxes in the dataset during the
training process without having to run the K-means clustering algorithm offline to obtain
the K anchor boxes and modify the head network parameters. Overall, the training process
of YOLOv5 is simple and automated. Moreover, the Focus module is used by YOLOv5,
which augments the channel count (the channel count has a minimal influence on the
computation quantity) and reduces the dimensions of the input image without losing pixel
information. As a result, the model calculation amount is greatly reduced [58].
Appl. Sci. 2023,13, 9878 15 of 17
This study has several limitations. The quality of the samples generated by the
DCGAN needs to be improved, which is influenced by many factors like the size of the
training dataset, the computing power of the computer, and hyperparameter optimization.
During the training of the DCGAN, problems such as mode collapse and image collapse
may also be encountered [
]. In future work, we will investigate more efficient DCGAN
training strategies and achieve a higher-quality reconstruction of the cracks.
5. Conclusions
A method combining DCGAN and YOLOv5 which can detect bridge cracks rapidly
and accurately was proposed. Moreover, we described the theory, training environment,
parameter setting, and neural architecture of the DCGAN and YOLOv5. This work inves-
tigated the performance of the proposed model and compared the training results of the
extended dataset with the original dataset. The main findings are presented as follows:
The trained DCGAN can learn the characteristics of cracks and quickly generate
a large number of artificial bridge crack images which are used to extend the real
dataset. The time cost of generating 1000 artificial crack samples was on the order of
10 s. The generated images were balanced and feature-rich.
The YOLOv5 target detection neural network can perform crack identification and
rapid detection. The time cost of detecting one crack image is on the order of 10 ms.
The results indicate that when YOLOv5 was trained on extended dataset, it had a
similar detection accuracy compared with when it was trained on the original dataset
(real dataset), which provides a new idea for the cost control of maintenance and
monitoring of large-scale concrete structures.
The proposed method of combining DCGAN and YOLOv5 has been proven to be
acceptable, especially in terms of cost-effectiveness. However, the quality of the images
generated by the DCGAN and the detection accuracy of YOLOv5 need to be improved.
These factors are influenced by factors such as the computing power of the computer,
hyperparameter optimization, and training strategy. In future work, advanced training
strategies and optimization algorithms for neural networks will be the focus of our research.
Author Contributions:
Conceptualization, Y.L. and W.G.; Methodology, Y.L. and W.G.; Software,
T.Z.; Validation, W.G., T.Z., Z.W. (Zhiyong Wang) and Z.W. (Zhihua Wang); Formal analysis, Y.L.;
Investigation, Y.L.; Data curation, Y.L.; Writing—original draft, Y.L.; Writing—review & editing, W.G.,
T.Z., Z.W. (Zhiyong Wang) and Z.W. (Zhihua Wang); Supervision, Z.W. (Zhiyong Wang) and Z.W.
(Zhihua Wang); Funding acquisition, Z.W. (Zhiyong Wang) and Z.W. (Zhihua Wang). All authors
have read and agreed to the published version of the manuscript.
This research was funded by the National Natural Science Foundation of China (Grant Nos.
12102294, 12272257), the Natural Science Foundation of Shanxi Province (202203021211169), and the
special fund for Science and Technology Innovation Teams of Shanxi Province (Nos. 202204051002006).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
Khem, F.C.; Kai, S.W.; Jee, K.H.; Jee, H.L.; Foo, W.L.; Yee, L.L. Experimental and numerical study of the strength performance of
deep beams with perforated thin mild steel plates as shear reinforcement. Appl. Sci. 2023,13, 8217.
Jack, M.; Marcus, P.; Christos, V.; Lorena, B.; Brenden, L. Robotic spray coating of self-sensing metakaolin geopolymer for concrete
monitoring. Automat. Constr. 2021,121, 103415.
Zhang, C.Y.; Wang, M.; Liu, R.T.; Li, X.H.; Yan, J.; Du, H.J. Enhancing self-healing efficiency of concrete using multifunctional
granules and PVA fibers. J. Build. Eng. 2023,76, 107314. [CrossRef]
Gabriele, B.; Mario, F.; Luca, G.; Marzia, M. Preliminary investigation on steel jacketing retrofitting of concrete bridges half-joints.
Appl. Sci. 2023,13, 8181.
Appl. Sci. 2023,13, 9878 16 of 17
Jang, K.; Jung, H.; An, Y. Automated bridge crack evaluation through deep super resolution network-based hybrid image
matching. Automat. Constr. 2022,137, 104229. [CrossRef]
6. Zhang, T.J. Analysis on the causes of cracks in bridges. J. Constr. Res. 2018,1, 13–26. [CrossRef]
Huang, Y.F.; Chen, Y.G.; Deng, F.M.; Wang, X.M. Design of CB-PDMS flexible sensing for monitoring of bridge cracks. Sensors
2022,22, 9817. [CrossRef] [PubMed]
Yu, S.; He, F.C.; Zhang, J.R. Experimental PIV radial splitting study on expansive soil during the drying process. Appl. Sci.
13, 8050. [CrossRef]
Hawley, C.J.; Gräbe, P.J. Water leakage mapping in concrete railway tunnels using LiDAR generated point clouds. Constr. Build.
Mater. 2022,361, 129644. [CrossRef]
Jiang, C.; Gu, X.L.; Huang, Q.H.; Zhang, W.P. Carbonation depth predictions in concrete bridges under changing climate
conditions and increasing traffic loads. Cement. Concrete Comp. 2018,93, 140–154. [CrossRef]
Oday, I.M.; Salah, S.A.; Alaa, S.A.; Hassane, L.; Belkheir, H.; Abdelkarim, C.; Young, G.K. On the development of an intelligent
Poly(aniline-co-o-toluidine)/Fe3O4/Alkyd coating for corrosion protection in carbon steel. Appl. Sci. 2023,13, 8189.
Li, G.; Liu, T.; Fang, Z.Y.; Shen, Q.; Ali, J. Automatic bridge crack detection using boundary refinement based on real-time
segmentation network. Struct. Control. Health Monitor. 2022,29, 2991. [CrossRef]
Sepasdar, R.; Karpatne, A.; Shakiba, M. A data-driven approach to full-field nonlinear stress distribution and failure pattern
prediction in composites using deep learning. Comput. Method. Appl. Mech. Eng. 2022,397, 115126. [CrossRef]
Yaser, G.; Jonny, N.; Taufik, N.; Andrzej, C. Formwork pressure prediction in cast-in-place self-compacting concrete using deep
learning. Automat. Constr. 2023,151, 104869.
Masi, F.; Stefanou, I. Multiscale modeling of inelastic materials with Thermodynamics-based Artificial Neural Networks (TANN).
Comput. Method. Appl. Mech. Eng. 2022,398, 115190. [CrossRef]
Li, G.; Fang, Z.Y.; Mohammed, A.M.; Liu, T.; Deng, Z.H. Automated bridge crack detection based on improving encoder–decoder
network and strip pooling. J. Infrastruct. Syst. 2023,29, 218. [CrossRef]
Li, R.X.; Yu, J.Y.; Li, F.; Yang, R.T.; Wang, Y.D.; Peng, Z.H. Automatic bridge crack detection using unmanned aerial vehicle and
Faster R-CNN. Constr. Build Mater. 2023,362, 129659. [CrossRef]
Xu, H.Y.; Su, X.; Wang, Y.; Cai, H.Y.; Cui, K.R.; Chen, X.D. Automatic bridge crack detection using a convolutional neural network.
Appl. Sci. 2019,9, 2867. [CrossRef]
Oh, K.; Kim, E.; Park, C.Y.; Chen, X.D. A physical model-based data-driven approach to overcome data scarcity and predict
building energy consumption. Sustainability 2022,14, 9464. [CrossRef]
Abdelhalim, I.S.A.; Mohamed, M.F.; Mahdy, Y.B. Data augmentation for skin lesion using self-attention based progressive
generative adversarial network. Expert. Syst. Appl. 2021,165, 113922. [CrossRef]
Pawar, S.P.; Talbar, S.N. LungSeg-Net: Lung field segmentation using generative adversarial network. Biomed. Signal Process.
Control 2021,64, 102296. [CrossRef]
Kazuhiro, K.; Werner, R.A.; Toriumi, F.; Javadi, M.S.; Pomper, M.G.; Solnes, L.B.; Verde, F.; Higuchi, T.; Rowe, S.P. Generative
adversarial networks for the creation of realistic artificial brain magnetic resonance images. Tomography
,4, 159–163.
[CrossRef] [PubMed]
Luo, J.; Huang, J.; Li, H. A case study of conditional deep convolutional generative adversarial networks in machine fault
diagnosis. J. Intell. Manuf. 2021,32, 407–425. [CrossRef]
Choi, S.H.; Jung, S.H.; Li, H. Similarity analysis of actual fake fingerprints and generated fake fingerprints by DCGAN. Int. J.
Fuzzy Log. Intell. Syst. 2019,19, 40–47. [CrossRef]
Zhang, K.G.; Zhang, Y.T.; Cheng, H.D. CrackGAN: Pavement crack detection using partially accurate ground truths based on
generative adversarial learning. IEEE Trans. Intell. Transp. 2021,22, 1306–1319. [CrossRef]
Yang, H.Y.; Yang, L.N.; Wu, T.; Meng, Z.Q.; Huang, Y.J.; Wang, P.S.; Li, P.; Li, X.C. Automatic detection of bridge surface crack
using improved Yolov5s. Int. J. Pattern. Recogn. 2022,36, 2250047. [CrossRef]
Mahaur, B.; Mishra, K.K. Small-object detection based on Yolov5 in autonomous driving systems. Pattern. Recogn. Lett.
115–122. [CrossRef]
Zhou, S.; Bi, Y.; Wei, X.; Liu, J.; Ye, Z.; Li, F.; Du, Y. Automated detection and classification of spilled loads on freeways based on
improved YOLO network. Mach. Vis. Appl. 2021,32, 44. [CrossRef]
Hu, W.X.; Xiong, J.T.; Liang, J.H.; Xie, Z.M.; Liu, Z.Y.; Huang, Q.Y.; Yang, Z.G. A method of citrus epidermis defects detection
based on an improved YOLOv5. Biosyst. Eng. 2023,227, 19–35. [CrossRef]
Tang, Z.; Zhou, L.; Qi, F.; Chen, H.R. An improved lightweight and real-time YOLOv5 network for detection of surface defects on
indocalamus leaves. J. Real-Time Image Process. 2023,20, 14. [CrossRef]
Jiang, Q.; Li, H. Silicon energy bulk material cargo ship detection and tracking method combining YOLOv5 and DeepSort. Energy.
Rep. 2023,9, 151–158. [CrossRef]
Lan, J.G.; Jean, P.; Mehdi, M.; Xu, B.; David, W.; Sherjil, O.; Aaron, C.; Yoshua, B. Generative Adversarial Networks. CoRR.
1–9. [CrossRef]
Won, U.Y.; An, V.Q.; Park, S.B.; Park, M.H.; Dam, D.V.; Park, H.J.; Yang, H.; Lee, Y.; Yu, W.J. Multi-neuron connection using
multi-terminal floating-gate memristor for unsupervised learning. Nat. Commun. 2023,14, 3070. [CrossRef]
Appl. Sci. 2023,13, 9878 17 of 17
34. Koochak, R.; Sayyafzadeh, M.; Nadian, A.; Bunch, M.; Haghighi, M. A variability aware GAN for improving spatial representa-
tiveness of discrete geobodies. Comput. Geosci. 2022,166, 105188. [CrossRef]
Kenshi, M.; Isal, N.; Yasuhiro, W. Transposed convolution as alternative preprocessor for brain-computer interface using
electroencephalogram. Appl. Sci. 2023,13, 3578.
Swamy, A.S.A.; Shylashree, N. HDR Image compression by multi-scale down sampling of intensity levels. Int. J. Image Graph
2021,21, 2150048. [CrossRef]
Hamzah, N.A.b.A.; Hadhrami, B.A.G.; Al-Selwi, H.F.; Hassan, N.; Aziz, A.b.A. Facial Mask Detection and Energy Monitoring
Dashboard Using YOLOv5 and Jetson Nano. In Proceedings of the Multimedia University Engineering Conference (MECON 2022);
Atlantis Press: Amsterdam, The Netherlands, 2022.
Liu, Y.F.; Zhang, J.; Zhao, T.T.; Wang, Z.Y.; Wang, Z.H. Reconstruction of the meso-scale concrete model using a deep convolutional
generative adversarial network (DCGAN). Constr. Build Mater. 2023,370, 130704. [CrossRef]
Xu, X.; Qiao, H.B.; Ma, X.M.; Yin, G.H.; Wang, Y.K.; Zhao, J.P.; Li, H.Y. An automatic wheat ear counting model based on the
minimum area intersection ratio algorithm and transfer learning. Measurement 2023,216, 112849. [CrossRef]
Yevick, D.; Melko, R. The accuracy of restricted Boltzmann machine models of Ising systems. Comput. Phys. Commun.
107518. [CrossRef]
41. He, K.M.; Zhang, X.Y.; Ren, S.Q.; Sun, J. Deep residual learning for image recognition. CoRR. 2015, 1–12. [CrossRef]
42. Llya, L.; Frank, H. Decoupled weight decay regularization. CoRR. 2019, 1–11. [CrossRef]
Huang, W.X.; Huo, Y.; Yang, S.C.; Liu, M.J.; Li, H.; Zhang, M. Detection of Laodelphax striatellus (small brown planthopper)
based on improved YOLOv5. Comput. Electron. Agric. 2023,206, 107657. [CrossRef]
Zheng, X.; Qian, S.R.; Wei, S.D.; Zhou, S.Y.; Hou, Y. The combination of transformer and you only look once for automatic concrete
pavement crack detection. Appl. Sci. 2023,13, 9211. [CrossRef]
Pei, L.L.; Sun, Z.Y.; Sun, J.; Li, W.; Zhang, H. Generation method of pavement crack images based on deep convolutional
generative adversarial networks. J. Cent. South Univ. (Sci. Technol.) 2021,52, 2899–3906.
Marwan, H.; Khaled, K. Studying the effectiveness of changing parameters in pavement management systems on optimum
maintenance strategies of low-volume paved roads. J. Transp. Eng. Part B Pavements 2021,147, 04020075.
La, M.L.; Oddo, M.C.; Cucchiara, C.; Granata, M.F.; Barile, S.; Pappalardo, F.; Pennisi, A. Experimental investigation on innovative
stress sensors for existing masonry structures monitoring. Appl. Sci. 2023,13, 3712.
Orlowsky, J.; Be
ling, M.; Kryzhanovskyi, V. Prospects for the use of textile-reinforced concrete in buildings and structures
maintenance. Buildings 2023,13, 189. [CrossRef]
Wang, D.; Zhao, Y.; Wang, J.F.; Wang, Q.; Liu, X.D.; Pappalardo, F.; Pennisi, A. Establishment and effect analysis of traffic load for
long-span bridge via fusion of parameter correlation. Structure 2023,55, 1992–2002.
Eslam, M.A.; Osama, M.; Mohamed, M.; Tarek, Z. Entropy-Based automated method for detection and assessment of spalling
severities in reinforced concrete bridges. J. Perform. Conster. Fac. 2021,35, 04020132.
Barros, J.A.O.; Baghi, H.; Ventura-Gouveia, A. Assessing the applicability of a smeared crack approach for simulating the
behaviour of concrete beams flexurally reinforced with GFRP bars and failing in shear. Eng. Struct.
,227, 111391. [CrossRef]
Qian, H.J.; Li, Y.; Yang, J.F.; Xie, L.H.; Tang, K.H. Segmentation and analysis of cement particles in cement paste with deep
learning. Cement Concrete Comp. 2023,136, 104819. [CrossRef]
Dai, X.; Nagahara, M. Platooning control of drones with real-time deep learning object detection. Adv. Robot.
,37, 220–225.
Jiang, S.; Wu, Y.Q.; Zhang, J. Bridge coating inspection based on two-stage automatic method and collision-tolerant unmanned
aerial system. Automat. Constr. 2023,146, 104685. [CrossRef]
Junwon, S.; Luis, D.; Jim, W. Drone-enabled bridge inspection methodology and application. Automat. Constr.
,94, 112–126.
Zhang, X.Y.; Zhao, T.T.; Liu, Y.F.; Chen, Q.Q.; Wang, Z.Y.; Wang, Z.H. A data-driven model for predicting the mixed-mode stress
intensity factors of a crack in composites. Eng. Fract. Mech. 2023,288, 109385. [CrossRef]
Peng, X.Y.; Rao, X.; Zhao, H.; Xun, Y.F.; Zhong, X.; Zhen, W.T.; Huang, L.Y. A proxy model to predict reservoir dynamic pressure
profile of fracture network based on deep convolutional generative adversarial networks (DCGAN). J. Petrol. Sci. Eng.
109577. [CrossRef]
Klinwichit, P.; Yookwan, W.; Limchareon, S.; Chinnasarn, K.; Jang, J.S.; Onuean, A. BUU-LSPINE: A thai open lumbar spine
dataset for spondylolisthesis detection. Appl. Sci. 2023,13, 8646. [CrossRef]
Disclaimer/Publisher’s Note:
The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
The real-time detection of cracks is an important part of road maintenance and an important initiative to reduce traffic accidents caused by road cracks. In response to the lack of efficiency of current research results for the real-time detection of road cracks and the low storage and computational capacity of edge devices, a new automatic crack detection algorithm is proposed: BT–YOLO. We combined Bottleneck Transformer with You Only Look Once (YOLO), which is more conducive to extracting the features of small cracks than YOLOv5s. The introduction of DWConv to the feature extraction network reduced the number of parameters and improved the inference speed of the network. We embedded the SimAM (Simple, Parameter-Free Attention Module) non-parametric attention mechanism to make the crack features more prominent. The experimental results showed that the accuracy of BT–YOLO in crack detection was increased by 4.5%, the mapped value was increased by 8%, and the parameter amount was decreased by 24.9%. Eventually, we deployed edge devices for testing. The frame rate reached 89, which satisfied the requirements of real-time crack detection.
Full-text available
(1) Background: Spondylolisthesis, a common disease among older individuals, involves the displacement of vertebrae. The condition may gradually manifest with age, allowing for potential prevention by the research of predictive algorithms. However, one key issue that hinders research in spondylolisthesis prediction algorithms is the need for publicly available spondylolisthesis datasets. (2) Purpose: This paper introduces BUU-LSPINE, a new dataset for the lumbar spine. It includes 3600 patients’ plain film images annotated with vertebral position, spondylolisthesis diagnosis, and lumbosacral transitional vertebrae (LSTV) ground truth. (4) Methods: We established an annotation pipeline to create the BUU-SPINE dataset and evaluated it in three experiments as follows: (1) lumbar vertebrae detection, (2) vertebral corner points extraction, and (3) spondylolisthesis prediction. (5) Results: Lumbar vertebrae detection achieved the highest precision rates of 81.93% on the AP view and 83.45% on the LA view using YOLOv5; vertebral corner point extraction achieved the lowest average error distance of 4.63 mm on the AP view using ResNet152V2 and 4.91 mm on the LA view using DenseNet201. Spondylolisthesis prediction reached the highest accuracy of 95.14% on the AP view and 92.26% on the LA view of a testing set using Support Vector Machine (SVM). (6) Discussions: The results of the three experiments highlight the potential of BUU-LSPINE in developing and evaluating algorithms for lumbar vertebrae detection and spondylolisthesis prediction. These steps are crucial in advancing the creation of a clinical decision support system (CDSS). Additionally, the findings demonstrate the impact of Lumbosacral transitional vertebrae (LSTV) conditions on lumbar detection algorithms.
Full-text available
The corrosion of metals and alloys presents a significant challenge in many industries, demanding constant maintenance, and thereby increasing costs. In response to this problem, the smart corrosion protection coating has emerged as a promising solution. By enabling the immediate detection of, and response to, environmental changes, such as in the temperature and pH, these smart coatings contribute significantly to extending a material’s lifespan, and reducing maintenance expenses. In this study, nanomagnetic [poly(aniline-co-o-toluidine)/Fe3O4] systems were prepared and used as a self-healing corrosion inhibitor, mixed with alkyd paint at different weight percentages (5–25%). The composites were used as a coating on carbon steel (C1010), and their corrosion protection performance was tested in 0.1 mol/L HCl, using electrochemical impedance spectroscopy (EIS), scanning electron microscope (SEM), and FTIR analyses. The results showed an adequate corrosion inhibition performance for the developed composites, compared to the alkyd paint alone, reaching an inhibition efficiency of 80% at 20 wt.% of composite. Adding increasing weight percentages of the developed composites to the paints led to a significant increase in the corrosion resistance, accompanied by a remarkable decrease in the double-layer capacitance. Thus, these developed composites show excellent potential as a corrosion protection formulation in paints.
Full-text available
An innovative strengthening system for dapped-end beams is studied numerically and experimentally in this paper. The system is developed for the half-joint regions of bridge beams also commonly called “gerber saddles”, but it can be adapted to different scenarios. The strengthening system consists of two steel plates that are clamped on both sides of the webs of the beams by means of bolts. The purpose of the system is to transfer the highest possible amount of shear from the concrete webs to the steel plate elements reducing the resistance demand of the concrete half joint. Shear is transferred by friction from concrete to steel plates. The system is designed to be applied on existing bridges without heavy work interesting the carriageway, therefore reducing the interference with the traffic. Some interesting considerations emerge from the study, including the influence of the flange web connection on the structural behavior and the possible presence of brittle failure mechanisms that are difficult to model numerically using f.e.m. simulations.
Full-text available
Expansive soil is prone to shrinkage and cracking during the drying process, leading to strength and permeability problems that exist widely in water conservancy projects and geotechnical engineering, including foundation pits and cracks at the bottom of channels and slopes. Such problems are closely related to the tensile strength of the soil. In this study, Nanyang expansive soil is taken as the research object and radial splitting tests were performed using a particle image velocimetry (PIV) test system on both undisturbed and remolded expansive soil during the drying process. The results indicated that the load–displacement curve of the undisturbed and remolded expansive soil specimens showed a strain-softening phenomenon and that the peak load increased with decreasing water content. Under the same other conditions, the peak load of the remolded expansive soil specimen was higher than that of the undisturbed soil specimen, with the undisturbed soil specimen having distinctive structural and fractural features. The load–displacement relation curve, displacement vector field, and fracture characteristics had an obvious one-to-one correspondence in the stage division. The compression deformation stage, crack development stage after the peak value, crack maturity stage, and failure stage could be observed via the PIV technique. Moreover, the fracture characteristics of the remolded specimens were more regular than those of the undisturbed specimens. The above research results provide a scientific basis for the design and construction of geotechnical engineering related to expansive soil.
Full-text available
Multi-terminal memristor and memtransistor (MT-MEMs) has successfully performed complex functions of heterosynaptic plasticity in synapse. However, theses MT-MEMs lack the ability to emulate membrane potential of neuron in multiple neuronal connections. Here, we demonstrate multi-neuron connection using a multi-terminal floating-gate memristor (MT-FGMEM). The variable Fermi level (EF) in graphene allows charging and discharging of MT-FGMEM using horizontally distant multiple electrodes. Our MT-FGMEM demonstrates high on/off ratio over 10⁵ at 1000 s retention about ~10,000 times higher than other MT-MEMs. The linear behavior between current (ID) and floating gate potential (VFG) in triode region of MT-FGMEM allows for accurate spike integration at the neuron membrane. The MT-FGMEM fully mimics the temporal and spatial summation of multi-neuron connections based on leaky-integrate-and-fire (LIF) functionality. Our artificial neuron (150 pJ) significantly reduces the energy consumption by 100,000 times compared to conventional neurons based on silicon integrated circuits (11.7 μJ). By integrating neurons and synapses using MT-FGMEMs, a spiking neurosynaptic training and classification of directional lines functioned in visual area one (V1) is successfully emulated based on neuron’s LIF and synapse’s spike-timing-dependent plasticity (STDP) functions. Simulation of unsupervised learning based on our artificial neuron and synapse achieves a learning accuracy of 83.08% on the unlabeled MNIST handwritten dataset.
Full-text available
The prediction of formwork pressure exerted by self-compacting concrete (SCC) remains a challenge not only to researchers but also to engineers and contractors on the construction site. This article aims to utilize shallow neural networks (SNN) and deep neural networks (DNN) using Long Short-Term Memory (LSTM) approach to develop a prediction model based on real-time data acquitted from controllable laboratory testing series. A test setup consisting of a two-meter-high column, ø160 mm, was prepared and tested in the laboratory. A digital pressure monitoring system was used to collect and transfer the data to the cloud on a real-time basis. The pressure was monitored during-and after casting, following the pressure build-up and reduction, respectively. The two main parameters affecting the form pressure, i.e., casting rate and slump flow, were varied to collect a wide range of input data for the analysis. The proposed model by DNN was able to accurately predict the pressure behavior based on the input data from the laboratory tests with high-performance indicators and multiple hidden layers. The results showed that the pressure is significantly affected by the casting rate, while the slump flow had rather lower impact. The proposed model can be a useful and reliable tool at the construction site to closely predict the pressure development and the effects of variations in casting rate and slump flow. The model provides the opportunity to increase safety and speeding up construction while avoiding costly and time-consuming effects of oversized formwork.
A data-driven model is trained to predict mixed-mode stress intensity factors (SIFs) of composites through an artificial neural network (ANN) method. The model is based on a database generated by combining the interaction integral and the extended finite element method (XFEM). To reduce the input dimensionality and improve predictive performance, feature engineering is performed on the input data, and principal component analysis is conducted. Hyperparameters of the model are adjusted by using K-fold cross-validation and Bayesian optimization algorithm (BOA) to enhance the adaptability and generalization of the model. To overcome the data scarcity issue, an active learning knowledge extraction framework is constructed, which allows for accurate knowledge extraction even with limited data. By utilizing data-driven models to solve mixed-mode SIFs, the computational cost and complexity are greatly reduced compared to numerical simulations, while computational stability and ability to deal with high-dimensional nonlinear problems are significantly improved.
The detection of bridge cracks is an important task in bridge maintenance. It can also reflect the health of the bridge. However, cracks are usually in the form of strips, which are different from the concrete surface. Most crack detection algorithms cannot adapt to this situation well. In this paper, the original image of bridge cracks is collected and the data set is obtained through image processing. A bridge crack detection method based on improving encoder-decoder and mixed pooling module is proposed in this article. The basic features of the crack images are extracted by an encoder with dilated convolution. In this way, the resolution of the feature image can be guaranteed, and large receptive field can be obtained. Then the feature picture through the mix pooling module, which helps to capture remote context information and establish a remote dependency. Finally, the decoder restores the picture to its original size and integrates the original features. In the comparison experiment with the same experimental conditions, we compared with the classic image segmentation methods such as PSPNet, U-Net, FCN, and DeepLabv3+. The results show that our method achieves 98.3%, 97.3%, 97.6%, and 84.5% in precision, recall, F1-score, and MIoU. The results show that our method does have certain advantages in the field of crack detection and segmentation.