ArticlePublisher preview available

Bicrack: a bilateral network for real-time crack detection

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Crack detection is an important task to ensure structural safety. Traditional manual detection is extremely time-consuming and labor-intensive. However, existing deep learning-based methods also commonly suffer from low inference speed and continuous crack interruption. To solve the above problems, a novel bilateral crack detection network (BiCrack) is proposed for real-time crack detection tasks. Specifically, the network fuses two feature branches to achieve the best trade-off between accuracy and speed. A detail branch with a shallow convolutional layer is first designed. It preserves crack detail to the maximum and generates high-resolution features. Meanwhile, the semantic branch with fast-downsampling strategy is used to obtain enough high-level semantic information. Then, a simple pyramid pooling module (SPPM) is proposed to aggregate multi-scale context information with low computational cost. In addition, to enhance feature representation, an attention-based feature fusion module (FFM) is introduced, which uses space and channel attention to generate weights, and then fuses input fusion features with weights. To demonstrate the effectiveness of the proposed method, it was evaluated on 5 challenging datasets and compared with state-of-the-art crack detection methods. Extensive experiments show that BiCrack achieves the best performance in the crack detection task compared to other methods.
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
International Journal of Machine Learning and Cybernetics
https://doi.org/10.1007/s13042-024-02438-3
ORIGINAL ARTICLE
Bicrack: abilateral network forreal‑time crack detection
SaileiWang1,2· RongshengLu1,2· BingtaoHu1,2· DahangWan1,2· MingtaoFang1,2
Received: 24 November 2023 / Accepted: 17 October 2024
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024
Abstract
Crack detection is an important task to ensure structural safety. Traditional manual detection is extremely time-consuming
and labor-intensive. However, existing deep learning-based methods also commonly suffer from low inference speed and
continuous crack interruption. To solve the above problems, a novel bilateral crack detection network (BiCrack) is pro-
posed for real-time crack detection tasks. Specifically, the network fuses two feature branches to achieve the best trade-off
between accuracy and speed. A detail branch with a shallow convolutional layer is first designed. It preserves crack detail
to the maximum and generates high-resolution features. Meanwhile, the semantic branch with fast-downsampling strategy
is used to obtain enough high-level semantic information. Then, a simple pyramid pooling module (SPPM) is proposed to
aggregate multi-scale context information with low computational cost. In addition, to enhance feature representation, an
attention-based feature fusion module (FFM) is introduced, which uses space and channel attention to generate weights, and
then fuses input fusion features with weights. To demonstrate the effectiveness of the proposed method, it was evaluated on 5
challenging datasets and compared with state-of-the-art crack detection methods. Extensive experiments show that BiCrack
achieves the best performance in the crack detection task compared to other methods.
Keywords Crack detection· Bilateral network· Semantic segmentation· Real-time· Deep learning
1 Introduction
Cracks are the early damage of buildings, roads, bridges and
other man-made infrastructure, which will pose a serious
threat to the safety of the structure [13]. If the crack cannot
be found and repaired in time, it may cause immeasurable
losses. Manual detection is a very common method of crack
detection. However, due to the subjectivity of human judg-
ment, manual detection is not only extremely time-consum-
ing but can also result in a significant lack of accuracy in
crack detection. To solve the above problems, researchers
have conducted in-depth research on fast and accurate auto-
matic crack detection algorithms. Automatic crack detection
methods based on image processing have attracted much
attention due to their low cost and high accuracy. Most of
these methods focus on threshold segmentation [4, 5], edge
detection [68], wavelet transform [9] and other technolo-
gies. Compared with manual detection, automatic crack
detection significantly improves the efficiency and accu-
racy of crack detection, and is not affected by subjectivity.
However, due to the uneven distribution of cracks, low con-
trast between cracks and the surrounding background, back-
ground noise and the influence of shadows, the traditional
automatic crack detection methods cannot adapt to complex
environments well, resulting in very limited use scenarios
and practical effects.
In recent years, deep learning technology has made
great progress in the field of computer vision, which pro-
vides a new method to solve the traditional automatic crack
* Rongsheng Lu
rslu@hfut.edu.cn
Sailei Wang
saileiwang@mail.hfut.edu.cn
Bingtao Hu
hubingtao@mail.hfut.edu.cn
Dahang Wan
wandahang@mail.hfut.edu.cn
Mingtao Fang
mingtaofang@mail.hfut.edu.cn
1 School ofInstrument Science andOpto-electronic
Engineering, Hefei University ofTechnology, Hefei230009,
China
2 Anhui Province Key Laboratory ofMeasuring Theory
andPrecision Instrument, Hefei University ofTechnology,
Hefei230009, China
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Cracks in concrete surfaces are one of the most prominent causes of the degradation of concrete structures such as bridges, roads, buildings, etc. Hence, it is very crucial to detect cracks at an early stage to inspect the structural health of the concrete structure. To solve the drawbacks of manual inspection, Image Processing Techniques (IPTs), especially those based on Deep Learning (DL) methods, have been investigated for the past few years. Due to the groundbreaking development of this field, researchers have devoted their endeavors to detecting cracks using DL-based IPTs and as a result, the techniques have given answers to many challenging problems. However, to the best of our knowledge, a state-of-the-art systematic review paper is lacking in this field that would present a scientometric analysis as well as a critical survey of the existing works to document the research trends and summarize the prominent IPTs for detecting cracks in concrete structures. Therefore, this article comes forward to spur researchers with a systematic review of the relevant literature, which will present both scientometric and critical analysis of the papers published in this research area. The scientometric data that are brought out from the articles are analyzed and visualized by using VOSviewer and CiteSpace text mining tools in terms of some parameters. Furthermore, this article elucidates research from all over the world by highlighting and critically analyzing the incarnated essence of some of the most influential papers. Moreover, this research raises some common questions as well as extracts answers from the analyzed papers to highlight various features of the utilized methods.
Article
Pixel-level road crack detection has always been a challenging task in intelligent transportation systems. Due to the external environments, such as weather, light, and other factors, pavement cracks often present low contrast, poor continuity, and different sizes in length and width. However, most of the existing studies pay less attention to crack data under different situations. Meanwhile, recent algorithms based on deep convolutional neural networks (DCNNs) have promoted the development of cutting-edge models for crack detection. Nevertheless, they usually focus on complex models for good performance, but ignore detection efficiency in practical applications. In this article, to address the first issue, we collected two new databases (i.e. Rain365 and Sun520) captured on rainy and sunny days respectively, which enrich the data of the open source community. For the second issue, we reconsider how to improve detection efficiency with excellent performance, and then propose our lightweight encoder-decoder architecture termed CarNet. Specifically, we introduce a novel olive-shaped structure for the encoder network, a lightweight multi-scale block and a new up-sampling method in the decoder network. Numerous experiments show that our model can better balance detection performance and efficiency compared with previous models. Especially, on the Sun520 dataset, our CarNet significantly advances the state-of-the-art performance with ODS F-score from 0.488 to 0.514. Meanwhile, it does so with an improved detection speed (104 frames per second) which is orders of magnitude faster than some recent DCNNs-based algorithms specially designed for crack detection.
Article
Face hallucination in low-light environments is an extremely challenging task due to the significant loss of facial structure and facial texture information. Although cascading image relighting and face hallucination tasks is a feasible strategy, simply cascading these two tasks does not achieve satisfactory results because they do not fit into each other naturally. In this paper, we propose a novel duplex fusing-embedding learning approach to tackle this challenge in low-light environments. The core of the proposed approach is the duplexity of feature fusion and embedding between relighting and hallucination tasks. In the feature fusion phase, the shallow features from two tasks are bidirectionally fused and activated into a consistent feature space. In the feature embedding phase, the fused features from the previous iteration are fed back and bidirectionally embedded into the deep features of two tasks in the current iteration so that they can learn feature representations that consistently represent both tasks, thereby boosting the performance of relighting and hallucination to generate photorealistic HR face images. Experimental results show that the proposed approach allows current face hallucination methods to learn to hallucinate face in the dark.
Article
With the development of convolutional neural network and deep learning, pavement crack detection has also attracted some attention. The cracks on the pavement have the characteristics of different sizes and shapes. At present, there are still many problems in the practical application of the detection algorithm. On the one hand, many detection methods use quadrilateral contour frame as the detection frame, which is not very high for the cracks with narrow shape in the image. On the other hand, the detection frame obtained by some methods cannot accurately capture the uneven crack texture, resulting in false detection. To solve these two problems, a pavement crack detection method based on multi-region segmentation based on deep learning is proposed in this paper. In this method, the pavement crack instances in the image scene are mapped into the overall area, core area, and frame area space respectively to obtain the segmentation map of the pavement crack instances in the above three areas. Then the whole region segmentation map and border region segmentation map are used to guide the generation of core region segmentation map. To obtain more accurate detection results, the proposed method uses the detected crack frame region to supervise and learn the core region. Finally, the detection image of the core area is generated into crack lines with higher coincidence, and the detection results are obtained. The experimental results show that the accuracy of the proposed method on CrackForest data set can reach more than 83%. Compared with the existing detection algorithms, its F value is improved by more than 1%, and the algorithm has good detection results in different data sets.
Article
Recently, face hallucination methods either feed whole face image into convolutional neural networks (CNNs) or utilize extra facial priors (e.g., facial parsing maps and landmarks) to focus on global facial structure and constrain facial texture generation. However, the limited receptive fields of CNNs and inaccurate facial priors will reduce the naturalness and fidelity of restored face. In this paper, we propose a FaceFormer that aggregates global representation of Transformers and local representation of CNNs to maintain the consistency of facial structure while restoring local facial details. The reason for this design is that the Transformer can capture global facial information by exploiting the long-distance visual relation modeling, while the local modeling capability of CNNs can recover fine-grained facial details. Therefore, aggregating these two independent representations can help to maximize their merits and reconstruct high-quality and high-fidelity face images. Experimental results of face reconstruction and recognition verify that the proposed FaceFormer significantly outperforms current state-of-the-arts.
Article
Accurate pavement surface crack detection is essential for pavement assessment and maintenance. This study aims to improve pavement crack detection under noisy conditions. A novel model named Crack Transformer (CT), which unifies Swin Transformer as the encoder and the decoder with all multi-layer perception (MLP) layers, is proposed for the automatic detection of long and complicated pavement cracks. Based on a comprehensive investigation of training performance metrics and visualization results on three public datasets, the proposed CT model indicates enhanced performance. Experimental results prove the effectiveness and robustness of the Transformer-based network on accurate pavement crack detection. This study shows the feasibility of using a Transformer-based network for automatic robust pavement crack detection under noisy conditions.
Article
One of the emerging and powerful tools of Artificial Intelligence (AI) in computer vision is Convolutional Neural Network (CNN) which can outperform traditional algorithms for crack detection by extracting unique image features. The segmentation of crack images is intensively affected by the imbalanced presence of crack and non-crack elements. Tackling the influence of class imbalanced datasets on the training network, we proposed an additive attention gate-based network architecture called Crack Segmentation Network-II (CSN-II). CSN-II has fewer encoder–decoder blocks with improved accuracy and reduced computational cost as compared to other crack segmentation network architectures. An additive attention gate is used as a connecting block between the encoder–decoder section of CSN-II that focuses on significant crack regions in the image. The network performance is evaluated on two different crack image datasets i.e., MSCI (500 images) and CFD (118 images). The experimental results showed that the CSN-II architecture using a local balanced cross-entropy (LBCE) loss function has achieved 98.48 % and 94.39 % mean accuracy for MSCI and CFD dataset, respectively. Furthermore, extensive research experiments are performed on MSCI and CFD datasets to delineate the best combination of five network architectures (U-Net, SegNet, DeepLabv3+, CSN, and CSN-II) and twelve loss functions for crack segmentation to observe the efficiency for tackling imbalanced dataset.
Article
With the advancement of deep learning, the newly proposed neural networks are growing increasingly complicated to achieve great performance. In this context, we propose a simple but effective neural network called MiniCrack for narrow crack detection. We also propose a lightweight version, MiniCrack-Light, to adapt to scenarios with limited computing resources. MiniCrack and MiniCrack-Light outperform the current state-of-the-art neural networks on all three challenging testing data sets with fewer parameters and achieving stronger robustness. PixelShuffle and PixelUnshuffle designed for image super-resolution are successfully used to the field of image segmentation, which effectively alleviates the problems caused by pooling.