Article

Pavement crack detection through a deep-learned asymmetric encoder-decoder convolutional neural network

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Crack detection on roads’ surfaces is an important issue in pavement management, as it provides an indication of the quality of the road and its deterioration over time. Pavement cracks are one of the most common types of damage observed on roads, and they can be seen visually. Despite the fact that it does not provide immediate resolution to the issue, understanding the extent of crack damage is essential for the upkeep of roads. This paper presents a novel approach to automatically detecting pavement cracks using the orthoimage generated by a consumer-grade photogrammetric Unmanned Aerial Vehicle (UAV) and a deep learning algorithm. We used an autoencoder Convolutional Neural Network (CNN) to train a dataset full of challenging factors such as road lines and marks, oil and colour spots, and water stains. The model was tested on a dataset of RGB patches of different patterns of cracks and achieved an overall accuracy (OA) and F1 score of about 0.98. The results demonstrate the effectiveness of the proposed method in accurately detecting pavement cracks in challenging real-world conditions. This approach provides an efficient and cost-effective solution for pavement crack detection, that can be used for measuring the road's quality and monitoring it.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The emergence of crack further increases the challenge of asphalt pavement detection and maintenance [8][9][10][11][12]. By the end of 2022, China's pavement detection and maintenance mileage accounted for 99.9% of the total mileage, and the conventional asphalt pavement crack detection means is inefficient [13][14][15][16], difficult to achieve rapid and large-scale detection goals [17][18][19][20]. Therefore, the use of intelligent methods based on deep learning (DL) to identify asphalt pavement crack images has become a development trend in the field [21][22][23][24][25]. ...
Article
Full-text available
Owing to the significant application potential of unmanned aerial vehicles (UAVs) and infrared imaging technologies, researchers from different fields have conducted numerous experiments on aerial infrared image processing. To continuously detect small road objects 24 h/day, this study proposes an efficient Rep-style Gaussian–Wasserstein network (ERGW-net) for small road object detection in infrared aerial images. This method aims to resolve problems of small object size, low contrast, few object features, and occlusions. The ERGW-net adopts the advantages of ResNet, Inception net, and YOLOv8 networks to improve object detection efficiency and accuracy by improving the structure of the backbone, neck, and loss function. The ERGW-net was tested on a DroneVehicle dataset with a large sample size and the HIT-UAV dataset with a relatively small sample size. The results show that the detection accuracy of different road targets (e.g., pedestrians, cars, buses, and trucks) is greater than 80%, which is higher than the existing methods.
Article
Full-text available
This paper proposes a framework for obstacle-avoiding autonomous unmanned aerial vehicle (UAV) systems with a new obstacle avoidance method (OAM) and localization method for autonomous UAVs for structural health monitoring (SHM) in GPS-denied areas. There are high possibilities of obstacles in the planned trajectory of autonomous UAVs used for monitoring purposes. A traditional UAV localization method with an ultrasonic beacon is limited to the scope of the monitoring and vulnerable to both depleted battery and environmental electromagnetic fields. To overcome these critical problems, a deep learning-based OAM with the integration of You Only Look Once version 3 (YOLOv3) and a fiducial marker-based UAV localization method are proposed. These new obstacle avoidance and localization methods are integrated with a real-time damage segmentation method as an autonomous UAV system for SHM. In indoor testing and outdoor tests in a large parking structure, the proposed methods showed superior performances in obstacle avoidance and UAV localization compared to traditional approaches.
Article
Full-text available
As technology continues to develop, computer vision (CV) applications are becoming increasingly widespread in the intelligent transportation systems (ITS) context. These applications are developed to improve the efficiency of transportation systems, increase their level of intelligence, and enhance traffic safety. Advances in CV play an important role in solving problems in the fields of traffic monitoring and control, incident detection and management, road usage pricing, and road condition monitoring, among many others, by providing more effective methods. This survey examines CV applications in the literature, the machine learning and deep learning methods used in ITS applications, the applicability of computer vision applications in ITS contexts, the advantages these technologies offer and the difficulties they present, and future research areas and trends, with the goal of increasing the effectiveness, efficiency, and safety level of ITS. The present review, which brings together research from various sources, aims to show how computer vision techniques can help transportation systems to become smarter by presenting a holistic picture of the literature on different CV applications in the ITS context.
Article
Full-text available
Current Multi-View Stereo (MVS) algorithms are tools for high-quality 3D model reconstruction, strongly depending on image spatial resolution. In this context, the combination of image Super-Resolution (SR) with image-based 3D reconstruction is turning into an interesting research topic in photogrammetry, around which however only a few works have been reported so far in the literature. Here, a thorough study is carried out on various state-of-the-art image SR techniques to evaluate the suitability of such an approach in terms of its inclusion in the 3D reconstruction process. Deep-learning techniques are tested here on a UAV image dataset, while the MVS task is then performed via the Agisoft Metashape photogrammetric tool. The data under experimentation are oblique cultural heritage imagery. According to results, point clouds from low-resolution images present quality inferior to those from upsampled high-resolution ones. The SR techniques HAT and DRLN outperform bicubic interpolation, yielding high precision/recall scores for the differences of reconstructed 3D point clouds from the reference surface. The current study indicates spatial image resolution increased by SR techniques may indeed be advantageous for state-of-the art photogrammetric 3D reconstruction.
Article
Full-text available
Bridges are often at risk due to the effects of natural disasters, such as earthquakes and typhoons. Bridge inspection assessments normally focus on cracks. However, numerous concrete structures with cracked surfaces are highly elevated or over water, and is not easily accessible to a bridge inspector. Furthermore, poor lighting under bridges and a complex visual background can hinder inspectors in their identification and measurement of cracks. In this study, cracks on bridge surfaces were photographed using a UAV-mounted camera. A YOLOv4 deep learning model was used to train a model for identifying cracks; the model was then employed in object detection. To perform the quantitative crack test, the images with identified cracks were first converted to grayscale images and then to binary images the using local thresholding method. Next, the two edge detection methods, Canny and morphological edge detectors were applied to the binary images to extract the edges of the cracks and obtain two types of crack edge images. Then, two scale methods, the planar marker method, and the total station measurement method, were used to calculate the actual size of the crack edge image. The results indicated that the model had an accuracy of 92%, with width measurements as precise as 0.22 mm. The proposed approach can thus enable bridge inspections and obtain objective and quantitative data.
Article
Full-text available
Existing deep learning (DL) models can detect wider or thicker segments of cracks that occupy multiple pixels in the width direction, but fail to distinguish the thin tail shallow segment or propagating crack occupying fewer pixels. Therefore, in this study, we proposed a scheme for tracking missing thin/propagating crack segments during DL-based crack identification on concrete surfaces in a computationally efficient manner. The proposed scheme employs image processing as a preprocessor and a postprocessor for a 1D DL model. Image-processing-assisted DL as a precursor to DL eliminates labor-intensive labeling and the plane structural background without any distin-guishable features during DL training and testing; the model identifies potential crack candidate regions. Iterative differential sliding-window-based local image processing as a postprocessor to DL tracks missing thin cracks on segments classified as cracks. The capability of the proposed method is demonstrated on low-resolution images with cracks of single-pixel width, captured using un-manned aerial vehicles on concrete structures with different surface textures, different scenes with complicated disturbances, and optical variability. Due to the multi-threshold-based image processing , the overall approach is invariant to the choice of initial sensitivity parameters, hyperparam-eters, and the sequence of neuron arrangement. Further, this technique is a computationally efficient alternative to semantic segmentation that results in pixelated mapping/classification of thin crack regimes, which requires labor-intensive and skilled labeling.
Article
Full-text available
The latest developments in the field of road asphalt materials and pavement construction/maintenance technologies, as well as the spread of life-cycle-based sustainability assessment techniques, have posed issues in the continuous and efficient management of data and relative decision-making process for the selection of appropriate road pavement design and maintenance solutions; Infrastructure Building Information Modeling (IBIM) tools may help in facing such challenges due to their data management and analysis capabilities. The present work aims to develop a road pavement life cycle sustainability assessment framework and integrate such a framework into the IBIM of a road pavement project through visual scripting to automatically provide the informatization of an appropriate pavement information model and evaluate sustainability criteria already in the design stage through life cycle assessment and life cycle cost analysis methods. The application of the proposed BIM-based tool to a real case study allowed us (a) to draw considerations about the long-term environmental and economic sustainability of alternative road construction materials and (b) to draft a maintenance plan for a specific road section that represents the best compromise solution among the analyzed ones. The IBIM tool represents a practical and dynamic way to integrate environmental considerations into road pavement design, encouraging the use of digital tools in the road industry and ultimately supporting a pavement maintenance decision-making process oriented toward a circular economy.
Article
Full-text available
Detection of colorectal polyps through colonoscopy is an essential practice in prevention of colorectal cancers. However, the method itself is labor intensive and is subject to human error. With the advent of deep learning-based methodologies, and specifically convolutional neural networks, an opportunity to improve upon the prognosis of potential patients suffering with colorectal cancer has appeared with automated detection and segmentation of polyps. Polyp segmentation is subject to a number of problems such as model overfitting and generalization, poor definition of boundary pixels, as well as the model’s ability to capture the practical range in textures, sizes, and colors. In an effort to address these challenges, we propose a dual encoder–decoder solution named Polyp Segmentation Network (PSNet). Both the dual encoder and decoder were developed by the comprehensive combination of a variety of deep learning modules, including the PS encoder, transformer encoder, PS decoder, enhanced dilated transformer decoder, partial decoder, and merge module. PSNet outperforms state-of-the-art results through an extensive comparative study against 5 existing polyp datasets with respect to both mDice and mIoU at 0.863 and 0.797, respectively. With our new modified polyp dataset we obtain an mDice and mIoU of 0.941 and 0.897 respectively.
Article
Full-text available
In recent years, artificial neural networks (ANN) and artificial intelligence (AI), in general, have garnered significant attention with respect to their applications in several scientific fields, varying from big data management to medical diagnosis [...]
Article
Full-text available
In recent years, as the use of Unmanned Aerial Vehicle (UAV) imaging systems has increased, the photogrammetry community has conducted extensive research on the unique advantages of these systems. The UAVs are considered as one of the most important platforms for photogrammetry applications from various urban and non-urban areas at different scales. In UAV photogrammetry projects the spatial resolution of the images must be determined prior to the imaging stage. The spatial resolution of the images is a commonly-used criterion for detecting the smallest distance between two adjacent separable objects in the images. Numerous methods have been developed to precisely evaluate the spatial resolution of images. In this study, the Siemens star target, which is one of the most commonly used artificial targets for analysing spatial resolution was studied. The objective of this paper is to evaluate and compare the reduction of spatial resolution coefficient using the Siemens star target in images captured by UAVs. To this end, a method for automatically detecting the radius of the circle of ambiguity and calculating spatial resolution in UAV images has been developed. According to the findings of this study, the initial step in creating the Siemens star target, in terms of size and the number of acceptable arms, is dependent on the flying altitude of the UAV and the level of image blur. In addition, the reduction in spatial resolution of images captured by various UAVs varies, and its coefficient must be calculated for each project.
Article
Full-text available
Non-destructive testing and characterization of internal vertical cracks are critical for road maintenance by ground penetrating radar (GPR). This paper describes a mask region-based convolutional neural network (R-CNN) that automatically detects and segments small cracks in asphalt pavement at the pixel level. Simulation using Gprmax software and field detection were performed to determine the crack features in GPR images of asphalt pavement and the relationship between the width of vertical cracks and their area in GPR images. Results showed that a 0.833 precision, 0.822 F1 score, 0.701 mean intersection-over-union (mIoU) and 4.2 frames per second (FPS) were achieved on 429 GPR images (1024×1024 pixels), and the mean error between the segmented crack width and the true values was 2.33%. The research results represent a further step toward accurately detecting and characterizing internal vertical cracks in asphalt pavement.
Article
Full-text available
As the battery cycles between charging and discharging, the working conditions or improper operations such as overcharge and over discharge will aggravate the negative reaction inside the battery, generate irreversible chemical substances, and reduce the number of active substances involved in the electrochemical reaction, resulting in a decrease in battery capacity. Batteries that lose 20% of their capacity can be considered to have failed. A failed battery shows that the battery capacity and power decay faster, and the electrical characteristics, stability, and safety of the battery will drop significantly. As a means of improving the machine learning model's accuracy and generalization for RUL prediction of zinc-ion batteries, this paper mainly discusses about the design of the encoder-decoder model structure and the application of optimization methods. Then, the method of neural network hyperparameter optimization is studied. Finally, the validity of the research work done in this paper is verified by a series of comparative experiments.
Article
Full-text available
Previous research has shown the high accuracy of convolutional neural networks (CNNs) in asphalt and concrete crack detection in controlled conditions. Yet, human-like generalisation remains a significant challenge for industrial applications where the range of conditions varies significantly. Given the intrinsic biases of CNNs, this paper proposes a vision transformer (ViT)-based framework for crack detection on asphalt and concrete surfaces. With transfer learning and the differentiable intersection over union (IoU) loss function, the encoder-decoder network equipped with ViT could achieve an enhanced real-world crack segmentation performance. Compared to the CNN-based models (DeepLabv3+ and U-Net), TransUNet with a CNN-ViT backbone achieved up to ~61% and ~3.8% better mean IoU on the original images of the respective datasets with very small and multi-scale crack semantics. Moreover, ViT assisted the encoder-decoder network to show a robust performance against various noisy signals where the mean Dice score attained by the CNN-based models significantly dropped (<10%).
Article
Full-text available
Pavement crack detection methods based on deep learning and computer vision can greatly improve detection efficiency and accuracy, but in many cases the data in training set is lacking or uneven, making it insufficient to train an accurate detection model. This paper proposes a detection method under small samples, which is composed of two steps. First, with a generative adversarial network (GAN) constructed, the small sample data set of pavement cracks taken by unmanned aerial vehicle (UAV) is used as the training set and the GAN model is trained. The best trained model is used for generation of new images. Second, original small-sample data set is expanded by images generated by the GAN model, and a convolutional neural network (CNN) model is constructed at the same time. Then, data set before and after the expansion is trained and tested by the method of transfer learning to verify the effectiveness of expanded data separately. It has been proved that, compared with the unexpanded data set, CNN model trained after expansion improves the test set detection accuracy from 80.75% to 91.61%, which is regarded as a significant improvement. In addition, this paper also uses class activation map (CAM) to visually evaluate CNN model, and expands the detection ability of classification model.
Article
Full-text available
The identification of clayey soil desiccation cracks is an important practical issue in geotechnical engineering and engineering geology. The desiccation cracks can dramatically increase the hydraulic conductivity and deteriorate the mechanical performance of clayey soils. Traditionally, the analysis of soil desiccation cracks relies on visual inspection and image processing techniques, which lack automation and intelligence. Therefore, there is an increasing need for an automated algorithm to meet accuracy and efficiency requirements for various engineering scenarios. In this study, a state-of-the-art deep-learning algorithm, Mask R-CNN, was utilized for the clayey soil crack detection, locating and segmentation. A comprehensive dataset including 1200 annotated crack images of 256×256 resolution was prepared for the algorithm training and validation. The proposed Mask R-CNN algorithm achieved precision, recall and F1 score of 73.29%, 82.76% and 77.74%, respectively. Besides, the algorithm gained a mean locating accuracy (AP bb) of 64.14% and a mean segmentation accuracy (AP m) of 47.59%. The detection performance of the Mask R-CNN was also compared with the U-Net on three different scenarios. The test results have demonstrated the superiority of the Mask R-CNN over the U-Net algorithm in crack detection, locating and segmentation.
Article
Full-text available
To establish a system for managing road pavement, it is mandatory to prepare information components based on various perspectives of pavement management. One of the most significant information components in these systems is quality assessment regarding road pavement status. Accordingly, in this regard, data containing details of surface pavement failures and defects are of great significance. Apart from causing vehicle depreciation & damage, maintenance costs, and reducing the useful lifespan of the pavement structure, road pavement failures also lead to accidents and reduce road safety. Bearing in mind that the most important surface damages in road pavement are related to cracks with longitudinal, transverse, oblique, alligator, and block types, and as such cracks and defects can be visually and non-destructively assessed and evaluated, imaging-based approaches and techniques can provide details such as the type of defect, its severity, extent, and location and prove to be highly useful. In this paper, the Unmanned Aerial Vehicle (UAV)/drone photogrammetry has been proposed as a complementary approach aimed at providing information on defects caused by cracks in the country's road pavement management system. According to the author, the output of UAV photogrammetric products will significantly improve if the system parameters are adjusted. Consequently, through presenting a procedure to investigate the optimal parameters in the design of a UAV photogrammetric network, further, attempts were made for the implementation of an automated algorithm based on image processing operations & classifier decision tree which is independent of scale and image dimensions. Hence, after removing the road edges and determining the asphalt area, a pixel detection operation was carried out to reveal the cracks. Furthermore, after preparing the ground reality through selected orthophoto mosaic, the evaluation of crack pixel detection was determined using the proposed algorithm with three methods. An accuracy of 96% was determined for the main orthophoto mosaic. For the test orthophotos, which were the result of images taken by Phantom 4 Pro and Mavic Pro at different altitudes, an accuracy of approximately 82% to 91% was determined.
Article
Full-text available
Significant evolution in deep learning took place in 2010, when software developers started using graphical processing units for general-purpose applications. From that date, the deep neural network (DNN) started progressive steps across different applications ranging from natural language processing to hyperspectral image processing. The convolutional neural network (CNN) mostly triggers the interest, as it is considered one of the most powerful ways to learn useful representations of images and other structured data. The revolution of DNNs in medical imaging (MI) came in 2012, when Li launched ImageNet, a free database of more than 14 million labeled medical images. This state-of-the-art work presents a comprehensive study for the recent DNNs research directions applied in MI analysis. Clinical and pathological analysis through a selected patch of most cited researches is introduced. It will be shown how DNNs are able to tackle medical problems: classification, detection, localization, segmentation, and automatic diagnosis. Datasets comprises a range of imaging technologies: X-Ray, MRI, CT, Ultrasound, PET, Fluorescene Angiography, and even photographic images. This work surveys different patterns of DNNs and focuses somehow on the CNN, which offers an outstanding percentage of solutions compared to other DNNs structures. CNN emphasizes image features and has well-known architectures. On the other hand, limitations beyond DNNs training and execution time will be explained. Problems related to data augmentation and image annotation will be analyzed among a multiple of high standard publications. Finally, a comparative study of existing software frameworks supporting DNNs and future research directions in the area will be presented. From all presented works it could be deduced that the use of DNNs in healthcare is still in its early stages, there are strong initiatives in academia and industry to pursue healthcare projects based on DNNs.
Article
Full-text available
Recently, crack segmentation studies have been investigated using deep convolutional neural networks. However, significant deficiencies remain in the preparation of ground truth data, consideration of complex scenes, development of an object-specific network for crack segmentation, and use of an evaluation method, among other issues. In this paper, a novel semantic transformer representation network (STRNet) is developed for crack segmentation at the pixel level in complex scenes in a real-time manner. STRNet is composed of a squeeze and excitation attention-based encoder, a multi head attention-based decoder, coarse upsampling, a focal-Tversky loss function, and a learnable swish activation function to design the network concisely by keeping its fast-processing speed. A method for evaluating the level of complexity of image scenes was also proposed. The proposed network is trained with 1203 images with further extensive synthesis-based augmentation, and it is investigated with 545 testing images (1280 × 720, 1024 × 512); it achieves 91.7%, 92.7%, 92.2%, and 92.6% in terms of precision, recall, F1 score, and mIoU (mean intersection over union), respectively. Its performance is compared with those of recently developed advanced networks (Attention U-net, CrackSegNet, Deeplab V3+, FPHBN, and Unet++), with STRNet showing the best performance in the evaluation metrics-it achieves the fastest processing at 49.2 frames per second.
Article
Full-text available
Python is one of the most widely adopted programming languages, having replaced a number of those in the field. Python is popular with developers for a variety of reasons, one of which is because it has an incredibly diverse collection of libraries that users can run. The most compelling reasons for adopting Keras come from its guiding principles, particularly those related to usability. Aside from the simplicity of learning and model construction, Keras has a wide variety of production deployment options and robust support for multiple GPUs and distributed training. A strong and easy-to-use free, open-source Python library is the most important tool for developing and evaluating deep learning models. The aim of this paper is to provide the most current survey of Keras in different aspects, which is a Python-based deep learning Application Programming Interface (API) that runs on top of the machine learning framework, TensorFlow. The mentioned library is used in conjunction with TensorFlow, PyTorch, CODEEPNEATM, and Pygame to allow integration of deep learning models such as cardiovascular disease diagnostics, graph neural networks, identifying health issues, COVID-19 recognition, skin tumors, image detection, and so on, in the applied area. Furthermore, the author used Keras's details, goals, challenges, significant outcomes, and the findings obtained using this method.
Article
Full-text available
Cracking in concrete structures affects performance and is a major durability problem. Cracks must be detected and repaired in time in order to maintain the reliability and performance of the structure. This study focuses on vision-based crack detection algorithms, based on deep convolutional neural networks that detect and classify cracks with higher classification rates by using transfer learning. The image dataset, consisting of two subsequent image classes (no-cracks and cracks), was trained by the AlexNet model. Transfer learning was applied to the AlexNet, including fine-tuning the weights of the architecture, replacing the classification layer for two output classes (no-cracks and cracks), and augmenting image datasets with random rotation angles. The fine-tuned AlexNet model was trained by stochastic gradient descent with momentum optimizer. The precision, recall, accuracy, and F1 metrics were used to evaluate the performance of the trained AlexNet model. The accuracy and loss obtained through the training process were 99.9% and 0.1% at the learning rate of 0.0001 and 6 epochs. The trained AlexNet model accurately predicted 1998/2000 and 3998/4000 validation and test images, which demonstrated the prediction accuracy of 99.9%. The trained model also achieved precision, recall, accuracy, and F1 scores of 0.99, respectively.
Article
Full-text available
Pavement crack detection is essential for safe driving. The traditional manual crack detection method is highly subjective and time-consuming. Hence, an automatic pavement crack detection system is needed to facilitate this progress. However, this is still a challenging task due to the complex topology and large noise interference of crack images. Recently, although deep learning-based technologies have achieved breakthrough progress in crack detection, there are still some challenges, such as large parameters and low detection efficiency. Besides, most deep learning-based crack detection algorithms find it difficult to establish good balance between detection accuracy and detection speed. Inspired by the latest deep learning technology in the field of image processing, this paper proposes a novel crack detection algorithm based on the deep feature aggregation network with the spatial-channel squeeze & excitation (scSE) attention mechanism module, which calls CrackDFANet. Firstly, we cut the collected crack images into 512 × 512 pixel image blocks to establish a crack dataset. Then through iterative optimization on the training and validation sets, we obtained a crack detection model with good robustness. Finally, the CrackDFANet model verified on a total of 3516 images in five datasets with different sizes and containing different noise interferences. Experimental results show that the trained CrackDFANet has strong anti-interference ability, and has better robustness and generalization ability under the interference of light interference, parking line, water stains, plant disturbance, oil stains, and shadow conditions. Furthermore, the CrackDFANet is found to be better than other state-of-the-art algorithms with more accurate detection effect and faster detection speed. Meanwhile, our algorithm model parameters and error rates are significantly reduced.
Article
Full-text available
Conventional methods for monitoring pavement healthy states have the disadvantages of low efficiency and being time-consuming and destructive. Current studies indicate that traditional machine learning algorithms showed poor performance and low generalization capacity in identifying asphalt pavement aging and damage conditions. Further, deep learning network models have less been applied to the detection of asphalt pavement aging types and damage objects from UAV imagery. In this study, we first used a low-altitude UAV platform to acquire multispectral images of road pavement with centimeter-level spatial resolution. The fine spatial resolution can provide detailed textural information of the pavement damage objects such as cracks and potholes. Afterwards, we combined multiscale semantic segmentation, using the CNN model and SVM classifier into a framework to extract pavement potholes and cracks and classify the pavement surfaces into three aging states. Results demonstrated that the proposed framework achieved the highest overall accuracy (87.83% and 92.96%) and recall rate (85.4% and 90.65%) in the classification of the asphalt pavement images in the two segments of roads in Xinjiang, China. We concluded that the combination of the CNN + SVM and low-altitude UAV multispectral images would contribute to improve the accuracy in the detection of asphalt pavement aging states and damaged objects.
Article
Full-text available
With recent advances in non-contact sensing technology such as cameras, unmanned aerial and ground vehicles, the structural health monitoring (SHM) community has witnessed a prominent growth in deep learning-based condition assessment techniques of structural systems. These deep learning methods rely primarily on convolutional neural networks (CNNs). The CNN networks are trained using a large number of datasets for various types of damage and anomaly detection and post-disaster reconnaissance. The trained networks are then utilized to analyze newer data to detect the type and severity of the damage, enhancing the capabilities of non-contact sensors in developing autonomous SHM systems. In recent years, a broad range of CNN architectures has been developed by researchers to accommodate the extent of lighting and weather conditions, the quality of images, the amount of background and foreground noise, and multiclass damage in the structures. This paper presents a detailed literature review of existing CNN-based techniques in the context of infrastructure monitoring and maintenance. The review is categorized into multiple classes depending on the specific application and development of CNNs applied to data obtained from a wide range of structures. The challenges and limitations of the existing literature are discussed in detail at the end, followed by a brief conclusion on potential future research directions of CNN in structural condition assessment.
Article
Full-text available
In recent years, maintenance work on public transport routes has drastically decreased in many countries due to difficult economic situations. The various studies that have been conducted by groups of drivers and groups related to road safety concluded that accidents are increasing due to the poor conditions of road surfaces, even affecting the condition of vehicles through costly breakdowns. Currently, the processes of detecting any type of damage to a road are carried out manually or are based on the use of a road vehicle, which incurs a high labor cost. To solve this problem, many research centers are investigating image processing techniques to identify poor-condition road areas using deep learning algorithms. The main objective of this work is to design of a distributed platform that allows the detection of damage to transport routes using drones and to provide the results of the most important classifiers. A case study is presented using a multi-agent system based on PANGEA that coordinates the different parts of the architecture using techniques based on ubiquitous computing. The results obtained by means of the customization of the You Only Look Once (YOLO) v4 classifier are promising, reaching an accuracy of more than 95%. The images used have been published in a dataset for use by the scientific community. https://github.com/luisaugustos/Pothole-Recognition
Article
Full-text available
In this study, an essential application of remote sensing using deep learning functionality is presented. Gaofen-1 satellite mission, developed by the China National Space Administration (CNSA) for the civilian high-definition Earth observation satellite program, provides near-real-time observations for geographical mapping, environment surveying, and climate change monitoring. Cloud and cloud shadow segmentation are a crucial element to enable automatic near-real-time processing of Gaofen-1 images, and therefore, their performances must be accurately validated. In this paper, a robust multiscale segmentation method based on deep learning is proposed to improve the efficiency and effectiveness of cloud and cloud shadow segmentation from Gaofen-1 images. The proposed method first implements feature map based on the spectral-spatial features from residual convolutional layers and the cloud/cloud shadow footprints extraction based on a novel loss function to generate the final footprints. The experimental results using Gaofen-1 images demonstrate the more reasonable accuracy and efficient computational cost achievement of the proposed method compared to the cloud and cloud shadow segmentation performance of two existing state-of-the-art methods.
Article
Full-text available
In the past two decades, structural health monitoring (SHM) systems have been widely installed on various civil infrastructures for the tracking of the state of their structural health and the detection of structural damage or abnormality, through long-term monitoring of environmental conditions as well as structural loadings and responses. In an SHM system, there are plenty of sensors to acquire a huge number of monitoring data, which can factually reflect the in-service condition of the target structure. In order to bridge the gap between SHM and structural maintenance and management (SMM), it is necessary to employ advanced data processing methods to convert the original multi-source heterogeneous field monitoring data into different types of specific physical indicators in order to make effective decisions regarding inspection, maintenance and management. Conventional approaches to data analysis are confronted with challenges from environmental noise, the volume of measurement data, the complexity of computation, etc., and they severely constrain the pervasive application of SHM technology. In recent years, with the rapid progress of computing hardware and image acquisition equipment, the deep learning-based data processing approach offers a new channel for excavating the massive data from an SHM system, towards autonomous, accurate and robust processing of the monitoring data. Many researchers from the SHM community have made efforts to explore the applications of deep learning-based approaches for structural damage detection and structural condition assessment. This paper gives a review on the deep learning-based SHM of civil infrastructures with the main content, including a brief summary of the history of the development of deep learning, the applications of deep learning-based data processing approaches in the SHM of many kinds of civil infrastructures, and the key challenges and future trends of the strategy of deep learning-based SHM.
Article
Full-text available
Cloud detection is a crucial preprocessing step for optical satellite remote sensing (RS) images. This article focuses on the cloud detection for RS imagery with cloud-snow coexistence and the utilization of the satellite thumbnails that lose considerable amount of high resolution and spectrum information of original RS images to extract cloud mask efficiently. To tackle this problem, we propose a novel cloud detection neural network with an encoder-decoder structure, named CDnetV2, as a series work on cloud detection. Compared with our previous CDnetV1, CDnetV2 contains two novel modules, that is, adaptive feature fusing model (AFFM) and high-level semantic information guidance flows (HSIGFs). AFFM is used to fuse multilevel feature maps by three submodules: channel attention fusion model (CAFM), spatial attention fusion model (SAFM), and channel attention refinement model (CARM). HSIGFs are designed to make feature layers at decoder of CDnetV2 be aware of the locations of the cloud objects. The high-level semantic information of HSIGFs is extracted by a proposed high-level feature fusing model (HFFM). By being equipped with these two proposed key modules, AFFM and HSIGFs, CDnetV2 is able to fully utilize features extracted from encoder layers and yield accurate cloud detection results. Experimental results on the ZY-3 satellite thumbnail data set demonstrate that the proposed CDnetV2 achieves accurate detection accuracy and outperforms several state-of-the-art methods.
Article
Full-text available
Road pavement cracks automated detection is one of the key factors to evaluate the road distress quality, and it is a difficult issue for the construction of intelligent maintenance systems. However, pavement cracks automated detection has been a challenging task, including strong nonuniformity, complex topology, and strong noise-like problems in the crack images, and so on. To address these challenges, we propose the CrackSeg—an end-to-end trainable deep convolutional neural network for pavement crack detection, which is effective in achieving pixel-level, and automated detection via high-level features. In this work, we introduce a novel multiscale dilated convolutional module that can learn rich deep convolutional features, making the crack features acquired under a complex background more discriminant. Moreover, in the upsampling module process, the high spatial resolution features of the shallow network are fused to obtain more refined pixel-level pavement crack detection results. We train and evaluate the CrackSeg net on our CrackDataset, the experimental results prove that the CrackSeg achieves high performance with a precision of 98.00%, recall of 97.85%, F -score of 97.92%, and a mIoU of 73.53%. Compared with other state-of-the-art methods, the CrackSeg performs more efficiently, and robustly for automated pavement crack detection.
Article
Full-text available
The extraction of pavement cracks is always a hard task in image processing. In airport and road construction, cracking is the main factor for pavement damage, which can decrease the quality of pavement and affect transportation seriously. Cracks also exist in other artificial or natural objects, such as buildings, bridges, tunnels, etc. Among all the object images, pavement crack images are the most complex, so the image processing and analysis for them is harder than other crack images. From the early image acquisition based on photography technology to the current 3D laser scanning technology, the pavement crack image acquisition technology is becoming more convenient and efficient, but there are still challenges in the automatic processing and recognition of cracks in images. From the early global thresholding to deep learning algorithms, the research for crack extraction has been developed for about 40 years. There are many methods and algorithms that are satisfactory in pavement crack applications, but there is no standard until today. Therefore, in order to know the developing history and the advanced research, we have collected a number of literature in this research topic for summarizing the research artwork status, and giving a review of the pavement crack image acquisition methods and 2D crack extraction algorithms. Also, for image acquisition methods and pavement crack image segmentation, more detailed comparison and discussions are made. Keywords: Highway engineering, Pavement crack, Image acquisition, Image processing, Crack extraction
Chapter
The continued understanding of the influence of flight planning characteristics on data quality is crucial in the demand for minimizing costs and maximizing the output potential of Uncrewed Aerial Systems (UAS) for forestry applications. This study was conducted to ascertain the effects of various combinations of flying height and percentage overlaps on the quality of photogrammetry data products generated from images acquired by a low-cost UAS (Mavic 2 Pro), with emphasis on tree crown delineation in a Mangium plantation forest in the Philippines. The quality of the products is evaluated based on their completeness and the accuracy of tree crown delineations. Results suggest that the percentage completeness increases as the flying height and percentage overlap increase. More than 90% completeness was achieved for 90% overlap regardless of the flying height. Tree crown delineations using multiresolution segmentation of Digital Surface Models (DSMs) generated from images with a flying height of 120 m and percentage overlap of 80% and 90%, achieved the highest overall accuracy of 43.35%. This study showed that a minimum of 80% overlap must be aimed when acquiring images to ensure higher completeness of the data products and that flying at 120 m above ground with at least 80% overlaps can provide more accurate tree crown delineations.
Article
Maintaining urban roads in a good condition is a basic duty of any city government. Traditionally, urban road maintenance is performed by a contractor that provides the required manpower and equipment resources to complete the necessary daily inspection and repair works. Although the quality of roads is acceptable, the inefficiency is evident everywhere. A performance-based contract (PBC) is used in the field of road maintenance in various countries, primarily for highway or large-area pavement rehabilitation and follow-up maintenance as the subject of the contract. In this study, we applied the design thinking approach to establish a PBC implementation model for urban road patrolling and sporadic repair. Moreover, to ensure the feasibility of the proposed model and to decrease the political risk of execution outcome, we adopted a two-stage validation in this study. The proposed performance indicators were examined by the historical data of road maintenance in Taipei City, and the feasibility of the proposed PBC implementation model was then confirmed by domain experts. The research outcomes could be used to develop PBCs for maintenance management of other infrastructures with better maintenance performances.
Article
Periodic grooved cement concrete pavement crack detection is of great importance for pavement condition monitoring and maintenance. The current state-of-the-art (SOTA) detection solutions highly depend on datasets. However, due to the limited access to crack images, more efficient methods are urgently needed to advance the detection of cracking on grooved cement concrete pavement. This study proposes an improved deeper Wasserstein generative adversarial network with gradient penalty (WGAN-GP) to generate datasets of pavement images with a size of 512 × 512 pixels 2. Poisson bleeding is adopted to create the synthesized grooved cement concrete pavement crack images based on the generated crack images and groove images. The robustness of the proposed improved deeper WGAN-GP model is validated by Faster R-CNN, YOLOv3, and YOLOv4 models trained on original crack images and generated crack images for region-level detection. U-Net and W-segnet are used to achieve pixel-level crack detection to evaluate the effectiveness of proposed model. Results show that the improved deeper WGAN-GP could generate more realistic transverse, longitudinal and oblique crack images. In addition, the Poisson bleeding algorithm contributes to synthesizing grooved cement concrete pavement crack images. Moreover, it is observed that YOLOv3 trained by the augmented dataset could achieve a mean average precision (MAP) of 81.98%, 6% MAP higher than the non-augmented dataset. U-Net and W-segnet benefit from augmented dataset with a better pixel-level segmentation result. Based on the results, it can be concluded that the improved deeper WAGN-GP image generation method can provide a straightforward way to fill the data shortage gap of grooved cement concrete pavement cracks, thus increasing the problem-solving capability of the SOTA crack detection models.
Article
Pavement, as a kind of common public transit infrastructure, plays an important part in the daily passing and transportation-associated activities. The good conditions and smooth traffics of pavements matter significantly to the pavement users. However, due to long-time services, pavements often suffer from different kinds and severities of distresses, which might bring inconvenience to the pavement-related events, or even cause terrible traffic hazards. In this regard, we put forward a novel hybrid-window attentive vision transformer framework, called CrackFormer, for pavement crack detection aiming at providing an effective and automated solution to serving the pavement distress inspecting and repairing works. The CrackFormer employs a transformer-based high-resolution network architecture to rationally exploit and fuse multiscale feature semantics. To be specific, a hybrid-window based self-attention scheme is designed to extract feature semantics of entities both locally with dense windows and globally with sparse windows, which effectively improves the semantic details and accuracies. Moreover, a weighted multi-head self-attention philosophy is developed to recalibrate the contributions of different heads according to their relevance, which well enhances the feature encoding robustness and saliency. The CrackFormer is systematically tested on seven pavement crack detection datasets. Quantitative evaluations show that the CrackFormer achieves an overall performance with the precision of 0.9376, recall of 0.9352, and F1-score of 0.9364, respectively. In addition, qualitative examinations and comparative analyses all confirm the excellent performance of the CrackFormer for recognizing and delineating the pavement cracks of varying patterns under diverse pavement surface conditions.
Article
BackgroundA challenge for experimental fracture mechanics studies using vision-based methods is the accuracy with which the crack tip can be located in the region of interest for extracting fracture parameters. When using full-field displacement measurement methods such as digital image correlation (DIC), positioning the crack tip coordinate system could greatly influence the accuracy of stress intensity factors for brittle materials.Objective The objective of the present work is to develop improved methods of tracking crack tip position for fracture parameter extraction for problems involving moving fracture fronts (e.g. dynamic crack growth).Methods An improved image processing-based automated method for identifying the location of a propagating crack tip is proposed here. The primary inputs to the method are two-dimensional displacement fields measured using DIC. An edge detection methodology using a series of partial derivative computations is used to locate the crack tip.ResultsThe proposed method’s performance is verified using simulated displacement fields with a sequence of controlled crack tip positions for mode I and mixed-mode examples. The method is used to locate crack tip positions from mixed-mode dynamic fracture experiments and extract instantaneous stress intensity factor histories. Consistency is shown between baseline and automated methods and post-initiation stress intensity factor histories varied by approximately 5% with the maximum variation being under 10% for the mixed-mode experiments.Conclusions The automated fracture parameter extraction method produced consistent results with those extracted using traditionally accepted methods, indicating that the proposed automated approach is a marked improvement due to its systematic nature and processing efficiency.
Article
Cracks are a major sign of aging transportation infrastructure. The detection and repair of cracks is the key to ensuring the overall safety of the transportation infrastructure. In recent years, due to the remarkable success of deep learning (DL) in the field of crack detection, many researches have been devoted to developing pixel-level crack image segmentation (CIS) models based on DL to improve crack detection accuracy, but as far as we know there is no review of DL-based CIS methods yet. To address this gap, we present a comprehensive thematic survey of DL-based CIS techniques. Our review offers several contributions to the CIS area. First, more than 40 papers of journal or top conference most published in the last three years are identified and collected based on the systematic literature review method. Second, according to the backbone network architecture of the models proposed in them, they are grouped into 10 topics: FCN, U-Net, encoder-decoder model, multi-scale, attention mechanism, transformer, two-stage detection, multi-modal fusion, unsupervised learning and weakly supervised learning, to be reviewed. Meanwhile, our survey focuses on discussing strengths and limitations of the models in each topic so as to reveal the latest research progress in the CIS field. Third, publicly accessible data sets, evaluation metrics, and loss functions that can be used for pixel-level crack detection are systematically introduced and summarized to facilitate researchers to select suitable components according to their own research tasks. Finally, we discuss six common problems and existing solutions to them in the field of DL-based CIS, and then suggest eight possible future research directions in this field.
Article
Pavement damage detection is essential for subsequent road maintenance decisions. However, recent detection networks have low accuracy and fail to detect most diseases on the road, which means that testing is very inefficient. Therefore, this study uses the unmanned aerial vehicle (UAV) road damage database and describes a multi-level attention mechanism called Multi-level Attention Block (MLAB) to strengthen the utilization of essential features by the You Only Look Once version 3 (YOLO v3). Adding MLAB between the backbone and feature fusion parts effectively increases the mAP value of the proposed network to 68.75%, while the accuracy of the original network is only 61.09%. The network is able to detect longitudinal cracks, transverse cracks, repairs, and potholes with high accuracy, and significantly improves the accuracy of alligator cracks and oblique cracks. The findings of this study will accelerate the application of non-destructive automatic road damage detection.
Article
Civil infrastructure (e.g., buildings, roads, underground tunnels) could lose its expected physical and functional conditions after years of operation. Timely and accurate inspection and assessment of such infrastructures are essential to ensure safety and serviceability, e.g., by preventing unsafe working conditions and hazards. Cracks, which are one of the most common distress, can indicate severe structural integrity issues that threaten the safety of the structure and people in the environment. As such, accurate, fast, and automatic detection of cracks on structure surfaces is a major issue for a variety of civil engineering applications. Due to advances in hardware data acquisition systems, significant progress has been made in the automatic detection and quantification of cracks in recent decades. This paper provides a comprehensive review of the research progress and prospects in computer vision frameworks for crack detection of civil infrastructures from multiple materials, including asphalt, concrete, and metal-like materials. The review encompasses major components of typical frameworks, i.e., data acquisition techniques, publicly available datasets, detection algorithms, and evaluation metrics. In particular, we provide a taxonomy of detection algorithms with a detailed discussion of the advantages, limitations, and application scenarios of the methods in each category, as well as the relationships between methods of different categories. We also discuss unsolved issues and key challenges in crack detection that could drive future research directions.
Article
Computer vision-based crack analysis for civil infrastructure has become popular to automatically process inspection imaging data for crack detection, localisation and quantification. Some literature reviews have been conducted, which mostly focus on qualitative damage evaluation or damage segmentation, missing the methodology categorisation for applicability-oriented quantitative crack assessment. To fill the gap, this review provides a comprehensive overview of state-of-the-art image-based crack analysis under various conditions in both qualitative and quantitative aspects, particularly focusing on image processing and deep learning-based methodologies from image-level detection to pixel-level segmentation and quantification. The key challenges and research gaps are also discussed as follows, which indicate the importance of future research: (1) developing data model methodologies to resolve the difficulties due to the image data deficiency; (2) building a learning-based model capable of processing data with complex backgrounds; (3) enhancing the scene generalisation on different detection tasks; (4) establishing a lightweight mechanism for real-time crack analysis; (5) constructing learning-based systems that comprehend the local and global contexts during crack evaluation; (6) developing a semi-supervised mechanism for more information capturing and (7) establishing attention-based models for enhanced segmentation performance.
Article
Stereo matching algorithms of binocular vision suffer from low accuracy when dealing with natural scenes (such as industrial robot scenes). Biological vision is sensitive to object edges; it divides objects by their edges, and then perceives their distances. Similar to the biological eye mechanism, this study proposes a matching algorithm that combines segment- and edge-matching to obtain the disparity. In segment matching, pixel strings from the same row of the left and right images are divided into pixel segments, whose colors and lengths are used as clues to determine several types of matching pixel segment pairs according to non-crossing mapping. The analysis of the spatial state yields several types of stimulus bars. Disparities can be obtained from the relation between pixel segment pairs and stimulus bars. In edge matching, the DTW (Dynamic Time Warping) algorithm and the gradient are used to determine the initial edge pixel matching results. The remaining edge point disparity is obtained by fitting a fill to the existing edge point disparity. Finally, segment and edge matching results are combined to check and fill and post-processing. This new matching method transforms pixel matching to pixel segment matching and edge matching, which can reduces the time complexity. The algorithm can be implemented in an industrial robot environment for high-precision needle threading guidance, which neither traditional binocular matching nor deep learning matching algorithms can do.
Article
Landslide susceptibility analysis at the regional scale is the focus of landslide risk management. To obtain more accurate and guiding significance results of landslide susceptibility, it is necessary to conduct a refined analysis on the basis of regional scale. Therefore, we proposed a refined method for landslide susceptibility assessment. This method comprehensively considered the geological and dynamic surface deformation information, and was applied in some areas of Maoxian County. We selected twelve influencing factors (elevation, slope, relief amplitude, curvature, aspect, engineering geological rock group, distance to fault, distance to river, land type, vegetation type, and topographic wetness index, rainfall) and used the traditional machine learning method for landslide susceptibility. Then, the Interferometric Synthetic Aperture Radar (InSAR) technology was introduced to modify the unsuitable zoning of the traditional landslide susceptibility. Based on the improved landslide susceptibility mapping, field investigation and UAV models were used to verify. The results showed that the introduction of InSAR technology and UAV multi-source data can rationalize the inappropriate zoning in traditional landslide susceptibility, and landslides (L1 and L7) with very high susceptibility were identified. The field investigation and spatial-temporal evolution characteristics of typical landslides indicated that both L1 and L7 were severely damaged and in the deformation stage. L1 showed significant deformation during road construction. However, the deformation of L1 reacted on the pavement, resulting in many tensile cracks. The deformation of L7 was mainly affected by rainfall and presented the characteristics of seasonal variation, and the deformation was accelerated in the rainy season. Therefore, the proposed method had a better performance in landslide susceptibility and improved the accuracy. It can realize the refined analysis of landslide susceptibility in a large area and provide technical support for geological hazard susceptibility assessment and reference for disaster prevention and mitigation.
Article
Detailed maps and road maps have broad and vital applications in military, urban design, crisis management, and accidents. Today, on the one hand, with the development of remote sensing sciences, the use of aerial and satellite images has expanded, and on the other hand, new advances in the processing of these images have led to improved analysis results on remote sensing images. One of the most up-to-date branches of image processing is deep learning networks, which have wide applications in various fields of engineering sciences, including remote sensing sciences. In this study, to extract the roads, the deep learning method with the customized U-Net architecture was used on the dataset of Massachusetts, USA, which were published in free access. Many studies have been done on this data, which generally emphasizes the architecture and methods of post-processing. In this research, focusing on data, deep learning networks have been used in two different scenarios. Scenario 1 of RGB images and Scenario 2, an additional channel with a random forest road/on-road classified image, is added to the RGB images and trained by the network. The results of the U-Net network training indicate very high accuracy of the F1-score average of 0.92 for Scenario 2. The method in this article can be used as a reliable method in road detection.
Article
The prompt detection of early decay in the pavement could be an auspicious technique in road maintenance. Admittedly, early crack detection allows preventive measures to be taken to avoid damage and possible failure. With regards to the advancement in computer vision and image processing in civil engineering, traditional visual inspection has been replaced by semi-automatic/automatic techniques. The process of detecting objects from the images is a fundamental stage of any image processing technique since the accuracy rate of the classification will depend heavily on the quality of the results obtained from the segmentation step. The major challenge of pavement image segmentation is the detection of thin, irregular dark lines cracks that are buried into the textured backgrounds. Although the pioneering works on image processing methodologies have proven great merit of such techniques in detecting pavement surface distresses, there is still a need for further improvement. The academic community is already working on image-based identification of pavement cracks, but there is currently no standard structure. This literature review establishes the history of development and interpretation of existing studies before conducting new research; and focuses heavily on three major types of approaches in the field of image segmentation, namely thresholding-based, edge-based, and data driven-based methods. With comparison and analysis of various image segmentation algorithms, this research provides valuable information for researchers working on enhanced segmentation strategies that potentially yield a fully automated distress detection process for pavement images with varying conditions.
Article
Transportation networks are severely affected by natural hazards, including landslides. The prioritization of maintenance works is required to preserve the efficiency and functionality of road infrastructure. To overcome the subjectivity of traditional visual inspections for road pavement condition assessment, advanced (semi-)automatic approaches have been emerging. Still, the quantitative and objective description of damage typology and extent, and its severity classification remain the major issues for the assessment of landslide impacts on transportation routes. The objective of this work is to provide a ready-to-use tool for semi-automatic damage assessment of asphalt-paved roads in landslide-affected areas to support risk analysis and planning of mitigation measures. The use of 3D models and 2D images as reconstructed from UAV-based photogrammetry is investigated to detect longitudinal and transverse cracks on the road pavement and assess their severity in landslide areas, as a rapid, systematic, objective and less laborious alternative to traditional field surveys. A semi-automatic procedure is proposed to i) select asphalt-paved roads exposed to landslides, ii) rapidly map distresses on selected road sections, iii) quantitatively detect and describe longitudinal and transverse cracks, and iv) classify their severity according to the International Roughness Index (IRI). The procedure is applied to the Province of Como (northern Italy), where three test sites are selected for detailed analyses. The results indicate that the proposed procedure is useful for road management purposes at different levels of details by providing four outputs: i) a road damage hotspot map to detect deformations, ii) a multi-criteria binary classifier to detect pavement damage, iii) an IRI-based criterion to rate the pavement quality, and iv) a road damage severity map.
Article
An autonomous unmanned aerial vehicle (UAV) system integrated with a modified faster region-based convolutional neural network (Faster R-CNN) is proposed to identify various types of structural damage and to map the detected damage a GPS-denied environment. The proposed method reduces the number of false positives significantly using a real-time streaming protocol and multi-processing, particularly in the case of very small cracks in blurry videos due to the UAV vibrations. In comparative studies, the modified Faster R-CNN using ResNet-101 as the base network showed superior performance in detecting small and blurry defects with a mean average precision of 93.31% and mean intersection-over-union of 92.16% in video frames captured by the low-cost autonomous UAV. The autonomous flights of the UAV were tested in a real large-scale parking structure to account for the high wind effects during flight. The UAV successfully followed the desired trajectories, and the Faster R-CNN detected defects accurately.
Article
Point clouds are essential for storage and transmission of 3D content. As they can entail significant volumes of data, point cloud compression is crucial for practical usage. Recently, point cloud geometry compression approaches based on deep neural networks have been explored. In this paper, we evaluate the ability to predict perceptual quality of typical voxel-based loss functions employed to train these networks. We find that the commonly used focal loss and weighted binary cross entropy are poorly correlated with human perception. We thus propose a perceptual loss function for 3D point clouds which outperforms existing loss functions on the ICIP2020 subjective dataset. In addition, we propose a novel truncated distance field voxel grid representation and find that it leads to sparser latent spaces and loss functions that are more correlated with perceived visual quality compared to a binary representation. The source code is available at https://github.com/mauriceqch/2021_pc_perceptual_loss .
Article
Research in artificial intelligence for radiology and radiotherapy has recently become increasingly reliant on the use of deep learning‐based algorithms. While the performance of the models which these algorithms produce can significantly outperform more traditional machine learning methods, they do rely on larger datasets being available for training. To address this issue, data augmentation has become a popular method for increasing the size of a training dataset, particularly in fields where large datasets aren’t typically available, which is often the case when working with medical images. Data augmentation aims to generate additional data which is used to train the model and has been shown to improve performance when validated on a separate unseen dataset. This approach has become commonplace so to help understand the types of data augmentation techniques used in state‐of‐the‐art deep learning models, we conducted a systematic review of the literature where data augmentation was utilised on medical images (limited to CT and MRI) to train a deep learning model. Articles were categorised into basic, deformable, deep learning or other data augmentation techniques. As artificial intelligence models trained using augmented data make their way into the clinic, this review aims to give an insight to these techniques and confidence in the validity of the models produced.
Article
Automatic pavement crack detection has attracted increasing attention in the field of national infrastructure maintenance and rehabilitation. However, due to inhomogeneous crack pixel value, miscellaneous crack topology, noisy texture surrounding and multifarious illumination condition, automatic pavement crack detection is still a challenging problem. In this paper, a triple-thresholds pavement crack detection method is proposed leveraging random structured forest. Specifically, channel features and pairwise difference features are exploited to enrich the information of patches that comprise the crack image. We tackle the task of predicting local crack patches in a structured learning framework leveraging random structured forest. After that, a crack score map is generated where each position shows the score of crack in the corresponding position of the original image. Then, a triple-thresholds method is introduced to obtain the preliminary crack detection result from the crack score map as well as suppress the noise. Finally, a new morphological operation is proposed to enhance the continuity of the crack on premise of maintaining crack width of the given tolerance margin. Our approach is evaluated by comparing with six state-of-the-art crack detection algorithms using public data-set CFD. Experimental results show that our method outperforms the counterparts since it achieves 95.95%, 90.59% and 92.59% of precision, recall and F1-score, respectively.
Article
Aerial imaging of an open-pit mine integrated with the visual analytics offers a novel approach for routine monitoring of tension cracks for mine safety. Tension cracks may occur on work- or catch-benches that are excavated according to a CAD model. The size of the tension cracks, their locations, and evolutions is commonly used to predict slope failures and to assure the mine safety operations. The goal of this research was to replace the current manual interventions with an automated platform for routine report generations for the mine controller. First, a drone was flown on a pre-programmed flight trajectory at a constant elevation to generate a mosaic and a depth map image. Next, work-, catch-benches, and access roads were automatically identified and represented by their medial axes. Subsequently, the waypoints from each medial axis were sequentially uploaded into the drone for scanning the corresponding regions at high-resolution. These high-resolution images were then used to delineate tension cracks. The delineation of tension cracks was performed using steerable filters, ENet, and UNet deep learning models for comparison. The ENet model, with the leave-one-out cross-validation method, produced the best performance profile with an Aggregated Jaccard Index and F1-Score of 0.51 and 0.79, respectively.
Article
Discovering and assessing cracks is widely thought to be critical for maintaining the healthy conditions of asphalt pavement. Unfortunately, the inspection of pavement for cracks is not only labor-intensive, time-consuming, inefficient, and costly, but it is also unable to detect and quantify cracks accurately at the pixel level. To address this problem, we propose an integrated approach based on the convolutional neural network DeepLabv3+ for crack detection, as well as a crack quantification algorithm for crack quantification at the pixel level. The quantification algorithm is used to evaluate five important indicators: crack length, mean width, maximum width, area, and ratio. To fully verify the performance of DeepLabv3+, 50 images were studied; the best image showed a mean intersection of union (MIoU) of 0.8342. For testing, 80 new images (including both asphalt pavement images and concrete pavement images) were used. DeepLabv3+ was found to be reliable and widely applicable for crack detection, and it demonstrated an MIoU of 0.7331. Of the various quantitative indicators, the crack length had the lowest relative error rate of the predicted values and therefore had the highest accuracy (its relative error rate ranged from −25.93% to 14.11%). We also compared our system with four state-of-the-art methods. The results showed our integrated approach to be more effective and more accurate in both the detection and quantification of cracks. The integrated approach could potentially serve as the basis of an automated, cost-effective pavement-condition assessment scheme for the operation and maintenance of pavement.
Article
In recent years, deep learning-based crack detection methods have been widely explored and applied due to their high versatility and adaptability. In civil engineering applications, recent research on crack detection through deep convolutional neural network (DCNN) includes road pavement crack detection, bridge inspection, defects detection in shield tunnel lining, etc. Despite the increasing popularity of DCNN on crack detection, many challenges have yet to be properly addressed. For crack detection using three-dimensional (3D) range (i.e., elevation) images, disturbances such as surface variation can negatively affect the detection performance. Besides, some typical non-crack patterns such as grooves can be easily misidentified as cracks, i.e., false positives. Another issue lies in the selection of hyperparameters related with the design of a DCNN architecture. For example, the hyperparameters which are related with network structure (e.g., kernel size, network depth and width) and training (e.g., mini-batch size and learning rate) can impact the network performance to a significant extent. Therefore, they need to be properly determined for optimal performance. However, for deep learning-based roadway crack classification using laser-scanned range images, a comprehensive discussion on the hyperparameter selection/tuning has not been thoroughly performed. This study develops a hyperparameter selection process involving a series of experiments on laser-scanned range images with high diversities, investigating the optimal joint hyperparameter configuration on network structure and training for DCNN-based roadway crack classification. In a comparative study, 36 DCNN architectures with varying layouts are developed for crack classification. These architecture candidates differ in kernel sizes (e.g., 3 × 3, 7 × 7, and 11 × 11), network depths (from 5 to 8 weight layers), and widths (from 16 to 96 kernels in each convolutional layer). The 7-layer DCNN with constant 7 × 7 kernels and increasing network widths yields the highest classification performance among the proposed 36 DCNN classifiers, which may be because it can best reflect the complexity of the acquired laser-scanned roadway range images. Once the optimal architecture layout is determined, further discussion on the selection of min-batch sizes, learning rates, dropout factor and leaky rectified linear unit (LReLU) factor is performed. Experimental results show the optimal architecture with associated training configuration can achieve consistent and accurate performance, under the contamination of surface variations and grooved patterns in laser-scanned range images. Discussion on the hyperparameter selection can provide insights for the development of DCNN in similar applications using laser-scanned range images.