Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Many municipalities and road authorities seek to implement automated evaluation of road damage. However, they often lack technology, know-how, and funds to afford state-of-the-art equipment for data collection and analysis of road damages. Although some countries, like Japan, have developed less expensive and readily available Smartphone-based methods for automatic road condition monitoring, other countries still struggle to find efficient solutions. This work makes the following contributions in this context. Firstly, it assesses usability of Japanese model for other countries. Secondly, it proposes a large-scale heterogeneous road damage dataset comprising 26,620 images collected from multiple countries (India, Japan, and the Czech Republic) using smartphones. Thirdly, it proposes models capable of detecting and classifying road damages in more than one country. Lastly, the study provides recommendations for readers, local agencies, and municipalities of other countries when one other country publishes its data and model for automatic road damage detection and classification. A part of the proposed dataset was utilized for Global Road Damage Detection Challenge’2020 and can be accessed at (https://github.com/sekilab/RoadDamageDetector/).

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Varying the input image resolution, the batch size, and the learning rate, the paper reports F1-scores ranging from 52% to 56%. The authors used the dataset from [29], an upgraded version of the dataset from [8], consisting of 26,620 images collected from India, the Czech Republic, and partially from Slovakia, with four damage categories: longitudinal cracks, lateral cracks, alligator cracks, and potholes. On this dataset, the authors from [30] analyzed their dataset using SSD models based on MobileNet architectures. ...
... These strategies involve test time augmentation and averaging predictions from multiple trained models, aiming to improve robustness and accuracy. The approaches are evaluated using the dataset from [29], demonstrating an F1 score up to 0.67. ...
... Based on the RDD2020 dataset from [29], the same authors examined an upgraded dataset, RDD2022 [41], with 47,420 road images from six countries: Japan, India, Czech Republic, Norway, the United States, and China. The annotations are based on four types of road damage: longitudinal cracks, transverse cracks, alligator cracks, and potholes. ...
Article
Full-text available
The rapid advancement of autonomous vehicle technology has brought into focus the critical need for enhanced road safety systems, particularly in the areas of road damage detection and surface classification. This paper explores these two essential components, highlighting their importance in autonomous driving. In the domain of road damage detection, this study explores a range of deep learning methods, particularly focusing on one-stage and two-stage detectors. These methodologies, including notable ones like YOLO and SSD for one-stage detection and Faster R-CNN for two-stage detection, are critically analyzed for their efficacy in identifying various road damages under diverse conditions. The review provides insights into their comparative advantages, balancing between real-time processing and accuracy in damage localization. For road surface classification, the paper investigates the classification techniques based on both environmental conditions and material road composition. It highlights the role of different convolutional neural network architectures and innovations at the neural level in enhancing classification accuracy under varying road and weather conditions. The main finding of this work is that it offers a comprehensive overview of the current state of the art, showcasing significant strides in utilizing deep learning for road analysis in autonomous vehicle systems. The study concludes by underscoring the importance of continued research in these areas to further refine and improve the safety and efficiency of autonomous driving.
... Among the various technologies and high-precision methods used for automating road inspection [7][8][9][10], image-based approaches have gained prominence [11,12]. These approaches employ machine learning models to analyze images captured by cameras attached to vehicles or drones, offering a cost-effective solution [13,14]. ...
... While generic datasets like PASCAL-VOC [16,17], KITTY [18], and MSCOCO [19] have been extensively explored [20], domain-specific datasets are required for the road sector. In recent years, various datasets have been introduced to support road damage detection, and researchers have experimented with data from different locations, emphasizing the need for location-specific datasets and models [11]. Nonetheless, another concern is the solutions proposed utilizing a given dataset. ...
... Several datasets have been introduced in the recent years [41][42][43][44], each with its own characteristics and challenges [12]. Some notable examples are: RDD2018 [13], RDD2019 [45], EdmCrack600 [14]; RDD2020 [11] and [46], RDD2022 [26]; CQU-BPDD [47] and CQU-BPMDD [48]. These datasets differ in various aspects, such as the study area, number of images, image size, data collection device, data acquisition method, and view captured, etc. ...
Article
Monitoring road conditions is crucial for safe and efficient transportation infrastructure, but developing effective models for automatic road damage detection is challenging requiring large-scale annotated datasets. Cross-country collaboration provide access to diverse datasets and insights into factors affecting road damage detection models. This paper presents a review of winning strategies of the Crowdsensing-based Road Damage Detection Challenge (CRDDC) held in 2022 as a Big Data Cup, with 90+ teams from 20+ countries proposing solutions for six countries: India, Japan, the Czech Republic, Norway, the United States, and China. The best solution achieved an F1-score of 77 % for all six countries, which is 2.7 % better than the 2nd ranked solution. This study explores the impact of factors influencing dataset and model selection by CRDDC winners. The study’s insights can guide future research in making data-related choices and developing more effective road damage detection models accounting for the diverse road conditions across different countries.
... To evaluate the causative factors of road pavement, we need to consider several things, such as weather conditions, traffic loads, construction materials, and maintenance activities. (Arya D. M., 2021) Road pavement is a layer of construction on top of the ground (subgrade) that has been compacted to support the traffic load, which is then distributed to the road so that the soil does not receive the load due to the allowable soil capacity (Isradi M. R., 2022). To provide a comfortable, safe, and durable road surface. ...
Article
Full-text available
The development of road infrastructure in various countries is growing rapidly in line with the increasing need for more efficient transportation and better road access to various regions. Suppose in a country the road infrastructure is damaged. In that case, it will result in limited access to more efficient transportation, hinder the mobility of road and goods users, and potentially slow down economic growth. This study aims to determine the level of satisfaction of road users with the condition of the pavement on Jl. Pejuang-Sindangkasih, and determine the level of existing damage. This research is located on Jl. Pejuang-Sindangkasih, which is located in Majalengka District, Majalengka Regency, West Java. The data collection technique used is a questionnaire, which is intended to collect data on the level of damage felt, the impact of damage on comfort and safety, and road users' expectations for road repairs. Overall, this data provides a clear picture of the profile of road users on Jl. Pejuang-Sindangkasih and how they see the condition of the road. This information can be an important basis for road repair and maintenance to improve comfort and safety for all road users. Some respondents also emphasized the importance of immediate repairs to road damage to reduce the risk of accidents and vehicle damage. In addition, respondents from younger age groups tend to be more critical of road conditions and show a high awareness of the importance of good road infrastructure. Road damage is very dangerous for road users, especially for motorcycle users, and urgent repairs are needed. Overall, this information provides important information about the planning and implementation of road repairs, with a special emphasis on the most important elements to improve the safety and comfort of road users.
... A sample architecture of a CNN includes an input layer, multiple groups of convolutional layers and pooling layers, followed by a fully connected layer and an output layer. It is cited from [86]. ...
Article
Full-text available
The hot deconfined matter called quark–gluon plasma (QGP) can be generated in relativistic heavy-ion collisions (HICs). Its properties under high temperatures have been widely studied. Since the short-lived QGP is not directly observable, data-driven methods, including deep learning, are often used to infer the initial-state properties from the final distributions of hadrons. This paper reviews various applications of machine learning in relativistic heavy-ion collisions, explains the fundamental concepts of deep learning, and discusses how the properties of HIC data can be interpreted using efficient machine learning models.
... This study enhances pothole detection accuracy by optimizing smartphone-accelerometer alignment with vehicle axes, identifying significant threshold values for diverse algorithms, and validating results through an external tri-axis accelerometer [4]. The author employed a smartphone-based approach for pothole detection, creating a heterogeneous road image dataset comprising 26,620 images collected from various countries, with results demonstrating a notable F1-score rate of 0.674 using a YOLObased ensemble method [5]. The proposed SVM excels in accuracy, managing parameters like Pothole Depth, Latitude, and Longitude, utilizing ultrasonic sensors, gyroscope, GSM, and GPS modules on vehicles, with data being sent to the Thing Speak Cloud [6].This proposed Internet of Things prototype combines a GPS, accelerometer, and ultrasonic sensor to detect potholes and humps. ...
Conference Paper
Full-text available
The need for transportation increases as the population ages, making road safety even more vital. We propose a straightforward system that utilizes already-existing parts to address this issue. Using vehicle acceleration sensors, our solution employs real-time road condition detection to identify hazards such as potholes. The data is seamlessly integrated into our website using the ESP-NOW communication protocol, providing users with a clear view of the road conditions and their precise locations. The locations can be viewed on the website’s Google Map, and a straightforward and trustworthy graphical plot can be used to track the conditions. Our system enhances safety and efficiency by enabling drivers to make informed decisions through location-based insights and timely updates. Government officials can also utilize this information to plan for infrastructure maintenance.
... Compared with the CNN based method using sliding window, Maeda et al. [25], Wang et al. [26], Arya et al. [27] applied object detection frames to crack detection. To provide quasi real-time simultaneous detection of multiple types of damages, Cha et al. [28] proposed a structural crack detection method based on the Faster R-CNN. ...
Article
Full-text available
Crack detection in pavements is a critical task for infrastructure maintenance, but it often requires extensive manual labeling of training samples, which is both time-consuming and labor-intensive. To address this challenge, this paper proposes a semi-supervised learning approach based on a DenseNet classification model to detect pavement cracks more efficiently. The primary objective is to leverage a small set of labeled samples to improve the model's performance by incorporating a large number of unlabeled samples through semi-supervised learning. This method enhances the DenseNet model's ability to generalize by iteratively learning from new unlabeled datasets. As a result, the proposed approach not only reduces the need for extensive manual labeling but also mitigates issues related to label inconsistency and errors in the original labels. The experimental results demonstrate that the semi-supervised DenseNet model achieves a prediction precision of 96.77% and a recall of 94.17%, with an F1 score of 95.45% and an Intersectidn over Union (IoU) of 91.30%. These metrics highlight the model's high accuracy and effectiveness in crack detection. The proposed method not only improves label quality and model performance but also offers practical value for engineering applications in the field of pavement maintenance, making it a valuable tool for infrastructure management.
... Guo et al. [48] introduce MN-YOLOv5, incorporating MobileNetV3 [49] and coordinate attention [50] for mobile deployment. In the Crowdsensing-based Road Damage Detection Challenge (CRDDC'2022 [51][52][53][54][55]), team Shi Yu_Sea View fuse YOLO and Faster R-CNN algorithms to enhance global information utilization and achieve top results. ...
Article
Full-text available
As an essential object detection application, road damage detection aims to identify and mark road damage. Timely maintenance of detected damage can improve road safety. However, the proportions of damage area of the image is very diffident for the variety of the road damage textures and shapes. Additionally it is a challenge to localize the road damage accurately for the blurring of the damaged regions caused by the external environmental factors. In this study, we propose a Road Damage Detector with a Local Sensing Feature Network (LSF-RDD), which constructs a Local Sensing Feature Network (LSF-Net) as a neck to fuse multi-scale features extracted from the backbone network and can focus on the location of the damaged area. First, the CSP-Darknet53 backbone network extracts the feature maps of three scales layer-by-layer from the input images. Second, these three feature maps are input into LSF-Net for multi-scale feature fusion to generate three local feature representations. LSF-Net comprises four interconnected blocks, enabling top-down and bottom-up feature fusion. Feature maps from the backbone perform multi-scale feature fusion through connections between different blocks. Finally, three local feature representations are sent into the detection head for detection. Experiments show that LSF-RDD performs well on the adopted datasets, especially on the China_motorbike dataset of RDD2022, with mAP@0.5 reaching 94.4%.
... Flight platforms allow for automated flight trajectory planning and execution, and many UAS are equipped with proximity sensors that enable dynamic detection (i.e., during flight execution) of obstacles and, in many cases, the reprogramming of flight trajectories to avoid them [17]. Alongside, increasingly efficient and automated photogrammetric data processing procedure, based on image analysis [18,19] or artificial intelligence (AI) algorithms (machine learning, deep learning, neural networks) [20,21], have undergone unprecedented technological advancements. ...
Conference Paper
Cold mix patching materials (CMPMs) have become increasingly popular solutions for the emergency asphalt repairs of small-to medium-sized potholes in severe winter conditions or where short reopening times are required. Their performances need to be carefully monitored in the early post-application phases, when these materials have a high potential to face several distresses. The research aimed to identify a set of methods for the rapid and cost-effective acquisition of asphalt pavement surface geometric data to analyze the effectiveness of a given CMPM, with particular attention to procedures that minimize interference with the normal traffic flow or limit the lanes closure whilst ensuring the safety of operators and pedestrians. A road trial section in a suburban industrial zone, where ideal potholes were cut and filled with different types of CMPMs, was set up for this purpose. A low-cost unmanned aerial vehicle (UAV), was used for the experiments, repeating the drone survey in four successive epochs (up to 30 days). The captured images were post-processed with a photogrammetric software and the co-registered point clouds at the different epochs were compared to highlight the early patches performances and deteriorations. This ready-to-use methodology, tailored on the urban scale, made it possible to identify over time the occurrence of raveling on the patch area and to monitor the trend of patch surface depressions, extracting desired transverse and longitudinal profiles.
... Typical CNN architecture. Reprinted from Arya et al. (2020). ...
Thesis
Full-text available
Biodiversity conservation is a critical issue, driven by rapid species loss and ecosystem degradation. Effective conservation strategies require accurate monitoring and prediction of species presence and distribution. Traditional biodiversity assessments are often time-consuming and resource-intensive, highlighting the need for automated, scalable approaches using advanced computational techniques. This thesis investigates location-based species presence prediction using multi-modal data, encompassing satellite imagery, bioclimatic and other environmental variables. The GeoLife CLEF 2024 Challenge provides a dataset for developing predictive species presence models. The competition's dataset includes satellite images, environmental rasters, and species observation records. This thesis explores the application of various machine learning techniques, including k-Nearest Neighbors, Full Model Ensembles, Top k Binary Classifiers, and Graph Neural Networks (GraphSAGE) to this task. Through a benchmark experiment, the Top k Binary Classifier emerged as the best-performing approach with the highest mean and median F1-scores across five runs. The study demonstrates significant improvements over the established baseline model, emphasizing the potential of advanced techniques in enhancing predictive accuracy. The thesis also discusses limitations, such as model explainability and hyperparameter tuning, and suggests areas for future research, including model optimization and integration with real-world applications. The findings underscore the importance of multi-label classification approaches in ecological monitoring, contributing to more informed biodiversity conservation efforts. 5
... In this article, we reclassify and organize the RDD2022 [27] used in Crowdsensing-based Road Damage Detection Challenge [27,[50][51][52]. Based on the shooting perspective, we change the dataset from its original classification based on countries to a classification based on different photographic perspectives. ...
Article
Full-text available
Road damage detection using computer vision and deep learning to automatically identify all kinds of road damage is an efficient application in object detection, which can significantly improve the efficiency of road maintenance planning and repair work and ensure road safety. However, due to the complexity of target recognition, the existing road damage detection models usually carry a large number of parameters and a large amount of computation, resulting in a slow inference speed, which limits the actual deployment of the model on the equipment with limited computing resources to a certain extent. In this study, we propose a road damage detector named LMFE-RDD for balancing speed and accuracy, which constructs a Lightweight Multi-Feature Extraction Network (LMFE-Net) as the backbone network and an Efficient Semantic Fusion Network (ESF-Net) for multi-scale feature fusion. First, as the backbone feature extraction network, LMFE-Net inputs road damage images to obtain three different scale feature maps. Second, ESF-Net fuses these three feature graphs and outputs three fusion features. Finally, the detection head is sent for target identification and positioning, and the final result is obtained. In addition, we use WDB loss, a multi-task loss function with a non-monotonic dynamic focusing mechanism, to pay more attention to bounding box regression losses. The experimental results show that the proposed LMFE-RDD model has competitive accuracy while ensuring speed. In the Multi-Perspective Road Damage Dataset, combining the data from all perspectives, LMFE-RDD achieves the detection speed of 51.0 FPS and 64.2% mAP@0.5, but the parameters are only 13.5 M.
... Automated detection of road damage utilizes machine learning to identify and classify various types of road damage, such as cracks and potholes [1][2][3][4][5]. This technology leverages vehicle-mounted cameras to capture and record road conditions, significantly improving detection accuracy, which is crucial for maintaining road safety. ...
Preprint
Full-text available
Road damage detection involves identifying cracks, potholes, and other surface irregularities from collected images. This technology is crucial for road maintenance and ensuring traffic safety. Despite significant progress in object detection algorithms, challenges such as weather-induced variability, dispersed key features, and diverse forms of damage persist. To address these issues, this paper proposes a road damage detection algorithm named Flexi-Weather Hard Detection, which integrates data augmentation based on AIGC and corner point feature aggregation. One of the modules, named Weather Trim Augment, utilizes stable diffusion technology to generate road damage data under various weather conditions. This enhancement expands the training dataset and reduces the negative impact of weather on detection accuracy. The Flexi Corner Block Block, utilizes deformable convolutions and combines a lightweight MLP with a learnable visual center mechanism to leverage corner points, enhancing local feature learning and improving the detection of subtle and dispersed features in a multi-scale context. Additionally, the HXIOU loss function is designed, employing weighted calculations and multiple metrics to effectively mine hard examples with significant variability, thus enhancing the detection accuracy of difficult cases such as blurred potholes and fine cracks. Comprehensive experiments on the RDD2020 and CNRDD datasets demonstrate that the proposed approach significantly improves performance, achieving 64.9% in the Test1 metric and 40.6% in the F1-Score. Notably, the algorithm achieves robust detection in unannotated, adverse weather conditions such as snow and rain, showcasing excellent eneralization capabilities.
... In 2020, Arya, Maeda, Ghosh, Toshniwal, Mraz, et al. (2020); Arya, Maeda, Ghosh, Toshniwal, and Sekimoto (2021) attempted to apply the models trained using RDD2019 to detect road damage outside Japan. Experiments were performed using the data from India and the Czech Republic. ...
Article
Full-text available
The data article describes the Road Damage Dataset, RDD2022, encompassing of 47,420 road images from majorly six countries, Japan, India, the Czech Republic, Norway, the United States, and China. The dataset incorporates over 55,000 instances of road damage, specifically longitudinal cracks, transverse cracks, alligator cracks, and potholes. Designed to facilitate the development of deep learning methodologies for automated road damage detection and classification, RDD2022 was unveiled as part of the Crowd sensing‐based Road Damage Detection Challenge (CRDDC'2022), with a major contribution from the challenge winners. This challenge garnered global participation, urging researchers to propose solutions for automatic road damage detection in multiple countries. A noteworthy outcome of CRDDC'2022 was the emergence of a top‐performing model achieving a remarkable F1 Score of 76.9% for road damage detection in all six countries using RDD2022. This success underscores the dataset's practical applicability for municipalities and road agencies, enabling low‐cost, automatic monitoring of road conditions. Beyond its immediate utility, RDD2022 stands as a valuable benchmark for researchers in computer vision, geoscience, and machine learning, offering a rich resource for algorithmic evaluation in diverse image‐based applications, including classification and object detection. The latest big data cup, Optimized Road Damage Detection Challenge (ORDDC'2024), is also based on RDD2022, underscoring its continued relevance and pivotal role in current research and development endeavors.
... Despite the significant research contributions in road condition detection [34][35][36][37], the problem of accurately classifying cracks remains open and challenging. One of the primary problem is the substantial training time required to achieve accurate classification. ...
Article
Full-text available
Detecting and promptly identifying cracks on road surfaces is of paramount importance for preserving infrastructure integrity and ensuring the safety of road users, including both drivers and pedestrians. Presently, the predominant approach for crack detection relies on labor-intensive and costly manual inspections, exacerbated by the enduring challenge of concrete road cracks susceptible to environmental factors like temperature fluctuations, heavy traffic loads, and prolonged exposure to harsh weather conditions. This paper presents a comprehensive research initiative aimed at advancing road crack detection techniques. The proposed method leverages a classical Convolutional Neural Network (CNN) in conjunction with the Extreme Learning Machine (ELM) for efficient feature extraction and classification. A pivotal breakthrough is achieved by replacing the fully connected layer of the CNN with ELM, thus circumventing the time-consuming backpropagation process and significantly expediting the training process. This integration harnesses ELM’s swift learning capabilities and strong generalization performance, complemented by the CNN’s exceptional feature extraction abilities. The model’s effectiveness is rigorously assessed using the wide recognized CCIC, SDNET2018, and Synthesized datasets, and its proficiency in distinguishing between Crack and No-Crack regions is demonstrated through various performance metrics. Furthermore, the model’s performance is benchmarked against existing deep learning models using these performance metrics, showcasing its superior performance.
... This paper refers to the evaluation metrics for road damage detection in CRDDC'2022, using the F1 score and mean average precision (mAP) to assess the performance of the trained model. The F1 score is the harmonic mean of precision and recall values; maximizing the F1 score ensures reasonably high precision and recall [31]. Precision is the ratio of true positives to all predicted positives, while recall is the ratio of true positives to all actual positives. ...
Article
Full-text available
The detection of road damage is highly important for traffic safety and road maintenance. Conventional detection approaches frequently require significant time and expenditure, the accuracy of detection cannot be guaranteed, and they are prone to misdetection or omission problems. Therefore, this paper introduces an enhanced version of the You Only Look Once version 8 (YOLOv8) road damage detection algorithm called RDD-YOLO. First, the simple attention mechanism (SimAM) is integrated into the backbone, which successfully improves the model’s focus on crucial details within the input image, enabling the model to capture features of road damage more accurately, thus enhancing the model’s precision. Second, the neck structure is optimized by replacing traditional convolution modules with GhostConv. This reduces redundant information, lowers the number of parameters, and decreases computational complexity while maintaining the model’s excellent performance in damage recognition. Last, the upsampling algorithm in the neck is improved by replacing the nearest interpolation with more accurate bilinear interpolation. This enhances the model’s capacity to maintain visual details, providing clearer and more accurate outputs for road damage detection tasks. Experimental findings on the RDD2022 dataset show that the proposed RDD-YOLO model achieves an mAP50 and mAP50-95 of 62.5% and 36.4% on the validation set, respectively. Compared to baseline, this represents an improvement of 2.5% and 5.2%. The F1 score on the test set reaches 69.6%, a 2.8% improvement over the baseline. The proposed method can accurately locate and detect road damage, save labor and material resources, and offer guidance for the assessment and upkeep of road damage.
... Deeksha Arya et al. [2] proposed a large-scale heterogeneous road damage dataset comprising 26,620 images collected from multiple countries using smartphones. They also proposed generalized models capable of detecting and classifying road damage in more than one country. ...
Article
Full-text available
Road detection is a fundamental task in autonomous driving, making accurate and efficient road area segmentation essential for the safe and precise navigation of autonomous vehicles. This paper proposes various models for road segmentation, employing an encoder-decoder architecture for fully automatic segmentation of road areas. As part of the encoder, this work explores different models, such as ResNet50V2, DenseNet121, DenseNet169, and DenseNet201, and utilizes them in one of the few dedicated methods for road area segmentation. Here, the dataset, derived from the Mapillary Vistas Dataset, has been meticulously pre-processed to convert it into a binary segmentation problem for road detection, comprising 8041 training images and 919 validation images with their respective masks. The models were trained on our dataset, achieving the highest Dice coefficient value of 99.61% on the training dataset and 93.85% on the validation dataset using the DenseNet169 encoder model. This research contributes to advancing the state-of-the-art in road segmentation for autonomous driving applications.
... The manual method is a subjective judgment that involves observing the road surface by either walking or using a slow-moving vehicle. It is time consuming and requires significant human intervention [8]. Moreover, in a semi-automated approach, images are collected with a camera mounted on a fast-moving vehicle; however, human effort is still required for damage identification [9]. ...
Article
Road damage detection (RDD) through computer vision and deep learning techniques can ensure the safety of vehicles and humans on the roads. Integrating unmanned aerial vehicles (UAVs) in RDD and infrastructure evaluation (IE) has also emerged as a key enabler, contributing significantly to data acquisition and real-time monitoring of road damages such as potholes, cracks, and surface anomalies, facilitating proactive maintenance and improved road conditions. These UAVs are low-powered and resource-constrained devices that work autonomously to perform pattern detection and decision-making leveraging tiny machine learning (Tiny ML) algorithms. These Tiny ML algorithms are designed to run on edge devices, IoT devices, UAVs, etc. In this study, the RDD2022 dataset collected using UAVs and dashboard cameras of vehicles was utilized to train pure and mixed models that exhibit class instance imbalance in certain classes which is addressed by implementing data augmentation as a regularization technique. State-of-the-art two-stage detectors; Faster R-CNN ResNet101 and one-stage detectors; SSD MobileNet V1 FPN, YOLOv5, and Efficientdet D1 are employed. The results indicate that the two-stage detector achieved an impressive mAP of 88.49% overall and 96.62% for focused classes. Notably, the state-of-the-art Efficientdet D1 approach achieved a competitive mAP of 86.47% overall and 95.12% for focused classes, with significantly lower computational cost. These findings highlight the potential of advanced object detection techniques, particularly Efficientdet D1, to enhance the accuracy and efficiency of RDD systems, thereby improving passenger safety and overall performance.
... The potential applications in ITS services and road network management, coupled with the practicality of using 2D images, make it a noteworthy advancement in the ongoing efforts to enhance road maintenance and safety. Vinodhini and Sidhaarth [17], Arya et al. [18] presents an innovative approach, where users actively contribute to data collection, is an exciting methodology that taps into the widespread use of smartphones. The study's examination of participatory sensing for gathering information about road irregularities adds a crowd sourced element to the data collection process. ...
Article
Full-text available
p>The challenges of road maintenance, particularly in detecting potholes and cracks, and the proposed method using transfer learning and convolutional neural networks (CNNs) are significant advancements in this domain. Transfer learning is particularly beneficial, as it allows leverage pre-trained models to enhance the performance of the pothole detection system. CNNs, with their ability to capture spatial hierarchies in data, are well-suited for image-based tasks like pothole detection. The potential applications of the suggested method for intelligent transportation systems (ITS) services, such as alerting drivers about real-time potholes, demonstrate we research’s practical implications. This contributes to road safety and aligns with the broader goals of innovative city initiatives and infrastructure management. Achieving a 96% accuracy rate is a significant result, indicating the robustness of the proposed approach. Using this information to assess initial maintenance needs in a road management system is forward-thinking. Overall, we work is a valuable contribution to intelligent transportation and infrastructure management, showcasing the potential of advanced machine-learning techniques for addressing critical issues in road maintenance.</p
Preprint
Full-text available
Maintaining roadway infrastructure is essential for ensuring a safe, efficient, and sustainable transportation system. However, manual data collection for detecting road damage is time-consuming, labor-intensive, and poses safety risks. Recent advancements in artificial intelligence, particularly deep learning, offer a promising solution for automating this process using road images. This paper presents a comprehensive workflow for road damage detection using deep learning models, focusing on optimizations for inference speed while preserving detection accuracy. Specifically, to accommodate hardware limitations, large images are cropped, and lightweight models are utilized. Additionally, an external pothole dataset is incorporated to enhance the detection of this underrepresented damage class. The proposed approach employs multiple model architectures, including a custom YOLOv7 model with Coordinate Attention layers and a Tiny YOLOv7 model, which are trained and combined to maximize detection performance. The models are further reparameterized to optimize inference efficiency. Experimental results demonstrate that the ensemble of the custom YOLOv7 model with three Coordinate Attention layers and the default Tiny YOLOv7 model achieves an F1 score of 0.7027 with an inference speed of 0.0547 seconds per image. The complete pipeline, including data preprocessing, model training, and inference scripts, is publicly available on the project's GitHub repository, enabling reproducibility and facilitating further research.
Article
Road damage seriously affects the comfort and safety of drivers. The detection of road damage is of great importance not only for transportation safety, but also in terms of cost. The detection of road damage is critical for enabling early intervention and repair. In this study, the road damage detection performance of the YOLO (You Only Look Once) v8 algorithm has been evaluated using datasets obtained from different geographies, including Czechia -Türkiye, India-Türkiye, USA-Türkiye, and Japan-Türkiye. The findings revealed both the capabilities of the algorithm in damage detection and the challenges it faced in distinguishing certain types of damage. For the creation of the Türkiye dataset, images of roads in the province of Hatay were recorded. These images were labeled using Microsoft's VoTT application. Comparisons and evaluations have been made among the developed models. Among these models, the Japan-Türkiye model yielded the best results with a 0.55 mAP and 0.54 F1 score. The results of the models indicated that the appearance of damage varies according to the geographical location and the quality of road data. The importance of training on local images and indistinct types of damage has been observed.
Chapter
Road maintenance technology is crucial for safe driving and accident prevention. Traditional methods using sensor-equipped trucks are costly for many local governments. Affordable devices like smartphones can scan road surfaces. Machine learning (ML) and deep learning (DL) models can effectively detect road damages. The authors used various versions of the “You Only Look Once” (YOLO) algorithm (YOLOv3, YOLOv5, YOLOv6, YOLOv7) with the Road Damage Detection 2020 (RDD 2020) dataset. Data augmentation through transformations and generative adversarial network (GAN) enhanced the dataset. DL models were trained using three methods: from scratch, transfer learning, and hyperparameter tuning. YOLOv6, trained with GAN-based augmentation and hyperparameter tuning, achieved the best results: 0.80 Precision, 0.85 Recall, 0.676 mAP@0.5, and 0.438 mAP@0.5:0.95. Dynamic range quantization reduced the model size by 75% without compromising accuracy. This study highlights YOLOv6 with GAN-based augmentation and hyperparameter tuning as a cost-effective road maintenance solution.
Article
Firstly, the effect of damages (crack and delamination) on frequency responses of the polymeric composite structures is predicted numerically in this research. The responses are computed numerically using the finite element technique associated with a higher‐order deformation kinematic model. The model accuracy has been verified by comparing the published frequency responses and in‐house experimental data. The verified model is extended to generate the desired data (frequencies) utilizing various input parameters related to the geometrical forms and damage types (shapes, sizes, and positions). Further, different machine learning models (MLMs) are developed using Python algorithms for the identification of structural health. In this regard, the extracted data sets are initially used to train the MLM, detect the damages, and identify types of damage and damage‐related data of polymeric structures. Out of all kinds of MLMs, it is understood that the Random Forest Classifier provides the best result, which had an accuracy of 94.66% with the optimal parameters. The precision accomplished is 97% for intact and 94% for damaged structures. The proposed algorithm is also capable of identifying the damage‐related parameters (shape, size, type, and position) and predicting the defects early to prevent unintended mishaps.
Article
Full-text available
The Internet of Things (IoT) has been extensively deployed for Smart Cities due to its ability to process many different and heterogeneous end systems. The IoT innovation encourages artificial intelligence applications to process data. In Smart City infrastructure, the road is a critical component of transportation infrastructure that supports the economic, social, and cultural things of community life and various aspects of community life. Road conditions affect a variety of community activities. Good roads enhance comfort and support local businesses. However, many roads remain in bad condition, such as potholes. Various methods have been attempted to identify potholes, especially the two-dimensional imaging method. This paper proposes the real-time Artificial Intelligence detection of potholes using the Convolutional Neural Network (CNN), which leverages the Edge Tensor Processing Unit (TPU) with the MobileNet SSD v2. The system was set up on a Jetson Nano with a few extras, including a camera and GPS, to support the IoT infrastructure. Evaluation for the model consists of device implementation, model evaluation, GPS position deviation, and on-road implementation. The effectiveness is confirmed through experiments using a system test-bed that generates ideal mAP off 0.22 and recall values.
Article
Full-text available
Understanding road conditions is essential for implementing effective road safety measures and driving solutions. Road situations encompass the day-to-day conditions of roads, including the presence of vehicles and pedestrians. Surveillance cameras strategically placed along streets have been instrumental in monitoring road situations and providing valuable information on pedestrians, moving vehicles, and objects within road environments. However, these video data and information are stored in large volumes, making analysis tedious and time-consuming. Deep learning models are increasingly utilized to monitor vehicles and identify and evaluate road and driving comfort situations. However, the current neural network model requires the recognition of situations using time-series video data. In this paper, we introduced a multi-directional detection model for road situations to uphold high accuracy. Deep learning methods often integrate long short-term memory (LSTM) into long-term recurrent network architectures. This approach effectively combines recurrent neural networks to capture temporal dependencies and convolutional neural networks (CNNs) to extract features from extensive video data. In our proposed method, we form a multi-directional long-term recurrent convolutional network approach with two groups equipped with CNN and two layers of LSTM. Additionally, we compare road situation recognition using convolutional neural networks, long short-term networks, and long-term recurrent convolutional networks. The paper presents a method for detecting and recognizing multi-directional road contexts using a modified LRCN. After balancing the dataset through data augmentation, the number of video files increased, resulting in our model achieving 91% accuracy, a significant improvement from the original dataset.
Article
Road damage presents a significant risk to traffic safety, including pavement distress and disaster-induced damage. Thanks to their high efficiency, computer vision-based methods for pavement distress detection have been widely developed. In disaster scenarios, the automatic extraction of road damage information from extensive social media images plays a critical role in rescue efforts. However, few existing studies have focused on detecting object-level disaster-induced road damage. To fill the gap, this paper presents a Social media image dataset of Object detection for Disaster-induced Road damage (SODR), including 1,552 images and two categories (i.e., collapses and blockages). Additionally, this paper proposes an ensemble learning approach with attention mechanisms based on YOLOv5 (You Only Look Once) network. Initially, attention modules are employed to create two distinct detectors for ensemble learning. Subsequently, one standard YOLOv5 and two variant networks are trained with consistent settings, and test time augmentation is applied during the inference phase. The proposed method has been implemented across five scales of YOLOv5, offering alternatives for balancing accuracy and computational cost. To demonstrate the validity, comprehensive experiments were conducted on two datasets. Compared with some mainstream detectors and ensemble learning methods, our approach achieved competitive results with a fewer number of parameters and a simpler training and testing process. The SODR dataset and source code are available at ( https://github.com/nonondayo/yolov5_SODRv1 ).
Article
Road damage detection (RDD) based on front-view images of roads is more in line with practical application scenarios and is suitable for automatic road damage detection systems. The road damage objects in the front-view images have the characteristics of complex background, multi-scale and large aspect ratio, which greatly increase the difficulty of detection. We propose an anchor-free road damage detection model YOLOX-RDD for front-view images. YOLOX is used as the basic network and three optimization strategies are implemented according to the characteristics of road damage objects. The refined switchable atrous convolution (RSAC) is used to adaptively adjust the receptive field according to the size of the object, which can satisfy the requirements of the detection of the damages of multi-scale and large aspect ratio. For unobvious road damage detection in complex background, four feature enhancement attention (FEA) modules are added to the network to extract more salient information and enhance the fusion effect. Two-level adaptive spatial feature fusion (ASFF) is performed by fusing dark2 with the three output feature maps of neck respectively, and the optimal fusion weights are learned through training to further improve the detection capability of multi-scale objects. The experiments on CNRDD, RDD2020 and USRDD datasets demonstrate the effectiveness and high generalization of our method. Compared with the baseline model, the mAP @0.5 can be improved by up to 2.78%, and ${F}1$ -Score can be improved by up to 2.55%. The FPS can reach up to 90, achieving a balance between detection accuracy and speed.
Article
To address two key challenges – limited grid-level detection capability and difficulty in detecting pavement cracks in complex environments, this study proposes a novel neural network model called CrackcellNet. This innovative model incorporates an output structure that enables end-to-end grid recognition and a module that enhances shadow image data to enhance crack detection. The model relies on the design of consecutive pooling layers to achieve adaptive target size grid output. By utilizing image fusion techniques, it enhances the quantity of shadow data in road surface detection. The results of ablation experiments indicate that the optimal configuration for CrackcellNet includes V-Block and shadow augmentation operations, dilation rates of 1 or 2, and a convolutional layer in the CBA module. Through extensive experimentation, we have demonstrated that our model achieved an accuracy rate of 94.5% for grid-level crack detection and a F1 value of 0.839. Furthermore, practical engineering validation confirms the model’s efficacy with an average PCIe of 0.045, providing valuable guidance for road maintenance decisions.
Article
Advanced driver assistance systems (ADASs) rely heavily on the resilient spatial perception of the surrounding environment of vehicles enabled by sensor systems, such as radio detection and ranging (RADAR), light detection and ranging (LIDAR), and cameras. The electric/electronic (E/E) architecture of the vehicle must ensure that the sensor systems and data gathering and processing for the perception operate under all circumstances. However, certain factors can affect the resilience of the sensor system and the supporting E/E architecture, such as internal software failures and external weather conditions. This paper provides a comprehensive review of the latest developments in automotive sensor systems and architectures, highlighting potential disturbances and presenting corresponding countermeasures for hardware and software failures. We also identify challenging applications and assess the requirements for resilience to make these applications feasible under all circumstances. Based on this review, we propose a comprehensive framework for developing resilient sensor systems and architectures that can serve as a basis for further research towards fully automated driving.
Article
Segmentation of road negative obstacles (i.e., potholes and cracks) is important to the safety of autonomous driving. Although existing RGB-D fusion networks could achieve acceptable performance, most of them only conduct binary segmentation for negative obstacles, which does not distinguish potholes and cracks. Moreover, their performance is susceptible to depth noises, in which case the fluctuations of depth data caused by the noises may make the networks mistakenly treat the area as a negative obstacle. To provide a solution to the above issues, we design a novel RGB-D semantic segmentation network with dual semantic-feature complementary fusion for road negative obstacle segmentation. We also re-label an RGB-D dataset for this task, which distinguishes road potholes and cracks as two different classes. Experimental results show that our network achieves state-of-the-art performance compared to existing well-known networks.
Article
Deep learning is widely used for road damage detection, but it requires extensive, diverse, and well‐labeled data. Centralized model training can be difficult due to large data transfers, storage needs, and computational resources. Data privacy concerns can also hinder data sharing among clients, leaving them to train models on their own data, leading to less robust models. Federated learning (FL) addresses these problems by training models without data sharing, only exchanging model parameters between clients and the server. This study deploys FL along with YOLOv5l to generate models for single‐ and multi‐country applications. These models gave 21%–25% lesser mean average precision (mAP) than centralized models but outperformed local client models by 1.33%–163% on the global test data.
Article
It is known that road pavements are damaged due to time, climatic conditions and construction errors. Considering these damages, the most important road defect that reduces road safety and comfort is potholes. Especially as the width and depth of the pothole increases, driving safety is also endangered. In addition, the locations of these potholes, especially on urban roads, are determined manually in many regions. This process causes delays in the maintenance and repair of the potholes. To this end, the authors plan an in-vehicle integrated system consisting of multiple stages to automatically detect potholes occurring in the road network. The main purpose of the planned system is to identify potholes with high accuracy. However, the effect of vehicle speed on pothole detection in this system is unknown. In order to solve this complex situation, real-time video recordings were made on the same road and pothole at different vehicle speeds. Then, the pothole detection process was realized through these videos with the single-stage detector YOLOv7 vs YOLOv8. When the results obtained were examined, exact relationship could not be determined between vehicle speed and pothole detection. This situation may vary according to various parameters such as camera angle, image quality, sunlight condition. In addition, when both models are compared according to the performance criteria, YOLOv7 has a partial superiority over YOLOv8 in mAP0.5, precision, recall and F1 score values. It is especially significant that these criteria are close to 1. Finally, the perception results obtained from the images obtained from the video showed that there was no overfitting in the models.
Article
Full-text available
This data article provides details for the RDD2020 dataset comprising 26336 road images from India, Japan, and the Czech Republic with more than 31000 instances of road damage. The dataset captures four types of road damage: longitudinal cracks, transverse cracks, alligator cracks, and potholes; and is intended for developing deep learning-based methods to detect and classify road damage automatically. The images in RDD2020 were captured using vehicle-mounted smartphones, making it useful for municipalities and road agencies to develop methods for low-cost monitoring of road pavement surface conditions. Further, the machine learning researchers can use the datasets for benchmarking the performance of different algorithms for solving other problems of the same type (classification, object detection, etc.). RDD2020 is freely available at [1]. The latest updates and the corresponding articles related to the dataset can be accessed at [2].
Article
Full-text available
Proper maintenance of roads is an extremely complex task and also an important issue all over the world. One of the most critical road monitoring and maintenance activities is the detection of road anomalies such as potholes. Identification of potholes is necessary to avoid road accidents, prevent damage of vehicles, enhance travelling comforts, etc. Although maintenance of roads is considered to be a serious issue by the authorities over the years, lack of proper detection and mapping of road potholes makes the issue more severe. To overcome this problem, an end-to-end system called PotSpot is built for real-time detection, monitoring, and spatial mapping of potholes across the city A Convolutional Neural Network (CNN) model is proposed and evaluated on real-world dataset for pothole detection. Additionally, real-time pothole-marked maps are generated with the help of Google Maps API (Application Programming Interface). To provide an end-to-end service through this system, both the pothole detection and pothole mapping are integrated through an android application. The proposed model is also compared with six baselines namely Artificial Neural Network (ANN), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and three pre-trained CNN models InceptionV3, VGG19 and VGG16 in terms of performance metrics to verify its effectiveness. The proposed model achieves better accuracy (≈ 97.6 %) as compared to the above-mentioned baseline methods. It is also observed that the Area Under the Curve (AUC) value for the proposed pothole detection model (AUC= 0.97) is higher than the baseline methods.
Conference Paper
Full-text available
Information about road damages are of great interest for federal road authorities and their infrastructure management as well as the automated driving task and thus safety and comfort of vehicle occupants. Therefore, the investigation of the automatic detection of different types of road damages by images from a front-facing camera in the vehicle is of utter importance. Here we show a novel deep Learning approach utilizes the pre-trained Faster Region Based Convolutional Neural Networks (R-CNN). The data basis of our work is provided by the 'IEEE BigData Cup Challenge' and its dataset 'RDD-2020' with a large number of labelled images from Japan, India and the Czech Republic. In the first step, we classify the destination of the image followed by expert networks for each region. Between the explanation of our applied Deep Learning methodology, some remaining sources of errors are discussed and further, partly failed approaches during our development period are presented, which could be of interest for future work. Our results are convincing and we are able to achieve an F1 score of 0.487 across all regions for longitudinal and lateral cracks, alligator cracks and potholes.
Conference Paper
Full-text available
Deep learning-based technology is a good key to unlock the object detection tasks in our real world. By using deep neural networks, we could break a problem that is dangerous and very time-consuming but has to be done every day like detecting the road state. This paper describes the solution using YOLO to detect the various types of road damage in the IEEE BigData Cup Challenge 2020. Our YOLOv5x based-solution is lightweight and fast, even it has good accuracy. We achieved an F1 score of 0.58 using our ensemble model with TTA, and it could be an adequate candidate for detecting real road damage in real-time.
Article
Full-text available
A variety of civil engineering applications require the identification of cracks in roads and buildings. In such cases, it is frequently helpful for the precise location of cracks to be identified as labelled parts within an image to facilitate precision repair for example. CrackIT is known as a crack detection algorithm that allows a user to choose between a block-based or a pixel-based approach. The block-based approach is noise-tolerant but is not accurate in edge localization while the pixel-based approach gives accurate edge localisation but is not noise-tolerant. We propose a new approach that combines both techniques and retains the advantages of each. The new method is evaluated on three standard crack image datasets. The method was compared with the CrackIT method and three deep learning methods namely, HED, RCF and the FPHB. The new approach outperformed the existing arts and reduced the discretisation errors significantly while still being noise-tolerant.
Conference Paper
Full-text available
Pavement condition evaluation is essential to time the preventative or rehabilitative actions and control distress propagation. Failing to conduct timely evaluations can lead to severe structural and financial loss of the infrastructure and complete reconstructions. Automated computer-aided surveying measures can provide a database of road damage patterns and their locations. This database can be utilized for timely road repairs to gain the minimum cost of maintenance and the asphalt's maximum durability. This paper introduces a deep learning-based surveying scheme to analyze the image-based distress data in real-time. A database consisting of a diverse population of crack distress types such as longitudinal, transverse, and alligator cracks, photographed using mobile-device is used. Then, a family of efficient and scalable models that are tuned for pavement crack detection is trained. Proposed models, resulted in F1-scores, ranging from 52% to 56%, and average inference time from 178-10 images per second. Finally, the performance of the object detectors are examined, and error analysis is reported against various images. The source code is available at https://github.com/mahdi65/roadDamageDetection2020.
Article
Full-text available
Machine learning can produce promising results when sufficient training data are available; however, infrastructure inspections typically do not provide sufficient training data for road damage. Given the differences in the environment, the type of road damage and the degree of its progress can vary from structure to structure. The use of generative models, such as a generative adversarial network (GAN) or a variational autoencoder, makes it possible to generate a pseudoimage that cannot be distinguished from a real one. Combining a progressive growing GAN along with Poisson blending artificially generates road damage images that can be used as new training data to improve the accuracy of road damage detection. The addition of a synthesized road damage image to the training data improves the F ‐measure by 5% and 2% when the number of original images is small and relatively large, respectively. All of the results and the new Road Damage Dataset 2019 are publicly available (https://github.com/sekilab/RoadDamageDetector).
Article
Full-text available
Automatic crack detection on pavement surfaces is an important research field in the scope of developing an intelligent transportation infrastructure system. In this paper, a cost effective solution for road crack inspection by mounting the commercial grade sport camera, GoPro, on the rear of the moving vehicle is introduced. Also, a novel method called ConnCrack combining conditional Wasserstein generative adversarial network and connectivity maps is proposed for road crack detection. In this method, a 121-layer densely connected neural network with deconvolution layers for multi-level feature fusion is used as generator, and a 5-layer fully convolutional network is used as discriminator. To overcome the scattered output issue related to deconvolution layers, connectivity maps are introduced to represent the crack information within the proposed ConnCrack. The proposed method is tested on a publicly available dataset as well our collected data. The results show that the proposed method achieves state-of-the-art performance compared with other existing methods in terms of precision, recall and F1 score.
Article
Full-text available
Governments are faced with countless challenges to maintain conditions of road networks. This is due to financial and physical resource deficiencies of road authorities. Therefore, low-cost automated systems are sought after to alleviate these issues and deliver adequate road conditions for citizens. There have been several attempts at creating such systems and integrating them within Pavement management systems. This paper utilizes replicable deep learning techniques to carry out hotspot analyses on urban road networks highlighting important pavement distress types and associated severities. Following this, analyses were performed illustrating how the hotspot analysis can be carried out to continuously monitor the structural health of the pavement network. The methodology is applied to a road network in Sicily, Italy where there are numerous roads in need of rehabilitation and repair. Damage detection models were created which accurately highlight the location and a severity assessment. Harmonized distress categories, based on industry standards, are utilized to create practical workflows. This creates a pipeline for future applications of automated pavement distress classification and a platform for an integrated approach towards optimizing urban pavement management systems.
Preprint
Full-text available
Automated pavement distresses detection using road images remains a challenging topic in the computer vision research community. Recent developments in deep learning has led to considerable research activity directed towards improving the efficacy of automated pavement distress identification and rating. Deep learning models require a large ground truth data set, which is often not readily available in the case of pavements. In this study, a labeled dataset approach is introduced as a first step towards a more robust, easy-to-deploy pavement condition assessment system. The technique is termed herein as the Pavement Image Dataset (PID) method. The dataset consists of images captured from two camera views of an identical pavement segment, i.e., a wide-view and a top-down view. The wide-view images were used to classify the distresses and to train the deep learning frameworks, while the top-down view images allowed calculation of distress density, which will be used in future studies aimed at automated pavement rating. For the wide view group dataset, 7,237 images were manually annotated and distresses classified into nine categories. Images were extracted using the Google Application Programming Interface (API), selecting street-view images using a python-based code developed for this project. The new dataset was evaluated using two mainstream deep learning frameworks: You Only Look Once (YOLO v2) and Faster Region Convolution Neural Network (Faster R-CNN). Accuracy scores using the F1 index were found to be 0.84 for YOLOv2 and 0.65 for the Faster R-CNN model runs; both quite acceptable considering the convenience of utilizing Google maps images.
Article
Full-text available
Pavement crack detection is a critical task for insuring road safety. Manual crack detection is extremely time-consuming. Therefore, an automatic road crack detection method is required to boost this progress. However, it remains a challenging task due to the intensity inhomogeneity of cracks and complexity of the background, e.g., the low contrast with surrounding pavements and possible shadows with a similar intensity. Inspired by recent advances of deep learning in computer vision, we propose a novel network architecture, named feature pyramid and hierarchical boosting network (FPHBN), for pavement crack detection. The proposed network integrates context information to low-level features for crack detection in a feature pyramid way, and it balances the contributions of both easy and hard samples to loss by nested sample reweighting in a hierarchical way during training. In addition, we propose a novel measurement for crack detection named average intersection over union (AIU). To demonstrate the superiority and generalizability of the proposed method, we evaluate it on five crack datasets and compare it with the state-of-the-art crack detection, edge detection, and semantic segmentation methods. The extensive experiments show that the proposed method outperforms these methods in terms of accuracy and generalizability. Code and data can be found in https://github.com/fyangneil/pavement-crack-detection.
Article
Full-text available
SDNET2018 is an annotated image dataset for training, validation, and benchmarking of artificial intelligence based crack detection algorithms for concrete. SDNET2018 contains over 56,000 images of cracked and non-cracked concrete bridge decks, walls, and pavements. The dataset includes cracks as narrow as 0.06 mm and as wide as 25 mm. The dataset also includes images with a variety of obstructions, including shadows, surface roughness, scaling, edges, holes, and background debris. SDNET2018 will be useful for the continued development of concrete crack detection algorithms based on deep convolutional neural networks (DCNNs), which are a subject of continued research in the field of structural health monitoring. The authors present benchmark results for crack detection using SDNET2018 and a crack detection algorithm based on the AlexNet DCNN architecture. SDNET2018 is freely available at https://doi.org/10.15142/T3TD19.
Article
Full-text available
Edge detection is a fundamental problem in computer vision. Recently, convolutional neural networks (CNNs) have pushed forward this field significantly. Existing methods which adopt specific layers of deep CNNs may fail to capture complex data structures caused by variations of scales and aspect ratios. In this paper, we propose an accurate edge detector using richer convolutional features (RCF). RCF encapsulates all convolutional features into more discriminative representation, which makes good usage of rich feature hierarchies, and is amenable to training via backpropagation. RCF fully exploits multiscale and multilevel information of objects to perform the image-to-image prediction holistically. Using VGG16 network, we achieve state-of-the-art performance on several available datasets. When evaluating on the well-known BSDS500 benchmark, we achieve ODS F-measure of 0.811 while retaining a fast speed (8 FPS). Besides, our fast version of RCF achieves ODS F-measure of 0.806 with 30 FPS. We also demonstrate the versatility of the proposed method by applying RCF edges for classical image segmentation.
Article
Full-text available
Over the past few years, several countries, including Spain, have been experiencing a period of economic recession. As a result, these governments have reduced their budgets for transport infrastructures (both construction and maintenance operations). The main objective of this study is to analyze whether these budget reductions have an effect on increased accident rates and to perform an assessment of their real economic benefit. Thus, we analyze whether significant changes over recent years are perceptible in the road safety indexes in Spain, in terms of risk, accident fatality, and accident severity. The relation between lower budgets and higher road safety indices is analyzed through linear regression techniques. The results show a strong relation between the Risk Index and the maintenance budget, measured as an average of the last years. In addition, a final economic assessment demonstrates that this reduction in investment had no real economic benefits, especially as the costs of the accidents exceeded the savings in the conservation plans.
Article
Full-text available
Research on damage detection of road surfaces using image processing techniques has been actively conducted. This study makes three contributions to address road damage detection issues. First, to the best of our knowledge, for the first time, a large‐scale road damage data set is prepared, comprising 9,053 road damage images captured using a smartphone installed on a car, with 15,435 instances of road surface damage included in these road images. Next, we used state‐of‐the‐art object detection methods using convolutional neural networks to train the damage detection model with our data set, and compared the accuracy and runtime speed on both, using a GPU server and a smartphone. Finally, we demonstrate that the type of damage can be classified into eight types with high accuracy by applying the proposed object detection method. The road damage data set, our experimental results, and the developed smartphone application used in this study are publicly available (https://github.com/sekilab/RoadDamageDetector/).
Article
Automated pavement distress detection based on 2D images is facing various challenges. To efficiently complete the crack and pothole segmentation in a practical environment, an automated pixel-level pavement distress detection framework integrating stereo vision and deep learning is developed in this study. Based on the multi-view stereo imaging system, multi-feature pavement image datasets containing color images, depth images and color-depth overlapped images are established, providing a new perspective for deep learning. To alleviate computational burden, a modified U-net deep learning architecture introducing depthwise separable convolution is proposed for crack and pothole segmentation. These methods are tested in asphalt roads with different circumstances. The results show that the 3D pavement image achieves millimeter-level accuracy. The enhanced 3D crack segmentation model outperforms other models in terms of segmentation accuracy and inference speed. After obtaining the high-resolution pothole segmentation map, the automated pothole volume measurement is realized with high accuracy.
Conference Paper
Fast and accurate road damage detection is essential for the automatization of road inspection. This paper describes our solution submitted to the Global Road Damage Detection Challenge of the 2020 IEEE International Conference on Big Data, for typical road damage detection in digital images based on deep learning. The recently proposed YOLOv4 is chosen as the baseline network, while the effects of data augmentation, transfer learning, Optimized Anchors, and their combination are evaluated. We propose a novel road damage data generation method based on a generative adversarial network, which can generate multi-class samples with a single model. The evaluation results demonstrate the effectiveness of different tricks and their combinations on the road damage detection task, which provides a reference for practical application. The code of our solution is available at https://github.com/ZhangXG001/RoadDamgeDetection.git.
Conference Paper
This paper summarizes the Global Road Damage Detection Challenge (GRDDC), a Big Data Cup organized as a part of the IEEE International Conference on Big Data’2020. The Big Data Cup challenges involve a released dataset and a well-defined problem with clear evaluation metrics. The challenges run on a data competition platform that maintains a leaderboard for the participants. In the presented case, the data constitute 26336 road images collected from India, Japan, and the Czech Republic to propose methods for automatically detecting road damages in these countries. In total, 121 teams from several countries registered for this competition. The submitted solutions were evaluated using two datasets test1 and test2, comprising 2,631 and 2,664 images. This paper encapsulates the top 12 solutions proposed by these teams. The best performing model utilizes YOLO-based ensemble learning to yield an F1 score of 0.67 on test1 and 0.66 on test2. The paper concludes with a review of the facets that worked well for the presented challenge and those that could be improved in future challenges.
Article
Routine visual inspection is essential to maintain adequate safety and serviceability of civil infrastructures. Computer vision and machine learning based software techniques are becoming recognized methods that can potentially help the inspectors analyze the physical and functional condition of infrastructures from images and/or videos of the region of interest. More recently, deep learning approaches have been shown robust in identifying damages; yet these methods require precisely labeled large amount of training data for high accuracy complementary to visual assessment of inspectors. Especially in image segmentation operations, in which damages are subtracted from the image background for further analysis, there is a strong need to localize the damaged region prior to segmentation operation. However, available segmentation methods mostly focus on the latter step (i.e., delineation), and mis-localization of damaged regions causes accuracy drops. Inspired by the superiority of human cognitive system, where recognizing objects is simpler and more efficient than machine learning algorithms, which are superior to human in local tasks, this paper describes a novel method to dramatically improve the accuracy of the damage quantification (detection + segmentation) using an attention-guided technique. In the proposed method, a fast object detection model, Single Shot Detector (SSD) trained on VGG-16 base classifier architecture, performs a real-time crack and spall detection while working interactively with the human inspector to ensure recognition of the region of interest is well-performed. Upon the inspector's verification, happening in real-time, the detected damage region is used for damage segmentation for further analysis. This initial region of interest selection drastically lowers the computational cost, required amount of training data and reduces number of outliers. For optimal performance, a modified version of SegNet architecture was used for damage segmentation. Based on various performance criteria, the proposed attention-guided infrastructure damage analysis technique provides 30% more precision with a very minor sacrifice in computational speed compared to analysis without using attention guide.
Article
The condition of the road surface should be inspected to increase the service life of the road and to ensure safety and comfort. This study aims to automatically detect and measure road distress from unmanned aerial vehicle (UAV)-based images. The proposed methodology consists of three steps. First, images acquired from the UAV are used to generate the three-dimensional point cloud. Then, the road surface is extracted from the 3D point cloud. Finally, the developed algorithm is used to automatically detect and measure road distress. The accuracy assessment is conducted by comparing the analyses from point cloud data and measurements obtained from the traditional inspection method. The root mean square error values range from 2.09–6.72 cm. Finally, the outcomes of the proposed methodology are compared with those of commercial GIS software. Both produce statistically similar results for detecting road surface distress.
Article
The detection and classification of pavement distress (PD) play a critical role in pavement maintenance and rehabilitation. Research on PD automation detection and measurement has been actively conducted. However, types of PD are more necessary for road managers to take effective actions. Also, lack of a unified PD dataset leads to absence of a benchmark on various methods. This study makes three contributions to address these issues. Firstly, a large-scale PD dataset is prepared. This dataset is composed of 45,788 images captured with a high-resolution industrial camera installed on vehicles, in a variety of weather and illuminance conditions. Each image is annotated with bounding box representing location and type of distress. Secondly, a deep learning-based object detection framework, the YOLO network, is adopted to predict possible distress location and category. Comprehensive detection accuracy reaches 73.64%. The processing speed reaches 0.0347s/pic, as 9 times faster than Faster R-CNN and only 70% of SSD. Finally, the applicability of model under various illumination conditions is also explored. The results reveal that the method significantly outperforms with appropriate illumination. We conclude that the proposed YOLO-based approach is able to detect PD with high accuracy, which requires no manual feature extraction and calculation during detecting.
Chapter
Research on damage detection of road surfaces has been an active area of research, but most studies have focused so far on the detection of the presence of damages. However, in real-world scenarios, road managers need to clearly understand the type of damage and its extent in order to take effective action in advance or to allocate the necessary resources. Moreover, currently there are few uniform and openly available road damage datasets, leading to a lack of a common benchmark for road damage detection. Such dataset could be used in a great variety of applications; herein, it is intended to serve as the acquisition component of a physical asset management tool which can aid governments agencies for planning purposes, or by infrastructure maintenance companies. In this paper, we make two contributions to address these issues. First, we present a large-scale road damage dataset, which includes a more balanced and representative set of damages. This dataset is composed of 18,034 road damage images captured with a smartphone, with 45,435 instances road surface damages. Second, we trained different types of object detection methods, both traditional (an LBP-cascaded classifier) and deep learning-based, specifically, MobileNet and RetinaNet, which are amenable for embedded and mobile and We compare the accuracy and inference time of all these models with others in the state of the art.
Article
Accurate quantitative information of crack length and width is important for assessing the severity level of cracks and making accurate pavement maintenance decisions. Nevertheless, due to the noise and fluctuation at the crack edge of pavement image, itis difficult to precisely determine the edge correspondence of crack that is necessary for measuring the crack width at a specific pixel. Different automated or manually performed measurements are likely to yield different results at the same pixel. Instead of measuring the crack width on a pixel-by-pixel basis, this paper presents an accurate and robust segment-based method for measuring crack width. Firstly, based on the distinctive curved structure of pavement crack, a structured edge detector is trained to obtain the confidence map of crack edges. Secondly, with the crack edge map, the morphological operation is used to extract the crack skeleton which characterizes the propagation of the crack. Adaptive segmentation of the crack skeleton is performed to partition a crack curve into crack segments. Each crack segment has the same width because its edges become almost parallel after the segmentation. Finally, combining the structured edge confidence and grayscale contrast at crack edges, an enhanced edge map of crack is proposed to measure the width of each crack segment by translating the skeleton towards both edges. A large number of experiments taken on the synthetic and real-world pavement images demonstrate that the proposed method can accurately and robustly quantify various cracks with the average accuracy of 93.7% for crack width. It is promising for quantitative pavement condition assessment and maintenance.
Article
Modern smartphones have a large variety of built-in sensors that can measure different information about users and the environment around them. Given the increasing popularity of these devices, their high processing power, and the ability to transfer data over wireless networks, different smartphone-based applications have emerged in the last years to solve old problems with new approaches more efficiently and cheaply. One example is the assessment and monitoring of asphalt quality. This task has been done manually by experts since the 1930s, and with the help of expensive equipment since the 1960s. Currently, we are experiencing the emergence of next-generation tools to perform this monitoring with smartphones, significantly reducing costs, time, and effort of experts. However, there is a trade-off between the costs and precision of smartphone sensors, requiring the use of sophisticated software solutions. In this paper, we propose Asfault, a low-cost system to evaluate and monitor road pavement conditions in real-time using smartphone sensors and machine learning algorithms. The system is composed of an Android application responsible for doing automatic evaluations and a web application that aims to show the evaluations in an informative way. We propose to employ accelerometer sensors to measure the vehicle vibration while driving and use this data to evaluate the pavement conditions. Asfault achieves a classification performance superior to 90% in a 5-class problem considering the following road qualities: Good, Average, Fair, and Poor, as well the occurrence of obstacles in the road. Our system is publicly available for use and could be useful for practitioners responsible for urban and highway maintenance, as well for regular drivers in the planning of better routes based on the pavement quality and comfort of the travel.
Article
Crack detection is a crucial task in periodic pavement survey. This study establishes and compares the performance of two intelligent approaches for automatic recognition of pavement cracks. The first model relies on edge detection approaches of the Sobel and Canny algorithms. Since the implementation of the two edge detectors require the setting of threshold values, Differential Flower Pollination, as a metaheuristic, is employed to fine-tune the model parameters. The second model is constructed by the implementation of the Convolution Neural Network (CNN) – a deep learning algorithm. CNN has the advantage of performing the feature extraction and the prediction of crack/non-crack condition in an integrated and fully automated manner. Experimental results show that the model based on CNN achieves a good prediction performance of Classification Accuracy Rate (CAR) = 92.08%. This performance is significantly better than the method based on the edge detection algorithms (CAR = 79.99%). Accordingly, the proposed CNN based crack detection model is a promising alternative to support transportation agencies in the task of periodic pavement inspection.
Article
In general, potholes on asphalt pavements can be detected and represented in 2D and 3D. However, pothole detections through 3D imaging and image reconstructions have proven to be expensive in terms of acquisition equipment and the computational and processing requirements and time. For potholes at incipient formations, their detection, representation and quantification in terms of the surface-area are important for timely maintenance and repairs. By casting pavement image segmentation for pothole detection as a problem of clustering multivariate features within mixed pixels (mixels), this study presents a low-cost 2D vision image-based approach for the detection of potholes on asphalt road pavements in urban areas. The approach in this study is based on the a priori integration of multiscale texture-based image filtering for textons representation using wavelet transform, into the superpixel clustering of the pavement defects and non-defects using fuzzy c-means (FCM) algorithm. For the extraction of the defects extrema (minima and maxima) in the hybrid wavelet-FCM clustering results, fine segmentation based on morphological reconstruction is adopted to further smoothen and recognize the contour of the detected potholes. The methodology is implemented in a MATLAB prototype, tested and validated using 75 experimental image datasets. With a mean CPU run-time of 95 seconds, the average detection accuracies by comparing the study results and the manually segmented ground-truth data were determined using the Dice coefficient of similarity, Jaccard Index and sensitivity metric as 87.5%, 77.7% and 97.6% respectively. The average magnitudes of the mean and standard deviation of the percentage errors in pothole size extractions were detected as 8.5% and 4.9% respectively. The results of the study show that with well-planned road condition surveys, the proposed algorithm is suitable for the detection and extraction of incipient potholes from 2D vision images acquired using low-cost consumer-grade imaging sensors.
Conference Paper
Can a large convolutional neural network trained for whole-image classification on ImageNet be coaxed into detecting objects in PASCAL? We show that the answer is yes, and that the resulting system is simple, scalable, and boosts mean average precision, relative to the venerable deformable part model, by more than 40% (achieving a final mAP of 48% on VOC 2007). Our framework combines powerful computer vision techniques for generating bottom-up region proposals with recent advances in learning high-capacity convolutional neural networks. We call the resulting system R-CNN: Regions with CNN features. The same framework is also competitive with state-of-the-art semantic segmentation methods, demonstrating its flexibility. Beyond these results, we execute a battery of experiments that provide insight into what the network learns to represent, revealing a rich hierarchy of discriminative and often semantically meaningful features.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Conference Paper
State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [7] and Fast R-CNN [5] have reduced the running time of these detection networks, exposing region pro-posal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully-convolutional network that simultaneously predicts object bounds and objectness scores at each position. RPNs are trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. With a simple alternating optimization, RPN and Fast R-CNN can be trained to share convolu-tional features. For the very deep VGG-16 model [18], our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007 (73.2% mAP) and 2012 (70.4% mAP) using 300 proposals per image. The code will be released.
Article
The CrackNet, an efficient architecture based on the Convolutional Neural Network (CNN), is proposed in this article for automated pavement crack detection on 3D asphalt surfaces with explicit objective of pixel-perfect accuracy. Unlike the commonly used CNN, CrackNet does not have any pooling layers which downsize the outputs of previous layers. CrackNet fundamentally ensures pixel-perfect accuracy using the newly developed technique of invariant image width and height through all layers. CrackNet consists of five layers and includes more than one million parameters that are trained in the learning process. The input data of the CrackNet are feature maps generated by the feature extractor using the proposed line filters with various orientations, widths, and lengths. The output of CrackNet is the set of predicted class scores for all pixels. The hidden layers of CrackNet are convolutional layers and fully connected layers. CrackNet is trained with 1,800 3D pavement images and is then demonstrated to be successful in detecting cracks under various conditions using another set of 200 3D pavement images. The experiment using the 200 testing 3D images showed that CrackNet can achieve high Precision (90.13%), Recall (87.63%) and F-measure (88.86%) simultaneously. Compared with recently developed crack detection methods based on traditional machine learning and imaging algorithms, the CrackNet significantly outperforms the traditional approaches in terms of F-measure. Using parallel computing techniques, CrackNet is programmed to be efficiently used in conjunction with the data collection software.