ArticlePDF Available

A review of deep learning in the study of materials degradation


Abstract and Figures

Abstract Deep learning is revolutionising the way that many industries operate, providing a powerful method to interpret large quantities of data automatically and relatively quickly. Deterioration is often multi-factorial and difficult to model deterministically due to limits in measurability, or unknown variables. Deploying deep learning tools to the field of materials degradation should be a natural fit. In this paper, we review the current research into deep learning for detection, modelling and planning for material deterioration. Driving such research are factors such as budget reductions, increasing safety and increasing detection reliability. Based on the available literature, researchers are making headway, but several challenges remain, not least of which is the development of large training data sets and the computational intensity of many of these deep learning models.
Content may be subject to copyright.
A review of deep learning in the study of materials degradation
Will Nash
, Tom Drummond
and Nick Birbilis
Deep learning is revolutionising the way that many industries operate, providing a powerful method to interpret large quantities of
data automatically and relatively quickly. Deterioration is often multi-factorial and difcult to model deterministically due to limits
in measurability, or unknown variables. Deploying deep learning tools to the eld of materials degradation should be a natural t. In
this paper, we review the current research into deep learning for detection, modelling and planning for material deterioration.
Driving such research are factors such as budget reductions, increasing safety and increasing detection reliability. Based on the
available literature, researchers are making headway, but several challenges remain, not least of which is the development of large
training data sets and the computational intensity of many of these deep learning models.
npj Materials Degradation (2018) 2:37 ; doi:10.1038/s41529-018-0058-x
The degradation of engineered materials presents signicant
environmental, safety and economic risks. Modern society
depends on the ongoing integrity of materialsfrom the
reliability of aircraft to the efcacy of sanitary systems. Designers
impose ever increasing demands on man-made materials that are
thermodynamically driven to deteriorate.
For all the novel
materials created in laboratories around the world, their potential
degradation in service is a signicant barrier to adoption.
Magnesium alloys provide a salient example, promising light-
weight and strong parts, but suffering from rapid corrosion rates.
Aside from the mechanistic research regarding materials
degradation, research is nowadays underway that seeks to employ
deep learning to understand how to detect defects, improve
durability and manage the associated risks associated with
materials degradation.
Background on Deep Learning
Recently, advances in Articial Intelligence (A.I.) seem to be
broadcast weekly, even daily. To a large extent the burgeoning A.I.
revolution has been supported by silicon transistor technology,
arguably the material technology that denes our current age.
Alongside the development of cheaper more powerful Graphical
Processing Units (GPUs), A.I. improvement in recent years has
been driven by the collection of massive data sets via the Internet,
novel learning architectures and programming languages.
recent review by Dimiduk et al.
reveals that materials design and
development is beneting from Deep Learning; and quantum
matter researchers using articial neural nets have revealed
previously hidden patterns in cuprate superconductor psuedogap
providing insight to fundamental questions that have
gone unanswered for decades. The critical review herein intends
to explore how Deep Learning methods are being used to
automate the detection of degradation, improve modelling of
materials durability and assist decision making by analysis of large
sets of degradation data. The true power of Deep Learning arises
when the computer is able to discover its own interpretation of
the data, often leading to faster and more accurate predictive
power than hand-crafted algorithms.
Common taxonomy
The eld of A.I. is awash with apparently complicated terminology,
in many cases with different descriptors having an identical
meaning, (i.e. due to the pace of research there can be various
names given to the same concepts); understandably this can
create confusion, even to researchers in the eld. The common
taxonomy of terms are dened below, for a more detailed
description of Deep Learning A.I. systems, readers should refer to
Deep Learning.
Articial Intelligence
Within this review we dene A.I. to refer to machine learning
models that can process data to make meaningful decisions. This
denition is narrower than the traditional one, and excludes A.I.
that is hard coded such as expert systems.
Articial Neural Network
The Articial Neural Network (ANN) was rst proposed in 1958 by
as a computer Perceptronthat mimics the brain. As
the name suggests an ANN is made up of articial neurons
represented by an activation function, each neuron is fed inputs
that are weighted and summed, once the activation threshold is
exceeded the neuron res, producing an output signal. The
neurons are arranged in a layered network with neurons taking
inputs from preceding layers, thus transforming an input signal to
an output. The weights of the neurons can be tuned to adjust how
they react to the inputs. Rosenblatts original diagram of an ANN
has been reproduced in Fig. 1.
The measured error of a network can be passed backwards, using
the chain rule of derivatives to determine the contribution of each
Received: 26 July 2018 Accepted: 22 October 2018
Department of Materials Science and Engineering, Monash University, Clayton 3800 VIC, Australia;
Woodside Innovation Centre, Monash University, Clayton 3800 VIC, Australia
Department of Electrical and Computer Systems Engineering, Monash University, Clayton 3800 VIC, Australia
Correspondence: Will Nash (
Published in partnership with CSCP and USTB
weight to that errorthis is termed backpropagation. This
method was developed independently by a number of research-
ers in the 1970s and 1980s. Its use in machine learning was rst
popularised in 1986.
Data sets
The development of accurate deep learning models relies heavily
on good qualitydata sets. Any underlying biases and systemic
errors that are present in data sets utilised for training can
compromise the accuracy and effectiveness of deep learning. For
this reason, design of data sets is a major concern for A.I.
researchers and takes considerable effort. Ideally the distribution
of information contained in data sets will match the distribution
encountered in deployment. During training researchers typically
break the data set into the following subsets: training set, used for
training the model; validation set, used during training to check
the accuracy of the model on unseensamples; and a testing set,
reserved to evaluate performance after training. Thankfully, sites
like provide benchmark data sets for researchers
to develop their models and compete for prizes removing the
data set bottleneck and helping drive research.
Deep Learning
Until the 1990s ANNs were largely limited to three layers,
comprising one input, one hidden, and one output layer. In
2009 parallelisation of ANN training using Graphical Processing
Units (GPUs) was demonstrated.
Subsequently ANNs have been
successfully extended to so-called Deep Learning models,
extending to 100 s of hidden layers. It is useful to consider that
each neuron in the network transforms the incoming data to a
distinct output signal. As the depth of the ANN is increased the
network can transform the data in more complex manners,
effectively adding variables to the learned relationship between
inputs and outputs.
Convolutional layers
There is a special class of neural network layer called a
convolutional layerthat was rst proposed in 1982.
convolutional layers consist of neurons grouped into lters that
convolve the input data to produce activated outputs. For
example, if the input is an image made up of an array of red,
green and blue channels, the lters scan across the image and
produce an output map where the lter neurons are activated.
Extending the explanation of Deep Learning above, the lower
layers of convolutional neural networks close to the input have
been found to detect simple features such as edges or colours,
whereas the higher layers are able to use these lower level
representations to interpret more complex features such as faces
and text.
Networks that use convolutional layers are commonly
called Convolutional Neural Networks (CNN) or ConvNets.
Recurrent neural network and long short-term memory
Recurrent neural nets (RNNs) are designed to process sequential
data, using a connection from the output to the input of the next
sequence. This network architecture is particularly suited to
processing temporal data. Simple RNNs suffer from gradient
instability, when the sequence of inputs grows, the gradient
vanishes or explodes. To overcome this issue Long Short-Term
Memory (LSTM) networks were introduced,
and later rened.
LSTMs incorporate a memory cell into RNNs to store the state of
the neuron, preventing the gradient instability problem.
Semantic segmentation
Within object detection researchers typically use semantic
segmentation, this is a term that refers to segmenting an image
into its semantic components. Semantic segmentation models
produce a label for each pixel, are trained using data sets that are
themselves segmented into their different objects, Fig. 2provides
an example image alongside its semantic segmentation for
illustration from the Pascal VOC Dataset
. Accuracy of semantic
segmentation is typically reported using the F1-score, the
harmonic average of the precision (how many positives predicted
were true) and recall (how many true positives were predicted out
of labelled positives). We can examine human performance on
labelling data sets to formulate a benchmark for performance
e.g. the Microsoft Common Objects in Context semantic
segmentation data set expert labellers achieve an average F1-
score of 0.81.
Training Deep Learning models
Deep learning models are trained to be able to interpret the input
data in a useful way. Simply put, models are initialised with
random weights, and example inputs are fed through the
network. The difference between the target labels and the model
outputs is then measured as the error. The contribution of each
neuron to the error is determined using backpropagation, and the
weights are updated to reduce the error. This process is repeated
until a set number of iterations are completed, or the error is
reduced to an acceptable level, and the model adequately
interprets the input data into the desired output. The whole
process is termed Stochastic Gradient Descent (SGD), although
there are several variants in use that employ different methods to
increase the speed of converging on a solution. To set up the
training phase there are several so-called hyper-parameters that
affect the speed of convergence, including the number of
iterations to train with, the learning rate (i.e. how large of a step
to take with each iteration), and the specic calculation of the
error signal. Selecting an appropriate measurement of error is
important and depends on the problem space, within the
literature the error is also referred to as the costand the loss.
Fig. 1 The original perceptron concept from Rosenblatt (ref. 7) [public domain]; articial neurons mimic the function of the brain,
transforming inputs at the retina into responses
A review of deep learning in the study of materials degradation
W Nash et al.
npj Materials Degradation (2018) 37 Published in partnership with CSCP and USTB
Fine-tuning models
Deep Learning models can be trained on one task, and then ne-
tuned on another task, otherwise known as transfer learning.
Typically, ne-tuning involves locking all the previously learned
weights bar those on the output layer. Commonly this approach is
used by training on a large and freely available data set, and then
ne-tuning on a specic task with a smaller data set. This works for
tasks in similar domains where the weights learned at lower levels
are similar.
ImageNet large scale visual recognition challenge
In 2010 the annual ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) was launched and has become the benchmark
for object detection and classication computer vision models.
The ILSVRC provides a data set of 1.4 M labelled images in 1000
classes for competitors to develop and train models for ~4 months.
Models are assessed on a reserved data set where labels are only
known to the organisers, and scored based on the number of
accurate predictions.
In 2014 the Visual Geometry Group from Oxford University placed
second in the ILSVRC for classication using a very deep but
simple convolutional neural network architecture that has come to
be known as VGG-16.
This model has become very popular in
the research community due to its simple approach and because
the pre-trained weights were made freely available online,
facilitating the ne-tuning of this powerful model on new tasks.
Several of the papers reviewed make use of this model, and so its
network architecture is provided in Fig. 3.
Fig. 3 An overview of the VGG-16 model architecture, this model uses simple convolutional blocks to transform the input image to a 1000
class vector representing the classes of the ILSVRC, gure reproduced from ref.
Fig. 2 Example Images and Ground Truth Maps illustrating semantic segmentation. Adapted by permission from Springer Customer Service
Centre Gmbh (ref.
A review of deep learning in the study of materials degradation
W Nash et al.
Published in partnership with CSCP and USTB npj Materials Degradation (2018) 37
Detection of degradation is necessary to allow intervention prior
to failure; undetected deterioration can lead to catastrophic failure
in extreme cases. Direct detection involves measuring change in
materials that are detectable in ambient conditions, for example
visual presence of corrosion products, cracks and changes in
dimensions. Indirect detection requires the application of an
excitation signal for which the response of the material can be
measured to indicate deterioration, for example ultrasonic
thickness testing may reveal loss of wall thickness in pipes. Both
direct and indirect detection methods are used throughout
industry, and provide complementary functions. Typically, direct
detection is used to focus indirect detection efforts to areas of
Direct detection of degradation
Research by the European project MINOAS (Marine INspection
rObotic Assistant System) has demonstrated the effectiveness of a
simple Articial Neural Network (ANN) for corrosion and crack
detection using a micro-aerial vehicle in ships ballast tanks.
Using traditional computer vision techniques to produce inputs
related to colour and texture to various ANNs comprising one
hidden layer. The analysis determined that the optimum cong-
uration consisted of 34 inputs and 37 neurons, achieving
accuracies of 74 to 87%. This hybrid computer vision +ANN
approach may be necessary with shallow networks that are unable
to learn to discern higher order features, such as texture. Colour
information was provided to the network by ltering hue and
saturation values; and texture information by processing the
distribution of neighbouring pixel intensity. Thus, the approach
does not exploit the true power of deep learning, i.e. allowing the
computer to determine the best representation of the input data
to achieve the task. It is likely that these models overt the limited
training data of ship ballasts, although this is appropriate for the
task at hand, transferring this approach to other environments
and subjects may require signicant rework.
Deep learning models and traditional computer vision systems
for corrosion detection were compared in 2016.
The deep
learning architecture utilised transfer learning of the AlexNet
model architecture that won the ImageNet competition in
thus the model was pre-trained to identify low level
features like edges. The AlexNet model incorporates ve convolu-
tional layers, and consists of ~650,000 neurons. Even with a small
data set of 3,500 images it was demonstrated that Deep Learning
outperforms computer vision with total accuracies of 78% and
69%, respectively. Unfortunately, neither accuracy would be
considered equivalent to human performance, measured as
8895% when tested on the ImageNet Large Scale Visual
Recognition Challenge (ILSVRC).
The authors posit that a
computer vision system could augment Deep Learning to improve
classication accuracy further. The model also requires images to
be downsized to 256 × 256 pixels, discarding some (perhaps
signicant) available information. Image classication for the
presence of corrosion at the accuracy achieved would however
still require humans to review nearly all the data captured.
Deep Learning Fully Convolutional Networks (DLFCNs) have
been trained to detect degradation of railway ties
from greyscale images. In such work, a four-layer
material classication network with 493,226 trainable weights and
biases was able to discern crumbled and chipped concrete from
good concrete, as well as other materials such as ballast, rail and
fasteners, with an accuracy of 95.02%. A classier uses the output
of the material detector as its input and has been trained to
identify ve types of fasteners and whether they are broken or
not. Example detections from the model are presented in Fig. 4.
Learning to detect defects in railway ties and fasteners benets
greatly from the nature of data capturethe position of the
camera is xed with respect to the subject, and therefore the
images are well controlled. Even so, the authors were required to
manipulate their data set to enable good training: applying a
global gain normalisation, preferentially training on good quality
images, and resampling data to balance the data set to include
difcult images. Furthermore, the alignment of images was strictly
controlled to avoid intra-class variation, necessitating additional
annotation by an individual researcher to frame the region of
interest inside bounding boxes; this extra constraint complicates
and effectively prevents outsourcing of data set creation.
A CNN has been successfully used for classication of cracked
and un-cracked pavement regions,
the approach was able to
Fig. 4 Results from deep learning semantic segmentation of railway tiespink segmentation indicates crumbled concrete, red segmentation
indicates chipped concrete. © 2017 IEEE. Reprinted, with permission, from ref. 24
A review of deep learning in the study of materials degradation
W Nash et al.
npj Materials Degradation (2018) 37 Published in partnership with CSCP and USTB
achieve greater than 90% classication accuracy. This model of
Wang and Hu
took varying image sizes and divided them into
the input grid size, thus providing a quasi-localisation function, an
example output from the model is presented in Fig. 5. However, it
is unclear how well regulated the feature size to grid size needs to
be for the model to operate accurately. The pre-processing
required introduces more complexity than the more straightfor-
ward semantic segmentation techniques. Images are also down-
sized and made greyscale, reducing the information available for
the model.
CrackNet was developed using a CNN to detect cracks in 3D
images of asphalt,
with a reported precision of over 90%. By
foregoing max-pooling layers (i.e. layers that down-size resolution)
the CrackNet model was able to preserve the dimensionality of
the input and produce pixel level segmentation. A specialised
hand-coded feature extractor was used to feed data into the
model, which somewhat limited robustness, and was attributed to
false negatives detected for the case of hairline cracks. Although
the technique required a specialised PaveVison3D scanning
camera, it is expected that similar results can be achieved with
standard 2D scanning cameras.
Two deep neural nets were trained to detect corrosion,
on ZF Net, VGG-16, and two smaller CNNs of 5 and 7 layers; where
ZF Net and VGG-16 are freely available models that ranked highly
in the ILSVRC. A so-called sliding window was used to scan across
the images and provide localisation of features. Different window
size and input image colour formats were investigated, as well as
the impact of ne-tuning after training on the ILSVRC data set
against training end to end on a corrosion data set. Increasing the
window size improved the models success rate, at the cost of
decreased neness of detection. Accuracy was similar for the RGB
and YCbCr colour formats, which is anticipated as they contain
simple transforms of the same data. Overall however, and a
recurring outcome in the literature, is that the CNNs demonstrate
superior accuracy to traditional computer vision ltering techni-
ques in feature detection. An important nding from this work is
that the networks trained end to end on corrosion learned colour
and texture based lters, but suffered from overtting; whereas
the ne-tuned models provide more general representations at a
signicant computational cost. It is not clear that the scope of
training images included features that may confuse a CNN, and
the example images demonstrate false positives when the model
is faced with gravel, presumably due to the similar texture to
corrosion. A robust A.I. detection of corrosion is likely to require
contextual clues that are provided by detecting other features
such as structural steel frames versus foliage that may be confused
when relying on colour and texture alone. The drawbacks of
neness of segmentation, overtting and computational demands
remain to be overcome for an automated A.I. corrosion detector.
The Faster-R-CNNmodel, recently presented in the work of Cha
and co-workers,
fuses a region of interest detector with an
object detector, and has been trained to detect a variety of
infrastructure defects, including cracked concrete, bolt corrosion,
steel delamination, and general corrosion. This method produces
region of interest bounding boxes around the detected defect in
real time on a video resolution of 500 × 375 pixels. Fusing a region
of interest with the object detector is intuitively similar to how
human vision operates, focusing on the important features.
Although reporting an impressive average precision of 87.8%
the data set was limited to two bridges and a building at the
University of Manitoba. Unfortunately, the subjects used for
performance evaluation were not fully described, and it appears to
be applied to a restricted domaintherefore, it is not clear
whether the model will perform reliably in other environments.
Additional neness of detection is likely to be improved by
replacing the object detector with a semantic segmentation
method. Unequivocally however, such a model shows both the
value of CNN models, and the possibilities of automated defect
detection removing the need for difcult site access and
Returning to the domain of ship ballast tanks, another Faster-
RCNNbased on VGG19 has been trained to segment Coating
Breakdown and Corrosion (CBC) in natural colour images.
model detects four classes of defects: CBC on edges, CBC on
welds, surface corrosion (termed hard rust) and pitting; example
output from the model is presented in Fig. 6. Accuracy is reported
to vary from 45 to 95%, although this performance is distorted by
the data set bias toward background class (no CBC), representing
40% of pixel labels in the ground truths. Excluding the back-
ground class the F1-score is calculated to be 0.69, approaching
individual human performance of 0.81 in the MS-COCO semantic
segmentation task.
As the training data set increases in size we
should expect the F1-score to reach and even exceed individual
human level performance.
Aircraft fuselage inspection is required frequently, typically a
visual inspection is undertaken between each ight. An auto-
mated system has been developed that utilised a deep learning
convolutional neural network, based on the VGG-16 model pre-
trained on ILSVRC and ne-tuned on fuselage images.
fuselage inspection system coupled the CNN with a traditional
computer vision feature detector (called SURF), which locates
areas of change in the image; increasing inspection speed up to 6
times. A Gaussian lter was also used to smooth the image,
reducing false positives occurring from dirt on an unwashed
plane. Images were divided into 64 × 64 pixel patches, which the
model classied as defect/no defect. Accuracy of more than 96%
was measured for new unseen images, with an average run time
of 15.78 sec. This system could be improved by providing pixel
level segmentation, and reducing the run time, which may be
achieved if pre-processing is reduced.
Recently researchers have demonstrated an ability to perform
semantic segmentation of video at 12 frames-per-second on a
resolution of 1024 × 1280 pixels.
The model was trained to
segment coating, water, rivet, wet or corroded surfaces of
Fig. 5 Crack detection and classication using grid tiling coupled with a Convolutional Neural Network, from Wang and Hu, 2017. © 2017 IEEE.
Reprinted, with permission, from ref. 25
A review of deep learning in the study of materials degradation
W Nash et al.
Published in partnership with CSCP and USTB npj Materials Degradation (2018) 37
penstocks. In order to compensate for the imbalance of relatively
small data set of 40 images the authors weight the loss function to
focus on the less common classes. This technique achieved an F-
score of 52.5%, and although the performance achieved falls short
of human level accuracy, with an increased data set size this can
be expected to improve, particularly within this restricted domain.
The work was extended to produce a 3D volume rendering of the
penstock, which is useful for automating inspection of inaccessible
Combining visual and infrared imaging was shown to permit a
CNN to detect concrete cracks smaller than 0.5 mm.
additional information from the infrared camera improved the
F1-score from 0.45 to 0.99, approaching near perfectcrack
detection. This method relies on a laser based excitation unit to
provide a signal for the infrared detector. If this system can be
successfully transferred from the laboratory to site it would
replace the tedious task of crack mapping. This approach of fusing
different detection methods shows great potential to be extended
to other domains.
Indirect detection of degradation
Indirect detection of degradation aims to identify signals of
change that arise as a result of material deterioration. Typically,
these methods measure the response to an energy source, either
imparted by inspectors as in the case of microwave thermal
imaging of composites, or from operating conditions as com-
monly used for vibration analysis. Deep Learning is well suited to
seek out the signs of deterioration hidden in the enormous
amount of data generated by these indirect methods.
Cracks in welds
Deep neural networks have been trained to detect aws in welds
from radiographic scans
in an automated process presented by
Hou et. al. that achieved a maximum of 91.84% classication
accuracy. The training process uses unlabelled data to pre-train
Stacked Sparse Autoencoders, before ne-tuning on labelled data;
reducing the need for a very large data set. This approach and
similar should be considered by researchers with difcult to
produce data sets. A sliding window is used to provide location of
the defects, a convolutional architecture could reduce the
computational complexity and provide pixel level labelling.
Extensive data pre-processing was implemented to train on a
limited data set of 88 scans, that would then be required to be
undertaken on new scans; placing a (minor) bottleneck on the
Carbon bre reinforced polymer composites
The astounding performance of carbon bre reinforced polymer
(CFRP) composites can be undermined by subsurface aws that
grow in service hidden from view. In-service integrity checks are
commonly performed using ultrasonic non-destructive testing.
Interpretation of the ultrasonic signal requires expert knowledge
from experienced inspectors. In order to provide faster and
reliable inspection a deep learning Convolutional Neural Network
was trained on the ultrasonic wavelet packet decomposition
signal to detect aws deliberately introduced to the composite
The two layer CNN was able to detect aws with 95%
classication accuracy. Post processing of the ultrasonic mapping
denoised the detection by removing defects with different class
neighbouring areas, thus introducing a lower limit for detectable
defect size. Just ten defective CFRP composite samples were used
to produce the data set, increasing the likelihood that the CNN
presented is overt. The authors also do not provide a measure of
the speed of detection, a signicant factor in deployment.
Aircraft fuselage composites
An early attempt by the US National Aeronautics and Space
Administration (NASA) to automate detection of corrosion in
aluminium composites used simple neural networks to analyse
thermal data.
Two ANNs were trained to detect aws and extent
of corrosion from the thermal response of composite panels
subjected to quartz lamp heating. The aw detector model was
Fig. 6 Model output Semantic Segmentation of Coating Breakdown Corrosion in ship ballast tanks from Liu et. al. © 2018 IEEE. Reprinted, with
permission, from ref. 29
A review of deep learning in the study of materials degradation
W Nash et al.
npj Materials Degradation (2018) 37 Published in partnership with CSCP and USTB
binary, while the corrosion detector was binned into 10 percentile
ranges. By averaging the training data over several frames of the
imaging the signal to noise ratio was effectively increased. The
research went on to compare the performance of the individual
models against a combined architecture and demonstrated that
this combined architecture provided superior performance.
Unfortunately, the authors did not provide details of the accuracy
of this approach according to any metrics.
Aluminium plate
Giant Magnetoresistive (GMR) sensing data has been used as an
input to a simple neural network to detect defects in aluminium
The method successfully identied cracks, holes and
deformation using Eddy current testing, with the aim of producing
a low-cost, fast and robust defect detection sensor array. The
network architecture utilised is described as a multilayer
perceptron (1 input, 1 hidden and 1 output layer) followed by a
competitive neural network of one layer. This competitive neural
network effectively performs the softmax function on the output.
The authors did not provide details on the size of their data set,
although data setit appears to be small. The classication accuracy
for holes and cracks is reported as 83 and 95%, respectively. The
GMR sensor ANN has a heavy reliance on ltering and feature
extraction prior to input to the neural network, thus the method
does not leverage the ability of deep neural networks to extract
relevant features, and makes it difcult to retrain the network on
other sensor geometries or material types. The small number of
machined defects used to generate the data set raises issues of
overtting. No eld performance evaluation was undertaken, and
demonstrating the efcacy of the method on aluminium plate
with unknown dimensions would be necessary before the
technique could be deployed.
Stainless steel coupons
The onset of pitting and crevice corrosion in stainless steels was
shown to be able to be predicted from electrochemical data using
a simple ANN in 1993.
The ANN was trained for 30,000 iterations
on a data set of 50 les (based on potentiodynamic scans of
304 stainless steel, presumably in chloride containing electrolyte),
after which it achieved a 90% accuracy at identifying the initiation
localised corrosion. This approach shows promise to detect
insidious forms of corrosion using potential monitoring that are
not easily observed otherwise. Although at rst the method of
assigning a pitting or crevice corrosion initiation event to the
electrochemical data is straightforward enough that an algorithm
may be a better choice than a neural network, where it was stated
that The start of the corrosion event was arbitrarily assigned as the
rst of at least three data points above the mean baseline plus
2 standard deviations calculated from baseline noise. The authors
contend that the neural network identied corrosion earlier than
simple current limit monitoring, and can distinguish between
pitting and crevice corrosion. A sensor based on this technology
may be able to detect the onset of pitting through measuring
potential changes of a coupon due to chemical excursion events
in processing industries. This work could be revisited using Deep
Neural Networks to improve the prediction capability.
Steel pipelines for subsea oil transmission
Simple neural networks were effective at processing multiple NDT
sensor inputs to predict the degree of oil pipeline corrosion under
laboratory conditions.
Ultrasonic and magnetic ux leakage
sensors were used to collect a data set from machined defects in
steel pipes. This work showed great promise to automate the
detection of corrosion on subsea pipelines, an ongoing concern in
the oil and gas industries. Its unclear if eld testing has validated
this approach, and there would be questions of overtting from
the small data set. The ANN training methods used in this work
have since fallen out of favour, this is another example where
modern deep neural networks could yield accuracy improve-
ments, if the data set were made available and ideally enlarged.
Steel transmission tower footings
Prediction of corrosion of electrical transmission tower footings
was achieved using a basic neural network. The network consisted
of 6 input neurons, 5 hidden neurons and one output neuron that
estimates the degree of corrosion on a 050 scale. Input data
consisted of close and remote soil resistivity, corrosion potential,
polarization and noise resistance. The reported accuracy of
0.999 shows that the ANN method is able to learn very well
when clear correlations exist even with small, shallow networks.
Sensitivity analysis of the model to the inputs revealed that the
corrosion potential is the most inuential in determining the
degree of corrosion.
Concrete reinforcement steel
A machine learning approach was used to predict the linear
polarization resistance (nominally determined by electrochemical
testing) of reinforcing steel in concrete without requiring
destructive breakout.
NDT measurements of concrete resistivity,
galvanostatic resistivity and air temperature were provided as
inputs to the simple ANNs. The author investigated the network
architecture to nd the best arrangement neurons, which
provided R-squared accuracies above 95% on the testing data. A
tool to measure reinforcing steel corrosion without breakout
would prove very useful, while no information on in-eld
performance was reported the work is extremely promising.
ANNs have also been trained to interpret ElectroMagnetic
Anomaly Detection (EMAD) of reinforcing steel in concrete.
EMAD is a non-destructive technique developed in 2009
magnetizes the reinforcement via electromagnetic induction, and
can detect defects from Magnetic Flux Leakage (MFL) sensors. In
real world performance testing, recurrent network architectures
were shown to provide the best predictive accuracy due to the
time-dependence of the EMAD signal. Recently developed
attention-basedneural networks may be able to further improve
accuracy for EMAD.
For reinforced concrete bridge monitoring, research using
vibration sensors as inputs to CNNs has shown the capability to
learn features that correspond with vibration mode.
networks were trained on simulated data for simple beams, but
extending the technique to real bridges with complex mixed
modes appears straightforward. Automating the deployment and
tuning of these sensors should increase their use thanks to savings
in time and costs. The accuracy and speed achieved by
researchers is suitable for eld deployment on simple bridges,
although small defects are undetected in the presence of noise.
The authors recognise that obtaining unbiased data is difcult
because bridges are generally maintained in good condition, thus
the data set needs to be augmented by robust numerical
Machine health monitoring
Machine condition monitoring typically involves measuring
vibration to detect faults in bearings and rotating parts. Once
again, deep learning methods are well suited to identify fault
signals from copious amounts of data. A comprehensive survey of
DL research into machine health monitoring was undertaken by
Zhao et. al in 2016
; a handful of salient examples are reviewed
CNNs were trained to identify bearing faults with 93.61%
Fifty minutes of vibration data from eight different
bearing fault conditions was used for training. Feature extraction
A review of deep learning in the study of materials degradation
W Nash et al.
Published in partnership with CSCP and USTB npj Materials Degradation (2018) 37
segmented the data into one minute windows, which were fed
into two neural nets, the rst classied the machine state as
balanced/unbalanced, and the second classied the bearing fault
type. Machine learning improved accuracy by ~6.4% compared to
hand-crafted features. Similar CNNs trained for classication of
gearbox faults outperformed manual feature extraction by roughly
Using CNNs to analyse temporal data requires pre-processing
into discrete time-windows, this dictates neness of detection. To
overcome this limitation, researchers have turned to Recurrent
Neural Network (RNN) methods, in particular using Long Short-
Term Memory (LSTM) networks. LSTM models incorporate a
hidden state that acts as a memory of previous inputs, providing
an advantage when interpreting time series data such as machine
LSTMs have been trained to predict CNC machine tool wear
from vibration and cutting force data.
This research indicated
that Deep LSTMs outperform basic LSTMs, RNN, MLP and
traditional regression models. Follow-up research trained a novel
Convolutional Bi-Directional LSTM (CBLSTM) network to monitor
machine health.
The CBLSTM extracted features from the data
using a CNN, these features were then analysed by two
bidirectional LSTMs both forward and backward in time. The
CBLSTM accuracy outperformed the compared state-of-the-art
methods across all data sets, achieving a root mean square error
of ~10 compared to ofine tool wear measurement. The work
presented is promising because it works on raw data, and is able
to analyse time series data continuously. However, the test set up
is problematic, and it is anticipated that health monitoring would
need tuning for each individual machine. It appears that the deep
LSTM models sometimes produce signicant error excursions,
which may trigger improper maintenance decisions. Furthermore,
the models produce errors that show a reverse of wear, which is
not logically consistent.
An alternative approach for machine health monitoring was
proposed by Jia et. al.
The authors designed a Normalised Sparse
Auto-Encoder network coupled with a Local Connection Network
(NSAE-LCN) to predict planetary gearbox health from vibration
data. A data set of 4,000 samples over ten-classes was used to
train the network to detect machine faults from vibration
accelerometers. The authors posited that the NSAE component
enabled them to automatically nd relevant features from the
inputs, while the LCN component ensured that the features
identied are independent and shift-invariant. Although it seems
that this method is a complex implementation of a straightfor-
ward deep network, the advantage is that the features learned are
able to be directly extracted from the NSAE. Impressively, the
NSAE-LCN achieved greater than 99.9% accuracy of classication
on the testing set.
Moving from detection of existing defects to prediction of
deterioration is important for managing critical assets and
forecasting budgets. Deep learning methods have the potential
to improve prediction of materials deterioration, especially where
the interaction of variables is not empirically understood, and
there is signicant uncertainty of variables to the extent that many
variables may remain unknown.
Remaining useful life of aero-engines
The US National Aeronautics and Space Administration (NASA) has
developed the C-MAPSS aero-engine simulator that has been used
to produce a Remaining Useful Life (RUL) data set. The RUL data
set provides time or cycles to failure labels based on 21 input
channels from temperature and pressure sensors. This data set has
been used to train and evaluate predictive models based on Deep
Convolutional Neural Network, DCNN,
Long Short-Term
Memory, LTSM,
and Deep Belief Networks.
The mean Root
Mean Squared Accuracy (RMSE) across the four C-MAPSS data sets
is presented in Table 1.
The best accuracy achieved on the RUL data set to date used a
DCNN model.
In order to input the time series data into a
convolutional network the multi-sensor vectors were concate-
nated into 2D arrays. Using the DCNN in this way reduced the
computational demands compared to an LSTM, however, it
required the model to observe a limited time window. Example
outputs of the DCNN are presented in Fig. 7.
Inspection of the data in Fig. 7indicates that the DCNN fails to
model the underlying phenomenon driving deterioration of the
aero-engine, the prediction is not similar to the expected
behaviour, and occasionally the RUL prediction increases with
increasing cycles. This behaviour may present frustration to
engineers entrusting the prediction to plan maintenance
although the authors posit that accuracy at the critical end of
life stage is adequate for decision making.
A major drawback of the DCNN method is that a limited time
window has to be selected for analysis, this discards historical
information that may be indicative of premature failures. Although
LSTM achieved a lower accuracy, the memory gate allows it to
make decision on all the historical information, at the cost of
increased computational demand. Presumably with more training
and tuning of hyper-parameters an LSTM could match or exceed
the accuracy of the DCNN. An output from the Vanilla LSTM is
presented in Fig. 8where the general form appears to more
closely match the underlying degradation of the engine, although
again there are instances of increasing RUL prediction with
increasing cycles. Interestingly, the bulk of the error occurs prior to
the decline in RUL, and the model detects an anomaly that
presumably causes the subsequent deterioration. Although these
methods show promise for predicting failures from NASAsC-
MAPSS data set, new applications would require obtaining a data
set by running the subjects to failure.
Lithium-ion battery remaining cycles
The remaining useful life of lithium-ion batteries
has been
predicted using a deep LSTM network with two hidden layers. This
network is able to provide early failure warningenabling users
to switch batteries prior to insufcient charge availability. Only
one battery lifetime running at 25 °C was used to train the
network, and it is likely that this prediction model is overt. To
develop a more generalized model more training data on battery
lifetimes over various temperature and other operating conditions
needs to be captured and utilised.
Table 1. A comparison of the accuracy (root mean squared accuracy)
of various Deep Learning methods to predict remaining useful life
from the NASA C-MAPSS data sets
Model architecture Reference Mean RMSE for C-
Deep Convolutional Neural
Vanilla LSTM
Deep Belief Network (MODBNE)
Deep Convolutional Neural
A review of deep learning in the study of materials degradation
W Nash et al.
npj Materials Degradation (2018) 37 Published in partnership with CSCP and USTB
Fig. 7 DCNN remaining useful life predictions from the NASA C-MAPPS RUL of aero-engines data set from Li, Ding, and Sun. Reprinted from
ref. 50 Copyright (2018), with permission from Elsevier
Fig. 8 RUL prediction from NASA C-MAPSS data set using Vanilla LSTM. Reprinted from ref. 53 Copyright (2018), with permission from Elsevier
A review of deep learning in the study of materials degradation
W Nash et al.
Published in partnership with CSCP and USTB npj Materials Degradation (2018) 37
Beyond individual asset deterioration we can envision deep
learning systems providing decision support for managing large
infrastructure portfolios with complex interdependencies. Deci-
sion support systems based on so-called big datarely on
collecting vast amounts of disparate sensor measurements to
monitor and forecast system health. Deep learning A.I. has the
capability to assess the quantity and complexity of this hetero-
geneous and unstructured data in real time. Although we are not
aware of any successful implementations, the potential of deep
learning for interpreting big data has been explored.
deep learning architectures were evaluated for suitability to
handle the challenges of big data analytics, including volume of
data, speed of processing and low data quality.
While not strictly materials degradation modelling, deep
learning has been used to model risk management of the San
Jose-Mountain View transportation network in the event of the
extreme natural disaster of an earthquake.
The simulation data
set illustrates the capability of deep learning to process
interdependencies of assets, and could feasibly be coupled with
health monitoring sensors to provide city planners with a forecast
risk prole.
Several challenges remain for deploying deep learning decision
support systems. Not least amongst these is the task specic
nature of deep learning models, that require training or tuning, as
well as the increasing computing complexity with increasing
inputs. One nal issue is the case when unknown factors are
driving deterioration, if these arent captured by sensors then the
computer is as blind to their inuence as human operators.
A summary of the methods reviewed herein is presented in Table
The challenge for deploying deep learning to materials
degradation has less to do with computing power and model
architecture, and more to do with lack of good qualitytraining
data. This latter point relates to everything from a lack of useful (or
available) collected data, to appropriately (or expertly) labelled
data. It is instructive that the areas where machine learning is
making great strides are supported by freely available large data
sets: KITTI for self-driving cars,
object detection and BRATS for brain tumours
to name but a
few. Recognising the social and economic costs of materials
degradation, the creation of these data sets to drive innovation in
the eld may be an appropriate undertaking for public institu-
tions. A discrete effort at addressing this point has been recently
made via a web based corrosion detection resource called, although, signicant data sets that are
free and readily available are in need.
Nonetheless, the transformative nature of deep learning in
many related elds is illuminating for researchers in materials
degradation. Just as A.I. is becoming adept at detecting
deteriorationof the human body within the medical imaging
eld, we are beginning to see these advances for our built
infrastructure. In particular vibration analysis,
detection of
railway defects,
and corrosion of ship ballast tanks
have been
successfully demonstrated using deep learning.
Materials degradation researchers that are interested to deploy
deep learning would benet from standardising data collection
Table 2. Deep Learning methods for degradation that have been reviewed and their applications
Reference Network Application Input
Hybrid +ANN Corrosion segmentation of ship ballast tanks Natural colour images
CNN (AlexNet) Corrosion classication Natural colour images
FCN Degradation segmentation of Railway ties and fasteners Natural colour images
CNN Pavement crack classication Greyscale images
CNN (CrackNet) Asphalt crack segmentation 3D scans (PaveVision3D)
CNN Corrosion detection and localization Natural colour images
Faster-RCNN Defect detection and localization on Infrastructure Natural colour video
Faster-RCNN Coating defects and corrosion on steel Natural colour images
CNN (VGG-16) Aircraft fuselage defects Natural colour images
U-Net Defect segmentation of penstocks Natural colour images
CNN Concrete crack detection Natural colour and infrared images
Deep NN Flaw detection in welds Radiographic scans
CNN Flaw detection in carbon bre reinforced polymer
Ultrasonic scans
ANN Flaw detection in aluminium composites Thermal response
ANN Hole and crack detector in aluminium plate Eddy current (giant magnetoresistive sensing)
ANN Pitting and crevice corrosion in stainless steels Electrochemical data
ANN Corrosion of steel pipes Magnetic ux leakage
ANN Corrosion of transmission tower footings Soil resistivity, corrosion potential, polarization and noise
ANN Corrosion of concrete reinforcement Concrete resistivity, galvanostatic resistivity and air
ANN Defects of concrete reinforcement Magnetic ux leakage
CNN Damage detection of bridges Vibration sensors (simulation)
CNN Bearing faults Vibration sensors
LSTM Tool wear Vibration and cutting force
CNN Aero-engine remaining useful life Pressure, temperature and vibration sensors
LSTM Aero-engine remaining useful life Pressure, temperature and vibration sensors
Deep Belief Network Aero-engine remaining useful life Pressure, temperature and vibration sensors
LSTM Lithium-ion batteries Battery capacity (Ah)
A review of deep learning in the study of materials degradation
W Nash et al.
npj Materials Degradation (2018) 37 Published in partnership with CSCP and USTB
where possible, so that data sets can be effectively built up from
multiple published sources. Providing as much information as
possible about the set-up of experiments will also aid machine
learning, as well as publishing results in a digital format. As much
as possible, deep learning will benet from having raw data
available, that represents the expected distribution of data that
will be seen in the eld. To leverage the power of deep learning,
models should be free to nd the representation that best ts the
datawhich is a mental barrier for some researchers, where
allowing a computer to determine mechanistic trends is (or was)
considered anathema to basic science. Researchers using Deep
Learning methods should adhere to established data practices,
most importantly splitting data sets into training, validation and
test sets; and reporting standard metrics on the reserved test set
that the model has not seen prior.
Finally, within this review we have not reported on the speed of
the models, largely due to the difculty in comparing models
developed on different hardware and with different objectives. It
must be noted that where models are to be deployed into real
world environments researchers need to strive to reduce the
speed of prediction to run on available hardwarein many cases
the hardware for training of models is required to be vastly more
powerful than that available in the eld.
Deep learning has produced some very impressive results for
detecting materials degradation to date, in particular the use
of DLFCN for railway tie defect detection,
achieved greater
than 95% accuracy. Furthermore, examples that employed
CNNs indicated the possibility of automated defect detection
of infrastructure
and aircraft fuselage,
making it clear that
autonomous A.I. has merit in detecting materials degradation,
with ramications in the future role of personnel and access.
In the case of deep learning tools applied to indirect detection
of degradation; it was shown that the most promise to date
was in machine health monitoring via vibration sensors
where an accuracy of 99% was achieved on the testing set.
When forecasting degradation, preliminary results sum-
marised herein are promising, but are based on small data
sets, and exhibit errors that indicate that the models do not
satisfactorily reect the underlying degradation phenomenon.
The latter is anticipated in the case of incomplete learning,
and highlights that deep learning is a method that relies on
learningas opposed to mechanistic hard coding. The critical
review herein has identied that models with large training
data sets are likely to outperform those with small data sets,
and that large data sets are to date, not widely available.
Many of the deep learning models investigated for materials
degradation have incorporated traditional hard-coded algo-
rithms to lter and transform input datathis approach does
not fully leverage the power of deep learning to learn its own
representations, and limits the ability to deploy models to
different problems. This conclusion also highlights a reluc-
tance of researchers to let goof mechanistic rules, which will
allow the deep learning models to learn their own weightings
of relevance.
There are limited data sets publicly available for deep learning
of degradation; forcing most researchers to develop their own
data sets. Producing large and high-quality data sets is
resource intensive, especially if it requires running multiple
assets to failure. The relatively small data sets produced for
training tend to show evidence of overtting based on the
works reviewed herein.
Several of the examples presented utilising simple ANNs or
outdated methods could be revisited using modern deep
learning models to yield improvements in accuracy; in
particular for indirect detection.
Where industry is interested in driving research into degrada-
tion it is suggested that the competitionformula is followed, such
as the ImageNet Large Scale Visual Recognition Challenge,
where data sets are created and made publicly available, and
subsequently some incentive is awarded for the best performing
We acknowledge Woodside Energy for support.
W.N. undertook the collation and detailed review of research papers, and drafted the
manuscript. T.D. and N.B. provided review of the manuscript.
Competing interests: The authors declare no competing interest.
Publishers note: Springer Nature remains neutral with regard to jurisdictional claims
in published maps and institutional afliations.
1. Gin, S., Dillmann, P. & Birbilis, N. Material degradation foreseen in the very long
term: The case of glasses and ferrous metals. npj Mater. Degrad. 1, 10 (2017).
2. Schmidhuber, J. Deep Learning in neural networks: An overview. Neural Netw. 61,
85117 (2015).
3. Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet Classication with Deep
Convolutional Neural Networks. in Proceedings of the 25th International Con-
ference on Neural Information Processing Systems 19 (IEEE, New Jersey, 2012).
4. Dimiduk, D. M., Holm, E. A. & Niezgoda, S. R. Perspectives on the impact of
machine learning, deep learning, and articial intelligence on materials, pro-
cesses, and structures engineering. Integr . Mater. Manuf. Innov. 7,157172
5. Zhang, Y. et al. Using machine learning for scientic discovery in electronic
quantum matter visualization experiments. arXiv pre-print at
1808.00479 (2018).
6. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436444 (2015).
7. Rosenblatt, F. The perceptron: A probabilistic model for information storage and
organization in the brain. Psychol. Rev. 65, 386408 (1958).
8. Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-
propogation errors. Nature 323, 533536 (1986).
9. Raina, R., Madhavan, A. & Ng, A. Y. Large-scale Deep Unsupervised Learning using
Graphics Processors. in Proc of the 26th International Conference on Machine
Learning (IEEE, New Jersey, 2009).
10. Fukushima, K. & Miyake, S. Neocognitron: a new algorithm for pattern recognition
tolerant of deformations and shifts in position. Pattern Recognit. 15, 455469
11. Yosinski, J., Clune, J., Nguyen, A., Fuchs, T. & Lipson, H. Understanding Neural
Networks Through Deep Visualization. In Proc. Deep Learning Workshop, 31st
International Conference on Machine Learning 12 (2015).
12. Hochreiter, S. & Urgen Schmidhuber, J. Long short-term memory. Neural Comput.
9, 17351780 (1997).
13. Gers, F. A., Schmidhuber, J. & Cummins, F. Learning to forget: continual prediction
with LSTM. Neural Comput. 12, 24512471 (2000).
14. Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J. & Zisserman, A. The Pascal
Visual Object Classes (VOC) challenge. Int. J. Comput. Vis. 88, 303338 (2010).
15. Lin TY. et al. Microsoft COCO: Common Objects in Context. (eds Fleet D., Pajdla T.,
Schiele B., Tuytelaars T.) In Computer Vision ECCV 2014. ECCV 2014. Lecture Notes
in Computer Science Vol 8693. (Springer, New York, 2014).
16. Rich Caruana (School of Computer Science Carnegie Mellon University). Learning
Many Related Tasks at the Same Time With Backpropogation. in NIPS94 Proc of
the 7th International Conference on Neural Information Processing Systems
657664 (MIT Press Cambridge, MA, 1994).
17. Bengio, Y. Deep learning of representations for unsupervised and transfer
learning. JMLR Work. Conf. Proc. 7,120 (2011).
18. Russakovsky, O. et al. ImageNet large scale visual recognition challenge. Int. J.
Comput. Vis. 115, 211252 (2015).
A review of deep learning in the study of materials degradation
W Nash et al.
Published in partnership with CSCP and USTB npj Materials Degradation (2018) 37
19. Simonyan, K. & Zisserman, A. Very Deep Convolutional Networks for Large-Scale
Image Recognition. in ICLR 2015 114 (2014).
20. Blier, L. A brief report of the heuritech deep learning meetup#5. (2016). Available
learning-meetup-5/. (Accessed 21 June 2018).
21. Ortiz, A., Bonnin-Pascual, F., Garcia-Fidalgo, E. & Company, J. P. Visual Inspection
of Vessels by Means of a Micro-Aerial Vehicle: An Articial Neural Network
Approach for Corrosion Detection. in Proc. Robot 2015: Second Iberian Robotics
Conference 418, 543555 (Springer International Publishing, New York, 2016).
22. Petricca, L., Moss, T., Figueroa, G. & Broen, S. Corrosion Detection using A.I.: A
Comparison of Standard Computer Vision Techniques and Deep Learning Model.
In Proc. The Sixth International Conference on Computer Science, Engineering and
Information Technology 9199 (IEEE, New Jersey, 2016).
23. Gibert, X., Patel, V. M. & Chellappa, R. Material Classication and Semantic Seg-
mentation of Railway Track Images with Deep Convolutional Neural Networks. in
Proc. IEEE International Conference on Image Processing (IEEE, New Jersey, 2015).
24. Gibert, X., Patel, V. M. & Chellappa, R. Deep multitask learning for railway track
inspection. IEEE Trans. Intell. Transp. Syst. 18, 153164 (2017).
25. Wang, X. & Hu, Z. Grid-based pavement crack analysis using deep learning. In
Proc. The 4th International Conference on Transportation Information and Safety
(ICTIS) 917924 (IEEE, New Jersey, 2017).
26. Zhang, A. et al. Automated pixel-level pavement crack detection on 3D asphalt
surfaces using a deep-learning network. Comput. Civ. Infrastruct. Eng. 32, 805819
27. Atha, D. J. & Jahanshahi, M. R. Evaluation of deep learning approaches based on
convolutional neural networks for corrosion detection. Struct. Heal. Monit. An Int.
J.119 (2017).
28. Cha, Y. J., Choi, W., Suh, G., Mahmoudkhani, S. & Büyüköztürk, O. Autonomous
structural visual inspection using region-based deep learning for detecting
multiple damage types. Comput. Civ. Infrastruct. Eng. 00,117 (2017).
29. Liu, L., Tan, E., Zhen, Y. & Yin, X. J. AI-facilitated Coating Corrosion Assessment
System for Productivity Enhancement. In Proc. 2018 13th IEEE Conf. Ind. Electron.
Appl. 606610 (IEEE, New Jersey, 2018).
30. Malekzadeh, T., Abdollahzadeh, M., Nejati, H. & Cheung, N.-M. Aircraft Fuselage
Defect Detection using Deep Neural Networks. arXiv pre-print at
abs/1712.09213 (2017).
31. Nguyen, T. et al. U-Net for MAV-based Penstock Inspection: an Investigation of
Focal Loss in Multi-class Segmentation for Corrosion Identication. arXiv pre-print
at (2018).
32. Jang, K., Kim, B., Cho, S. & An, Y. Deep learning-based concrete crack detection
using hybrid images. in Proc. Sensors and Smart Structures Technologies for Civil,
Mechanical, and Aerospace Systems 2018 (ed Sohn, H.). 1059812, 36 (SPIE,
Washington, USA, 2018).
33. Hou, W., Wei, Y., Guo, J., Jin, Y. & Zhu, C. Automatic detection of welding defects
using deep neural network. J. Phys. Conf. Ser. 933, 012006 (2018).
34. Meng, M., Chua, Y. J., Wouterson, E. & Ong, C. P. K. Ultrasonic signal classication
and imaging system for composite materials via deep convolutional neural
networks. Neurocomputing 257, 128135 (2017).
35. Prabhu, D. R. & Winfree, W. P. Neural network based processing of thermal NDE
data for corrosion detection. Rev. Prog. Quant. Nondestruct. Eval. 12, 775782
36. Postolache, O., Ramos, H. G. & Ribeiro, A. L. Detection and characterization of
defects using GMR probes and articial neural networks. Comput. Stand. Inter-
faces 33, 191200 (2011).
37. Barton, T. F., Tuck, D. I. & Wells, D. B. The identication of pitting and crevice
corrosion using a neural/network. In Proc. 1993 The First New Zealand Interna-
tional Two-Stream Conference on Articial Neural Networks and Expert Systems. 01
(IEEE, New Jersey, 1993). 10.1109/ANNES.1993.323012
38. Jingwen, T., Meijuan, G. & Jin, L. Corrosion detection system for oil pipelines
based on multi-sensor data fusion by improved simulated annealing neural
network. In 2006 International Conference on Communication Technology (IEEE,
New Jersey, 2006).
39. Uruchurtu-Chavarin, J., M. Malo-Tamayo, J. & Hernandez-Perez, A. J. Articial
intelligence for the assessment on the corrosion conditions diagnosis of trans-
mission line tower foundations. Recent Pat. Corros. Sci. 2,98111 (2012).
40. Sadowski, L. Non-destructive investigation of corrosion current density in steel
reinforced concrete by articial neural networks. Arch. Civ. Mech. Eng. 13,
104111 (2013).
41. Butcher, J. B. et al. Defect detection in reinforced concrete using random neural
architectures. Comput. Civ. Infrastruct. Eng. 29, 191207 (2014).
42. Butcher, J. B. et al. in Concrete Solutions (eds. Grantham, M., Majorana, C. &
Valentina, S.). 417424 (CRC Press, USA, 2009).
43. Lin, Y. Z., Nie, Z. H. & Ma, H. W. Structural damage detection with automatic
feature-extraction through deep learning. Comput. Civ. Infrastruct. Eng. 32,
10251046 (2017).
44. Zhao, R. Deep learning and its applications to machine health monitoring. Mech.
Syst. Signal Process. 115, 213237 (2019).
45. Janssens, O. et al. Convolutional neural network based fault detection for rotating
machinery. J. Sound Vib. 377, 331345 (2016).
46. Jing, L., Zhao, M., Li, P. & Xu, X. A convolutional neural network based feature
learning and fault diagnosis method for the condition monitoring of gearbox.
Measurement 111,110 (2017).
47. Zhao, R., Wang, J., Yan, R. & Mao, K. Machine health monitoring with LSTM
networks. In Proc. 2016 10th International Conference on Sensing Technology
(ICST) (2016).
48. Zhao, R., Yan, R., Wang, J. & Mao, K. Learning to monitor machine health with
convolutional Bi-directional LSTM networks. Sens. (Switz.) 17,118 (2017).
49. Jia, F., Lei, Y., Guo, L., Lin, J. & Xing, S. A neural network constructed by deep
learning technique and its application to intelligent fault diagnosis of machines.
Neurocomputing 272, 619628 (2017).
50. Li, X., Ding, Q. & Sun, J. Q. Remaining useful life estimation in prognostics using
deep convolution neural networks. Reliab. Eng. Syst. Saf. 172,111 (2018).
51. Sateesh Babu G., Zhao P., Li XL. Deep Convolutional Neural Network Based
Regression Approach for Estimation of Remaining Useful Life. (eds Navathe S., Wu
W., Shekhar S., Du X., Wang X., Xiong H.) Database Systems for Advanced
Applications. DASFAA 2016. Lecture Notes in Computer Science, vol 9642.
(Springer International Publishing AG, Cham, Switzerland, 2016).
52. Yuan, M., Wu, Y. & Lin, L. Fault diagnosis and remaining useful life estimation of
aero engine using LSTM neural network. Int. Conf. Aircr. Util. Syst. 135140 (2016).
53. Wu, Y., Yuan, M., Dong, S., Lin, L. & Liu, Y. Remaining useful life estimation of
engineered systems using vanilla LSTM neural networks. Neurocomputing 275,
167179 (2017).
54. Zhang, C., Lim, P., Qin, A. K. & Tan, K. C. Multiobjective deep belief networks
ensemble for remaining useful life estimation in prognostics. IEEE Trans. Neural
Netw. Learn. Syst 28, 23062318 (2017).
55. Zhang, Y., Xiong, R., He, H. & Liu, Z. A LSTM-RNN method for the lithuim-ion
battery remaining useful life prediction. 2017 Progn. Syst. Heal. Manag. Conf.
PHM-Harbin 2017 - Proc. (2017).
56. Zhang, Q., Yang, L. T., Chen, Z. & Li, P. A survey on deep learning for big data. Inf.
Fusion 42, 146157 (2018).
57. Nabian, M. A. & Meidani, H. Deep learning for accelerated seismic reliability
analysis of transportation networks. Comput. Civ. Infrastruct. Eng 33, 443458
58. Fritsch, J., Kuhnl, T. & Geiger, A. A new performance measure and evaluation
benchmark for road detection algorithms. In Proc. 16th International IEEE Con-
ference on Intelligent Transportation Systems (ITSC 2013) 16931700 (IEEE, New
Jersey, 2013).
59. Menze, B. H. et al. The multimodal Brain Tumor Image Segmentation Benchmark
(BRATS). IEEE Trans. Med Imaging 34, 19932024 (2016).
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons license, and indicate if changes were made. The images or other third party
material in this article are included in the articles Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the
articles Creative Commons license and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly
from the copyright holder. To view a copy of this license, visit http://creativecommons.
© The Author(s) 2018
A review of deep learning in the study of materials degradation
W Nash et al.
npj Materials Degradation (2018) 37 Published in partnership with CSCP and USTB
... Deep learning has also been applied with different levels of success to non-imaged based corrosion detection methods such as utilising Eddy currents in metal plates [18], thermal responses in aluminium [19], and electrochemical data in steel [20]. Researchers have also applied deep learning models to other non-metallic defect detection problems such as concrete cracking [15]. ...
... Recently, much like with the general field of CV, deep learning researchers have looked to convolutional neural networks (CNNs) as the leading method for imaged based deep learning methods [15,16]. In classification, researchers have been able to achieve +90% accuracy for generic corrosion detection using only deep learning methods [15,17]. ...
... Recently, much like with the general field of CV, deep learning researchers have looked to convolutional neural networks (CNNs) as the leading method for imaged based deep learning methods [15,16]. In classification, researchers have been able to achieve +90% accuracy for generic corrosion detection using only deep learning methods [15,17]. However, as noted, the segmentation of corrosion has proven more challenging for deep learning based methods, due to the unavailability of suitable training data [15]. ...
Full-text available
The inspection of infrastructure for corrosion remains a task that is typically performed manually by qualified engineers or inspectors. This task of inspection is laborious, slow, and often requires complex access. Recently, deep learning based algorithms have revealed promise and performance in the automatic detection of corrosion. However, to date, research regarding the segmentation of images for automated corrosion detection has been limited, due to the lack of availability of per-pixel labelled data sets which are required for model training. Herein, a novel deep learning approach (termed RustSEG) is presented, that can accurately segment images for automated corrosion detection, without the requirement of per-pixel labelled data sets for training. The RustSEG method will first, using deep learning techniques, determine if corrosion is present in an image (i.e. a classification task), and then if corrosion is present, the model will examine what pixels in the original image contributed to that classification decision. Finally, the method can refine its predictions into a pixel-level segmentation mask. In ideal cases, the method is able to generate precise masks of corrosion in images, demonstrating that the automated segmentation of corrosion without per-pixel training data is possible, addressing a significant hurdle in automated infrastructure inspection.
... VGG architecture[50] ...
Full-text available
Object detection is one of the predominant and challenging problems in computer vision. Over the decade, with the expeditious evolution of deep learning, researchers have extensively experimented and contributed in the performance enhancement of object detection and related tasks such as object classification, localization, and segmentation using underlying deep models. Broadly, object detectors are classified into two categories viz. two stage and single stage object detectors. Two stage detectors mainly focus on selective region proposals strategy via complex architecture; however, single stage detectors focus on all the spatial region proposals for the possible detection of objects via relatively simpler architecture in one shot. Performance of any object detector is evaluated through detection accuracy and inference time. Generally, the detection accuracy of two stage detectors outperforms single stage object detectors. However, the inference time of single stage detectors is better compared to its counterparts. Moreover, with the advent of YOLO (You Only Look Once) and its architectural successors, the detection accuracy is improving significantly and sometime it is better than two stage detectors. YOLOs are adopted in various applications majorly due to their faster inferences rather than considering detection accuracy. As an example, detection accuracies are 63.4 and 70 for YOLO and Fast-RCNN respectively, however, inference time is around 300 times faster in case of YOLO. In this paper, we present a comprehensive review of single stage object detectors specially YOLOs, regression formulation, their architecture advancements, and performance statistics. Moreover, we summarize the comparative illustration between two stage and single stage object detectors, among different versions of YOLOs, applications based on two stage detectors, and different versions of YOLOs along with the future research directions.
... Considering the input-output structures, ML approaches are classified into four categories, as demonstrated in Fig. 2. These include supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning [13,14]. The main difference between these categories is the nature of the input and the required output. ...
... The degradation pathways can be complex and not directly obvious when examining the experimental data. Machine learning (ML) has been used to predict degradation [13][14][15][16][17] as well as to optimize process conditions to reduce material decomposition 17,18 . However, traditional data-science methods yield little insight into the underlying mechanisms. ...
Full-text available
While machine learning (ML) in experimental research has demonstrated impressive predictive capabilities, extracting fungible knowledge representations from experimental data remains an elusive task. In this manuscript, we use ML to infer the underlying differential equation (DE) from experimental data of degrading organic-inorganic methylammonium lead iodide (MAPI) perovskite thin films under environmental stressors (elevated temperature, humidity, and light). Using a sparse regression algorithm, we find that the underlying DE governing MAPI degradation across a broad temperature range of 35 to 85 °C is described minimally by a second-order polynomial. This DE corresponds to the Verhulst logistic function, which describes reaction kinetics analogous to self-propagating reactions. We examine the robustness of our conclusions to experimental variance and Gaussian noise and describe the experimental limits within which this methodology can be applied. Our study highlights the promise and challenges associated with ML-aided scientific discovery by demonstrating its application in experimental chemical and materials systems.
... With increasing awareness of the importance of maintaining the high quality and integrity of engineering structures by constant and thorough monitoring, the non-destructive evaluation (NDE) and structural health monitoring (SHM) methods attract more attention nowadays. The enormous contribution in the field of measurement devices (Astafev, 2019;Song et al., 2018), signal processing (Xia et al., 2016) and data analysis (Nash et al., 2018) is dedicated to corrosion detection and monitoring have been made. However, there is still a need to propose and develop new diagnostic methods. ...
The study presents an experimental investigation on the corrosion degradation level assessment using nondestructive wave-based methods. The degradation level of ship structural elements has been assessed in two different ways. The first one is based on the spectral decomposition and zero-crossing incorporated reconstruction of the dispersion curve approach of the antisymmetric Lamb wave mode and the best matching of the theoretical solution. The second approach was based on searching for a solution to the convex optimization problem. In the first case, the plate thickness is assumed to be constant and the phase velocity reconstructed curve is fitted to match the best solution in the all considered frequency domain. In the second case, the assumption about the varying plate thickness is set and the optimal thickness distribution resulting in any times of wave flight is defined.
... With the rapid development of computer vision technology, CNN-based auto-encoders and embedding model based on the CNN network have been widely used. The latter methods transform images into vectors through feature maps in the CNN network pre-trained on specific datasets such as Places365 or ImageNet [24]. Image clustering provides a way for discovering hidden structures or patterns of data in various fields. ...
Full-text available
With the rise of social media platforms, tourists tend to share their experiences in the form of texts, photos, and videos on social media. These user-generated contents (UGC) play an important role in shaping tourism destination images (TDI) and directly affect the decision-making process of tourists. Among UGCs, photos represent tourists’ visual preferences for a specific area. Paying attention to the value of photos, several studies have attempted to analyze them using deep learning technology. However, the research methods that analyze tourism photos using recent deep learning technology have a limitation in that they cannot properly classify unique photos appearing in specific tourist attractions with predetermined photo categories such as Places365 or ImageNet dataset or it takes a lot of time and effort to build a separate training dataset to train the model and to generate a tourism photo classification category according to a specific tourist destination. The purpose of this study is to propose a method of automatically classifying tourist photos by tourist attractions by applying the methods of the image feature vector clustering and the deep learning model. To this end, first, we collected photos attached to reviews posted by foreign tourists on TripAdvisor. Second, we embedded individual images as 512-dimensional feature vectors using the VGG16 network pre-trained with Places365 and reduced them to two dimensions with t-SNE(t-Distributed Stochastic Neighbor Embedding). Then, clusters were extracted through HDBSCAN(Hierarchical Clustering and Density-Based Spatial Clustering of Applications with Noise) analysis and set as a regional image category. Finally, the Siamese Network was applied to remove noise photos within the cluster and classify photos according to the category. In addition, this study attempts to confirm the validity of the proposed method by applying it to two representative tourist attractions such as ‘Gyeongbokgung Palace’ and ‘Insadong’ in Seoul. As a result, it was possible to identify which visual elements of tourist attractions are attractive to tourists. This method has the advantages in that it is not necessary to create a classification category in advance, it is possible to flexibly extract categories for each tourist destination, and it is able to improve classification performance even with a rather small volume of a dataset.
... Since the quality of the predictions relies on the size of the training data sets, one of the major difficulties relies on obtaining enough information. This challenge is particularly acute for damage prognosis since we often seek to characterize the extreme rather than the mean of distribution (Nash, 2018). ...
Full-text available
The reliability of turbine engines depends significantly on the environment experienced during flight. Air humidity, corrosive contaminant substances, and high operating temperatures are among the attributes that affect engine lifespans. The specifics of the environment that affect materials are not always known, and damage is often evaluated by time-consuming manual inspection. This study innovates by demonstrating that machine learning approaches can identify the environmental conditions that degrade jet engine metallic materials. We used the state-of-the-art pre-trained neural network models to assess images of damaged nickel-based superalloy samples to identify the environment temperature, the exposure time, and the deposited amounts of salt contaminants. These parameters are predicted by training the model with a database of approximately 3,600 sample images tested in laboratory conditions. A novel tree classification process results in excellent predictive power for classifying the type of environment experienced by nickel-based superalloys.
Deep learning-based object detection models have recently found widespread use in materials science, with rapid progress made in just the past two years. Scanning and tunneling electron microscopy methods are among the most important and widely used characterization techniques for understanding fundamental materials structure–property-performance linkages from the micron to atomic scale. Dramatic increases in dataset size and complexity from modern electron microscopy instruments have necessitated the development and use of automated methods of extracting pertinent features of images. Here, the use of object detection in materials science, with a focus on the analysis of features in electron microscopy images, is reviewed. Key findings and limitations of recent seminal studies using object detection to characterize and quantify defects in irradiated metal alloys, segment and analyze micro and nanoparticles, find individual atoms at the nanoscale, and detect and track objects from in situ video are reviewed. Opportunities and challenges presently facing the materials community are highlighted, where discussion of best practices for model assessment and applicability are presented, along with the potential of improved model training with synthetic data. This review concludes with offering more speculative, forward-looking thoughts on the potential of the broader materials community to construct a living ecosystem integrating community-consensus curated data and validated models as tools to best inform application of object detection and segmentation models to specific materials domains.
Implicit or explicit decision making pervades all branches of human and societal endeavours, including scientific efforts. We have studied the applicability of modern statistical learning methods to assist in relevant decision making problems where small sets of synthetic or experimental data are available. It is a hallmark of supervised learning that prediction errors decay with training data size (Big Data paradigm). By contrast, the `similar structure--similar property'-principle (popular in cheminformatics) hinges on the importance of similarity, rather than sheer data size. We discovered similarity based machine learning (SML) to exhibit favorable performance for certain conditional (Bayesian) decision problems. We apply and analyse the SML approach for the harmonic oscillator and the Rosenbrock function. Real-world demonstrations include improved decision making in (i) quantum mechanics based molecular design, (ii) experimental design in organic synthesis planning, and (iii) real estate investment decisions in the city of Berlin, Germany. Our numerical evidence suggests that SML's superior data-efficiency enables rational decision making even in very scarce data limits.
Transmission electron microscopy (TEM) is a popular method for characterizing and quantifying defects in materials. Analyzing digitized TEM images is typically done manually, which is a time-consuming and potentially error-prone task that is not scalable to large dataset sizes, motivating development of automated methods for quantifying and analyzing defects in TEM images. In this work, we perform semantic segmentation of multiple defect types in electron microscopy images of irradiated FeCrAl alloys using a deep learning mask regional convolutional neural network (Mask R-CNN) model. We evaluate the performance of the model based on distributions of defect shapes, sizes, and areal densities relevant to informing physical modeling and understanding irradiated Fe-based materials properties. To better understand the performance and present limitations of the model, we provide examples of useful evaluation tests, which include a suite of random splits and dataset-size-dependent and domain-targeted cross-validation tests, exposing potential weak points in the model applicability domain. Our model predicts the expected irradiation-induced material hardening to within 10–20 MPa (about 10% of total hardening), on par with experimental error. Finally, we discuss the first phase of an effort to provide an easy-to-use, open-source object detection tool to the broader community for identifying defects in new images.
Full-text available
The fields of machining learning and artificial intelligence are rapidly expanding, impacting nearly every technological aspect of society. Many thousands of published manuscripts report advances over the last 5 years or less. Yet materials and structures engineering practitioners are slow to engage with these advancements. Perhaps the recent advances that are driving other technical fields are not sufficiently distinguished from long-known informatics methods for materials, thereby masking their likely impact to the materials, processes, and structures engineering (MPSE). Alternatively, the diverse nature and limited availability of relevant materials data pose obstacles to machine-learning implementation. The glimpse captured in this overview is intended to draw focus to selected distinguishing advances, and to show that there are opportunities for these new technologies to have transformational impacts on MPSE. Further, there are opportunities for the MPSE fields to contribute understanding to the emerging machine-learning tools from a physics basis. We suggest that there is an immediate need to expand the use of these new tools throughout MPSE, and to begin the transformation of engineering education that is necessary for ongoing adoption of the methods.
Full-text available
In this paper, we propose an automatic detection schema including three stages for weld defects in x-ray images. Firstly, the preprocessing procedure for the image is implemented to locate the weld region; Then a classification model which is trained and tested by the patches cropped from x-ray images is constructed based on deep neural network. And this model can learn the intrinsic feature of images without extra calculation; Finally, the sliding-window approach is utilized to detect the whole images based on the trained model. In order to evaluate the performance of the model, we carry out several experiments. The results demonstrate that the classification model we proposed is effective in the detection of welded joints quality.
Full-text available
Computer vision-based techniques were developed to overcome the limitations of visual inspection by trained human resources and to detect structural damage in images remotely, but most methods detect only specific types of damage, such as concrete or steel cracks. To provide quasi real-time simultaneous detection of multiple types of damages, a Faster Region-based Con-volutional Neural Network (Faster R-CNN)-based structural visual inspection method is proposed. To realize this, a database including 2,366 images (with 500 × 375 pixels) labeled for five types of damages-concrete crack, steel corrosion with two levels (medium and high), bolt corrosion, and steel delamination-is developed. Then, the architecture of the Faster R-CNN is modified, trained, validated, and tested using this database. Results show 90.6%, 83.4%, 82.1%, 98.1%, and 84.7% average precision (AP) ratings for the five damage types, respectively, with a mean AP of 87.8%. The robustness of the trained Faster R-CNN is evaluated and demonstrated using 11 new 6,000 × 4,000-pixel images taken of different structures. Its performance is also compared to that of the traditional CNN-based method. Considering that the proposed method provides a remarkably fast test speed (0.03 seconds per image with 500 × 375 resolution), a frame-* To whom correspondence should be addressed. E-mail: Young. work for quasi real-time damage detection on video using the trained networks is developed.
Deep learning, as one of the most currently remarkable machine learning techniques, has achieved great success in many applications such as image analysis, speech recognition and text understanding. It uses supervised and unsupervised strategies to learn multi-level representations and features in hierarchical architectures for the tasks of classification and pattern recognition. Recent development in sensor networks and communication technologies has enabled the collection of big data. Although big data provides great opportunities for a broad of areas including e-commerce, industrial control and smart medical, it poses many challenging issues on data mining and information processing due to its characteristics of large volume, large variety, large velocity and large veracity. In the past few years, deep learning has played an important role in big data analytic solutions. In this paper, we review the emerging researches of deep learning models for big data feature learning. Furthermore, we point out the remaining challenges of big data deep learning and discuss the future topics.
Conference Paper
Application of protective coatings is the primary method used to protect marine and offshore structures from coating breakdown and corrosion (CBC). Assessment of CBC is the major aspect in coating failure management. Subjective assessment methods cause unnecessary maintenance cost and higher risk of failure. To improve efficiency and productivity, an integrated coating breakdown and corrosion (CBC) assessment system is developed. This AI-facilitated CBC inspection system implements a deep transfer learning technique to automate CBC assessment, it includes a faster region-base convolutional neural network (faster R-CNN) architecture and a vgg19 model for deep transfer learning, an instance-aware semantic segmentation method is developed for CBC measurement and grading. This method provides efficient inspection techniques for marine and offshore industries.
To optimize mitigation, preparedness, response, and recovery procedures for infrastructure systems, it is essential to use accurate and efficient means to evaluate system reliability against probabilistic events. The predominant approach to quantify the impact of natural disasters on infrastructure systems is the Monte Carlo approach, which still suffers from high computational cost, especially when applied to large systems. This article presents a deep learning framework for accelerating seismic reliability analysis, on a transportation network case study. Two distinct deep neural network surrogates are constructed and studied: (1) a classifier surrogate that speeds up the connectivity determination of networks and (2) an end-to-end surrogate that replaces modules such as roadway status realization, connectivity determination, and connectivity averaging. Numerical results from k-terminal connectivity analysis of a California transportation network subject to a probabilistic earthquake event demonstrate the effectiveness of the proposed surrogates in accelerating reliability analysis while achieving accuracies of at least 99%.
Corrosion is a major defect in structural systems that has a significant economic impact and can pose safety risks if left untended. Currently, an inspector visually assesses the condition of a structure to identify corrosion. This approach is time-consuming, tedious, and subjective. Robotic systems, such as unmanned aerial vehicles, paired with computer vision algorithms have the potential to perform autonomous damage detection that can significantly decrease inspection time and lead to more frequent and objective inspections. This study evaluates the use of convolutional neural networks for corrosion detection. A convolutional neural network learns the appropriate classification features that in traditional algorithms were hand-engineered. Eliminating the need for dependence on prior knowledge and human effort in designing features is a major advantage of convolutional neural networks. This article presents different convolutional neural network–based approaches for corrosion assessment on metallic surfaces. The effect of different color spaces, sliding window sizes, and convolutional neural network architectures are discussed. To this end, the performance of two pretrained state-of-the-art convolutional neural network architectures as well as two proposed convolutional neural network architectures are evaluated, and it is shown that convolutional neural networks outperform state-of-the-art vision-based corrosion detection approaches that are developed based on texture and color analysis using a simple multilayered perceptron network. Furthermore, it is shown that one of the proposed convolutional neural networks significantly improves the computational time in contrast with state-of-the-art pretrained convolutional neural networks while maintaining comparable performance for corrosion detection.