Content uploaded by Morteza Moradi
Author content
All content in this area was uploaded by Morteza Moradi on Sep 28, 2023
Content may be subject to copyright.
DEVELOPING HEALTH INDICATORS FOR COMPOSITE
STRUCTURES BASED ON A TWO-STAGE SEMI-SUPERVISED
MACHINE LEARNING MODEL USING ACOUSTIC EMISSION DATA
MORTEZA MORADI*, JUAN CHIACHÍO† AND DIMITRIOS ZAROUCHAS*
*Center of Excellence in Artificial Intelligence for structures, prognostics & health management,
Aerospace Engineering Faculty, Delft University of Technology
Kluyverweg 1, Delft, 2629 HS, The Netherlands
e-mail: M.Moradi-1@tudelft.nl
† Dept. Structural Mechanics & Hydraulics Engineering, Andalusian Research Institute in Data
Science and Computational Intelligence (DaSCI), University of Granada
Granada18001, Spain
Abstract. Composite structures are highly valued for their strength-to-weight ratio, durability,
and versatility, making them ideal for a variety of applications, including aerospace,
automotive, and infrastructure. However, potential damage scenarios like impact, fatigue, and
corrosion can lead to premature failure and pose a threat to safety. This highlights the
importance of monitoring composite structures through structural health monitoring (SHM) and
prognostics and health management (PHM) to ensure their safe and reliable operation. SHM
provides information on the current state of the structure, while PHM predicts its future
behavior and determines necessary maintenance. Health indicators (HIs) play a crucial role in
both SHM and PHM, providing information on structural health and behavior, but accurate
determination of these indicators can be challenging due to the complexity of material behavior
and multiple sources of damage in composite structures. In the present work, a model containing
a developed adaptive standardization, a dimension reduction sub-model, a time-independent
sub-model, and a time-dependent sub-model is introduced to address this challenge. First, the
raw data collected by the acoustic emission technique monitoring composite structures under
fatigue loading is processed to provide plenty of statistical features. The extracted features are
adaptively standardized according to the available data until the current time. Then, the
principal component analysis algorithm is employed to reconstruct a few yet highly informative
features out of those statistical features. An artificial neural network is used to regress the
principal components to the HI that meets the prognostic criteria. Finally, the last sub-model
takes into account the time dependency of HI values during fatigue loading. In comparison to
other models, the results show superior performance.
Key words: Prognostic and Health Management, Structural Health Monitoring, Intelligent
Health Indicator, Artificial Intelligence, Composite Structures, Acoustic Emission.
923
X ECCOMAS Thematic Conference on Smart Structures and Materials
SMART 2023
D.A. Saravanos, A. Benjeddou, N. Chrysochoidis and T. Theodosiou (Eds)
Available online at www.eccomasproceedia.org
Eccomas Proceedia SMART (2023) 923-934
© 2023 The Authors. Published by Eccomas Proceedia.
Peer-review under responsibility of the organizing committee of SMART 2023.
doi: 10.7712/150123.9844.451295
1 INTRODUCTION
The use of composite structures is increasing in different industries thanks to their desirable
mechanical properties such as lightweight and high strength. However, interpreting and
predicting their behavior is more challenging than that of isotropic materials such as metals,
especially during complex operational loading conditions like compression-compression
fatigue loading [1]. In fact, describing the behavior of composite structures with the existing
physics-based models has not been completely successful, taking different aspects into account
[2, 3]. This becomes even more unpredictable when some uncertain events, such as impacts,
happen that may not have already been included in the calculation [4]. With this in mind, the
question is how safely to use these structures in sensitive industries such as aviation for a long
period of time. The simple and fast answer could be increasing the design (safety) factor,
making the structures thicker and the aircraft heavier. However, this solution is not obviously
efficient in various terms, including zero emissions, cost, and sustainability. Thus, the
prediction of the structure's behavior, especially its remaining useful life (RUL), in such
industries not only increases safety and efficiency but also saves time and money regarding
maintenance. However, a health indicator (HI) is needed to first represent the damage status of
the structure and secondly predict its RUL [5]. The first part, namely diagnosis, can help to
interpret the behavior of the structure and its damages, which will be useful to discover the
weak points and improve the design. The second part, namely prognosis, can improve safety
by predicting the failure time of the component earlier. Diagnosis and prognosis together can
significantly improve the decision-making process for maintenance, which is really costly in
aviation, in a way that the first one will say what types of actions are needed and the latter one
will determine when these actions are taken.
Designing (or discovering) a HI that satisfies the needs for both diagnostics and prognostics
is really challenging, and it seems even more difficult for complicated cases such as composite
structures. A HI (or damage index) of a structure is always decreasing (or increasing) during
operational conditions if no maintenance and self-healing occur. This fact should be induced in
the design of a HI and investigated by a metric, namely monotonicity (Mo). When a group of
similar structures reach their end-of-life (EoL), their comprehensive HIs should logically and
ideally end up at the same value, representing the failure threshold. Unfortunately, HIs at the
EoL do not end up with the same value and fluctuate; this deviation can be measured by a metric
called prognosability (Pr). Finally, if the HIs for similar structures follow a similar correlation
in terms of usage time and have the same pattern, they are more predictable. This similarity in
HIs’ trends can be measured by the criterion of trendability (Tr) [6-8]. The first two expectations
(Mo and Pr) can be considered as facts, while the maximum Tr obviously may not be achievable
due to the stochasticity and uncertainty of influential phenomena, including different
progressive damage scenarios and loading conditions. Nevertheless, achieving HIs with a high
Tr is still a target in order to enhance the RUL prediction accuracy. In the perspective of the
prognostics, which is the main target of the current work, a HI should meet these three
evaluation criteria, Mo, Pr, and Tr.
Complex time-dependent patterns (e.g., progressive damages in composite laminates) and
uncertain events (e.g., a bird strike to an airplane) cannot be taken into account without online
924
condition monitoring [9], which is termed structural health monitoring (SHM) for the case of
structures. Thus, SHM plays an important role in the diagnostics of the structures [10]. An
extension of SHM is termed "prognostics and health management (PHM)" technology, which
includes RUL prediction and is more comprehensive. Among different SHM techniques,
acoustic emission (AE) is one of the most popular and promising ones [11]. The principles of
elastic wave dispersion through the structure are the basis for this passive SHM technique,
which is highly sensitive to damage initiation and propagation. An AE system continuously
gathers signals from sensors attached to the structure. Since these signals are not steady over
time due to the nonlinear nature of the physical system in operation, they should be processed
and mined over time-windowed intervals. Moreover, interpreting the raw AE data and
translating it to the health state of the structure is not straightforward [12]. Thus, feature
extraction from the time-windowed AE data can not only take steps towards this purpose but
also reduce the capacity needed to carry the massive raw data, which is problematic in long
applications like the fatigue loading of structures (reduction from billions to thousands) [13]. It
should be noted that in complex applications where there are no pre-confirmed or pre-promising
features to select, it is important to first extract as many statistical features as possible from the
time and frequency domains of the signals. However, the tremendous features require a more
complicated fusion model to construct desirable HIs. To moderate this issue, dimension
reduction techniques, ensuring that the variation among features can be kept, result in a simpler
HI constructor model.
One of the common characteristics of the prognostics and HI construction models is that
they are time-dependent, meaning that the correlation between the historic data from the
beginning (often in a healthy state) and the current time should be taken into account to improve
the performance of the HI and RUL prediction models. The performance of the HI design model
can therefore benefit from taking into account whether or not SHM data are time-dependent.
An AE dataset of composite panels with a single stiffener that were subjected to impact and
run-to-failure C-C fatigue loading is examined in the current work [14]. The dataset consists of
201 statistical (time and frequency) features that were drawn from AE data that had been
windowed using different lengths and sliding windowing sizes. In the current work, 500 cycles
for both length and slide have been selected. The 201 statistical features are first adaptively
standardized for each composite panel individually using a new standardization method. Then,
using the principal component analysis (PCA) method, 201 features are reduced to 10 features,
which are then imported into a multilayer perceptron (MLP) to be regressed to HI labels without
taking the time-dependency relationship of the SHM data at various time steps into account. A
semi-supervised learning technique is used to train the model using the simulated ideal labels
because the true HI labels are not available [15]. With an objective function combining the HIs
evaluation criteria and the regression error, the Bayesian optimization algorithm is used to
determine the hyperparameters of this time-independent model (TIM). The predicted HI values
with TIM, or "1st level HI," are then imported into the next model to take into account the time-
dependency from the prior SHM data up to the current time step. But before using this time-
dependent model (TDM), the 1st level HIs are resampled based on usage time to have the same
length in each batch with the goal of enhancing the TDM section's performance. The
performance of the suggested approach is finally confirmed by a comparison of the final outputs
925
(the 2nd level HIs) alongside the output of TIM in terms of the criteria scores.
2 CRITERIA
To evaluate the quality of a prognostic signature (HI), three confirmed criteria (Mo, Pr, and
Tr) are used, which are expressed as follows:
(1)
(2)
(3)
where and represent the measurements at the times of and , respectively.
is the covariance, where is the vector of measurements on the specimen (among
specimens) that has measurements. and are the standard deviations of and ,
respectively. The selected metric for Mo in Eq. (1), the so-called Modified Mann-Kendall
(MMK), compared to the other versions (Sign and Mann-Kendall), is more robust to noise and
also considers the relation of data points with a time gap of more than one unit [13, 16]. All
three criteria get a score in the range of [0 – 1], with 1 representing the optimum score for the
HIs. After considering all of the above-mentioned criteria, the metric is defined as
follows:
(4)
which ranges from 0 (minimum quality) to 3 (maximum quality) for the evaluated HIs,
assuming that the control constants a, b, and c are 1.
3 SEMI-SUPERVISED CRITERIA-BASED FUSION MODEL
In the present work, a machine learning approach is developed that is based on a combination
of a dimension reduction model (PCA), a time-independent model (TIM), and a time-dependent
model (TDM) after up-sampling of timeseries in each batch. Since no true value is available as
HIs, the ideal HIs labels are simulated in terms of the usage time, and a semi-supervised
framework is employed through implicitly implementing the prognostic metrics (Mo, Pr, and
Tr) as well as exploiting the given EOL [17]. This framework is categorized based on inductive
learning algorithms, termed intrinsically semi-supervised [15], which are improvements to
preexisting supervised algorithms that enable labeled and unlabeled data to be used directly to
optimize an objective function with components. The overall framework from the raw SHM
data to the final HI, including the proposed model, is shown in Figure 1. From the pre-
926
1
14.
.
Figure 1 .
3.1 Adaptive standardization
.
.
.
.
.
.
1.
.
.
927
(5)
(6)
(7)
This process is performed for the extracted AE features of each composite specimen separately,
which is acceptable and applicable from the perspective of the prognostics.
3.2 Dimension reduction
Since the number of extracted features from AE data is high, i.e., 201, which causes more
complex subsequent models, they can first be reduced. In this regard, the PCA model, which is
a promising dimension reduction technique, is utilized to decrease the number of features from
201 statistical features to 10 principal components (PCs). The percentage of the total variance
explained by 10 PCs for twelve composite specimens is listed in Table 1. As can be seen, the
minimum variance covered by the first 10 PCs is 86.97% for composite specimen 3. More PCs
could be extracted as inputs for the subsequent models, but to keep the models simpler, such a
reconstructed variance of AE features is accepted. The PCs for all twelve composite specimens
are shown in Figure 2. Although more PCs could be extracted as inputs for the following
models, the reconstructed variance of AE features is accepted in order to keep the subsequent
models simple. Figure 2 displays the PCs for all twelve composite specimens.
Table 1: The percentage of the total variance covered by 10 PCs for different composite specimens.
Specimen
1
2
3
4
5
6
7
8
9
10
11
12
covered variance
(%)
90.79
91.24
86.97
93.16
95.13
92.46
92.45
97.97
91.75
96.81
94.46
97.69
Figure 2: The first 10 PCs extracted from 201 AE features for twelve single-stiffener composite panels.
928
3.3 Time-independent model (TIM)
In this section, an initial neural network architecture is established first, and the BO
algorithm is then employed to optimize the pertinent constructive hyperparameters. Given that
only ten features (PCs) remain after the dimension reduction step by PCA, an MLP with a few
layers can be a suitable starting point for the regression task, which entails fitting 10 PCs to a
value (HI).
3.3.1 Multilayer perceptron (MLP)
To fit the 10 PCs to the ideal simulated HI, an MLP network including 4 layers is designed
with a linear transfer function as the output layer. A modified mean absolute error (MAE) is
used as the loss function between predictions and targets:
(8)
where R represents the number of responses, and denotes target value and the network’s
output for response , respectively. is the regularization parameter to improve generalization
by modifying the performance function. Using this performance function leads the neural
network to have smaller weights and biases, resulting in a smoother response and less
overfitting. is considered 1 for the current work. Only the training (composite specimens) set
is used to train and validate the MLP model, with 30% of the training data used for validation.
Although the maximum number of training epochs was set to 1000, the output of the MLP is
based on the best validation loss, with the validation check patience set to 10. According to the
optimizers and default values in the MATLAB R2022a framework, the other hyperparameters
are determined using the BO algorithm.
3.3.2 Bayesian optimization (BO)
The hyperparameters that should be optimized by the BO include training optimizer
algorithms as well as each layer's number of neurons and activation function. Three types of
optimizing algorithms are regarded as the first optimizable variable (optimizer algorithms),
including Levenberg-Marquardt (LM), Bayesian regularization (BR), and resilient
backpropagation (RB). The number of neurons in fully connected (FC) layers 1, 2, 3, and 4 has
been allocated [1,50], [1,50], [1,50], and [1,10], respectively, based on trial and error. The last
optimizable variable is the activation function, which is assigned the same type for all hidden
layers and is selected from a categorical space including linear, Rectified Linear Units (ReLU),
saturating linear, symmetric saturating linear, hard-limit, symmetric hard-limit, log-sigmoid,
hyperbolic tangent sigmoid, Elliot symmetric sigmoid, radial basis, normalized radial basis,
triangular basis, inverse, softmax, and competitive according to the MATLAB definition. The
BO algorithm was given 100 trials with an exploration ratio of 0.8 in parallel computing to
optimize the hyperparameters.
A new objective function for the BO algorithm is introduced in order to consider the HI’s
evaluation metrics. The BO objective includes two parts: regression loss and criteria loss. The
929
first one is based on the root-mean-square error (RMSE) between the targets and predictions
over only the validation (composite specimens) set. The second loss includes the Mo, Pr, and
Tr, which are calculated considering all data sets, including training and validation portions.
The relevant equations are as follows:
(9)
(10)
(11)
where is the importance coefficient of against . has been
normalized based on the maximum fitness score to fall in the range of [0, 1], while
has been normalized based on the maximum target value, which is 100. It should be noted that
the ideal HI values are simulated in a range from 0 (healthy state) to 100 (failure state).
3.4 Time-based resampling
After the TIM step, the 1st level predicted HI could be considered as a prognostic parameter
to import into a prognostic model for predicting RUL. However, the time-dependency between
the data has not yet been considered, even though this relationship is a fact according to the
physics of the phenomenon. Before designing the time-dependent model, its input data should
be resampled in such a way that all sequence input HI(1) (1st level predicted HI) within a batch
have the same length. However, the typical padding techniques, such as zero padding, are not
appropriate in this case since the HIs values with respect to the percentage of lifetime should
be similar. For instance, if the batch size is 2 and the lengths of the HIs are 100 and 1000, the
first HI cannot be extended by 900 zero values to have the same length as the second HI. In this
case, the HI at the EOL for the first specimen becomes 0, while it should be 100, the same as
the second HI. Similarly, the typical interpolation cannot be performed since the correlation
between the number of data points in HI and the EOL is not constant or even linear. Sometimes
the length of HI for a longer EOL is less than for one with a shorter EOL, as a result of the
varying sampling frequency of the AE system depending on the pre-determined amplitude
threshold value and uncertain progressive damage in composite panels. With this in mind, a
new up-sampling technique called "time-based resampling" is employed in the current research.
First, the time vectors of HIs are converted to percent lifespan, i.e., [0%, 100%]. Then, in each
batch, the shorter HI vectors (in terms of the number of data points) are up-sampled to equal
the longer HI vector’s length according to the relevant time vectors. This process is repeated
for each batch separately. It should be noted that the size of the batch cannot be equal to the
number of all training data sets in this case. Because the TIM will learn only the position of the
data regardless of its value, it starts to predict from zero (healthy) at the beginning up to 100
(failure) at the EOL according to the position of the coming data. For instance, for HIs with an
equal length of 1000, the TIM model, by using only a bias, will learn that position 1 should
930
1000
.
.
.
10 .
252.
1.
3.5 Time-dependent model (TDM)
..
10.
.
.
.
12
20
.
Figure 3 .
0.01
931
learning rate drop factor of 0.1, a learning rate drop period of 10, and a gradient threshold of 1,
all of which have been selected after trial and error. Despite the fact that the maximum number
of training epochs was set to 2000, the network’s output is based on the best validation loss,
with the validation check frequency set to 50 iterations (the number of trained batches) and the
validation check patience set to 50. As mentioned in the previous section, the batch size of 2
was taken, where no padding is needed since the sequences in each batch are already of equal
length.
4 RESULTS AND DISCUSSIONS
Table 2 displays the hyperparameters for the TIM sub-model that have been optimized using
BO. The 1st and 2nd level HIs that are outputted after the TIM and TDM sub-models can be seen
in Figure 4. It is important to note that specimens 11 and 12 were utilized as validation and test
samples, respectively. The error shown in Figure 4 represents the RMSE between the simulated
ideal HIs and the constructed (1st and 2nd level) HIs. As shown in the figure, the HI(1)s generated
by TIM have high fluctuations, while TDM upon TIM produces smooth HI(2)s. The HI(1)s for
specimens 6 (training) and 11 (validation) exhibit a decreasing trend, which highlights the
limitations of the TIM. On the other hand, the TDM was able to correct the trend for these two
specimens. In terms of the behavior of HI(2)s, there are several (1 to 3) increasing steps observed
over the fatigue life, which can be interpreted as different damage states and can contribute to
the subsequent prognostic model for RUL prediction.
The evaluation metrics for the constructed HIs are presented in Table 3. Thanks to
considering the time-dependency, all the scores for HI(2) are higher than those for HI(1). The
proposed model not only offers a simpler and faster approach, but it also produces higher
scores when compared to state-of-the-art results [17]. Notably, the deep learning
model in [17] comprises 193418 learnable parameters, whereas the proposed method has only
1319 learnable parameters (~0.7%), with 828 assigned to TIM and 491 assigned to TDM.
Table 2: The hyperparamters of the TIM optimized by the BO.
optimizer algorithms
FCL1
FCL2
FCL3
FCL4
activation function
Objective value
()
Bayesian regularization
7
4
50
9
symmetric
saturating linear
0.6949
Table 3: The HIs’ evaluation metrics for the 1st and 2nd level HIs constructed by TIM and TDM, respectively.
HIs’ evaluation criteria
TIM (HI(1))
TIM-TDM (HI(2))
Ref. [17]
Mo
0.93
1
1
Pr
0.65
0.97
0.95
Tr
0.62
0.94
0.94
2.21
2.91
2.89
932
Figure 4: The 1st and 2nd level HIs outputted after the TIM and TDM sub-models.
5 CONCLUSIONS
The study examined an AE dataset of composite panels with a single stiffener subjected to
impact and run-to-failure C-C fatigue loading. Using a new standardization method and PCA,
the 201 statistical features were reduced to 10 features, which were then imported into an MLP
to be regressed to simulated HI labels. The TIM sub-model was optimized using the BO
algorithm, and the predicted HI values were imported into the TDM sub-model to enhance its
performance. The proposed approach produced higher Fitness score than state-of-the-art results
and had much less (~0.7%) learnable parameters, which is much simpler and faster than the
deep learning model in the previous study.
REFERENCES
[1] B. Ameri, M. Moradi, B. Mohammadi, and D. Salimi-Majd, "Investigation of nonlinear
post-buckling delamination in curved laminated composite panels via cohesive zone
model," Thin-Walled Structures, vol. 154, p. 106797, 2020.
[2] T. Peng, Y. Liu, A. Saxena, and K. Goebel, "In-situ fatigue life prognosis for composite
laminates based on stiffness degradation," Composite Structures, vol. 132, pp. 155-165,
2015.
933
[3] F. Wu and W. Yao, "A fatigue damage model of composite materials," International
Journal of Fatigue, vol. 32, no. 1, pp. 134-138, 2010.
[4] C. S. Kumar, M. Fotouhi, M. Saeedifar, and V. Arumugam, "Acoustic emission based
investigation on the effect of temperature and hybridization on drop weight impact and
post-impact residual strength of hemp and basalt fibres reinforced polymer composite
laminates," Composites Part B: Engineering, vol. 173, p. 106962, 2019.
[5] M. Moradi, P. Komninos, R. Benedictus, and D. Zarouchas, "Interpretable neural
network with limited weights for constructing simple and explainable HI using SHM
data," in Annual Conference of the PHM Society, 2022, vol. 14, no. 1.
[6] J. B. Coble, "Merging data sources to predict remaining useful life–an automated
method to identify prognostic parameters," 2010.
[7] Y. Lei, Intelligent fault diagnosis and remaining useful life prediction of rotating
machinery. Butterworth-Heinemann, 2016.
[8] L. Saidi, J. B. Ali, E. Bechhoefer, and M. Benbouzid, "Wind turbine high-speed shaft
bearings health prognosis through a spectral Kurtosis-derived indices and SVR,"
Applied Acoustics, vol. 120, pp. 1-8, 2017.
[9] J. Contreras Lopez, J. Chiachío, A. Saleh, M. Chiachío, and A. Kolios, "A cross-sectoral
review of the current and potential maintenance strategies for composite structures," SN
Applied Sciences, vol. 4, no. 6, p. 180, 2022.
[10] C. Kralovec and M. Schagerl, "Review of structural health monitoring methods
regarding a multi-sensor approach for damage assessment of metal and composite
structures," Sensors, vol. 20, no. 3, p. 826, 2020.
[11] M. Saeedifar and D. Zarouchas, "Damage characterization of laminated composites
using acoustic emission: A review," Composites Part B: Engineering, vol. 195, p.
108039, 2020.
[12] M. Saeedifar, M. Fotouhi, M. A. Najafabadi, and H. H. Toudeshky, "Prediction of
delamination growth in laminated composites using acoustic emission and cohesive
zone modeling techniques," Composite Structures, vol. 124, pp. 120-127, 2015.
[13] M. Moradi, A. Broer, J. Chiachío, R. Benedictus, T. H. Loutas, and D. Zarouchas,
"Intelligent health indicator construction for prognostics of composite structures
utilizing a semi-supervised deep neural network and SHM data," Engineering
Applications of Artificial Intelligence, vol. 117, p. 105502, 2023.
[14] M. Moradi, A. Broer, and D. Zarouchas. Acoustic emission dataset of single-stiffener
composite panels subjected to impact and run-to-failure fatigue loading [Online].
Available: https://doi.org/10.17632/ys8r8m7bx2.2
[15] J. E. Van Engelen and H. H. Hoos, "A survey on semi-supervised learning," Machine
learning, vol. 109, no. 2, pp. 373-440, 2020.
[16] N. Eleftheroglou, "Adaptive prognostics for remaining useful life of composite
structures," 2020.
[17] M. Moradi, A. Broer, J. Chiachío, R. Benedictus, and D. Zarouchas, "Intelligent Health
Indicators Based on Semi-supervised Learning Utilizing Acoustic Emission Data," in
European Workshop on Structural Health Monitoring: EWSHM 2022-Volume 3, 2022,
pp. 419-428: Springer.
934