Content uploaded by Péter Zoltán Csurcsia
Author content
All content in this area was uploaded by Péter Zoltán Csurcsia on Feb 13, 2024
Content may be subject to copyright.
Reducing black-box nonlinear state-space models: a real-life case study
P. Z. Csurcsia1,2, J. Decuyper1,2, B. Renczes3, M.C. Runacres1,2, T. De Troyer1,2
1Thermo and Fluid Dynamics (FLOW), Faculty of Engineering, Vrije Universiteit Brussel (VUB),
Pleinlaan 2, 1050, Brussels, Belgium
2Brussels Institute for Thermal-Fluid Systems and clean Energy (BRITE), Vrije Universiteit Brussel
(VUB) and Université Libre de Bruxelles (ULB)
3Department of Measurement and Information Systems (MIT),
Budapest University of Technology and Economics
ABSTRACT
A known challenge when building nonlinear models from data is to limit the size of the model in terms of the number of
parameters. Especially for complex nonlinear systems, which require a substantial number of state variables, the classical
formulation of the nonlinear part (e.g. through a basis expansion) tends to lead to a rapid increase in the model size. In this
work, we propose two strategies to counter this effect:
1) The introduction of a novel nonlinear-state selection algorithm. The method relies on the non-parametric nonlinear distortion
analysis of the Best Linear Approximation framework to identify the state variables which are the most impacted by
nonlinearities. Pre-selecting only the most appropriate states when constructing the nonlinear terms results in a considerable
reduction of the model size.
2) The use of so-called ‘decoupled’ functions directly in the model estimation procedure. While it is known that function
decoupling can reduce the model size in a secondary step, we show how a decoupled formulation can be imposed to advantage
from the start. The results of this approach are benchmarked with the state-of-the-art a posteriori decoupling technique.
Our strategies are demonstrated on real-life data of a multiple-input, multiple-output (MIMO) ground vibration test of an F-16
aircraft, a prime complex and nonlinear dynamic system.
Keywords: MIMO systems, nonlinearity, data-driven modeling, decoupling, ground vibration testing
1 INTRODUCTION
Engineers and scientists want mathematical models of the observed system for understanding, design and control. Modeling
nonlinear systems is essential because many systems are inherently nonlinear. The challenge lies in the fact that there are
several differently behaving nonlinear structures and therefore modeling is very involved. As it becomes increasingly important
to cope with nonlinear analysis and modeling, various approaches have been proposed; for a detailed overview we refer to [1]
[2] and [3].
In this work, we propose a data-driven nonlinear modeling procedure where we build upon a number of well-known, matured,
system identification techniques, and add two novel tools in order to overcome some of the drawbacks of the classical approach.
In doing so, we provide a complete modeling strategy which allows retrieving compact nonlinear state-space models from data.
The procedure combines both nonparametric and parametric nonlinear modeling techniques and is particularly useful when
dealing with complex nonlinear systems, such as dynamic structures with many resonances. An important domain of application
is found in the modeling of multiple-input, multiple-output (MIMO) real-life vibro-acoustic measurements. We illustrate the
methodologies on a ground vibration test of an F-16 aircraft .
The recommended nonlinear modeling procedure is listed below and illustrated in Figure 1.
▪ In the experiment design step, systems are excited by broadband (multisine) signals at multiple excitation levels. The
recommended multisine (also known as pseudo-random noise) excitation signal consists of a series of periodic multisines that
are mutually independent over the experiments. The main advantage of the recommended signals is that there is no problem
with spectral leakage or transients. They deliver excellent linear models while providing useful information about the level and
type of nonlinearities.
▪ In the second step, the measured signals are (nonparametrically) analyzed by applying the (multisine-driven) Best Linear
Approximation (BLA) framework of MIMO systems as a generalization of the conceptual work [4]. Even though the technique
works best with the recommended multisines, (with some loss of accuracy) any (orthogonal) signal can be applied. This
(multisine-driven) BLA analysis differs from the classical H1 Frequency Response Function (FRF) estimation process [5]. The
key idea is to make use of the statistical features of the excitation signal. The outcome of the BLA analysis results in a series
of nonparametric FRFs together with noise and nonlinear distortion estimates.
▪ In the third step, a classical discrete (parametric) linear state-space model is built based on the BLA estimates on a level
closest to the linear regime of operation. The model complexity (number of states) can easily be determined with a cross-
validation-based model order scanning.
▪ The fourth step is the building of a nonlinear model. The considered nonlinear model class is of the polynomial nonlinear
state-space (PNLSS) type. A PNLSS model consists of the classical linear state-space part (initialized by a BLA FRF estimated
on a linear excitation level) and a nonlinear extension part with high-dimensional multivariate polynomials of inputs and states.
The PNLSS model structure is a flexible representation. Examples show that it can capture many different nonlinear phenomena
such as hysteresis, nonlinear feedback, bifurcations, etc, [6].An additional advantage is that the structure can also easily deal
with multiple inputs and multiple outputs. Because of the state-space representation, it is well suited for control and simulation.
Note that nonlinear state-space models relying on other representations than polynomials have been proposed. In [7] artificial
neural networks were introduced as nonlinear terms. It is important to notice that the contributions which are introduced below
remain relevant irrespective of the choice of nonlinear formulation.
Although PNLSS models have already been successfully applied in a large range of applications, until now, the application
was restricted to problems of low to moderate complexity (e.g.: [8] [9] [10]). The reason is that even for a system of moderate
complexity, the number of model parameters is prohibitively large when resorting to the classical PNLSS framework. In this
work, we demonstrate that the parameter estimation problem remains tractable by introducing a novel nonlinear-state selection
mechanism.
In addition, we introduce a second strategy to constrain the number of model parameters through function decoupling. The
current state-of-the-art is to decouple in a post-processing step. Indeed, function decoupling transforms the large multivariate
polynomials present in PNLSS models into a simplified representation, thereby reducing the number of parameters, and – in
many cases – leading to physical interpretability of the nonlinear terms [11]. In this work, however, we show how a decoupled
model can be imposed from the start of the modeling. Therefore, the main contributions in this work are:
1. Multisine-driven BLA distortion information is used to provide a novel robust method for choosing the optimal
nonlinear representation (methodology explained in Section 3.1, illustration is given in Section 5.6).
2. The direct enforcing of a decoupled model structure allows to skip the computationally heavy PNLSS step
(methodology explained in Section 3.2, illustration is given in Section 5.7).
3. The first end-to-end implementation, from BLA to decoupling, of a strategy for the system identification of complex,
nonlinear systems (Sections 4-5).
4. The first instance of the nonlinear modeling of MIMO GVT data of an F-16 aircraft (Section 4).
The numerical results of this work are obtained by the use of the tools developed within our research group: the SAMI
(Simplified Analysis for Multiple Input Systems) toolbox [12] and the freely available PNLSS toolbox [13].
The paper is organized as follows. Section 2 briefly describes the considered systems, excitation signal design, the main
assumptions, and the modeling frameworks applied in this work. Then, the novel strategies to constrain the model order for
complex, nonlinear systems are introduced in Section 3. The description and preliminary analysis of the GVT experiments are
given in Section 4. Section 5 then presents a step-by-step description of the state-of-the art in nonlinear modeling including an
illustration of the benefit of these model-reduction strategies. Section 6 gives a concise summary of the obtained results.
Conclusions can be found in Section 7.
Figure 1: The entire process of the proposed nonlinear modeling technique from experiment design until model validation.
The black boxes refer to the state-of-the-art techniques, while the red boxes refer to the novel components of the proposed
work.
2 BASICS ELEMENTS AND ASSUMPTIONS FOR NONLINEAR MIMO MODELING
Assumptions on the systems considered
The dynamics of a linear MIMO system can be nonparametrically characterized in the frequency domain by its Frequency
Response Matrix (FRM, a matrix whose elements are FRFs [14]) model at frequency index , which relates inputs to
outputs of measurement samples as follows:
(1)
where (complex valued) , ,
at frequency with sampling
frequency of .
To make the text more accessible, the frequency indices and dimensionalities will be omitted.
The system represented by is linear when the superposition principle is satisfied in steady-state, i.e.:
(2)
where a and b are scalar values. If does not vary for any a, b (and excitation), then the system is called linear-time invariant
(LTI). On the other hand, when varies with a and b (and the variation depends also on the excitation signal – e.g. level of
excitation, distribution, etc.), then the system is called nonlinear.
Since time-varying systems are often misinterpreted as nonlinear systems, it is important to mention that when G varies over
the measurement time, but at each time instant the principle of superposition is satisfied, then the system is called linear time-
varying (LTV) [15] [16] [17].
In this article, we assume that the underlying systems are damped, bounded-input, bounded-output stable, time-invariant,
nonlinear systems that can be adequately described with a (smooth) low degree Volterra-series [18] and the linear response of
the system is still present and identifiable.
In addition, we assume that the output of the underlying system has the same period as the excitation signal (i.e. the system has
PISPO (period in, same period out) behavior [19]. To ensure that we work in PISPO regime, the frequency excitation range
will be limited with respect to the sampling frequency (see Section 4) such that the output remains (nearly) periodic.
Assumptions on the instrumentation and measurement
From an instrumentation point of view, it is assumed that the measurement system is perfectly synchronized: the samples at
different channels (input and output nodes) are acquired at the same time instant; the sampling frequency is kept constant.
Furthermore, there is an adequate antialiasing filter that suppresses frequency components higher than the Nyquist frequency.
The actuator (e.g. shaker) of the system is linear. The excitation signal is measured accurately such that the signal-to-noise ratio
(SNR) is at least . In this case, the systematic errors introduced by the input noise, both on the estimated FRF, and its
confidence bounds, are negligibly small [20]. As most of the mechanical (force and acceleration) measurements have a much
better SNR, this is a reasonable assumption.
The steady-state output (denoted by in the frequency domain) is assumed to be measured with time-domain additive,
independent and identically distributed Gaussian noise (denoted by in the frequency domain) with zero mean and finite
variance
, such that the measurement is given by:
(3)
The benefit of multisines in a (MIMO) Best Linear Approximation framework
There exists a wide variety of excitation signals to test the underlying structures in a user-friendly, time-efficient manner [21].
Many prefer noise excitation signals as they are simple to implement, but there is a possible leakage (transient) error and the
possibility to detect nonlinearities is limited. We recommend using so-called multisine signals [14] [22] (pseudo-random noise
signal) because they are easy to generate, periodic, and they are noise-like: in the time domain they look like white noise, act
like it, but they are not noise (see [3]). The magnitude characteristics of multisines are set by the user in the frequency domain
(typically flat) but the phases are randomly chosen. The proposed (flat magnitude) multisine signals are wide-sense stationary
signals [23].
For experiments with multiple inputs we recommend to use orthogonal multisine excitation signals [24, 25]. The proposed
procedure is to generate independent random excitations for every input channel [26] [27], as opposed to the classical Hadamard
technique [4]. A freeware implementation of a multisine toolbox can be found in [28].
The essence of a Best Linear Approximation (BLA) [4] is to minimize the mean squared error between the measured nonlinear
response of a system and the response of a linear nonparametric frequency response function model (the BLA), for a given
level of excitation. In the BLA framework, systems are excited by multiple periods and realizations of (orthogonally shifted)
random phase multisines. The key idea is to use statistical features of the multisine excitation signal to separate the phase-
coherent (linear) part of the response signal from the non-phase-coherent part (noise and nonlinear distortions), and then to
reduce the random nonlinear distortions by averaging over multiple realizations of the multisine [29]. The multisine-driven
BLA analysis requires the calculation of different variance quantities.
The BLA framework is defined in Figure 2. is the “classical” linear (phase coherent) component of the model. The
nonlinear component of the model is . This component is phase non-coherent, meaning that the varying (harmonic-wise)
phase at the input results in a random phase rotation at the output. Increasing the number of random realizations of the multisine
signal allows us to tackle the random output phase rotations as a nonlinear noise source. The classical additive measurement
noise is represented by the component . represents the bias error, i.e. the remaining (coherent) nonlinearities after
multiple realizations.
Figure 2: The building blocks of the BLA.
The processing and analysis of the measured signals differ from the well-known H1 estimation process [14]. The key idea is to
use some statistical features of the excitation signal. In this framework, there are (input number times phase-rotated) different
realizations of the multisine excitation signal, each realization is repeated period times. The usage of periodic excitation
reduces the effects of the measurement noise (). The usage of multiple realizations reduces the impact of nonlinear noise
(non-coherent nonlinearities, ). The considered steady-state model at period and realization at frequency bin k is given
by:
(4)
where ,
, and the is the generalized (Moore–Penrose) inverse of .
For further computational and fundamental details we refer to [4]. A freeware implementation of a BLA toolbox can be found
in [30].
Polynomial Nonlinear State-Space models
The nonlinear model structure, which will serve as a starting point, is based on the extension of the classical linear parametric
state-space modeling framework. An nth order (noise-free) discrete-time polynomial nonlinear state-space (PNLSS) model can
be expressed as:
(5)
where the and vectors contain the input-output values at discrete time instance ; the state vector represents the
memory of the system; , and
define the state equation; , and
define the output equation. When dealing with measurements, the ideal noise free output in (5) should be
substituted with the experimental data.
The state vector includes the common dynamics present in the different outputs. The state equation represents the evolution
of the state as a function of the input and the previous state. and matrices realize the nonlinear extension part where auto-
and cross-terms of the input and states are considered. These terms can include (but are not limited to) .
The vectors and contain the nonlinear monomials in and of degree two up to a chosen degree
(and it can include zero degree as well). The PNLSS framework has been implemented as a freely accessible Matlab toolbox
[13].
Since the nonlinear extension part in (5) typically involves a polynomial basis expansion in the states and inputs, the number
of model parameters rapidly increases for complex systems with many states. Especially for applications in optimization or
(real-time) control, models cannot become too computationally heavy. The concern is that the PNLSS model is sensitive to
unseen input distributions and might produce an unstable simulation output. One available solution is the a posteriori model
reduction through function decoupling [11]. In the next section, however, we introduce two novel, alternative strategies to
constrain the number of model parameters.
3 CONSTRAINED-ORDER PNLSS MODELING: STATE SELECTION AND FUNCTION DECOUPLING
The main issue with the classical PNLSS models is that the number of parameters grows combinatorically with the model
order. Apart from being computationally more cumbersome, increasing model complexity beyond a critical threshold will also
lead to less accuracy. This threshold is related to the ratio of the number of parameters in the model and the number of training
data points. This ratio should be well below 1 for ordinary least-squares-based algorithms. There exist indicators to verify
whether the model order is not excessive [3] but these indicators fail to direct the user towards a better model structure. This is
especially relevant when the system is nonlinear, when the nonlinear part (see (5)) is expanded in a basis of the (linear) states
and (multiple) inputs.
Nonlinear state-selection method for PNLSS
There exist indicators to verify whether the model order is not excessive [3] but these indicators fail to direct the user towards
a better model structure. This is especially relevant when the system is nonlinear, when the nonlinear part (see (5)) is expanded
to a basis of the (linear) states and multiple inputs.
The first step in the proposed method is to obtain the nonparametric BLA models (i.e., the FRM and distortion level estimates,
see Section 2.3). This step – together with nonlinearity assessment – is elaborated in Sections 4.2-4.3.
Next, the parametric BLA model parameters (i.e., the parameters of the MIMO SS model) have to be estimated using the BLA
FRMs and their total distortions. The state-space model parameters are obtained by using a weighted frequency domain fitting
method – where the weights are the total covariance estimates of the FRM (see methodological and numerical details in Section
5.3).
The last classical step would be estimating the (fully built) PNLSS models. The PNLSS models are initialized with the help of
the state-space BLA models using all (linear) states that may result in the above-mentioned problem.
Provided that multisine excitation signals have been used in a BLA framework (Section 2.3), the nonlinear distortion
information of the BLA FRM ( in Figure 2) can be used to make an informed choice of the (nonlinear) states that should be
included or excluded in the model represented by (5). Indeed, using the proposed BLA framework, one can estimate the
significance of the nonlinear contribution of every state individually (for a concrete example see Figure 12). For each state, the
corresponding resonance frequency can be found using the eigenvalues of matrix as:
(6)
In general, not every state is expected to contribute significantly to the nonlinearity in the system. Therefore, it is unnecessary,
even counterproductive, to blindly include all linear states in the nonlinear part of the model. Instead, we propose to use the
nonlinear distortion analysis to select only those monomials of the multivariate polynomial which contain highly nonlinear
states. The resulting model will hence be sparser in the nonlinear part.
In the proposed technique, we recommend ordering the states from the most significant (dominant) nonlinear state to the least
significant nonlinear (thus nearly linear) one. Analyzing the relative contributions of the states, it becomes possible to determine
what the optimal number of nonlinear states is while guaranteeing that only those states contributing most to the nonlinear
behaviors are included. The algorithm is as follows:
1. Obtain the resonance frequencies from the state-space representation using (6):
With this step one can link the states of the parametric BLA model to the resonances of the nonparametric BLA FRM model
and (nonlinear) distortions.
2. Estimate the bandwidth (3 dB) interval around the identified modes (frequencies):
The isolation of the frequency band of the modes is needed because we are interested in the noise and nonlinearity levels of the
modes represented by the BLA states – not in the distortion levels at the resonance frequency alone.
3. Integrate the level of nonlinear distortions within the bandwidths (neglecting the samples where the noise distortions
are higher than the nonlinear distortions):
This step gives an idea of the relative and absolute amount of nonlinear distortions that occur in the bandwidth of the identified
states (modes).
4. Order the states according to the level of nonlinearity (obtained from step 3) and chose the appropriate states:
By sorting the states according to the levels of nonlinear distortions in descending order, one can choose to include the states
where significant nonlinear distortions are present. It is recommended to normalize the integrated nonlinearity levels (by the
total amount of nonlinearities (see e.g. Figure 13)) and to estimate the cumulative sum of the nonlinear distortions over the
(ordered) states (see e.g. Figure 12).
The proposed method gives the user an indication what percentage of the total nonlinearities will be accounted for in the model.
We recommended to discard the states in the nonlinear part of (5) where insignificant (very low level) of nonlinearities are
present. Furthermore, at each given number of (ordered) states it is possible to calculate the critical threshold (i.e. the ratio of
the number of parameters and the data samples) such that the end-users can make an informed choice about the model
complexity and the extent to which nonlinearities are covered by the selected states.
A detailed example and further elaboration can be found in Sections 5.4 and 5.6.
Direct decoupling
A general decoupled function is given by
(7)
where i s a vector function collecting univariate functions, such that the ith function is given by with . With
inputs and outputs to the function, the elements are of the following dimensions: Decoupled
functions can be regarded as additive structures built up out of univariate functions, and are therefore also known as additive
index models [31]. Notice that a decoupled function is in fact a single-hidden layer neural network with flexible activation
functions. The decoupled form is often preferred over classical regression models, e.g. deep neural networks with fixed
activation functions or basis expansions for two main reasons:
1. The univariate functions (also known as branches or ridge functions) can easily be visualized. It is known that these internal
nonlinear elements can convey information of the nonlinear system under study, hence increasing the explainability of the
model.
2. When compared to neural networks with predefined activation functions or basis expansions, the decoupled form typically
results in a more efficient parameterization, leading to a substantial model reduction [32].
Figure 3. Graphical illustration of the decoupling structure, which can be regarded as a single-hidden layer neural network
with flexible activation functions.
Many studies have focused on devising algorithms which are capable of approximating any given multivariate function with a
decoupled function [11] [32]. A state-of-the-art method is reviewed in the Appendix. However, the computationally heavy
decoupling strategy can, in some cases, be omitted. We propose to introduce the decoupled function structure from the start in
the dynamic state-space models. The resulting state-space model assumes the following form:
(8)
The challenge is then shifted from a posteriori decoupling to the a priori selection of an appropriate basis for the functions
and an adequate initialization of and .
We suggest to initialize the parameters of the decoupled function using either random values or unity (instead of the results
which would normally be obtained from the a posteriori decoupling, as detailed in the Appendix). Random or unity initialization
means that the parameters of and in (8) take normally distributed random or unity (1) values, while is initialized with
zeros, thus bypassing the decoupling process during the first optimization step. The latter guarantees that the stability of the
initial linear part (obtained from the BLA) is not compromised by extending the model with decoupled functions. All model
parameters may then be tuned simultaneously to the input-output data by means of a Levenberg-Marquardt optimization
routine. Note that the basis of need not necessarily be of the polynomial type. In fact, also discontinuous nonlinearities can
be considered, e.g. when is a piece-wise linear function.
The benefit of nonlinear state selection and direct decoupling are illustrated on a complex, MIMO, nonlinear system in the next
sections.
4 EXPERIMENTAL, MIMO GROUND VIBRATION DATA ON AN F-16 AIRCRAFT
Introduction
Here we introduce the ground vibration testing campaign of a decommissioned F-16 aircraft with two payloads mounted at the
wing tips, see Figure 4. These payloads, T-shaped connecting elements, slide into the wing tip rails of the F-16. This mounting
interface is a major source of nonlinear distortions – due to clearance/free play and friction. Furthermore, the aircraft is not
entirely symmetrical due to the configuration features. The measurement setup consists of 2 shakers–2 force cells (placed under
the wings), and various acceleration sensors. The shaker reference signals are full random phase multisine signals. The sampling
frequency is 200 Hz. The period length is 4096 resulting in a frequency resolution of 0.0489 Hz. The smallest excited frequency
is 4.541 Hz, the highest excited frequency is 25 Hz. The data acquisition system has an integrated antialiasing filter (with
attenuation of 150 dB per octave) with a cutoff frequency of 90 Hz. In the range of excitation, 420 frequency bins can be found.
There are 7 different multisine realizations (M) for each input channel per experiment (in total 14 realizations with the
recommended orthogonal phase rotation). There are two excitation levels measured: 1) the low-level is 8 N RMS per input
channel, 2) the high-level is 50 N RMS per input. The low-level excitation is considered to be in the linear operating regime
whereas high-level excitation corresponds more to the nonlinear regime (see an elaboration in the next subsection). Each
multisine realization is repeated 3 times (i.e., P=3).
In the excitation frequency band, the aircraft possesses about 12 resonance modes. The first few modes below 5 Hz correspond
to rigid body motions of the structure. The first flexible mode around 5.2 Hz corresponds to wing bending deformations. The
mode involving the most substantial nonlinear distortions is the wing torsion mode located around 7.3 Hz. An in-dept modal
analysis of the F-16 aircraft can be found at [33].
A similar single-input measurement campaign and its benchmark data are openly accessible [34]. As previous work pointed
out, the SISO F-16 GVT has proven to be highly challenging. A collection of these works can be found on the benchmark
website [34]. In this article, we consider for the first time a MIMO F-16 GVT setup – with a broader frequency range of
excitation.
Figure 4: F-16 ground vibration testing measurement.
Nonparametric Data Analysis
Checking the transients revealed that only the first period is disturbed by the transient, therefore each first period is discarded
in a (phase-shifted) realization. The input-output signals and FRFs are shown only at the driving points (for conciseness). The
input (force) measurements are shown in Figure 5, left. Observe that the excitation signals are almost ideally flat, while the
noise characteristics are different. For instance, at high-level excitation, the SNR is approximately 50 dB at the right wing, and
around 45 dB at the left wing. Furthermore, noise levels move together with the variation of the excitation power: this is an
indication of (weak, even type [4] of) nonlinear behavior because increasing the excitation level results in decreasing SNR.
The output (acceleration) measurements are shown in Figure 5, right and in Figure 6. Observe that the level-wise noise
characteristics are almost identical, but the spectra are different: there is a large difference between the responses at left and
right wings – due to the asymmetric configuration of the airplane. It can also be observed that the resonances have been shifted,
which is an indication of (strong, odd type [4] of) nonlinearities. Furthermore, observe in Figure 6 that the output of the low-
level excitation drops down to the noise estimate level outside of the excited frequency band, confirming the assumption that
this level is (nearly) linear. In case of high-level excitation, the output remains at a relatively high level outside of the excitation
band, and it converges slowly to the estimated noise level. Furthermore, clear resonances (harmonics) can be seen outside of
the excited frequency band. This information will be later used at the building stage of the nonlinear models (see Sections 4.3
and 5.2-5.3).
An in-depth modal analysis of the F-16 aircraft can be found at [33], while the nonparametric modeling is described in [35].
Figure 5: Input and output signals and their noise estimates measured at the right and left wings in the frequency domain of
excitation at the low and high amplitude levels.
Figure 6: Output signals and their noise estimates measured at the right and left wings in the whole frequency domain at the
low and high amplitude levels.
Preliminary FRF Analysis
Figure 7 shows the FRFs at the driving points. Despite the fact that the high-level excitation is only 16 dB higher than the low-
level excitation, it can be clearly observed that the FRFs differ from each other (compare grey and black thick lines, particularly
around 18-22 Hz). Using the proposed BLA framework we can directly estimate the noise level (black-grey thin lines) and
nonlinearity level estimation (red-orange thin lines). For instance, at the resonance of the wing torsion mode at low-level of
excitation (around 7.3 Hz) there is an SNR of 47 dB and an SNLR (signal-to-nonlinearity ratio) of 20 dB. This means that the
main error comes from the nonlinear distortions at this resonance. This extra information would have been impossible to derive
using the classical H1 estimator. A complete, end-to-end nonlinear modeling strategy, including a further analysis of the SNR,
SNLR levels and their connections to the modeling errors, is discussed in Section 5. For the interpretation of the modes
(resonance frequencies) we refer to [33]
Figure 7: FRFs, noise and nonlinear distortions estimated at driving points at low and high excitation levels.
5 END-TO-END MODELING OF COMPLEX, NONLINEAR DYNAMIC SYSTEMS
In this section we provide a step-by-step description of the suggested nonlinear modeling approach, comparing the state-of-
the-art techniques to the advances described in Section 3, and illustrate this on the MIMO F-16 data introduced in Section 4.
The datasets
In this section, the workflow of the modeling process illustrated in Figure 1 is implemented. To build and test the models, (a
total number of) 9-9 realizations were taken from low- and high-levels of excitation. From each realization, only the steady-
state periods are used, averaged (to reduce the noise and the computational needs) and the resulting data are split into three
parts:
▪ 4-4 realizations from low- and high-levels of excitation are used – with a total number of 32768 samples – as estimation
dataset to build the models,
▪ 4-4 realizations from low- and high-levels of excitation are used – with a total number of 32768 samples – as validation
dataset to assess the different model parametrizations/versions and to keep the model complexity under control,
▪ 1-1 realizations from low- and high-levels of excitation are used – with a total number of 8192 samples – as test dataset
to compare different modeling approaches on a completely independent dataset.
The main metric used for comparison is the relative root mean squared error (rrmse) that is calculated in the time domain as:
(9)
where is a modeled output test signal and is the measured output test signal. This rrmse will be calculated for the
left and right wings, both for low- and high-level data independently.
First, we will discuss the nonparametric, linear and nonlinear parametric modeling steps (as shown in Figure 1). In the last
subsection, we will comment on the user-oriented practical usage (computational complexities, memory needs) of a few
selected modeling approaches.
FRF fitting results
The first models to be assessed are the nonparametric FRM models. Table 1 shows the detailed fitting results of three modeling
scenarios: 1) the FRM based on low-level data, 2) the FRM based on high-level data, and 3) the FRM based on a combination
of low- and high-level data using 4-4 low- and high-level realizations. The low- and high-level FRM estimates are shown in
Figure 7. As there are 420 frequency lines excited in the range of interest, the resulting models are described by 420 parameters.
Unsurprisingly, each FRM performs best on the level where it has been built. The mixed-level FRM performs almost identically
in low- and high-level cases. Furthermore, due to the asymmetric configuration of the plane, left and right wing results are a
little bit different. Out of these three situations, the best performance was obtained by the mixed-level FRM resulting in 0.56
total rrmse, the output of this FRM on a segment of the test dataset is shown in Figure 16.
What might seem strange at first, is that the fit of the high-level BLA FRM performs poorly on the high-level dataset. This is
because of the large nonlinear distortion () at high level. In principle, adding more random realizations should decrease the
error, as more of the stochastic nonlinear response is filtered out. In this GVT case, when all random realizations are used, the
total rrmse at high-level drops from 0.76 to 0.49. This is, however, still a large error.
From the FRF BLA distortion analysis (see Section 4.3) the SNR and SNLR levels can be estimated. The SNLR/SNR levels
provide a (rough) indication of the lower bounds of modeling error using a linear/nonlinear framework.
The averaged SNLR level estimates for low-level is 22 dB, at high-level it is 14 dB. These values correspond to rrmse levels
of 0.08 at low-level and 0.15 at high-level. These rrmse values are the lowest possible modeling error levels using a linear
modeling framework. The averaged SNR levels are 35 dB at low-level, corresponding to an rrmse of 0.02, and 40 dB at high-
level (an rrmse of 0.01). These rrmse values are the lowest possible modeling error levels using an advanced nonlinear
framework.
These levels are purely indications of the lower bounds of errors because the out-of-band distortions (distortions outside of the
excited range) are not included in these metrics. For highly nonlinear systems, the out-of-band distortions can be substantial
resulting in increased (expected) rrmse values – with a factor of three to five (in this particular case).
Overall, these BLA FRM models would be insufficient in terms of modeling quality.
Table 1. Cross-validation results based on the nonparametric BLA FRF estimates
Model
Relative rms error on test data
Low-level excitation
High-level excitation
Total
average
Right
wing
Left
wing
Mean
Right
wing
Left
wing
Mean
Nonparametric BLA trained on low-level estimation data
0.42
0.24
0.33
0.81
0.91
0.86
0.59
Nonparametric BLA trained on high-level estimation data
0.97
0.75
0.86
0.88
0.63
0.76
0.81
Nonparametric BLA trained on mixed-level estimation data
0.57
0.42
0.50
0.64
0.60
0.62
0.56
Parametric BLA
As a second step, different parametric state-space (SS) models are built based on of low-, high- and mixed-level nonparametric
BLA estimates – by the use of FRMs and their total distortions. To determine the best model order (i.e. number of states), a
cross-validation model-order scanning method is used between orders 1 and 20.
The overview of the model order scanning results, obtained on the validation dataset (which was not used for training), can be
seen in Figure 8. Next, three model orders are considered and grouped according to the model complexity/memory needs:
▪ model with 11 states: the model complexity is low with 169 parameters, and it still provides a good result,
▪ model with 18 states: more in-depth analysis is shown for this model order, this model order contains 400 parameters,
▪ model with 20 states: this is the best performing model with 484 parameters.
The FRM fitting results of these three models, trained on low-level data, are shown in Figure 9. The highest model order
provides the best fitting results (as expected), even though it still fails to capture a few resonances. The numerical results of
different scenarios are shown in Table 2. The low-, high- and mixed-level parametric BLA estimates are improved compared
to their nonparametric versions: this is due to the smoothening effect of the parametric models. This smoothening effect is
important because certain nonlinearities may manifest as a kind of noise (see Section 4).
In the first row of Table 2, the results of the model fitting via weighted frequency domain Levenberg–Marquardt with 100
iterations [36] is shown. The weights are the total covariance estimates of the FRM (see Section 4.3). In the second row, no
variance information has been used to obtain the state-space models. This clearly leads to a worse fit. This conclusion holds
for all scenarios – but for the sake of simplicity, the results are not detailed here.
The difference in model complexity between the 11 and 20 states is not as significant as one might expect (rrmse of 0.49 vs
0.41). The improvement of rrmse comes from the improved accuracy of low-level fitting (rrmse of 0.24 at 11 states and 0.10
at 20 states).
The output of the low-level BLA SS FRM on a segment of the test dataset is shown in Figure 18. In short, we can conclude
that using the parametric SS representation improves the modeling quality, but it is still insufficient for accurate description.
Figure 8: Parametric model fitting based on low-level BLA FRF. State-space models are trained on the estimation dataset.
Absolute rms errors are calculated on the validation datasets.
Figure 9: State-space models of order 11, 18, and 20 estimated on the BLA FRFs at the low excitation level.
Table 2. Cross-validation results based on the parametric (state-space) BLA estimates
Model
Relative rms error on test data
Low-level excitation
High-level excitation
Total
average
Right
wing
Left
wing
Mean
Right
wing
Left
wing
Mean
SS model of order 18 fitted on low-level BLA FRF
0.29
0.12
0.21
0.64
0.86
0.75
0.48
SS model of order 18 on low-level FRF without variance information
0.56
0.28
0.42
0.72
0.90
0.81
0.62
SS model of order 18 fitted on high-level BLA FRF
0.53
0.62
0.58
0.24
0.23
0.23
0.40
SS model of order 18 fitted on mixed-level BLA FRF
0.35
0.38
0.37
0.37
0.46
0.42
0.39
SS model of order 11 fitted on low-level BLA FRF
0.31
0.18
0.24
0.62
0.84
0.73
0.49
SS model of order 20 fitted on low-level BLA FRF
0.09
0.10
0.10
0.62
0.81
0.72
0.41
The classical PNLSS modeling
The PNLSS models are initialized with the help of the previously obtained SS BLA models. To reduce the model complexity
and the computational needs, only nonlinearities in the state equation (i.e. matrix in (13)) are considered with 2nd and 3rd
order multivariate (state and input) monomials.
As a continuation of the previous subsection, PNLSS models are built of following model orders:
▪ 11 resulting in a total of 6006 parameters – the results are shown with the low-level SS BLA only,
▪ 18 resulting in a total of 31500 parameters – the results are detailed below,
▪ 20 resulting in a total of 45540 parameters – the results are shown with the low-level SS BLA only.
A PNLSS model is obtained using a Levenberg–Marquardt optimization routine with 40 iteration steps [6]. In each iteration,
all terms in (5) are fine-tuned. The performance of the selected PNLSS models is detailed in Table 3. Illustrations are shown
in Figure 19 and Figure 21.
The separation of the linear and nonlinear parts is of no importance for the behavior of the model. However, this distinction
will turn out to be practical, since the first step of the identification procedure is to estimate a linear parametric state-space
linear model on top of the nonparametric linear BLA model using a frequency domain subspace method [36]. The subspace
method provides a good initial estimate of the linear model parameters , but to further improve on those estimates, a
Levenberg-Marquardt optimization is executed on the parameters. The total distortion level
that was calculated in the
previous section is used as a frequency weighting in the optimization problem.
Next, the entire nonlinear model from (5) is estimated. By appropriately setting the transient parameters, we can again make
sure that the modeled output is computed in steady-state. This steady-state modeled output is used during the optimization to
evaluate the cost function.
Not surprisingly, the best results are obtained using low-level SS BLA. This is because the PNLSS works best when a high-
quality linear SS model is used to initialize the model structure. In the case of this F-16 GVT, the closest level to the linear
regime is the low-level excitation.
As in the SS model-building case, the best results are obtained when using the total distortion information (
) from
Equation (4): the 18 states model performance (rrmse) is improved by 0.08 when using the FRM covariance (see Table 3).
The parameters, and therefore the performance of the linear model, are slightly changed after the optimization cycles. In this
situation, the rrmse has slightly improved by 0.01 (the results are not detailed here). The absolute best results (at the highest
computational cost) were obtained using the model with 20 statesTable 3.
Table 3. Cross-validation results based on the PNLSS estimates
Model
Relative rms error on test data
Low-level excitation
High-level excitation
Total
average
Right
wing
Left
wing
Mean
Right
wing
Left
wing
Mean
PNLSS 18 states with low-level BLA SS model
0.20
0.16
0.18
0.24
0.24
0.24
0.21
PNLSS 18 states with high-level BLA SS model
0.54
0.63
0.59
0.17
0.17
0.17
0.38
PNLSS 18 states with mixed-level SS model
0.36
0.39
0.37
0.26
0.31
0.29
0.33
PNLSS 18 states with low-level SS model without variance information
0.22
0.24
0.23
0.29
0.32
0.31
0.27
PNLSS 11 states with low-level BLA SS model
0.30
0.22
0.26
0.29
0.24
0.27
0.27
PNLSS 20 states with low-level BLA SS model
0.13
0.13
0.13
0.24
0.26
0.25
0.19
Classical decoupling – the indirect approach
In this subsection, the results of the application of the filtered tensor decomposition method (see Appendix) are shown. The
number of operating points on which to decouple the function is 250. The number of iterations of the alternating optimization
routine was set to 200. The resulting univariate functions are parameterized using fourth-order polynomials. Notice that this
is higher than the degree of the original PNLSS model (degree 2 and 3). While increasing the degree of the PNLSS model
would result in a combinatorial growth in the number of parameters, the size of a decoupled function scales only linearly with
the degree. This is an important benefit of the decoupling procedure.
The function g of a single branch model of order 11, 18 and 20 is depicted in Figure 10. The functions clearly show a dominant
cubic behavior. Notice that in case of a single branch model, there is only one nonlinear element in the PNLSS model, i.e. a
univariate function g in the state equation. This allows for easy visualization and interpretation. The cubic behavior is in line
with previous work on the aircraft which showed that the dominant nonlinearity acts as a cubic spring [33]. This can be linked
to the variable stiffness of the bolted T joints (between the payload and the wing) acting as hardening springs. When the number
of branches is increased, similar shapes are retrieved. The visualization of the nonlinearity, enabled through function
decoupling, hence directly increases the interpretability of the model.
Next, we consider decoupled models with the number of branches ranging between 1 and 10. Figure 11 depicts the results
obtained for state-space models of order 20, as a function of the number of branches. The left figure shows that the accuracy
of the decoupled models has fallen back to the level of the BLA. The right figure gives an indication why: the decoupled
functions result in large approximation errors for the given range of number of branches. Substituting the obtained decoupled
functions back into the state-space model hence considerably decreases the performance. It is believed that a much larger
number of branches would be required to be able to accurately approximate the large original multivariate polynomial.
The only benefit of the transition to the decoupled form is a considerable reduction in the number of parameters. This can be
seen from the middle figure. In case of a single-branch model, the number of parameters of a nonlinear model with 20 states is
reduced from 45540 to 531. The benefit of small model sizes is that parameter tuning is much less computationally intensive.
We will therefore use the models as initialization for a new optimization run in which all the parameters of the decoupled model
are fine-tuned. The results of 40 iterations of Levenberg–Marquardt optimization are shown in Figure 11 (post-optimized line)
and listed in Table 4.
Figure 10: Illustration of one branch of the filtered decoupled univariate functions for model orders 11 (left), 18 (middle) and
20 (right). Black dots show the nonparametric estimates, the red lines show the polynomial fourth order LS fit and the green
dots represent the residuals.
Figure 11: Fitting results on decoupled PNLSS models with 20 nonlinear states (“full PNLSS”) compared to BLA FRF and
PNLSS models with 20 linear states and 16 nonlinear states (“PNLSS with 16 NL states”). Left: the variation of relative rms
errors as a function of branches. Middle: the variation of the number of parameters as a function of branches. Right: the
variation of the relative errors of the decoupling process as a function of branches.
Apparently, no large (linear) model orders are required when an appropriate nonlinear expansion is used in (5). One might
wonder how to select an optimal model order and number of branches for a given data set. Too simple a model might fail to
capture all important aspects of the output. This will result in errors that are too large in most cases. Looking at Figure 11
illustrates that single-branch decoupled models achieve a similar rrmse compared to the original high-complexity PNLSS
models. Models with more than two branches even outperform the original PNLSS models. The results illustrate that in case
of large model orders, the classical monomial basis is not the preferred choice. The smaller size of the decoupled functions
makes the parameter tuning more tractable, ultimately leading to higher accuracy.
Table 4. Cross-validation results based on the decoupled estimates
Model
Relative rms error on test data
Low-level excitation
High-level excitation
Total
average
Right
wing
Left
wing
Mean
Right
wing
Left
wing
Mean
Post-optimized decoupled based on PNLSS of 20 states, 1 branch
0.31
0.18
0.25
0.17
0.15
0.16
0.20
Post-optimized decoupled based on PNLSS of 20 states, 6 branches (best)
0.22
0.11
0.17
0.12
0.11
0.11
0.14
Illustration of the nonlinear state selection strategy
The model complexities and the obtained results in Table 3 show that the rrmse difference between the models with 11 (with
6006 parameters) and 20 states (with 45540 parameters) is only 0.08, despite the fact that the model of order 20 has
approximately 7.5 times more parameters. This can partly be explained by the ratio of training data samples (32768) and number
of parameters (45540), which is 0.72. The least squares (LS) based fitting algorithms work best when this ratio is much larger
than 1 [21]. In this case there is no unique LS solution, and to cope with this the LS fitting algorithm pushes many parameter
values towards zero (in order to effectively to be able to solve the equations).
Second, until now we have only partially used the nonlinear distortion information of the BLA FRM (Figure 7): it has been
used as frequency weighting (the higher the variance around a resonance, the less the weight). However, using the proposed
BLA framework, one can also directly estimate the significance of the nonlinear contributions per state (resonance frequency)
with respect to the noise estimate, see Figure 12.
Figure 13, on the right panel, and Table 5 show the benefit of nonlinear state selection proposed in Section 3.1. Using only 2
nonlinear states (with 5460 parameters), around 24% of the nonlinear contributions are covered, improving the SS model
performance from an rrmse of 0.41 (compared with the 20 states model) to 0.32. With 10 nonlinear states (12340 parameters),
around 80% of nonlinear distortions are covered resulting in a massive rrmse improvement to 0.11. The best possible model
order is obtained with 16 nonlinear states (27860 parameters, ratio 1.156) at 97%. Further analyzing the BLA noise estimates
(Sections 4.2-4.3) and the rrmse, one can see that with the best performing models we have nearly reached the noise floor: this
means that no further significant improvement can be made on the models.
Figure 12: Fitting of a PNLSS model order 20 with 2nd and 3rd order terms using the nonlinear state selection method
illustrated on a segment of the test data.
Figure 13: FRFs, noise nonlinear distortions, relative nonlinear contributions estimated at driving points at high excitation
level.
Table 5. Cross-validation results based on the PNLSS estimates using the state selection method
Model
Relative rms error on test data
Low-level excitation
High-level excitation
Total
average
Right
wing
Left
wing
Mean
Right
wing
Left
wing
Mean
PNLSS 20 states, 2 NL states with low-level BLA SS model
0.32
0.32
0.32
0.30
0.34
0.32
0.32
PNLSS 20 states, 4 NL states with low-level BLA SS model
0.29
0.35
0.32
0.19
0.22
0.20
0.26
PNLSS 20 states, 6 NL states with low-level BLA SS model
0.20
0.25
0.22
0.10
0.11
0.11
0.16
PNLSS 20 states, 8 NL states with low-level BLA SS model
0.13
0.14
0.14
0.23
0.31
0.27
0.20
PNLSS 20 states, 10 NL states with low-level BLA SS model
0.13
0.14
0.14
0.08
0.08
0.08
0.11
PNLSS 20 states, 12 NL states with low-level BLA SS model
0.13
0.15
0.14
0.09
0.10
0.09
0.12
PNLSS 20 states, 14 NL states with low-level BLA SS model
0.11
0.13
0.12
0.07
0.08
0.08
0.10
PNLSS 20 states, 16 NL states with low-level BLA SS model
0.11
0.12
0.11
0.05
0.06
0.05
0.08
PNLSS 20 states, 18 NL states with low-level BLA SS model
0.10
0.11
0.10
0.30
0.45
0.37
0.24
PNLSS 20 states, 20 NL states with low-level BLA SS model
0.13
0.13
0.13
0.24
0.26
0.25
0.19
Illustration of the direct decoupling strategy
Figure 14, Figure 15 and Table 6 show the results of the direct decoupling technique with random and unity realization for the
18 states scenario. There are 3 random and unity realizations, in the rrmse the averaged error is considered. Figure 14, right,
shows the relative – to the time of linear model building (FRM-SS) – computation time. The computation time of the direct
method is the cumulative sum of all three realizations. The results show that the models which are initialized by the decoupling
procedure perform better than when random or unity initialization is used. The difference is however limited, especially
compared to the considerably larger computing times. For the direct approach, random initialization has proven slightly better
than unity. In many cases, where the complexity of the nonlinear system is not too high, the direct methodology will be a
valuable approach. If direct decoupling is possible, it can save a significant amount of computational time.
Figure 14: Direct – with random and unity initialization compared- method is compared to the indirect method for state-space
model order 18 (left) and 20 (middle).
Figure 15: Direct – with random initialization compared- method is compared to the indirect method for state-space model
order 20 with 20, 16 and 10 nonlinear states.
Table 6. Cross-validation results based on the direct decoupled estimates
Model
Relative rms error on test data
Low-level excitation
High-level excitation
Total
average
Right
wing
Left
wing
Mean
Right
wing
Left
wing
Mean
18 states with random initialization, 1 branch
0.33
0.21
0.27
0.16
0.15
0.15
0.21
18 states with random initialization, 9 branches (best)
0.21
0.15
0.18
0.14
0.13
0.14
0.16
18 states with unity initialization, 1 branch
0.34
0.22
0.28
0.17
0.16
0.16
0.22
18 states with unity initialization, 10 branches (best)
0.21
0.13
0.17
0.16
0.17
0.17
0.17
20 states with random initialization, 1 branch
0.11
0.11
0.11
0.17
0.16
0.17
0.14
20 states with random initialization, 6 branches (best)
0.11
0.11
0.11
0.13
0.15
0.14
0.13
20 states with unity initialization, 1 branch
0.11
0.11
0.11
0.19
0.22
0.21
0.16
20 states with unity initialization, 10 branches (best)
0.11
0.11
0.11
0.15
0.17
0.16
0.13
20 states 16 NL states with random initialization, 1 branch
0.10
0.10
0.10
0.16
0.15
0.16
0.13
20 states 16 NL states with random initialization, 5 branches (best)
0.10
0.10
0.10
0.14
0.17
0.16
0.13
20 states 10 NL states with random initialization, 1 branch
0.19
0.20
0.19
0.39
0.54
0.47
0.33
20 states 10 NL states with random initialization, 5 branches (best)
0.12
0.14
0.13
0.16
0.17
0.17
0.15
6 SUMMARY
The performance of the parametric, nonparametric BLA, PNLSS, and decoupled models are detailed in Table 1-Table 6. Here
a user-oriented overview can be found along the route illustrated in Figure 1. An overview of the results of the process illustrated
in this figure is shown in Table 7.
As the nonparametric analysis (Section 4) shows, the MIMO F-16 GVT measurement is highly nonlinear. If one wants to use
a nonparametric FRM model to simulate the system, an unacceptable accuracy is to be expected. It was shown that in this case,
the best choice is to use a mixed-level FRM. However, if someone wants to obtain an accurate PNLSS or decoupled model,
the low-level (or the closest level to the linear regime) BLA FRM has to be used. An illustration of the fitting capabilities of
the BLA FRM is shown in Figure 16 on a zoomed-in section of the high-level part of the test signal segment. The advantage
of the FRM-based modeling is that it is very simple to calculate, it is very fast, and it has very low memory needs.
The next step is to build a parametric SS model. We recommend using a model with 18 or 20 states for the given experiment.
If the memory needs are essential, then model order 18 provides a good trade-off between rrmse and the number of parameters.
The estimation process is in the latter case still simple and fast, resulting in a slight parameter reduction (from 420 to 400)
while improving the results (rrmse is reduced from 0.59 to 0.48). Simulation results on the high-level test data segment are
illustrated in Figure 17. The scenario with 20 states is illustrated in Figure 18: at a cost of more parameters (484 instead of 400)
the rrmse is reduced (to 0.41 from 0.48). These SS parametric fittings look better but they would still be insufficiently accurate
for most applications.
The next stage is building classical PNLSS models. The fitting results of the full PNLSS models are already acceptable for
most applications (with an rrmse of 0.21 in case of 18 states, and with an rrmse of 0.19 in case of 20 states), see Figure 19 and
Figure 20. The improvements, however, come with a hefty price tag: the tuning time (of order of day(s)) and the memory needs
are excessive. This is due to the high number of parameters (400+5606 for 18 states and 484+45056 for 20 states) and the
length of the training data (which results in large Jacobian matrices used during the optimization process).
The performance of the full PNLSS models can be further improved with filtered tensor decomposition method that results in
a better rrmse of 0.14 with a significantly lower number of parameters (484+276).
In many – though not all – cases, the time consuming PNLSS modeling steps may be avoided by directly estimating the
decoupled models. This can be seen from the results obtained by the use of the direct decoupled model, which provides an
optimal balance between the number of parameters (484+276) and the performance (with an rrmse of 0.13) while avoiding the
time-consuming part of building and decoupling a PNLSS nonlinear model first. An illustration is shown in Figure 22. A further
advantage of the decoupling technique is that in many cases, the decoupled function may provide some physical insight into
the nonlinear source.
The best overall performing models were PNLSS models obtained using the BLA based nonlinear state-selection method. With
10 nonlinear states (484+4976 parameters), the rrmse resulted in 0.11, while with 16 carefully selected nonlinear states
(484+27376 parameters) the rrmse resulted in 0.08.
Table 7. Overview of the modeling results.
* exact figures are a function of hardware and software implementations.
Model
Relative rms error on test data
Computational
time*
Number of
parameters
(linear +
nonlinear)
Memory need of the
tuning method*
Low-level
High-level
Average
Nonparametric low-level BLA
0.33
0.86
0.59
<1 sec
420
0.1 GiB
Parametric low-level BLA of 18 states
0.21
0.75
0.48
< 1 minute
400
0.5 GiB
Parametric low-level BLA of 20 states
0.10
0.72
0.41
< 5 minutes
484
0.5 GiB
PNLSS model with 18 states
0.18
0.24
0.21
+- 1 day
400+5606
8 GiB
PNLSS model with 20 states
0.13
0.25
0.19
+- 3 days
484+45056
32 GiB
PNLSS model with 16 NL states
0.11
0.05
0.08
+- 1.5 days
484+27376
16 GiB
PNLSS model with 10 NL states
0.14
0.08
0.11
+- 1 days
484+4976
8 GiB
Indirect decoupled model with 6
branches based on full PNLSS with 20
states
0.17
0.11
0.14
hours + PNLSS
484+276
4 GiB +
PNLSS
Direct model with 6 branches and 20
states
0.11
0.14
0.13
< 1 hour
484+276
4 GiB
Figure 16: Fitting illustration of the output of the nonparametric low-level BLA model is shown on a segment of the test
data.
Figure 17: Fitting illustration of the output of the 18 states low-level SS BLA model is shown on a segment of the test data.
Figure 18: Fitting illustration of the output of the 20 states low-level SS BLA model is shown on a segment of the test data.
Figure 19: Fitting illustration of the output of the 18 states low-level PNLSS model is shown on a segment of the test data.
Figure 20: Fitting illustration of the output of the 20 states low-level SS BLA model is shown on a segment of the test data.
Figure 21: Fitting illustration of the output of the 20 states low-level PNLSS model with 16 nonlinear states is shown on a
segment of the test data.
Figure 22: Fitting illustration of the output of the full PNLSS decoupled model with 6 branches and 20 states is shown on a
segment of the test data.
Figure 23: Fitting illustration of the output of the direct decoupled with random initialization 6 branches is shown on a
segment of the test data.
7 CONCLUSIONS
When considering a popular nonlinear modeling approach, such as PNLSS, on a real-life, highly complex nonlinear system,
one might quickly run into problems. The model structures become very large, resulting in:
▪ computationally demanding high complexity optimization problems,
▪ poor estimates of the parameters, leading to low accuracy,
▪ a complete loss of interpretability.
In this work, two novel strategies to constrain the model order for complex, nonlinear, MIMO systems have been introduced.
In addition, these strategies have been embedded in a detailed, step-by-step description of a nonlinear MIMO modeling
procedure within the multisine driven MIMO Best Linear Approximation (BLA) framework. The two strategies fully exploit
the potential from this framework:
▪ a nonlinear-state selection procedure, informed by the signal-to-noise and signal-to-nonlinearity ratios established
within the BLA, which precedes the parameter estimation,
▪ the use of decoupled functions, which both reduce the size of the model and allow regaining some insight.
Moreover, the results turned out to be useful for modeling the nonlinear ground vibration testing of an aircraft, which is a
system that is typically characterized by a high number of modes as well as nonlinear behavior, and therefore challenging to
model using the classical data-driven approaches.
ACKNOWLEDGEMENTS
This work was funded by the Strategic Research Program SRP60 of the Vrije Universiteit Brussel, and by the Flemish fund
for scientific research FWO under license number G0068.18N.
APPENDIX
Function decoupling
Decoupling aims at transforming generic multivariate nonlinear functions. The decoupled structure is characterized by a
number of univariate functions of intermediate variables. Decoupling is designed to process multivariate nonlinearities which
emerge naturally in a large number of dynamical models. The objective is to achieve model reduction while gaining insight
into the nonlinear mapping [11]. Given a generic nonlinear function
(10)
with and , the idea is to approximate it by an alternative, so-called, decoupled form
(11)
where the ith function is with , emphasizing that all the internal functions are strictly univariate. The rationale
behind the idea is that by introducing an appropriate linear transformation of , denoted , an alternative basis is obtained in
which univariate functions may be used to describe the nonlinear mapping. Given that classical regression tools, e.g. a
polynomial basis expansion, not necessarily result in a sparse representation, a rotation towards a more favorable basis can lead
to a more efficient representation of the nonlinearity in terms of number of parameters.
Notice that the structure in (11) is in fact that of a single-hidden layer neural network with flexible activation functions. It was
shown in [32] that the universal approximation properties, known for classical single-hidden layer networks, extend to this
related structure. The number of univariate functions, denoted r, is a user choice which can be used to control the model
complexity (r may be larger or smaller than ). It plays a crucial role since it will influence the decoupling process. As a result,
it may return an approximative rather than an exact representation of the original function. A second linear transformation W
maps the function back onto the correct dimensions of the outputs. The matrices then have the following dimensions:
and . The decoupled structure is represented graphically in Figure 24.
Figure 24. Illustration of the decoupling technique. A generic multivariate function is depicted on the left. On the right the
decoupled form is shown, which is in fact a single-hidden layer neural network with flexible activation functions.
Decoupled functions have a number of attractive features. The fact that the nonlinearity is captured by a set of univariate
functions enables the user to easily visualize the relationship. This may lead to valuable insight. Moreover, decoupled functions
are often a much more efficient parametrization of the nonlinearity, resulting in a significant reduction in the number of
parameters [32]. Function decoupling was pioneered in [37]. The cornerstone of the method which was proposed in [37] is the
establishment of a connection between the coupled and decoupled form based on first-order derivative information.
Applying the chain rule, one obtains the Jacobian of (11) as:
(12)
where This is a consequence of using univariate functions . [37] found that the underlying diagonality
of the Jacobian, that can be observed in the second term in (12), can be exploited in the decoupling process. It was suggested
to construct a third order tensor, , out of evaluations of the Jacobian of the known function, . This is achieved by
calculating the Jacobian at N operating points, and stacking the resulting N Jacobian matrices behind one another. Then, a
diagonal decomposition is computed according to (12), denoted such that . The tensor decomposition is depicted in
Figure 25 The decomposition corresponds to the canonical polyadic decomposition (CPD) [30], which returns three matrix
factors: both the required linear transformation matrices W and V, together with a third matrix H which stores nonparametric
estimates of the first-order derivative of the functions. Formally, the CPD is given as
where denotes the
outer product of two vectors [38]. The decomposition can be performed using the Alternating Least Squares (ALS) approach
[38]. This entails that each matrix factor is updated alternatingly, while treating both others as constants. After the
decomposition, univariate functions are obtained after integration.
Figure 25: Center: a collection of evaluations of the Jacobian of the decoupled function, stacked in the third dimension. Left:
corresponding third order tensor. Right: extracting the diagonal plane reveals a diagonal tensor decomposition.
A downside of the original method, however, is that non-meaningful results can be obtained. To guarantee a successful
decoupling, the CPD needs to be both exact (equivalence implied in (8)) and unique. It turns out that the uniqueness requirement
imposes strong limitations on the applicability of the method. Whenever a non-unique CPD is obtained, the resulting can be
nonsmooth (not admitting an accurate parametric fit) and hence non-meaningful in the search for a decoupled functional form.
To address this issue, the filtered diagonal tensor decomposition (FTD) was introduced in [32]. In the FTD, the original
approach was modified by introducing a smoothness objective in the cost function via regularization. The filtered tensor
decomposition is a generic tool. Since it no longer relies on the uniqueness properties of the CPD, it is capable of retrieving
decoupled functions, regardless of the function family. An additional advantage of removing the uniqueness requirement on
the CPD is that the number of branches (which follows from the number of rank-one terms in the CPD) is now a design
parameter. As a result, the user is able to balance between the model complexity and the accuracy of the obtained
approximation. For an in-depth discussion on the method the reader is referred to [32].
The results in this work show that the nonlinear functions found in PNLSS models may typically be replaced by decoupled
functions with a low number of univariate branches. This points to the fact that nonlinear dynamical systems are in many cases
driven by a low number of internal nonlinearities.
REFERENCES
=
J
0
=
[1]
G. Kerschen, K. Worden, A.F. Vakakis, J.-C. Golinval, "Past, present and future of nonlinear system identification in
structural dynamics,," Mechanical Systems and Signal Processing, vol. 20, no. 3, pp. 505-592, 2006.
[2]
K. Worden, G.R. Tomlinson, Nonlinearity in Structural Dynamics: Detection, Identification and Modelling,, Bristol :
Institute of Physics Publishing, 2001.
[3]
J. Schoukens, L. Ljung, "Nonlinear System Identification: A User-Oriented Road Map," IEEE Control Systems
Magazine, vol. 39, no. 6, pp. 28-99, 2019.
[4]
P. Z. Csurcsia, B. Peeters, J. Schoukens, "User-friendly nonlinear nonparametric estimation framework for vibro-acoustic
industrial measurements with multiple inputs," Mechanical Systems and Signal Processing, vol. 145, 2020.
[5]
R. Priemer, Introductory Signal Processing, World Scientific, ISBN: 9971509199, 1991.
[6]
J. Paduart, Identification of nonlinear systems using Polynomial Nonlinear State Space models, Belgium: PhD thesis,
2008.
[7]
M. Schoukens, "Improved Initialization of State-Space Artificial Neural Networks," 2021 European Control Conference
(ECC), pp. 1913-1918, 2021.
[8]
J. Paduart, L. Lauwers, R. Pintelon, J. Schoukens,, "Identification of a Wiener-Hammerstein System Using the
Polynomial Nonlinear State Space Approach," in Proceedings of the 15th IFAC Symposium on System Identification,
Saint-Malo, France, 2009.
[9]
M. Schüssler, O. Nelles, "Extrapolation Behavior Comparison of Nonlinear State Space Models," in 19th IFAC
Symposium on System Identification SYSID 2021, Padova, Italy, 2021.
[10]
J. Schoukens, D.Westwick, L. Ljung, T .Dobrowiecki, "Nonlinear System Identification with Dominating Output Noise
- A Case Study on the Silverbox," in 19th IFAC Symposium on System Identification SYSID 2021, Padova, Italy, 2021.
[11]
J. Decuyper, K. Tiels, J. Schoukens and M. C. Runacres, "Retrieving highly structured models starting from black-box
nonlinear state-space models using polynomial decoupling," Mechanical Systems and Signal Processing, vol. 146, 2021.
[12]
P. Z. Csurcsia, B. Peeters, J. Schoukens, T. De Troyer, "Simplified Analysis for Multiple Input Systems: A Toolbox
Study Illustrated on F-16 Measurements," Vibration, vol. 3, no. 2, pp. 70-84, 2020.
[13]
J. Schoukens, "PNLSS toolbox," 2018. [Online]. Available: http://sysidguy.eu/PNLSS_v1_0.zip. [Accessed 2020 1 1].
[14]
R. Pintelon, J. Schoukens, System Identification: A Frequency Domain Approach, 2nd ed., New Jersey: Wiley-IEEE
Press, ISBN: 978-0470640371, 2012.
[15]
P. Z. Csurcsia and J. Lataire, "Nonparametric Estimation of Time-variant Systems Using 2D Regularization," IEEE
Transactions on Instrumentation & Measurement, vol. 65, no. 5, pp. 1259-1270, 2016.
[16]
P. Z. Csurcsia, J. Schoukens, I. Kollár, "Identification of time-varying systems using a two-dimensional B-spline
algorithm," in 2012 IEEE International Instrumentation and Measurement Technology Conference, Graz, Austria, 2012.
[17]
P. Z. Csurcsia, J. Schoukens, I. Kollár, "A first study of using B-splines in nonparametric system identification," in IEEE
8th International Symposium on Intelligent Signal Processing, Funchal, Portugal, 2013.
[18]
G. Birpoutsoukis, P. Z. Csurcsia and J. Schoukens, "Efficient multidimensional regularization for Volterra series
estimation," Mechanical Systems and Signal Processing, vol. 104, p. 896–914, 2018.
[19]
P. Z. Csurcsia, "Static nonlinearity handling using best linear approximation: An introduction," Pollack Periodica , vol.
8, no. 1, 2013.
[20]
R. Pintelon, Y. Rolain, W. Van Moer, "Probability density function for frequency response function measurements using
periodic signals," IEEE Transactions on Instrumentation and Measurement, vol. 52, no. 1, pp. 61-68, 2003.
[21]
J. Schoukens, R. Pintelon, Y. Rolain, Mastering System Identification in 100 exercises, New Jersey: John Wiley & Sons,
ISBN: 978047093698, 2012.
[22]
W. Heylen, P. Sas, Modal Analysis Theory and Testing, Leuven: Lirias, 2005.
[23]
S. M. Kay, Modern Spectral Estimation: Theory and Application, Prentice Hall Signal Processing Series, 1988.
[24]
T. Dobrowiecki, J. Schoukens, P. Guillaume, "Optimized excitation signals for MIMO frequency function
measurements," IEEE Transactions on Instrumentation and Measurements,, vol. 55, pp. 2072-2079, 2006.
[25]
M. Blanco, P. Z. Csurcsia, B. Peeters, K. Janssens and W. Desmet, "Nonlinearity assessment of MIMO electroacoustic
systems on direct field environmental acoustic testing," Proceedings of ISMA 2018 - International Conference on Noise
and Vibration Engineering and USD 2018 - International Conference on Uncertainty in Structural Dynamics, p. 457–
471, 2018.
[26]
P. Z. Csurcsia, B. Peeters, J. Schoukens, "The Best Linear Approximation of MIMO Systems: First Results on Simplified
Nonlinearity Assessment," in Nonlinear Structures and Systems, Volume 1. Conference Proceedings of the Society for
Experimental Mechanics Series. Springer, Cham., 2020.
[27]
B. Cornelis, A. Toso, W. Verpoest, B. Peeters, "Improved MIMO FRF estimation and model updating for robust Time
Waveform Replication on durability test rigs," in International Conference on Noise and Vibration Engineering, Leuven,
2014.
[28]
P. Z. Csurcsia, "MUMI: Multisine for multiple input systems: A user-friendly excitation toolbox for physical systems,"
Software Impacts, vol. 11, p. 100192, 2022.
[29]
P. Z. Csurcsia, B. Peeters and J. Schoukens, "The best linear approximation of MIMO systems: Simplified nonlinearity
assessment using a toolbox," Proceedings of ISMA 2020 - International Conference on Noise and Vibration Engineering
and USD 2020 - International Conference on Uncertainty in Structural Dynamics, p. 2239–2252, 2020.
[30]
P. Z. Csurcsia, "LPRM: A user-friendly iteration-free combined Local Polynomial and Rational Method toolbox for
measurements of multiple input systems," Software Impacts, vol. 12, p. 100238, 2012.
[31]
L. Ruan and M. Yuan, "Dimension reduction and parameter estimation for additive index models," Statistics and its
Interface, vol. 3, no. 2010, pp. 493-499, 2010.
[32]
J. Decuyper, K. Tiels, S. Weiland and J. Schoukens, "Decoupling multivariate functions using a nonparametric filtered
tensor decomposition," Mechanical Systems and Signal Processing, vol. 179, no. 1, 2022.
[33]
T. Dossogne, J. Noël, C. Grappasonni, G. Kerschen, B. Peeters, J. Debille, M. Vaes, J. Schoukens, "Nonlinear Ground
Vibration Identification of an F-16 Aircraft - Part II Understanding Nonlinear Behaviour in Aerospace Structures Using
Sine-sweep Testing," in Conference: International Forum on Aeroelasticity and Structural Dynamics, Saint Petersburg,
Russia, 2015.
[34]
J.P. Noël and M. Schoukens, "F-16 aircraft benchmark based on ground vibration test data," in 2017 Workshop on
Nonlinear System Identification Benchmarks, Brussels, Belgium, 2017.
[35]
P. Z. Csurcsia, "User-Friendly Method to Split Up the Multiple Coherence Function Into Noise, Nonlinearity and
Transient Components Illustrated on Ground Vibration Testing of an F-16 Fighting Falcon," Journal of Vibration
Engineering and Technologies, vol. 10, no. 7, p. 2577–2591, 2022.
[36]
McKelvey, T., Akçay, H., & Ljung, L., "Subspace-based multivariable system identication from frequency response
data," EEE Transactions on Automatic Control, vol. 41, pp. 960-979, 1996.
[37]
P. Dreesen, M. Ishteva and J. Schoukens, "Decoupling multivariate polynomials using first-order information," {SIAM}
Journal on Matrix Analysis and Applications, vol. 36, no. 2, pp. 864-879, 2015.
[38]
G. T. Kolda and B. W. Bader, "Tensor decompositions and applications," SIAM review , vol. 51, no. 3, pp. 455-500,
2009.