Table 1 - uploaded by Damien François
Content may be subject to copyright.
Source publication
Aircraft engines are designed to be used during several tens of years. Their maintenance is a challenging and costly task, for obvious security reasons. The goal is to ensure a proper operation of the engines, in all conditions, with a zero probability of failure, while taking into account aging. The fact that the same engine is sometimes used on s...
Contexts in source publication
Context 1
... this study, the variables at disposal are those listed in Table 1. The goal of this study is to visualize the Y ij vectors. ...
Similar publications
Purpose
Soil nutrients, elemental stoichiometry, and their associated environmental control play important roles in nutrient cycling. The objectives of this study were (1) to investigate soil nutrients and elemental stoichiometry, especially potassium and its associative elemental stoichiometry with other nutrients under different land uses in terr...
Citations
... euclidean distance (Kohonen, 1982 ;Olteanu and Villa-Vialaneix, 2015). Self-organizing maps have been used for aircraft engine fleet monitoring in Cottrell et al. (2009) ;Côme et al. (2010b,a) ; Forest et al. (2018) and to classify transient flight phases Faure et al. (2017). No specific study has been yet conducted on using SOM to validate and categorize anomalies and especially on production tests data. ...
Mon travail de thèse CIFRE s’inscrit dans la continuité des thèses menées par Tsirizo Rabenoro (Rabenoro,2015), Cynthia Faure (Faure, 2018), Florent Forest (Forest, 2021). L’objectif de cette thèse est de développer une méthodologie pour comprendre et mettre en évidence des typologies spécifiques du fonctionnement des moteurs d’avion lors de tests de réception effectués sur des bancs d’essai, et d’aider les ingénieurs métier de Safran Aircraft Engines dans l’analyse des résultats.Du point de vue théorique, l'enjeu principal de ce travail de thèse est de modéliser et expliquer les phénomènes physiques observés et non observés à l'aide de méthodes statistiques et d'en interpréter la cause à l'aide d'une (petite) partie des variables explicatives. Ce travail se place donc dans le cadre supervisé et non supervisé, avec une contrainte forte, celle de construire des modèles interprétables: on cherche à indiquer la contribution de chaque variable à la construction du modèle, c'est ce que l'on appelle mesurer l'\emph{importance des variables}. En outre, on cherche à améliorer l'interprétabilité en construisant des modèles parcimonieux ou \emph{sparses} en anglais, c'est-à-dire tels que les variables qui ne contribuent pas à leur construction en soient exclues. Dans le cadre non supervisé, des méthodes de partitionnement de groupes d'observations, ou méthodes de \textit{clustering} en anglais, vont être étudiées. La solution que nous cherchons doit pouvoir s'utiliser en grande dimension, doit être interprétable et elle doit pouvoir tenir compte de la structure de groupes des variables. Ce type de méthode est connu sous le nom de méthodes de clustering sparse. Dans le cadre supervisé, nous proposons des méthodes modélisant un phénomène décrit par des variables continues (respectivement catégorielles), c'est-à-dire des méthodes de régression (respectivement de classification). On impose les mêmes contraintes que dans le cas non supervisé (gérer des données de grande dimension, être sparse, indiquer l'importance des variables, tenir en compte des structures de groupes de variables). Par ailleurs, remarquons que nous voulons modéliser un phénomène physique dans le but d'en expliquer les principes et donc que nous ne sommes pas (directement) intéressés par les aspects prédictifs. Ce point est extrêmement important car nous verrons qu'expliquer un phénomène et expliquer la prédiction du modèle associé sont deux buts qui peuvent être contradictoires.Une caractéristique propre aux algorithmes sparses est qu'ils dépendent d'un paramètre à ajuster (au même titre que les méthodes de clustering). Différentes valeurs de ce paramètre donnent naissance à différents modèles et il est nécessaire de choisir entre tous ces modèles. Ainsi, il est primordial de disposer d'une méthode de sélection de modèle efficace. Il faut insister sur le fait que la sélection de modèle est un défi majeur en clustering non supervisé. En effet, il n'existe pas de méthode universellement admise pour évaluer les résultats du clustering pour la raison évidente qu'il n'y a pas de vérité de terrain par rapport à laquelle les résultats pourraient être comparés. C'est aussi le cas lorsque l'on s'intéresse à la sélection de variables (modèles sparses) et à l'importance de variables, même dans le cadre supervisé, car on ne dispose jamais du \emph{vrai} ensemble de variables et des \emph{vraies} importances de variables définissant le phénomène sous-jacent étudié.
... euclidean distance Kohonen (1982); Olteanu & Villa-Vialaneix (2015). Self-organizing maps have been used for aircraft engine fleet monitoring in Cottrell et al. (2009);Côme et al. (2010b,a); Forest et al. (2018) and to classify transient flight phases Faure et al. (2017). No specific study has been yet conducted on using SOM to validate and categorize anomalies and especially on production tests data. ...
Engines are verified through production tests before delivering them to customers. During those tests, lot of measures are taken on different parts of the engine, considering multiple physical parameters. Unexpected measures can be observed. For this very reason, it is important to assess if these unusual observations are statistically significant. However, anomaly detection is a difficult problem in unsupervised learning. The obvious reason is that, unlike supervised classification, there is no ground truth against which we could evaluate results. Therefore, we propose a methodology based on two independent statistical algorithms to double check our results. One approach is the Isolation Forest (IF) model which is specific to anomaly detection and able to handle a large number of variables. The goal of the algorithm is to find rare items, events or observations which raise suspicions by differing significantly from the majority of the data and, at the same time, it discriminates non-informative variables to improve. One main issue of IF is its lack of interpretability. Within this scope, we extend the shapley values, interpretation indicators, to the unsupervised context to interpret the model outputs. The second approach is the Self-Organizing Map (SOM) model which has nice properties for data mining by providing both clustering and visual representation. The performance of the method and its interpretability depends on the chosen subset of variables. In this respect, we first implement a sparse-weighted K-means to reduce the input space, allowing the SOM to give an interpretable discretized representation. We apply the two methodologies on data on aircraft engines measurements. Both approaches show similar results which are easily interpretable and exploitable by the experts.
... Self-organizing maps were already used for aircraft engine fleet monitoring in (Cottrell et al., 2009;Côme, Cottrell, Verleysen, & Lacaille, 2010, 2011Forest et al., 2018). These works focus on the performance health state of the engine and not the vibration aspects. ...
Vibration analysis is an important component of industrial equipment health monitoring. Aircraft engines in particular are complex rotating machines where vibrations, mainly caused by unbalance, misalignment, or damaged bearings, put engine parts under dynamic structural stress. Thus, monitoring the vibratory behavior of engines is essential to detect anomalies and trends, avoid faults and improve availability. Intrinsic properties of parts can be described by the evolution of vibration as a function of rotation speed, called a vibration signature. This work presents a methodology for large-scale vibration monitoring of operating civil aircraft engines, based on unsupervised learning algorithms and a flight recorder database. Firstly, we present a pipeline for massive extraction of vibration signatures from raw flight data, consisting in time-domain medium-frequency sensor measurements. Then, signatures are classified and visualized using interpretable self-organized clustering algorithms, yielding a visual cartography of vibration profiles. Domain experts can then extract various insights from the resulting models. An abnormal temporal evolution of a signature gives early warning before failure of an engine. In a post-finding situation after an event has occurred, similar at-risk engines are detectable. The approach is global, end-to-end and scalable, which is yet uncommon in our industry, and has been tested on real flight data.
... This approach allows to detect degradation patterns and the nature of the problem and to derive the remaining useful life [9,10,12]. This network has been widely used on various application fields for its particular abilities [13][14][15][16][17][18]. ...
One of the main goals of predictive maintenance is to be able to trigger the right maintenance actions at the right moment in time building upon the monitoring of the health status of the concerned systems and their components. As such, it allows identifying incipient faults and forecasting the moment of failure at the earliest stage. Many different data-driven methods are used in such approaches (Naderi and Khorasani in 2017 IEEE 30th Canadian conference on electrical and computer engineering (CCECE), Windsor, ON, IEEE, pp 1–6, 2017. https://doi.org/10.1109/ccece.2017.7946715; Sarkar et al. in J Eng Gas Turbines Power 1338(8):081602, 2011. https://doi.org/10.1115/1.4002877; Svärd et al. in Mech Syst Signal Process 45(1):170–192, 2014. https://doi.org/10.1016/j.ymssp.2013.11.002; Pourbabaee et al. Mech Syst Signal Process 76–77:136–156, 2016. https://doi.org/10.1016/j.ymssp.2016.02.023). This work uses the self-organizing maps (SOMs) or Kohonen map, thanks to its ability to emphasize underlying behavior such as fault modes. An automatic fault mode detection is presented based on a SOM network and the kernel density estimation with as less as possible prior knowledge. The different SOM development steps are presented and the suitable solutions proposed to structure the approach are accompanied by mathematical methods. The generated maps are then used with kernel density analysis to isolate fault modes on them. Finally, a methodology is presented to identify the different fault modes. The work is illustrated with an aircraft jet engines case study.
... A number of modified SOM versions are developed and proposed for the improvement of vector quantization and the topology preservation performances [11][12][13][14][15][16][17][18]. Brugger et al., and Bogdan et al. proposed a method for detecting clusters by applying the different clustering algorithm to SOM [12,19]. ...
The advancement of available technology in use cause the production of huge amounts of data which need to be categorised within an acceptable time for end users and decision makers to be able to make use of the data contents. Present unsupervised algorithms are not capable to process huge amounts of generated data in a short time. This increases the challenges posed by storing, analyzing, recognizing patterns, reducing the dimensionality and processing Data. Self-Organizing Map (SOM) is a specialized clustering technique that has been used in a wide range of applications to solve different problems. Unfortunately, it suffers from slow convergence and high steady-state error. The work presented in this paper is based on the recently proposed modified SOM technique introducing a Robust Adaptive learning approach to the SOM (RA-SOM). RA-SOM helps to overcome many of the current drawbacks of the conventional SOM and is able to efficiently outperform the SOM in obtaining the winner neuron in a lower learning process time. To verify the improved performance of the RA-SOM, it was compared against the performance of other versions of the SOM algorithm, namely GF-SOM, PLSOM, and PLSOM2. The test results proved that the RA-SOM algorithm outperformed the conventional SOM and the other algorithms in terms of the convergence rate, Quantization Error (QE), Topology Error (TE) preserving map using datasets of different sizes. The results also showed that RA-SOM maintained an efficient performance on all the different types of datasets used, while the other algorithms a more inconsistent performance, which means that their performance could be data type-related.
... Come et al. [172] applied SOM for aircraft engines data visualization in combination with two other modules, one to normalize the effect of ambient condition variations on the measurements (based on a linear regression approach) and the other for fault detection (based on the joint use of a recursive least squares (RLS) and GLR algorithms). In another study, Cottrell et al. [173] used SOM to visualize an aircraft engine health evolution based on preprocessed data through a General Linear Model (GLM). A fault diagnostic algorithm for gas turbine fuel systems was also introduced by Cao et al. [162] based on an improved SOM approach. ...
Gas-path diagnostics is an essential part of gas turbine (GT) condition-based maintenance (CBM). There exists extensive literature on GT gas-path diagnostics and a variety of methods have been introduced. The fundamental limitations of the conventional methods such as the inability to deal with the nonlinear engine behavior, measurement uncertainty, simultaneous faults, and the limited number of sensors available remain the driving force for exploring more advanced techniques. This review aims to provide a critical survey of the existing literature produced in the area over the past few decades. In the first section, the issue of GT degradation is addressed, aiming to identify the type of physical faults that degrade a gas turbine performance, which gas-path faults contribute more significantly to the overall performance loss, and which specific components often encounter these faults. A brief overview is then given about the inconsistencies in the literature on gas-path diagnostics followed by a discussion of the various challenges against successful gas-path diagnostics and the major desirable characteristics that an advanced fault diagnostic technique should ideally possess. At this point, the available fault diagnostic methods are thoroughly reviewed, and their strengths and weaknesses summarized. Artificial intelligence (AI) based and hybrid diagnostic methods have received a great deal of attention due to their promising potentials to address the above-mentioned limitations along with providing accurate diagnostic results. Moreover, the available validation techniques that system developers used in the past to evaluate the performance of their proposed diagnostic algorithms are discussed. Finally, concluding remarks and recommendations for further investigations are provided.
... Many versions of the SOM have been proposed to improve the vector quantization and the topology preservation performances [16][17][18][19][20][21][22][23][24][25][26][27]. In [19], the authors proposed a new way to detect clusters automatically by applying a cluster algorithm [28] to the SOM. ...
... Section IV details each component of the pipeline along with their requirements and motivations. Our guiding thread throughout the development of the pipeline will be the engine health monitoring (EHM) use case using a self-organizing map (SOM) described in [15]- [18], which we will refer to as the SOM-EHM use case. However, the range of possible applications is much wider. ...
... It maps the data space onto a two-dimensional grid preserving the topology of the original data space. The approaches presented by [15]- [18] use this algorithm to visualize the state of a fleet of aircraft engines. In [17], their data contains 20 variables (15 context variables and 5 engine variables) measured on a fleet of 91 engines during approximately one year. ...
... Une autre façon de détecter des anomalies est d'utiliser les cartes Auto- Organisées de Kohonen (Self-Organizing Maps, SOM). Par exemple dans ( Cottrell et al., 2009), Cottrell et al. projettent sur une carte bi-dimensionnelle les données de vols mesurées afin d'analyser la trajectoire de ces données vol après vol. Dans (Come, 2011), Côme et al. utilisent également cette méthode pour visualiser l'évolution des données "saines" corrigées des données exogènes. ...
L'analyse de séries temporelles multivariées, créées par des capteurs présents sur le moteur d'avion durant un vol ou un essai, représente un nouveau challenge pour les experts métier en aéronautique. Chaque série temporelle peut être décomposée de manière univariée en une succession de phases transitoires, très connues par les experts, et de phases stabilisées qui sont moins explorées bien qu'elles apportent beaucoup d'informations sur le fonctionnement d'un moteur. Notre projet a pour but de convertir ces séries temporelles en une succession de labels, désignant des phases transitoires et stabilisées dans un contexte bivarié. Cette transformation des données donne lieu à plusieurs perspectives : repérer dans un contexte univarié ou bivarié les patterns similaires durant un vol, trouver des tronçons de courbes similaires à une courbe donnée, identifier les phases atypiques, détecter ses séquences de labels fréquents et rares durant un vol, trouver le vol le plus représentatif et déterminer les vols «volages». Ce manuscrit propose une méthodologie pour automatiquement identifier les phases transitoires et stabilisées, classer les phases transitoires, labelliser des séries temporelles et les analyser. Tous les algorithmes sont appliqués à des données de vols et les résultats sont validés par les experts.
... SOM are dual layer (input and output) artificial neural networks that were originally developed for pattern recognition in artificial intelligence research. SOM have been applied to the residuals of linear models for predicting faults in aircraft engines, 29 and for pattern detection in EEMs, 21,30-34 aquatic systems, 35,36 and ecology-toxicology. 37,38 SOM were applied both to the residuals and factor loadings (i.e., Fmax) from PARAFAC models in order to assess: (1) patterns in the residuals and their relationship with DOM from different tree species; (2) the relative differentiability of DOM by tree species/headwater; and (3) model overfitting by comparing the distributions of component loadings. ...
Parallel factor analysis (PARAFAC) has facilitated an explosion in research connecting the fluorescence properties of dissolved organic matter (DOM) to its functions and biogeochemical cycling in natural and engineered systems. However, the validation of robust PARAFAC models using split-half analysis requires an oft unrealistically large number (hundreds to thousands) of excitation-emission matrices (EEMs), and models with too few components may not adequately describe differences between DOM. This study used self-organizing maps (SOM) and comparing changes in residuals with the effects of adding components to estimate the number of PARAFAC components in DOM from two data sets: MS (110 EEMs from nine leaf leachates and headwaters) and LR (64 EEMs from the Lena River). Clustering by SOM demonstrated that peaks clearly persisted in model residuals after validation by split-half analysis. Plotting the changes to residuals was an effective method for visualizing the removal of fluorophore-like fluorescence caused by increasing the number of PARAFAC components. Extracting additional PARAFAC components via residuals analysis increased the proportion of correctly identified size-fractionated leaf leachates from 56.0 ± 0.8 to 75.2 ± 0.9%, and from 51.7 ± 1.4 to 92.9 ± 0.0% for whole leachates. Model overfitting was assessed by considering the correlations between components, and their distributions amongst samples. Advanced residuals analysis improved the ability of PARAFAC to resolve the variation in DOM fluorescence, and presents an enhanced validation approach for assessing the number of components that can be used to supplement the potentially misleading results of split-half analysis.