Seunghwan Jung’s research while affiliated with Pusan National University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (27)


Development of the events (driftwise, stepwise, malfunction, and failure) from a fault [10].
Fault detection procedure using AAKR.
Fault detection procedure using proposed method.
Selection of shared neighbor using SNN.
Comparison of nearest neighbors between kNN and SNN for faulty data: (a) when neighbors are located at the center; (b) when neighbors are close to the center of cluster; (c) when many neighbors are located on the outskirts.

+7

Fault Detection Method Using Auto-Associative Shared Nearest Neighbor Kernel Regression for Industrial Processes
  • Article
  • Full-text available

February 2025

·

14 Reads

·

Eunkyeong Kim

·

Seunghwan Jung

·

[...]

·

As industrial systems grow larger and more interconnected, timely fault detection is essential to minimize downtime, enhance reliability, and reduce costs. However, conventional methods focus on reactive maintenance, limiting their ability to detect faults before escalation. Additionally, fault propagation in large-scale systems can degrade detection performance. To address these challenges, we propose an auto-associative shared nearest neighbor kernel regression method for fault detection in complex industrial processes. Inspired by attention mechanisms, the proposed approach assigns higher weights to relevant training data. Shared nearest neighbor is used to assess similarity between faults and training data, rescaling distances accordingly. These adjusted distances are then utilized in auto-associative kernel regression for fault detection. The performance of the proposed method is evaluated by applying it to benchmark data from the Tennessee Eastman Process and a real-world, unplanned shutdown case concerning a circulating fluidized bed boiler. The experimental results show that the proposed method can detect anomalies up to 2 h earlier than conventional fault detection methods.

Download

A Hybrid Fault Detection Method of Independent Component Analysis and Auto-Associative Kernel Regression for Process Monitoring in Power Plant

January 2025

·

19 Reads

·

1 Citation

IEEE Access

In complex industrial processes, distributed control systems (DCSs) are currently operated to prevent unplanned shutdowns and major accidents. However, DCSs not only have the advantages of collecting large amounts of operational history data, but they also have the shortcoming of limited monitoring capabilities, such as early detection, due to their reliance on generating fault alarms based on simple threshold values. To improve the stability and reliability of industrial processes, it is essential to operate DCS in conjunction with data-driven process monitoring technologies. In this paper, we propose a novel hybrid model combining independent component analysis (ICA) and auto-associative kernel regression (AAKR) to address the limitations of both models. The proposed model (ICA+AAKR) introduces a new method, cumulative percentage distance (CPD), which can determine the appropriate number of independent components (ICs) for dimensionality reduction in ICA. By inputting the dimension-reduced IC matrix into AAKR, the issue of excessive computation time caused by lazy learning in AAKR is effectively mitigated. We applied the proposed fault detection method to two well-known benchmarks (multivariate dynamic process and Tennessee Eastman process) and a real-world application (actual tube leakage in power plant) to verify its monitoring performance. The experimental results validated superior detection performance compared to existing methods for the two benchmark problems. In addition, the method demonstrated the potential to enhance process stability and reliability by enabling remarkable early detection of tube leakage in a circulating fluidized bed boiler at the power plant.


Improved Surface Solar Irradiation Estimation Using Satellite Data and Feature Engineering

December 2024

·

22 Reads

Planning an optimal installation site to maximize power-generation efficiency is crucial for the effective operation of photovoltaic power plants. Achieving this requires accurate, reliable information on solar irradiation across different regions. However, ground-based measurements using pyranometers are resource-intensive, requiring substantial time and human effort, and their measurement range is limited, complicating data collection. To address this, we propose a method to accurately estimate surface solar irradiation (SSI) using satellite data and feature engineering. By leveraging satellite data as the primary input, we overcome the spatial and temporal limitations of ground-based measurements. Additionally, we improve SSI-estimation performance through designed features based on the geometric information of the Sun and satellite. A hybrid deep neural network model is used for SSI estimation, effectively handling data of varying dimensions. Hourly SSI data from 12 synoptic observation stations collected over one year, excluded from the model’s training and validation sets, are utilized to evaluate the proposed method. Experimental results demonstrate strong SSI-estimation performance, with an average root mean square error (RMSE) of 0.1813 MJ/m², a relative RMSE of 0.1601, mean absolute error of 0.1159 MJ/m², and coefficient of determination of 0.9680.


Enhanced Sequence-to-Sequence Attention-Based PM2.5 Concentration Forecasting Using Spatiotemporal Data

December 2024

·

18 Reads

Severe air pollution problems continue to increase because of accelerated industrialization and urbanization. Specifically, fine particulate matter (PM2.5) causes respiratory and cardiovascular diseases, and according to the World Health Organization (WHO), millions of premature deaths and significant health burdens annually. Therefore, PM2.5 concentration forecasting is essential. This study proposed a method to forecast PM2.5 concentrations one hour after using Sequence-to-Sequence Attention (Seq2Seq-attention). The proposed method selects neighboring stations using minimum redundancy maximum relevance (mRMR) and integrates their data using a convolutional neural network (CNN). The proposed attention score and Seq2Seq are used on the integrated data to forecast PM2.5 concentration after one hour. The performance of the proposed method is validated through two case studies. The first comparison evaluated the performance of the conventional attention score against the proposed attention scores. The second comparison evaluated the forecasting results with and without considering neighboring stations. The first study showed that the proposed attention score improved the performance index (Root Mean Square Error (RMSE): 3.48%p, Mean Absolute Error (MAE): 8.60%p, R2: 0.49%p, relative Root Mean Square Error (rRMSE): 3.64%p, Percent Bias (PBIAS): 59.29%p). The second case study showed that considering neighboring stations’ data can be more effective in forecasting than considering that of a standalone station (RMSE: 5.49%p, MAE: 0.51%p, R2: 0.67%p, rRMSE: 5.44%p, PBIAS: 46.56%p). This confirmed that the proposed method can effectively forecast the PM2.5 concentration after one hour.


Spatio-Temporal Deep Learning-Based Forecasting of Surface Solar Irradiance: Leveraging Satellite Data and Feature Selection

March 2024

·

42 Reads

·

2 Citations

This paper proposes a method for forecasting surface solar irradiance (SSI), the most critical factor in solar photovoltaic (PV) power generation. The proposed method uses 16-channel data obtained by the GEO-KOMPSAT-2A (GK2A) satellite of South Korea as the main data for SSI forecasting. To determine feature variables related to SSI from the 16-channel data, the differences and ratios between the channels were utilized. Additionally, to consider the fundamental characteristics of SSI originating from the sun, solar geometry parameters, such as solar declination (SD), solar elevation angle (SEA), and extraterrestrial solar radiation (ESR), were used. Deep learning-based feature selection (Deep-FS) was employed to select appropriate feature variables that affect SSI from various feature variables extracted from the 16-channel data. Lastly, spatio-temporal deep learning models, such as convolutional neural network–long short-term memory (CNN-LSTM) and CNN–gated recurrent unit (CNN-GRU), which can simultaneously reflect temporal and spatial characteristics, were used to forecast SSI. Experiments conducted to verify the proposed method against conventional methods confirmed that the proposed method delivers superior SSI forecasting performance.


The Early Detection of Faults for Lithium-Ion Batteries in Energy Storage Systems Using Independent Component Analysis with Mahalanobis Distance

January 2024

·

28 Reads

·

2 Citations

In recent years, battery fires have become more common owing to the increased use of lithium-ion batteries. Therefore, monitoring technology is required to detect battery anomalies because battery fires cause significant damage to systems. We used Mahalanobis distance (MD) and independent component analysis (ICA) to detect early battery faults in a real-world energy storage system (ESS). The fault types included historical data of battery overvoltage and humidity anomaly alarms generated by the system management program. These are typical preliminary symptoms of thermal runaway, the leading cause of lithium-ion battery fires. The alarms were generated by the system management program based on thresholds. If a fire occurs in an ESS, the humidity inside the ESS will increase very quickly, which means that threshold-based alarm generation methods can be risky. In addition, industrial datasets contain many outliers for various reasons, including measurement and communication errors in sensors. These outliers can lead to biased training results for models. Therefore, we used MD to remove outliers and performed fault detection based on ICA. The proposed method determines confidence limits based on statistics derived from normal samples with outliers removed, resulting in well-defined thresholds compared to existing fault detection methods. Moreover, it demonstrated the ability to detect faults earlier than the point at which alarms were generated by the system management program: 15 min earlier for battery overvoltage and 26 min earlier for humidity anomalies.


A Fault Detection and Isolation Method via Shared Nearest Neighbor for Circulating Fluidized Bed Boiler

December 2023

·

23 Reads

·

2 Citations

Accurate and timely fault detection and isolation (FDI) improve the availability, safety, and reliability of target systems and enable cost-effective operations. In this study, a shared nearest neighbor (SNN)-based method is proposed to identify the fault variables of a circulating fluidized bed boiler. SNN is a derivative method of the k-nearest neighbor (kNN), which utilizes shared neighbor information. The distance information between these neighbors can be applied to FDI. In particular, the proposed method can effectively detect faults by weighing the distance values based on the number of neighbors they share, thereby readjusting the distance values based on the shared neighbors. Moreover, the data distribution is not constrained; therefore, it can be applied to various processes. Unlike principal component analysis and independent component analysis, which are widely used to identify fault variables, the main advantage of SNN is that it does not suffer from smearing effects, because it calculates the contributions from the original input space. The proposed method is applied to two case studies and to the failure case of a real circulating fluidized bed boiler to confirm its effectiveness. The results show that the proposed method can detect faults earlier (1 h 39 min 46 s) and identify fault variables more effectively than conventional methods.


Anomaly Detection Using Puzzle-Based Data Augmentation to Overcome Data Imbalances and Deficiencies

November 2023

·

14 Reads

Machine tools are used in a wide range of applications, and they can manufacture workpieces flexibly. Furthermore, they require maintenance; the overall costs include maintenance costs, which constitute a significant portion, and the costs involved in ensuring product quality. Therefore, anomaly detection in tool conditions is required, because these tools are essential industrial elements. However, the data related to tool conditions present some challenges: data imbalances and deficiencies. Data imbalances and deficiencies can affect the performance of anomaly detection models. A model trained using data with imbalances and deficiencies may miscalculate that abnormal data are normal data, leasing to errors. To overcome these problems, the proposed method has been designed using the wavelet transform, color space conversion, color extraction, puzzle-based data augmentation, and double transfer learning. The proposed method generated image data from time-series data, effectively extracted features, and generated new image data using puzzle-based data augmentation. The color information was processed to highlight features, and the proposed puzzle-based data augmentation was applied during processing to increase the amount of data to improve the performance of the anomaly detection model. The experimental results showed that the proposed method can classify normal and abnormal data with greater accuracy. In particular, the accuracy of abnormal data classification increased from 25.00% to 91.67%. This demonstrates that the proposed method is effective and can overcome data imbalances and deficiencies.



PM2.5 Concentration Forecasting Using Weighted Bi-LSTM and Random Forest Feature Importance-Based Feature Selection

June 2023

·

88 Reads

·

7 Citations

Particulate matter (PM) in the air can cause various health problems and diseases in humans. In particular, the smaller size of PM2.5 enable them to penetrate deep into the lungs, causing severe health impacts. Exposure to PM2.5 can result in respiratory, cardiovascular, and allergic diseases, and prolonged exposure has also been linked to an increased risk of cancer, including lung cancer. Therefore, forecasting the PM2.5 concentration in the surrounding is crucial for preventing these adverse health effects. This paper proposes a method for forecasting the PM2.5 concentration after 1 h using bidirectional long short-term memory (Bi-LSTM). The proposed method involves selecting input variables based on the feature importance calculated by random forest, classifying the data to assign weight variables to reduce bias, and forecasting the PM2.5 concentration using Bi-LSTM. To compare the performance of the proposed method, two case studies were conducted. First, a comparison of forecasting performance according to preprocessing. Second, forecasting performance between deep learning (long short-term memory, gated recurrent unit, and Bi-LSTM) and conventional machine learning models (multi-layer perceptron, support vector machine, decision tree, and random forest). In case study 1, The proposed method shows that the performance indices (RMSE: 3.98%p, MAE: 5.87%p, RRMSE: 3.96%p, and R2:0.72%p) are improved because weights are given according to the input variables before the forecasting is performed. In case study 2, we show that Bi-LSTM, which considers both directions (forward and backward), can effectively forecast when compared to conventional models (RMSE: 2.70, MAE: 0.84, RRMSE: 1.97, R2: 0.16). Therefore, it is shown that the proposed method can effectively forecast PM2.5 even if the data in the high-concentration section is insufficient.


Citations (11)


... In addition, we designed feature variables representing the Sun's and satellites' geometric characteristics for estimation location to improve the SSI-estimation performance. Moreover, to optimize the input variables for the SSI-estimation model, a performance-contribution-based feature selection method using a deep learning model (Deep-FS) used in a previous study was utilized [22]. The estimation model used a hybrid deep neural network (HDNN) that could handle data with various dimensions. ...

Reference:

Improved Surface Solar Irradiation Estimation Using Satellite Data and Feature Engineering
Spatio-Temporal Deep Learning-Based Forecasting of Surface Solar Irradiance: Leveraging Satellite Data and Feature Selection

... The LOF is highly effective at detecting local anomalies in datasets. The Mahalanobis distance measures how far a data point is from a distribution [17][18][19]. This distance index detects multidimensional anomalies by taking into account the covariance structure of data points. ...

The Early Detection of Faults for Lithium-Ion Batteries in Energy Storage Systems Using Independent Component Analysis with Mahalanobis Distance

... The CFBC shutdown occurred on 9 September 2020, at 14:35, when the operator responded to an emergency alarm, unexpectedly shutdown the boiler for inspection, and confirmed the failure. As shown in Figure 10, the failure was caused by tube rupture in Figure 9. Diagram of power generation system in CFBC boiler [41]. ...

A Fault Detection and Isolation Method via Shared Nearest Neighbor for Circulating Fluidized Bed Boiler

... Air pollution is a worldwide problem, where the main pollutants affecting the atmosphere are gases (CO, NO 2 , SO 2 , and Pb) and particulate matter PM 10 and PM 2.5 [3,4] These are generally released by anthropogenic and natural sources, where respiratory morbidity studies are implicated for certain types of air pollutants according to the World Health Organization (WHO), the main one being particulate matter, PM 2.5 [5,6]. PM 10 and PM 2.5 particles are those that generate a negative effect on health, as well as on the environment and ecosystems; these dangerous particles suspended in the air are composed of solid and liquid particles [7,8]. ...

PM2.5 Concentration Forecasting Using Weighted Bi-LSTM and Random Forest Feature Importance-Based Feature Selection

... This is because image data can be generated simply by data augmentation and can be distinguished better than time-series data. This overcomes data deficiency and imbalance, and improves the classification accuracy of the normal and abnormal data [21,22]. To remove unnecessary data, the current data were extracted during pre-processing. ...

Tool Diagnosis Method of CNC Machine based on Color Space Conversion and Deep Learning
  • Citing Conference Paper
  • November 2022

... This process involves manufacturing products by cutting or milling workpieces according to pre-designed shapes. Manufacturing products using CNC machine tools can affect the cutting tools of the CNC machine tool, as the machine tool processes the workpiece while rotating the device fixed on the spindle motor of the CNC machine tool [5,6]. This paper proposes a most probable explanation algorithm based on the beetle antennae search algorithm (BAS-MPE). ...

Fault Detection for CNC Machine Tools Using Auto-Associative Kernel Regression Based on Empirical Mode Decomposition

... Classical representation-based methods such as one-class support vector machines(OCSVM) [5], Fong [6] proposed a fault detection method to detect degradation based on one-class support vectors and a hierarchical Bayesian framework. Based on the nearest neighbor method, such as the local outlier factor (LOF) [7], Kim [8] improves the local outlier factor algorithm, improves the detection performance of the algorithm, and is verified on the Tennessee Eastman process data. Cheng [9] for the multi-source high-dimensional data of high-speed trains, local outlier factor fault detection method based on cross variable variance is proposed. ...

Fault Detection Method Using Inverse Distance Weight-based Local Outlier Factor
  • Citing Conference Paper
  • October 2021

... We compare the original signal to a reconstruction of that signal, and if the reconstructed signal and the original signal are sufficiently different (larger than some value ), we classify this data point as anomalous, and opposite, if the difference is small, the data point is classified as normal. The signal is reconstructed using auto associative kernel regression (AAKR), which ''compares the similarity between the training data stored in memory and the query vector (testing data), and then assigns high weights to the high similarity vectors to compute the estimated vector'' (Jung, Kim, Kim, & Kim, 2021). A bandwidth parameter ℎ is used to control the weighting function that calculates the weights. ...

Fault Detection Method based on Auto-associative Kernel Regression and Interval Type-2 Fuzzy Logic System for Multivariate Process
  • Citing Conference Paper
  • July 2021

... Recently, diverse LSTM-structured models have been presented due to the benefits induced by the structural characteristics of LSTM in processing time series data. Kim et al. predicted the hourly insolation by constructing a LSTM model based on weather data by CV(RMSE) of 26.87% [36], and Jeon et al. built an LSTM model using cloudiness and the previous day's insolation data. In this study, the LSTM model derived a high prediction performance only with other regions' insolation data and the previous day's insolation pattern [37]. ...

A Study on Solar Radiation Forecasting Based on Long Short-term Memory Considering Hourly Weather Changes
  • Citing Article
  • February 2021

Journal of Korean institute of intelligent systems

... Measuring the mean accuracy decrease quantifies the importance of a variable by gauging the change in prediction accuracy that occurs when the values of the variable are randomly permuted compared to the original observations (Calle and Urrea, 2010). This is widely accepted as the most efficient measure of the importance of variables in RF procedures (Lee et al., 2019). The most important variables for the RF classification were distance to water and the canopy height model, with mean contribution values of 45% and 30%, respectively. ...

Case Dependent Feature Selection using Mean Decrease Accuracy for Convective Storm Identification
  • Citing Conference Paper
  • November 2019