Fig 7 - uploaded by Serkan Kartal
Content may be subject to copyright.
The architecture of the utilized LSTM/GRU models for SST prediction.

The architecture of the utilized LSTM/GRU models for SST prediction.

Source publication
Article
Full-text available
Spatiotemporal time series prediction plays a crucial role in a wide range of applications. However, in most of the studies, spatial information was ignored and predictions were carried out either on a few points or on average values. In this study, 37 different configurations of 4 traditional ML models and 3 Neural Network (NN) based models were u...

Contexts in source publication

Context 1
... In other words, these models receive information from both the previous layer and the previous moment (in this study, ''moment'' corresponds to 1 day, 1 month, or 3 months, depending on the prediction process performed). These two input values connected to each neuron are shown in Fig. 4, where the inner structure of RNN cells is given, and in Fig. 7, where the general structure of LSTM and GRU is given. This structure makes RNNs more dynamic than standard MLPs. Thus, RNN-based models have achieved incredible success in various areas such as language modeling, translation, and time series prediction in the last few ...
Context 2
... on the LSTM and GRU architectures, NN models having a different number of hidden layers and a different number of neurons were built. The general architecture of the LSTM/GRU models utilized for SST prediction is given in Fig. ...

Citations

... Additionally, elevated sea temperatures incite the migration of marine organisms, disrupting existing ecological chains and threatening the stability of fisheries and the integrity of marine ecosystems. Therefore, accurately predicting sea temperatures and implementing potential precautionary measures is of paramount importance (Cheng et al., 2023;Haghbin et al., 2021;Kartal, 2023;Kim et al., 2020;Lin et al., 2023a;Sun et al., 2021;Xu et al., 2023a). ...
... In recent years, deep learning has been widely used to construct predictive models in various applications, demonstrating superior performance compared to traditional machine learning, particularly when large historical datasets are involved (Kang et al., 2024). For instance, Kartal (2023) utilized four traditional machine learning techniques (Decision Tree, KNN, SVM, and LR) and three deep learning techniques (MLP, GRU, and LSTM) to train models with 37 configurations for predicting SST. The results revealed that the LSTM model outperformed the others. ...
... Additionally, considering the ocean's open nature, each area interacts with adjacent areas, necessitating a comprehensive spatial-temporal modelling approach. The spatiotemporal variation model has been widely used in various environmental predictions with temporal and spatial characteristics (Amato et al., 2020;Hamdi et al., 2022;Li et al., 2020;Sharma et al., 2022;Wang et al., 2023;Wu et al., 2023), such as marine ecological information (Kartal, 2023), air quality (Yu et al., 2023), traffic flow (Ji et al., 2023), and temperature (Faridi et al., 2023) etc. In these applications, it's crucial to consider both temporal and spatial influences to enhance predictive accuracy. ...
... Therefore, there are still accuracy issues in SST prediction through numerical methods. Traditional machine learning methods [15] directly learn the rules of SST changes from massive historical databases and use them for prediction. Common machine learning methods include linear regression [16], the Support Vector Machine (SVM) models [17], and Artificial Neural Networks (ANNs) [18]. ...
Article
Full-text available
Sea surface temperature (SST) is an important factor in the marine environment and has significant impacts on climate, ecology, and maritime activities. Most existing SST prediction methods consider the ocean as a uniform field and use a uniform grid to predict SST. However, the marine environment is a complex system, and factors such as solar radiation, differences in land and sea thermal properties, and ocean circulation lead to uneven spatial distributions of SSTs. We propose a non-uniform grid construction method based on an SST spatial gradient to encode SST data, as well as a Non-uniform Grid Graph Convolutional Network (NGGCN) model. The NGGCN consists of two spatiotemporal modules, each of which extracts spatial features from the GCN module, captures temporal correlations through the GRU module, and performs feature restoration and output results through the fully connected module. We selected data from the Yellow Sea and Bohai Sea to validate the effectiveness of the NGGCN in predicting SST at different time scales and prediction steps. The results indicate that our model shows a significant improvement in prediction performance compared to other models.
... Machine learning methods, utilizing rich data and advanced statistical theories, enable computers to automatically identify data patterns and make predictive decisions. Various machine learning techniques such as linear regression (Kug et al. 2004), orthogonal function method (Sharma et al. 2010), random forest method ) and support vector machines (SVM) (Lins et al. 2013;Bonino et al. 2024;Kartal 2023;Boschetti et al. 2023) are widely used in SST prediction due to their ability to learn from historical data. These methods offer conditional adaptability through iterative optimization without needing to reconstruct the model for each new environment or condition. ...
Article
Full-text available
The dynamics of Sea Surface Temperature (SST) are crucial for maintaining the balance of marine ecosystems. While existing artificial intelligence methods offer powerful tools for SST prediction, they struggle with data sparsity issues. To enhance SST prediction accuracy under sparse data conditions, this study proposes an innovative prediction model: TL-iTransformer. This model is based on the iTransformer architecture and incorporates transfer learning techniques specifically tailored for SST prediction. We begin by extracting SST features from data-rich sea areas (source sea areas) using a transfer learning strategy, integrating these features into the iTransformer model for pre-training. This process imparts the model with a priori knowledge and basic prediction capabilities, enabling it to adapt to data-sparse sea areas (target sea areas). The model is then fine-tuned using domain adaptive techniques to accurately capture the data characteristics and distribution patterns of the target sea area. We conducted a series of experiments using a real SST dataset from the sea area of British Columbia, Canada. The results demonstrate that TL-iTransformer maintains the Mean Absolute Error (MAE) and Mean Squared Error (MSE) within 0.144 and 0.356, respectively, under data sparsity conditions. Additionally, it outperforms four mainstream time-series prediction baseline models as the prediction time span increases. The proposed model can effectively address the issue of SST prediction in situations with sparse data.
... Traditional methods, like threshold-based approaches, are simple to implement but lack sensitivity to subtle anomalies and can trigger false alarms due to environmental fluctuations (Corradino et al., 2023;Jeffrey et al., 2023;Harrou et al., 2024). Statistical methods offer more robustness but struggle to capture early signs of sensor degradation or subtle temperature drifts (Harrou et al., 2023;Kartal, 2023;Zou et al., 2023). Both approaches rely on assumptions about normal temperature distribution, which may not always hold true in real-world scenarios. ...
Article
Full-text available
Maintaining consistent and accurate temperature is critical for the safe and effective storage of vaccines. Traditional monitoring methods often lack real-time capabilities and may not be sensitive enough to detect subtle anomalies. This paper presents a novel deep learning-based system for real-time temperature fault detection in refrigeration systems used for vaccine storage. Our system utilizes a semi-supervised Convolutional Autoencoder (CAE) model deployed on a resource-constrained ESP32 microcontroller. The CAE is trained on real-world temperature sensor data to capture temporal patterns and reconstruct normal temperature profiles. Deviations from the reconstructed profiles are flagged as potential anomalies, enabling real-time fault detection. Evaluation using real-time data demonstrates an impressive 92% accuracy in identifying temperature faults. The system's low energy consumption (0.05 watts) and memory usage (1.2 MB) make it suitable for deployment in resource-constrained environments. This work paves the way for improved monitoring and fault detection in refrigeration systems, ultimately contributing to the reliable storage of life-saving vaccines
... Using in-situ and satellite remote sensing-based ocean observation datasets of SST and other parameters, quite a number of studies have been conducted to model SST and make accurate forecasts and hindcasts. These models can be categorized into two main categories; data-driven and numerical [2][3][4]. Data-driven models, as the name suggests, are driven by past SST data alone. These models involve training models to recognize patterns in the SST time series and adapting to them using mathematical equations. ...
... A combined CNN and FC-LSTM (CFFC-LSTM) model was proposed in 2018 for spatiotemporal SST forecasting and it achieved predictions with over 0.97 accuracy and less than 0.8 RMSE values [13]. An assessment of different ML and DL algorithms with different configurations was carried out for the Mediterranean Sea and the study concluded that LSTM and GRU models performed best among the considered models [4]. ...
Article
Full-text available
Sea Surface Temperature (SST) is of great importance to study several major phenomena due to ocean interactions with other earth systems. Previous studies on SST based on statistical inference methods were less accurate for longer prediction lengths. A considerable number of studies in recent years involve machine learning for SST modeling. These models were able to mitigate this problem to some length by modeling SST patterns and trends. Sequence analysis by decomposition is used for SST forecasting in several studies. Ensemble Empirical Mode Decomposition (EEMD) has been proven in previous studies as a useful method for this. The application of EEMD in spatiotemporal modeling has been introduced as Multidimensional EEMD (MEEMD). The aim of this study is to employ fast MEEMD methods to decompose the SST spatiotemporal dataset and apply a Convolutional Long Short-Term Memory (ConvLSTM)-based model to model and forecast SST. The results show that the fast MEEMD method is capable of enhancing spatiotemporal SST modeling compared to the Linear Inverse Model (LIM) and ConvLSTM model without decomposition. The model was further validated by making predictions from April to May 2023 and comparing them to original SST values. There was a high consistency between predicted and real SST values.
... BP is the most efficient and widely used algorithm in deep learning [15]. Both LSTM and GRU are variations of basic recurrent neural networks and capable of learning long-term sequences [16]. The purpose of adopting different deep neural networks is to find the more skillful one in prediction and the more efficient one in computation. ...
... Remote Sens. 2024,16, 1034 ...
Article
Full-text available
We explore to what extent data-driven prediction models have skills in forecasting daily sea-surface temperature (SST), which are comparable to or perform better than current physics-based operational systems over long-range forecast horizons. Three hybrid deep learning-based models are developed within the South China Sea (SCS) basin by integrating deep neural networks (back propagation, long short-term memory, and gated recurrent unit) with traditional empirical orthogonal function analysis and empirical mode decomposition. Utilizing a 40-year (1982–2021) satellite-based daily SST time series on a 0.25° grid, we train these models on the first 32 years (1982–2013) of detrended SST anomaly (SSTA) data. Their predictive accuracies are then validated using data from 2014 and tested over the subsequent seven years (2015–2021). The models’ forecast skills are assessed using spatial anomaly correlation coefficient (ACC) and root-mean-square error (RMSE), with ACC proving to be a stricter metric. A forecast skill horizon, defined as the lead time before ACC drops below 0.6, is determined to be 50 days. The models are equally capable of achieving a basin-wide average ACC of ~0.62 and an RMSE of ~0.48 °C at this horizon, indicating a 36% improvement in RMSE over climatology. This implies that on average the forecast skill horizon for these models is beyond the available forecast length. Analysis of one model, the BP neural network, reveals a variable forecast skill horizon (5 to 50 days) for each individual day, showing that it can adapt to different time scales. This adaptability seems to be influenced by a number of mechanisms arising from the evident regional and global atmosphere–ocean coupling variations on time scales ranging from intraseasonal to decadal in the SSTA of the SCS basin.
... The Jacobian matrix of acceleration measurement can be got as (35). ...
... The output gate determines what output should be generated. These gates surpass the vanishing gradient problem and make the LSTM suitable for learning long-term dependencies [35]. The entire calculation of the LSTM models can be expressed using the equations (54) to (59) below: ...
Article
The nine-axis MEMS Inertial Measurement Units (IMU) have been widely used in various fields, such as underwater vehicles, unmanned aerial vehicles and bionic robots. Due to the noises of gyroscope sensors and errors introduced in solution process, the rotation angles estimated using only angular velocity data usually contain large accumulated errors and have to be corrected by acceleration and geomagnetic measurements. A serious problem is if there is strong magnetic anomaly field in the environment, the geomagnetic field aiding performance decrease quickly and probably leads to extra errors. To improve the heading and attitude estimation accuracy of the nine-axis MEMS IMU in magnetic anomaly field, a partially adaptive Extended Kalman Filter (PADEKF) using double quaternions is proposed in this work. To reduce the coupled influence of magnetic measurement noise on attitude estimation in a single quaternion, the heading and attitude angles are represented with two independent quaternions in the state vector. Self-adaptability design is adopted in the EKF to improve the robustness to spatially varying magnetic anomaly data. For the case that the strong and quickly varying magnetic anomaly field cannot be well modelled by the PADEKF, a combination algorithm of long and short-term memory (LSTM) neural network and Runge-Kutta method is given to make good heading estimation. Field experiments in different scenarios are performed and verified the effectiveness of the proposed approach.
... Despite its successful application in MODIS, it has been shown that for Landsat-8, this method has a large error and still needs further improvement [15]. In recent years, with the development of big data technology, numerous methods based on statistical regression [16], machine learning [17,18] and data fusion [19] have emerged to estimate SST. ...
Article
Full-text available
The Landsat-8 Collection 2 provides Level-2 surface temperature product (L8-L2ST) at a spatial resolution of 30 m, catering to various applications. However, discrepancies in the spatial resolution of certain parameters involved in L8-L2ST production often result in noticeable “checkerboard” patterns in images over oceanic waters. To enhance the accuracy and reasonability of sea surface temperature (SST) products derived from the Landsat-8 measurements, this study introduces a neural network (NN) based algorithm for the estimation of SST. By sidestepping the conventional radiative-transfer-based method, which relies on numerous auxiliary data products, the SST generated by the NN-based algorithm could avoid the “checkerboard” issues encountered in the L8-L2ST products. Compared to the reference MODIS SST products, the Root Mean Square Error (RMSE) of NN-based SST is 0.7°C, while the RMSE of L8-L2ST is 1.42°C. In comparison to buoy data, the RMSE of this method is 1.18°C, while the RMSE of L8-L2ST is 2°C. This work thus presents a valuable framework for acquiring more consistent and better-quality SST products from Landsat-8 measurements.
... Jahanbakht et al. [28] used an ensemble of two stacked Deep NNs for SST prediction. Most SST prediction studies primarily focus on temporal data and fail to incorporate spatial information [29], [30], [31], [32]. However, SST prediction is broadly a spatiotemporal sequence prediction task. ...
Article
Full-text available
Accurate forecasting of sea surface temperature (SST) is pivotal for a wide range of applications ranging from climate modelling to marine ecosystem management. This study introduces an ingenious multi-dilated model that employs dilated Convolutional Long Short-Term Memory (ConvLSTM) networks alongside a U-Net architecture which aims to enhance the precision of monthly SST predictions at lead periods of 3, 6 and 12 months. Our approach innovatively combines dilated convolutions within ConvLSTM with the segmentation capabilities of U-Net to adeptly capture the complex Spatio-temporal dynamics as well as relevant attributes of SST. The study utilizes a high-resolution MPI-ESM1-2-HR SST dataset for a series of rigorous tests, achieving a Mean Square Error of 0.01, indicating a high accuracy level in SST predictions. The model at a lead time of 12 months also showed a Sea Surface Microwave Index of 0.036, illustrating its effectiveness in reflecting SST’s microwave emissivity characteristics, and an Earth Mover’s Distance score of 0.97, highlighting its ability to closely match the predicted SST distribution with the actual one. Furthermore, a cosine similarity score of 0.99 suggests a significant alignment between the predicted and actual SST patterns. By merging multi-dilated convolutions with segmentation, our model tackles the intricate challenge of simultaneously capturing the spatial and temporal dependencies of SST, setting a new standard for forecasting accuracy in the field.
... Diğer yandan günümüzde yeni YZ tabanlı yöntemler, tahmine dayalı performansı geliştirme ve doğrusal olmayan örüntüleri modelleme yetenekleri nedeniyle ilgi kazanmışlardır [5]. Bu algoritmalar hava durumu ve sıcaklık tahminlerinden, borsa tahminlerine, ürün hasat miktarı tahminlerinden araç satış tahminlerine kadar birçok farklı alanda kullanılmaktadırlar [6,7,8]. Özellikle makine öğrenmesi (MÖ) modelleri, verilerdeki örüntüleri tespit etmedeki esneklikleri nedeniyle iş dünyası problemlerini çözmede yaygın olarak kullanılmaya başlanmışlardır [9]. ...
Conference Paper
Full-text available
Özet: Günümüzün dinamik iş dünyasında, perakende ve e-ticaret sektörleri gibi satış odaklı endüstriler, gelecekteki talebi tahmin etme yeteneğini artırmaya yönelik sürekli bir çaba içerisindedirler. Literatürde, talep tahmini genellikle ARIMA gibi geleneksel zaman serisi yöntemleriyle ele alınmakta, ancak problemin hem dış hem de iç faktörlerden etkilenen karmaşık bir problem olması nedeniyle, daha iyi yöntem arayışı sürekli olarak devam etmektedir. Bu bağlamda, yapay zeka (YZ) teknolojilerinin gelişimi, işletmelere büyük miktarda veriyi analiz etme ve bu verileri kullanarak gelecekteki talepleri daha hassas ve doğru bir biçimde tahmin etme potansiyeli sunmaktadır. Bu çalışmada da stok tahmin işlemi için, sezonsal trendleri yakalamada başarılı ve uzun zamanlı tahmin işlemi yapabilen Prophet modeli kullanılmıştır. Bu model, Facebook'un veri bilimi ekibi tarafından 2017 yılında geliştirilen açık kaynak kodlu bir araç olup, özellikle geleneksel yöntemlerin sınırlamalarıyla karşılaşan işletmelere değerli bir çözüm sunmaktadır. Modelin eğitim ve test işlemi için 10 farklı mağazaya ait 50 farklı ürünün beş yıllık satış değerlerinden oluşan bir veri seti kullanılmıştır. Veri zaman serisi verisi olduğundan model ilk dört yıllık eğitim verisi ile eğitilip, başarısı bir yıllık test verisinin günlük, haftalık ve aylık periyodlarda tahminleri üzerinde değerlendirilmiştir. Sonuçlar, Prophet modelinin MOYH metriğine göre günlük satış tahminlerinde %14.42, haftalık tahminlerde %6.65 ve aylık tahminlerde %5.78 hata oranlarıyla tahminler yapabildiğini göstermektedir. Bu durum, modelin stok satış değeri örüntülerini anlayarak uzun ve kısa vadeli tahminler yapma konusunda son derece etkili olduğunu göstermektedir. Elde edilen bu sonuçlara göre, perakende sektöründe rekabet avantajı elde etmek isteyen işletmelerin, üretim planlaması ve insan kaynakları yönetimi konularında kararlar alırken Prophet modelinin tahminlerinden faydalanabilecekleri görülmektedir. Abstract: In today's dynamic business world, sales-focused industries such as retail and e-commerce are constantly striving to enhance their ability to forecast future demand. In the literature, demand forecasting is often addressed using traditional time series methods like ARIMA. However, due to the complexity of the problem, influenced by both external and internal factors, the search for better methods continues. In this context, the advancement of Artificial Intelligence (AI) technologies offers the potential to analyse vast amounts of data and make more accurate and precise forecasting of future demands. In this study, the Prophet model, which is successful in capturing seasonal trends and making long-term forecasting, is used for stock forecasting. Developed by Facebook's data science team in 2017, this open-source tool provides a valuable solution, particularly for businesses encountering limitations with traditional methods. During the training and testing process of the model, a dataset consisting of five years of sales data of 50 different products in 10 different stores is used. As the data is in the form of a time series, the model is trained on the first four years of data, and its performance is evaluated on one year of test data, making daily, weekly, and monthly forecasting. The results indicate that the Prophet model can make daily sales forecasts with an 84